Preprocess module
- bamt.preprocess.discretization.get_nodes_sign(data: DataFrame) dict[source]
- Function to define sign of the node
neg - if node has negative values pos - if node has only positive values
- Parameters:
data (pd.DataFrame) – input dataset
- Returns:
output dictionary where ‘key’ - node name and ‘value’ - sign of data
- Return type:
dict
- bamt.preprocess.discretization.get_nodes_type(data: DataFrame) dict[source]
- Function to define the type of the node
disc - discrete node cont - continuous
- Parameters:
data (pd.DataFrame) – input dataset
- Returns:
output dictionary where ‘key’ - node name and ‘value’ - node type
- Return type:
dict
- bamt.preprocess.discretization.discretization(data: DataFrame, method: str, columns: list, bins: int = 5) Tuple[DataFrame, KBinsDiscretizer][source]
Discretization of continuous parameters
- Parameters:
data (pd.DataFrame) – input dataset
method (str) – discretization approach (equal_intervals, equal_frequency, kmeans)
columns (list) – name of columns for discretization
bins (int, optional) – number of bins. Defaults to 5.
- Returns:
output dataset with discretized parameters KBinsDiscretizer: fitted exemplar of discretization class
- Return type:
pd.DataFrame
- bamt.preprocess.discretization.code_categories(data: DataFrame, method: str, columns: list) Tuple[DataFrame, dict][source]
Encoding categorical parameters
- Parameters:
data (pd.DataFrame) – input dataset
method (str) – method of encoding (label or onehot)
columns (list) – name of categorical columns
- Returns:
output dataset with encoded parameters dict: dictionary with values and codes
- Return type:
pd.DataFrame
- bamt.preprocess.discretization.inverse_discretization(data: DataFrame, columns: list, discretizer: KBinsDiscretizer) DataFrame[source]
Inverse discretization for numeric params
- Parameters:
data (pd.DataFrame) – input dataset with discrete values
columns (list) – colums for inverse_discretization
discretizer (KBinsDiscretizer) – fitted exemplar of discretization class
- Returns:
output dataset with continuous values
- Return type:
pd.DataFrame
- bamt.preprocess.discretization.decode(data: DataFrame, columns: list, encoder_dict: dict) DataFrame[source]
Decoding categorical params to initial labels
- Parameters:
data (pd.DataFrame) – input dataset with encoded params
columns (list) – columns for decoding
encoder_dict (dict) – dictionary with values and codes
- Returns:
output dataset with decoded params
- Return type:
pd.DataFrame
- bamt.preprocess.graph.nodes_from_edges(edges: list)[source]
- Retrieves all nodes from the list of edges.
Arguments
Returns
nodes : list
Effects
None
- bamt.preprocess.graph.edges_to_dict(edges: list)[source]
- Transfers the list of edges to the dictionary of parents.
Arguments
Returns
parents_dict : dict
Effects
None
- bamt.preprocess.numpy_pandas.loc_to_DataFrame(data: array)[source]
Function to convert array to DataFrame :param data: input array :type data: np.array
- Returns:
with string columns for filtering
- Return type:
data (pd.DataFrame)
- bamt.preprocess.numpy_pandas.get_type_numpy(data: array)[source]
- Function to define the type of the columns of array
disc - discrete node cont - continuous
- Parameters:
data (np.array) – input array
- Returns:
output dictionary where ‘key’ - node name and ‘value’ - node type
- Return type:
dict
Notes: – You may have problems with confusing rows and columns