Utilities
Math Utilities
- bamt.utils.MathUtils.get_n_nearest(data, columns, corr=False, number_close=5)[source]
Returns N nearest neighbors for every column of dataframe, added into list
- Parameters:
data (DataFrame) – Proximity matrix
columns (list) – df.columns.tolist()
corr (bool, optional) – _description_. Defaults to False.
number_close (int, optional) – Number of nearest neighbors. Defaults to 5.
- Returns:
groups
- bamt.utils.MathUtils.get_proximity_matrix(df, proximity_metric) DataFrame[source]
- Returns matrix of proximity matrix of the dataframe, dataframe must be coded first if it contains
categorical data
- Parameters:
df (DataFrame) – data
df_coded (DataFrame) – same data, but coded
proximity_metric (str) – ‘MI’ or ‘corr’
- Returns:
mutual information matrix
- Return type:
df_distance
- bamt.utils.MathUtils.get_brave_matrix(df_columns, proximity_matrix, n_nearest=5) DataFrame[source]
Returns matrix Brave coeffitients of the DataFrame, requires proximity measure to be calculated
- Parameters:
df_columns (DataFrame) – data.columns
proximity_matrix (DataFrame) – may be generated by get_mutual_info_score_matrix() function or correlation from scipy
n_nearest (int, optional) – _description_. Defaults to 5.
- Returns:
DataFrame of Brave coefficients
- Return type:
brave_matrix
Graph Utilities
- bamt.utils.GraphUtils.nodes_types(data: DataFrame) Dict[str, str][source]
- Function to define the type of the node
disc - discrete node cont - continuous
- Args:
data: input dataset
- Returns:
dict: output dictionary where ‘key’ - node name and ‘value’ - node type
- bamt.utils.GraphUtils.nodes_signs(nodes_types: dict, data: DataFrame) Dict[str, str][source]
- Function to define sign of the node
neg - if node has negative values pos - if node has only positive values
- Parameters:
data (pd.DataFrame) – input dataset
- Returns:
output dictionary where ‘key’ - node name and ‘value’ - sign of data
- Return type:
dict