Scores redefiner

bamt.redef_info_scores.info_score(edges: list, data: DataFrame, method='LL')[source]

bamt.redef_info_scores.log_likelihood(bn, data, method='LL')[source]

Determining log-likelihood of the parameters of a Bayesian Network. This is a quite simple score/calculation, but it is useful as a straight-forward structure learning score.

Semantically, this can be considered as the evaluation of the log-likelihood of the data, given the structure and parameters of the BN:

log( P( D | Theta_G, G ) )

where Theta_G are the parameters and G is the structure.

However, for computational reasons it is best to take advantage of the decomposability of the log-likelihood score.

As an example, if you add an edge from A->B, then you simply need to calculate LOG(P’(B|A)) - Log(P(B)), and if the value is positive then the edge improves the fitness score and should therefore be included.

Even more, you can expand and manipulate terms to calculate the difference between the new graph and the original graph as follows:

Score(G’) - Score(G) = M * I(X,Y), where M is the number of data points and I(X,Y) is the marginal mutual information calculated using the empirical distribution over the data.

In general, the likelihood score decomposes as follows:

LL(D | Theta_G, G) =: M * Sum over Variables ( I ( X , Parents(X) ) ) - M * Sum over Variables ( H( X ) ),

where ‘I’ is mutual information and ‘H’ is the entropy, and M is the number of data points

Moreover, it is clear to see that H(X) is independent of the choice of graph structure (G). Thus, we must only determine the difference in the mutual information score of the original graph which had a given node and its original parents, and the new graph which has a given node and new parents.

NOTE: This assumes the parameters have already been learned for the BN’s given structure.

LL = LL - f(N)*|B|, where f(N) = 0

Arguments

bna BayesNet object: Must have both structure and parameters instantiated.

Notes

NROW = data.shape[0] mi_score = 0 ent_score = 0 for rv in bn.nodes():

cols = tuple([bn.V.index(rv)].extend([bn.V.index(p) for p in bn.parents(rv)])) mi_score += mutual_information(data[:,cols]) ent_score += entropy(data[:,bn.V.index(rv)])

return NROW * (mi_score - ent_score)

bamt.redef_info_scores.log_lik_local(data, method='LL')[source]

bamt.redef_info_scores.BIC_local(data, method='BIC')[source]

bamt.redef_info_scores.num_params(data)[source]

bamt.redef_info_scores.AIC_local(data, method='AIC')[source]