A posteriori¶

class deslib.dcs.a_posteriori.APosteriori(pool_classifiers, k=7, DFP=False, with_IH=False, safe_k=None, IH_rate=0.3, selection_method='diff', diff_thresh=0.1, rng=<mtrand.RandomState object>)[source]¶

A Posteriori Dynamic classifier selection.

This method works similarly to the LCA technique. The only difference is that it uses the scores obtained by the base classifiers as well as the distance between the test sample and each pattern in the region of competence are also considered in the competence estimation.

Parameters:

pool_classifiers : list of classifiers: The generated_pool of classifiers trained for the corresponding classification problem. The classifiers should support methods “predict” and “predict_proba”.
k : int (Default = 7): Number of neighbors used to estimate the competence of the base classifiers.
DFP : Boolean (Default = False): Determines if the dynamic frienemy pruning is applied.
with_IH : Boolean (Default = False): Whether the hardness level of the region of competence is used to decide between using the DS algorithm or the KNN for classification of a given query sample.
safe_k : int (default = None): The size of the indecision region.
IH_rate : float (default = 0.3): Hardness threshold. If the hardness level of the competence region is lower than the IH_rate the KNN classifier is used. Otherwise, the DS algorithm is used for classification.
selection_method : String (Default = “best”): Determines which method is used to select the base classifier after the competences are estimated.
diff_thresh : float (Default = 0.1): Threshold to measure the difference between the competence level of the base classifiers for the random and diff selection schemes. If the difference is lower than the threshold, their performance are considered equivalent.
rng : numpy.random.RandomState instance: Random number generator to assure reproducible results.

References

G. Giacinto and F. Roli, Methods for Dynamic Classifier Selection 10th Int. Conf. on Image Anal. and Proc., Venice, Italy (1999), 659-664.

Ko, Albert HR, Robert Sabourin, and Alceu Souza Britto Jr. “From dynamic classifier selection to dynamic ensemble selection.” Pattern Recognition 41.5 (2008): 1718-1731.

Britto, Alceu S., Robert Sabourin, and Luiz ES Oliveira. “Dynamic selection of classifiers—a comprehensive review.” Pattern Recognition 47.11 (2014): 3665-3680.

R. M. O. Cruz, R. Sabourin, and G. D. Cavalcanti, “Dynamic classifier selection: Recent advances and perspectives,” Information Fusion, vol. 41, pp. 195 – 216, 2018.

estimate_competence(query)[source]¶

estimate the competence of each base classifier ci the classification of the query sample using the A Posteriori method.

The A Posteriori method considers the probability of correct classification of the base classifier ci, taking into account the supports obtained by the base classifier ci for the samples belonging to the region of competence. The probability of correct classification for a base classifier ci is calculated taking into account only the samples in the region of competence from a specific class wl. In this case, wl is the predict class of the base classifier ci for the query sample.

This method also weights the influence of each training sample according to its Euclidean distance to the query instance. The closest samples have a higher influence in the computation of the competence level.

Returns an array containing the level of competence estimated using the LCA method for each base classifier. The size of the array is equals to the size of the pool of classifiers.

Parameters:	query : array cf shape = [n_features] The query sample Returns ——- competences : array of shape = [n_classifiers] The competence level estimated for each base classifier

predict(X)[source]¶

Predict the class label for each sample in X.

Parameters:	X : array of shape = [n_samples, n_features] The input data.
Returns:	predicted_labels : array of shape = [n_samples] Predicted class label for each sample in X.

predict_proba(X)[source]¶

Estimates the posterior probabilities for sample in X.

Parameters:	X : array of shape = [n_samples, n_features] The input data.
Returns:	predicted_proba : array of shape = [n_samples, n_classes] with the probabilities estimates for each class in the classifier model.

score(X, y, sample_weight=None)[source]¶

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:	X : array-like, shape = (n_samples, n_features) Test samples. y : array-like, shape = (n_samples) or (n_samples, n_outputs) True labels for X. sample_weight : array-like, shape = [n_samples], optional Sample weights.
Returns:	score : float Mean accuracy of self.predict(X) wrt. y.

select(competences)[source]¶

Select the most competent classifier for the classification of the query sample given the competence level estimates. Four selection schemes are available.

Best : The base classifier with the highest competence level is selected. In cases where more than one base classifier achieves the same competence level, the one with the lowest index is selected. This method is the standard for the LCA, OLA, MLA techniques.

Diff : Select the base classifier that is significantly better than the others in the pool (when the difference between its competence level and the competence level of the other base classifiers is higher than a predefined threshold). If no base classifier is significantly better, the base classifier is selected randomly among the member with equivalent competence level.

Random : Selects a random base classifier among all base classifiers that achieved the same competence level.

ALL : all base classifiers with the max competence level estimates are selected (note that in this case the dcs technique becomes a des).

Parameters:	competences : array = [n_classifiers] containing the estimated competence level for the base classifiers
Returns:	selected_clf : index of the selected base classifier(s)