DES-Kullback Leibler¶

class deslib.des.probabilistic.DESKL(pool_classifiers=None, k=None, DFP=False, with_IH=False, safe_k=None, IH_rate=0.3, mode='selection', random_state=None, knn_classifier='knn', knn_metric='minkowski', DSEL_perc=0.5, n_jobs=-1, voting='hard')[source]¶

Dynamic Ensemble Selection-Kullback-Leibler divergence (DES-KL).

This method estimates the competence of the classifier from the information theory perspective. The competence of the base classifiers is calculated as the KL divergence between the vector of class supports produced by the base classifier and the outputs of a random classifier (RC) RC = 1/L, L being the number of classes in the problem. Classifiers with a competence higher than the competence of the random classifier is selected.

Parameters:

pool_classifiers : list of classifiers (Default = None)

The generated_pool of classifiers trained for the corresponding classification problem. Each base classifiers should support the method “predict”. If None, then the pool of classifiers is a bagging classifier.

k : int (Default = 7)

Number of neighbors used to estimate the competence of the base classifiers.

DFP : Boolean (Default = False)

Determines if the dynamic frienemy pruning is applied.

with_IH : Boolean (Default = False)

Whether the hardness level of the region of competence is used to decide between using the DS algorithm or the KNN for classification of a given query sample.

safe_k : int (default = None)

The size of the indecision region.

IH_rate : float (default = 0.3)

Hardness threshold. If the hardness level of the competence region is lower than the IH_rate the KNN classifier is used. Otherwise, the DS algorithm is used for classification.

mode : String (Default = “selection”)

Whether the technique will perform dynamic selection, dynamic weighting or an hybrid approach for classification.

random_state : int, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

knn_classifier : {‘knn’, ‘faiss’, None} (Default = ‘knn’)

The algorithm used to estimate the region of competence:

‘knn’ will use KNeighborsClassifier from sklearn
‘faiss’ will use Facebook’s Faiss similarity search through the class FaissKNNClassifier
None, will use sklearn KNeighborsClassifier.

knn_metric : {‘minkowski’, ‘cosine’, ‘mahalanobis’} (Default = ‘minkowski’)

The metric used by the k-NN classifier to estimate distances.

‘minkowski’ will use minkowski distance.
‘cosine’ will use the cosine distance.
‘mahalanobis’ will use the mahalonibis distance.

DSEL_perc : float (Default = 0.5)

Percentage of the input data used to fit DSEL. Note: This parameter is only used if the pool of classifier is None or unfitted.

voting : {‘hard’, ‘soft’}, default=’hard’

If ‘hard’, uses predicted class labels for majority rule voting. Else if ‘soft’, predicts the class label based on the argmax of the sums of the predicted probabilities, which is recommended for an ensemble of well-calibrated classifiers.

n_jobs : int, default=-1

The number of parallel jobs to run. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. Doesn’t affect fit method.

References

Woloszynski, Tomasz, et al. “A measure of competence based on random classification for dynamic ensemble selection.” Information Fusion 13.3 (2012): 207-213.

Woloszynski, Tomasz, and Marek Kurzynski. “A probabilistic model of classifier competence for dynamic ensemble selection.” Pattern Recognition 44.10 (2011): 2656-2668.

R. M. O. Cruz, R. Sabourin, and G. D. Cavalcanti, “Dynamic classifier selection: Recent advances and perspectives,” Information Fusion, vol. 41, pp. 195 – 216, 2018.

estimate_competence(competence_region, distances, predictions=None)[source]¶

estimate the competence of each base classifier \(c_{i}\) using the source of competence \(C_{src}\) and the potential function model. The source of competence \(C_{src}\) for all data points in DSEL is already pre-computed in the fit() steps.

\[\delta_{i,j} = \frac{\sum_{k=1}^{N}C_{src} \: exp(-d (\mathbf{x}_{k}, \mathbf{x}_{q})^{2})} {exp( -d (\mathbf{x}_{k}, \mathbf{x}_{q})^{2} )}\]

Parameters:	competence_region : array of shape (n_samples, n_neighbors) Indices of the k nearest neighbors according for each test sample. distances : array of shape (n_samples, n_neighbors) Distances from the k nearest neighbors to the query. predictions : array of shape (n_samples, n_classifiers) Predictions of the base classifiers for all test examples.
Returns:	competences : array of shape (n_samples, n_classifiers) Competence level estimated for each base classifier and test example.

fit(X, y)[source]¶

Train the DS model by setting the KNN algorithm and pre-processing the information required to apply the DS methods. In the case of probabilistic techniques, the source of competence (C_src) is calculated for each data point in DSEL in order to speed up the process during the testing phases.

C_src is estimated with the source_competence() function that is overridden by each DS method based on this paradigm.

Parameters:	X : array of shape (n_samples, n_features) Data used to fit the model. y : array of shape (n_samples) class labels of each example in X.
Returns:	self : object Returns self.

predict(X)[source]¶

Predict the class label for each sample in X.

Parameters:	X : array of shape (n_samples, n_features) The input data.
Returns:	predicted_labels : array of shape (n_samples) Predicted class label for each sample in X.

predict_proba(X)[source]¶

Estimates the posterior probabilities for sample in X.

Parameters:	X : array of shape (n_samples, n_features) The input data.
Returns:	predicted_proba : array of shape (n_samples, n_classes) Probabilities estimates for each sample in X.

score(X, y, sample_weight=None)[source]¶

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:	X : array-like of shape (n_samples, n_features) Test samples. y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X. sample_weight : array-like of shape (n_samples,), default=None Sample weights.
Returns:	score : float Mean accuracy of `self.predict(X)` wrt. y.

select(competences)[source]¶

Selects the base classifiers that obtained a competence level higher than the predefined threshold. In this case, the threshold indicates the competence of the random classifier.

Parameters:	competences : array of shape (n_samples, n_classifiers) Competence level estimated for each base classifier and test example.
Returns:	selected_classifiers : array of shape (n_samples, n_classifiers) Boolean matrix containing True if the base classifier is selected, False otherwise.

source_competence()[source]¶

Calculates the source of competence using the KL divergence method.

The source of competence C_src at the validation point \(\mathbf{x}_{k}\) is calculated by the KL divergence between the vector of class supports produced by the base classifier and the outputs of a random classifier (RC) RC = 1/L, L being the number of classes in the problem. The value of C_src is negative if the base classifier misclassified the instance \(\mathbf{x}_{k}\).

Returns:	C_src : array of shape (n_samples, n_classifiers) The competence source for each base classifier at each data point.