META-DES¶

class deslib.des.meta_des.METADES(pool_classifiers, meta_classifier=MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True), k=7, kp=5, Hc=1.0, selection_threshold=0.5, mode='selection', DFP=False, with_IH=False, safe_k=None, IH_rate=0.3)[source]¶

Meta learning for dynamic ensemble selection (META-DES).

The META-DES framework is based on the assumption that the dynamic ensemble selection problem can be considered as a meta-problem. This meta-problem uses different criteria regarding the behavior of a base classifier \(c_{i}\), in order to decide whether it is competent enough to classify a given test sample.

The framework performs a meta-training stage, in which, the meta-features are extracted from each instance belonging to the training and the dynamic selection dataset (DSEL). Then, the extracted meta-features are used to train the meta-classifier \(\lambda\). The meta-classifier is trained to predict whether or not a base classifier \(c_{i}\) is competent enough to classify a given input sample.

When an unknown sample is presented to the system, the meta-features for each base classifier \(c_{i}\) in relation to the input sample are calculated and presented to the meta-classifier. The meta-classifier estimates the competence level of the base classifier \(c_{i}\) for the classification of the query sample. Base classifiers with competence level higher than a pre-defined threshold are selected. If no base classifier is selected, the whole pool is used for classification.

Parameters:

pool_classifiers : list of classifiers: The generated_pool of classifiers trained for the corresponding classification problem. The classifiers should support methods “predict” and “predict_proba”.
k : int (Default = 7): Number of neighbors used to estimate the competence of the base classifiers.
kp : int (Default = 5): Number of output profiles used to estimate the competence of the base classifiers.
Hc : float (Default = 1.0): Sample selection threshold.
selection_threshold : float(Default = 0.5): Threshold used to select the base classifier. Only the base classifiers with competence level higher than the selection_threshold are selected to compose the ensemble.
mode : String (Default = “selection”): Determines the mode of META-des that is used (selection, weighting or hybrid).
DFP : Boolean (Default = False): Determines if the dynamic frienemy pruning is applied.
with_IH : Boolean (Default = False): Whether the hardness level of the region of competence is used to decide between using the DS algorithm or the KNN for classification of a given query sample.
safe_k : int (default = None): The size of the indecision region.
IH_rate : float (default = 0.3): Hardness threshold. If the hardness level of the competence region is lower than the IH_rate the KNN classifier is used. Otherwise, the DS algorithm is used for classification.

References

Cruz, R.M., Sabourin, R., Cavalcanti, G.D. and Ren, T.I., 2015. META-DES: A dynamic ensemble selection framework using meta-learning. Pattern Recognition, 48(5), pp.1925-1935.

Cruz, R.M., Sabourin, R. and Cavalcanti, G.D., 2015, July. META-des. H: a dynamic ensemble selection technique using meta-learning and a dynamic weighting approach. In Neural Networks (IJCNN), 2015 International Joint Conference on (pp. 1-8).

R. M. O. Cruz, R. Sabourin, and G. D. Cavalcanti, “Dynamic classifier selection: Recent advances and perspectives,” Information Fusion, vol. 41, pp. 195 – 216, 2018.

estimate_competence_from_proba(query, probabilities)[source]¶

Estimate the competence of each base classifier \(c_i\) the classification of the query sample. This method received an array with the pre-calculated probability estimates for each query.

First, the meta-features of each base classifier \(c_i\) for the classification of the query sample are estimated. These meta-features are passed down to the meta-classifier \(\lambda\) for the competence level estimation.

Parameters:	query : array of shape = [n_samples, n_features] The test examples. probabilities : array of shape = [n_samples, n_classifiers, n_classes] Probabilities estimates obtained by each each base classifier for each query sample.
Returns:	competences : array of shape = [n_samples, n_classifiers] The competence level estimated for each base classifier and test example.

fit(X, y)[source]¶

Prepare the DS model by setting the KNN algorithm and pre-processing the information required to apply the DS method.

This method also extracts the meta-features and trains the meta-classifier \(\lambda\) if the meta-classifier was not yet trained.

Parameters:	X : array of shape = [n_samples, n_features] Data used to fit the model. y : array of shape = [n_samples] class labels of each example in X.
Returns:	self

predict(X)[source]¶

Predict the class label for each sample in X.

Parameters:	X : array of shape = [n_samples, n_features] The input data.
Returns:	predicted_labels : array of shape = [n_samples] Predicted class label for each sample in X.

predict_proba(X)[source]¶

Estimates the posterior probabilities for sample in X.

Parameters:	X : array of shape = [n_samples, n_features] The input data.
Returns:	predicted_proba : array of shape = [n_samples, n_classes] Probabilities estimates for each sample in X.

score(X, y, sample_weight=None)[source]¶

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:	X : array-like, shape = (n_samples, n_features) Test samples. y : array-like, shape = (n_samples) or (n_samples, n_outputs) True labels for X. sample_weight : array-like, shape = [n_samples], optional Sample weights.
Returns:	score : float Mean accuracy of self.predict(X) wrt. y.

select(competences)[source]¶

Selects the base classifiers that obtained a competence level higher than the predefined threshold defined in self.selection_threshold.

Parameters:	competences : array of shape = [n_samples, n_classifiers] The competence level estimated for each base classifier and test example.
Returns:	selected_classifiers : array of shape = [n_samples, n_classifiers] Boolean matrix containing True if the base classifier is select, False otherwise.