META-DES

class deslib.des.meta_des.METADES(pool_classifiers, meta_classifier=MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True), k=7, kp=5, Hc=1.0, selection_threshold=0.5, mode='selection', DFP=False, with_IH=False, safe_k=None, IH_rate=0.3)[source]

Meta learning for dynamic ensemble selection (META-DES).

The META-DES framework is based on the assumption that the dynamic ensemble selection problem can be considered as a meta-problem. This meta-problem uses different criteria regarding the behavior of a base classifier \(c_{i}\), in order to decide whether it is competent enough to classify a given test sample.

The framework performs a meta-training stage, in which, the meta-features are extracted from each instance belonging to the training and the dynamic selection dataset (DSEL). Then, the extracted meta-features are used to train the meta-classifier \(\lambda\). The meta-classifier is trained to predict whether or not a base classifier \(c_{i}\) is competent enough to classify a given input sample.

When an unknown sample is presented to the system, the meta-features for each base classifier \(c_{i}\) in relation to the input sample are calculated and presented to the meta-classifier. The meta-classifier estimates the competence level of the base classifier \(c_{i}\) for the classification of the query sample. Base classifiers with competence level higher than a pre-defined threshold are selected. If no base classifier is selected, the whole pool is used for classification.

Parameters:
pool_classifiers : list of classifiers

The generated_pool of classifiers trained for the corresponding classification problem. The classifiers should support methods “predict” and “predict_proba”.

k : int (Default = 7)

Number of neighbors used to estimate the competence of the base classifiers.

kp : int (Default = 5)

Number of output profiles used to estimate the competence of the base classifiers.

Hc : float (Default = 1.0)

Sample selection threshold.

selection_threshold : float(Default = 0.5)

Threshold used to select the base classifier. Only the base classifiers with competence level higher than the selection_threshold are selected to compose the ensemble.

mode : String (Default = “selection”)

Determines the mode of META-des that is used (selection, weighting or hybrid).

DFP : Boolean (Default = False)

Determines if the dynamic frienemy pruning is applied.

with_IH : Boolean (Default = False)

Whether the hardness level of the region of competence is used to decide between using the DS algorithm or the KNN for classification of a given query sample.

safe_k : int (default = None)

The size of the indecision region.

IH_rate : float (default = 0.3)

Hardness threshold. If the hardness level of the competence region is lower than the IH_rate the KNN classifier is used. Otherwise, the DS algorithm is used for classification.

References

Cruz, R.M., Sabourin, R., Cavalcanti, G.D. and Ren, T.I., 2015. META-DES: A dynamic ensemble selection framework using meta-learning. Pattern Recognition, 48(5), pp.1925-1935.

Cruz, R.M., Sabourin, R. and Cavalcanti, G.D., 2015, July. META-des. H: a dynamic ensemble selection technique using meta-learning and a dynamic weighting approach. In Neural Networks (IJCNN), 2015 International Joint Conference on (pp. 1-8).

R. M. O. Cruz, R. Sabourin, and G. D. Cavalcanti, “Dynamic classifier selection: Recent advances and perspectives,” Information Fusion, vol. 41, pp. 195 – 216, 2018.

estimate_competence_from_proba(query, probabilities)[source]

Estimate the competence of each base classifier \(c_i\) the classification of the query sample. This method received an array with the pre-calculated probability estimates for each query.

First, the meta-features of each base classifier \(c_i\) for the classification of the query sample are estimated. These meta-features are passed down to the meta-classifier \(\lambda\) for the competence level estimation.

Parameters:
query : array of shape = [n_samples, n_features]

The test examples.

probabilities : array of shape = [n_samples, n_classifiers, n_classes]

Probabilities estimates obtained by each each base classifier for each query sample.

Returns:
competences : array of shape = [n_samples, n_classifiers]

The competence level estimated for each base classifier and test example.

fit(X, y)[source]

Prepare the DS model by setting the KNN algorithm and pre-processing the information required to apply the DS method.

This method also extracts the meta-features and trains the meta-classifier \(\lambda\) if the meta-classifier was not yet trained.

Parameters:
X : array of shape = [n_samples, n_features]

Data used to fit the model.

y : array of shape = [n_samples]

class labels of each example in X.

Returns:
self
predict(X)[source]

Predict the class label for each sample in X.

Parameters:
X : array of shape = [n_samples, n_features]

The input data.

Returns:
predicted_labels : array of shape = [n_samples]

Predicted class label for each sample in X.

predict_proba(X)[source]

Estimates the posterior probabilities for sample in X.

Parameters:
X : array of shape = [n_samples, n_features]

The input data.

Returns:
predicted_proba : array of shape = [n_samples, n_classes]

Probabilities estimates for each sample in X.

score(X, y, sample_weight=None)[source]

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:
X : array-like, shape = (n_samples, n_features)

Test samples.

y : array-like, shape = (n_samples) or (n_samples, n_outputs)

True labels for X.

sample_weight : array-like, shape = [n_samples], optional

Sample weights.

Returns:
score : float

Mean accuracy of self.predict(X) wrt. y.

select(competences)[source]

Selects the base classifiers that obtained a competence level higher than the predefined threshold defined in self.selection_threshold.

Parameters:
competences : array of shape = [n_samples, n_classifiers]

The competence level estimated for each base classifier and test example.

Returns:
selected_classifiers : array of shape = [n_samples, n_classifiers]

Boolean matrix containing True if the base classifier is select, False otherwise.