Multiple Classifier Behaviour (MCB)¶

class deslib.dcs.mcb.MCB(pool_classifiers=None, k=7, DFP=False, with_IH=False, safe_k=None, IH_rate=0.3, similarity_threshold=0.7, selection_method='diff', diff_thresh=0.1, random_state=None, knn_classifier='knn', knn_metric='minkowski', knne=False, DSEL_perc=0.5, n_jobs=-1)[source]¶

Multiple Classifier Behaviour (MCB).

The MCB method evaluates the competence level of each individual classifiers taking into account the local accuracy of the base classifier in the region of competence. The region of competence is defined using the k-NN and behavioral knowledge space (BKS) method. First the k-nearest neighbors of the test sample are computed. Then, the set containing the k-nearest neighbors is filtered based on the similarity of the query sample and its neighbors using the decision space (BKS representation).

A single classifier \(c_{i}\) is selected only if its competence level is significantly higher than that of the other base classifiers in the pool (higher than a pre-defined threshold). Otherwise, all classifiers in the pool are combined using the majority voting rule. The selection methodology can be modified by changing the hyper-parameter selection_method.

Parameters:

pool_classifiers : list of classifiers (Default = None)

The generated_pool of classifiers trained for the corresponding classification problem. Each base classifiers should support the method “predict”. If None, then the pool of classifiers is a bagging classifier.

k : int (Default = 7)

Number of neighbors used to estimate the competence of the base classifiers.

DFP : Boolean (Default = False)

Determines if the dynamic frienemy pruning is applied.

with_IH : Boolean (Default = False)

Whether the hardness level of the region of competence is used to decide between using the DS algorithm or the KNN for classification of a given query sample.

safe_k : int (default = None)

The size of the indecision region.

IH_rate : float (default = 0.3)

Hardness threshold. If the hardness level of the competence region is lower than the IH_rate the KNN classifier is used. Otherwise, the DS algorithm is used for classification.

selection_method : String (Default = “best”)

Determines which method is used to select the base classifier after the competences are estimated.

diff_thresh : float (Default = 0.1)

Threshold to measure the difference between the competence level of the base classifiers for the random and diff selection schemes. If the difference is lower than the threshold, their performance are considered equivalent.

random_state : int, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

knn_classifier : {‘knn’, ‘faiss’, None} (Default = ‘knn’)

The algorithm used to estimate the region of competence:

‘knn’ will use KNeighborsClassifier from sklearn

KNNE available on deslib.utils.knne

‘faiss’ will use Facebook’s Faiss similarity search through the class FaissKNNClassifier
None, will use sklearn KNeighborsClassifier.

knn_metric : {‘minkowski’, ‘cosine’, ‘mahalanobis’} (Default = ‘minkowski’)

The metric used by the k-NN classifier to estimate distances.

‘minkowski’ will use minkowski distance.
‘cosine’ will use the cosine distance.
‘mahalanobis’ will use the mahalonibis distance.

knne : bool (Default=False)

Whether to use K-Nearest Neighbor Equality (KNNE) for the region of competence estimation.

DSEL_perc : float (Default = 0.5)

Percentage of the input data used to fit DSEL. Note: This parameter is only used if the pool of classifier is None or unfitted.

n_jobs : int, default=-1

The number of parallel jobs to run. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. Doesn’t affect fit method.

References

Giacinto, Giorgio, and Fabio Roli. “Dynamic classifier selection based on multiple classifier behaviour.” Pattern Recognition 34.9 (2001): 1879-1881.

Britto, Alceu S., Robert Sabourin, and Luiz ES Oliveira. “Dynamic selection of classifiers—a comprehensive review.” Pattern Recognition 47.11 (2014): 3665-3680.

Huang, Yea S., and Ching Y. Suen. “A method of combining multiple experts for the recognition of unconstrained handwritten numerals.” IEEE Transactions on Pattern Analysis and Machine Intelligence 17.1 (1995): 90-94.

Huang, Yea S., and Ching Y. Suen. “The behavior-knowledge space method for combination of multiple classifiers.” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1993.

R. M. O. Cruz, R. Sabourin, and G. D. Cavalcanti, “Dynamic classifier selection: Recent advances and perspectives,” Information Fusion, vol. 41, pp. 195 – 216, 2018.

estimate_competence(competence_region, distances=None, predictions=None)[source]¶

estimate the competence of each base classifier \(c_{i}\) for the classification of the query sample using the Multiple Classifier Behaviour criterion.

The region of competence in this method is estimated taking into account the feature space and the decision space (using the behaviour knowledge space method [4]). First, the k-Nearest Neighbors of the query sample are defined in the feature space to compose the region of competence. Then, the similarity in the BKS space between the query and the instances in the region of competence are estimated using the following equations:

\[S(\tilde{\mathbf{x}}_{j},\tilde{\mathbf{x}}_{k}) = \frac{1}{M} \sum\limits_{i = 1}^{M}T(\mathbf{x}_{j},\mathbf{x}_{k})\]

\[\begin{split}T(\mathbf{x}_{j},\mathbf{x}_{k}) = \left\{\begin{matrix} 1 & \text{if} & c_{i}(\mathbf{x}_{j}) = c_{i}(\mathbf{x}_{k}),\\ 0 & \text{if} & c_{i}(\mathbf{x}_{j}) \neq c_{i}(\mathbf{x}_{k}). \end{matrix}\right.\end{split}\]

Where \(S(\tilde{\mathbf{x}}_{j},\tilde{\mathbf{x}}_{k})\) denotes the similarity between two samples based on the behaviour knowledge space method (BKS). Instances with similarity lower than a predefined threshold are removed from the region of competence. The competence level of the base classifiers are estimated as their classification accuracy in the final region of competence.

Parameters:	competence_region : array of shape (n_samples, n_neighbors) Indices of the k nearest neighbors. distances : array of shape (n_samples, n_neighbors) Distances from the k nearest neighbors to the query. predictions : array of shape (n_samples, n_classifiers) Predictions of the base classifiers for the test examples.
Returns:	competences : array of shape (n_samples, n_classifiers) Competence level estimated for each base classifier and test example.

fit(X, y)[source]¶

Prepare the DS model by setting the KNN algorithm and pre-processing the information required to apply the DS methods

Parameters:	X : array of shape (n_samples, n_features) The input data. y : array of shape (n_samples) class labels of each example in X.
Returns:	self

predict(X)[source]¶

Predict the class label for each sample in X.

Parameters:	X : array of shape (n_samples, n_features) The input data.
Returns:	predicted_labels : array of shape (n_samples) Predicted class label for each sample in X.

predict_proba(X)[source]¶

Estimates the posterior probabilities for sample in X.

Parameters:	X : array of shape (n_samples, n_features) The input data.
Returns:	predicted_proba : array of shape (n_samples, n_classes) Probabilities estimates for each sample in X.

score(X, y, sample_weight=None)[source]¶

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:	X : array-like of shape (n_samples, n_features) Test samples. y : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for X. sample_weight : array-like of shape (n_samples,), default=None Sample weights.
Returns:	score : float Mean accuracy of `self.predict(X)` wrt. y.

select(competences)[source]¶

Select the most competent classifier for the classification of the query sample given the competence level estimates. Four selection schemes are available.

Best : The base classifier with the highest competence level is selected. In cases where more than one base classifier achieves the same competence level, the one with the lowest index is selected. This method is the standard for the LCA, OLA, MLA techniques.

Diff : Select the base classifier that is significantly better than the others in the pool (when the difference between its competence level and the competence level of the other base classifiers is higher than a predefined threshold). If no base classifier is significantly better, the base classifier is selected randomly among the member with equivalent competence level.

Random : Selects a random base classifier among all base classifiers that achieved the same competence level.

ALL : all base classifiers with the max competence level estimates are selected (note that in this case the DCS technique becomes a DES technique).

Parameters:	competences : array of shape (n_samples, n_classifiers) Competence level estimated for each base classifier and test example.
Returns:	selected_classifiers : array of shape [n_samples] Indices of the selected base classifier for each sample. If the selection_method is set to ‘all’, a boolean matrix is returned, containing True for the selected base classifiers, otherwise false.