Probabilistic

class deslib.des.probabilistic.Probabilistic(pool_classifiers, k=None, DFP=False, with_IH=False, safe_k=None, IH_rate=0.3, mode='selection', selection_threshold=None)[source]

Base class for a DS method based on the potential function model. ALL DS methods based on the Potential function should inherit from this class

Warning: This class should not be used directly. Use derived classes instead.

Parameters:
pool_classifiers : list of classifiers

The generated_pool of classifiers trained for the corresponding classification problem. The classifiers should support methods “predict” and “predict_proba”.

k : int (Default = None)

Number of neighbors used to estimate the competence of the base classifiers. If k = None, the whole dynamic selection dataset is used, and the influence of each sample is based on its distance to the query.

DFP : Boolean (Default = False)

Determines if the dynamic frienemy pruning is applied.

with_IH : Boolean (Default = False)

Whether the hardness level of the region of competence is used to decide between using the DS algorithm or the KNN for classification of a given query sample.

safe_k : int (default = None)

The size of the indecision region.

IH_rate : float (default = 0.3)

Hardness threshold. If the hardness level of the competence region is lower than the IH_rate the KNN classifier is used. Otherwise, the DS algorithm is used for classification.

mode : String (Default = “selection”)

Whether the technique will perform dynamic selection, dynamic weighting or an hybrid approach for classification.

References

T.Woloszynski, M. Kurzynski, A probabilistic model of classifier competence for dynamic ensemble selection, Pattern Recognition 44 (2011) 2656–2668.

  1. Rastrigin, R. Erenstein, Method of collective recognition, Vol. 595, 1981, (in Russian).

Britto, Alceu S., Robert Sabourin, and Luiz ES Oliveira. “Dynamic selection of classifiers—a comprehensive review.” Pattern Recognition 47.11 (2014): 3665-3680.

R. M. O. Cruz, R. Sabourin, and G. D. Cavalcanti, “Dynamic classifier selection: Recent advances and perspectives,” Information Fusion, vol. 41, pp. 195 – 216, 2018.

estimate_competence(query)[source]

estimate the competence of each base classifier ci using the source of competence C_src and the potential function model. The source of competence C_src for all data points in DSEL is already pre-computed in the fit() steps.

Parameters:
query : array containing the test sample = [n_features]
Returns:
competences : array of shape = [n_classifiers]

The competence level estimated for each base classifier

fit(X, y)[source]

Train the DS model by setting the KNN algorithm and pre-processing the information required to apply the DS methods. In the case of probabilistic techniques, the source of competence (C_src) is calculated for each data point in DSEL in order to speed up the process during the testing phases.

C_src is estimated with the source_competence() function that is overridden by each DS method based on this paradigm

Parameters:
X : matrix of shape = [n_samples, n_features] with the data.
y : class labels of each sample in X.
Returns:
self
static potential_func(dist)[source]

Gaussian potential function to decrease the influence of the source of competence as the distance between xk and the query increases

Parameters:
dist : array of shape = [self.n_samples]

distance between the corresponding sample to the query

Returns:
The result of the potential function for each value in (dist)
select(competences)[source]

Selects the base classifiers that obtained a competence level higher than the predefined threshold. In this case, the threshold indicates the competence of the random classifier.

Parameters:
competences : array of shape = [n_classifiers]

The estimated competence level for the base classifiers

Returns:
indices : the indices of the selected base classifiers
source_competence()[source]

Method used to estimate the source of competence at each data point.

Each DS technique based on this paradigm should define its computation of C_src

Returns:
C_src : array of shape = [n_samples, n_classifiers]

The competence source for each base classifier at each data point.