Probabilistic¶
-
class
deslib.des.probabilistic.
Probabilistic
(pool_classifiers, k=None, DFP=False, with_IH=False, safe_k=None, IH_rate=0.3, mode='selection', selection_threshold=None)[source]¶ Base class for a DS method based on the potential function model. ALL DS methods based on the Potential function should inherit from this class
Warning: This class should not be used directly. Use derived classes instead.
Parameters: - pool_classifiers : list of classifiers
The generated_pool of classifiers trained for the corresponding classification problem. The classifiers should support methods “predict” and “predict_proba”.
- k : int (Default = None)
Number of neighbors used to estimate the competence of the base classifiers. If k = None, the whole dynamic selection dataset is used, and the influence of each sample is based on its distance to the query.
- DFP : Boolean (Default = False)
Determines if the dynamic frienemy pruning is applied.
- with_IH : Boolean (Default = False)
Whether the hardness level of the region of competence is used to decide between using the DS algorithm or the KNN for classification of a given query sample.
- safe_k : int (default = None)
The size of the indecision region.
- IH_rate : float (default = 0.3)
Hardness threshold. If the hardness level of the competence region is lower than the IH_rate the KNN classifier is used. Otherwise, the DS algorithm is used for classification.
- mode : String (Default = “selection”)
Whether the technique will perform dynamic selection, dynamic weighting or an hybrid approach for classification.
References
T.Woloszynski, M. Kurzynski, A probabilistic model of classifier competence for dynamic ensemble selection, Pattern Recognition 44 (2011) 2656–2668.
- Rastrigin, R. Erenstein, Method of collective recognition, Vol. 595, 1981, (in Russian).
Britto, Alceu S., Robert Sabourin, and Luiz ES Oliveira. “Dynamic selection of classifiers—a comprehensive review.” Pattern Recognition 47.11 (2014): 3665-3680.
R. M. O. Cruz, R. Sabourin, and G. D. Cavalcanti, “Dynamic classifier selection: Recent advances and perspectives,” Information Fusion, vol. 41, pp. 195 – 216, 2018.
-
estimate_competence
(query, predictions=None)[source]¶ estimate the competence of each base classifier \(c_{i}\) using the source of competence \(C_{src}\) and the potential function model. The source of competence \(C_{src}\) for all data points in DSEL is already pre-computed in the fit() steps.
\[\delta_{i,j} = \frac{\sum_{k=1}^{N}C_{src} \: exp( -d (\mathbf{x}_{k}, \mathbf{x}_{q})^{2} )} {exp( -d (\mathbf{x}_{k}, \mathbf{x}_{q})^{2} )}\]Parameters: - query : array of shape = [n_samples, n_features]
The test examples.
- predictions : array of shape = [n_samples, n_classifiers]
Predictions of the base classifiers for all test examples.
- Returns
- ——-
- competences : array of shape = [n_samples, n_classifiers]
Competence level estimated for each base classifier and test example.
-
fit
(X, y)[source]¶ Train the DS model by setting the KNN algorithm and pre-processing the information required to apply the DS methods. In the case of probabilistic techniques, the source of competence (C_src) is calculated for each data point in DSEL in order to speed up the process during the testing phases.
C_src is estimated with the source_competence() function that is overridden by each DS method based on this paradigm
Parameters: - X : array of shape = [n_samples, n_features]
Data used to fit the model.
- y : array of shape = [n_samples]
class labels of each example in X.
Returns: - self
-
static
potential_func
(dist)[source]¶ Gaussian potential function to decrease the influence of the source of competence as the distance between \(\mathbf{x}_{k}\) and the query \(\mathbf{x}_{q}\) increases. The function is computed using the following equation:
\[potential = exp( -dist (\mathbf{x}_{k}, \mathbf{x}_{q})^{2} )\]where dist represents the Euclidean distance between \(\mathbf{x}_{k}\) and \(\mathbf{x}_{q}\)
Parameters: - dist : array of shape = [self.n_samples]
distance between the corresponding sample to the query
Returns: - The result of the potential function for each value in (dist)
-
select
(competences)[source]¶ Selects the base classifiers that obtained a competence level higher than the predefined threshold. In this case, the threshold indicates the competence of the random classifier.
Parameters: - competences : array of shape = [n_samples, n_classifiers]
Competence level estimated for each base classifier and test example.
Returns: - selected_classifiers : array of shape = [n_samples, n_classifiers]
Boolean matrix containing True if the base classifier is select, False otherwise.
-
source_competence
()[source]¶ Method used to estimate the source of competence at each data point.
Each DS technique based on this paradigm should define its computation of C_src
Returns: - C_src : array of shape = [n_samples, n_classifiers]
The competence source for each base classifier at each data point.