Probabilistic¶
-
class
deslib.des.probabilistic.
BaseProbabilistic
(pool_classifiers=None, k=None, DFP=False, with_IH=False, safe_k=None, IH_rate=0.3, mode='selection', voting='hard', selection_threshold=None, random_state=None, knn_classifier='knn', knn_metric='minkowski', DSEL_perc=0.5, n_jobs=-1)[source]¶ Base class for a DS method based on the potential function model. All DS methods based on the Potential function should inherit from this class.
Warning: This class should not be used directly. Use derived classes instead.
-
estimate_competence
(competence_region, distances, predictions=None)[source]¶ estimate the competence of each base classifier \(c_{i}\) using the source of competence \(C_{src}\) and the potential function model. The source of competence \(C_{src}\) for all data points in DSEL is already pre-computed in the fit() steps.
\[\delta_{i,j} = \frac{\sum_{k=1}^{N}C_{src} \: exp(-d (\mathbf{x}_{k}, \mathbf{x}_{q})^{2})} {exp( -d (\mathbf{x}_{k}, \mathbf{x}_{q})^{2} )}\]Parameters: - competence_region : array of shape (n_samples, n_neighbors)
Indices of the k nearest neighbors according for each test sample.
- distances : array of shape (n_samples, n_neighbors)
Distances from the k nearest neighbors to the query.
- predictions : array of shape (n_samples, n_classifiers)
Predictions of the base classifiers for all test examples.
Returns: - competences : array of shape (n_samples, n_classifiers)
Competence level estimated for each base classifier and test example.
-
fit
(X, y)[source]¶ Train the DS model by setting the KNN algorithm and pre-processing the information required to apply the DS methods. In the case of probabilistic techniques, the source of competence (C_src) is calculated for each data point in DSEL in order to speed up the process during the testing phases.
C_src is estimated with the source_competence() function that is overridden by each DS method based on this paradigm.
Parameters: - X : array of shape (n_samples, n_features)
Data used to fit the model.
- y : array of shape (n_samples)
class labels of each example in X.
Returns: - self : object
Returns self.
-
static
potential_func
(dist)[source]¶ Gaussian potential function to decrease the influence of the source of competence as the distance between \(\mathbf{x}_{k}\) and the query \(\mathbf{x}_{q}\) increases. The function is computed using the following equation:
\[potential = exp( -dist (\mathbf{x}_{k}, \mathbf{x}_{q})^{2} )\]where dist represents the Euclidean distance between \(\mathbf{x}_{k}\) and \(\mathbf{x}_{q}\)
Parameters: - dist : array of shape = [self.n_samples]
distance between the corresponding sample to the query
Returns: - The result of the potential function for each value in (dist)
-
select
(competences)[source]¶ Selects the base classifiers that obtained a competence level higher than the predefined threshold. In this case, the threshold indicates the competence of the random classifier.
Parameters: - competences : array of shape (n_samples, n_classifiers)
Competence level estimated for each base classifier and test example.
Returns: - selected_classifiers : array of shape (n_samples, n_classifiers)
Boolean matrix containing True if the base classifier is selected, False otherwise.
-
source_competence
()[source]¶ Method used to estimate the source of competence at each data point.
Each DS technique based on this paradigm should define its computation of C_src
Returns: - C_src : array of shape (n_samples, n_classifiers)
The competence source for each base classifier at each data point.
-