Probabilistic¶

class deslib.des.probabilistic.BaseProbabilistic(pool_classifiers=None, k=None, DFP=False, with_IH=False, safe_k=None, IH_rate=0.3, mode='selection', voting='hard', selection_threshold=None, random_state=None, knn_classifier='knn', knn_metric='minkowski', DSEL_perc=0.5, n_jobs=-1)[source]¶

Base class for a DS method based on the potential function model. All DS methods based on the Potential function should inherit from this class.

Warning: This class should not be used directly. Use derived classes instead.

estimate_competence(competence_region, distances, predictions=None)[source]¶

estimate the competence of each base classifier \(c_{i}\) using the source of competence \(C_{src}\) and the potential function model. The source of competence \(C_{src}\) for all data points in DSEL is already pre-computed in the fit() steps.

\[\delta_{i,j} = \frac{\sum_{k=1}^{N}C_{src} \: exp(-d (\mathbf{x}_{k}, \mathbf{x}_{q})^{2})} {exp( -d (\mathbf{x}_{k}, \mathbf{x}_{q})^{2} )}\]

Parameters:	competence_region : array of shape (n_samples, n_neighbors) Indices of the k nearest neighbors according for each test sample. distances : array of shape (n_samples, n_neighbors) Distances from the k nearest neighbors to the query. predictions : array of shape (n_samples, n_classifiers) Predictions of the base classifiers for all test examples.
Returns:	competences : array of shape (n_samples, n_classifiers) Competence level estimated for each base classifier and test example.

fit(X, y)[source]¶

Train the DS model by setting the KNN algorithm and pre-processing the information required to apply the DS methods. In the case of probabilistic techniques, the source of competence (C_src) is calculated for each data point in DSEL in order to speed up the process during the testing phases.

C_src is estimated with the source_competence() function that is overridden by each DS method based on this paradigm.

Parameters:	X : array of shape (n_samples, n_features) Data used to fit the model. y : array of shape (n_samples) class labels of each example in X.
Returns:	self : object Returns self.

static potential_func(dist)[source]¶

Gaussian potential function to decrease the influence of the source of competence as the distance between \(\mathbf{x}_{k}\) and the query \(\mathbf{x}_{q}\) increases. The function is computed using the following equation:

\[potential = exp( -dist (\mathbf{x}_{k}, \mathbf{x}_{q})^{2} )\]

where dist represents the Euclidean distance between \(\mathbf{x}_{k}\) and \(\mathbf{x}_{q}\)

Parameters:	dist : array of shape = [self.n_samples] distance between the corresponding sample to the query
Returns:	The result of the potential function for each value in (dist)

select(competences)[source]¶

Selects the base classifiers that obtained a competence level higher than the predefined threshold. In this case, the threshold indicates the competence of the random classifier.

Parameters:	competences : array of shape (n_samples, n_classifiers) Competence level estimated for each base classifier and test example.
Returns:	selected_classifiers : array of shape (n_samples, n_classifiers) Boolean matrix containing True if the base classifier is selected, False otherwise.

source_competence()[source]¶

Method used to estimate the source of competence at each data point.

Each DS technique based on this paradigm should define its computation of C_src

Returns:	C_src : array of shape (n_samples, n_classifiers) The competence source for each base classifier at each data point.