Static Selection¶
-
class
deslib.static.static_selection.
StaticSelection
(pool_classifiers=None, pct_classifiers=0.5, scoring=None, random_state=None, n_jobs=-1)[source]¶ Ensemble model that selects N classifiers with the best performance in a dataset
Parameters: - pool_classifiers : list of classifiers (Default = None)
The generated_pool of classifiers trained for the corresponding classification problem. Each base classifiers should support the method “predict”. If None, then the pool of classifiers is a bagging classifier.
- scoring : string, callable (default = None)
A single string or a callable to evaluate the predictions on the validation set.
- random_state : int, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
- pct_classifiers : float (Default = 0.5)
Percentage of base classifier that should be selected by the selection scheme.
- n_jobs : int, default=-1
The number of parallel jobs to run. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. Doesn’t affect fit method.
References
Britto, Alceu S., Robert Sabourin, and Luiz ES Oliveira. “Dynamic selection of classifiers—a comprehensive review.” Pattern Recognition 47.11 (2014): 3665-3680.
Kuncheva, Ludmila I. Combining pattern classifiers: methods and algorithms. John Wiley & Sons, 2004.
R. M. O. Cruz, R. Sabourin, and G. D. Cavalcanti, “Dynamic classifier selection: Recent advances and perspectives,” Information Fusion, vol. 41, pp. 195 – 216, 2018.
-
fit
(X, y)[source]¶ Fit the static selection model by select an ensemble of classifier containing the base classifiers with highest accuracy in the given dataset.
Parameters: - X : array of shape (n_samples, n_features)
Data used to fit the model.
- y : array of shape (n_samples)
class labels of each example in X.
Returns: - self : object
Returns self.
-
predict
(X)[source]¶ Predict the label of each sample in X and returns the predicted label.
Parameters: - X : array of shape (n_samples, n_features)
The data to be classified
Returns: - predicted_labels : array of shape (n_samples)
Predicted class for each sample in X.
-
predict_proba
(X)[source]¶ Estimates the posterior probabilities for sample in X.
Parameters: - X : array of shape (n_samples, n_features)
The input data.
Returns: - predicted_proba : array of shape (n_samples, n_classes)
Probabilities estimates for each sample in X.
-
score
(X, y, sample_weight=None)[source]¶ Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters: - X : array-like of shape (n_samples, n_features)
Test samples.
- y : array-like of shape (n_samples,) or (n_samples, n_outputs)
True labels for X.
- sample_weight : array-like of shape (n_samples,), default=None
Sample weights.
Returns: - score : float
Mean accuracy of
self.predict(X)
wrt. y.