Instance Hardness¶
This file contains the implementation of different measures of instance hardness.
-
deslib.util.instance_hardness.
hardness_region_competence
(neighbors_idx, labels, safe_k)[source]¶ Calculate the Instance hardness of the sample based on its neighborhood. The sample is deemed hard to classify when there is overlap between different classes in the region of competence. This method does not takes into account the target label of the test sample
This hardness measure is used to select whether use DS or use the KNN for the classification of a given query sample
Parameters: - neighbors_idx : array of shape = [n_samples_test, k]
Indices of the nearest neighbors for each considered sample
- labels : array of shape = [n_samples_train]
labels associated with each training sample
- safe_k : int
Number of neighbors used to estimate the hardness of the corresponding region
Returns: - hardness : array of shape = [n_samples]
The Hardness level associated with each example.
References
Smith, M.R., Martinez, T. and Giraud-Carrier, C., 2014. An instance level analysis of data complexity. Machine learning, 95(2), pp.225-256