Chetty, CA and Simi, VR and Joseph, J and Venugopal, V (2024) A Nonparametric Feature Separability Measure and an Algorithm for Simulating Synthetic Feature Vectors. In: 10th International Conference on Information Management, ICIM 2024, 8 March 2024 through 10 March 2024, Cambridge, pp. 388-397.
PDF
Com_Com_Inf_Sci _2102 _2024.pdf - Published Version Restricted to Registered users only Download (336kB) | Request a copy |
Abstract
Measures that quantitatively reflect the separability between feature sets of two classes are required to identify the determinant features and select hyper-parameters of feature extraction algorithms, in binary classification paradigms. State-of-the-art separability measures look for equality of distribution parameters of the feature sets and do not linearly quantify the level of overlap between them. Reliable algorithms for generating synthetic feature sets with known levels of overlap are required to test and compare the performance of the separability measures. A measure of separability of features between two classes termed Thresholding-based Classification Error Estimate (TCEE) and an algorithm for generating synthetic feature vectors for testing the feature separability measures are proposed in this paper. Pearson�s correlation coefficient (PCC) of the Bhattacharyya distance (BD), Relative Entropy (RE), p-value of Rank-sum test, Jeffries-Matusita (JM) distance and TCEE with the percentage of overlaps on synthetic feature sets of two distinct classes are �0.6429, �0.6428, 0.3780, �0.9881, and 1. A high value of Pearson�s correlation with the percentage of overlap justifies that the TCEE can accurately measure separability of feature sets of two classes. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
Item Type: | Conference Paper |
---|---|
Publication: | Communications in Computer and Information Science |
Publisher: | Springer Science and Business Media Deutschland GmbH |
Additional Information: | The copyright for this article belongs to Springer Science and Business Media Deutschland GmbH. |
Keywords: | Classification (of information); Feature extraction, Bhattacharya distance; Classification errors; Error estimates; Feature separability measure; Features sets; Jeffrie-matusita distance; Relative entropy; Separability measure; Synthetic feature set; Thresholding, Entropy |
Department/Centre: | Autonomous Societies / Centres > Centre for Brain Research |
Date Deposited: | 09 Sep 2024 10:40 |
Last Modified: | 09 Sep 2024 10:40 |
URI: | http://eprints.iisc.ac.in/id/eprint/86042 |
Actions (login required)
View Item |