Variable-length particle swarm optimization for feature selection on high-dimensional classification
journal contributionposted on 18.01.2021, 05:51 by Binh Tran, Bing Xue, Mengjie Zhang
With a global search mechanism, particle swarm optimization (PSO) has shown promise in feature selection (FS). However, most of the current PSO-based FS methods use a fix-length representation, which is inflexible and limits the performance of PSO for FS. When applying these methods to high-dimensional data, it not only consumes a significant amount of memory but also requires a high computational cost. Overcoming this limitation enables PSO to work on data with much higher dimensionality which has become more and more popular with the advance of data collection technologies. In this paper, we propose the first variable-length PSO representation for FS, enabling particles to have different and shorter lengths, which defines smaller search space and therefore, improves the performance of PSO. By rearranging features in a descending order of their relevance, we facilitate particles with shorter lengths to achieve better classification performance. Furthermore, using the proposed length changing mechanism, PSO can jump out of local optima, further narrow the search space and focus its search on smaller and more fruitful area. These strategies enable PSO to reach better solutions in a shorter time. Results on ten high-dimensional datasets with varying difficulties show that the proposed variable-length PSO can achieve much smaller feature subsets with significantly higher classification performance in much shorter time than the fixed-length PSO methods. The proposed method also outperformed the compared non-PSO FS methods in most cases.