A Novel Penalty-Based Wrapper Objective Function for Feature Selection in Big Data using Cooperative Co-Evolution |
The rapid progress of modern technologies generates a massive amount of high-throughput data, called Big Data, which provides opportunities to find new insights using machine learning (ML) algorithms. Big Data consist of many features (also called attributes); however, not all these are necessary or relevant, and they may degrade the performance of ML algorithms. Feature selection (FS) is an essential preprocessing step to reduce the dimensionality of a dataset. Evolutionary algorithms (EAs) are widely used search algorithms for FS. Using classification accuracy as the objective function for FS, EAs, such as the cooperative co-evolutionary algorithm (CCEA), achieve higher accuracy, even with a higher number of features. Feature selection has two purposes: reducing the number of features to decrease computations and improving classification accuracy, which are contradictory but can be achieved using a single objective function. For this very purpose, this paper proposes a penalty-based wrapper objective function. This function can be used to evaluate the FS process using CCEA, hence called Cooperative Co-Evolutionary Algorithm-Based Feature Selection (CCEAFS). An experiment was performed using six widely used classifiers on six different datasets from the UCI ML repository with FS and without FS. The experimental results indicate that the proposed objective function is efficient at reducing the number of features in the final feature subset without significantly reducing classification accuracy. Based on different performance measures, in most cases, naïve Bayes outperforms other classifiers when using CCEAFS.
Rashid BANM, Ahmed M, Sikos L, Haskell-Dowland PS (Dowland PS)