TY - JOUR
T1 - Medical Feature Selection Approach Based on Generalized Normal Distribution Algorithm
AU - Abdel-Basset, Mohamed
AU - Mohamed, Reda
AU - Chakrabortty, Ripon K.
AU - Ryan, Michael J.
AU - Nam, Yunyoung
AU - Abouhawwash, Mohamed
N1 - Funding Information:
Funding Statement: This work has supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2021R1A2C1010362) and the Soonchunhyang University Research Fund.
Publisher Copyright:
© 2021 Tech Science Press. All rights reserved.
PY - 2021
Y1 - 2021
N2 - This paper proposes a new pre-processing technique to separate the most effective features from those that might deteriorate the performance of the machine learning classifiers in terms of computational costs and classification accuracy because of their irrelevance, redundancy, or less information; this pre-processing process is often known as feature selection. This technique is based on adopting a new optimization algorithm known as generalized normal distribution optimization (GNDO) supported by the conversion of the normal distribution to a binary one using the arctangent transfer function to convert the continuous values into binary values. Further, a novel restarting strategy (RS) is proposed to preserve the diversity among the solutions within the population by identifying the solutions that exceed a specific distance from the best-so-far and replace them with the others created using an effective updating scheme. This strategy is integrated with GNDO to propose another binary variant having a high ability to preserve the diversity of the solutions for avoiding becoming stuck in local minima and accelerating convergence, namely improved GNDO (IGNDO). The proposed GNDO and IGNDO algorithms are extensively compared with seven state-of-the-art algorithms to verify their performance on thirteen medical instances taken from the UCI repository. IGNDO is shown to be superior in terms of fitness value and classification accuracy and competitive with the others in terms of the selected features. Since the principal goal in solving the FS problem is to find the appropriate subset of features that maximize classification accuracy, IGNDO is considered the best.
AB - This paper proposes a new pre-processing technique to separate the most effective features from those that might deteriorate the performance of the machine learning classifiers in terms of computational costs and classification accuracy because of their irrelevance, redundancy, or less information; this pre-processing process is often known as feature selection. This technique is based on adopting a new optimization algorithm known as generalized normal distribution optimization (GNDO) supported by the conversion of the normal distribution to a binary one using the arctangent transfer function to convert the continuous values into binary values. Further, a novel restarting strategy (RS) is proposed to preserve the diversity among the solutions within the population by identifying the solutions that exceed a specific distance from the best-so-far and replace them with the others created using an effective updating scheme. This strategy is integrated with GNDO to propose another binary variant having a high ability to preserve the diversity of the solutions for avoiding becoming stuck in local minima and accelerating convergence, namely improved GNDO (IGNDO). The proposed GNDO and IGNDO algorithms are extensively compared with seven state-of-the-art algorithms to verify their performance on thirteen medical instances taken from the UCI repository. IGNDO is shown to be superior in terms of fitness value and classification accuracy and competitive with the others in terms of the selected features. Since the principal goal in solving the FS problem is to find the appropriate subset of features that maximize classification accuracy, IGNDO is considered the best.
KW - Feature selection
KW - Generalized normal distribution optimization
KW - Novel restarting strategy
KW - Transfer function
KW - Uci repository
UR - http://www.scopus.com/inward/record.url?scp=85115907281&partnerID=8YFLogxK
U2 - 10.32604/cmc.2021.017854
DO - 10.32604/cmc.2021.017854
M3 - Article
AN - SCOPUS:85115907281
SN - 1546-2218
VL - 69
SP - 2883
EP - 2901
JO - Computers, Materials and Continua
JF - Computers, Materials and Continua
IS - 3
ER -