Analytics of Heterogeneous Breast Cancer Data Using Neuroevolution

Beibit Abdikenov, Zangir Iklassov, Askhat Sharipov, Shahid HUSSAIN, Prashant K. Jamwal

Research output: Contribution to journalArticle

Abstract

Breast cancer prognostic modeling is difficult since it is governed by many diverse factors. Given the low median survival and large scale breast cancer data, which comes from high throughput technology, the accurate and reliable prognosis of breast cancer is becoming increasingly difficult. While accurate and timely prognosis may save many patients from going through painful and expensive treatments, it may also help oncologists in managing the disease more efficiently and effectively. Data analytics augmented by machine-learning algorithms have been proposed in past for breast cancer prognosis; and however, most of these could not perform well owing to the heterogeneous nature of available data and model interpretability related issues. A robust prognostic modeling approach is proposed here whereby a Pareto optimal set of deep neural networks (DNNs) exhibiting equally good performance metrics is obtained. The set of DNNs is initialized and their hyperparameters are optimized using the evolutionary algorithm, NSGAIII. The final DNN model is selected from the Pareto optimal set of many DNNs using a fuzzy inferencing approach. Contrary to using DNNs as the black box, the proposed scheme allows understanding how various performance metrics (such as accuracy, sensitivity, F1, and so on) change with changes in hyperparameters. This enhanced interpretability can be further used to improve or modify the behavior of DNNs. The heterogeneous breast cancer database requires preprocessing for better interpretation of categorical variables in order to improve prognosis from classifiers. Furthermore, we propose to use a neural networkbased entity-embedding method for categorical features with high cardinality. This approach can provide a vector representation of categorical features in multidimensional space with enhanced interpretability. It is shown with evidence that DNNs optimized using evolutionary algorithms exhibit improved performance over other classifiers mentioned in this paper.
LanguageEnglish
Pages18050-18060
Number of pages11
JournalIEEE Access
Volume7
DOIs
Publication statusPublished - 1 Jan 2019

Fingerprint

Evolutionary algorithms
Classifiers
Deep neural networks
Learning algorithms
Learning systems
Throughput

Cite this

Abdikenov, Beibit ; Iklassov, Zangir ; Sharipov, Askhat ; HUSSAIN, Shahid ; Jamwal, Prashant K. / Analytics of Heterogeneous Breast Cancer Data Using Neuroevolution. In: IEEE Access. 2019 ; Vol. 7. pp. 18050-18060.
@article{2eb08a9f01f6417482104e514e3cbfcc,
title = "Analytics of Heterogeneous Breast Cancer Data Using Neuroevolution",
abstract = "Breast cancer prognostic modeling is difficult since it is governed by many diverse factors. Given the low median survival and large scale breast cancer data, which comes from high throughput technology, the accurate and reliable prognosis of breast cancer is becoming increasingly difficult. While accurate and timely prognosis may save many patients from going through painful and expensive treatments, it may also help oncologists in managing the disease more efficiently and effectively. Data analytics augmented by machine-learning algorithms have been proposed in past for breast cancer prognosis; and however, most of these could not perform well owing to the heterogeneous nature of available data and model interpretability related issues. A robust prognostic modeling approach is proposed here whereby a Pareto optimal set of deep neural networks (DNNs) exhibiting equally good performance metrics is obtained. The set of DNNs is initialized and their hyperparameters are optimized using the evolutionary algorithm, NSGAIII. The final DNN model is selected from the Pareto optimal set of many DNNs using a fuzzy inferencing approach. Contrary to using DNNs as the black box, the proposed scheme allows understanding how various performance metrics (such as accuracy, sensitivity, F1, and so on) change with changes in hyperparameters. This enhanced interpretability can be further used to improve or modify the behavior of DNNs. The heterogeneous breast cancer database requires preprocessing for better interpretation of categorical variables in order to improve prognosis from classifiers. Furthermore, we propose to use a neural networkbased entity-embedding method for categorical features with high cardinality. This approach can provide a vector representation of categorical features in multidimensional space with enhanced interpretability. It is shown with evidence that DNNs optimized using evolutionary algorithms exhibit improved performance over other classifiers mentioned in this paper.",
keywords = "breast cancer prognostic modelling, entity embedding, deep learning networks, Evolutionary algorithms, fuzzy inferencing, Breast cancer prognostic modelling, evolutionary algorithms",
author = "Beibit Abdikenov and Zangir Iklassov and Askhat Sharipov and Shahid HUSSAIN and Jamwal, {Prashant K.}",
year = "2019",
month = "1",
day = "1",
doi = "10.1109/ACCESS.2019.2897078",
language = "English",
volume = "7",
pages = "18050--18060",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",

}

Abdikenov, B, Iklassov, Z, Sharipov, A, HUSSAIN, S & Jamwal, PK 2019, 'Analytics of Heterogeneous Breast Cancer Data Using Neuroevolution', IEEE Access, vol. 7, pp. 18050-18060. https://doi.org/10.1109/ACCESS.2019.2897078

Analytics of Heterogeneous Breast Cancer Data Using Neuroevolution. / Abdikenov, Beibit ; Iklassov, Zangir ; Sharipov, Askhat ; HUSSAIN, Shahid; Jamwal, Prashant K.

In: IEEE Access, Vol. 7, 01.01.2019, p. 18050-18060.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Analytics of Heterogeneous Breast Cancer Data Using Neuroevolution

AU - Abdikenov, Beibit

AU - Iklassov, Zangir

AU - Sharipov, Askhat

AU - HUSSAIN, Shahid

AU - Jamwal, Prashant K.

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Breast cancer prognostic modeling is difficult since it is governed by many diverse factors. Given the low median survival and large scale breast cancer data, which comes from high throughput technology, the accurate and reliable prognosis of breast cancer is becoming increasingly difficult. While accurate and timely prognosis may save many patients from going through painful and expensive treatments, it may also help oncologists in managing the disease more efficiently and effectively. Data analytics augmented by machine-learning algorithms have been proposed in past for breast cancer prognosis; and however, most of these could not perform well owing to the heterogeneous nature of available data and model interpretability related issues. A robust prognostic modeling approach is proposed here whereby a Pareto optimal set of deep neural networks (DNNs) exhibiting equally good performance metrics is obtained. The set of DNNs is initialized and their hyperparameters are optimized using the evolutionary algorithm, NSGAIII. The final DNN model is selected from the Pareto optimal set of many DNNs using a fuzzy inferencing approach. Contrary to using DNNs as the black box, the proposed scheme allows understanding how various performance metrics (such as accuracy, sensitivity, F1, and so on) change with changes in hyperparameters. This enhanced interpretability can be further used to improve or modify the behavior of DNNs. The heterogeneous breast cancer database requires preprocessing for better interpretation of categorical variables in order to improve prognosis from classifiers. Furthermore, we propose to use a neural networkbased entity-embedding method for categorical features with high cardinality. This approach can provide a vector representation of categorical features in multidimensional space with enhanced interpretability. It is shown with evidence that DNNs optimized using evolutionary algorithms exhibit improved performance over other classifiers mentioned in this paper.

AB - Breast cancer prognostic modeling is difficult since it is governed by many diverse factors. Given the low median survival and large scale breast cancer data, which comes from high throughput technology, the accurate and reliable prognosis of breast cancer is becoming increasingly difficult. While accurate and timely prognosis may save many patients from going through painful and expensive treatments, it may also help oncologists in managing the disease more efficiently and effectively. Data analytics augmented by machine-learning algorithms have been proposed in past for breast cancer prognosis; and however, most of these could not perform well owing to the heterogeneous nature of available data and model interpretability related issues. A robust prognostic modeling approach is proposed here whereby a Pareto optimal set of deep neural networks (DNNs) exhibiting equally good performance metrics is obtained. The set of DNNs is initialized and their hyperparameters are optimized using the evolutionary algorithm, NSGAIII. The final DNN model is selected from the Pareto optimal set of many DNNs using a fuzzy inferencing approach. Contrary to using DNNs as the black box, the proposed scheme allows understanding how various performance metrics (such as accuracy, sensitivity, F1, and so on) change with changes in hyperparameters. This enhanced interpretability can be further used to improve or modify the behavior of DNNs. The heterogeneous breast cancer database requires preprocessing for better interpretation of categorical variables in order to improve prognosis from classifiers. Furthermore, we propose to use a neural networkbased entity-embedding method for categorical features with high cardinality. This approach can provide a vector representation of categorical features in multidimensional space with enhanced interpretability. It is shown with evidence that DNNs optimized using evolutionary algorithms exhibit improved performance over other classifiers mentioned in this paper.

KW - breast cancer prognostic modelling

KW - entity embedding

KW - deep learning networks

KW - Evolutionary algorithms

KW - fuzzy inferencing

KW - Breast cancer prognostic modelling

KW - evolutionary algorithms

UR - http://www.scopus.com/inward/record.url?scp=85062209129&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/analytics-heterogeneous-breast-cancer-data-using-neuroevolution

U2 - 10.1109/ACCESS.2019.2897078

DO - 10.1109/ACCESS.2019.2897078

M3 - Article

VL - 7

SP - 18050

EP - 18060

JO - IEEE Access

T2 - IEEE Access

JF - IEEE Access

SN - 2169-3536

ER -