Accuracy Comparison between Five Machine Learning Algorithms for Financial Risk Evaluation

Haokun Dong, Rui Liu, Allan W. Tham

    Research output: Contribution to journalArticlepeer-review

    26 Citations (Scopus)
    123 Downloads (Pure)

    Abstract

    An accurate prediction of loan default is crucial in credit risk evaluation. A slight deviation from true accuracy can often cause financial losses to lending institutes. This study describes the non-parametric approach that compares five different machine learning classifiers combined with a focus on sufficiently large datasets. It presents the findings on various standard performance measures such as accuracy, precision, recall and F1 scores in addition to Receiver Operating Curve-Area Under Curve (ROC-AUC). In this study, various data pre-processing techniques including normalization and standardization, imputation of missing values and the handling of imbalanced data using SMOTE will be discussed and implemented. Also, the study examines the use of hyper-parameters in various classifiers. During the model construction phase, various pipelines feed data to the five machine learning classifiers, and the performance results obtained from the five machine learning classifiers are based on sampling with SMOTE or hyper-parameters versus without SMOTE and hyper-parameters. Each classifier is compared to another in terms of accuracy during training and prediction phase based on out-of-sample data. The 2 data sets used for this experiment contain 1000 and 30,000 observations, respectively, of which the training/testing ratio is 80:20. The comparative results show that random forest outperforms the other four classifiers both in training and actual prediction.

    Original languageEnglish
    Article number50
    Pages (from-to)1-19
    Number of pages19
    JournalJournal of Risk and Financial Management
    Volume17
    Issue number2
    DOIs
    Publication statusPublished - Feb 2024

    Fingerprint

    Dive into the research topics of 'Accuracy Comparison between Five Machine Learning Algorithms for Financial Risk Evaluation'. Together they form a unique fingerprint.

    Cite this