Pruned extreme learning machines optimization and increasing performance and predictions on large real world problems

  • Lavneet Singh

    Student thesis: Doctoral Thesis


    Feed-forward neural networks have been extensively used in many fields due to their ability: (1) to approximate complex nonlinear mappings directly from the input samples; and (2) to provide models for a large class of natural and artificial phenomena that are difficult to handle using classical parametric techniques. On the other hand, there is a lack of faster learning algorithms for neural networks. The traditional learning algorithms are usually far slower than required. It is not surprising to see that it may take several hours, several days, and even more time to train neural networks by using traditional methods. (Huang et al.,2006) proposed a new novel algorithm known as the Extreme Machine Learning (ELM) for single hidden layer feed forward neural network, which has less computational time and faster speed even on large datasets. The main working core of ELM is random initialization of weights rather than learning through slow process via iteratively gradient based learning, called as the backpropagation. Significant work have been done in past for better generalization, faster learning and rate of convergence for ELMs. In spite of that, unfortunately, ELM suffers with certain limitations, such as outliers, irrelevant variables and number of hidden nodes. To address these limitations of ELM, constructive and heuristic approaches have been proposed in the literature. The accuracy and performance of machine learning and statistical models are still based on tuning certain parameters and optimization for generating better predictive models of learning that is based on the training data. Larger datasets and samples are also problematic, due to increase in computational times, complexity and bad generalization due to outliers. Using the motivation from extreme learning machine (ELM),we propose RANSAC multi model response regularization for multiple models to prune the large number of hidden nodes to acquire better optimality, generalization and classification accuracy of the network in ELM. Experimental results on different benchmark datasets and real time problems showed that proposed algorithm optimally prunes the hidden nodes, better generalization and higher classification accuracy compared to other algorithms, including SVM,OP-ELM for binary and multi-class classification and regression problems. In the first part of this thesis, I have done an extensive investigation of various classifiers that have been known to perform very well for different machine learning problems, and proposed an improved and highly accurate hybrid ensemble classifiers based on Support Vector Machines (SVM and Extreme Learning Machines for protein folding recognition. In contrast to protein folds prediction, it’s very hard to classify its various folds with its different amino acids attributes due to the limited training data availability. Thus, our proposed classifier involves dimensionality reduction using PCA and LDA prior to classification. The second part of this thesis presents a principled approach for investigating brain abnormalities based on wavelet based feature extraction, PCA based feature selection and deep and optimized Pruned extreme machine learning based classification comparative to various other classifiers. The third part of this thesis present the proposed architectural design for email personalization using our deal database based on grad boost with optimized pruned extreme learning machines as base estimators. In this experimental study, we also conducted a depth dive in data analysis to find each members behaviour and important attributes which plays a significant role in increasing clicks rates in personalized emails.
    Date of Award2015
    Original languageEnglish
    SupervisorGirija Chetty (Supervisor) & Max Wagner (Supervisor)

    Cite this