Modelling athletic training and performance : a hybrid artificial neural network ensemble approach

  • Tania Churchill

    Student thesis: Doctoral Thesis


    How an athlete trains is the most influential factor in the myriad of variables which determine performance in endurance sports. The questions that face all coaches and athletes is how should I train in order to: a) produce peak performance; and b) produce a peak when desired? The aim of this research was to develop a model which enables the accurate prediction of cycling performance using field derived training and racing data. A number of techniques were developed to pre-process raw sensor data; a novel training load quantification technique (PTRIMP) was developed to encapsulate duration and intensity of training load in one metric; the shape of training microcycles was analysed using a data mining approach whereby the time series of PTRIMP was transformed into a symbolic representation; Heart Rate Variability (HRV) indices were calculated from daily orthostatic tests; and a performance quantification technique allowing performance to be calculated from power data obtained from any race / training session where the athlete makes a 100% effort was developed. This data allowed a model to be created that can assist an athlete in manipulating training load and consequently manage fatigue levels,such that they arrive at a competition in a state from which a high performance is likely,and a low performance is unlikely. Artificial Neural Networks (ANN) were the key modelling technique used to model the relationship between training and performance. A hybrid ANN ensemble model – named HANNEM - was developed. To avoid overfitting of the model,the regularisation technique of bagging and adding noise to each bag was used. A combination model was created by using the output from a linear statistical model as an input for the neural network model,resulting in improved model accuracy. In the final modelling step,an ensemble of neural network models was created,resulting in an improvement of predictive performance over a single model. Model fit was evaluated using leave-one-out testing. Three datasets consisting of longitudinal training and racing power data obtained from three elite cyclists were used to validate the model (R2 = 0.51,0.64 and 0.78). These moderate to strong results are in a similar range to those reported in other modelling studies. ANN models on noisy,sparse datasets are prone to overfitting. A series of experiments were performed on synthetic data to investigate overfitting issues and validate the use of a bagged ensemble of ANNs on such datasets. Interestingly, deliberately overtrained individual ANNs resulted in improved ensemble performance for synthetic datasets with low to moderate data error rates. The novel modelling approach employed overcame the difficulties associated with using field-derived model inputs,which will allow the model to be implemented in the real world environment of professional sport. The model is a practical tool for the planning of training to maximise competition performance.
    Date of Award2014
    Original languageEnglish
    SupervisorDharmendra Sharma AM PhD (Supervisor), Muthukumar Balachandran (Supervisor) & Graham Williams (Supervisor)

    Cite this