Clinical name entity recognition based on machine learning

  • Thoai Man Luu

    Student thesis: Doctoral Thesis


    Background In this thesis, a novel machine learning framework for clinical name entity recognition is proposed. The focus of clinical context for this thesis is the Nursing ‘handover’ scenario, which involves the transfer of information between shifts in the care teams and sharing of professional responsibility and accountability for patient care from one clinical team to another, either temporarily or permanently. With changes in working hours and shifts of clinical teams (doctors, nurses and registrars in health care system),and an increasing demand for flexible work practices, the need for mechanisms to support effective and efficient handover processes for transferring information, responsibility, accountability and patient safety has become recognized as increasingly important for the delivery of high quality health care. While there has been significant importance given on development of computer aided decision support systems in other eHealth contexts, such as managing electronic medical records, disease diagnosis, monitoring, and treatment, the emphasis on the computer aided systems for clinical handover contexts have been somewhat minimal, leading to compromise in quality of care and patient safety. Hence, this thesis attempts to investigate these ill-address problems, and proposes a novel machine learning framework, with a focus on two main tasks: the front-end clinical speech recognition task, and the robust clinical name entity recognition task. Methods The methods used in this work addressed the two main research questions identified in this thesis and propose a new machine learning framework with several new algorithms for addressing the two tasks, with particular focus on name entity recognition task, due to its challenges associated with natural language processing and text mining tasks in unstructured, free text clinical domain. For the experimental evaluation of different algorithms proposed in this work, the publicly available benchmark challenge datasets - from the Clinical Laboratory Evaluation Forum (CLEF) challenge Task 1 in 2015 and 2016,on nursing handover activity was used, and performance improvement achieved for each of the algorithm was compared with other participating systems in the challenge. Results The experimental work designed to validate the improvements achieved for each stage of the proposed machine learning framework were promising, with each algorithm resulting in better performance as compared to the other participating systems in the Challenge task, in terms of different performance metrics, including precision, recall and final score. For clinical speech recognition task, a Hidden Markov Model (HMM) based speech recognition subsystem based on adaptation of acoustic and language models with extended dictionary was proposed and implemented using Carnegie Mellon (CMU) University Sphinx Engine, and for clinical name entity recognition task, several machine learning algorithms, with incrementally improving performance in terms of different performance measures, Precision, Recall and F-score were obtained. These different increasingly powerful algorithms for the proposed machine learning framework, deal with the complexity of unstructured free text clinical notes and are based on use of regular expression patterns and Natural Language Processing features in Chapter 4,use of Maximum Entropy and Hidden Markov Model algorithms in Chapter 5,a novel Co-training approach in Chapter 6,and cutting edge deep learning techniques in Chapter 7. Discussion/Conclusion The novel algorithms developed and the implementation of different algorithms using Java based software (Ling-Pipe, Stanford, Co-training, Maximum Entropy, and Deep Learning for Java), allowed a cost effective, end-to-end, integrated, open source clinical decision support technology platform for reduction of medical errors, and improvement of patient safety, and quality of care.
    Date of Award2018
    Original languageEnglish
    SupervisorRachel Davey (Supervisor) & Girija Chetty (Supervisor)

    Cite this