Evaluation data and benchmarks for cascaded speech recognition and entity extraction

Liyuan Zhou, Hanna Suominen, Leif Hanlen

Research output: A Conference proceeding or a Chapter in BookConference contributionpeer-review

5 Citations (Scopus)
1 Downloads (Pure)

Abstract

During clinical handover, clinicians exchange information about the patients and the state of clinical management. To improve care safety and quality, both handover and its documentation have been standardized. Speech recognition and entity extraction provide a way to help health service providers to follow these standards by implementing the handover process as a structured form, whose headings guide the handover narrative, and the documentation process as proofing and sign-off of the automatically filled-out form. In this paper, we evaluate such systems. The form considers the sections of Handover nurse, Patient introduction, My shift, Medication, Appointments, and Future care, divided in 49 mutually exclusive headings to fill out with speech recognized and extracted entities. Our system correctly recognizes 10,244 out of 14,095 spoken words and regardless of 6,692 erroneous words, its error percentage is significantly smaller than for systems submitted to the CLEF eHealth Evaluation Lab 2015. In the extraction of 35 entities with training data (i.e., 14 headings were not present in the 101 expertannotated training documents with 8,487 words in total), the system correctly extracts 2,375 out of 3,793 words in 50 test documents after calibration on 3,937 words in 50 validation documents. This translates to over 90% F1 in extracting information for the patient's age, current bed, current room, and given name and over 70% F1 for patient's admission reason/diagnosis and last name. F1 for filtering out irrelevant information is 78%. We have made the data publicly available for 201 handover cases together with processing results and code and proposed the extraction task for CLEF eHealth 2016.
Original languageEnglish
Title of host publicationSLAM 2015 - Proceedings of the 2015 Workshop on Speech, Language and Audio in Multimedia, co-located with ACM MM 2015
EditorsGuillaume Gravier, Martha Larson, Gareth Jones, Roeland Ordelman
PublisherAssociation for Computing Machinery (ACM)
Pages15-18
Number of pages4
ISBN (Electronic)9781450337496
ISBN (Print)9781450337496
DOIs
Publication statusPublished - 30 Oct 2015
EventACM Multimedia 2015: The third Edition Workshop on Speech, Language & Audio in Multimedia - Brisbane Exhibition & Convention Centre, Brisbane, Australia
Duration: 26 Oct 201530 Oct 2015
http://www.acmmm.org/2015/proceedings/

Publication series

NameSLAM 2015 - Proceedings of the 2015 Workshop on Speech, Language and Audio in Multimedia, co-located with ACM MM 2015

Conference

ConferenceACM Multimedia 2015
Country/TerritoryAustralia
CityBrisbane
Period26/10/1530/10/15
Internet address

Fingerprint

Dive into the research topics of 'Evaluation data and benchmarks for cascaded speech recognition and entity extraction'. Together they form a unique fingerprint.

Cite this