Task 1a of the CLEF eHealth Evaluation Lab 2015

Hanna Suominen, Leif Hanlen, Lorraine Goeuriot, Liadh Kelly, Gareth J.F. Jones

Research output: A Conference proceeding or a Chapter in BookConference contribution

1 Citation (Scopus)

Abstract

Best practice for clinical handover and its documentation recommends standardized, structured, and synchronous processes with patient involvement. Cascaded speech recognition (SR) and information extraction could support their compliance and release clinicians' time from writing documents to patient interaction and education. However, high requirements for processing correctness evoke methodological challenges. First, multiple people speak clinical jargon in the presence of background noise with limited possibilities for SR personalization. Second, errors multiply in cascading and hence, SR correctness needs to be carefully evaluated as meeting the requirements. This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015. The task released 100 synthetic handover documents for training and another 100 documents for testing in both verbal and written formats. It attracted 48 team registrations, 21 email confirmations, and four method submissions by two teams. The submissions were compared against a leading commercial SR engine and simple majority baseline. Although this engine performed significantly better than any submission [i.e., 38.5 vs. 52.8 test error percentage of the best submission with the Wilcoxon signed-rank test value of 302.5 (p < 10-12)], the releases of data, tools, and evaluations contribute to the body of knowledge on the task difficulty and method suitability.

Original languageEnglish
Title of host publicationthe eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015
Subtitle of host publication16th Conference and Labs of the Evaluation Forum, CLEF 2015
EditorsLinda Cappellato, Nicola Ferro, Gareth J.F. Jones, Eric San Juan
Place of PublicationToulouse, France
PublisherCEUR Workshop Proceedings
Pages1-18
Number of pages18
Volume1391
Publication statusPublished - 8 Sep 2015
Event6th International Conference on Labs of the Evaluation Forum, CLEF 2015 - Toulouse, Toulouse, France
Duration: 8 Sep 201511 Sep 2015
http://clef2015.clef-initiative.eu/publications.php

Publication series

NameCLEF2015 Working Notes
PublisherCEUR Worshop Proceedings
Volume1391
ISSN (Print)1613-0073

Conference

Conference6th International Conference on Labs of the Evaluation Forum, CLEF 2015
Abbreviated titleCLEF 2015
CountryFrance
CityToulouse
Period8/09/1511/09/15
OtherCLEF 2015 is the sixth CLEF conference continuing the popular CLEF campaigns which have run since 2000 contributing to the systematic evaluation of information access systems, primarily through experimentation on shared tasks.

Building on the format first introduced in 2010, CLEF 2015 consists of an independent peer-reviewed conference on a broad range of issues in the fields of multilingual and multimodal information access evaluation, and a set of labs and workshops designed to test different aspects of mono and cross-language Information retrieval systems. Together, the conference and the lab series will maintain and expand upon the CLEF tradition of community-based evaluation and discussion on evaluation issues
Internet address

Fingerprint

Speech recognition
Engines
Electronic mail
Acoustic noise
Education
Testing
Processing

Cite this

Suominen, H., Hanlen, L., Goeuriot, L., Kelly, L., & Jones, G. J. F. (2015). Task 1a of the CLEF eHealth Evaluation Lab 2015. In L. Cappellato, N. Ferro, G. J. F. Jones, & E. San Juan (Eds.), the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015: 16th Conference and Labs of the Evaluation Forum, CLEF 2015 (Vol. 1391, pp. 1-18). (CLEF2015 Working Notes; Vol. 1391). Toulouse, France: CEUR Workshop Proceedings.
Suominen, Hanna ; Hanlen, Leif ; Goeuriot, Lorraine ; Kelly, Liadh ; Jones, Gareth J.F. / Task 1a of the CLEF eHealth Evaluation Lab 2015. the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015: 16th Conference and Labs of the Evaluation Forum, CLEF 2015. editor / Linda Cappellato ; Nicola Ferro ; Gareth J.F. Jones ; Eric San Juan. Vol. 1391 Toulouse, France : CEUR Workshop Proceedings, 2015. pp. 1-18 (CLEF2015 Working Notes).
@inproceedings{42862de5737348c08899aa86f6ba4fe8,
title = "Task 1a of the CLEF eHealth Evaluation Lab 2015",
abstract = "Best practice for clinical handover and its documentation recommends standardized, structured, and synchronous processes with patient involvement. Cascaded speech recognition (SR) and information extraction could support their compliance and release clinicians' time from writing documents to patient interaction and education. However, high requirements for processing correctness evoke methodological challenges. First, multiple people speak clinical jargon in the presence of background noise with limited possibilities for SR personalization. Second, errors multiply in cascading and hence, SR correctness needs to be carefully evaluated as meeting the requirements. This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015. The task released 100 synthetic handover documents for training and another 100 documents for testing in both verbal and written formats. It attracted 48 team registrations, 21 email confirmations, and four method submissions by two teams. The submissions were compared against a leading commercial SR engine and simple majority baseline. Although this engine performed significantly better than any submission [i.e., 38.5 vs. 52.8 test error percentage of the best submission with the Wilcoxon signed-rank test value of 302.5 (p < 10-12)], the releases of data, tools, and evaluations contribute to the body of knowledge on the task difficulty and method suitability.",
keywords = "Computer systems evaluation, Data collection, Information extraction, Medical informatics, Nursing records, Patient Hand-over, Patient handoff, Records as topic, Software design, Speech recognition, Test-set generation",
author = "Hanna Suominen and Leif Hanlen and Lorraine Goeuriot and Liadh Kelly and Jones, {Gareth J.F.}",
year = "2015",
month = "9",
day = "8",
language = "English",
volume = "1391",
series = "CLEF2015 Working Notes",
publisher = "CEUR Workshop Proceedings",
pages = "1--18",
editor = "Linda Cappellato and Nicola Ferro and Jones, {Gareth J.F.} and {San Juan}, Eric",
booktitle = "the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015",

}

Suominen, H, Hanlen, L, Goeuriot, L, Kelly, L & Jones, GJF 2015, Task 1a of the CLEF eHealth Evaluation Lab 2015. in L Cappellato, N Ferro, GJF Jones & E San Juan (eds), the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015: 16th Conference and Labs of the Evaluation Forum, CLEF 2015. vol. 1391, CLEF2015 Working Notes, vol. 1391, CEUR Workshop Proceedings, Toulouse, France, pp. 1-18, 6th International Conference on Labs of the Evaluation Forum, CLEF 2015, Toulouse, France, 8/09/15.

Task 1a of the CLEF eHealth Evaluation Lab 2015. / Suominen, Hanna; Hanlen, Leif; Goeuriot, Lorraine; Kelly, Liadh; Jones, Gareth J.F.

the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015: 16th Conference and Labs of the Evaluation Forum, CLEF 2015. ed. / Linda Cappellato; Nicola Ferro; Gareth J.F. Jones; Eric San Juan. Vol. 1391 Toulouse, France : CEUR Workshop Proceedings, 2015. p. 1-18 (CLEF2015 Working Notes; Vol. 1391).

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - Task 1a of the CLEF eHealth Evaluation Lab 2015

AU - Suominen, Hanna

AU - Hanlen, Leif

AU - Goeuriot, Lorraine

AU - Kelly, Liadh

AU - Jones, Gareth J.F.

PY - 2015/9/8

Y1 - 2015/9/8

N2 - Best practice for clinical handover and its documentation recommends standardized, structured, and synchronous processes with patient involvement. Cascaded speech recognition (SR) and information extraction could support their compliance and release clinicians' time from writing documents to patient interaction and education. However, high requirements for processing correctness evoke methodological challenges. First, multiple people speak clinical jargon in the presence of background noise with limited possibilities for SR personalization. Second, errors multiply in cascading and hence, SR correctness needs to be carefully evaluated as meeting the requirements. This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015. The task released 100 synthetic handover documents for training and another 100 documents for testing in both verbal and written formats. It attracted 48 team registrations, 21 email confirmations, and four method submissions by two teams. The submissions were compared against a leading commercial SR engine and simple majority baseline. Although this engine performed significantly better than any submission [i.e., 38.5 vs. 52.8 test error percentage of the best submission with the Wilcoxon signed-rank test value of 302.5 (p < 10-12)], the releases of data, tools, and evaluations contribute to the body of knowledge on the task difficulty and method suitability.

AB - Best practice for clinical handover and its documentation recommends standardized, structured, and synchronous processes with patient involvement. Cascaded speech recognition (SR) and information extraction could support their compliance and release clinicians' time from writing documents to patient interaction and education. However, high requirements for processing correctness evoke methodological challenges. First, multiple people speak clinical jargon in the presence of background noise with limited possibilities for SR personalization. Second, errors multiply in cascading and hence, SR correctness needs to be carefully evaluated as meeting the requirements. This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015. The task released 100 synthetic handover documents for training and another 100 documents for testing in both verbal and written formats. It attracted 48 team registrations, 21 email confirmations, and four method submissions by two teams. The submissions were compared against a leading commercial SR engine and simple majority baseline. Although this engine performed significantly better than any submission [i.e., 38.5 vs. 52.8 test error percentage of the best submission with the Wilcoxon signed-rank test value of 302.5 (p < 10-12)], the releases of data, tools, and evaluations contribute to the body of knowledge on the task difficulty and method suitability.

KW - Computer systems evaluation

KW - Data collection

KW - Information extraction

KW - Medical informatics

KW - Nursing records

KW - Patient Hand-over

KW - Patient handoff

KW - Records as topic

KW - Software design

KW - Speech recognition

KW - Test-set generation

UR - http://www.scopus.com/inward/record.url?scp=84982805922&partnerID=8YFLogxK

M3 - Conference contribution

VL - 1391

T3 - CLEF2015 Working Notes

SP - 1

EP - 18

BT - the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015

A2 - Cappellato, Linda

A2 - Ferro, Nicola

A2 - Jones, Gareth J.F.

A2 - San Juan, Eric

PB - CEUR Workshop Proceedings

CY - Toulouse, France

ER -

Suominen H, Hanlen L, Goeuriot L, Kelly L, Jones GJF. Task 1a of the CLEF eHealth Evaluation Lab 2015. In Cappellato L, Ferro N, Jones GJF, San Juan E, editors, the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum in 2015: 16th Conference and Labs of the Evaluation Forum, CLEF 2015. Vol. 1391. Toulouse, France: CEUR Workshop Proceedings. 2015. p. 1-18. (CLEF2015 Working Notes).