Task 1 of the CLEF ehealth evaluation lab 2016

Handover information extraction

Hanna Suominen, Liyuan Zhou, Lorraine Goeuriot, Liadh Kelly

Research output: A Conference proceeding or a Chapter in BookConference contribution

3 Citations (Scopus)

Abstract

Cascaded speech recognition (SR) and information extraction (IE) could support the best practice for clinical handover and release clinicians' time from writing documents to patient interaction and education. However, high requirements for processing correctness evoke methodological challenges and hence, processing correctness needs to be carefully evaluated as meeting the requirements. This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum (CLEF) in 2016. This IE task built on the 2015 CLEF eHealth Task on SR by using its 201 synthetic handover documents for training and validation (appr. 8; 500 + 7; 700 words) and releasing another 100 documents with over 6; 500 expert-Annotated words for testing. It attracted 25 team registrations and 3 team submissions with 2 methods each. When using the macro-Averaged F1 over the 35 form headings present in the training documents for evaluation on the test documents, all participant methods outperformed all 4 baselines, including the organizers' method (F1 = 0:25), published in 2015 in a top-Tier medical informatics journal and provided to the participants as an option to build on, a random classifier (F1 = 0:02), and majority classifiers for the two most common classes (i.e., NA to filter out text irrelevant to the form and the most common form heading, both with F1 > 0:00). The top-2 methods (F1 = 0:38 and 0:37) had statistically significantly (p > 0:05, Wilcoxon signed-rank test) better performance than the third-best method (F1 = 0:35). In comparison, the top-3 methods and the organizers' method (7th) had F1 of 0.81, 0.80, 0.81, and 0.75 in the NA class, respectively.

Original languageEnglish
Title of host publication2016 Working Notes of Conference and Labs of the Evaluation Forum
EditorsKrisztian Balog, Linda Cappellato, Nicola Ferro, Craig Macdonald
Place of PublicationOnline
PublisherCEUR Workshop Proceedings
Pages1-14
Number of pages14
Volume1609
Publication statusPublished - 2016
Event7th International Conference of the CLEF Association, CLEF 2016 - Evora, Evora, Portugal
Duration: 5 Sep 20168 Sep 2016
http://clef2016.clef-initiative.eu/ (Conference website)

Publication series

NameCEUR WS- 1609 - CLEF2016 Working Notes
PublisherCEUR Workshop
Volume1609
ISSN (Print)1613-0073

Conference

Conference7th International Conference of the CLEF Association, CLEF 2016
Abbreviated titleCLEF 2016
CountryPortugal
CityEvora
Period5/09/168/09/16
OtherCLEF 2016 is the seventh CLEF conference continuing the popular CLEF campaigns which have run since 2000 contributing to the systematic evaluation of information access systems, primarily through experimentation on shared tasks. Building on the format first introduced in 2010, CLEF 2016 consists of an independent peer-reviewed conference on a broad range of issues in the fields of multilingual and multimodal information access evaluation, and a set of labs and workshops designed to test different aspects of mono and cross-language Information retrieval systems.
Together, the conference and the lab series will maintain and expand upon the CLEF tradition of community-based evaluation and discussion on evaluation issues
Internet address

Fingerprint

Speech recognition
Classifiers
Processing
Macros
Education
Testing

Cite this

Suominen, H., Zhou, L., Goeuriot, L., & Kelly, L. (2016). Task 1 of the CLEF ehealth evaluation lab 2016: Handover information extraction. In K. Balog, L. Cappellato, N. Ferro, & C. Macdonald (Eds.), 2016 Working Notes of Conference and Labs of the Evaluation Forum (Vol. 1609, pp. 1-14). (CEUR WS- 1609 - CLEF2016 Working Notes; Vol. 1609). Online: CEUR Workshop Proceedings.
Suominen, Hanna ; Zhou, Liyuan ; Goeuriot, Lorraine ; Kelly, Liadh. / Task 1 of the CLEF ehealth evaluation lab 2016 : Handover information extraction. 2016 Working Notes of Conference and Labs of the Evaluation Forum. editor / Krisztian Balog ; Linda Cappellato ; Nicola Ferro ; Craig Macdonald. Vol. 1609 Online : CEUR Workshop Proceedings, 2016. pp. 1-14 (CEUR WS- 1609 - CLEF2016 Working Notes).
@inproceedings{4e5d6394da5d41099ba1b511a5cf4c61,
title = "Task 1 of the CLEF ehealth evaluation lab 2016: Handover information extraction",
abstract = "Cascaded speech recognition (SR) and information extraction (IE) could support the best practice for clinical handover and release clinicians' time from writing documents to patient interaction and education. However, high requirements for processing correctness evoke methodological challenges and hence, processing correctness needs to be carefully evaluated as meeting the requirements. This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum (CLEF) in 2016. This IE task built on the 2015 CLEF eHealth Task on SR by using its 201 synthetic handover documents for training and validation (appr. 8; 500 + 7; 700 words) and releasing another 100 documents with over 6; 500 expert-Annotated words for testing. It attracted 25 team registrations and 3 team submissions with 2 methods each. When using the macro-Averaged F1 over the 35 form headings present in the training documents for evaluation on the test documents, all participant methods outperformed all 4 baselines, including the organizers' method (F1 = 0:25), published in 2015 in a top-Tier medical informatics journal and provided to the participants as an option to build on, a random classifier (F1 = 0:02), and majority classifiers for the two most common classes (i.e., NA to filter out text irrelevant to the form and the most common form heading, both with F1 > 0:00). The top-2 methods (F1 = 0:38 and 0:37) had statistically significantly (p > 0:05, Wilcoxon signed-rank test) better performance than the third-best method (F1 = 0:35). In comparison, the top-3 methods and the organizers' method (7th) had F1 of 0.81, 0.80, 0.81, and 0.75 in the NA class, respectively.",
keywords = "Computer Systems Evaluation, Data Collection, Information Extraction, Medical Informatics, Nursing Records, Patient Handoff, Patient Handover, Records as Topic, Software Design, Speech Recognition, Test-set Generation, Text Classification",
author = "Hanna Suominen and Liyuan Zhou and Lorraine Goeuriot and Liadh Kelly",
year = "2016",
language = "English",
volume = "1609",
series = "CEUR WS- 1609 - CLEF2016 Working Notes",
publisher = "CEUR Workshop Proceedings",
pages = "1--14",
editor = "Krisztian Balog and Linda Cappellato and Nicola Ferro and Craig Macdonald",
booktitle = "2016 Working Notes of Conference and Labs of the Evaluation Forum",

}

Suominen, H, Zhou, L, Goeuriot, L & Kelly, L 2016, Task 1 of the CLEF ehealth evaluation lab 2016: Handover information extraction. in K Balog, L Cappellato, N Ferro & C Macdonald (eds), 2016 Working Notes of Conference and Labs of the Evaluation Forum. vol. 1609, CEUR WS- 1609 - CLEF2016 Working Notes, vol. 1609, CEUR Workshop Proceedings, Online, pp. 1-14, 7th International Conference of the CLEF Association, CLEF 2016, Evora, Portugal, 5/09/16.

Task 1 of the CLEF ehealth evaluation lab 2016 : Handover information extraction. / Suominen, Hanna; Zhou, Liyuan; Goeuriot, Lorraine; Kelly, Liadh.

2016 Working Notes of Conference and Labs of the Evaluation Forum. ed. / Krisztian Balog; Linda Cappellato; Nicola Ferro; Craig Macdonald. Vol. 1609 Online : CEUR Workshop Proceedings, 2016. p. 1-14 (CEUR WS- 1609 - CLEF2016 Working Notes; Vol. 1609).

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - Task 1 of the CLEF ehealth evaluation lab 2016

T2 - Handover information extraction

AU - Suominen, Hanna

AU - Zhou, Liyuan

AU - Goeuriot, Lorraine

AU - Kelly, Liadh

PY - 2016

Y1 - 2016

N2 - Cascaded speech recognition (SR) and information extraction (IE) could support the best practice for clinical handover and release clinicians' time from writing documents to patient interaction and education. However, high requirements for processing correctness evoke methodological challenges and hence, processing correctness needs to be carefully evaluated as meeting the requirements. This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum (CLEF) in 2016. This IE task built on the 2015 CLEF eHealth Task on SR by using its 201 synthetic handover documents for training and validation (appr. 8; 500 + 7; 700 words) and releasing another 100 documents with over 6; 500 expert-Annotated words for testing. It attracted 25 team registrations and 3 team submissions with 2 methods each. When using the macro-Averaged F1 over the 35 form headings present in the training documents for evaluation on the test documents, all participant methods outperformed all 4 baselines, including the organizers' method (F1 = 0:25), published in 2015 in a top-Tier medical informatics journal and provided to the participants as an option to build on, a random classifier (F1 = 0:02), and majority classifiers for the two most common classes (i.e., NA to filter out text irrelevant to the form and the most common form heading, both with F1 > 0:00). The top-2 methods (F1 = 0:38 and 0:37) had statistically significantly (p > 0:05, Wilcoxon signed-rank test) better performance than the third-best method (F1 = 0:35). In comparison, the top-3 methods and the organizers' method (7th) had F1 of 0.81, 0.80, 0.81, and 0.75 in the NA class, respectively.

AB - Cascaded speech recognition (SR) and information extraction (IE) could support the best practice for clinical handover and release clinicians' time from writing documents to patient interaction and education. However, high requirements for processing correctness evoke methodological challenges and hence, processing correctness needs to be carefully evaluated as meeting the requirements. This overview paper reports on how these issues were addressed in a shared task of the eHealth evaluation lab of the Conference and Labs of the Evaluation Forum (CLEF) in 2016. This IE task built on the 2015 CLEF eHealth Task on SR by using its 201 synthetic handover documents for training and validation (appr. 8; 500 + 7; 700 words) and releasing another 100 documents with over 6; 500 expert-Annotated words for testing. It attracted 25 team registrations and 3 team submissions with 2 methods each. When using the macro-Averaged F1 over the 35 form headings present in the training documents for evaluation on the test documents, all participant methods outperformed all 4 baselines, including the organizers' method (F1 = 0:25), published in 2015 in a top-Tier medical informatics journal and provided to the participants as an option to build on, a random classifier (F1 = 0:02), and majority classifiers for the two most common classes (i.e., NA to filter out text irrelevant to the form and the most common form heading, both with F1 > 0:00). The top-2 methods (F1 = 0:38 and 0:37) had statistically significantly (p > 0:05, Wilcoxon signed-rank test) better performance than the third-best method (F1 = 0:35). In comparison, the top-3 methods and the organizers' method (7th) had F1 of 0.81, 0.80, 0.81, and 0.75 in the NA class, respectively.

KW - Computer Systems Evaluation

KW - Data Collection

KW - Information Extraction

KW - Medical Informatics

KW - Nursing Records

KW - Patient Handoff

KW - Patient Handover

KW - Records as Topic

KW - Software Design

KW - Speech Recognition

KW - Test-set Generation

KW - Text Classification

M3 - Conference contribution

VL - 1609

T3 - CEUR WS- 1609 - CLEF2016 Working Notes

SP - 1

EP - 14

BT - 2016 Working Notes of Conference and Labs of the Evaluation Forum

A2 - Balog, Krisztian

A2 - Cappellato, Linda

A2 - Ferro, Nicola

A2 - Macdonald, Craig

PB - CEUR Workshop Proceedings

CY - Online

ER -

Suominen H, Zhou L, Goeuriot L, Kelly L. Task 1 of the CLEF ehealth evaluation lab 2016: Handover information extraction. In Balog K, Cappellato L, Ferro N, Macdonald C, editors, 2016 Working Notes of Conference and Labs of the Evaluation Forum. Vol. 1609. Online: CEUR Workshop Proceedings. 2016. p. 1-14. (CEUR WS- 1609 - CLEF2016 Working Notes).