TY - GEN
T1 - Segmentation of patent claims for improving their readability
AU - Ferraro, Gabriela
AU - Suominen, Hanna
AU - NUALART VILAPLANA, Jaume
N1 - Funding Information:
NICTA is funded by the Australian Government through the Department of Communications and the Australian Research Council through the ICT Centre of Excellence Program. We also express our gratitude to the TALN Research Group from Universitat Pompeu Fabra for their corpus development. Finally, we thank the anonymous reviewers of The 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR 2014), held in conjunction with the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014), for their comments and suggestions.
Publisher Copyright:
© European Chapter of the Association for Computational Linguistics.
PY - 2014
Y1 - 2014
N2 - Good readability of text is important to ensure efficiency in communication and eliminate risks of misunderstanding. Patent claims are an example of text whose readability is often poor. In this paper, we aim to improve claim readability by a clearer presentation of its content. Our approach consist in segmenting the original claim content at two levels. First, an entire claim is segmented to the components of preamble, transitional phrase and body, using a rule-based approach. Second, a conditional random field is trained to segment the components into clauses. An alternative approach would have been to modify the claim content which is, however, prone to also changing the meaning of this legal text. For both segmentation levels, we report results from statistical evaluation of segmentation performance. In addition, a qualitative error analysis was performed to understand the problems underlying the clause segmentation task. Our accuracy in detecting the beginning and end of preamble text is 1.00 and 0.97, respectively. For the transitional phase, these numbers are 0.94 and 1.00 and for the body text, 1.00 and 1.00. Our precision and recall in the clause segmentation are 0.77 and 0.76, respectively. The results give evidence for the feasibility of automated claim and clause segmentation, which may help not only inventors, researchers, and other laypeople to understand patents but also patent experts to avoid future legal cost due to litigations.
AB - Good readability of text is important to ensure efficiency in communication and eliminate risks of misunderstanding. Patent claims are an example of text whose readability is often poor. In this paper, we aim to improve claim readability by a clearer presentation of its content. Our approach consist in segmenting the original claim content at two levels. First, an entire claim is segmented to the components of preamble, transitional phrase and body, using a rule-based approach. Second, a conditional random field is trained to segment the components into clauses. An alternative approach would have been to modify the claim content which is, however, prone to also changing the meaning of this legal text. For both segmentation levels, we report results from statistical evaluation of segmentation performance. In addition, a qualitative error analysis was performed to understand the problems underlying the clause segmentation task. Our accuracy in detecting the beginning and end of preamble text is 1.00 and 0.97, respectively. For the transitional phase, these numbers are 0.94 and 1.00 and for the body text, 1.00 and 1.00. Our precision and recall in the clause segmentation are 0.77 and 0.76, respectively. The results give evidence for the feasibility of automated claim and clause segmentation, which may help not only inventors, researchers, and other laypeople to understand patents but also patent experts to avoid future legal cost due to litigations.
KW - Text Segmentation
KW - Readability
KW - Patent claims
UR - http://www.scopus.com/inward/record.url?scp=85105776758&partnerID=8YFLogxK
M3 - Conference contribution
SN - 9781937284916
SN - 9781632664075
T3 - Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations, PITR 2014 at the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014
SP - 66
EP - 73
BT - Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations, PITR 2014 at the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014
A2 - Williams, Sandra
A2 - Siddharthan, Advaith
A2 - Nenkova, Ani
PB - Curran Associates
CY - New York, USA
T2 - 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR 2014)
Y2 - 26 April 2014 through 30 April 2014
ER -