The need for annotated corpora from legal documents, and for (human) protocols for creating them: The attribution problem

The need for annotated corpora from legal documents, and for (human) protocols for creating them: The attribution problem, paper presented at the 2016 Dagstuhl Seminar on Natural Language Argumentation: Mining, Processing, and Reasoning over Textual Arguments, held at Schloss Dagstuhl, in Wadern, Germany, on April 17-22, 2016.


This paper argues that in order to make progress today in automating argumentation mining from legal documents, we have a critical need for two things. First, we need a sufficient supply of manually annotated corpora, as well as theoretical and experimental evidence that those annotated data are accurate. Second, we need protocols for effectively training people to perform the tasks and sub-tasks required to create those annotations. Such protocols are necessary not only for a team approach to annotation and for quality assurance of the finished annotations, but also for developing and testing software to assist humans in the process of annotation. Drawing upon the latest work at Hofstra University’s Law, Logic and Technology Research Laboratory in New York, the paper offers an extended example from the problem of annotating attribution relations, as an illustration of why obtaining consistent and accurate annotations in law is extremely difficult, and of why protocols are necessary. Attribution is the problem of determining which actor believes, asserts, or relies upon the truth of a proposition as a premise or a conclusion of an argument. The paper illustrates that in applying argumentation mining to legal documents, annotating attribution relations correctly is a critical task.

