Searching for Legal Clauses by Analogy. Few-shot Contract Discovery Shared Task
The aim of this task is to provide a substrings of requested document representing clauses analogous (semantically and formally equivalent) to provided examples from other documents.
Subsets of Corporate Bond and Non-disclosure Agreement documents from US Edgar and Charity Annual Reports form UK Charity Register were annotated, in a way clauses of the same type are selected (e.g. determining governing law, clause types depend on type of legal act). Clauses can consist of single sentence, multiple sentences or sentence parts. The exact type of clause is not important during the evaluation, since no full-featured training is allowed and one have to use solely a set of few sample clauses during execution.
Input file consists of up to 6 tab-separated fields, eg.:
| ID of document to search in | Entity considered | Example #1 | ... | Example #N | |-----------------------------|----------------------|----------------------|--------------------|----------------------| | NDA_057 | governing-law | NDA_059 15215-15453 | NDA_033 7890-8032 | NDA_009 12797-13364 |
Each example consists of document ID (NDA_059, NDA_033, NDA_009) and characters range (15215-15453 and so on). Ranges can be discontinuous. In such a case their parts are distinguished with colon, eg. 4103-4882,12127-12971.
Expected file contains one answer per line, consisting of entity name (to be copied from input) and characters range in the same format as described above.
Reference file contains 2 tab-separated fields: document id and its content.
The metric used is Soft-F1.
Directory structure
README.md
— this fileconfig.txt
— configuration filedev-0/
— directory with dev (test) datadev-0/in.tsv
— input data for the dev setdev-0/expected.tsv
— expected (reference) data for the dev setdev-0/reference.tsv.xz
— file with documents considered in dev settest-A
— directory with test datatest-A/in.tsv
— input data for the test settest-A/reference.tsv.xz
— file with documents considered in test set