Diachronic normalisation of Polish texts
Transform old Polish texts into modern spelling. [ver. 1.0.0]
This is a long list of all submissions, if you want to see only the best, click leaderboard.
# | submitter | when | ver. | description | dev-0 CharMatch | dev-1 CharMatch | test-A CharMatch | |
---|---|---|---|---|---|---|---|---|
26 | ked | 2023-10-12 10:01 | 1.0.0 | plt5-base_normalizer_test_pruned, no finetuning | N/A | N/A | 0.0021 | |
14 | p/tlen | 2022-07-07 06:18 | 1.0.0 | Lucene Transducers ver. 0.25-SNAPSHOT extended=yes rule-based | 1.0000 | 0.5508 | 0.5968 | |
7 | p/tlen | 2022-07-07 06:18 | 1.0.0 | Lucene Transducers ver. 0.25-SNAPSHOT extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
17 | p/tlen | 2022-07-06 18:45 | 1.0.0 | Lucene Transducers ver. 0.24 extended=yes rule-based | 1.0000 | 0.5375 | 0.5839 | |
6 | p/tlen | 2022-07-06 18:45 | 1.0.0 | Lucene Transducers ver. 0.24 extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
16 | p/tlen | 2022-07-06 18:35 | 1.0.0 | Lucene Transducers ver. 0.24 extended=yes rule-based | 1.0000 | 0.5375 | 0.5839 | |
5 | p/tlen | 2022-07-06 18:35 | 1.0.0 | Lucene Transducers ver. 0.24 extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
15 | p/tlen | 2022-07-06 18:23 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT extended=yes rule-based | 1.0000 | 0.5375 | 0.5839 | |
4 | p/tlen | 2022-07-06 18:23 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
22 | p/tlen | 2022-07-06 18:21 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT extended=yes rule-based | 1.0000 | 0.3570 | 0.4012 | |
3 | p/tlen | 2022-07-06 18:21 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT extended=no rule-based | 1.0000 | 0.6633 | 0.6662 | |
19 | p/tlen | 2022-07-06 14:41 | 1.0.0 | Lucene Transducers ver. 0.24-SNAPSHOT rule-based | 1.0000 | 0.5328 | 0.5820 | |
1 | p/tlen | 2022-02-24 19:47 | 1.0.0 | Lucene Transducers ver. 0.23-SNAPSHOT rule-based | 1.0000 | 0.6618 | 0.6732 | |
2 | p/tlen | 2021-10-20 14:00 | 1.0.0 | Lucene Transducers ver. 0.23-SNAPSHOT rule-based | 1.0000 | 0.6628 | 0.6663 | |
9 | p/tlen | 2021-10-20 11:02 | 1.0.0 | Lucene Transducers ver. 0.22-SNAPSHOT rule-based | 1.0000 | 0.6724 | 0.6580 | |
8 | [anonymized] | 2021-08-15 13:19 | 1.0.0 | 0.22 use nosecondary option | 1.0000 | 0.6724 | 0.6580 | |
21 | [anonymized] | 2021-08-03 20:03 | 1.0.0 | Lucene transducers 0.22 - move pairs to a separate file | 1.0000 | 0.4064 | 0.4101 | |
23 | p/tlen | 2020-04-22 19:16 | 1.0.0 | PSI-Toolkit Diachroniser 2020 | 1.0000 | 0.2045 | 0.2934 | |
10 | p/tlen | 2019-10-26 19:38 | 1.0.0 | Lucene Transducers 0.21 | 1.0000 | 0.6724 | 0.6580 | |
11 | p/tlen | 2019-10-19 20:13 | 1.0.0 | Lucene Transducers 20 | 1.0000 | 0.6143 | 0.6189 | |
24 | p/tlen | 2018-03-30 12:49 | 1.0.0 | PSI-Toolkit better-diachronizer | 1.0000 | 0.1375 | 0.1951 | |
12 | p/tlen | 2018-03-17 11:07 | 1.0.0 | use Lucene token filter with sub-word variants (v. 0.15) | 1.0000 | 0.6110 | 0.6093 | |
25 | [anonymized] | 2018-03-16 20:38 | 1.0.0 | Raw normalization | N/A | 0.0150 | 0.0284 | |
13 | p/tlen | 2018-03-16 13:25 | 1.0.0 | use Lucene filter with words mined using word2vec (v. 0.14) | 1.0000 | 0.6061 | 0.6031 | |
18 | p/tlen | 2018-03-16 11:16 | 1.0.0 | use Lucene filter without OCR fixes (v. 0.13) | 1.0000 | 0.6122 | 0.5833 | |
20 | p/tlen | 2018-03-15 20:55 | 1.0.0 | use Lucene token filter (v. 0.12) | 1.0000 | 0.5181 | 0.4656 | |
27 | p/tlen | 2018-03-15 20:47 | 1.0.0 | do nothing stupid | 0.0000 | 0.0000 | 0.0000 |