Diachronic OCR challenge
Do OCR of a Polish historical text (or post-correction of Tesseract OCR) [ver. 1.0.0]
This is a long list of all submissions, if you want to see only the best, click leaderboard.
# | submitter | when | ver. | description | dev-0 CharMatch | test-A CharMatch | |
---|---|---|---|---|---|---|---|
3 | s444415 | 2023-03-23 11:51 | 1.0.0 | Donut trained on wikisource fine-tuned donut proto ocr | 16.1 | 41.2 | |
5 | s444415 | 2023-03-21 08:26 | 1.0.0 | Donut fine tuned wikisource yellow fine-tuned donut proto ocr | 20.6 | 39.6 | |
2 | s444415 | 2023-01-05 14:59 | 1.0.0 | Donut fine tune with data from challange fine-tuned donut proto ocr | 17.3 | 43.2 | |
4 | s444415 | 2022-12-22 10:33 | 1.0.0 | Donut model fine tuned fine-tuned donut proto | 21.0 | 40.4 | |
6 | s444415 | 2022-12-22 10:29 | 1.0.0 | Donut proto model donut proto | 14.3 | 35.6 | |
8 | s444415 | 2022-12-22 10:18 | 1.0.0 | Donut base model base donut | 12.6 | 26.2 | |
1 | p/tlen | 2022-07-07 16:06 | 1.0.0 | Lucene Transducers ver. 0.27-SNAPSHOT rule-based | 77.5 | 56.4 | |
7 | p/tlen | 2022-07-07 15:30 | 1.0.0 | just copy the Tesseract output tesseract | 39.0 | 33.7 |