Diachronic OCR challenge

Do OCR of a Polish historical text (or post-correction of Tesseract OCR) [ver. 1.0.0]

pol diachronic

Git repo URL: git://gonito.net/diachronia-ocr / Branch: master
Run git clone --single-branch git://gonito.net/diachronia-ocr -b master to get the challenge data
Browse at https://gonito.net/gitlist/diachronia-ocr.git/master

Leaderboard

# submitter when ver. description test-A CharMatch ×
1 p/tlen 2022-07-07 16:06 1.0.0 Lucene Transducers ver. 0.27-SNAPSHOT rule-based 56.4 2
2 s444415 2023-01-05 14:59 1.0.0 Donut fine tune with data from challange fine-tuned donut proto ocr 43.2 6
3 p/tlen 2022-07-07 15:30 1.0.0 just copy the Tesseract output tesseract 33.7 2
4 s444415 2022-12-22 10:18 1.0.0 Donut base model base donut 26.2 6