OCR challenge for index cards
The goal of this task is to post-process the output from the Tesseract OCR engine. Alternatively, it could be treated as an OCR, as images are also available. [ver. 1.0.2]
This is a long list of all submissions, if you want to see only the best, click leaderboard.
# | submitter | when | ver. | description | dev-0 CER | dev-0 WER | dev-0 CharMatch | test-A CER | test-A WER | test-A CharMatch | |
---|---|---|---|---|---|---|---|---|---|---|---|
4 | s444415 | 2023-03-23 11:49 | 1.0.2 | Donut trained on wikisource fine-tuned donut ocr | 1.038 | 0.991 | 0.370 | 1.000 | 1.000 | 0.401 | |
6 | s444415 | 2023-03-21 08:35 | 1.0.2 | Donut fine tuned wikisource yellow fine-tuned donut ocr | 1.021 | 1.010 | 0.367 | 1.066 | 1.122 | 0.370 | |
2 | s444415 | 2023-01-05 15:02 | 1.0.2 | Donut fine tune with data from challange fine-tuned donut ocr | 0.557 | 0.741 | 0.537 | 0.664 | 0.915 | 0.508 | |
1 | s444415 | 2022-12-22 13:58 | 1.0.2 | Donut fine tuned fine-tuned donut | 0.486 | 0.813 | 0.585 | 0.459 | 0.694 | 0.622 | |
7 | s444415 | 2022-12-22 13:51 | 1.0.2 | Donut base model base donut | 1.624 | 1.973 | 0.231 | 1.995 | 2.459 | 0.206 | |
5 | s444415 | 2022-12-22 13:46 | 1.0.2 | Donut proto model donut proto | 1.054 | 1.080 | 0.359 | 1.079 | 1.089 | 0.377 | |
3 | p/tlen | 2021-04-09 16:01 | 1.0.1 | Baseline - just rewrite Tesseract output baseline tesseract | 0.432 | 0.745 | 0.422 | 0.463 | 0.786 | 0.425 |