Guess a word in a gap in historic texts

Give a probability distribution for a word in a gap in a corpus of Polish historic texts spanning 1814-2013. This is a challenge for (temporal) language models. [ver. 1.0.0]

Git repo URL: git://gonito.net/retro-gap.git / Branch: master
Run git clone --single-branch git://gonito.net/retro-gap.git -b master to get the challenge data
Browse at https://gonito.net/gitlist/retro-gap.git/master

Leaderboard

# submitter when ver. description test-A LogLossHashed ×
1 kubapok 2020-10-09 10:56 1.0.0 roberta z embeddingiem pierwszy token 4.2801 19
2 p/tlen 2019-04-12 04:40 1.0.0 LM model used (bi-transformer.bin) aggregator=GEO lm word-level transformer 4.8570 49
3 kaczla 2017-12-12 20:35 1.0.0 3-gram with prune, best 15, best oov ready-made kenlm lm 5.7006 20
4 [anonymized] 2018-01-24 14:39 1.0.0 simple neural network, context 2 words ahead 2 words behind neural-network 5.7395 4
5 [anonymized] 2017-04-24 16:42 1.0.0 unigramy, n=100, v3 self-made lm 6.0733 10
6 [anonymized] 2021-02-08 06:15 1.0.0 solution self-made lm 6.0733 3
7 [anonymized] 2021-01-08 18:34 1.0.0 pytorch neural ngram model (3 previous words) lm pytorch-nn 6.0819 8
8 [anonymized] 2018-01-15 18:11 1.0.0 Bigrams model, 100 best words stupid self-made lm bigram 6.1097 3
9 [anonymized] 2020-12-04 00:27 1.0.0 solution self-made lm bigram 6.1610 3
10 [anonymized] 2019-11-27 10:19 1.0.0 Simple bigram model lm 6.1802 3
11 [anonymized] 2020-12-16 07:16 1.0.0 python bigram self-made lm bigram 6.1837 1
12 [anonymized] 2017-06-29 15:12 1.0.0 Update source code; kenlm order=3 tokenizer.perl from moses. best 100 results, text mode. ready-made kenlm lm 6.1898 7
13 [anonymized] 2021-01-09 21:10 1.0.0 2 left, 2 right context lm pytorch-nn 6.2379 3
14 [anonymized] 2021-02-03 07:57 1.0.0 updated bigram self-made lm bigram 6.2673 10
15 [anonymized] 2021-01-13 02:38 1.0.0 v10 lm temporal pytorch-nn 6.3330 44
16 [anonymized] 2021-01-13 01:51 1.0.0 following_words;x_size=100;epochs=5;lr=0.001 lm pytorch-nn 6.3331 8
17 [anonymized] 2021-01-27 10:23 1.0.0 TAU22 lm pytorch-nn 6.4151 3
18 [anonymized] 2020-12-08 16:27 1.0.0 solution self-made lm bigram 6.4201 4
19 [anonymized] 2021-01-12 17:36 1.0.0 first solution 1 epoch 1000 texts best 15 lm pytorch-nn 6.5711 1
20 [anonymized] 2020-12-02 13:04 1.0.0 Trigram slef-made self-made lm trigram 6.7172 2
21 [anonymized] 2019-11-20 17:07 1.0.0 better bigram solution, nananana lm 6.7249 2
22 [anonymized] 2019-11-13 12:29 1.0.0 My bigram guess a word solution lm 6.7309 1
23 [anonymized] 2020-12-16 08:52 1.0.0 poprawka tetragram self-made lm tetragram 6.7517 3
24 [anonymized] 2019-11-30 22:48 1.0.0 3gram outfile format fix lm trigram 6.8032 2
25 [anonymized] 2017-05-16 04:31 1.0.0 zad 16 self-made lm 6.8056 4
26 [anonymized] 2017-06-28 08:47 1.0.0 test 2 ready-made neural-network 6.8956 2
27 [anonymized] 2021-01-12 17:48 1.0.0 run.py update lm pytorch-nn 6.9054 10
28 [anonymized] 2021-02-04 20:29 1.0.0 ngram lm pytorch-nn 6.9123 1
29 [anonymized] 2020-12-08 15:50 1.0.0 finally self-made lm bigram 6.9236 5
30 [anonymized] 2020-01-16 18:12 1.0.0 IRLSTM 3-gram lm 6.9314 25
31 [anonymized] 2020-12-13 14:17 1.0.0 solution self-made lm trigram 7.5152 3

Graphs by parameters

beam-search-depth

[direct link]

best-epoch

[direct link]

dropout

[direct link]

early-stopping

[direct link]

enc-dropout

[direct link]

enc-highways

[direct link]

epochs

[direct link]

epochs-done

[direct link]

execution-time

[direct link]

ffn-emb-size

[direct link]

hidden-size

[direct link]

layers

[direct link]

trainable-params

[direct link]

validation-perplexity

[direct link]

word-emb-size

[direct link]