Guess a word in a gap in historic texts

Give a probability distribution for a word in a gap in a corpus of Polish historic texts spanning 1814-2013. This is a challenge for (temporal) language models.

Git repo URL: git://gonito.net/retro-gap.git / Branch: master

(Browse at https://gonito.net/gitlist/retro-gap.git/master)

Leaderboard

# submitter when description test-A LogLossHashed ×
1 p/tlen 2018-12-30 05:32 LM model trained on 20181230 (applica-lm-retro-gap-bilstm-preproc=minimalistic-bidirectional-lang=pl-5.1.8.0.bin) best-epoch=49 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 enc-dropout=0.5 enc-highways=0 epochs=50 epochs-done=49 execution-time=192264.61 hidden-size=400 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 trainable-params=53192800 validation-perplexity=134.93 word-emb-size=400 lm word-level bilstm 5.0632 9
2 kaczla 2017-12-12 20:35 3-gram with prune, best 15, best oov ready-made kenlm lm 5.7006 20
3 siulkilulki 2018-01-24 14:39 simple neural network, context 2 words ahead 2 words behind neural-network 5.7395 4
4 tamazaki 2017-04-24 16:42 unigramy, n=100, v3 self-made lm 6.0733 10
5 patrycja 2018-01-15 18:11 Bigrams model, 100 best words stupid self-made lm bigram 6.1097 3
6 mmalisz 2017-06-29 15:12 Update source code; kenlm order=3 tokenizer.perl from moses. best 100 results, text mode. lm kenlm ready-made 6.1898 7
7 EmEm 2017-05-16 04:31 zad 16 self-made lm 6.8056 4
8 Durson 2017-06-28 08:47 test 2 ready-made neural-network 6.8956 2

Graphs by parameters

best-epoch

[direct link]

bptt

[direct link]

char-emb-size

[direct link]

chunksize

[direct link]

clip

[direct link]

dropout

[direct link]

enc-dropout

[direct link]

enc-highways

[direct link]

epochs

[direct link]

epochs-done

[direct link]

execution-time

[direct link]

hidden-size

[direct link]

layers

[direct link]

lr

[direct link]

lrs-min-lr

[direct link]

lrs-multiplier

[direct link]

lrs-patience

[direct link]

mini-batch-size

[direct link]

trainable-params

[direct link]

validation-perplexity

[direct link]

word-emb-size

[direct link]

word-len

[direct link]