Guess a word in a gap in historic texts

Give a probability distribution for a word in a gap in a corpus of Polish historic texts spanning 1814-2013. This is a challenge for (temporal) language models.

# submitter when ver. description dev-0 LogLossHashed dev-1 LogLossHashed test-A LogLossHashed
81 Paulina Lester 2019-11-30 22:48 1.0.0 3gram outfile format fix lm trigram N/A N/A 6.8032
63 [anonymised] 2019-11-27 10:19 1.0.0 Simple bigram model lm 6.1832 6.3083 6.1802
117 Jędrzej Furmann 2019-11-27 10:04 1.0.0 Simple trigram lm N/A N/A N/A
48 kubapok 2019-11-25 10:42 1.0.0 year aware 4 splits statistical 5.8219 N/A 5.7973
39 kubapok 2019-11-25 10:36 1.0.0 year aware 2 splits 5.7372 N/A 5.6667
79 [anonymised] 2019-11-20 17:07 1.0.0 better bigram solution, nananana lm 6.7205 6.7565 6.7249
95 [anonymised] 2019-11-18 16:30 1.0.0 bigram solution, sialala lm 7.3424 7.2223 7.2732
37 kubapok 2019-11-17 08:31 1.0.0 self made LM 3grams with fallback to 2grams and 1grams 5.6790 N/A 5.6063
104 [anonymised] 2019-11-13 16:32 1.0.0 Simple bigram model lm Infinity Infinity Infinity
80 452107 2019-11-13 12:29 1.0.0 My bigram guess a word solution lm 6.7279 6.7631 6.7309
116 [anonymised] 2019-11-13 09:33 1.0.0 Simple bigram model lm N/A N/A N/A
61 kubapok 2019-11-12 06:52 1.0.0 bigram model, equal distribution N/A N/A 6.1264
67 kubapok 2019-11-11 18:14 1.0.0 stupid solution N/A N/A 6.2078
91 kubapok 2019-11-11 11:45 1.0.0 very baseline N/A N/A 6.9315
20 p/tlen 2019-05-24 09:39 1.0.0 LM model used (applica-lm-retro-gap-transformer-bpe-bigger-preproc=minimalistic-left_to_right-lang=pl-5.3.0.0.bin) attention-dropout=0.1 attention-heads=8 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=15 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=100 ffn-emb-size=2048 layers=8 learned-positional=false mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 share-embed=true vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer 5.2179 5.4184 5.1606
96 p/tlen 2019-05-19 02:53 1.0.0 LM model trained on 20190519 (applica-lm-retro-gap-transformer-bigger-preproc=minimalistic-left_to_right-lang=pl-5.3.0.0.bin) best-epoch=90 bptt=50 chunksize=10000 clip=0.25 early-stopping=15 early-stopping-type=iteration epochs=100 epochs-done=90 execution-time=63635.69 mini-batch-size=100 resume-adversarial-path=models/last/adversarial-state.pkl resume-best-current-iteration=80 resume-best-epoch=80 resume-current-iteration=80 resume-epoch=80 resume-line-number=0 resume-model=models/last/model.bin resume-optimizer-state=models/last/optimizer-state.pkl resume-ppl=384.0869976871132 resume-scheduler-path=models/last/scheduler-data.pkl trainable-params=56727040 validation-perplexity=383.37 lm word-level 7.3604 7.3470 7.3037
47 p/tlen 2019-05-17 19:14 1.0.0 LM model used (model.bin) attention-dropout=0.1 attention-heads=8 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 rvt=true share-embed=true vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer 5.8551 5.9289 5.7924
19 p/tlen 2019-05-16 10:31 1.0.0 LM model trained on 20190516 (applica-lm-retro-gap-transformer-bpe-bigger-preproc=minimalistic-left_to_right-lang=pl-5.3.0.0.bin) attention-dropout=0.1 attention-heads=8 beam-search-depth=2 best-epoch=80 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 epochs-done=80 execution-time=411814.9 ffn-emb-size=2048 layers=8 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 share-embed=true trainable-params=56727040 validation-perplexity=384.09 vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer left-to-right bpe 5.2138 5.4159 5.1584
22 p/tlen 2019-05-14 04:53 1.0.0 LM model trained on 20190514 (applica-lm-retro-gap-transformer-bpe-preproc=minimalistic-left_to_right-lang=pl-5.2.0.0.bin) attention-dropout=0.1 attention-heads=8 beam-search-depth=2 best-epoch=75 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 epochs-done=75 execution-time=359738.83 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 share-embed=true trainable-params=50422272 validation-perplexity=419.37 vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer left-to-right bpe 5.2746 5.4622 5.2050
24 p/tlen 2019-05-11 00:49 1.0.0 LM model used (model.bin) attention-dropout=0.1 attention-heads=8 beam-search-depth=2 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 rvt=true share-embed=true vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer left-to-right bpe 5.3047 5.4820 5.2355
31 p/tlen 2019-05-10 13:58 1.0.0 LM model used (model.bin) attention-dropout=0.1 attention-heads=8 beam-search-depth=2 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 rvt=true share-embed=true vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer left-to-right bpe 5.3522 5.5277 5.2825
27 p/tlen 2019-05-10 01:27 1.0.0 LM model used (model.bin) attention-dropout=0.1 attention-heads=8 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 rvt=true share-embed=true vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer 5.3083 5.4859 5.2453
26 p/tlen 2019-05-08 20:38 1.0.0 LM model used (model.bin) attention-dropout=0.1 attention-heads=8 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 share-embed=true vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer 5.3070 5.4849 5.2377
29 p/tlen 2019-05-08 11:24 1.0.0 LM model used (model.bin) attention-dropout=0.1 attention-heads=8 beam-search-depth=0 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 share-embed=true vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer left-to-right bpe 5.3200 5.4907 5.2505
25 p/tlen 2019-05-08 08:59 1.0.0 LM model used (model.bin) attention-dropout=0.1 attention-heads=8 beam-search-depth=1 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 share-embed=true vocab=model/vocab.pkl word-emb-size=512 word-level transformer left-to-right bpe 5.3071 5.4841 5.2370
76 p/tlen 2019-04-17 11:25 1.0.0 LM model used (model.bin) attention-dropout=0.1 attention-heads=8 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 rvt=true share-embed=true vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer 6.5591 6.6602 6.5816
23 p/tlen 2019-04-12 04:40 1.0.0 LM model used (bi-transformer.bin) aggregator=MIN lm word-level transformer 5.3190 5.5181 5.2311
4 p/tlen 2019-04-12 04:40 1.0.0 LM model used (bi-transformer.bin) aggregator=MAX lm word-level transformer 4.9537 5.1992 4.9031
3 p/tlen 2019-04-12 04:40 1.0.0 LM model used (bi-transformer.bin) aggregator=RMS lm word-level transformer 4.9381 5.1868 4.8886
2 p/tlen 2019-04-12 04:40 1.0.0 LM model used (bi-transformer.bin) aggregator=MEAN lm word-level transformer 4.9128 5.1678 4.8653
1 p/tlen 2019-04-12 04:40 1.0.0 LM model used (bi-transformer.bin) aggregator=GEO lm word-level transformer 4.9048 5.1608 4.8570
86 p/tlen 2019-04-10 22:39 1.0.0 LM model used (transformer-sumo.bin) lm word-level transformer 6.8188 6.8351 6.8170
8 p/tlen 2019-04-10 12:41 1.0.0 LM model used (bi-partially-casemarker-transformer.bin) lm word-level transformer 5.0075 5.2557 4.9933
7 p/tlen 2019-04-10 09:54 1.0.0 LM model used (bi-transformer.bin) lm word-level transformer 4.9959 5.2376 4.9875
21 p/tlen 2019-04-10 00:09 1.0.0 LM model used (model.bin) lm word-level transformer right-to-left 5.2604 5.4692 5.2021
83 p/tlen 2019-04-06 11:23 1.0.0 LM model used (model.bin) lm word-level transformer 6.8933 6.7577 6.8104
85 p/tlen 2019-04-06 00:39 1.0.0 LM model used (model.bin) lm word-level transformer 6.8781 6.7650 6.8168
84 p/tlen 2019-04-05 15:29 1.0.0 LM model trained on 20190405 (applica-lm-retro-gap-transformer-frage-rvt-preproc=minimalistic-left_to_right-lang=pl-5.2.0.0.bin) attention-dropout=0.1 attention-heads=8 best-epoch=65 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 epochs-done=65 execution-time=250534.72 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 rvt=true share-embed=true trainable-params=50422272 validation-perplexity=117.85 vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer 6.8781 6.7650 6.8168
28 p/tlen 2019-04-01 13:14 1.0.0 LM model used (model.bin) lm word-level transformer left-to-right 5.3162 5.5091 5.2502
13 p/tlen 2019-04-01 10:16 1.0.0 LM model trained on 20190331 (applica-lm-retro-gap-bilstm-case-marker-preproc=minimalistic-bidirectional-lang=pl-5.2.0.0.bin) lm word-level bilstm casemarker 5.0902 5.3020 5.0304
34 p/tlen 2019-03-31 23:09 1.0.0 LM model trained on 20190331 (applica-lm-retro-gap-bilstm-case-marker-preproc=minimalistic-bidirectional-lang=pl-5.2.0.0.bin) best-epoch=97 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=20 early-stopping-type=iteration enc-dropout=0.5 enc-highways=0 epochs=1000 epochs-done=97 execution-time=787661.3 hidden-size=600 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 tied=true trainable-params=47599200 validation-perplexity=94.7 word-emb-size=600 lm word-level bilstm casemarker 5.4263 5.6073 5.3378
71 p/tlen 2019-03-30 12:13 1.0.0 LM model used (model.bin) lm word-level transformer 6.4247 6.4674 6.2930
75 p/tlen 2019-03-30 05:29 1.0.0 LM model trained on 20190330 (applica-lm-retro-gap-transformer-frage-casemarker-preproc=minimalistic-left_to_right-lang=pl-5.2.0.0.bin) attention-dropout=0.1 attention-heads=8 best-epoch=20 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=20 epochs-done=20 execution-time=71436.37 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 share-embed=true trainable-params=50422272 validation-perplexity=93.12 word-emb-size=512 lm word-level transformer 6.6148 6.6728 6.4872
11 p/tlen 2019-03-29 02:33 1.0.0 LM model trained on 20190329 (applica-lm-retro-gap-bilstm-frage-fixed-vocab-preproc=minimalistic-bidirectional-lang=pl-5.2.0.0.bin) best-epoch=103 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=20 early-stopping-type=iteration enc-dropout=0.5 enc-highways=0 epochs=1000 epochs-done=103 execution-time=573809.85 hidden-size=600 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 tied=true trainable-params=47599200 validation-perplexity=126.87 word-emb-size=600 lm word-level bilstm 5.0710 5.2996 5.0022
5 p/tlen 2019-03-21 14:11 1.0.0 per-period models combined (100/50) bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=8 early-stopping-type=epoch enc-dropout=0.5 enc-highways=0 epochs=1000 execution-time=463551.03 hidden-size=600 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=epoch mini-batch-size=100 tied=true trainable-params=142797600 validation-perplexity=168.6133 word-emb-size=600 year-resolution=100 year-stride=50 lm word-level bilstm 5.0234 5.2742 4.9768
6 p/tlen 2019-03-21 09:04 1.0.0 per-period models combined (100/50) bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=8 early-stopping-type=epoch enc-dropout=0.5 enc-highways=0 epochs=1000 execution-time=463551.03 hidden-size=600 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=epoch mini-batch-size=100 tied=true trainable-params=142797600 validation-perplexity=168.6133 word-emb-size=600 year-resolution=100 year-stride=50 lm word-level bilstm 5.0241 5.2751 4.9769
9 p/tlen 2019-03-21 08:31 1.0.0 two BiLSTMs, one for each 100 years bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=8 early-stopping-type=epoch enc-dropout=0.5 enc-highways=0 epochs=1000 execution-time=339080.39 hidden-size=600 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=epoch mini-batch-size=100 tied=true trainable-params=95198400 validation-perplexity=176.045 word-emb-size=600 year-resolution=100 year-stride=100 lm word-level bilstm 5.0337 5.2924 4.9956
16 p/tlen 2019-03-20 04:14 1.0.0 LM model trained on 20190320 (applica-lm-retro-gap-train-1864-1963-bilstm-frage-1814-1913-preproc=minimalistic-bidirectional-lang=pl-5.2.0.0.bin) best-epoch=35 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=8 early-stopping-type=epoch enc-dropout=0.5 enc-highways=0 epochs=1000 epochs-done=35 execution-time=124470.64 hidden-size=600 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=epoch mini-batch-size=100 tied=true trainable-params=47599200 validation-perplexity=153.75 word-emb-size=600 lm word-level bilstm 5.1606 5.3497 5.0825
12 p/tlen 2019-03-19 21:11 1.0.0 LM model trained on 20190319 (applica-lm-retro-gap-train-1914-2013-bilstm-frage-1914-2013-preproc=minimalistic-bidirectional-lang=pl-5.2.0.0.bin) best-epoch=63 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=20 early-stopping-type=iteration enc-dropout=0.5 enc-highways=0 epochs=1000 epochs-done=63 execution-time=277183.16 hidden-size=600 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 tied=true trainable-params=47599200 validation-perplexity=161.74 word-emb-size=600 lm word-level bilstm 5.1546 5.3037 5.0229
30 p/tlen 2019-03-16 19:24 1.0.0 LM model trained on 20190316 (applica-lm-retro-gap-train-1814-1913-bilstm-frage-1814-1913-preproc=minimalistic-bidirectional-lang=pl-5.2.0.0.bin) best-epoch=32 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=8 early-stopping-type=epoch enc-dropout=0.5 enc-highways=0 epochs=1000 epochs-done=32 execution-time=61897.23 hidden-size=600 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=epoch mini-batch-size=100 tied=true trainable-params=47599200 validation-perplexity=190.35 word-emb-size=600 lm word-level bilstm 5.2199 5.4381 5.2651
100 p/tlen 2019-03-10 17:09 1.0.0 LM model trained on 20190310 (applica-lm-retro-gap-transformer-frage-preproc=minimalistic-left_to_right-lang=pl-5.2.0.0.bin) attention-dropout=0.1 attention-heads=8 best-epoch=74 bptt=50 chunksize=10000 clip=0.25 dropout=0.1 early-stopping=20 early-stopping-type=iteration enc-dropout=0.1 enc-highways=1 epochs=80 epochs-done=74 execution-time=222418.51 ffn-emb-size=2048 layers=6 learned-positional=false lr=2.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 no-positional-embeddings=false normalize-before=true relu-dropout=0.1 share-embed=true trainable-params=50422272 validation-perplexity=116.17 vocab=model/vocab.pkl word-emb-size=512 lm word-level transformer 9.5350 9.4320 9.2231
18 p/tlen 2019-02-22 22:52 1.0.0 LM model used (model.bin) lm word-level bilstm 5.1855 5.3806 5.1346
10 p/tlen 2019-02-18 15:14 1.0.0 LM model trained on 20190218 (applica-lm-retro-gap-retro-gap-frage-preproc=minimalistic-bidirectional-lang=pl-5.2.0.0.bin) best-epoch=95 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=20 early-stopping-type=iteration enc-dropout=0.5 enc-highways=0 epochs=1000 epochs-done=95 execution-time=761381.7 hidden-size=600 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 tied=true trainable-params=47599200 validation-perplexity=126.46 word-emb-size=600 lm word-level bilstm frage 5.0696 5.2951 5.0006
36 p/tlen 2019-02-08 23:03 1.0.0 LM model trained on 20190208 (applica-lm-train-tokenized-lowercased-shuffled-bilstm-all-preproc=minimalistic-bidirectional-lang=pl-5.2.0.0.bin) best-epoch=1 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=40 enc-dropout=0.5 enc-highways=0 epochs=1000 epochs-done=1 execution-time=28793.21 hidden-size=400 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 trainable-params=53192800 validation-perplexity=494.44 word-emb-size=400 lm word-level bilstm 5.4361 5.5855 5.3822
35 p/tlen 2019-02-07 07:58 1.0.0 LM model trained on 20190207 (applica-lm-train-tokenized-lowercased-shuffled-bilstm-all-preproc=minimalistic-bidirectional-lang=pl-5.2.0.0.bin) best-epoch=1 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 early-stopping=10 enc-dropout=0.5 enc-highways=0 epochs=1000 epochs-done=1 execution-time=28836.08 hidden-size=400 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 trainable-params=53192800 validation-perplexity=494.77 word-emb-size=400 lm word-level bilstm 5.4315 5.5867 5.3780
14 p/tlen 2019-02-02 09:49 1.0.0 LM model trained on 20190202 (applica-lm-retro-gap-bilstm-word-preproc=minimalistic-bidirectional-lang=pl-5.1.9.0.bin) best-epoch=79 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 enc-dropout=0.5 enc-highways=0 epochs=80 epochs-done=79 execution-time=429035.21 hidden-size=400 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 trainable-params=53192800 validation-perplexity=133.6 word-emb-size=400 lm word-level bilstm 5.1253 5.3486 5.0603
38 p/tlen 2019-01-30 19:56 1.0.0 LM model trained on 20190130 (applica-lm-retro-gap-transformer-word-preproc=minimalistic-left_to_right-lang=pl-5.1.9.0.bin) attention-heads=8 best-epoch=80 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 enc-dropout=0.5 enc-highways=0 epochs=100 epochs-done=80 execution-time=195031.97 ffn-emb-size=640 layers=3 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 trainable-params=40868160 validation-perplexity=172.55 word-emb-size=320 lm word-level transformer 5.7237 5.8370 5.6476
41 p/tlen 2019-01-28 05:16 1.0.0 LM model trained on 20190128 (applica-lm-retro-gap-transformer-word-preproc=minimalistic-left_to_right-lang=pl-5.1.9.0.bin) attention-heads=8 best-epoch=48 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 enc-dropout=0.5 enc-highways=0 epochs=50 epochs-done=48 execution-time=135357.24 ffn-emb-size=1024 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 trainable-params=50927648 validation-perplexity=183.11 word-emb-size=400 lm word-level transformer 5.7902 5.8972 5.7039
33 p/tlen 2019-01-09 16:30 1.0.0 LM model trained on 20190109 (applica-lm-retro-gap-bilstm-cnn-preproc=minimalistic-bidirectional-lang=pl-5.1.8.0.bin) best-epoch=50 bptt=35 char-emb-size=16 chunksize=10000 clip=0.25 dropout=0.3 enc-dropout=0.3 enc-highways=2 epochs=50 epochs-done=50 execution-time=800403.33 feature-maps=- 50 - 100 - 100 - 100 - 200 - 200 hidden-size=400 kernels=- 3 - 4 - 5 - 6 - 7 - 8 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=75 trainable-params=33775626 validation-perplexity=196.74 word-len=28 lm char-n-grams bilstm 5.4438 5.5773 5.3202
15 p/tlen 2018-12-30 05:32 1.0.0 LM model trained on 20181230 (applica-lm-retro-gap-bilstm-preproc=minimalistic-bidirectional-lang=pl-5.1.8.0.bin) best-epoch=49 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 enc-dropout=0.5 enc-highways=0 epochs=50 epochs-done=49 execution-time=192264.61 hidden-size=400 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 trainable-params=53192800 validation-perplexity=134.93 word-emb-size=400 lm word-level bilstm 5.1365 5.3476 5.0632
32 p/tlen 2018-12-27 21:24 1.0.0 LM model trained on 20181227 (applica-lm-retro-gap-bilstm-preproc=minimalistic-bidirectional-lang=pl-5.1.8.0.bin) best-epoch=1 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 enc-dropout=0.5 enc-highways=0 epochs=1 epochs-done=1 execution-time=2657.21 hidden-size=300 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 trainable-params=38949600 validation-perplexity=195.74 word-emb-size=300 lm word-level bilstm 5.4143 5.5581 5.3177
17 p/tlen 2018-12-27 10:00 1.0.0 LM model trained20181225 (applica-lm-retro-gap-bilstm-preproc=minimalistic-bidirectional-lang=en-5.1.8.0.bin) best-epoch=39 bptt=35 chunksize=10000 clip=0.25 dropout=0.5 enc-dropout=0.5 enc-highways=0 epochs=40 epochs-done=39 execution-time=109120.05 hidden-size=300 layers=2 lr=1.0e-3 lrs-min-lr=1.0e-6 lrs-multiplier=0.5 lrs-patience=5 lrs-step-frequency=eval_iteration mini-batch-size=100 trainable-params=38949600 validation-perplexity=145.34 word-emb-size=300 lm 5.2085 N/A 5.1303
44 p/tlen 2018-09-02 20:20 1.0.0 simple 2-layer LSTM, left-to-right epochs=1 neural-network lm lstm left-to-right 5.8425 5.8765 5.7359
102 EmEm 2018-01-28 07:51 1.0.0 trigrams_fixed self-made lm trigram N/A N/A 19.1101
45 siulkilulki 2018-01-24 14:39 1.0.0 simple neural network, context 2 words ahead 2 words behind neural-network 5.8672 6.0007 5.7395
46 kaczla 2018-01-17 11:20 1.0.0 simple neural network - nb_of_epochs=3, batch_size=2048 neural-network 5.8751 5.9999 5.7839
50 kaczla 2018-01-16 18:52 1.0.0 simple neural network - nb_of_epochs=2 neural-network 5.9285 6.0385 5.8193
51 kaczla 2018-01-16 18:13 1.0.0 simple neural network - nb_of_epochs=4 neural-network 5.9463 6.0446 5.8514
56 kaczla 2018-01-16 17:17 1.0.0 simple neural network - decrease batch_size neural-network 6.1810 6.2569 6.0581
60 patrycja 2018-01-15 18:11 1.0.0 Bigrams model, 100 best words stupid self-made lm bigram N/A 6.3638 6.1097
115 patrycja 2018-01-09 18:26 1.0.0 ??? stupid self-made lm bigram N/A N/A N/A
114 patrycja 2018-01-09 18:08 1.0.0 Bigrams model, 100 best words stupid self-made lm bigram N/A N/A N/A
52 p/tlen 2018-01-03 06:07 1.0.0 a very simple (non-recurrent) neural network, looking one word behind and one word ahead (train on all data), dictionary size=40000 neural-network 5.9766 6.0881 5.8648
101 EmEm 2018-01-02 18:14 1.0.0 'trigrams' self-made lm trigram N/A N/A 14.5507
113 EmEm 2018-01-02 17:26 1.0.0 'trigrams' N/A N/A N/A
53 p/tlen 2018-01-02 16:23 1.0.0 a very simple (non-recurrent) neural network, looking one word behind and one word ahead neural-network 5.9794 6.0982 5.8990
58 siulkilulki 2017-12-13 14:54 1.0.0 unigram with temporal info, best 100, two periods (1813, 1913) (1913, 2014) self-made lm temporal unigram 6.1654 6.1828 6.0816
59 siulkilulki 2017-12-13 14:44 1.0.0 unigram with temporal info, best 100, 2 periods (1813, 1913) (1913, 2014) self-made lm temporal unigram 6.1717 6.2016 6.0893
62 siulkilulki 2017-12-13 14:41 1.0.0 unigram with temporal model, 25 best self-made 6.2397 6.2592 6.1729
64 kaczla 2017-12-12 20:45 1.0.0 3-gram with prune, best 1, best oov ready-made kenlm lm 6.1260 6.2991 6.1896
55 kaczla 2017-12-12 20:42 1.0.0 3-gram with prune, best 2, best oov ready-made kenlm lm 5.9662 6.1685 6.0105
54 kaczla 2017-12-12 20:41 1.0.0 3-gram with prune, best 3, best oov ready-made kenlm lm 5.8803 6.0738 5.9181
49 kaczla 2017-12-12 20:38 1.0.0 3-gram with prune, best 5, best oov ready-made kenlm lm 5.8022 5.9837 5.8182
43 kaczla 2017-12-12 20:37 1.0.0 3-gram with prune, best 10, best oov ready-made kenlm lm 5.7428 5.9032 5.7196
40 kaczla 2017-12-12 20:35 1.0.0 3-gram with prune, best 15, best oov ready-made kenlm lm 5.7367 5.8767 5.7006
42 kaczla 2017-12-12 20:32 1.0.0 3-gram with prune, best 25, best oov ready-made kenlm lm 5.7500 5.8788 5.7052
69 kaczla 2017-12-12 19:19 1.0.0 3-gram with prune, best 1 ready-made kenlm lm 6.1473 6.3361 6.2166
70 kaczla 2017-12-12 19:17 1.0.0 3-gram with prune, best 2 ready-made kenlm lm 6.1808 6.4349 6.2362
72 kaczla 2017-12-12 19:14 1.0.0 3-gram with prune, best 3 ready-made kenlm lm 6.2590 6.5174 6.3085
74 kaczla 2017-12-05 21:39 1.0.0 3-gram with prune, best 5 ready-made kenlm lm 6.4040 6.6586 6.4228
77 kaczla 2017-12-05 21:38 1.0.0 3-gram with prune, best 10 ready-made kenlm lm 6.6364 6.8789 6.5879
78 kaczla 2017-12-05 21:35 1.0.0 3-gram with prune, best 15 ready-made kenlm lm 6.7882 7.0033 6.7119
87 kaczla 2017-12-05 21:33 1.0.0 3-gram with prune, best 25 ready-made kenlm lm 6.9749 7.1766 6.8763
92 kaczla 2017-12-05 21:30 1.0.0 3-gram with prune, best 50 ready-made kenlm lm 7.2401 7.4038 7.1059
97 kaczla 2017-12-05 21:24 1.0.0 3-gram with prune, best 100 ready-made kenlm lm 7.4523 7.6464 7.3087
68 mmalisz 2017-06-29 22:47 1.0.0 Order 4 N/A N/A 6.2111
73 mmalisz 2017-06-29 18:38 1.0.0 order 2 N/A N/A 6.3262
66 mmalisz 2017-06-29 15:12 1.0.0 Update source code; kenlm order=3 tokenizer.perl from moses. best 100 results, text mode. ready-made kenlm lm N/A N/A 6.1898
65 mmalisz 2017-06-29 15:08 1.0.0 added wildcard N/A N/A 6.1898
103 mmalisz 2017-06-29 12:29 1.0.0 first 100 N/A N/A Infinity
112 mmalisz 2017-06-28 13:23 1.0.0 top 100 N/A N/A N/A
88 Durson 2017-06-28 08:47 1.0.0 test 2 ready-made neural-network N/A N/A 6.8956
98 Durson 2017-06-27 19:14 1.0.0 first test ready-made neural-network N/A N/A 7.5236
111 mmalisz 2017-06-15 23:29 1.0.0 First try N/A N/A N/A
82 EmEm 2017-05-16 04:31 1.0.0 zad 16 self-made lm N/A N/A 6.8056
57 tamazaki 2017-04-24 16:42 1.0.0 unigramy, n=100, v3 self-made lm 6.1745 6.1841 6.0733
99 tamazaki 2017-04-24 16:32 1.0.0 unigramy, n=100, v2 self-made lm 8.0610 8.0714 7.8460
110 tamazaki 2017-04-24 16:29 1.0.0 unigramy, n=100 N/A N/A N/A
109 tamazaki 2017-04-24 16:24 1.0.0 unigramy, n=1000 7.6808 7.7246 N/A
94 tamazaki 2017-04-24 15:14 1.0.0 unigramy (dobre kodowanie) v2 self-made lm 7.3661 7.3596 7.2467
108 tamazaki 2017-04-24 15:11 1.0.0 unigramy (dobre kodowanie) N/A N/A N/A
93 tamazaki 2017-04-23 17:57 1.0.0 Unigram (problem kodowania) 7.3661 7.3596 7.2467
107 tamazaki 2017-04-23 17:53 1.0.0 Unigram (problem kodowania) 7.3661 N/A N/A
105 tamazaki 2017-04-23 17:46 1.0.0 Unigram (problem kodowania) N/A N/A N/A
106 tamazaki 2017-04-23 17:43 1.0.0 Unigram (problem kodowania) N/A N/A N/A
89 p/tlen 2017-04-10 06:22 1.0.0 uniform probability except for comma stupid 6.9116 6.9585 6.9169
90 p/tlen 2017-04-10 06:18 1.0.0 uniform probability stupid 6.9315 6.9315 6.9315

Submission graph

Graphs by parameters

attention-dropout

[direct link]

attention-heads

[direct link]

beam-search-depth

[direct link]

best-epoch

[direct link]

bptt

[direct link]

char-emb-size

[direct link]

chunksize

[direct link]

clip

[direct link]

dropout

[direct link]

early-stopping

[direct link]

enc-dropout

[direct link]

enc-highways

[direct link]

epochs

[direct link]

epochs-done

[direct link]

execution-time

[direct link]

ffn-emb-size

[direct link]

hidden-size

[direct link]

layers

[direct link]

lr

[direct link]

lrs-min-lr

[direct link]

lrs-multiplier

[direct link]

lrs-patience

[direct link]

mini-batch-size

[direct link]

relu-dropout

[direct link]

resume-best-current-iteration

[direct link]

resume-best-epoch

[direct link]

resume-current-iteration

[direct link]

resume-epoch

[direct link]

resume-line-number

[direct link]

resume-ppl

[direct link]

trainable-params

[direct link]

validation-perplexity

[direct link]

word-emb-size

[direct link]

word-len

[direct link]

year-resolution

[direct link]

year-stride

[direct link]