RetroC2 temporal classification challenge

Guess the publication year of a Polish text. [ver. 1.0.0]

# submitter when ver. description dev-0 RMSE dev-1 RMSE test-A RMSE
23 s444383 2022-06-21 10:15 1.0.0 new prediction 21.295 21.579 23.268
136 Jakub Eichner 2022-06-07 21:46 1.0.0 s478874 linear-regression 66.030 42.681 52.826
28 [anonymized] 2022-05-29 20:41 1.0.0 Prześlij pliki do '' linear-regression 21.698 21.996 23.507
29 s444455 2022-05-26 22:04 1.0.0 regresja self-made linear-regression 21.698 21.996 23.507
22 s444383 2022-05-25 06:55 1.0.0 new prediction linear-regression 21.295 21.579 23.268
21 s444383 2022-05-20 13:36 1.0.0 changes N/A N/A 23.268
53 s443930 2022-05-19 21:03 1.0.0 s443930 linear-regression 24.659 25.312 26.515
27 s444415 2022-05-19 15:06 1.0.0 444415 linear-regression 21.698 21.996 23.507
49 Martyna Druminska 2022-05-19 11:42 1.0.0 my brilliant solution self-made linear-regression 22.747 N/A 24.816
62 Adam Wojdyła 2022-05-18 00:10 1.0.0 4444507 self-made linear-regression 29.731 26.882 28.143
57 Adam Wojdyła 2022-05-18 00:07 1.0.0 4444507 self-made linear-regression 27.500 26.100 26.964
50 s478839 2022-05-17 23:12 1.0.0 s478839 self-made linear-regression 22.768 22.983 24.865
48 [anonymized] 2022-05-17 22:07 1.0.0 s478831 linear-regression 22.747 22.932 24.816
35 s444018 2022-05-17 21:32 1.0.0 s444018 self-made linear-regression 22.025 22.096 23.855
26 s444501 2022-05-17 21:30 1.0.0 s444501 linear-regression 21.675 21.941 23.461
41 Marcin Kostrzewski 2022-05-17 21:28 1.0.0 Mean publication year, stop words removed. Trained on 50000 examples linear-regression scikit-learn stop-words 22.477 22.581 24.204
52 s478815 2022-05-17 21:27 1.0.0 478815 linear-regression 24.287 24.843 26.113
181 s478815 2022-05-17 21:08 1.0.0 478815 self-made N/A N/A N/A
38 s444356 2022-05-17 20:52 1.0.0 s444356 linear-regression 24.614 22.163 24.032
180 s478815 2022-05-17 20:29 1.0.0 478815 self-made linear-regression 66.062 N/A N/A
179 s478815 2022-05-17 20:12 1.0.0 478815 self-made 63.895 N/A N/A
24 s444417 2022-05-17 19:00 1.0.0 linear regression self-made linear-regression 21.574 21.818 23.438
33 Kamil Guttmann 2022-05-17 18:42 1.0.0 s444380 linear regression tf-idf linear-regression 21.728 22.105 23.595
51 s478873 2022-05-17 11:14 1.0.0 s478873 23.342 23.852 25.479
56 [anonymized] 2022-05-17 09:00 1.0.0 444421 linear-regression 24.925 25.599 26.884
39 Mikołaj Pokrywka 2022-05-17 06:43 1.0.0 444463 linear-regression 24.613 22.162 24.033
55 s478873 2022-05-16 22:56 1.0.0 s478873 self-made linear-regression 24.673 25.366 26.566
47 s409771 2022-05-16 17:35 1.0.0 first solution linear-regression 24.077 22.447 24.782
68 s478840 2022-05-16 12:56 1.0.0 s478840 linear-regression 27.180 27.811 28.809
30 [anonymized] 2022-05-14 17:03 1.0.0 478841 self-made linear-regression 21.698 21.996 23.507
32 s444354 2022-05-14 01:44 1.0.0 s444354 self-made linear-regression 21.702 21.998 23.515
63 s478846 2022-05-11 13:27 1.0.0 First solution linear-regression 28.516 26.476 28.166
37 s444452 2022-05-09 19:43 1.0.0 s444452 self-made linear-regression 22.181 22.201 23.988
36 s444386 2022-05-09 13:38 1.0.0 linear regresion 444386 linear-regression 22.072 22.194 23.951
25 s478855 2022-05-08 17:27 1.0.0 s478855 self-made linear-regression 21.668 21.944 23.456
31 s444476 2022-05-01 11:59 1.0.0 s444476 linear-regression 21.698 21.996 23.507
54 ked 2022-04-29 14:01 1.0.0 s449288 - simple linear regression with 10% of train dataset linear-regression 25.237 25.579 26.550
46 p/tlen 2021-05-05 09:07 1.0.0 linear regression with PyTorch batch-size=1 epochs=1 hash-bit-size=18 token-root-len=7 linear-regression pytorch-nn adam 23.183 22.192 24.733
79 p/tlen 2021-05-04 20:53 1.0.0 linear regression with PyTorch batch-size=1 epochs=1 hash-bit-size=18 learning-rate=3.2e-2 token-root-len=7 linear-regression pytorch-nn 38.639 33.722 36.429
92 p/tlen 2021-05-04 20:41 1.0.0 linear regression with PyTorch batch-size=1 epochs=1 hash-bit-size=18 learning-rate=3.2e-2 token-root-len=7 linear-regression pytorch-nn 42.860 37.115 39.378
82 p/tlen 2021-05-04 19:58 1.0.0 linear regression with PyTorch batch-size=1 epochs=1 hash-bit-size=18 learning-rate=3.2e-2 token-root-len=7 linear-regression pytorch-nn 45.891 33.055 37.160
15 kubapok 2020-06-30 22:45 1.0.0 fasttext as classification problem Mikolaj Bachorz experiment reproduction 50 epochs devs included 11.250 10.806 20.805
40 kubapok 2020-06-30 21:19 1.0.0 fasttext as classification problem Mikolaj Bachorz experiment reproduction 50 epochs 21.670 19.827 24.101
69 kubapok 2020-06-30 13:35 1.0.0 fasttext as classification problem Mikolaj Bachorz experiment reproduction 33.336 25.245 30.931
141 [anonymized] 2020-06-24 00:12 1.0.0 xgboost solution ready-made xgboost 54.560 52.444 54.317
161 [anonymized] 2020-06-18 11:49 1.0.0 xgboost skrypt + pliki out ready-made xgboost 66.925 64.530 67.634
1 kubapok 2020-06-02 17:12 1.0.0 linear layer on top of polish roberta- Adam lr 1e-07 12.260 12.317 13.328
2 kubapok 2020-05-28 15:09 1.0.0 linear layer on top of polish roberta (both finetuned 2 epochs) 13.671 13.723 15.606
14 kubapok 2020-05-24 18:27 1.0.0 xgb on top of polish roberta mean token 17.685 17.286 20.458
8 [anonymized] 2020-05-18 08:34 1.0.0 v3 0.913 1.048 18.923
5 [anonymized] 2020-05-18 05:20 1.0.0 v2 fasttext 1.084 1.216 17.989
3 kubapok 2020-01-28 19:54 1.0.0 ensemble of 4, bilstm 15.326 15.060 17.262
7 kubapok 2020-01-28 19:52 1.0.0 ensemble of 4, bilstm 16.342 16.339 18.758
6 kubapok 2020-01-19 22:49 1.0.0 keras bilstm lstm 16.342 16.339 18.758
34 kubapok 2020-01-12 21:38 1.0.0 keras fasttext like 22.069 20.969 23.782
17 kubapok 2020-01-09 21:38 1.0.0 tfidf baseline maxminclipping linear-regression tf-idf 21.327 21.222 23.114
18 kubapok 2020-01-09 16:16 1.0.0 tfidf lr baseline tf-idf 21.673 21.317 23.167
150 [anonymized] 2019-07-21 14:25 1.0.0 CNN 30,31,32 72.250 52.565 57.806
154 [anonymized] 2019-07-16 18:50 1.0.0 CNN, lr 0.001, 100 filters, [4,5,6,7] filter sizes, dropout 0.5 76.269 N/A 59.799
153 [anonymized] 2019-07-16 18:17 1.0.0 CNN, lr 0.001, 100 filters, [4,5,6,7] filter sizes, dropout 0.5 N/A N/A 59.799
127 [anonymized] 2019-07-13 20:29 1.0.0 Feedforward, word embeddings NKJP + Wikipedia, model01 61.224 43.289 47.333
126 [anonymized] 2019-07-13 20:15 1.0.0 Feedforward, word embeddings NKJP + Wikipedia, model01 61.224 N/A 47.333
125 [anonymized] 2019-07-13 19:20 1.0.0 Feedforward, word embeddings NKJP + Wikipedia, model48 N/A N/A 47.333
119 [anonymized] 2019-07-13 19:06 1.0.0 Feedforwar, word embeddings NKJP + Wikipedia N/A N/A 47.045
166 [anonymized] 2019-07-13 12:19 1.0.0 Char CNN 30e, 0.001lr 69.770 45.606 74.298
148 [anonymized] 2019-06-12 14:49 1.0.0 wordlist 4 63.600 55.405 56.295
130 [anonymized] 2019-06-12 14:40 1.0.0 wordlist 4 57.742 51.007 51.853
139 [anonymized] 2019-06-12 14:16 1.0.0 wordlist 4 64.766 50.656 53.026
142 [anonymized] 2019-06-12 14:12 1.0.0 wordlist 4 63.817 53.179 54.668
140 [anonymized] 2019-06-12 13:56 1.0.0 wordlist 4 64.448 51.292 53.421
157 [anonymized] 2019-06-12 12:20 1.0.0 wordlist 4 65.050 63.803 63.186
133 [anonymized] 2019-06-12 12:00 1.0.0 wordlist 4 65.818 49.338 52.340
131 [anonymized] 2019-06-12 11:27 1.0.0 wordlist 4 66.698 48.749 52.150
168 [anonymized] 2019-06-12 11:13 1.0.0 wordlist + random choice 2 83.138 78.207 79.643
91 [anonymized] 2019-06-10 15:23 1.0.0 graf self-made linear-regression graph 48.238 41.755 39.246
124 [anonymized] 2019-06-08 11:23 1.0.0 Bayes to predict some time range fix naive-bayes 41.561 50.584 47.205
58 [anonymized] 2019-06-04 15:41 1.0.0 Vowpal Wabbit quadratic model + graph v2 vowpal-wabbit graph 27.479 23.632 27.297
66 [anonymized] 2019-06-04 15:05 1.0.0 Vowpal Wabbit quadratic model + graph vowpal-wabbit graph 28.930 25.384 28.401
59 [anonymized] 2019-06-04 12:03 1.0.0 vw first encounter(loss function, -b 27, passes=20, quadratic model) vowpal-wabbit graph 26.939 23.654 27.384
42 [anonymized] 2019-06-01 15:50 1.0.0 tf ready-made linear-regression tf 23.149 21.791 24.291
118 [anonymized] 2019-05-29 06:10 1.0.0 Test feedforward 69.770 45.606 46.987
165 [anonymized] 2019-05-27 14:04 1.0.0 Bayes to predict some time range naive-bayes 65.137 82.283 71.060
167 [anonymized] 2019-05-27 13:06 1.0.0 naive bayes naive-bayes 69.913 89.059 77.466
129 [anonymized] 2019-05-22 08:38 1.0.0 BiLSTM 30epochs 22nd new tokenizer 69.770 45.606 50.381
178 [anonymized] 2019-05-22 08:36 1.0.0 BiLSTM 3- ep0 epochs 22nd new tokenizer 69.770 45.606 N/A
155 [anonymized] 2019-05-21 19:00 1.0.0 BiLSTM 3- epochs 22nd new tokenizer 69.770 45.606 62.489
61 [anonymized] 2019-05-17 18:45 1.0.0 Vowpal Wabbit - linear regression + graph vowpal-wabbit graph 29.092 24.445 27.808
45 [anonymized] 2019-05-11 22:40 1.0.0 BiLSTM, 30epochs, model 28th 69.770 45.606 24.605
73 [anonymized] 2019-05-09 23:19 1.0.0 ready-made tf-df: a fix ready-made linear-regression tf-idf 29.817 26.292 33.388
44 [anonymized] 2019-05-09 17:51 1.0.0 BiLSTM, 30epochs, model 28th N/A 45.606 24.605
43 [anonymized] 2019-05-09 05:03 1.0.0 BiLSTM, 30epochs, model 28th N/A N/A 24.605
64 [anonymized] 2019-05-06 15:08 1.0.0 tfidf 3k words low range 59.446 N/A 28.277
60 [anonymized] 2019-05-06 15:05 1.0.0 tfidf 50k words low reduction range ready-made linear-regression tf-idf 59.446 N/A 27.708
107 [anonymized] 2019-05-06 15:01 1.0.0 My solution go.php rule-based 57.517 45.142 42.687
102 [anonymized] 2019-05-06 14:49 1.0.0 transfer files to VM 59.446 N/A 40.674
101 [anonymized] 2019-05-06 14:08 1.0.0 all documents to predict on vector 30k words TFIDF KNN ready-made knn tf-idf 59.446 N/A 40.674
74 [anonymized] 2019-05-06 13:57 1.0.0 all documents to predict on vector 10k words TFIDF Linear ready-made linear-regression tf-idf 59.446 N/A 34.444
114 [anonymized] 2019-05-06 13:53 1.0.0 300 documents to predict on vector 10k words TFIDF Linear self-made linear-regression tf-idf 59.446 N/A 43.571
20 [anonymized] 2019-05-03 19:49 1.0.0 BiLSTM w/o sorting N/A 43.734 23.225
70 [anonymized] 2019-05-03 18:37 1.0.0 3000 words tf-idf self-made linear-regression tf-idf 34.237 30.865 32.631
71 [anonymized] 2019-05-03 17:25 1.0.0 2500 words tf-idf 34.857 31.521 32.912
72 [anonymized] 2019-05-03 16:45 1.0.0 2000 words tf-idf self-made linear-regression tf-idf 35.672 32.114 33.384
78 [anonymized] 2019-05-03 16:27 1.0.0 1000 words tf-idf 38.925 35.224 36.004
19 [anonymized] 2019-05-03 07:17 1.0.0 BiLSTM w\o sorting N/A N/A 23.225
128 [anonymized] 2019-04-29 14:57 1.0.0 self made TFIDF 1000 documents 500 word vector KNN4 self-made linear-regression knn tf-idf 59.446 N/A 48.323
89 [anonymized] 2019-04-28 20:08 1.0.0 linner ready tf ready-made linear-regression tf 51.562 41.400 39.240
123 [anonymized] 2019-04-27 18:41 1.0.0 change encoding 59.446 N/A 47.094
122 [anonymized] 2019-04-27 17:54 1.0.0 tfidf - not ready 59.446 N/A 47.094
4 Artur Nowakowski 2019-04-19 07:52 1.0.0 optimized word2vec + nn neural-network word2vec 15.913 14.587 17.789
13 Artur Nowakowski 2019-04-17 16:02 1.0.0 wordvec + nn 5-fold validation 17.644 16.803 20.232
12 Artur Nowakowski 2019-04-17 09:20 1.0.0 word2vec + nn optimized for dev1 18.714 15.851 19.915
149 [anonymized] 2019-04-16 22:30 1.0.0 Now with CHARTS self-made linear-regression graph 62.496 57.478 57.119
77 [anonymized] 2019-04-16 16:36 1.0.0 simple lin reg self-made linear-regression graph N/A N/A 35.958
88 [anonymized] 2019-04-16 16:24 1.0.0 Figure add self-made linear-regression graph N/A 40.479 38.431
10 Artur Nowakowski 2019-04-16 16:13 1.0.0 basic word2vec + nn solution (optimized for dev0) 16.793 17.702 19.557
87 [anonymized] 2019-04-16 15:36 1.0.0 years in text self-made linear-regression graph 50.710 N/A 38.419
172 [anonymized] 2019-04-16 15:20 1.0.0 mean year found in text 1341469.078 N/A 1345919.117
94 [anonymized] 2019-04-16 13:24 1.0.0 Regresja liniowa (USA |usa |stany zjednoczone) + Lata 48.599 42.924 39.382
76 [anonymized] 2019-04-16 13:14 1.0.0 simple linear regression self-made linear-regression graph N/A N/A 35.958
93 [anonymized] 2019-04-16 13:10 1.0.0 Regresja liniowa (USA|usa|stany zjednoczone) plus lata N/A 42.924 39.382
104 [anonymized] 2019-04-16 09:49 1.0.0 linear regression self-made linear-regression graph 61.474 40.228 41.220
121 [anonymized] 2019-04-16 01:12 1.0.0 Merge branch 'master' of ssh://gonito.net/huntekah/retroc2 self-made linear-regression graph 59.446 N/A 47.094
120 [anonymized] 2019-04-16 00:56 1.0.0 Honest one variable solution, without fancy, and thus easy methods self-made linear-regression graph 59.446 N/A 47.094
65 [anonymized] 2019-04-15 20:33 1.0.0 Basic ready-made solution with one column ready-made linear-regression tf-idf 30.212 26.677 28.298
109 [anonymized] 2019-04-15 18:40 1.0.0 excel plots :) self-made linear-regression graph 52.092 46.796 42.991
99 [anonymized] 2019-04-15 18:39 1.0.0 XXDDDD my solution self-made linear-regresion ADD CHARTS selfMadeLinearRegres_Solver.py self-made linear-regression graph 48.699 42.928 39.904
100 [anonymized] 2019-04-15 16:35 1.0.0 one variable regression self-made linear-regression graph N/A N/A 40.219
108 [anonymized] 2019-04-15 12:46 1.0.0 hope that is final one self-made linear-regression 52.092 46.796 42.991
177 [anonymized] 2019-04-15 12:40 1.0.0 now more iterations 52.092 46.796 N/A
170 [anonymized] 2019-04-15 12:02 1.0.0 forgot to add out files xd 279.733 285.868 269.460
171 [anonymized] 2019-04-15 11:57 1.0.0 date detection and linear regression 271.215 296.533 271.270
98 [anonymized] 2019-04-15 11:54 1.0.0 XDDD my solution self-made linear-regresion selfMadeLinearRegres_Solver.py self-made linear-regression 48.699 42.928 39.904
16 Artur Nowakowski 2019-04-15 11:40 1.0.0 Linear regression with TF-IDF weighing scheme ready-made linear-regression tf-idf 20.750 20.676 22.341
67 [anonymized] 2019-04-11 18:56 1.0.0 Basic ready-made solution ready-made linear-regression tf-idf 30.723 26.963 28.668
90 [anonymized] 2019-04-10 10:28 1.0.0 wykrywanie dat top prio rule-based 48.238 41.755 39.246
176 [anonymized] 2019-04-09 19:11 1.0.0 not many rules rule-based N/A N/A N/A
145 [anonymized] 2019-04-09 16:55 1.0.0 post OCR signs 69.841 N/A 55.400
144 [anonymized] 2019-04-09 16:53 1.0.0 post OCR signs 73.236 N/A 55.400
164 [anonymized] 2019-04-09 16:51 1.0.0 post OCR signs 73.236 N/A 69.970
163 [anonymized] 2019-04-09 16:46 1.0.0 post OCR signs 351.237 N/A 69.970
147 [anonymized] 2019-04-09 16:43 1.0.0 solution with simple word list2 rule-based 63.523 55.172 56.124
160 [anonymized] 2019-04-09 16:28 1.0.0 post OCR signs N/A N/A 66.046
106 [anonymized] 2019-04-09 16:15 1.0.0 My solution basic rule-based solver.py rule-based 57.517 45.142 42.682
105 [anonymized] 2019-04-09 16:12 1.0.0 fourth solution rule-based N/A 43.947 41.235
83 [anonymized] 2019-04-09 15:28 1.0.0 bad solution 2 rule-based N/A N/A 37.197
152 [anonymized] 2019-04-09 13:07 1.0.0 Bad rule-based solution rule-based N/A N/A 59.748
86 [anonymized] 2019-04-08 21:02 1.0.0 better stupid solution rule-based 50.229 39.773 38.368
85 [anonymized] 2019-04-08 19:59 1.0.0 stupid solution rule-based 50.234 41.683 38.177
103 [anonymized] 2019-04-08 18:10 1.0.0 my very simple solution3 rule-based 50.607 44.078 40.838
96 [anonymized] 2019-04-08 15:24 1.0.0 rulebased rule-based N/A N/A 39.781
112 [anonymized] 2019-04-08 15:23 1.0.0 all rules rule-based 54.064 N/A 43.101
111 [anonymized] 2019-04-08 15:21 1.0.0 slowa 57.733 N/A 43.101
95 [anonymized] 2019-04-08 12:54 1.0.0 Based on a list with years rule-based 48.666 43.031 39.695
175 [anonymized] 2019-04-08 12:42 1.0.0 Improve guessing accuracy for de0 dev1 48.666 43.031 N/A
146 [anonymized] 2019-04-08 11:32 1.0.0 my very simple solution2A 63.411 49.901 55.583
174 [anonymized] 2019-04-07 22:13 1.0.0 my very simple solution1 63.411 49.901 N/A
110 [anonymized] 2019-04-07 19:51 1.0.0 complicated rules make Good/Bad results 57.733 N/A 43.101
151 [anonymized] 2019-04-07 18:17 1.0.0 most popular words in 10-year periods java rule-based 51.117 54.080 57.981
132 [anonymized] 2019-04-07 09:44 1.0.0 simple rule based rule-based 57.733 N/A 52.306
159 [anonymized] 2019-04-07 08:42 1.0.0 my best solution rule-based 86.441 61.280 65.953
113 [anonymized] 2019-04-06 16:59 1.0.0 based on historical word list rule-based 57.220 46.287 43.424
97 [anonymized] 2019-04-05 19:22 1.0.0 simple set solution rule-based 53.933 37.245 39.885
81 Artur Nowakowski 2019-04-03 11:52 1.0.0 simple solution rule-based 50.255 37.596 36.563
143 [anonymized] 2019-04-02 16:50 1.0.0 LSTM EPOCHS=10 LR=0.001 DROPOUT=0.1 - input w/o filtering, adding missing line neural-network bilstm 66.348 N/A 55.115
158 [anonymized] 2019-03-30 19:01 1.0.0 LSTM EPOCHS=10 LR=0.001 DROPOUT=0.2 - input w/o filtering, adding missing line neural-network bilstm 73.227 N/A 63.913
169 [anonymized] 2019-03-27 17:36 1.0.0 LSTM EPOCHS=10 LR=0.001 DROPOUT=0.5 - input w/o filtering, adding missing line neural-network bilstm N/A N/A 97.902
138 [anonymized] 2019-02-19 09:08 1.0.0 Modyfikacja skryptu do uruchomienia 72.529 N/A 52.950
137 [anonymized] 2019-02-19 08:50 1.0.0 5 epochs; filtered input; feedforward network neural-network 72.529 N/A 52.950
156 p/tlen 2018-08-30 20:16 1.0.0 tescik 5 stupid N/A N/A 62.502
162 p/tlen 2018-08-30 19:38 1.0.0 tescik 3 stupid N/A N/A 69.597
116 p/tlen 2018-08-30 19:31 1.0.0 tescik2 stupid N/A N/A 46.513
117 p/tlen 2018-08-30 19:27 1.0.0 test stupid N/A N/A 46.591
135 [anonymized] 2018-08-14 18:29 1.0.0 dev0 first solution stupid neural-network 57.863 N/A 52.721
80 p/tlen 2018-05-29 07:19 1.0.0 try xgboost xgboost 39.907 37.228 36.484
11 p/tlen 2017-07-08 12:52 1.0.0 VW with yearly resolution 23.065 17.178 19.695
173 p/tlen 2017-07-08 12:50 1.0.0 VW with yearly resolution N/A N/A N/A
75 p/tlen 2017-07-07 10:23 1.0.0 year references combined with hand-crafted rules 42.302 42.654 35.807
115 p/tlen 2017-07-07 09:41 1.0.0 hand-crafted rules 48.602 50.646 44.167
84 p/tlen 2017-07-07 09:11 1.0.0 year references 46.447 39.809 37.652
9 p/tlen 2017-05-31 04:49 1.0.0 VW -nn 6 on up to 4-grams and [5-7] tokens stupid vowpal-wabbit neural-network 22.413 16.991 19.501
134 p/tlen 2017-05-26 21:48 1.0.0 null model stupid null-model 57.735 51.906 52.539