"He Said She Said" classification challenge (2nd edition)

Give the probability that a text in Polish was written by a man. [ver. 2.0.2]

Git repo URL: git://gonito.net/petite-difference-challenge2 / Branch: master
Run git clone --single-branch git://gonito.net/petite-difference-challenge2 -b master to get the challenge data
Browse at https://gonito.net/gitlist/petite-difference-challenge2.git/master

Leaderboard

# submitter when ver. description test-A Accuracy test-A Likelihood ×
1 kaczla 2020-07-02 08:55 2.0.2 Polish RoBERTa (base), epoch 5, seq_len 512, active dropout fairseq roberta-pl 0.74332 0.62110 30
2 kubapok 2020-06-19 09:29 2.0.2 pl roberta large active dropout avg 12 runs 0.74406 0.61949 28
3 [anonymized] 2020-05-24 14:52 2.0.2 self-made NB with probs ISI-2019-063 probabilities 0.65612 0.52537 5
4 [anonymized] 2020-05-24 14:17 2.0.0 v5 probabilities 0.64427 0.52134 7
5 [anonymized] 2020-06-07 12:45 2.0.0 XGBoost ready-made ready-made xgboost 0.60039 0.52000 5
6 [anonymized] 2020-12-13 22:48 2.0.2 model-size=100k voc-size=100k logistic-regression pytorch-nn 0.60299 0.51313 2
7 wirus wirus 2020-05-23 21:29 2.0.0 null model null-model 0.50000 0.50000 24
8 Adam Wojdyła 2022-04-26 16:15 2.0.2 4444507 self-made 0.50000 0.49749 2
9 [anonymized] 2020-06-03 17:39 2.0.0 2nd logistic-regression word2vec 0.49299 0.49679 2
10 [anonymized] 2020-12-08 15:08 2.0.2 pytorch logistic regression logistic-regression pytorch-nn 0.54176 0.43930 4
11 [anonymized] 2021-02-03 17:42 2.0.2 TAU15 logistic-regression pytorch-nn 0.50074 0.00000 2
12 [anonymized] 2021-02-01 20:38 2.0.2 logistic-regression logistic-regression pytorch-nn 0.63924 0.00000 3
13 [anonymized] 2020-12-15 22:09 2.0.2 added code with output for dev-0, dev-1, test-A logistic-regression pytorch-nn 0.49317 0.00000 1
14 [anonymized] 2020-12-15 21:55 2.0.2 added source file logistic-regression pytorch-nn 0.49728 0.00000 3
15 [anonymized] 2021-01-27 02:20 2.0.2 1try logistic-regression pytorch-nn 0.49914 0.00000 1
16 [anonymized] 2021-01-26 23:25 2.0.2 try again logistic-regression pytorch-nn 0.53999 0.00000 2
17 [anonymized] 2020-12-16 08:42 2.0.2 'final' logistic-regression pytorch-nn 0.50878 0.00000 1
18 [anonymized] 2020-12-07 18:12 2.0.2 add code and fixed test-A logistic-regression pytorch-nn 0.57926 0.00000 2
19 [anonymized] 2020-12-09 07:50 2.0.2 Simple Solution for Logistic Regresion logistic-regression pytorch-nn 0.58355 0.00000 2
20 [anonymized] 2020-12-10 01:01 2.0.2 solution logistic-regression pytorch-nn 0.58344 0.00000 5
21 [anonymized] 2021-03-13 14:48 2.0.2 my brilliant solution logistic-regression pytorch-nn 0.53925 0.00000 1
22 s478839 2022-04-25 22:13 2.0.2 s478839 self-made 0.51284 0.00000 1
23 s444417 2022-04-23 08:45 2.0.2 imbalance words self-made 0.51237 0.00000 2
24 s443930 2022-04-26 21:44 2.0.2 s443930 self-made 0.64618 0.00000 1
25 Jakub Adamski 2022-04-20 06:45 2.0.2 s444341 zadanie self-made 0.58960 0.00000 1
26 s409771 2022-04-22 15:12 2.0.2 multinomial naive bayes self-made 0.65503 0.00000 1
27 s444354 2022-04-26 23:26 2.0.2 s444354 0.51333 0.00000 6
28 s478815 2022-04-27 10:35 2.0.2 478815 self-made 0.52032 0.00000 8
29 ked 2022-04-22 15:09 2.0.2 s449288 - dumb wordlist lookup self-made 0.51464 0.00000 2
30 s478873 2022-04-25 22:18 2.0.2 s478873 self-made 0.51352 0.00000 1
31 Cezary 2022-04-26 22:57 2.0.2 s470623 0.51036 0.00000 2
32 [anonymized] 2022-04-25 23:49 2.0.2 478841 self-made 0.66702 0.00000 1
33 Jakub 2022-04-27 09:04 2.0.2 s434624 self-made 0.51005 0.00000 1
34 [anonymized] 2022-04-26 20:41 2.0.2 s478831 0.51241 0.00000 1
35 s444476 2022-04-22 15:59 2.0.2 s444476 self-made 0.51889 0.00000 2
36 444498 2023-11-06 15:39 2.0.2 test old after key modify on new 0.51307 0.00000 8
37 Mikołaj Pokrywka 2022-04-26 14:14 2.0.2 444463 self-made 0.66439 0.00000 3
38 s444517 2022-04-25 07:39 2.0.2 s444517 - logistic regression self-made 0.66705 0.00000 1
39 s478855 2022-04-20 19:16 2.0.2 478855 - improvement self-made 0.51230 0.00000 2
40 s478840 2022-04-26 05:20 2.0.2 s478840 self-made 0.66531 0.00000 1
41 s478846 2022-04-21 08:55 2.0.2 First solution s478846 self-made 0.51599 0.00000 1
42 s444356 2022-04-20 11:04 2.0.2 s444356 self-made 0.51307 0.00000 3
43 s444386 2022-04-25 15:56 2.0.2 logistic regresion 444386 self-made 0.66769 0.00000 1
44 s444501 2022-04-22 17:31 2.0.2 444501 self-made 0.51405 0.00000 3
45 Kamil Guttmann 2022-04-30 15:42 2.0.2 s444380 logistic regression bigrams self-made 0.67034 0.00000 2
46 [anonymized] 2022-04-25 17:13 2.0.2 444421 self-made 0.66394 0.00000 1
47 s444465 2022-04-25 20:36 2.0.2 Solution 444465 self-made 0.62985 0.00000 4
48 s444452 2022-04-26 21:05 2.0.2 444452 self-made 0.63615 0.00000 2
49 s444018 2022-04-26 14:19 2.0.2 s444018 self-made 0.51307 0.00000 3
# tags test-A Accuracy -C test-A Accuracy +C test-A Accuracy +H test-A Likelihood -C test-A Likelihood +C test-A Likelihood +H test-A Accuracy test-A Likelihood
1 fairseq roberta-pl 0.74244 0.77159 0.77125 0.62032 0.64656 0.64332 0.74332 0.62110
2 0.74349 0.76227 0.77125 0.61881 0.64193 0.63970 0.74406 0.61949
3 roberta-xlm 0.70042 0.72571 0.71500 0.58222 0.60181 0.59521 0.70118 0.58280
4 fairseq roberta N/A N/A N/A N/A N/A N/A 0.69153 0.57068
5 probabilities 0.65572 0.66879 0.64500 0.52524 0.52950 0.52176 0.65612 0.52537
6 ready-made xgboost N/A N/A N/A N/A N/A N/A 0.60039 0.52000
7 logistic-regression pytorch-nn 0.60208 0.63199 0.60500 0.51296 0.51866 0.52272 0.60299 0.51313
8 null-model N/A N/A N/A N/A N/A N/A 0.50000 0.50000
9 self-made 0.50162 0.44799 0.48250 0.49766 0.49233 0.49575 0.50000 0.49749
10 logistic-regression word2vec N/A N/A N/A N/A N/A N/A 0.49299 0.49679
11 baseline N/A N/A N/A N/A N/A N/A 0.50000 0.48990
12 python 0.51753 0.56109 0.53000 0.00000 0.00000 0.00000 0.51885 0.00000
13 algo attention backoff 0.63534 0.66192 0.64750 0.00000 0.00000 0.00000 0.63615 0.00000
14 ready-made svm N/A N/A N/A N/A N/A N/A 0.59623 0.00000