"He Said She Said" classification challenge (2nd edition)

Guess whether a text in Polish was written by a man or woman.

submitter when description dev-0/Accuracy dev-1/Accuracy test-A/Accuracy
kaczla 2017-05-27 17:08 LSTM - remove one layer, simple lemmatizer neural-network 0.64777 0.64211 0.64444
kaczla 2017-05-25 19:55 LSTM - decrease batch_size, 5 RNNs neural-network 0.70343 0.69886 0.69348
kaczla 2017-05-24 18:04 LSTM - decrease batch_size, 3 RNNs neural-network 0.70125 0.69679 0.69214
kaczla 2017-05-23 05:32 LSTM - remove one layer, 3 RNNs neural-network 0.70082 0.69814 0.69063
kaczla 2017-05-19 04:25 LSTM - remove one layer, decrease batch_size, epoch = 2 neural-network 0.69495 0.69329 0.68734
kaczla 2017-05-18 17:54 LSTM - remove one layer, decrease batch_size, epoch = 3 neural-network 0.68841 0.68476 0.68000
kaczla 2017-05-16 10:23 LSTM - epoch = 3 neural-network 0.68501 0.68359 0.67617
kaczla 2017-05-15 04:27 LSTM - decrease batch_size neural-network 0.69484 0.69201 0.68766
kaczla 2017-05-15 04:25 LSTM - decrease batch_size 0.69364 0.69189 0.68599
kaczla 2017-05-15 04:21 LSTM - remove one layer neural-network 0.69364 0.69189 0.68599
mmalisz 2017-05-14 22:05 Bpe smalltrain 0.56843 0.64794 N/A
mmalisz 2017-05-14 22:02 Keras smalltrain 0.56843 0.64794 0.49932
kaczla 2017-05-14 15:48 LSTM - remove one layer neural-network 0.69364 0.69189 0.68599
siulkilulki 2017-05-11 20:01 Trigram hard keywords that occured at least 13 times, when can't decide on hard keywords "F" is assigned, Answers based on hard keywords percentage: dev-0 6%, dev-1 7%, test-A 6% self-made python 0.51779 0.51906 0.51526
siulkilulki 2017-05-11 19:54 Trigram hard keywords that occured at least 13 times, when can't decide on hard keywords naive bayes is used, Answers based on hard keywords percentage: dev-0 6%, dev-1 7%, test-A 6% self-made python 0.67116 0.65394 0.65709
siulkilulki 2017-05-11 16:56 Bigram hard keywords that occured at least 17 times, when can't decide on hard keywords "F" is assigned, Based on hard keywords percentage: dev-0 12%, dev-1 13%, test-A 14% self-made python 0.53133 0.53295 0.53057
siulkilulki 2017-05-11 16:42 Bigram hard keywords that occured at least 17 times, when can't decide on hard keywords naive bayes is used, Based on hard keywords percentage: dev-0 12%, dev-1 13%, test-A 14% self-made python 0.67223 0.65568 0.65883
siulkilulki 2017-05-11 14:40 Bigram hard keywords that occured at least 5 times, when can't decide on hard keywords naive bayes is used, Based on hard keywords percantage: dev-0 59%, dev-1 57%, test-A 56% self-made python 0.64618 0.63862 0.63857
siulkilulki 2017-05-11 13:13 Bigram hard keywords that occured at least 5 times, when can't decide on hard keywords assings "F", Based on hard keywords percantage: dev-0 59%, dev-1 57%, test-A 56% self-made python 0.59141 0.58698 0.58315
EmEm 2017-05-04 19:08 1st try N/A N/A 0.59865
siulkilulki 2017-04-28 17:45 Hard keywords based solution ver 2. If can't decide based on hard keywords naive bayes is used. Percentage of answers based on keywords: dev-0 10%, dev-1 9%, test-A 8%. Only words with count 3 and bigger are considered in hard keyword based approach. self-made python 0.66285 0.64877 0.65190
siulkilulki 2017-04-28 17:26 Hard keywords based solution ver 1. If can't decide based on hard keywords naive bayes is used. Percentage of answers based on keywords: dev-0 22%, dev-1 19%, test-A 20% self-made python 0.65111 0.64067 0.64489
p/tlen 2017-04-25 19:49 5 RNNs combined 0.70079 0.69568 0.69044
p/tlen 2017-04-24 05:36 fasttext combined with KenLM 0.71653 0.70503 0.69295
p/tlen 2017-04-23 17:02 LSTM (by Nozdi) 0.69433 0.68978 0.68382
p/tlen 2017-04-23 10:35 fasttext word 2-ngrams, 10x buckets, character 3-6-ngrams 0.70222 0.69351 0.68632
p/tlen 2017-04-23 08:15 fasttext word 2-ngrams, 10x buckets, character 3-6-ngrams 0.70222 N/A N/A
p/tlen 2017-04-23 06:53 fasttext word 2-ngrams, 10x buckets, character 3-6-ngrams 0.69423 0.68672 0.67830
p/tlen 2017-04-22 20:26 fasttext with word 2-grams and 10x buckets fasttext ready-made 0.69322 0.68578 0.67851
p/tlen 2017-04-22 19:42 fasttext with word 2-grams fasttext ready-made 0.68593 0.67887 0.67183
p/tlen 2017-04-22 19:34 fasttext (baseline) fasttext ready-made 0.67711 0.66870 0.66623
kaczla 2017-04-15 16:18 Vowpal Wabbit vowpal-wabbit ready-made 0.67142 0.66639 0.66109
kaczla 2017-04-10 13:26 KenLM lm kenlm ready-made 0.67077 0.66102 0.65053
kaczla 2017-04-10 13:07 Vowpal Wabbit vowpal-wabbit ready-made 0.67013 0.66531 0.66036
zp30615 2017-04-04 15:19 bayes with simple stemming fix python self-made naive-bayes 0.65368 0.63479 0.64012
zp30615 2017-04-04 13:48 bayes with simple stemming 0.56540 0.56040 0.56282
zp30615 2017-04-03 21:08 bayes tf-idf (classic) python self-made naive-bayes 0.59090 0.58922 0.58420
zp30615 2017-04-03 20:54 dev-0 tf-idf test (big change) 0.54156 0.66063 0.65417
zp30615 2017-04-03 20:07 dev-0 tf-idf test (small change) 0.58224 0.66063 0.65417
zp30615 2017-04-01 17:45 logistic regression 40 epoch 0.66230 0.66063 0.65417
zp30615 2017-04-01 13:38 dev-0 tf-idf test 0.59090 0.66089 0.65494
kaczla 2017-03-31 21:52 Vowpal Wabbit vowpal-wabbit ready-made 0.65301 0.64660 0.64337
zp30615 2017-03-31 17:30 logistic regression 20 epoch logistic-regression python self-made 0.66397 0.66089 0.65494
kaczla 2017-03-27 20:29 Logistic regression logistic-regression self-made python 0.66180 0.65658 0.65381
EmEm 2017-03-27 20:11 logistic regression python self-made logistic-regression N/A N/A 0.59865
zp30615 2017-03-27 18:29 logistic regression 10 epoch logistic-regression self-made python 0.66355 0.66069 0.65399
zp30615 2017-03-27 16:03 logistic regression 1 epoch logistic-regression self-made python 0.65032 0.64632 0.63895
germek 2017-03-27 13:21 Regresja logistic-regression python self-made N/A N/A 0.62472
germek 2017-03-27 13:20 Regresja N/A N/A N/A
germek 2017-03-27 13:19 Regresja N/A N/A N/A
germek 2017-03-27 13:14 Regresja N/A N/A 0.63928
Mario 2017-03-27 13:07 reg logistyczna 10 epok - shuffle logistic-regression self-made 0.63823 0.63671 0.62985
siulkilulki 2017-03-27 11:08 without feature engineering, Adaptive Moment Estimation, 49 epoch. discriminative better than generative self-made python logistic-regression 0.67127 0.66687 0.66120
Mario 2017-03-27 10:32 reg logistyczna 10 epok logistic-regression self-made 0.62059 0.61890 0.61450
Mario 2017-03-26 23:22 reg logistyczna 1 epoka logistic-regression self-made 0.59625 0.59012 0.58915
Mario 2017-03-26 23:17 reg logistyczna 1 epoka, mały zbiór uczący v2 0.66669 0.64823 0.58915
Mario 2017-03-26 22:51 reg logistyczna 1 epoka, mały zbiór uczący 0.66669 0.64823 0.50767
siulkilulki 2017-03-23 08:23 22 epoch, simple SGD with stupid annealing, need to make better SGD, without feature engineering logistic-regression self-made python 0.66878 0.66422 0.65814
zp30615 2017-03-20 19:43 Bernoulli Naive Bayes 1 naive-bayes bernoulli python self-made 0.65483 0.63717 0.64269
antystenes 2017-03-20 16:28 Logistic Haskell haskell self-made logistic-regression 0.61675 0.61432 0.61065
zp30615 2017-03-16 17:27 bayes + tf_idf 0.59461 0.59014 0.58846
zp30615 2017-03-16 12:37 corrected bayes naive-bayes python self-made multinomial 0.66665 0.64844 0.65369
siulkilulki 2017-03-15 14:05 sckit-learn naive bayes ready-made python naive-bayes scikit-learn 0.66680 0.64842 0.65394
antystenes 2017-03-13 08:36 TurboHaskell 2010 v2 0.66435 0.70540 0.65029
antystenes 2017-03-11 15:54 TurboHaskell 2010 haskell self-made naive-bayes 0.66912 0.64996 0.65531
Durson 2017-03-11 03:25 Test 0.58665 0.58153 0.57822
Durson 2017-03-11 02:59 Test 0.59857 0.59280 0.58933
Durson 2017-03-11 02:15 Test 0.62323 0.61270 0.60889
Durson 2017-03-11 01:16 Test 0.59528 0.59049 0.58699
Durson 2017-03-11 00:44 Test 0.63650 0.62513 0.62066
Durson 2017-03-11 00:26 Test 0.63455 0.62364 0.61931
Durson 2017-03-11 00:19 Test 0.63425 0.62240 0.61862
Durson 2017-03-10 23:48 Test N/A 0.52997 N/A
Durson 2017-03-09 17:44 Test 0.66364 0.64468 0.64945
Durson 2017-03-09 17:38 Naive Bayes perl self-made naive-bayes multinomial 0.66521 0.64534 0.65043
Durson 2017-03-09 17:18 Test 0.64469 0.62934 0.62802
Durson 2017-03-09 17:03 Yolo 0.64314 0.62835 0.62709
Durson 2017-03-09 16:23 Test 0.63938 0.62525 0.62369
Durson 2017-03-09 16:16 Test 0.64379 0.62851 0.62740
Durson 2017-03-09 15:51 Test 0.64366 0.62845 0.62752
Durson 2017-03-09 15:24 Test 0.64358 0.62858 0.62751
Durson 2017-03-09 14:53 Yolo 0.64420 0.62867 0.62784
Durson 2017-03-09 14:33 Test 0.54233 0.53734 0.53638
antystenes 2017-03-07 02:31 Haskell na resorach 0.66344 0.64638 0.64971
mmalisz 2017-03-02 23:49 I can see that I'll have to teach you how to be villains! naive-bayes self-made lisp regexp 0.56843 0.64794 0.65479
mmalisz 2017-03-02 23:35 Throw it at him, not me! 0.56843 0.64794 0.65375
mmalisz 2017-03-02 23:16 Back to old corpora 0.56843 0.64794 0.65450
mmalisz 2017-03-02 23:00 Change of preprocessing 0.56843 0.64794 0.65031
mmalisz 2017-03-02 21:48 Próba raz dwa czy 0.56843 0.64794 0.64935
Durson 2017-03-02 13:01 Test N/A N/A 0.62362
Durson 2017-03-02 12:22 Yolo N/A N/A 0.50288
Durson 2017-03-02 12:11 Yolo N/A N/A 0.50381
Durson 2017-03-02 12:08 Yolo N/A N/A 0.00000
mmalisz 2017-03-02 11:15 Now look at this net that I just found; when I say go... 0.56843 0.64794 0.65331
mmalisz 2017-03-02 10:54 Now look at this net that I just found 0.56843 N/A 0.65331
mmalisz 2017-03-02 10:44 Now look at this net 0.56843 N/A 0.34669
Durson 2017-03-02 08:19 Yolo N/A N/A 0.50288
Durson 2017-03-02 08:10 Yolo N/A N/A 0.50374
zp30615 2017-03-01 11:40 bayes3 naive-bayes self-made python multinomial 0.50157 0.50408 0.49981
zp30615 2017-03-01 11:04 bayes2 0.49982 0.50048 0.49941
antystenes 2017-03-01 07:13 Haskell 0.63596 0.61912 0.62383
germek 2017-02-28 23:51 something is no yes :X naive-bayes python self-made N/A N/A 0.63928
germek 2017-02-28 22:42 test N/A N/A N/A
germek 2017-02-28 21:47 something is no yes :X N/A N/A N/A
zp30615 2017-02-28 21:37 bayes1 N/A N/A N/A
zp30615 2017-02-28 21:13 bayes solution1 0.50033 0.50155 0.50085
siulkilulki 2017-02-28 19:32 naiwen bajesen, changed equation 0.66582 0.64740 0.65173
siulkilulki 2017-02-28 19:19 naiwen bajesen naive-bayes self-made python 0.66600 0.64745 0.65224
kaczla 2017-02-28 18:33 Rozwiązanie python naive-bayes self-made 0.66092 0.64342 0.65071
Mario 2017-02-28 17:06 Rozwiązanie 3 naive-bayes self-made java 0.66669 0.64823 0.65482
Mario 2017-02-28 16:44 Rozwiązanie 2 N/A N/A 0.50006
antystenes 2017-02-28 15:35 Swag 0.61095 0.59919 0.60005
Durson 2017-02-28 15:04 Yolo N/A N/A 0.62326
Durson 2017-02-28 10:44 Yolo N/A N/A 0.62268
Mario 2017-02-27 23:17 Rozwiązanie 1 N/A N/A N/A
Durson 2017-02-27 17:57 First N/A N/A 0.53074
Durson 2017-02-27 17:44 First N/A N/A 0.53376
Durson 2017-02-27 17:31 First N/A N/A 0.52212
[anonymised] 2017-02-27 17:22 moje rozwiazanie 1 python self-made stupid 0.50123 N/A 0.50068
zp30615 2017-02-27 16:23 regexPro stupid regexp self-made python 0.50033 0.50155 0.50085
tamazaki 2017-02-27 16:21 test regexp self-made python stupid 0.50241 0.50147 0.50155
antystenes 2017-02-24 08:31 Simple regexp solution 0.52190 0.51948 0.51246
[anonymised] 2017-02-21 16:58 test simple solution 0.52869 0.53085 0.52200
p/tlen 2017-01-26 10:08 KenLM + Vowpal Wabbit vowpal-wabbit 0.71473 0.70513 0.69379
Domagalsky 2017-01-08 20:31 Punct split v2 0.66486 0.65639 0.64260
Domagalsky 2017-01-08 15:16 KenLM punctuation.split 0.64351 0.63973 0.62437
Mieszko 2016-12-27 14:04 Train LM 3 grams & tokenize 0.99425 0.63660 0.64909
Mieszko 2016-12-27 14:00 LM 4grams female 0.99425 0.63660 0.62213
Mieszko 2016-12-27 13:55 Train LM improvement 0.99425 0.63660 0.53150
Mieszko 2016-12-27 13:46 Train LM improvement 0.99425 0.63660 0.58043
Mieszko 2016-12-27 10:22 Kenml devs & train LM & remove punct 0.99425 0.63660 0.65591
Mieszko 2016-12-27 10:17 Kenml devs & train LM 0.99425 0.63660 0.65591
Mieszko 2016-12-27 01:21 2 w nocy -> wystarczy 0.98007 0.97880 0.64758
Mieszko 2016-12-27 01:16 2 w nocy -> wystarczy 0.98007 0.97880 0.53478
Mieszko 2016-12-27 01:09 kenml & dict v2 0.98007 0.97880 0.63106
Mieszko 2016-12-27 00:57 kenml & dict 0.98007 0.97880 0.59847
Mieszko 2016-12-27 00:43 kenml train LM 0.98007 0.97880 0.64909
Mieszko 2016-12-27 00:39 kenml v4 0.98007 0.97880 0.64758
Mieszko 2016-12-27 00:32 Kenml v3 0.98007 0.97880 0.64758
Mieszko 2016-12-27 00:19 Kenml v2 0.98007 0.97880 0.62256
Mieszko 2016-12-27 00:04 Kenml v2 0.98007 0.97880 N/A
Mieszko 2016-12-26 23:58 Kenml v2 0.98007 0.97880 N/A
Mieszko 2016-12-26 23:54 Kenml v2 0.98007 0.97880 N/A
Mieszko 2016-12-26 23:25 Kenml v1 0.98007 0.97880 0.62129
RafciX 2016-12-07 09:31 sama 0.51523 N/A 0.50463
RafciX 2016-12-07 09:24 v2 0.51523 N/A 0.51408
PioBec 2016-12-05 22:38 extra rules, information about each rule accuracy 0.50095 N/A N/A
PioBec 2016-12-05 21:59 silly mistake in adding stuff twice to out 0.50091 N/A N/A
PioBec 2016-12-05 21:50 Dydlojn zaliczony? N/A N/A N/A
Dominik Ziętkowski 2016-12-05 00:26 Womendict ver.3 0.51991 N/A 0.51516
Dominik Ziętkowski 2016-12-05 00:03 Womendict ver.2 0.52001 N/A 0.51494
Dominik Ziętkowski 2016-12-04 23:43 Womendict ver.2 0.51547 N/A 0.51278
Dominik Ziętkowski 2016-12-03 23:50 First submission - Womendict 0.51460 N/A 0.51278
Dominik Ziętkowski 2016-12-03 23:35 First submission - Womendict 0.51460 N/A N/A
Dominik Ziętkowski 2016-12-03 23:32 First submission - Womendict 0.51460 N/A N/A
KamilTrabka 2016-12-03 23:06 proste rozwiazanie N/A 0.51687 0.51754
Dominik Ziętkowski 2016-12-03 18:15 First submission - Womendict N/A N/A N/A
KamilTrabka 2016-12-01 12:40 p3 0.49753 N/A 0.50251
KamilTrabka 2016-12-01 12:36 2ga proba 0.50351 N/A 0.50251
KamilTrabka 2016-12-01 12:29 pp N/A N/A N/A
Martin 2016-12-01 02:45 kenlm first attempt 0.99640 0.99542 0.65047
Domagalsky 2016-11-30 14:38 Poprawki w ./runD.py 0.52735 0.52362 0.52521
Domagalsky 2016-11-30 13:49 Push z plikami - wersja słownikowa 0.52735 0.52362 0.52521
Domagalsky 2016-11-30 13:46 Test plikow 0.52735 0.52362 0.52521
Domagalsky 2016-11-30 10:32 KenLM z Train'a* 0.64377 0.52363 0.62182
Domagalsky 2016-11-30 10:30 KenLM z Train'a 0.64377 0.52363 0.52520
Mieszko 2016-11-30 10:28 merged v2 0.54357 N/A 0.53326
Mieszko 2016-11-30 10:27 merged v1 N/A N/A 0.53326
Mieszko 2016-11-30 10:26 merged Mieszko & Maciej solution N/A N/A 0.53326
RafciX 2016-11-30 09:28 dict v1 0.51523 N/A 0.51408
Domagalsky 2016-11-30 09:26 Słownik na Trainie 0.52735 0.52363 0.52520
Maciej 2016-11-28 16:23 Women - interpunction 0.53752 N/A 0.52835
Domagalsky 2016-11-28 08:57 KenLM 3gram 0.98726 0.98495 0.58469
Domagalsky 2016-11-28 07:57 KenLM 1st Try 0.98843 0.98664 0.58520
Domagalsky 2016-11-26 14:53 Best On test-A** 0.61496 0.53644 0.53855
Domagalsky 2016-11-26 14:48 Best on test-A 0.77005 0.73899 0.53038
Domagalsky 2016-11-26 14:35 Best on devs 0.77007 0.73899 0.53038
Domagalsky 2016-11-26 14:19 _ 0.63512 0.61667 0.52541
Domagalsky 2016-11-26 13:25 El Dictioannte finallo 0.67131 0.64660 0.53131
Domagalsky 2016-11-26 12:40 Dic v4 cleaning + tr improve 0.77005 0.73899 0.53040
Domagalsky 2016-11-26 10:50 Dic v3 0.61498 0.53645 0.53853
Domagalsky 2016-11-25 19:33 Dictionary version over 9000 small cleaning 0.60219 0.53713 0.52968
Domagalsky 2016-11-25 19:02 Dictionary version over 9000 dev-1 0.59657 0.52953 0.52968
Domagalsky 2016-11-25 18:58 Dictionary version over 9000 0.59657 N/A 0.52968
Maciej 2016-11-23 20:08 Women dictionary v3 0.53190 N/A 0.52321
Maciej 2016-11-23 19:56 Women dictionary v2 0.52793 N/A 0.52035
Maciej 2016-11-23 19:47 Women dictionary 0.52408 N/A 0.51827
Maciej 2016-11-23 19:31 Only men v3 0.51677 N/A 0.50993
Maciej 2016-11-23 17:41 Only men - bigger dictionary 0.51156 N/A 0.50724
RafciX 2016-11-23 14:06 words v1 N/A N/A N/A
Maciej 2016-11-22 23:15 Dictionary - only women 0.50000 N/A 0.50000
Maciej 2016-11-22 23:07 "First attempt - dictionary" 0.50867 N/A 0.50562
Maciej 2016-11-22 21:10 test submition (all F) 0.50000 N/A 0.50000
Mieszko 2016-11-22 18:29 female + male dict 0.53915 N/A 0.53150
Mieszko 2016-11-22 18:14 male + female dict N/A N/A 0.53001
Mieszko 2016-11-20 17:14 add swears 0.53714 N/A 0.53001
Mieszko 2016-11-20 17:07 add swears N/A N/A 0.53001
Mieszko 2016-11-20 09:42 dict v4 0.54134 N/A 0.53208
Mieszko 2016-11-19 23:15 dict v3 0.53816 N/A 0.52971
Mieszko 2016-11-19 22:18 improve dict v2 0.53785 N/A 0.52971
Mieszko 2016-11-19 22:13 improve dict 0.53785 N/A 0.52399
Mieszko 2016-11-19 20:07 Dictionary approach 0.52699 N/A 0.52399
p/tlen 2016-11-15 09:29 trivial baseline (only female) 0.50000 0.50000 0.50000