tag description
2-dimensional classifier or regression on two variables
algo non-trivial algorithm implemented
analysis some extra analysis done, not just giving the test results
bagging bagging/bootstraping used
baseline baseline solution
bernoulli Bernoulli (naive Bayes) model used
bert bert
better-than-no-model-baseline significantly better than stupid, no-model baseline (e.g. returning the majority class)
bidirectional bidirectional
bigram bigrams considered
bilstm BiLSTM model used
bpe Text segmented into BPE subword units
c++ written (partially or fully) in C++
casemarker Special handling of case
challenge-preparation prepare other challenge
character-level Character-level
char-n-grams character n-grams
chi-square chi-square test used
cnn Convolutional Neural Network
complement Complement variant
crm-114 CRM-114 used
data-exploration data exploration/visualization
deathmatch deathmatch
decision-tree decision tree used
dumz20-challenge UMZ 2019/2020 (stacjonarne) - konkurs
existing some existing solution added
fairseq Fairseq used
fasttext fasttext used
feature-engineering used more advanced pre-processing, feature engineering etc.
fine-tuned fine-tuned
frage FRAGE used
geval geval
glove GloVe used
gpt2 GPT-2 used
gpt2-large GPT-2 large used
gpt2-xlarge GPT-2 xlarge used
gradient-descent gradient-descent
graph extra graph
gru Use GRU network
hashing-trick Hashing trick used
haskell written (partially or fully) in Haskell
hyperparam some hyperparameter modifed
improvement existing solution modified and improved as measured by the main metric
inverted inverted
irstlm irstlm
java written (partially or fully) in Java
kenlm KenLM used
k-means k-means or its variant used
knn k nearest neighbors
knowledge-based some external source of knowledge used
language-tool LanguageTool used
left-to-right only left to right
lemmatization lemmatization used
linear-regression linear regression used
lisp written (partially or fully) in Lisp
lm a language model used
locally-weighted Locally weighted variant
logistic-regression logistic regression used
lstm LSTM network
marian Marian NMT used
mert MERT (or equivalent) for Moses
moses Moses MT
multidimensional classifier or regression on many variables
multinomial multinomial (naive Bayes) model used
multiple-outs generated multiple outputs for geval
naive-bayes Naive Bayes Classifier used
neural-network neural network used
new-leader significantly better than the current top result
n-grams n-grams used
no-model-baseline significantly better than stupid, no-model baseline (e.g. returning the majority class)
non-zero non zero value for the metric
null-model null model baseline
oddballness oddballness used
perl written (partially or fully) in Perl
pretrained pre-trained embeddings
probabilities return probabilities not just classes
probability probability rather oddballness
python written (partially or fully) in Python 2/3
pytorch-nn Pytorch NN
r written (partially or fully) in R
random-forest Random Forest used
ready-made Machine Learning framework/library/toolkit used, algorithm was not implemented by the submitter
ready-made-model Ready-made ML model
regexp handcrafted regular expressions used
regularization some regularization used
right-to-left model working from right to left
rnn Recurrent Neural Network
roberta RoBERTa model
roberta-pl Polish Roberta
roberta-xlm Multilingual Roberta
ruby written (partially or fully) in Ruby
rule-based rule-based solution
scala written (partially or fully) in Scala
scikit-learn sci-kit learn used
self-made algorithm implemented by the submitter, no framework used
sentence-piece sentence pieces used (unigram)
simple simple solution
standards-preparations How to standards
stemming stemming used
stop-words stop words handled in a special manner
stupid simple, stupid rule-based solution
subword-regularization Use subword-regularization
supervised supervised
svm Support Vector Machines
temporal temporal information taken into account
tf term frequency
tf-idf tf-idf used
timestamp timestamp considered
tokenization special tokenization used
torch (py)torch used
train train yourself an ML model
transformer Transformer model used
trigram trigrams considered
truecasing truecasing used
uedin Uedin corrector
umz-2019-challenge see https://eduwiki.wmi.amu.edu.pl/pms/19umz#Dodatkowe_punkty_za_wygranie_wyzwa.2BAUQ-
unigram only unigrams considered
vowpal-wabbit Vowpal Wabbit used
word2vec Word2Vec
word-level Word-level
wordnet some wordnet used
xgboost xgboost used
zumz-2019-challenge zumz competition