# Tags

tag | description |
---|---|

2-dimensional | classifier or regression on two variables |

algo | non-trivial algorithm implemented |

analysis | some extra analysis done, not just giving the test results |

bagging | bagging/bootstraping used |

baseline | baseline solution |

bernoulli | Bernoulli (naive Bayes) model used |

bert | bert |

better-than-no-model-baseline | significantly better than stupid, no-model baseline (e.g. returning the majority class) |

bidirectional | bidirectional |

bigram | bigrams considered |

bilstm | BiLSTM model used |

bpe | Text segmented into BPE subword units |

c++ | written (partially or fully) in C++ |

casemarker | Special handling of case |

challenge-preparation | prepare other challenge |

character-level | Character-level |

char-n-grams | character n-grams |

chi-square | chi-square test used |

cnn | Convolutional Neural Network |

complement | Complement variant |

crm-114 | CRM-114 used |

data-exploration | data exploration/visualization |

deathmatch | deathmatch |

decision-tree | decision tree used |

dumz20-challenge | UMZ 2019/2020 (stacjonarne) - konkurs |

existing | some existing solution added |

fairseq | Fairseq used |

fasttext | fasttext used |

feature-engineering | used more advanced pre-processing, feature engineering etc. |

fine-tuned | fine-tuned |

frage | FRAGE used |

geval | geval |

glove | GloVe used |

gpt2 | GPT-2 used |

gpt2-large | GPT-2 large used |

gpt2-xlarge | GPT-2 xlarge used |

gradient-descent | gradient-descent |

graph | extra graph |

gru | Use GRU network |

hashing-trick | Hashing trick used |

haskell | written (partially or fully) in Haskell |

hyperparam | some hyperparameter modifed |

improvement | existing solution modified and improved as measured by the main metric |

inverted | inverted |

irstlm | irstlm |

java | written (partially or fully) in Java |

kenlm | KenLM used |

k-means | k-means or its variant used |

knn | k nearest neighbors |

knowledge-based | some external source of knowledge used |

language-tool | LanguageTool used |

left-to-right | only left to right |

lemmatization | lemmatization used |

linear-regression | linear regression used |

lisp | written (partially or fully) in Lisp |

lm | a language model used |

locally-weighted | Locally weighted variant |

logistic-regression | logistic regression used |

lstm | LSTM network |

marian | Marian NMT used |

mert | MERT (or equivalent) for Moses |

moses | Moses MT |

multidimensional | classifier or regression on many variables |

multinomial | multinomial (naive Bayes) model used |

multiple-outs | generated multiple outputs for geval |

naive-bayes | Naive Bayes Classifier used |

neural-network | neural network used |

new-leader | significantly better than the current top result |

n-grams | n-grams used |

no-model-baseline | significantly better than stupid, no-model baseline (e.g. returning the majority class) |

non-zero | non zero value for the metric |

null-model | null model baseline |

oddballness | oddballness used |

perl | written (partially or fully) in Perl |

pretrained | pre-trained embeddings |

probabilities | return probabilities not just classes |

probability | probability rather oddballness |

python | written (partially or fully) in Python 2/3 |

pytorch-nn | Pytorch NN |

r | written (partially or fully) in R |

random-forest | Random Forest used |

ready-made | Machine Learning framework/library/toolkit used, algorithm was not implemented by the submitter |

ready-made-model | Ready-made ML model |

regexp | handcrafted regular expressions used |

regularization | some regularization used |

right-to-left | model working from right to left |

rnn | Recurrent Neural Network |

roberta | RoBERTa model |

roberta-pl | Polish Roberta |

roberta-xlm | Multilingual Roberta |

ruby | written (partially or fully) in Ruby |

rule-based | rule-based solution |

scala | written (partially or fully) in Scala |

scikit-learn | sci-kit learn used |

self-made | algorithm implemented by the submitter, no framework used |

sentence-piece | sentence pieces used (unigram) |

simple | simple solution |

standards-preparations | How to standards |

stemming | stemming used |

stop-words | stop words handled in a special manner |

stupid | simple, stupid rule-based solution |

subword-regularization | Use subword-regularization |

supervised | supervised |

svm | Support Vector Machines |

temporal | temporal information taken into account |

tf | term frequency |

tf-idf | tf-idf used |

timestamp | timestamp considered |

tokenization | special tokenization used |

torch | (py)torch used |

train | train yourself an ML model |

transformer | Transformer model used |

trigram | trigrams considered |

truecasing | truecasing used |

uedin | Uedin corrector |

umz-2019-challenge | see https://eduwiki.wmi.amu.edu.pl/pms/19umz#Dodatkowe_punkty_za_wygranie_wyzwa.2BAUQ- |

unigram | only unigrams considered |

vowpal-wabbit | Vowpal Wabbit used |

word2vec | Word2Vec |

word-level | Word-level |

wordnet | some wordnet used |

xgboost | xgboost used |

zumz-2019-challenge | zumz competition |