2-dimensional |
classifier or regression on two variables |
|
5gram |
5gram |
|
adam |
Adam optimizer |
|
algo |
non-trivial algorithm implemented |
|
alpha |
alpha phase |
orange |
analysis |
some extra analysis done, not just giving the test results |
|
attention |
attention used |
|
backoff |
|
|
bagging |
bagging/bootstraping used |
|
base |
base model |
|
baseline |
baseline solution |
|
bernoulli |
Bernoulli (naive Bayes) model used |
|
bert |
bert |
|
beta |
beta phase |
green |
better-than-no-model-baseline |
significantly better than stupid, no-model baseline (e.g. returning the majority class) |
|
bidirectional |
bidirectional |
|
bigram |
bigrams considered |
|
bilstm |
BiLSTM model used |
|
bow |
bag of words |
|
bpe |
Text segmented into BPE subword units |
|
c++ |
written (partially or fully) in C++ |
|
casemarker |
Special handling of case |
|
challenge-preparation |
prepare other challenge |
|
challenging-america |
Challenging America challenge |
|
char-n-grams |
character n-grams |
|
character-level |
Character-level |
|
chi-square |
chi-square test used |
|
classification |
classification challenge |
|
clm |
Causal Language Modeling |
|
cnn |
Convolutional Neural Network |
|
complement |
Complement variant |
|
computer-vision |
computer vision |
|
crf |
|
|
crm-114 |
CRM-114 used |
|
data-exploration |
data exploration/visualization |
|
deathmatch |
deathmatch |
|
deberta |
deberta |
|
decision-tree |
decision tree used |
|
diachronic |
Diachronic/temporal challenge |
|
document-understanding |
challenge related to Document Understanding |
|
donut |
Donut model |
|
dumz20-challenge |
UMZ 2019/2020 (stacjonarne) - konkurs |
|
eng |
data in English |
|
ensemble |
|
|
existing |
some existing solution added |
|
fairseq |
Fairseq used |
|
fast-align |
Fast Align |
|
faster-r-cnn |
Faster R-CNN |
|
fasttext |
fasttext used |
|
feature-engineering |
used more advanced pre-processing, feature engineering etc. |
|
fine-tuned |
fine-tuned |
|
frage |
FRAGE used |
|
geval |
geval |
|
glove |
GloVe used |
|
golden-query |
"Golden" query used |
|
goodturing |
|
|
gpt2 |
GPT-2 used |
|
gpt2-large |
GPT-2 large used |
|
gpt2-xlarge |
GPT-2 xlarge used |
|
gradient-descent |
gradient-descent |
|
graph |
extra graph |
|
gru |
Use GRU network |
|
hashing-trick |
Hashing trick used |
|
haskell |
written (partially or fully) in Haskell |
|
huggingface-transformers |
Huggingface Transformers |
|
hyperparam |
some hyperparameter modifed |
|
improvement |
existing solution modified and improved as measured by the main metric |
|
interpolation |
interpolation |
|
inverted |
inverted |
|
irstlm |
irstlm |
|
java |
written (partially or fully) in Java |
|
just-inference |
Just test a model without training/fine-tuning |
|
k-means |
k-means or its variant used |
|
kenlm |
KenLM used |
|
kneser-ney |
Use Kneser-Ney |
|
knn |
k nearest neighbors |
|
knowledge-based |
some external source of knowledge used |
|
language-tool |
LanguageTool used |
|
large |
large model |
|
left-to-right |
only left to right |
|
lemmatization |
lemmatization used |
|
linear-regression |
linear regression used |
|
lisp |
written (partially or fully) in Lisp |
|
lm |
a language model used |
|
lm-loss |
Loss of language model used for prediction answers |
|
locally-weighted |
Locally weighted variant |
|
logistic-regression |
logistic regression used |
|
lstm |
LSTM network |
|
m2m-100 |
M2M-100 facebook model |
|
mMiniLMv2 |
mMiniLMv2 |
|
marian |
Marian NMT used |
|
mbart |
MBart |
|
mbart-large-50 |
MBart-50 large |
|
medicine |
medicine-related challenge |
|
mert |
MERT (or equivalent) for Moses |
|
mlm |
Masked Language Modeling |
|
modernization |
diachronic modernization |
|
moses |
Moses MT |
|
ms-read-api |
Microsoft Read API |
|
ms-read-api-2021-04-12 |
Microsoft Read API model ver. 2021-04-12 |
|
ms-read-api-2021-09-30-preview |
Microsoft Read API model ver. 2021-09-30-preview |
|
multidimensional |
classifier or regression on many variables |
|
multinomial |
multinomial (naive Bayes) model used |
|
multiple-outs |
generated multiple outputs for geval |
|
n-grams |
n-grams used |
|
naive-bayes |
Naive Bayes Classifier used |
|
neural-network |
neural network used |
|
new-leader |
significantly better than the current top result |
|
no-fine-tuning |
no fine-tuning |
|
no-model-baseline |
significantly better than stupid, no-model baseline (e.g. returning the majority class) |
|
no-pretrained |
Trained from scratch |
|
no-temporal |
No temporal information |
|
non-zero |
non zero value for the metric |
|
null-model |
null model baseline |
|
ocr |
OCR task |
|
oddballness |
oddballness used |
|
perl |
written (partially or fully) in Perl |
|
plusaplha |
|
|
pol |
Polish |
|
postprocessing |
simple postprocessing |
|
pretrained |
pre-trained embeddings |
|
probabilities |
return probabilities not just classes |
|
probability |
probability rather oddballness |
|
proto |
proto version (e.g. for Donut) |
|
python |
written (partially or fully) in Python 2/3 |
|
pytorch-nn |
Pytorch NN |
|
question-query |
Full question used as query |
|
r |
written (partially or fully) in R |
|
randlm |
RandLM |
|
random-forest |
Random Forest used |
|
ready-made |
Machine Learning framework/library/toolkit used, algorithm was not implemented by the submitter |
|
ready-made-model |
Ready-made ML model |
|
regexp |
handcrafted regular expressions used |
|
regularization |
some regularization used |
|
right-to-left |
model working from right to left |
|
rnn |
Recurrent Neural Network |
|
roberta |
RoBERTa model |
|
roberta-base |
RoBERTa Base |
|
roberta-challam |
RoBERTa trained on Chronicling America |
|
roberta-pl |
Polish Roberta |
|
roberta-xlm |
Multilingual Roberta |
|
ruby |
written (partially or fully) in Ruby |
|
rule-based |
rule-based solution |
|
scala |
written (partially or fully) in Scala |
|
scikit-learn |
sci-kit learn used |
|
self-made |
algorithm implemented by the submitter, no framework used |
|
sentence-piece |
sentence pieces used (unigram) |
|
seq2seq |
Sequence To Sequence Modeling |
|
simple |
simple solution |
|
small |
small model |
|
solr |
Solr used |
|
standards-preparations |
How to standards |
|
stemming |
stemming used |
|
stop-words |
stop words handled in a special manner |
|
stupid |
simple, stupid rule-based solution |
|
subword-regularization |
Use subword-regularization |
|
supervised |
supervised |
|
svm |
Support Vector Machines |
|
t5 |
T5 models |
|
temporal |
temporal information taken into account |
|
tesseract |
Tesseract OCR |
|
tetragram |
tetragrams |
|
tf |
term frequency |
|
tf-idf |
tf-idf used |
|
timestamp |
timestamp considered |
|
tokenization |
special tokenization used |
|
torch |
(py)torch used |
|
train |
train yourself an ML model |
|
transformer |
Transformer model used |
|
transformer-decoder |
Transformer-decoder architecture used |
|
transformer-encoder |
Transformer-encoder architecture used |
|
transformer-encoder-decoder |
Transformer-encoder-decoder architecture used |
|
trigram |
trigrams considered |
|
truecasing |
truecasing used |
|
tuning |
tuning an existing model |
|
uedin |
Uedin corrector |
|
umz-2019-challenge |
see https://eduwiki.wmi.amu.edu.pl/pms/19umz#Dodatkowe_punkty_za_wygranie_wyzwa.2BAUQ- |
|
unigram |
only unigrams considered |
|
unks |
special handling of unknown words |
|
video |
video involved |
|
vowpal-wabbit |
Vowpal Wabbit used |
|
wikisource |
WikiSource used |
|
word-level |
Word-level |
|
word2vec |
Word2Vec |
|
wordnet |
some wordnet used |
|
xgboost |
xgboost used |
|
zumz-2019-challenge |
zumz competition |
|