Tags

tag	description	color
2-dimensional	classifier or regression on two variables
5gram	5gram
adam	Adam optimizer
algo	non-trivial algorithm implemented
alpha	alpha phase	orange
analysis	some extra analysis done, not just giving the test results
attention	attention used
backoff
bagging	bagging/bootstraping used
base	base model
baseline	baseline solution
bernoulli	Bernoulli (naive Bayes) model used
bert	bert
beta	beta phase	green
better-than-no-model-baseline	significantly better than stupid, no-model baseline (e.g. returning the majority class)
bidirectional	bidirectional
bigram	bigrams considered
bilstm	BiLSTM model used
bow	bag of words
bpe	Text segmented into BPE subword units
c++	written (partially or fully) in C++
casemarker	Special handling of case
challenge-preparation	prepare other challenge
challenging-america	Challenging America challenge
char-n-grams	character n-grams
character-level	Character-level
chi-square	chi-square test used
classification	classification challenge
clm	Causal Language Modeling
cnn	Convolutional Neural Network
complement	Complement variant
computer-vision	computer vision
crf
crm-114	CRM-114 used
data-exploration	data exploration/visualization
deathmatch	deathmatch
deberta	deberta
decision-tree	decision tree used
diachronic	Diachronic/temporal challenge
document-understanding	challenge related to Document Understanding
donut	Donut model
dumz20-challenge	UMZ 2019/2020 (stacjonarne) - konkurs
eng	data in English
ensemble
existing	some existing solution added
fairseq	Fairseq used
fast-align	Fast Align
faster-r-cnn	Faster R-CNN
fasttext	fasttext used
feature-engineering	used more advanced pre-processing, feature engineering etc.
fine-tuned	fine-tuned
frage	FRAGE used
geval	geval
glove	GloVe used
golden-query	"Golden" query used
goodturing
gpt2	GPT-2 used
gpt2-large	GPT-2 large used
gpt2-xlarge	GPT-2 xlarge used
gradient-descent	gradient-descent
graph	extra graph
gru	Use GRU network
hashing-trick	Hashing trick used
haskell	written (partially or fully) in Haskell
huggingface-transformers	Huggingface Transformers
hyperparam	some hyperparameter modifed
improvement	existing solution modified and improved as measured by the main metric
interpolation	interpolation
inverted	inverted
irstlm	irstlm
java	written (partially or fully) in Java
just-inference	Just test a model without training/fine-tuning
k-means	k-means or its variant used
kenlm	KenLM used
kneser-ney	Use Kneser-Ney
knn	k nearest neighbors
knowledge-based	some external source of knowledge used
language-tool	LanguageTool used
large	large model
left-to-right	only left to right
lemmatization	lemmatization used
linear-regression	linear regression used
lisp	written (partially or fully) in Lisp
lm	a language model used
lm-loss	Loss of language model used for prediction answers
locally-weighted	Locally weighted variant
logistic-regression	logistic regression used
lstm	LSTM network
m2m-100	M2M-100 facebook model
mMiniLMv2	mMiniLMv2
marian	Marian NMT used
mbart	MBart
mbart-large-50	MBart-50 large
medicine	medicine-related challenge
mert	MERT (or equivalent) for Moses
mlm	Masked Language Modeling
modernization	diachronic modernization
moses	Moses MT
ms-read-api	Microsoft Read API
ms-read-api-2021-04-12	Microsoft Read API model ver. 2021-04-12
ms-read-api-2021-09-30-preview	Microsoft Read API model ver. 2021-09-30-preview
multidimensional	classifier or regression on many variables
multinomial	multinomial (naive Bayes) model used
multiple-outs	generated multiple outputs for geval
n-grams	n-grams used
naive-bayes	Naive Bayes Classifier used
neural-network	neural network used
new-leader	significantly better than the current top result
no-fine-tuning	no fine-tuning
no-model-baseline	significantly better than stupid, no-model baseline (e.g. returning the majority class)
no-pretrained	Trained from scratch
no-temporal	No temporal information
non-zero	non zero value for the metric
null-model	null model baseline
ocr	OCR task
oddballness	oddballness used
perl	written (partially or fully) in Perl
plusaplha
pol	Polish
postprocessing	simple postprocessing
pretrained	pre-trained embeddings
probabilities	return probabilities not just classes
probability	probability rather oddballness
proto	proto version (e.g. for Donut)
python	written (partially or fully) in Python 2/3
pytorch-nn	Pytorch NN
question-query	Full question used as query
r	written (partially or fully) in R
randlm	RandLM
random-forest	Random Forest used
ready-made	Machine Learning framework/library/toolkit used, algorithm was not implemented by the submitter
ready-made-model	Ready-made ML model
regexp	handcrafted regular expressions used
regularization	some regularization used
right-to-left	model working from right to left
rnn	Recurrent Neural Network
roberta	RoBERTa model
roberta-base	RoBERTa Base
roberta-challam	RoBERTa trained on Chronicling America
roberta-pl	Polish Roberta
roberta-xlm	Multilingual Roberta
ruby	written (partially or fully) in Ruby
rule-based	rule-based solution
scala	written (partially or fully) in Scala
scikit-learn	sci-kit learn used
self-made	algorithm implemented by the submitter, no framework used
sentence-piece	sentence pieces used (unigram)
seq2seq	Sequence To Sequence Modeling
simple	simple solution
small	small model
solr	Solr used
standards-preparations	How to standards
stemming	stemming used
stop-words	stop words handled in a special manner
stupid	simple, stupid rule-based solution
subword-regularization	Use subword-regularization
supervised	supervised
svm	Support Vector Machines
t5	T5 models
temporal	temporal information taken into account
tesseract	Tesseract OCR
tetragram	tetragrams
tf	term frequency
tf-idf	tf-idf used
timestamp	timestamp considered
tokenization	special tokenization used
torch	(py)torch used
train	train yourself an ML model
transformer	Transformer model used
transformer-decoder	Transformer-decoder architecture used
transformer-encoder	Transformer-encoder architecture used
transformer-encoder-decoder	Transformer-encoder-decoder architecture used
trigram	trigrams considered
truecasing	truecasing used
tuning	tuning an existing model
uedin	Uedin corrector
umz-2019-challenge	see https://eduwiki.wmi.amu.edu.pl/pms/19umz#Dodatkowe_punkty_za_wygranie_wyzwa.2BAUQ-
unigram	only unigrams considered
unks	special handling of unknown words
video	video involved
vowpal-wabbit	Vowpal Wabbit used
wikisource	WikiSource used
word-level	Word-level
word2vec	Word2Vec
wordnet	some wordnet used
xgboost	xgboost used
zumz-2019-challenge	zumz competition