Pierwsza kolumna zbioru in.tsv zawiera początek dialogu pewnej lektury. Dialogi mogą być być prowadzone przez dowolną ilość osób i nie zawierają innych adnotacji niż sama wypowiedź (np. komentarzy narratora). Poszczególne wypowiedzi w początku dialogu oddzielone są separatorem [SEP]. Każda kolejna kolumna to propozycja kontynuacji dialogu. Kontynuacja dialogu może pochodzić z tej samej lub innej lektury. Istnieje tylko jedna taka poprawna kontynuacja dialogu- ta, która faktycznie występuje w książce. Zadaniem jest zwrócić poprawną kontynuację dialogu. [MAP]

Deadline: 2022-05-23 16:00:01.819979 UTC

W latach 1980-2020 prowadzono pomiary opadów deszczu. Jednostką jest miesięczna suma opadów w milimetrach. Lista stacji pogodowych znajduje się w pliku dataset_splits.tsv. Stacje pogodowe podzielone są na 3 zbiory: train, dev-0, test-A. [RMSE]

Deadline: 2022-05-23 16:00:01.819979 UTC

Guess publication location for a piece of text [Haversine]
Guess a word in a gap. [PerplexityHashed]
Guess the time when an excerpt was published [RMSE]
challenging-america diachronic
Translate from Polish to English. [BLEU]
The aim of this task is to provide a substrings of requested document representing clauses analogous (semantically and formally equivalent) to provided examples from other documents. [Soft-F1.0]
Guess the publication year of a Polish text. [RMSE]
Give the probability that a text in Polish was written by a man. [Likelihood]

Deadline: 2022-05-23 16:00:01.8199 UTC

Classify objects and determine their positions on scanned first pages of newspaper images. [F1]
pol diachronic computer-vision
Do OCR (or OCR post-correction) and modernize Polish text. [CharMatch]
pol modernization diachronic
Do OCR of a Polish historical text (or post-correction of Tesseract OCR) [CharMatch]
pol diachronic
Key information extraction for scientific tables. Guess the <mask> token in texts based on tables images and context from text. [Accuracy]
Handwriting Recognition for Polish index cards. [WER]
[PerplexityHashed]
Extract the information from NDAs (Non-Disclosure Agreements) about the involved parties, jurisdiction, contract term, etc. [F1(UC)]
Handwriting detection for Polish index cards [F1]
Guess the masked date in an wikipedia article. [RMSE-Against-Interval]
Correct Polish grammatical errors. [CharMatch]
Detect iconography in digitized historical publications. [F1]
Eur-lex-documents multilabel long documents classification. Assign one, more than one or none labels to each doc. [F1]
Dataset from paper "Twitter Sentiment Classification using Distant Supervision" [RMSE]
Dataset from paper "Twitter Sentiment Classification using Distant Supervision" [PerplexityHashed]
Dataset from paper "Twitter Sentiment Classification using Distant Supervision" [Accuracy]
Predict the masked word given text and year. [PerplexityHashed]
Matching OCR from bookspines with books. [Accuracy]
Predict the year Start Date: 1996-01-01 End Date: 2019-12-31 [RMSE]
Predict the headline category given headine text and year Start Date: 1996-01-01 End Date: 2019-12-31 [Accuracy]
Classify Polish urban legend texts the way folklorists do. [Accuracy]
Guess whether a search engine snippet contains possibly criminal content. [F1.0]
Handwriting Recognition for Polish index cards [WER]
NER challenge for CoNLL-2003 English. Annotations were taken from University of Antwerp. The English data is a collection of news wire articles from the Reuters Corpus, RCV1. [BIO-F1]
The goal of this task is to post-process the output from the Tesseract OCR engine. Alternatively, it could be treated as an OCR, as images are also available. [CharMatch]
Guess a reddit date based on its text. All reddits come form liverpoolfc reddit. [RMSE]
Predict the date of headline. Start Date: 2003-02-19 ; End Date: 2019-12-31 [RMSE]
Guess the publication year of a English text from the Chronicling America collection (1836-1922). [RMSE]
Guess a reddit date based on its text. This is larger version with more reddits and subrredits (topics) than in https://gonito.net/challenge/guess-reddit-date. [MSE]
Guess a reddit date based on its text. [MSE]
Classify a reddit as either from Skeptic subreddit or one of the "paranormal" subreddits (Paranormal, UFOs, TheTruthIsHere, Ghosts, ,Glitch-in-the-Matrix, conspiracytheories). [Likelihood]
Extract information from Wikipedia articles (WikiReading dataset repackaged). [Mean/MultiLabel-F1.0]
Detect errors in english text. [Mean F0.5]

Deadline: 2021-02-26 10:00:00 UTC

Guess the price of a flat/house. [MAE]
Cluster weird stories by their types. [NMI]
Open Challenge for Correcting Errors of Speech Recognition Systems [WER]
Guess the prices of flats in Poznan. Edition 2018 [RMSE]
Guess whether the sport is connected to the ball for a Polish article. Evaluation metrics: Accuracy, Likelihood. [Likelihood]
Guess the sentiment for texts in English. [Likelihood]
Translate Wikipedia entries from English to Polish character by character. [WER]
Guess the sport discipline for a Polish article. [LikelihoodHashed]
Translate news articles from Czech into English. [BLEU]
Determine nested Named Entities in NKJP-compatible way, that is provide a series of labels with corresponding token indexes. [MultiLabel-F1.0]
Give the probability of a positive sentiment for a short Polish text. [LogLoss]
Guess the prices of flats in Poznan. Edition 2018 [RMSE]
Transform old Polish texts into modern spelling. [CharMatch]
Guess who survived from the disaster. [Accuracy]
For a given Polish word, as used in a given year, give a diachronic equivalent (a.k.a. temporal word analogy) for a given year. [MAP]
Predict whether the mushroom is edible (e) or poisonous (p). [Accuracy]
Predict the price of a car. [RMSE]
Predict the price of flats in Poznań. [RMSE]
Translate news articles from German into English. [BLEU]
Give a probability distribution for a word in a gap in a corpus of Polish historic texts spanning 1814-2013. This is a challenge for (temporal) language models. [LogLossHashed]
Cluster Polish urban legend texts the way folklorists do. [NMI]
Clip a death notice in a Polish newspaper. [F1]
Guess if a given word is a correct Polish word in a given domain. Additionally, you have the information on reported frequency of the word in source texts. [F2.0]
Predict the price of flats in Poznań. Each entry in training data set is described by: Price, Rooms, SqrMeters, Floor, Location, Description. Evaluation metric is RMSE. [RMSE]
Translate subtitles from Russian into Polish. [BLEU]
Clip an obituary in a Polish newspaper. (This is only a sample challenge!) [ClippEU]
Guess the publication year of a Vietnamese text. The metric is root mean square error. [RMSE]
Translate Europarl proceedings from English into Polish. [BLEU]
Guess whether a text in Polish was written by a man or woman. [Accuracy]
Guess the publication year of a Polish text. [RMSE]