Posts

Overview



The "AERFAI Contest: Multilingual Handwritten Text Recognition" is organized in the framework of the AERFAI competition.

Nowadays, handwritten text recognition (HTR) in multiple languages represents a significant challenge in scientific research. This competition aims at encouraging competitors to create a single system capable of recognizing text in five different languages: English, Spanish, German, French, and Portuguese. Generally, HTR approaches have predominantly been oriented toward a single language. However, following the paradigm of automatic translation, this initiative aims to develop truly multilingual systems. This effort not only expands technological possibilities in HTR but also promises to overcome linguistic and cultural barriers in written communication.

The proposed contest consists of two tracks, with the aim of evaluating the perfomance of HTR systems under different conditions:
  1. T1 Restricted track: participants can only use the data provided by the organisers, in which systems must be trained from scratch.
  2. T2 Unrestricted track: participants are allowed to use external training data, including the use of pre-trained models.
The goal of the competition is to obtain the best word error rate (WER) on the provided test data.

Prize Awards



Each contest track has a prize of 3 500 euros, distributed between the winners as follows:
    - 2 000 euros for 1st place;
    - 1 000 euros for 2nd place;
    -    500 euros for 3rd place.
Each team can only win one prize award per track.

BACK TO TOP >>

Dataset



The proposed Multilingual HTR dataset for this competition consists of 20 000 lines in five different languages (English, Spanish, German, French, and Portuguese), with a balance of 4 000 lines for each language. The split is done by assigning 90% of the samples to the training set and 10% to the test set. The main statistics of the dataset are presented in Table 1.

 
Table 1. Multilingual HTR dataset statistics.

As mentioned, the data samples prepared for this competition are partitioned into a training and a test set as follows:
  1. 90% (18 000) of samples are used for training (Tr);
  2. 10% (2 000) of samples are used for testing (Ts).
The provided training data (Tr), prepared for this competition, consist of:
  1. Images of rendered training text samples.
  2. A file, containing ground-truth transcriptions phrases.
Table 2, shows sample images of the Tr set.

English
 

 
Spanish
 

German
 

 
French
 

 
Portuguese
 


Table 2. Multilingual HTR dataset sample images and ground-truth transcriptions
in English, Spanish, German, Frech and Portuguese.

The goal of the competition is to obtain the lowest WER on the test data set. Both tracks, T1 and T2, will be evaluated with the same test set (Ts). This test set will consist of only rendered images, which will be made available according to the competition schedule. In addition, the test set will be merged with several thousand images, thus participants will not be able to distinguish the actual test set. The ground-truth associated to the Ts set will be published once the competition officialy concludes.

BACK TO TOP >>

Evaluation Protocol



Evaluation will be performed using WER, measuring the ratio of errors (insertions, substitutions, and deletions) in a transcript to the total words of the ground truth. Systems will be ranked based on this metric. In case of a tie between systems with the same WER result, resolution will be based on the Character Error Rate (CER) and subsequently on the time of submission of the results. The competition will prioritize submissions based on speed; those who submit their hypotheses first will be the beneficiaries. The contest organizers will evaluate hypotheses. The evaluation tool the organizers will use is be presented on the Data page.

The participants will not receive any feedback about their results on the test sets. Providing evaluation results while the competition is open can help the participants to fit their systems. Several submissions per participant will be allowed. Besides, results on several systems per participant will be allowed. If a participant submits results for several systems (s)he has to inform the differences from one system to another system. The differences have to be substantial in order to guarantee that the systems are really different.
In any case, just the last submitted results will be considered for ranking the participants.

BACK TO TOP >>

Contest Schedule



The schedule for the competition is as follows:
  1. January 12, 2024 The contest officialy starts, with participant registration and training materials available on the competition web page.
  2. February 28, 2024 Registration deadline, after which no more participants will be accepted.
  3. March 8, 2024 Participants will receive test data and the means to submit their final results.
  4. March 15, 2024 Deadline for systems results.
  5. Winners and final ranking of all teams will be publicly announced in CEDI 2024.


BACK TO TOP >>

Registration and Access to Data



To sign up for the competition, complete our form. You'll need to provide the following information in the form:
  • Group name and acronym
  • Institution
  • Participants and e-mail. One participant must be an AERFAI member. Kindly indicate in brackets which participant holds this membership.
  • Contact person
A password will be given to each registered participant, which will grant access to download the data.

BACK TO TOP >>

Tutorial


 
For those unfamiliar with text recognition, we've prepared a tutorial on GitHub with training data to design a model, train it, and perform inference using PyLaia toolkit.

Tutorial: https://github.com/dparres/AERFAI-Contest-Multilingual-Handwritten-Text-Recognition/tree/main/tutorial

Aknowledgement



Organized by Daniel Parres Montoya, PhD student: Pattern Recognition and Human Language Technologies research centre, in the framework of the AERFAI competition. This competition has been financed by AERFAI.



BACK TO TOP >>