Evaluation Protocol



Evaluation will be performed using WER, measuring the ratio of errors (insertions, substitutions, and deletions) in a transcript to the total words of the ground truth. Systems will be ranked based on this metric. In case of a tie between systems with the same WER result, resolution will be based on the Character Error Rate (CER) and subsequently on the time of submission of the results. The competition will prioritize submissions based on speed; those who submit their hypotheses first will be the beneficiaries. The contest organizers will evaluate hypotheses. The evaluation tool the organizers will use is be presented on the Data page.

The participants will not receive any feedback about their results on the test sets. Providing evaluation results while the competition is open can help the participants to fit their systems. Several submissions per participant will be allowed. Besides, results on several systems per participant will be allowed. If a participant submits results for several systems (s)he has to inform the differences from one system to another system. The differences have to be substantial in order to guarantee that the systems are really different.
In any case, just the last submitted results will be considered for ranking the participants.

BACK TO TOP >>