Results of the Quality Estimation Shared Task 2023
Jump direclty to
Note: due to Codalab being unstable during the competition (failing submissions due to congested servers, servers down, etc.), the automatic computation of the predictions did not go as planned. As a result, the leaderboards of the "competition" phases are not representative and should not be considered. Instead,
- participants are only listed in tasks and language pairs they have officially declared (form) to the organisers wishing to participate in; Participants who did not fill in the form are thus not considered for the official ranking of the shared task.
- for a given language pair, each participant was ranked based on their submission with the highest score (primary metric) for that language pair;
- only participants who officially participated in and submitted to all language pairs for a given task were considered for the "Multilingual" ranking. In this case, we retained the highest macro-average score (as reported by our scoring programmes) over all submissions which contain predictions over all the language pairs.
Task 1 -- Sentence-level
(top)Multilingual (Average over all LPs) |
|
English-German (MQM) |
Chinese-English (MQM) |
Hebrew-English (MQM) |
English-Marathi (DA) |
English-Hindi (DA) |
English-Tamil (DA) |
English-Telegu (DA) |
English-Gujarati (DA) |
Task 1 -- Word-level
(top)Multilingual (Average over all LPs) |
|
English-German (MQM) |
Chinese-English (MQM) |
Hebrew-English (MQM) |
English-Marathi (PE) |
English-Farsi (PE) |
|
Task 2 -- Error Span Detection
(top)Multilingual (Average over all LPs) |
|
English-German |
Chinese-English |
Hebrew-English |
|