ANNOUNCEMENTS
| Please register your team for the participation in shared task: forms.gle/zd9AeZem32RMEa7A9 [REGISTER NOW!] |
📢 April 8, 2026 - Website released, and the task is announced!
| We are still updating the page. Please keep your eye on it! |
OVERVIEW AND TASK DESCRIPTION
Arabic (عربي) is one of the most widely spoken languages in the world, serving as a primary means of communication across multiple countries and playing a crucial role in global domains such as education, commerce, culture, and international relations. Developing robust and accurate Arabic machine translation (MT) systems is therefore essential for enabling cross-lingual communication, improving information accessibility, and fostering digital inclusion for millions of speakers.
In recent years, machine translation has achieved significant progress due to advances in multilingual modeling and transfer learning. These innovations have expanded MT capabilities beyond high-resource languages, encouraging broader coverage of linguistically and geographically diverse languages. However, many language pairs involving Arabic still suffer from limited availability of parallel corpora, which remains a key bottleneck in building high-quality translation systems. This challenge is particularly evident in low-resource scenarios, where the lack of sufficient training data restricts the performance of conventional MT approaches. Consequently, developing efficient MT systems that can operate effectively with relatively small datasets is of great importance.
To address this need, this shared task focuses on Arabic-centric low-resource translation, covering both forward and backward translation directions between Arabic and selected Asian languages, namely, Bangla, Hindi, Indonesian, Urdu and English.
Low-Resource Arabic-Asian language translation task features two categories of tasks:
Task 1: Translation into Arabic
-
Sub-Task 1A: English → Arabic
-
Sub-Task 1B: Hindi → Arabic
-
Sub-Task 1C: Bangla → Arabic
-
Sub-Task 1D: Indonesian → Arabic
-
Sub-Task 1E: Urdu → Arabic
Task 2: Translation from Arabic
-
Sub-Task 2A: Arabic → English
-
Sub-Task 2B: Arabic → Hindi
-
Sub-Task 2C: Arabic → Bangla
-
Sub-Task 2D: Arabic → Indonesian
-
Sub-Task 2E: Arabic → Urdu
GOAL
The primary objective of this task is to develop machine translation (MT) systems capable of delivering high-quality translations under limited data conditions. Participants are encouraged to explore and experiment with the following approaches:
-
Transfer Learning: Adapting knowledge from models trained on high-resource language pairs to low-resource language pair translation .
-
Multilingual Approaches: Examining the impact of cross-lingual transfer on improving performance for low-resource language pairs.
-
Monolingual Data Utilization: Effectively leveraging monolingual corpora to enhance translation quality using limited low-resource parallel data.
-
Innovative Techniques: Designing and applying novel methods specifically suited for low-resource translation scenarios.
IMPORTANT DATES
April 8, 2026 |
Website released, and the task is announced! |
April 8, 2026 |
Team Registration Open |
May 20, 2026 |
Team Registration Close (Extended) |
May 16, 2026 |
Training, Development/validation, DevTest data release (only registered participants) |
June 1, 2026 |
Beginning of the evaluation cycle (Challenge Test Set) release and run submission) |
June 7, 2026 |
End of the evaluation cycle |
June 20, 2026 |
Result Declaration to individual team |
in-line with WMT26 |
System Paper Submission |
November, 2026 |
Under EMNLP Conference |
DATA
Releasing Soon: [DOWNLOAD 2026 TASK DATA]
TEST DATA (Challenge Test Set)
TBA
System Submission Guidelines
📝Test Set Output Submission Instructions
For each language pair (e.g., English → Arabic), you may submit up to:
-
Primary system
-
A constrained system (uses only official data), or
-
A system that uses additional monolingual resources, pretrained, etc., depending on your setup (should be publicly available).
-
-
Contrastive systems (Optional — may use external or additional data beyond the task set.) (Up to 2 submission allowed)
🚨 Maximum: 3 submissions per language pair.
📄Submission File Naming Convention
Each submission file must be named using this format:
<TEAM_NAME>_<SUBMISSION_TYPE>_<SUB-Task No.>.txt
Where:
-
TEAM_NAME — Your registered team name (e.g., BAHASH-AI)
-
SUBMISSION_TYPE — One of: primary or contrastive
-
Sub-Task No. — Use format like Sub-Task-1A for English → Arabic
🔍 Example For a primary system submission from team BAHASH-AI for English → Arabic:
BAHASH-AI_primary_Sub-Task-1A.txt
Repeat this pattern for all your submissions.
📄System Description File (Mandatory!)
You must include a brief abstract system description in one of the following formats:
-
.pdf
-
.doc/.docx
|
This file is mandatory. Submissions without it will not be published in the final results. |
📦 Packaging Your Submission
Place the following files in a single .zip archive:
-
All your system output files
-
The abstract system description file
Name the zip file using your team name:
<TEAM_NAME>.zip
📌 Example For team BAHASH-AI, name the file:
BAHASH-AI.zip
📄 Email Submission
Send your .zip file to the following address:
Email: : asianmt.wmt@gmail.com
Subject Line:
<TEAM_NAME>: Submission File for Shared Task: Low-Resource Arabic-Asian Language Translation
Example Subject Line:
BAHASH-AI: Submission File for Shared Task: Low-Resource Arabic-Asian Language Translation
Thank you for participating in the shared task! We look forward to your submissions and system descriptions.
EVALUATION
Systems will undergo both automatic evaluation metrics (using BLEU, ChrF, TER, COMET etc.) and human evaluation by native speakers for a comprehensive assessment of translation quality.
CONTACT
PAPER SUBMISSION
in-line with WMT26
Your system paper submission should be prepared according to the WMT instructions and uploaded to START before TBA, 2026 (WMT MAIN PAGE).
ORGANIZERS
-
Sahinur Rahman Laskar, UPES, Dehradun, India
-
Firoj Alam, Qatar Computing Research Institute, Doha, Qatar
-
Bishwaraj Paul, Bahash-AI, Bahash Private Limited, Silchar, India
-
Irfan Ahmad, King Fahd University of Petroleum and Minerals, Saudi Arabia
-
Maya Silvi Lydia, Universitas Sumatera Utara, Medan, Indonesia
-
Pankaj Dadure, UPES, Dehradun, India