This topic is created to initiate and organize the Afaan Oromoo Must Start Dataset effort. The goal is to prepare a high-quality foundational dataset that enables Afaan Oromoo to move forward in the Common Voice and Scripted Speech stages.
We have completed 100% localization of required files and are currently waiting for approval. This dataset will support speech recognition, TTS development, and broader AI applications in Afaan Oromoo.
We welcome collaboration, review, and guidance from the Mozilla community to officially launch and scale Afaan Oromoo contributions.
Hi @Eyob_Abebe, welcome to the community. We are so happy to see you aboard.
You’ve just been assigned as a translator in Pontoon, so you can send translations directly (carefully) and review contributions of others. Go review your own now
Translations will be available on the site when we do a release (every 1-2 weeks, planning one tonight)…
Then you have to add 2000 sentences (we already set it) and two people should review them (one vote can come from you).
After that you will be able to record…
Join our Matrix channel for quick chats, Q&A if you have problems:
If you can work with linguists or if you know them, we have Variant (dialect) and Predefined Accent support. Having them defined beforehand would be a plus.
We also have DataSheets which will be used when the datasets are released, to give important information on the language. Whenever you are ready, you can fill them for the next release. This is how they look for now (Kabardian as an example): Common Voice Spontaneous Speech 2.0 - Kabardian | Mozilla Data Collective
Thank you very much for the warm welcome. I’m really happy to join the community and contribute to the Afaan Oromo localisation effort.
I’ve seen that I’ve been assigned as a translator in Pontoon — thank you! I’ll carefully review my previous translations and continue contributing. I’m also looking forward to the upcoming release.
Regarding the 2000 sentences, I understand the requirement and will start adding them. I’ll also coordinate to ensure they are reviewed properly (including one vote from me as mentioned).
I’ll join the Matrix channel on chat.mozilla.org (Element) for quicker communication and support.
About variants and predefined accents, I can collaborate with linguists and Oromo language experts to help define dialects and accents in advance. I agree this would be a valuable addition.
Thanks again for the support. Excited to move forward!
I would like to inform you that several Oromo terminology entries have already been translated on Pontoon, but they are still pending approval. As you know, translations need to be approved in order to be included in releases and become active on the platform.
Could you please review and approve the pending terminology entries for Afaan Oromo?
We are working hard to move forward with the localization progress, and your support in reviewing these items would be greatly appreciated.