Dear Common Voice Community,
Thank you so much for your contribution and support in the creation of Common Voice Dataset V.8. Your creativity and ingenuity has made this dataset possible .
On behalf of the Product Team, THANK YOU !
Dataset Stats
The dataset has grown by 30% and reaches 87 languages !
New languages in Common Voice 8 include Igbo, Marathi, Danish, Norwegian Nynorsk, Central Kurdish, Malayalam, Swahili, Erzya, Moksha, Macedonian and Santali (Ol Chiki).
You can download the Common Voice dataset here for free. The Dataset metadata is now published
Are you developing with the Common Voice Dataset ?
-
You can share with the community via discourse, case studies and tips with using the dataset
-
At Mozfest there will be tutorial sessions on building speech recognition models; such as Hack the Planet with Coqui and Demo sessions for Building Speech Technology with NVIDIA . Make sure to secure your ticket !
Community Resources Update !
We recently created new graphics and content to support Communities. Including a draft onboarding slide deck. You can access the resources via the google drive. If you have any access issues please let me know !
Here are a few examples fo the new graphics !