Dataset Release Day V.8

heyhillary · January 27, 2022, 1:24pm

Dear Common Voice Community,

Thank you so much for your contribution and support in the creation of Common Voice Dataset V.8. Your creativity and ingenuity has made this dataset possible .

On behalf of the Product Team, THANK YOU !

Dataset Stats

The dataset has grown by 30% and reaches 87 languages !

New languages in Common Voice 8 include Igbo, Marathi, Danish, Norwegian Nynorsk, Central Kurdish, Malayalam, Swahili, Erzya, Moksha, Macedonian and Santali (Ol Chiki).

You can download the Common Voice dataset here for free. The Dataset metadata is now published

Are you developing with the Common Voice Dataset ?

You can share with the community via discourse, case studies and tips with using the dataset
At Mozfest there will be tutorial sessions on building speech recognition models; such as Hack the Planet with Coqui and Demo sessions for Building Speech Technology with NVIDIA . Make sure to secure your ticket !

Community Resources Update !

We recently created new graphics and content to support Communities. Including a draft onboarding slide deck. You can access the resources via the google drive. If you have any access issues please let me know !

Here are a few examples fo the new graphics !

Andrej · January 27, 2022, 6:34pm

The Belarusian dataset has decreased by more than 100 hours compared to the state on January 10. What is the reason for this?

heyhillary · January 28, 2022, 12:38pm

Hey Andre,

Thanks for raising your questions.

Our team is currently investiagting the issue. We hope to respond as soon as possible. Please note that the leaderboard is an estimation of hours contributed.I will follow up with you for a more comprehensive report regarding your query.

Sorry for any inconvencies caused.

Topic		Replies	Views
Common Voice 2021 Mid-year Dataset Release! Common Voice announcements , dataset	8	2852	August 4, 2021
4200h Voice Dataset Release: More Than 4,200 Common Voice Hours Now Ready For Download Common Voice announcements , dataset	20	3880	April 21, 2020
How can one download the German dataset? Common Voice dataset	3	1048	June 12, 2019
Common Voice Dataset V.9 Common Voice announcements	0	3370	April 27, 2022
Dataset 13 release 🎉 Common Voice dataset , updates	3	1633	March 20, 2023

Dataset Release Day V.8

Related topics