Multi-language Dataset Beta Release

nukeador · June 12, 2019, 11:18pm

Today we have released a new version of the dataset and keep improving the automation of the process.

MestafaKamal · April 4, 2020, 2:29pm

Hello,
I’ve tried to access the form link but it seems not to be accepting responses anymore. Can you please help me about it?

nukeador · April 6, 2020, 10:52am

HI, this review is no longer needed, the final dataset was published on

https://voice.mozilla.org/datasets

MestafaKamal · April 6, 2020, 11:04am

Thank you.
I wanted to use Corpora Creator with clips.tsv but it seems that the audio files are named differently in the Common-Voice dataset. So, how can I re-create the train dev and test tsv files?

Topic		Replies	Views
4200h Voice Dataset Release: More Than 4,200 Common Voice Hours Now Ready For Download Common Voice announcements , dataset	20	3872	April 21, 2020
Common Voice Dataset Release - Mid Year 2020 Common Voice announcements	16	24292	August 21, 2020
Add in dataset Sakha language Common Voice dataset	5	1310	April 25, 2019
Common Voice 2021 Mid-year Dataset Release! Common Voice announcements , dataset	8	2847	August 4, 2021
Common Voice mid-year release - more data, more languages! Common Voice announcements , dataset	20	2540	August 12, 2019

Multi-language Dataset Beta Release

Related topics