Good morning (or evening, or afternoon) to the delightful Common Voice community of contributors, dataset users and folks hanging out with us to learn more about language and technology.
It’s one of my favorite times of the year, it’s time for another dataset release!
Live and ready to download at: https://commonvoice.mozilla.org/en/datasets
Mozilla Common Voice 14 is live and we’re so excited that we have 28117 hours of speech data, of which 18651 hours are validated.
I love to see new languages available, so join me in welcoming Pashto, Albanian, Amharic and Standard Moroccan Amazigh to the platform and dataset.
We now have a total of 112 languages live and we would be so excited to welcome more.Please shout if you have any questions, suggestions or want to celebrate along!
So many thanks to the sentence, voice and technical contributors who made this possible.