I am wondering whether there will be support for other languages such as German, French, Italian and so on. For English there is already many hours (>1000) of free speech data available, e.g. from the LibriSpeech (http://www.openslr.org/12/) project or from VoxForge (http://www.voxforge.org). However, for all other languages the data situation is much worse, so there might be a much higher need for collecting data for other languages. For English, the best thing to do is probably to integrate an ASR system into Mozilla and collect real user data.
Agreed with everything you said. We definitely want to open Common Voice to more languages, but right now we are building out the v1 in English. We have a goal to start with a second language before the end of the year. Stay tuned!
Hi, sorry if I re-open this topic, but I found just this one, talking about the possibility to add other main languages. I’m interested in helping with the Italian one, being a voice actor and a dubbing director.
I was guided here from mycroft.ai, because they’re counting on your project for integrating your results in their work.
Hello @mhenretty ,
Would also be happy to contribute but not sure where to begin. There is a lot of topics regarding localisation and that will happen soon but without any other precision about the concerned languages and when it should be available.
On the github repository I found this file https://github.com/mozilla/voice-web/blob/master/web/locales/en/messages.ftl regarding the localisation of the website should we begin to translate it and propose a PR?
I suppose you will as well need short sentences for people to read but can we submit our sentences in other languages here https://github.com/mozilla/voice-web/issues/341?
or should we create a new issue per language?