Talk to us! How are you using Common Voice?

Hey everyone, I’m Em, the new Lead for Common Voice at Mozilla Foundation :slight_smile: I’ve met some of you already!

We want to hear stories of how all of you in the community are making use of Common Voice, or planning to! This will help to guide the roadmap, and help us to amplify your efforts to have more people contribute. We would love to hear what you’re building and working on - and how we can help.

Let us know, and as ever, feel free to ask us any questions

EM, Hillary and the team :slight_smile:


I look forward to integrating a voice assistant into Home Assistant which uses only LAN resources.

I love home automation, but I do now want my house controlled by a cloud company :slight_smile:


Hey there! :slight_smile: Here are a few things I am working on:

  • Training speech recognition models that function on-device for languages that don’t already have them (downloadable at
  • Listening-based language learning, basically listening comprehension tasks (demo at
  • Pronunciation training software for second-language learners (in development)

This is amazing :smiley: Do pass on the question to other people you come across doing amazing things with Common Voice :purple_heart:

1 Like

Hey there,

I love your work!
We use Common Voice for training a Spoken Language Identifier. There still a lot to do but we are getting there :slight_smile:
In the near future, we hope to utilise the meta-data to measure and mitigate the bias in our system.

Thanks for your work,


Hey, I am not involved into the website, but maybe it is interesting for you that the dictionary uses audio files from common voice as samples to show how words are pronounced.


I want to use to for non cloud-based dictation.

And honestly, the reason I want that is so I can add captions/subtitles to my stream in a way that DOESN’T use Google or Microsoft’s servers. XD;

Speaking of which, is there any chance this kind of dictation could be built into Firefox? I understand the reason why basically all caption/subtitle systems use Google’s servers is because it’s based on the browser’s built in dictation… and right now, the only browser with built-in dictation is Chrome. ^_^;


Hi, our volunteers are going to try to generate some speech datasets to train humanitarian AI applications. We’re starting with some simple datasets then more complex ones, including featuring different language content and text. The main aim is to generate datasets sufficient to train digital assistants how to answer complex queries posed by humanitarian actors using highly structured open data published by aid organizations. The datasets will generally encapsulate different types of information published in aid activity files, read out-loud by volunteers.


Hey @brentophillips , @infranscia , @stergro, @bytosaur , @ftyers , and @alfem

Thanks so much for sharing with us how you are using the Common Voice dataset.

Just, in case you are not aware of this, we hosting two community engagements that you might be interested in taking part in:

Both these sessions are opportunities to get your questions and thoughts shared with us about Common Voice.

1 Like

Hey everyone, would anyone be interested in doing a lightning talk explaining how they are using the Common Voice Dataset for the Contribute-athon Global on 7th October or 14th October ?

Please private message me if you would be intrested !