Downloading raw audio data

(Martin) #1

Hello forum,

I just found Common Voice, and I think it’s an amazing thing!

The power you give to speech researchers is wonderful :slight_smile:

My problem is that I just want raw audio data (I don’t care about validated transcriptions) of as many languages as possible for my research. Is it possible to download your audio data for all languages, not just the ones that are done being validated?


(Michael Henretty) #2

Hi Martin,

Yup, we make all audio available when we publish. Validated, invalidated, and yet to be validated. You can find those in the current published dataset for english.


(Martin) #3

That’s nice! I thought only validated audio was downloadable.

What I really want is all audio (all languages, all speakers), and I don’t need text at all, just the audio. Is it possible to download all samples, not just English?


(Michael Henretty) #4

Once we publish data in new languages (which we hope to by the end of the year), you will be able to download all samples.