How to download common_voice_9.0 dataset?

I want to download common_voice_9.0 dataset, but I could not see it in the Common Voice and Datasets

I saw that in my tutorial tar xvzf cv-corpus-9.0-2022-04-27-en.tar.gz, so this dataset will be cv-corpus-9.0-2022-04-27-en.tar.gz

So where I can find this dataset and download it? Thank you

Hi @vanhuy.tran, welcome…

As you know the datasets are distributed via MDC, and only the last version. Older datasets are taken out of circulation to respect user’s data deletion requests, which are legally binding.

This is currently what you can:

  • Use the latest datasets from MDC
  • Provided that your use case is educational/scientific, send an e-mail to commonvoice@mozilla.com explaining your request/use case to get a download link. You should NOT release any model with that data.

Yes. Thank you for your response.

I understood the situation. I will contact Mozilla again if needed.

For the latest version, is it Common Voice Scripted Speech 24.0 - English | Mozilla Data Collective?

Yes, the link you provided is correct for the latest version.

1 Like