Hi everyone,
I am currently looking for the French Scripted Speech dataset on the new Mozilla Data Collective platform: https://datacollective.mozillafoundation.org/datasets.
When I filter by language (French), the only available results are:
- Common Voice Spontaneous Speech 3.0 - French (which contains only 152 transcribed clips).
I am looking for the full scripted version (previously known as mvc-scripted-fr)
Could you please clarify:
-
Is the historical scripted French dataset still being migrated to the MDC?
-
Is it now bundled within a global multi-language “Common Voice Corpus” entry instead of a standalone French one?
-
Where can I find the most recent stable version of the scripted French data on this new platform?
Thank you for your help!