I hope soon we could have a good model trained for DeepSpeech publicly available, until that I’d like to know if it’s possible to download my voice data for training an acoustic model for cmu sphinx. Thanks and I hope that you guys start the work on other languages.
The answer to all 3 of your questions is yes:
- Public Voice Models: we are working on releasing some very soon!
- Download your voice data: we are working on having that soon!
- Multi-language support: Coming in early 2018!
Thank for your answer I’m glad for your reply.
Very soon you will be able to download your voice data, along with everyone else’s. Unfortunately, we won’t be able to know which data is yours (for privacy reasons).
It’s a bit unfortunate that we won’t be able to have voices of individuals even if we don’t know who they are. The ability to tell what multiple people are saying and identifying individual voices is an interesting use. I totally get the privacy concerns though.
Check out the Tatoeba dataset from our Download page, it has utterances grouped by speaker.
we won’t be able to know which data is yours (for privacy reasons).
What’s the privacy concern?
Just general ones really, we believe our users don’t want to be identified, so we do our best to protect them.
How about indicating the speaker ID for each audio file in the public corpus, and privately indicating to each voice contributor their speaker ID?
@Franck_Dernoncourt what are your needs for that? Ie. tell me about your research, and why the Tatoeba dataset I mentioned above doesn’t solve that for you.
I simply would like to be able to download the audio files I contributed to, in order to:
Train an ASR customized to my voice (I use speech recognition daily for my work)
Train a text-to-speech engine customized to my voice
Analyze which words I mispronounce (and more generally analyze how I speak)
Benchmark off-the-shelf ASR engine performance with my voice
As a principle, I think it’s good practice to allow users to download the data they generated (e.g., https://en.wikipedia.org/wiki/Google_Data_Liberation_Front).
Absolutely on point. Would love to be able to retrieve a dataset of my own voice to create my own tts from it.
Since unfortunately it’s hard for one person to make it’s own database from scratch…