I want to open this topic because more and more languages are moving to the voice collection phase (thanks to be able to get the minimum number of sentences!)
Now that your language is available for voice collection you might wonder, now what?
Keep in mind that in order to properly train Deep Speech algorithm there are a few big challenges:
- Get at least 2000 hours of voice recorded and validated.
- Get at least 1000 different/diverse speakers contributing.
Important: For the models training is important not to get the same sentence recorded more than once. So please keep in mind you will need to keep growing your sentences to accommodate more voice recordings. The math is calculated with 4 seconds per clip on average:
- The initial 5000 sentences will provide you buffer for around 5,5 hours of voice.
- For 10 hours you would need 9000 sentences.
- For 100 hours you would need 90000 sentences.
- For 2000 hours you would need 1800000 sentences.
We are currently working on new ways to collect this volume of sentences easily.
As a community volunteer you should not be scared about this big goals but rather think:
- How can I make the experience fun to get people contributing donating a lot of clips?
- How can engage big crowds of diverse people in order to also get more people contributing?
Let’s use this topic to share some ideas on how we are mobilizing our local communities, from events in Universities, to new ideas to collect voices outside the main app…
Let’s get free from the voice silos!