Firstly, a big hello! Or kia ora! Or habari ya asubuhi! Or selamat siang
I’m Kathy Reid, a part timer with Mozilla and an open source voice specialist. I’m working with Mozilla Fellow @Joshua_Meyer to put together a Playbook for DeepSpeech and would warmly welcome your input and feedback.
Learning how to train a speech to text model with DeepSpeech can have a high learning curve. Like the Common Voice Playbook, we’d like to put together a DeepSpeech Playbook. This will serve partly as a quick-start guide, partly as a set of tutorials, and partly as an on-ramp, allowing folx new to DeepSpeech to begin training speech models.
We’re anticipating that the Playbook will have the following broad sections:
- Fundamentals of Speech-to-Text
- Data collection
- Data formatting
- Model training
- Model fine-tuning
- Model testing (quantitative)
- Model evaluation (qualitative)
As people using DeepSpeech - and wrangling some of its quirks and hurdles - every day, you will know best what content the Playbook needs to include. We’d love to hear from you about the specific challenges you face using DeepSpeech, particularly those encountered as you were training your first models.
What are the one or two things that you had to learn the hard way and wished there was a walkthrough for? Did you take notes that we might be able to use?
Please do let us know in the comments below. Be sure to specify;
- the problem or hurdle you encountered
- how you overcame it (or didn’t)
- and what information or guidance would have been useful in overcoming the hurdle
- and please point us to an example, or additional documentation if it’s available
We will share the Playbook openly once it is more developed, and anticipate it being under the same license as the Common Voice Playbook (CC-BY-SA 3.0).
A huge thank you in advance for helping us to help the DeepSpeech community, and may you and your loved ones remain safe and well as we progress through the pandemic.
Kind regards,
Kathy