I just wanted to share details of the scripts we’ve developed at Bangor University that bring together the various features of DeepSpeech, along with CommonVoice data, and provides a complete solution for producing models and scorers for Welsh language speech recognition. They may be of interest to any other users of DeepSpeech that are working with a similarly lesser resourced language to Welsh.
The scripts:
- are based on DeepSpeech 0.7.4
- make use of DeepSpeech’s Dockerfiles (so setup and installation is easier).
- train with CommonVoice data
- utilize transfer learning
- with some additional test sets and corpora, produce optimized scorers/language models for various applications
- exports models with metadata
The initial README describes how to get started.
We’d like to share also the models that are produced from these scripts which can be found at https://github.com/techiaith/docker-deepspeech-cy/releases/tag/20.06
At the moment these models are used in two prototype applications which the Welsh speaking community can install and try, namely a Windows/C# based transcriber and an Android/iOS voice assistant app called Macsen. Source code for these applications using DeepSpeech can also be found on GitHub.
We are immensly grateful to Mozilla for creating the Common Voice and DeepSpeech projects.