Hi,
I setup and used the web-mic example, which included using the models
deepspeech-0.8.0-models.pbmm
deepspeech-0.8.0-models.scorer
these are about 1GB
the results were sub-optimal, and i’m wondering about how to improve it.
i wondered regarding the datasets from common voice, 50GB, available for download here - https://commonvoice.mozilla.org/en/datasets. would it help?
if so, how would i go about replacing the above files with it?
The model works fine for somewhat slow American English as this is the data that is freely available to train. Depending on what you want to recognize, find some hundred hours of that an fine tune the model for it.