I am brand new with DeepSpeech and tried this example out : Mic Vad Example - it is pretty much exactly what I want to do, convert a stream of audio from microphone and I got this example working using this scorer and model:
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer
As per the getting started guide : https://deepspeech.readthedocs.io/en/v0.7.1/?badge=latest
It does recognise, simple english statements but fails to recognise words such as:
“Oh”, “Hmm”, “Do”, “Ok”, “Hi” and it does not understand as far as I can see any swear words such as f**k, or words which are not very basic english. It appears to not pickup the first spoken word very accurately in a sentence also.
I also don’t see it understand the alphabet for example “A” , “B” , “C” I get nothing recognised – how can we make it recognise?
Is there an alternative model or scorer that may be more deeply trained? Any advice on improving it’s ability to pickup these “filler” words in a spoken sentence such as “ugh”, or “hffffff (huffing sound)”. There may well be things within this example that I can tweak to improve the capability of the recognition and I would love to hear any suggestions!
Many thanks - immensely exciting technology!