Deepspeech not recognizing Indian English Accent

EsakkiSundar_Varatharajan · November 16, 2020, 11:58am

Hi,
I have an Indian accent and trying DeepSpeech pre-trained English model, V0.91. When I give my voice as “Hello Im testing DeepSpeech thank you very much”, to DeepSpeech, Im getting the output as “hoooooo”. Kindly suggest how to recognize my voice. Attaching the snippet of code i used. Thanks for your help and suggestion.

`import deepspeech
import wave
import numpy as np

model_file_path = r"C:\deepspeech\0.91\deepspeech-0.9.1-models.pbmm"
model = deepspeech.Model(model_file_path)
scorer_file_path = r"C:\deepspeech\0.91\deepspeech-0.9.1-models.scorer"
model.enableExternalScorer(scorer_file_path)

recordingfile = r"C:\testingdeepspeech.wav"
w = wave.open(recordingfile, ‘r’)
rate = w.getframerate()
frames = w.getnframes()
buffer = w.readframes(frames)
print(rate)
print(model.sampleRate())
type(buffer)
data16 = np.frombuffer(buffer, dtype=np.int16)
type(data16)
text = model.stt(data16)
print(text)`

othiele · November 16, 2020, 12:37pm

Please format stuff.

Standard DeepSpeech is not good with accents and fast language, try slowly and clearly. And as suggested try to fine tune/transfer with Indian accent material.

lissyx · November 16, 2020, 12:53pm

kindly read the release notes and understand that the model is trained against american accent because we don’t have enough of other accents data/

EsakkiSundar_Varatharajan · November 16, 2020, 1:53pm

Thanks. I have GPU enabled Laptop running Windows 10. I read the prerequisite as either Linux/Mac.

Couple of questions

Can I use VirtualBox to load Linux and train the model as advised in https://deepspeech.readthedocs.io/en/v0.9.1/TRAINING.html
Once the model is trained with Indian accent, can it be used on Windows machine?

lissyx · November 16, 2020, 1:55pm

The model is indenpendant from the library.

No idea.

EsakkiSundar_Varatharajan · November 16, 2020, 3:51pm

Couple of questions,

Why can’t Windows OS be used to train the model? (Doc says only Linux/Mac)
Does the pre-built English version trained based on the English dataset available at https://commonvoice.mozilla.org/en/datasets? If yes, around 61,528 people had given their voice. The below distribution shows 5% of Indian’s gave their voice, which counts to 3076. When you say the model has to be trained with more Indian accent, how many more people should be used to give their voices to accurately predict Indian English Accent?

Accent
23% - United States English
8% - England English
5% - India and South Asia (India, Pakistan, Sri Lanka)
4% - Australian English
3% - Canadian English
2% - Scottish English
1% - Irish English
1% - Southern African (South Africa, Zimbabwe, Namibia)
1% - New Zealand English

lissyx · November 16, 2020, 4:19pm

As said plenty of times: we don’t use windows, and nobody cared enough to verify training pipeline on windows, send patches to fix and add CI.

I’ve told you several time to go and read the releases notes. I’m not here to do your own homework, especially when i’m on a day off.

EsakkiSundar_Varatharajan · November 16, 2020, 4:22pm

Lissyx, Sincere thanks for your prompt reply. Highly appreciate your effort and help.