Deepspeech not recognizing Indian English Accent

Hi,
I have an Indian accent and trying DeepSpeech pre-trained English model, V0.91. When I give my voice as “Hello Im testing DeepSpeech thank you very much”, to DeepSpeech, Im getting the output as “hoooooo”. Kindly suggest how to recognize my voice. Attaching the snippet of code i used. Thanks for your help and suggestion.

`import deepspeech
import wave
import numpy as np

model_file_path = r"C:\deepspeech\0.91\deepspeech-0.9.1-models.pbmm"
model = deepspeech.Model(model_file_path)
scorer_file_path = r"C:\deepspeech\0.91\deepspeech-0.9.1-models.scorer"
model.enableExternalScorer(scorer_file_path)

recordingfile = r"C:\testingdeepspeech.wav"
w = wave.open(recordingfile, ‘r’)
rate = w.getframerate()
frames = w.getnframes()
buffer = w.readframes(frames)
print(rate)
print(model.sampleRate())
type(buffer)
data16 = np.frombuffer(buffer, dtype=np.int16)
type(data16)
text = model.stt(data16)
print(text)`

1 Like

Please format stuff.

Standard DeepSpeech is not good with accents and fast language, try slowly and clearly. And as suggested try to fine tune/transfer with Indian accent material.

kindly read the release notes and understand that the model is trained against american accent because we don’t have enough of other accents data/

Thanks. I have GPU enabled Laptop running Windows 10. I read the prerequisite as either Linux/Mac.

Couple of questions

  1. Can I use VirtualBox to load Linux and train the model as advised in https://deepspeech.readthedocs.io/en/v0.9.1/TRAINING.html

  2. Once the model is trained with Indian accent, can it be used on Windows machine?

The model is indenpendant from the library.

No idea.

Couple of questions,

  1. Why can’t Windows OS be used to train the model? (Doc says only Linux/Mac)

  2. Does the pre-built English version trained based on the English dataset available at https://commonvoice.mozilla.org/en/datasets? If yes, around 61,528 people had given their voice. The below distribution shows 5% of Indian’s gave their voice, which counts to 3076. When you say the model has to be trained with more Indian accent, how many more people should be used to give their voices to accurately predict Indian English Accent?

Accent
23% - United States English
8% - England English
5% - India and South Asia (India, Pakistan, Sri Lanka)
4% - Australian English
3% - Canadian English
2% - Scottish English
1% - Irish English
1% - Southern African (South Africa, Zimbabwe, Namibia)
1% - New Zealand English

As said plenty of times: we don’t use windows, and nobody cared enough to verify training pipeline on windows, send patches to fix and add CI.

I’ve told you several time to go and read the releases notes. I’m not here to do your own homework, especially when i’m on a day off.

Lissyx, Sincere thanks for your prompt reply. Highly appreciate your effort and help.