Creating DeepSpeech Model for Hindi

lissyx · October 15, 2019, 12:28pm

I would suspect your importer code.

cryptoaimdy · October 15, 2019, 12:30pm

i did not understand

lissyx · October 15, 2019, 12:32pm

There is likely a bug somewhere that makes your data getting funny. Have you written your any code for those data ?

cryptoaimdy · October 15, 2019, 12:35pm

No. i am just using it in deepspeech’s version 0.5.1 code

lissyx · October 15, 2019, 12:42pm

Ok. First, it’d be better you work on master. Apply https://gist.github.com/reuben/b68b9085f7b293580f8431156a33daa9 if you need to reload a 0.5.1 english checkpoint.

cryptoaimdy · October 16, 2019, 9:56am

no luck with this. i tried.

i think fault was in my binary which i created with wrong alphabets. Now i am trying again.

after cloning deep speech doing
git checkout v0.5.1
but the version is 0.6.0 alpha 9

cryptoaimdy · October 16, 2019, 12:56pm

Test on data/test/test.csv - WER: 1.000000, CER: 0.911950, loss: 113.759827
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.666667, loss: 35.086880
 - src: "नाम"
 - res: "का"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.833333, loss: 39.348869
 - src: "सहायता"
 - res: "का"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.909091, loss: 67.961250
 - src: "सहायता2करिए"
 - res: "का"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.875000, loss: 79.533066
 - src: "आपका2नाम2क्या2है"
 - res: "का"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.866667, loss: 83.190369
 - src: "तमहरआ2कयआ2नाम2ह"
 - res: "का"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.900000, loss: 104.061623
 - src: "क्या2बोलना2चाहते2हैं"
 - res: "का"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.944444, loss: 104.943001
 - src: "हिंदी2में2बात2करिए"
 - res: "का"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.931034, loss: 133.641830
 - src: "मै2आपकी2कया2सहायता2कर2सकता2हू"
 - res: "का"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.937500, loss: 159.305466
 - src: "सर2मै2विमवीजयोर2से2बात2कर2रहा2हू"
 - res: "का"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 330.525848
 - src: "नई2दिल्ली"
 - res: "का"
--------------------------------------------------------------------------------

SRC is correct now, but why in between words instead space numerical ‘2’ is coming? any idea?

lissyx · October 16, 2019, 1:18pm

It does not looks like ASCII 2, more like some other UTF-8 variant. Maybe something with your alphabet? It’s really important to ensure you use the same alphabet everywhere.

cryptoaimdy · October 16, 2019, 1:23pm

i am using same alphabets every where, but at the time of training it says your have some missing alphabets in alphabet.txt that are present in train test or dev fiiles, but my alphabets are already present in that alphabet.txt. when i am deleting that alphabet and again entering the same, it goes and works properly. but dont know what is problem with number 2 intead of spaces. i created alphabets in utf-8 using notepad.

lissyx · October 16, 2019, 1:44pm

It’s possible windows line endings are playing a role here

If it says it cannot find the character, you need to fix that in your alphabet file if it’s a legit character, or cleanup your dataset if it is not

I’m not sure I get your process here.

cryptoaimdy · October 17, 2019, 6:21am

It keeps saying the word
(' ')
is not present in your alphabet.
do i have to add spaces after each character in alphabet.txt?

lissyx · October 17, 2019, 6:48am

No, but you need it at least once in your dataset. Make sure this is the proper UTF-8 code.

cryptoaimdy · October 17, 2019, 12:48pm

All done,

Hi, What max length of audio would be best for training data?

or

what should be length of audio/words for training to get best result in model.

can we place like 5-10 minutes conversation of each audio for training?

lissyx · October 17, 2019, 12:52pm

This is mostly going to be limited by your batch size and your GPU memory. To give you a ballpark, 11GB RAM on a GPU, I cannot go above 68 batch size with clips up to 10-15 seconds. If I push more, then I run out of GPU memory.

cryptoaimdy · October 17, 2019, 1:09pm

Okay Thank You for the support. great community with great people

Sreyan_Ghosh · June 25, 2020, 2:01pm

@cryptoaimdy I am working on Hindi ASR for my thesis. Could you please help with the process or steps to build Hindi ASR using Deepspeech.

cryptoaimdy · June 25, 2020, 2:55pm

Hi, the process is the same as english, except for the alphabet file and training data. you need to create an alphabet.txt file in hindi with all possible alphabets of hindi. Also you need a hindi vocabulary file and a few audios with transcripts to train and test.

you can remind me on this mail id to send you the hindi Vocab file i have.
mohammadali1ali@gmail.com

Sreyan_Ghosh · June 26, 2020, 1:41pm

@cryptoaimdy Thank you for your response, I have mailed you on your given email id.

Parinda_Pranami · June 24, 2021, 5:52am

Hi @cryptoaimdy @lissyx ,

I am working on a similar scenario i.e. using deepspeech with a Hindi dataset.

The parameters I am using right now => LR= .00003, DR=0.2, alpha=0.75, beta=1.85, n_hidden = 2048, train test and dev batch size = 16,16,16.

With this parameters, the last training completed in 10 hours with 17 Epochs, and results were as follows:

Training loss: 223.638213,
validation loss: 236.106942,
Testing loss: 254.768326,
WER: 0.790118,
CER: 0.593037.

Earlier I was getting a loss in the range 300-400 with other values for the parameters so I have been changing the values and training again and again to get the best result. The inference that I pasted above, that can’t be a good result, can you suggest some values that I should change to reach to optimal results? Any help is appreciated.

Thanks!

Parinda_Pranami · July 9, 2021, 6:09am

Hi, can someone please help me out.
@Sreyan_Ghosh @lissyx

Topic		Replies	Views
Training Vietnamese model DeepSpeech	33	3566	May 21, 2019
DeepSpeech model training DeepSpeech	65	7986	November 12, 2019
DeepSpeech Training own English model for call center speech recognition DeepSpeech	22	3254	October 8, 2019
Using Deep Speech DeepSpeech	34	12845	August 20, 2019
Hindi accent using deepspeech DeepSpeech	98	3284	November 25, 2019

Creating DeepSpeech Model for Hindi

Related topics