Getting wrong output from the trained model

siddhant2000 · January 30, 2019, 2:14pm

Hi,

I am able to train a model on language(Hindi) but getting wrong output from the model when passing a sample ‘wav’ file. Have tried many times but no luck.

Below is the command I am using to train the mode:-

python -u DeepSpeech.py --train_files data/flac_csv2.csv --dev_files data/flac_csv2.csv --test_files data/flac_csv2.csv --train_batch_size 10 --dev_batch_size 10 --test_batch_size 10 --n_hidden 1028 --export_dir model_export --epoch 75 --checkpoint_dir /home/sp/.local/share/deepspeech/flac_new2 > flac_new1.log

We are providing the 4600 input wav files and their transcriptions in the csv file. Model is getting trained successfully but returning wrong output.

Can you please guide me through the steps to train the model as I might be missing any step.

Also please do let me know,if we need to create our own ‘lm.library’ file for specific language.

Any help would be appreciated.

Thanks in advance…

lissyx · January 30, 2019, 6:51pm

We would need more context. 4600 wav might not be enough data, you changed n_hidden to 1028, that might also not be good, you might be doing too many epochs. Without more context and the training log, we can’t help. Is the wrong data not learning enough? Overfitting? Can’t tell.

siddhant2000 · January 31, 2019, 6:47am

Thanks lissyx for the reply.

PFB output of the logs:

Preprocessing [‘data/flac_csv.csv’]
Preprocessing done
Preprocessing [‘data/flac_csv.csv’]
Preprocessing done
W Parameter --validation_step needs to be >0 for early stopping to work
I STARTING Optimization
I Training epoch 0…
I Training of Epoch 0 - loss: 289.471059
I Training epoch 1…
I Training of Epoch 1 - loss: 303.551886
I Training epoch 2…
I Training of Epoch 2 - loss: 303.208348
I Training epoch 3…
I Training of Epoch 3 - loss: 303.340517
I Training epoch 4…
I Training of Epoch 4 - loss: 303.261908
I Training epoch 5…
I Training of Epoch 5 - loss: 303.110338
I Training epoch 6…
I Training of Epoch 6 - loss: 303.026394
I Training epoch 7…
I Training of Epoch 7 - loss: 303.048171
I Training epoch 8…
I Training of Epoch 8 - loss: 302.989858
I Training epoch 9…
I Training of Epoch 9 - loss: 303.014609
I Training epoch 10…
I Training of Epoch 10 - loss: 302.963002
I Training epoch 11…
I Training of Epoch 11 - loss: 302.966895
I Training epoch 12…
I Training of Epoch 12 - loss: 302.946233
I Training epoch 13…
I Training of Epoch 13 - loss: 302.952651
I Training epoch 14…
I Training of Epoch 14 - loss: 302.931431
I FINISHED Optimization - training time: 0:59:37
Preprocessing [‘data/flac_csv.csv’]
Preprocessing done
Computing acoustic model predictions…
Decoding predictions…
Test - WER: 0.970377, CER: 98.243197, loss: 302.404297

WER: 1.000000, CER: 6.000000, loss: 27.665541

src: “a story”
res: "i "

WER: 1.000000, CER: 6.000000, loss: 27.665541

src: “a story”
res: "i "

WER: 1.000000, CER: 6.000000, loss: 31.257378

src: “oh emil”
res: "i "

WER: 1.000000, CER: 4.000000, loss: 32.887798

src: “ay me”
res: "i "

WER: 1.000000, CER: 4.000000, loss: 32.887798

src: “ay me”
res: "i "

WER: 1.000000, CER: 8.000000, loss: 33.150005

src: “direction”
res: "i "

WER: 1.000000, CER: 5.000000, loss: 34.121330

src: “venice”
res: "i "

WER: 1.000000, CER: 8.000000, loss: 34.133373

src: “verse two”
res: "i "

WER: 1.000000, CER: 7.000000, loss: 34.345585

src: “indeed ah”
res: "i "

WER: 1.000000, CER: 7.000000, loss: 34.345585

src: “indeed ah”
res: "i "

I Exporting the model…
I Models exported at model_export

The output is same in all the cases…

Its been almost 5 days I am stuck in this thing as I am daily trying a different approach to train the model with the help of the related issues and the replies on them but no luck.
Can you please help me with the steps to train the model in my language. I might be missing something which i am not able to trace down.

Thanks…

shaikh.zhas · January 31, 2019, 7:02am

I had the same output, my data was in 8khz format, I thought maybe this is main reason.

Hope we can solve this problem

Here the is the my output:

Computing acoustic model predictions…
100% (285 of 285) |######################| Elapsed Time: 0:08:37 Time: 0:08:37
Decoding predictions…
100% (285 of 285) |######################| Elapsed Time: 0:07:52 Time: 0:07:52
Test - WER: 0.993192, CER: 66.797807, loss: 326.212860

WER: 1.500000, CER: 17.000000, loss: 172.890213

src: “оформили подскажите”
res: “в т п”

WER: 1.500000, CER: 14.000000, loss: 197.776184

src: “қанша переводите”
res: “в т п”

WER: 1.333333, CER: 16.000000, loss: 208.935623

src: “это родственники ваш”
res: “в т е е”

WER: 1.000000, CER: 20.000000, loss: 1.283422

src: “номер телефона подскажите”
res: “е т и”

WER: 1.000000, CER: 7.000000, loss: 4.558431

src: “да да да”
res: “а”

WER: 1.000000, CER: 15.000000, loss: 4.634631

src: “да ночи ежедневно”
res: “не”

WER: 1.000000, CER: 10.000000, loss: 7.905017

src: “да это тальго”
res: “эль”

WER: 1.000000, CER: 19.000000, loss: 7.950384

src: “на седьмое чего декабря”
res: “в т ер”

WER: 1.000000, CER: 18.000000, loss: 9.858935

src: “ну почему да причина”
res: "ни "

WER: 1.000000, CER: 21.000000, loss: 10.437287

src: “на шестнадцатое декабря”
res: “не”

I Exporting the model…
I Models exported at /home/dulan/models

pete · May 25, 2019, 7:20am

Are you still having this problem ?

Topic		Replies	Views
Training Vietnamese model DeepSpeech	33	3602	May 21, 2019
Very high error rate for this audio clip with my own model DeepSpeech	6	773	March 15, 2018
Inference on Self-Trained Model produces gibberish as output DeepSpeech	16	1694	March 7, 2019
Using Deep Speech DeepSpeech	34	12910	August 20, 2019
My model recognizes well when generating the model but when I use it I don't DeepSpeech	7	473	March 9, 2020

Getting wrong output from the trained model

Computing acoustic model predictions… 100% (285 of 285) |######################| Elapsed Time: 0:08:37 Time: 0:08:37 Decoding predictions… 100% (285 of 285) |######################| Elapsed Time: 0:07:52 Time: 0:07:52 Test - WER: 0.993192, CER: 66.797807, loss: 326.212860

Related topics

Computing acoustic model predictions…
100% (285 of 285) |######################| Elapsed Time: 0:08:37 Time: 0:08:37
Decoding predictions…
100% (285 of 285) |######################| Elapsed Time: 0:07:52 Time: 0:07:52
Test - WER: 0.993192, CER: 66.797807, loss: 326.212860