We can’t tell you the parameters that will work, it depends on your data.
So yeah, 40h, it’s not surprising you are not getting interesting results.
We can’t tell you the parameters that will work, it depends on your data.
So yeah, 40h, it’s not surprising you are not getting interesting results.
After editing some of parameters , my model converged.
(
learning_rate: 0.0003
batch: 16
n_hidden: 1024
)
But
WER and CER are very bad in test data.
No improve cann’t be seen in test results.
WER: 1.000000, CER: 1.000000, loss: 213.787552
WER: 1.000000, CER: 0.666667, loss: 2.507114
WER: 1.000000, CER: 0.821429, loss: 2.494120
WER: 1.000000, CER: 0.774194, loss: 2.488260
Are you still working on 40h of data? If so and you are training from scratch, it’s not unexpected.
You might want to re-generate Common Voice data to allow more duplicates, like doable in https://github.com/Common-Voice/commonvoice-fr/blob/master/DeepSpeech/Dockerfile.train
https://github.com/Common-Voice/commonvoice-fr/blob/master/DeepSpeech/CONTRIBUTING.md#build-the-image
Especially the duplicate_sentence_count
parameters. With 270h hours and those parameters, it should start to be better.
Can you share complete test output?
At least the plot of loss for training / validation looks quite good
What have you used for external scorer ?
No. At this time, we use as much as possible data for train. about 270 h.
flags.py don’t have duplicate_sentence_count flag.
I don’t use scorer during training faze. But data/lm/kenlm.scorer is used as default in flags.py.
It is nessesary to use external scorer and train a new scorer?
In the Dockerfile.train
linked earlier.
Default scorer is made for english data + alphabet, it’s not going to work with other data …
It looks like you are already using a custom scorer as you get some Persian output. A good scorer will significantly improve your WER.
Hi, Olaf, and all of my friends on this page.
After training a Persian scorer, all thing works very well.
Thank you for all notes.
Test epoch | Steps: 169 | Elapsed Time: 0:01:13
Test on data/clips/test.csv - WER: 0.037417, CER: 0.021683, loss: 8.717824
WER: 0.000000, CER: 0.000000, loss: 46.499149
WER: 0.000000, CER: 0.000000, loss: 46.358776
WER: 0.000000, CER: 0.000000, loss: 40.054466
WER: 0.000000, CER: 0.000000, loss: 38.130241
WER: 0.000000, CER: 0.000000, loss: 37.861706
WER: 0.000000, CER: 0.000000, loss: 3.321916
WER: 0.000000, CER: 0.000000, loss: 3.310491
WER: 0.000000, CER: 0.000000, loss: 3.299302
WER: 0.000000, CER: 0.000000, loss: 3.288944
WER: 0.000000, CER: 0.000000, loss: 3.286442
WER: 1.000000, CER: 0.800000, loss: 15.691487
WER: 1.000000, CER: 0.800000, loss: 13.577458
WER: 1.000000, CER: 0.636364, loss: 12.150107
WER: 1.333333, CER: 0.142857, loss: 4.585015
WER: 1.666667, CER: 0.238095, loss: 16.671534
The Median WERs are all 0, this looks like you might be overfitting your data.
Have you tried some fresh audio files on your trained model? Or what is the WER of the test set (which is not in train or dev)?
Test epoch | Steps: 169 | Elapsed Time: 0:01:13
Test on data/clips/test.csv - WER: 0.037417, CER: 0.021683, loss: 8.717824
But my test data was in the training files. Then the test was not Fairplay.
Just try your model with your own voice or some other new material and see how it is doing.
Did You just updated alphabete.txt? or changed other parts of the code for Persian training?
And about your language model, did you clean text for language model before creating (num2word, remove punctuations etc…)? Because I am doing this preprocessing. I want to know your experience.
Yes, you don’t use UTF-8 mode, but just change the alphabet to the Persian one and train. You will need your own scorer for Persian, the English one does not work.
Preprocessing should be same as for other NLP tasks in Persian, maybe look around or as you did here, ask others
did you use transfer learning from Latin languages to gain this WER on test data?
I think that may help. Because many people used transfer learning from English to Spanish or German, and got better results. But I a, not sure this can help for Persian too
Hi Mohammad
would you please share the details of your training like the value of your parameters and your loss logs?!
I’m doing the same as you did and can’t get appropriate responses.
thanks in advance
and what is the size of the Persain text file for training scorer and what kind of sentences do you use?!
I think transfer learning is propper when your alphabets are the same (I mean Latin alphabet). but for Persian, because the alphabets are different so you can’t do transfer learning.
hey can you share your steps to make your own scorer file .
I’m unable to get the scorer when i run the below command`
!./generate_scorer_package --alphabet /gdrive/My\ Drive/dataset/UrduAlphabet_newscrawl.txt --lm /gdrive/My\ Drive/urdu_lm/lm.binary
–package /content/gdrive/My\ Drive/dataset/kenlm.scorer --default_alpha 0.931289039105002 --default_beta 1.1834137581510284 --vocab /content/gdrive/My\ Drive/urdu_lm/vocab-500
got below error when run this
500000 unique words read from vocabulary file.
Doesn’t look like a character based (Bytes Are All You Need) model.
–force_bytes_output_mode was not specified, using value infered from vocabulary contents: false
Invalid label 0
`
Really? Reuben gave you the answer 2 days ago. Please don’t hijack older threds, but read what we post:
Dear Mohammad,
Can you make share changes you make to the original DeepSpeech repo to work with Arabic/ Farsi script?
If there are fixes for training arabic/farsi, please send PR …