Train model but actual prediction is too poor

As you said I was follow steps:
1 vocab.txt [commands in separate line]
2 clone and build it
3 create arpa as well as lm.binary file

       bin/./lmplz --text vocab.txt --arpa words.arpa --o 3
       bin/./build_binary -T -s words.arpa  lm.binary

After that how to use it ?
now i have dataset and above those file.I am confuse how to use those file ?

Try to change --o 3 with --o 1, just a recommendation.
So, you already have lm.binary from your vocab.txt right? I assume your vocab.txt is the file with your 300 commands and not your alphabet.

That lm.binary file is the one you provide as an argument as of version 0.6.1 of DeepSpeech. Something like this:

python DeepSpeech.py --test_files <your_test_file> --lm_binary_path <your_lm.binary_path> 

and other parameters like the path of your saved model.

The last thing you need to build in 0.6.1 is your trie. For that you have to run a executable file called generate_trie, which you can compile if you follow this:

Remember to checkout to the tag 0.6.1 to use languages models like this.

1 Like

Try to change --o 3 with --o 1 but getting error so I use --o 3

Yes right vocab contain 300 commands

I am using Deepspeech v0.5.1

              python DeepSpeech.py \
                --lm_binary_path bi_file/lm.binary \
                --train_files all_augumented_samples/train.csv \
                --dev_files all_augumented_samples/dev.csv \
                --test_files all_augumented_samples/test.csv \
                --epoch 13 \
                --train_batch_size 10 \
                --dev_batch_size 10 \
                --test_batch_size 10 \
                --learning_rate 0.0001 \
                --export_dir export2/ \

is it right sir?

It possible using DeepSpeech v0.5.1?

Yes, you only need, as I said, the trie, is another file that you need to generate by yourself from the lm.binary that you created. Then, given the size of you data, I would recommend you to train with something like -n_hidden 1024 --dropout_rate 0.3 --learning_rate 0.0001.

Respect with the order of your lm, I didn’t know that you can’t use order 1. Did you try with --o 2 ?

Remember that is very important to create your own trie for your own language model binary.

Thanks so much sir for help me

I was successfully generated trie .

Thanks once again

@victornoriega7 after train using LM still word error rate not decrease

now i was train model using DeepSpeech v0.6.1

command:

python DeepSpeech.py \
--train_files 18_2_2020_CSV/Train/train.csv \
--dev_files 18_2_2020_CSV/Dev/dev.csv \
--test_files 18_2_2020_CSV/Test/test.csv \
--epochs 30 \
--learning_rate 0.00001 \
--export_dir lm_export/ \
--n_hidden 1024 \
--checkpoint_dir lm_checkpoint/ \
--lm_binary_path new_native_client/18_02_lm.binary \
--lm_trie_path new_native_client/trie \
--automatic_mixed_precision=True \

After Result are:
Test on 18_2_2020_CSV/Test/test.csv - WER: 0.993631, CER: 0.941896, loss: 31.854940

Please me please in this problem ? please

How much data do you have?

@victornoriega7 I have around 2600 samples

and Now i am used Deepspeech v0.6.1

is Above command right or wrong ? for train model using kenLM ?

I know dataset is low but i want recognize few commands?

2600 samples are very very poor…
How many sentences in your vocabulary ?

@elpimous_robot my vocabulary is around 200 sentences.

my sentence length is around 8 words

How many different voices ?.

@elpimous_robot
6 different voices contain
3 male and 3 female

Arghhhh…:dizzy_face:

My friend, it’s totally impossible !!!

Ex:
2600 samples of only 1 speaker,
for 200 sentences,
with nearly 500 different words
with an alphabet of 26 to 30 characters…
You could reach a 40 to 60% accuracy max.

For 200 different sentences, for 4 peoples, it’s impossible without a bigger model.

It it was my problem, i’d work with at least
10000 train for each person, and separate it in 70/20/10% train/dev/test.

Sorry for the bad news.

@elpimous_robot It’s ok. :slightly_smiling_face:

Can I increase a dataset using audio augmentation ?

Its help me or not? actually I am new in that so …? :blush:

Yes if course it will help, but after basis…
It will not help here for now.
Not enough datas.

Augmentation is helpfull to add noise, echos, durations, and tone.
But the most important part is good initial datas, and ENOUGH

@elpimous_robot thanks friend

@lissyx and @elpimous_robot
Inference taking long time any idea to fast that process?
eg.

   Loaded model in 0.0259s.
   Loading language model from files KenLM-model/trie
   Loaded language model in 0.00017s.
   Running inference.
   hi how are you 
   Inference took 3.328s for 5.952s audio file.

is any solution please help me?

Hey, you can’t just ask people random questions without context. 3.3s for 5.9s is quite fast.

2 Likes

@lissyx for help thanks

@Sudarshan.gurav14 I won’t answer since you don’t care about my answers to read them?

1 Like

@lissyx apologize.

really i was not having the idea but still i got the answer which i was expecting that’s why i said thank you

next time i will make sure to give the proper context before asking.