Train model but actual prediction is too poor

lissyx · February 24, 2020, 10:39am

Please tell me what is difficult to understand in asking “it’s not fast enough, what can I do” without even telling what is your hardware, what are your constraints.

victornoriega7 · February 24, 2020, 2:38pm

If you’re using 0.6.1 you should also update your trie and lm.binary. Are you sure that you’re generating your LM from the file with all of your possible commands?

Can I increase a dataset using audio augmentation ?

No. From the source code, I infered that audio augmentation don’t create new files, just transform the current audio into something noisy. This is to create a trained model more robust for noisy tests and that can generalize well.
In your case, you don’t need that good generalization, because you already know that only a few persons will be using it.

Try getting more data like the french robot topic.

elpimous_robot · February 25, 2020, 7:50pm

Yes, and I could say : after more more more datas…,
use python, or a terminal command to duplicate all your datas, and process audio transformations, to slighty change audio specs.
You’ll have 2x more datas…
The more datas, the better your accuracy.
Note : pay attention in data augmentation values !! use small changes, or you’ll train bad audio files, and your accuracy will not increase.

elpimous_robot · February 25, 2020, 7:54pm

@Sudarshan.gurav14,
Friend, deep learning learns us patience !
You need to do like all of us, progress slowly, read, read…read, test your own ideas.
And magic will appear !

Sudarshan.gurav14 · February 25, 2020, 8:03pm

I was change the recording speed like slow , fast

Sudarshan.gurav14 · February 27, 2020, 8:52am

@elpimous_robot Yes, Right

i am reading and understand concepts

Thanks

Sudarshan.gurav14 · March 3, 2020, 5:59am

Hi @elpimous_robot

I want to change gain of audio using voice corpus tool as you suggest
how much gain i change now i am change my gain 0.5 is it ok ?

there is one more arg -times i did’t no how to use can please help me?

Now, i am decrease my commands i just want 70 command out of 200

One more que:
suppose i have 1 wav can i change its gain 2 time mean

1.wav [ original]
1_gain_05.wav [same file]
1_gain_07.wav [same file]

is it ok?

elpimous_robot · March 3, 2020, 7:04am

Hello.
Reducing orders is a good idea, but doesn’t change the fact that you need more samples per command.
You augmented the number of wav, good, but use low values. I’ll try with 0.2 to 0.3 max

Sudarshan.gurav14 · March 3, 2020, 8:04am

Thanks for quick reply @elpimous_robot
Ok will try with 0.2 to 0.3 as well as try to get more samples

Sudarshan.gurav14 · March 19, 2020, 11:45am

HI @elpimous_robot,

Now i have 10000 wav file thousand samples and i am split that into 70:20:10

Then train model using below command

  python DeepSpeech.py \
     --train_files dataset/train.csv \
     --dev_files dataset/dev.csv \
     --test_files dataset/test.csv \
     --epochs 50 \
     --learning_rate 0.0001 \
     --export_dir export1/ \
     --checkpoint_dir cpoint1/ \

Can you please suggest me changes in command if required ?

othiele · March 19, 2020, 12:26pm

Use batch size for train and cudrnn to make it faster, include dropout to improve accuracy.

Sudarshan.gurav14 · March 19, 2020, 12:50pm

Thanks @othiele for quick reply

Then train model using below command

  python DeepSpeech.py \
     --train_files dataset/train.csv \
     --dev_files dataset/dev.csv \
     --test_files dataset/test.csv \
     --epochs 50 \
     --learning_rate 0.0001 \
     --export_dir export1/ \
     --train_batch_size 10 \
     --dev_batch_size 10 \
     --test_batch_size 5 \
     --dropout_rate 0.15 \
     --checkpoint_dir cpoint1/ \

Right ?

not getting cudrnn please explore ?

--n_hidden i was use default no need to change ?

othiele · March 19, 2020, 12:59pm

You can change n_hidden, didn’t make much of a difference for me, but I have 1000 hours. You could try 512.

Set

–use_cudnn_rnn=True

for speed and set the train batch size as high as you can without getting an error. Anything higher than 1 is good, usually (4,8,16,…)

Sudarshan.gurav14 · March 23, 2020, 12:52pm

@othiele hi
I was train model but WER is constant is any way to decrease WER

Here is my training result:

           Epoch 3 |   Training | Elapsed Time: 0:00:21 | Steps: 49 | Loss: 1.624806
           Epoch 3 | Validation | Elapsed Time: 0:00:05 | Steps: 13 | Loss: 3.249443 | 
           Dataset: 10000_data_set/dev.csv
           I Early stop triggered as (for last 4 steps) validation loss: 3.249443 with 
           standard deviation: 0.094293 and mean: 3.145775
           I FINISHED optimization in 0:01:52.091116
           I Restored variables from best validation checkpoint at 
           10000_512_checkpoint/best_dev-21290, step 21290
           Testing model on 10000_data_set/test.csv
           Test epoch | Steps: 9 | Elapsed Time: 0:00:44
           Test on 10000_data_set/test.csv - WER: 0.225941, CER: 0.049616, loss: 
           4.468176

Command is :

python3.6 DeepSpeech.py
–train_files 10000_data_set/train.csv
–checkpoint_dir 10000_512_checkpoint/
–epochs 60
–dev_files 10000_data_set/dev.csv
–test_files 10000_data_set/test.csv
–n_hidden 512
–learning_rate 0.0001
–export_dir 10000_512_export
–early_stop False
–use_seq_length False
–earlystop_nsteps 3
–estop_mean_thresh 0.1
–estop_std_thresh 0.1
–dropout_rate 0.25
–train_batch_size 80
–dev_batch_size 80
–test_batch_size 45
–report_count 50
–use_cudnn True \

@elpimous_robot
is it required to give full path of wav file in csv

directory structure

wav_file_folder:

          -all_wav[10000 wav file]
          -train.csv
          -test.csv
          -dev.csv

in same directory

othiele · March 23, 2020, 12:53pm

A WER of 0.22 for 10.000 input files doesn’t sound too bad. I would go for more data to get better results, I am not sure other methods yield a lower WER for not so many files.

Sudarshan.gurav14 · March 23, 2020, 12:55pm

@othiele thanks you

need more dataset is only way can i use augment wav

othiele · March 23, 2020, 1:10pm

You can try nlpaug with what you have, but judging from your description, more material will give you a lot more.

Sudarshan.gurav14 · May 5, 2020, 9:58am

is this model understand the sequence when I said exact command result much better but if I change the sequence of command result is too much poor

How model work if am not wrong it gets a feature of every character and forms the word right?

Actually, After getting the poor result I was tens

one more in my all command one word is common. every cmd start with like demo word

Can we use the same word multiple time

@othiele Can Please help in that

othiele · May 5, 2020, 10:01am

Please give an example, I am not sure I understand what you want

Sudarshan.gurav14 · May 5, 2020, 11:10am

Hello @othiele I have train model as we discuss

In Dataset we have wave files. in that all wave file one word is common

Ex:
rec.1.wav 1938 how are you
rec.2.wav 1938 how about you
rec.3.wav 1938 how much

When I was trained the model `how’ word is common in all wave files

So, my question is I need to avoid how or not?