Train model but actual prediction is too poor

Hello.
Reducing orders is a good idea, but doesn’t change the fact that you need more samples per command.
You augmented the number of wav, good, but use low values. I’ll try with 0.2 to 0.3 max

Thanks for quick reply @elpimous_robot
Ok will try with 0.2 to 0.3 as well as try to get more samples

HI @elpimous_robot,

Now i have 10000 wav file thousand samples and i am split that into 70:20:10

Then train model using below command

  python DeepSpeech.py \
     --train_files dataset/train.csv \
     --dev_files dataset/dev.csv \
     --test_files dataset/test.csv \
     --epochs 50 \
     --learning_rate 0.0001 \
     --export_dir export1/ \
     --checkpoint_dir cpoint1/ \

Can you please suggest me changes in command if required ?

Use batch size for train and cudrnn to make it faster, include dropout to improve accuracy.

Thanks @othiele for quick reply

Then train model using below command

  python DeepSpeech.py \
     --train_files dataset/train.csv \
     --dev_files dataset/dev.csv \
     --test_files dataset/test.csv \
     --epochs 50 \
     --learning_rate 0.0001 \
     --export_dir export1/ \
     --train_batch_size 10 \
     --dev_batch_size 10 \
     --test_batch_size 5 \
     --dropout_rate 0.15 \
     --checkpoint_dir cpoint1/ \ 

Right ?

not getting cudrnn please explore ?

--n_hidden i was use default no need to change ?

You can change n_hidden, didn’t make much of a difference for me, but I have 1000 hours. You could try 512.

Set

–use_cudnn_rnn=True

for speed and set the train batch size as high as you can without getting an error. Anything higher than 1 is good, usually (4,8,16,…)

@othiele hi
I was train model but WER is constant is any way to decrease WER

Here is my training result:

           Epoch 3 |   Training | Elapsed Time: 0:00:21 | Steps: 49 | Loss: 1.624806
           Epoch 3 | Validation | Elapsed Time: 0:00:05 | Steps: 13 | Loss: 3.249443 | 
           Dataset: 10000_data_set/dev.csv
           I Early stop triggered as (for last 4 steps) validation loss: 3.249443 with 
           standard deviation: 0.094293 and mean: 3.145775
           I FINISHED optimization in 0:01:52.091116
           I Restored variables from best validation checkpoint at 
           10000_512_checkpoint/best_dev-21290, step 21290
           Testing model on 10000_data_set/test.csv
           Test epoch | Steps: 9 | Elapsed Time: 0:00:44
           Test on 10000_data_set/test.csv - WER: 0.225941, CER: 0.049616, loss: 
           4.468176

Command is :

python3.6 DeepSpeech.py
–train_files 10000_data_set/train.csv
–checkpoint_dir 10000_512_checkpoint/
–epochs 60
–dev_files 10000_data_set/dev.csv
–test_files 10000_data_set/test.csv
–n_hidden 512
–learning_rate 0.0001
–export_dir 10000_512_export
–early_stop False
–use_seq_length False
–earlystop_nsteps 3
–estop_mean_thresh 0.1
–estop_std_thresh 0.1
–dropout_rate 0.25
–train_batch_size 80
–dev_batch_size 80
–test_batch_size 45
–report_count 50
–use_cudnn True \

@elpimous_robot
is it required to give full path of wav file in csv

directory structure

wav_file_folder:

          -all_wav[10000 wav file]
          -train.csv
          -test.csv
          -dev.csv

in same directory

A WER of 0.22 for 10.000 input files doesn’t sound too bad. I would go for more data to get better results, I am not sure other methods yield a lower WER for not so many files.

@othiele thanks you

need more dataset is only way can i use augment wav

You can try nlpaug with what you have, but judging from your description, more material will give you a lot more.

is this model understand the sequence when I said exact command result much better but if I change the sequence of command result is too much poor

How model work if am not wrong it gets a feature of every character and forms the word right?

Actually, After getting the poor result I was tens

one more in my all command one word is common. every cmd start with like demo word

Can we use the same word multiple time

@othiele Can Please help in that

Please give an example, I am not sure I understand what you want

Hello @othiele I have train model as we discuss

In Dataset we have wave files. in that all wave file one word is common

Ex:
rec.1.wav 1938 how are you
rec.2.wav 1938 how about you
rec.3.wav 1938 how much

When I was trained the model `how’ word is common in all wave files

So, my question is I need to avoid how or not?

Sorry, I have no idea what you are talking about or what the problem is. @lissyx Do you get what might be the problem?