Hello.
Reducing orders is a good idea, but doesn’t change the fact that you need more samples per command.
You augmented the number of wav, good, but use low values. I’ll try with 0.2 to 0.3 max
Thanks for quick reply @elpimous_robot
Ok will try with 0.2 to 0.3 as well as try to get more samples
HI @elpimous_robot,
Now i have 10000 wav file thousand samples and i am split that into 70:20:10
Then train model using below command
python DeepSpeech.py \
--train_files dataset/train.csv \
--dev_files dataset/dev.csv \
--test_files dataset/test.csv \
--epochs 50 \
--learning_rate 0.0001 \
--export_dir export1/ \
--checkpoint_dir cpoint1/ \
Can you please suggest me changes in command if required ?
Use batch size for train and cudrnn to make it faster, include dropout to improve accuracy.
Thanks @othiele for quick reply
Then train model using below command
python DeepSpeech.py \
--train_files dataset/train.csv \
--dev_files dataset/dev.csv \
--test_files dataset/test.csv \
--epochs 50 \
--learning_rate 0.0001 \
--export_dir export1/ \
--train_batch_size 10 \
--dev_batch_size 10 \
--test_batch_size 5 \
--dropout_rate 0.15 \
--checkpoint_dir cpoint1/ \
Right ?
not getting cudrnn please explore ?
--n_hidden
i was use default no need to change ?
You can change n_hidden, didn’t make much of a difference for me, but I have 1000 hours. You could try 512.
Set
–use_cudnn_rnn=True
for speed and set the train batch size as high as you can without getting an error. Anything higher than 1 is good, usually (4,8,16,…)
@othiele hi
I was train model but WER is constant is any way to decrease WER
Here is my training result:
Epoch 3 | Training | Elapsed Time: 0:00:21 | Steps: 49 | Loss: 1.624806
Epoch 3 | Validation | Elapsed Time: 0:00:05 | Steps: 13 | Loss: 3.249443 |
Dataset: 10000_data_set/dev.csv
I Early stop triggered as (for last 4 steps) validation loss: 3.249443 with
standard deviation: 0.094293 and mean: 3.145775
I FINISHED optimization in 0:01:52.091116
I Restored variables from best validation checkpoint at
10000_512_checkpoint/best_dev-21290, step 21290
Testing model on 10000_data_set/test.csv
Test epoch | Steps: 9 | Elapsed Time: 0:00:44
Test on 10000_data_set/test.csv - WER: 0.225941, CER: 0.049616, loss:
4.468176
Command is :
python3.6 DeepSpeech.py
–train_files 10000_data_set/train.csv
–checkpoint_dir 10000_512_checkpoint/
–epochs 60
–dev_files 10000_data_set/dev.csv
–test_files 10000_data_set/test.csv
–n_hidden 512
–learning_rate 0.0001
–export_dir 10000_512_export
–early_stop False
–use_seq_length False
–earlystop_nsteps 3
–estop_mean_thresh 0.1
–estop_std_thresh 0.1
–dropout_rate 0.25
–train_batch_size 80
–dev_batch_size 80
–test_batch_size 45
–report_count 50
–use_cudnn True \
@elpimous_robot
is it required to give full path of wav file in csv
directory structure
wav_file_folder:
-all_wav[10000 wav file]
-train.csv
-test.csv
-dev.csv
in same directory
A WER of 0.22 for 10.000 input files doesn’t sound too bad. I would go for more data to get better results, I am not sure other methods yield a lower WER for not so many files.
You can try nlpaug with what you have, but judging from your description, more material will give you a lot more.
is this model understand the sequence when I said exact command result much better but if I change the sequence of command result is too much poor
How model work if am not wrong it gets a feature of every character and forms the word right?
Actually, After getting the poor result I was tens
one more in my all command one word is common. every cmd start with like demo
word
Can we use the same word multiple time
@othiele Can Please help in that
Please give an example, I am not sure I understand what you want
Hello @othiele I have train model as we discuss
In Dataset we have wave files. in that all wave file one word is common
Ex:
rec.1.wav 1938 how are you
rec.2.wav 1938 how about you
rec.3.wav 1938 how much
When I was trained the model `how’ word is common in all wave files
So, my question is I need to avoid how
or not?
Sorry, I have no idea what you are talking about or what the problem is. @lissyx Do you get what might be the problem?