Please tell me what is difficult to understand in asking âitâs not fast enough, what can I doâ without even telling what is your hardware, what are your constraints.
If youâre using 0.6.1 you should also update your trie and lm.binary. Are you sure that youâre generating your LM from the file with all of your possible commands?
Can I increase a dataset using audio augmentation ?
No. From the source code, I infered that audio augmentation donât create new files, just transform the current audio into something noisy. This is to create a trained model more robust for noisy tests and that can generalize well.
In your case, you donât need that good generalization, because you already know that only a few persons will be using it.
Try getting more data like the french robot topic.
Yes, and I could say : after more more more datasâŚ,
use python, or a terminal command to duplicate all your datas, and process audio transformations, to slighty change audio specs.
Youâll have 2x more datasâŚ
The more datas, the better your accuracy.
Note : pay attention in data augmentation values !! use small changes, or youâll train bad audio files, and your accuracy will not increase.
@Sudarshan.gurav14,
Friend, deep learning learns us patience !
You need to do like all of us, progress slowly, read, readâŚread, test your own ideas.
And magic will appear !
I was change the recording speed like slow , fast
I want to change gain of audio using voice corpus tool as you suggest
how much gain i change now i am change my gain 0.5 is it ok ?
there is one more arg -times i didât no how to use can please help me?
Now, i am decrease my commands i just want 70 command out of 200
One more que:
suppose i have 1 wav can i change its gain 2 time mean
1.wav [ original]
1_gain_05.wav [same file]
1_gain_07.wav [same file]
is it ok?
Hello.
Reducing orders is a good idea, but doesnât change the fact that you need more samples per command.
You augmented the number of wav, good, but use low values. Iâll try with 0.2 to 0.3 max
Thanks for quick reply @elpimous_robot
Ok will try with 0.2 to 0.3 as well as try to get more samples
HI @elpimous_robot,
Now i have 10000 wav file thousand samples and i am split that into 70:20:10
Then train model using below command
python DeepSpeech.py \
--train_files dataset/train.csv \
--dev_files dataset/dev.csv \
--test_files dataset/test.csv \
--epochs 50 \
--learning_rate 0.0001 \
--export_dir export1/ \
--checkpoint_dir cpoint1/ \
Can you please suggest me changes in command if required ?
Use batch size for train and cudrnn to make it faster, include dropout to improve accuracy.
Thanks @othiele for quick reply
Then train model using below command
python DeepSpeech.py \
--train_files dataset/train.csv \
--dev_files dataset/dev.csv \
--test_files dataset/test.csv \
--epochs 50 \
--learning_rate 0.0001 \
--export_dir export1/ \
--train_batch_size 10 \
--dev_batch_size 10 \
--test_batch_size 5 \
--dropout_rate 0.15 \
--checkpoint_dir cpoint1/ \
Right ?
not getting cudrnn please explore ?
--n_hidden
i was use default no need to change ?
You can change n_hidden, didnât make much of a difference for me, but I have 1000 hours. You could try 512.
Set
âuse_cudnn_rnn=True
for speed and set the train batch size as high as you can without getting an error. Anything higher than 1 is good, usually (4,8,16,âŚ)
@othiele hi
I was train model but WER is constant is any way to decrease WER
Here is my training result:
Epoch 3 | Training | Elapsed Time: 0:00:21 | Steps: 49 | Loss: 1.624806
Epoch 3 | Validation | Elapsed Time: 0:00:05 | Steps: 13 | Loss: 3.249443 |
Dataset: 10000_data_set/dev.csv
I Early stop triggered as (for last 4 steps) validation loss: 3.249443 with
standard deviation: 0.094293 and mean: 3.145775
I FINISHED optimization in 0:01:52.091116
I Restored variables from best validation checkpoint at
10000_512_checkpoint/best_dev-21290, step 21290
Testing model on 10000_data_set/test.csv
Test epoch | Steps: 9 | Elapsed Time: 0:00:44
Test on 10000_data_set/test.csv - WER: 0.225941, CER: 0.049616, loss:
4.468176
Command is :
python3.6 DeepSpeech.py
âtrain_files 10000_data_set/train.csv
âcheckpoint_dir 10000_512_checkpoint/
âepochs 60
âdev_files 10000_data_set/dev.csv
âtest_files 10000_data_set/test.csv
ân_hidden 512
âlearning_rate 0.0001
âexport_dir 10000_512_export
âearly_stop False
âuse_seq_length False
âearlystop_nsteps 3
âestop_mean_thresh 0.1
âestop_std_thresh 0.1
âdropout_rate 0.25
âtrain_batch_size 80
âdev_batch_size 80
âtest_batch_size 45
âreport_count 50
âuse_cudnn True \
@elpimous_robot
is it required to give full path of wav file in csv
directory structure
wav_file_folder:
-all_wav[10000 wav file]
-train.csv
-test.csv
-dev.csv
in same directory
A WER of 0.22 for 10.000 input files doesnât sound too bad. I would go for more data to get better results, I am not sure other methods yield a lower WER for not so many files.
You can try nlpaug with what you have, but judging from your description, more material will give you a lot more.
is this model understand the sequence when I said exact command result much better but if I change the sequence of command result is too much poor
How model work if am not wrong it gets a feature of every character and forms the word right?
Actually, After getting the poor result I was tens
one more in my all command one word is common. every cmd start with like demo
word
Can we use the same word multiple time
@othiele Can Please help in that
Please give an example, I am not sure I understand what you want
Hello @othiele I have train model as we discuss
In Dataset we have wave files. in that all wave file one word is common
Ex:
rec.1.wav 1938 how are you
rec.2.wav 1938 how about you
rec.3.wav 1938 how much
When I was trained the model `howâ word is common in all wave files
So, my question is I need to avoid how
or not?