Would you please show some major training parameters of the pre-trained model?

jackhuang · January 12, 2018, 3:17am

I want to know some major parameters of the pre-trained model since my model doesn’t perform well and I want some help.

kdavis · January 13, 2018, 6:33am

The parameters need to be tuned to the data you are training on.

So a question in return is what data set are you training with?

jackhuang · January 15, 2018, 11:34am

I use Common voice, Librivox-360 clean and TED as the training set, would you please give me some advice on choosing the parameters?

kdavis · January 17, 2018, 8:29am

Here are some of the parameters we used to train the released model

  ...
  --train_batch_size 12 \
  --dev_batch_size 8 \ 
  --test_batch_size 8 \ 
  --epoch 13 \
  --learning_rate 0.0001 \
  --display_step 0 \ 
  --validation_step 1 \ 
  --dropout_rate 0.2367 \
  --default_stddev 0.046875 \
  --checkpoint_step 1 \ 
  --log_level 0 \ 
  ...

The batch sizes depend upon the memory of your graphics card. You’ll want to make them as big as possible while not running out of memory.

For your data set I’d suggest using more epochs, maybe something around 20-30.

The only other parameter you may want to play with a bit is the dropout_rate. You may want to increase it a bit to somewhere around 0.25 to 0.27 but you will have to play around to find the optimal value.

jackhuang · January 17, 2018, 10:21am

Thank you very much!
And may I ask if the parameter “n_hidden” is 2048 ?

kdavis · January 17, 2018, 10:26am

Yes the n_hidden parameter is 2048.

jackhuang · January 17, 2018, 10:32am

Thank you!

mark2 · February 2, 2018, 1:20pm

Is there any guidelines which parameters affect speed, and which model accuracy? I just want to get rough model quickly to verify the whole concept, but still want some reasonable outputs…

jackhuang · February 2, 2018, 1:28pm

n_hidden and train_batch_size are the main parameters that affect speed(if the n_hidden is smaller and train_batch_size is larger, the speed will be faster). n_hidden and epoch will affect model accuracy.