Optimaizatiom the WER result

i need some help to make the thing more clear for me
(I have my own data for female, it small (5665 records
i get (0.608 ) WER value with n-hidden = 2048
I want to obtain more enhancement of WER,
so I try to change some of hyper parameters value only
when i put n-hidden= 1024 ; I get (0.502) WER
?i need to know why when used n-hidden= 1024 i get less WER
( which is the best used the value of 1024 or used the default values( 2048
some of clarification
thanks.

n-hidden makes the neural net bigger or smaller. If you have few data, you could even go with 512. You don’t give us enough information to help more.

thanks for help

i have 12 GB GPU
i have 5665 records for female genders only
the max long of records in the data is 10 seconds

i used deepspeech 0.7.0
i used the default paramters
alpa= 0.75
beta =1.85
learning rate = 0.0001
drop rate = 0.15
epochs =75
TensorFlow-GPU 1.15 installed and built
with CUDA 10.1 and CuDNN 7.6.5
i make some trying to change alpa, beta only to get lower WER
and when i change the n-hidden from 2048 to 1024
get less WER
is this enough information
?

The language you are training? Do you have a custom scorer or what do you use? What do you want to use it for?

You could try a dropout of 0.3 or 0.4, but with such few data this might be ok results.

i used Arabic languge
ther’a a scorer
i want to obtain an effective model to use it in application
i try to put apla= .93, beta= 1.183
drop = 0.5
with n-hidden = 2048
i get WER more than first time
worset WER value obtain
(here WER =0.70 )

And when i try to put apla= .93, beta= 1.183
drop = 0.5
with n-hidden = 1024
i get less WER than first time
(here WER =0.50 )

is you suggestion to let these as is
alpa= 0.75
beta =1.85
learning rate = 0.0001
but put drop rate = 0.3 or 0.4

??

Please read the docs on training and how to determine alpha and beta. And why don’t you use version 0.9? And dropout can help, maybe not. You don’t have enough material for a good overall model.

i read it already but i can’t get more enhancement
ok i will try to use version 0.9 later
but now can i used it with what i have from data
with n-hidden =1024
and enough with WEr 0f 0.5
is this good or not
?

I have no idea what you want to use the model for. WER of 0.5 means it understands 50% of the words. Google has about 90%.

yes this with a small amount of data
and this a first phase of model with a small amount af data
it like a primary model

so it seems acceptable so far
so using a n-hidden =1024 case
is good or there is a comment
?