i am working on Indian English. i took voice samples of 9 to 10 sec. in my audio samples, speaker speaks too fast. that is why many words comes under 10 sec voice files. do you thik it is too big for deepspeech training. if you think it is too big, i will reduce the voice samples from 10 sec to 5 sec duration. please review the voice transcript and let me just know if each line transcript is too big for deepspeech training
i am using 10 sec voice files with nvdia gtx1080 4 core gpu.
training parameters are
–train_batch_size 24
–test_batch_size 48
–n_hidden 2048
–epochs 3
–learning_rate 0.0001
–dropout_rate 0.2
–lm_alpha 0.75
–lm_beta 1.85 \
so can i assume this gpu setup and training parameters are fine for my 10 sec voice transcripts. or should i take 5 sec voice samples instead of 10 sec to reduce high no of words per voice samples.