Arabic model stuck

alexmay23 · November 24, 2020, 2:06pm

Hey mates. We have an issue with the Arabic model. It stuck sometimes. I can start speaking, and it can produce some response, but then after a while, it stuck. This case has volatile behavior so I can’t describe all conditions. What could it be? Thanks in advance.

othiele · November 24, 2020, 2:36pm

Without any knowledge of setup, scripts etc. it is hard to tell.

hhiyassat · November 24, 2020, 3:54pm

Our corpus size is 900 hrs
We train for 39 epochs
We actives best dev at 17% loos
Test achieve 10 % WER
We use augmentation for training
Trading loss 12%
Learning rate is 0.0001
python -u DeepSpeech.py
–train_files “train.sdb”
–epochs 39
–augment reverb[p=0.1,delay=50.0~30.0,decay=10.0:2.0~1.0]
–augment resample[p=0.1,rate=12000:8000~4000]
–augment codec[p=0.1,bitrate=48000:16000]
–augment volume[p=0.1,dbfs=-10:-40]
–augment pitch[p=0.1,pitch=1~0.2]
–augment tempo[p=0.1,factor=1~0.5]
–augment warp[p=0.1,nt=4,nf=1,wt=0.5:1.0,wf=0.1:0.2]
–augment frequency_mask[p=0.1,n=1:3,size=1:5]
–augment time_mask[p=0.1,domain=signal,n=3:10~2,size=50:100~40]
–augment dropout[p=0.1,rate=0.05]
–augment add[p=0.1,domain=signal,stddev=0~0.5]
–augment multiply[p=0.1,domain=features,stddev=0~0.5]
[…]

alexmay23 · December 6, 2020, 10:25am

Hey. We have sent a setup. Do you have any ideas ? Thanks in advance !

othiele · December 6, 2020, 10:51am

Sorry, somehow I missed that post.

As for training:

Results look OK, but 39 epochs for 900 hours might be overfitting. Check dev and train loss over time.

If you do a new run, use dropout of 0.3 or 0.4. You should have results after 15-20 epochs.

As for the problems:

You still don’t share logs of the training or what the problem really is. Examples? Inference or training?

alexmay23 · December 6, 2020, 12:23pm

During the inference model stuck on one sentence or less (with non native Arabian pronunciation). It seems like an overfitting problem. Let we check with new hyperparams. Thanks for an advice.

othiele · December 6, 2020, 12:29pm

I am not sure I understand what you mean by stuck. The output took the same time as the other good ones, but the output was empty? Or what is stuck?

alexmay23 · December 6, 2020, 12:38pm

input is voice 2 - 5 sentences, output is text only for first sentence, for example. after sometime model doesn’t produce any text.

othiele · December 6, 2020, 12:43pm

Cut the sentences by hand and check whether the parts are detected right. If that is true, the VAD is not the best for that audio and you might have to find another way to split.

And is this reproducible? So does DS detect it 10 times right then wrong? Or always the same problem with the same chunk?

alexmay23 · December 6, 2020, 12:53pm

As for reproducability, it is hard to say, since I am personally tested it with live audio from microphone. But for me, even in live audio it failed every time. I guess we need to use all your recommendations to check everything, once we finish we will send some results here. Thank you very much for your help.

othiele · December 6, 2020, 12:58pm

And record some audio and test DS directly with. Search this forum. There was some talk of microphone problems lately. Make sure this is a DS problem, not from recording microphone with missing frames or something.

hhiyassat · February 28, 2021, 6:54pm

hi Othiele
i have run the training with dropout of 0.35 and the trialing and evaluation goes in similar way
the training stopped at the 13 epoch ( because of early stopped flag is not used )
then i exported the tflite model and ported it on android example

when using the android app using the English pretrained model the app was ruining smooth and the recognition is excellent
but when i use my trained Arabic model the app recognized the first utterance and half off the second one prefect only and nothing shown on the app
please note that we use the same app and switched the model only betwen English and Arabic
which means that the error were from our Arabic model not from VAD or the app itself

othiele · February 28, 2021, 7:08pm

What about the language model and alphabet?
Your material might not be that good. And maybe without augmentation?

Pak · March 1, 2021, 6:11am

Which corpus is use? thanks you

hhiyassat · March 1, 2021, 8:24am

i use augmentation in training
language model (scorer) is consist of the training text itself only it is very narrow domain about 50k unique words and 6500 sentence in total

othiele · March 1, 2021, 8:36am

What is the output without a scorer? This will tell you whether it is a scorer problem.

How much audio material do you use for training? Early stop after 13 epochs doesn’t sound too bad for some material.

hhiyassat · March 1, 2021, 9:38am

the output without scorer is list of character connected not so accurate
training material about 600 hours
see this video for how it stop recognizing

othiele · March 1, 2021, 9:49am

Thanks for the video. Try the same sentences and make longer pauses between words to check whether it has to do with that. General recognition looks ok, but I don’t speak Arabic.

And try to record a problematic sequence and feed it to DS on a desktop/server and play maybe via speaker on phone to check whether it is something on device.

hhiyassat · March 2, 2021, 11:59am

dear othiele
thank you for your support
i just started new training yesterday using transfer learning using this model
/home/ubuntu/DeepSpeech_latest/EX-HD/deepspeech-0.9.3-checkpoint

here is the log so far
i will keep you updated

lissyx · March 2, 2021, 12:10pm

With this amount of data, and while I’m no arabic speaker, your description seems to be enough to conclude you just don’t have enough data.

Topic		Replies	Views
Need to create Arabic models DeepSpeech	3	586	November 24, 2020
Training DeepSpeech in reinforcement learning envoirment DeepSpeech	2	420	April 30, 2020
Created an Arabic LM, but deepspeech is not learning, early stops DeepSpeech	9	2977	January 10, 2020
Request for an Arabic dataset DeepSpeech dataset	1	1296	March 3, 2022
How to train mozilla DeepSpeech model on arabic language on google colab DeepSpeech learning	0	582	January 16, 2022

Arabic model stuck

Related topics