Error at the test phase while training

Tortoise · June 12, 2020, 12:22pm

ok while making language model or training or validating, there is no error at any step. So, is there possibility to ignore this error?

lissyx · June 12, 2020, 12:25pm

You can apply the suggestion in the error message, but I’d urge you to find the offending file. Unfortunately, we don’t have the same code as in training that tells you which file it is.

Tortoise · June 12, 2020, 12:45pm

OK. and this must be files out of the test dataset which I can see test.csv ? because training step and validation went well.

Also, if you can guide me how much is minimum length acceptable for DeepSpeech as there are some files with only one words. But so far, no empty file I have seen.

Earlier, there were some empty files but those are from common voice which i removed and I also shared this info with the support and research team.

reuben · June 12, 2020, 12:49pm

It’s not about a minimum length, it’s about a file with a corresponding transcription that is too long.

In this case, it’s a file with a duration between 980ms and 1s, and a transcription with 52 characters in it. The longest allowed transcription for a DeepSpeech training/validation/testing sample file is duration_in_milliseconds // 20ms characters.

Tortoise · June 12, 2020, 1:05pm

Can I make this exception and include this word. Because in literature and spoken there are several words with high characters between 35 - 52. Like German counting.

Tortoise · June 12, 2020, 1:10pm

for example, look at this.

Donaudampfschifffahrtselektrizitätenhauptbetriebswerkbauunterbeamtengesellschaft
a German word.

pneumonoultramicroscopicsilicovolcanoconiosis

English word as exemplary .

othiele · June 12, 2020, 1:25pm

It’s not the length of the word, but the length for a given time. 1 letter for every 20 ms. Try saying “Donau…” in under 1 second

Your data is corrupt and if you don’t change that, your training will be bad.

Tortoise · June 15, 2020, 9:44am

Thank you. Is there any way to increase 20ms limit for training / validation / testing? because there are several words which are above this limit.

Please if you can guide.

othiele · June 15, 2020, 12:09pm

More than 50 letters per second? That must be a fast speaker.

@reuben as far as I know you would have to change a lot to change the 20 ms window, right?

lissyx · June 15, 2020, 12:17pm

Have you actually verified the transcription and the matching WAV file? Do you really have someone able to say correctly those words at that pace?

Chances are more that you have a broken WAV for that transcription.

othiele · June 15, 2020, 12:39pm

From 100 times I had this error, it was always due to mismatched transcripts. Probably same here.

Tortoise · June 15, 2020, 1:07pm

The data seems OK. The speaker is fluent but very fast.

reuben · June 15, 2020, 1:26pm

See the --feature_win_step training flag. Note that changing that parameter will likely affect continuing/fine tuning from previous checkpoints with a different value. So make sure you check for that and train from scratch if it does affect things.

reuben · June 15, 2020, 1:26pm

Also note that the flag must be set at export time for the native client to see it.

Tortoise · June 15, 2020, 1:42pm

This change should I make at the flag.txt in checkpoint folder ? or where the default values are given at some training routines?

and what value you suggest? as there is value 20 ms stated.
and note that the flag must be set at export time for the native client to see it

I didn’t get this point. If you can explain / guide.

reuben · June 15, 2020, 1:52pm

It’s a command line flag. So change your command line when calling the training script.

I suggest a lower value, so that you get more windows per second of audio. Try 10.

When you’re exporting the model from a checkpoint, make sure you pass the flag too. Not just when training.

lissyx · June 15, 2020, 2:00pm

And so, have you checked actual WAV length, transcription’s size, WAV metadata length? Maybe the file is corrupted, it would not be the first time we expose a bug in TensorFlow’s WAV reading that breaks like that.

Tortoise · June 23, 2020, 1:55pm

Apparently it didn’t work. !! I mean the value of 10.

Tortoise · June 23, 2020, 2:00pm

Thank you for pointing. Should i upgrade tensorflow? at the moment is 1.15_gpu

lissyx · June 23, 2020, 2:11pm

No. Please check what suggested.