Error at the test phase while training

ok while making language model or training or validating, there is no error at any step. So, is there possibility to ignore this error?

You can apply the suggestion in the error message, but I’d urge you to find the offending file. Unfortunately, we don’t have the same code as in training that tells you which file it is.

OK. and this must be files out of the test dataset which I can see test.csv ? because training step and validation went well.

Also, if you can guide me how much is minimum length acceptable for DeepSpeech as there are some files with only one words. But so far, no empty file I have seen.

Earlier, there were some empty files but those are from common voice which i removed and I also shared this info with the support and research team.

It’s not about a minimum length, it’s about a file with a corresponding transcription that is too long.

In this case, it’s a file with a duration between 980ms and 1s, and a transcription with 52 characters in it. The longest allowed transcription for a DeepSpeech training/validation/testing sample file is duration_in_milliseconds // 20ms characters.

Can I make this exception and include this word. Because in literature and spoken there are several words with high characters between 35 - 52. Like German counting.

for example, look at this.

Donaudampfschifffahrtselektrizitätenhauptbetriebswerkbauunterbeamtengesellschaft
a German word.

pneumonoultramicroscopicsilicovolcanoconiosis

English word as exemplary .

It’s not the length of the word, but the length for a given time. 1 letter for every 20 ms. Try saying “Donau…” in under 1 second :slight_smile:

Your data is corrupt and if you don’t change that, your training will be bad.

Thank you. Is there any way to increase 20ms limit for training / validation / testing? because there are several words which are above this limit.

Please if you can guide.

More than 50 letters per second? That must be a fast speaker.

@reuben as far as I know you would have to change a lot to change the 20 ms window, right?

Have you actually verified the transcription and the matching WAV file? Do you really have someone able to say correctly those words at that pace?

Chances are more that you have a broken WAV for that transcription.

From 100 times I had this error, it was always due to mismatched transcripts. Probably same here.

The data seems OK. The speaker is fluent but very fast.

See the --feature_win_step training flag. Note that changing that parameter will likely affect continuing/fine tuning from previous checkpoints with a different value. So make sure you check for that and train from scratch if it does affect things.

Also note that the flag must be set at export time for the native client to see it.

This change should I make at the flag.txt in checkpoint folder ? or where the default values are given at some training routines?

and what value you suggest? as there is value 20 ms stated.
and note that the flag must be set at export time for the native client to see it

I didn’t get this point. If you can explain / guide.

It’s a command line flag. So change your command line when calling the training script.

I suggest a lower value, so that you get more windows per second of audio. Try 10.

When you’re exporting the model from a checkpoint, make sure you pass the flag too. Not just when training.

And so, have you checked actual WAV length, transcription’s size, WAV metadata length? Maybe the file is corrupted, it would not be the first time we expose a bug in TensorFlow’s WAV reading that breaks like that.

Apparently it didn’t work. !! I mean the value of 10.

Thank you for pointing. Should i upgrade tensorflow? at the moment is 1.15_gpu

No. Please check what suggested.