Error at the test phase while training

Dear Support

I have a training session, and validation session. afterwards comes test before the export of the model. I am facing this crash. Can you guess, how to resolve this.

  I Restored variables from best validation checkpoint at /home/user/data/checkpoints_test01/best_dev-20326, step 20326
Testing model on /home/user/data/test.csv
Test epoch | Steps: 0 | Elapsed Time: 0:00:00
Test epoch | Steps: 1 | Elapsed Time: 0:00:36
Test epoch | Steps: 2 | Elapsed Time: 0:00:51
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Not enough time for target transition sequence (required: 52, available: 49)4You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs
	 [[{{node CTCLoss}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "DeepSpeech.py", line 976, in <module>
    absl.app.run(main)
  File "/home/user/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/user/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "DeepSpeech.py", line 953, in main
    test()
  File "DeepSpeech.py", line 686, in test
    samples = evaluate(FLAGS.test_files.split(','), create_model, try_loading)
  File "/home/user/soft/DeepSpeech/evaluate.py", line 155, in evaluate
    samples.extend(run_test(init_op, dataset=csv))
  File "/home/user/soft/DeepSpeech/evaluate.py", line 116, in run_test
    session.run([batch_wav_filename, transposed, loss, batch_x_len, batch_y])
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Not enough time for target transition sequence (required: 52, available: 49)4You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs
	 [[node CTCLoss (defined at /home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'CTCLoss':
  File "DeepSpeech.py", line 976, in <module>
    absl.app.run(main)
  File "/home/user/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/user/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "DeepSpeech.py", line 953, in main
    test()
  File "DeepSpeech.py", line 686, in test
    samples = evaluate(FLAGS.test_files.split(','), create_model, try_loading)
  File "/home/user/soft/DeepSpeech/evaluate.py", line 73, in evaluate
    sequence_length=batch_x_len)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/ops/ctc_ops.py", line 176, in ctc_loss
    ignore_longer_outputs_than_inputs=ignore_longer_outputs_than_inputs)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/ops/gen_ctc_ops.py", line 336, in ctc_loss
    name=name)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/home/user/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Your advise will aid my way to proceed with this data.

You have some broken, too short data.

ok while making language model or training or validating, there is no error at any step. So, is there possibility to ignore this error?

You can apply the suggestion in the error message, but I’d urge you to find the offending file. Unfortunately, we don’t have the same code as in training that tells you which file it is.

OK. and this must be files out of the test dataset which I can see test.csv ? because training step and validation went well.

Also, if you can guide me how much is minimum length acceptable for DeepSpeech as there are some files with only one words. But so far, no empty file I have seen.

Earlier, there were some empty files but those are from common voice which i removed and I also shared this info with the support and research team.

It’s not about a minimum length, it’s about a file with a corresponding transcription that is too long.

In this case, it’s a file with a duration between 980ms and 1s, and a transcription with 52 characters in it. The longest allowed transcription for a DeepSpeech training/validation/testing sample file is duration_in_milliseconds // 20ms characters.

Can I make this exception and include this word. Because in literature and spoken there are several words with high characters between 35 - 52. Like German counting.

for example, look at this.

Donaudampfschifffahrtselektrizitätenhauptbetriebswerkbauunterbeamtengesellschaft
a German word.

pneumonoultramicroscopicsilicovolcanoconiosis

English word as exemplary .

It’s not the length of the word, but the length for a given time. 1 letter for every 20 ms. Try saying “Donau…” in under 1 second :slight_smile:

Your data is corrupt and if you don’t change that, your training will be bad.

Thank you. Is there any way to increase 20ms limit for training / validation / testing? because there are several words which are above this limit.

Please if you can guide.

More than 50 letters per second? That must be a fast speaker.

@reuben as far as I know you would have to change a lot to change the 20 ms window, right?

Have you actually verified the transcription and the matching WAV file? Do you really have someone able to say correctly those words at that pace?

Chances are more that you have a broken WAV for that transcription.

From 100 times I had this error, it was always due to mismatched transcripts. Probably same here.

The data seems OK. The speaker is fluent but very fast.

See the --feature_win_step training flag. Note that changing that parameter will likely affect continuing/fine tuning from previous checkpoints with a different value. So make sure you check for that and train from scratch if it does affect things.

Also note that the flag must be set at export time for the native client to see it.

This change should I make at the flag.txt in checkpoint folder ? or where the default values are given at some training routines?

and what value you suggest? as there is value 20 ms stated.
and note that the flag must be set at export time for the native client to see it

I didn’t get this point. If you can explain / guide.

It’s a command line flag. So change your command line when calling the training script.

I suggest a lower value, so that you get more windows per second of audio. Try 10.

When you’re exporting the model from a checkpoint, make sure you pass the flag too. Not just when training.

And so, have you checked actual WAV length, transcription’s size, WAV metadata length? Maybe the file is corrupted, it would not be the first time we expose a bug in TensorFlow’s WAV reading that breaks like that.

Apparently it didn’t work. !! I mean the value of 10.