Adding ignore longer outputs than inputs to ctc loss is not working with Deepspeech V0.9.1

i’m working on training the pre-trained deepspeech english model from release 0.9.1
I have a dataset composed of almost 31080 audio files (corresponding to bible verses).
When I ran the following

python3 DeepSpeech.py --n_hidden 2048 --checkpoint_dir /home/ghada/deepspeech-0.9.1-checkpoint/ --epochs 20 --train_cudnn --train_files train/train.csv --dev_files dev/dev.csv --test_files test/test.csv --scorer /home/ghada/deepspeech-0.9.1-models.scorer --learning_rate 0.0001 --export_dir output/ --export_tflite

the training starts, runs for 36 minutes as follows:

I Loading best validating checkpoint from /home/ghada/deepspeech-0.9.1-checkpoint/best_dev-1466475
I Loading variable from checkpoint: beta1_power
I Loading variable from checkpoint: beta2_power
I Loading variable from checkpoint: cudnn_lstm/opaque_kernel
I Loading variable from checkpoint: cudnn_lstm/opaque_kernel/Adam
I Loading variable from checkpoint: cudnn_lstm/opaque_kernel/Adam_1
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/bias/Adam
I Loading variable from checkpoint: layer_1/bias/Adam_1
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_1/weights/Adam
I Loading variable from checkpoint: layer_1/weights/Adam_1
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/bias/Adam
I Loading variable from checkpoint: layer_2/bias/Adam_1
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_2/weights/Adam
I Loading variable from checkpoint: layer_2/weights/Adam_1
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/bias/Adam
I Loading variable from checkpoint: layer_3/bias/Adam_1
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_3/weights/Adam
I Loading variable from checkpoint: layer_3/weights/Adam_1
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/bias/Adam
I Loading variable from checkpoint: layer_5/bias/Adam_1
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_5/weights/Adam
I Loading variable from checkpoint: layer_5/weights/Adam_1
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/bias/Adam
I Loading variable from checkpoint: layer_6/bias/Adam_1
I Loading variable from checkpoint: layer_6/weights
I Loading variable from checkpoint: layer_6/weights/Adam
I Loading variable from checkpoint: layer_6/weights/Adam_1
I Loading variable from checkpoint: learning_rate
I STARTING Optimization
Epoch 0 |   Training | Elapsed Time: 0:36:23 | Steps: 2997 | Loss: 32.925175   

then I get the following error:

Traceback (most recent call last):
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: Not enough time for target transition sequence (required: 225, available: 183)0You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs
	 [[{{node tower_0/CTCLoss}}]]
  (1) Invalid argument: Not enough time for target transition sequence (required: 225, available: 183)0You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs
	 [[{{node tower_0/CTCLoss}}]]
	 [[tower_0/CTCLoss/_49]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "DeepSpeech.py", line 12, in <module>
    ds_train.run_script()
  File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 976, in run_script
    absl.app.run(main)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 948, in main
    train()
  File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 605, in train
    train_loss, _ = run_set('train', epoch, train_init_op)
  File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 570, in run_set
    feed_dict=feed_dict)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: Not enough time for target transition sequence (required: 225, available: 183)0You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs
	 [[node tower_0/CTCLoss (defined at /home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
  (1) Invalid argument: Not enough time for target transition sequence (required: 225, available: 183)0You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs
	 [[node tower_0/CTCLoss (defined at /home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
	 [[tower_0/CTCLoss/_49]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'tower_0/CTCLoss':
  File "DeepSpeech.py", line 12, in <module>
    ds_train.run_script()
  File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 976, in run_script
    absl.app.run(main)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 948, in main
    train()
  File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 483, in train
    gradients, loss, non_finite_files = get_tower_results(iterator, optimizer, dropout_rates)
  File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 316, in get_tower_results
    avg_loss, non_finite_files = calculate_mean_edit_distance_and_loss(iterator, dropout_rates, reuse=i > 0)
  File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 246, in calculate_mean_edit_distance_and_loss
    total_loss = tfv1.nn.ctc_loss(labels=batch_y, inputs=logits, sequence_length=batch_seq_len)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/ops/ctc_ops.py", line 176, in ctc_loss
    ignore_longer_outputs_than_inputs=ignore_longer_outputs_than_inputs)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/ops/gen_ctc_ops.py", line 336, in ctc_loss
    name=name)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/home/ghada/python-environments/venv/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Knowing that I already added ignore_longer_outputs_than_inputs=True to ctc loss in train.py, what can the problem be?
I randomly checked some of the .wav files by listening if they match with the transcript and I didn’t notice any problem, I also checked line 2997 (where the training stopped) in train.csv but I couldn’t see any problem with it.
Can anyone tell me what can I do to continue the training ?

The exception message itself shows that if you did add the parameter, it did not get picked up when you ran the code again. There’s no ignore_longer_outputs_than_inputs argument in that call. Make sure you install the training code in editable mode pip install -e . as documented so modifications are picked up without needing to reinstall.

1 Like

I created a new environment for training the whole dataset and didn’t pay attention to this detail
Now it’s working ! Thanks

You may also need to change the ctc_loss function in evaluate.py so it does not have the same error during test.