Problem when training deepspeech v0.7.4 on specific data

Hi! I’m trying to train Deepspeech 0.7.4 on specific data, I generated the scorer file and then ran this command:

python -u Deepspeech.py --alphabet_config_path data/alphabet.txt --train_files train/train.csv --dev_files dev/dev.csv --test_files test/test.csv --scorer data/lm/kenlm.scorer --export_dir output --checkpoint_dir output

I get these errors:

I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 |   Training | Elapsed Time: 0:00:00 | Steps: 0 | Loss: 0.000000                                                                                                                                          Traceback (most recent call last):
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
    return fn(*args)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
         [[{{node tower_0/IteratorGetNext}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 560, in run_set
    feed_dict=feed_dict)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run
    run_metadata_ptr)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run
    run_metadata)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
         [[node tower_0/IteratorGetNext (defined at C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]

Original stack trace for 'tower_0/IteratorGetNext':
  File "Deepspeech.py", line 12, in <module>
    ds_train.run_script()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 955, in run_script
    absl.app.run(main)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\absl\app.py", line 299, in run
    _run_main(main, args)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\absl\app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 927, in main
    train()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 473, in train
    gradients, loss, non_finite_files = get_tower_results(iterator, optimizer, dropout_rates)
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 312, in get_tower_results
    avg_loss, non_finite_files = calculate_mean_edit_distance_and_loss(iterator, dropout_rates, reuse=i > 0)
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 231, in calculate_mean_edit_distance_and_loss
    batch_filenames, (batch_x, batch_seq_len), batch_y = iterator.get_next()
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py", line 426, in get_next
    name=name)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\ops\gen_dataset_ops.py", line 2518, in iterator_get_next
    output_shapes=output_shapes, name=name)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Deepspeech.py", line 12, in <module>
    ds_train.run_script()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 955, in run_script
    absl.app.run(main)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\absl\app.py", line 299, in run
    _run_main(main, args)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\absl\app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 927, in main
    train()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 595, in train
    train_loss, _ = run_set('train', epoch, train_init_op)
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 563, in run_set
    exception_box.raise_if_set()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\helpers.py", line 124, in raise_if_set
    raise exception  # pylint: disable = raising-bad-type
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\helpers.py", line 132, in do_iterate
    yield from iterable()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\feeding.py", line 110, in generate_values
    for sample_index, sample in enumerate(samples):
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\augmentations.py", line 221, in apply_sample_augmentations
    yield from pool.imap(_augment_sample, timed_samples())
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\helpers.py", line 103, in imap
    for obj in self.pool.imap(fun, self._limit(it)):
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\multiprocessing\pool.py", line 735, in next
    raise value
ValueError: Unknown audio type extension ""

Knowing that my data (70 hours long, almost 29000 audio files) is in the correct format (int16, 16khz, mono) and all audios are not corrupted.

I’m using windows 10
Cuda 10.0
cudnn 7.6.2
tensorflow-gpu 1.15.2
python 3.6.5

Can anyone help me figure out why am I getting these errors please?

Please understand we don’t support training on Windows.

There’s something obviously wrong somewhere, please check your dataset.

1 Like

Please verify with ./bin/run-ldc93s1.sh.

1 Like

Looks like maybe your training filenames don’t have the ‘.wav’ extension so the feeding code ends up with an empty dataset? Check your CSV, training input files need to have the ‘.wav’ extension.

1 Like

It gives the same error when ran on Linux (ubuntu 18.04 lts)

That’s true, I will change it, thanks !

It works now, I also had to set flag ignore_longer_outputs_than_inputs=True in tensorflow method ctc_loss call in train.py
Thank you

It means you have some data that needs to be cleaned. Please check your importer and get inspiration from existing imports to check for minimum audio length matching transcript.

1 Like

will do that, thanks !