Problem when training deepspeech v0.7.4 on specific data

Ghada_Mjanah · July 30, 2020, 10:39am

Hi! I’m trying to train Deepspeech 0.7.4 on specific data, I generated the scorer file and then ran this command:

python -u Deepspeech.py --alphabet_config_path data/alphabet.txt --train_files train/train.csv --dev_files dev/dev.csv --test_files test/test.csv --scorer data/lm/kenlm.scorer --export_dir output --checkpoint_dir output

I get these errors:

I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 |   Training | Elapsed Time: 0:00:00 | Steps: 0 | Loss: 0.000000                                                                                                                                          Traceback (most recent call last):
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
    return fn(*args)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
         [[{{node tower_0/IteratorGetNext}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 560, in run_set
    feed_dict=feed_dict)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run
    run_metadata_ptr)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run
    run_metadata)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
         [[node tower_0/IteratorGetNext (defined at C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]

Original stack trace for 'tower_0/IteratorGetNext':
  File "Deepspeech.py", line 12, in <module>
    ds_train.run_script()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 955, in run_script
    absl.app.run(main)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\absl\app.py", line 299, in run
    _run_main(main, args)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\absl\app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 927, in main
    train()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 473, in train
    gradients, loss, non_finite_files = get_tower_results(iterator, optimizer, dropout_rates)
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 312, in get_tower_results
    avg_loss, non_finite_files = calculate_mean_edit_distance_and_loss(iterator, dropout_rates, reuse=i > 0)
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 231, in calculate_mean_edit_distance_and_loss
    batch_filenames, (batch_x, batch_seq_len), batch_y = iterator.get_next()
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py", line 426, in get_next
    name=name)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\ops\gen_dataset_ops.py", line 2518, in iterator_get_next
    output_shapes=output_shapes, name=name)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Deepspeech.py", line 12, in <module>
    ds_train.run_script()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 955, in run_script
    absl.app.run(main)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\absl\app.py", line 299, in run
    _run_main(main, args)
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\site-packages\absl\app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 927, in main
    train()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 595, in train
    train_loss, _ = run_set('train', epoch, train_init_op)
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\train.py", line 563, in run_set
    exception_box.raise_if_set()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\helpers.py", line 124, in raise_if_set
    raise exception  # pylint: disable = raising-bad-type
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\helpers.py", line 132, in do_iterate
    yield from iterable()
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\feeding.py", line 110, in generate_values
    for sample_index, sample in enumerate(samples):
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\augmentations.py", line 221, in apply_sample_augmentations
    yield from pool.imap(_augment_sample, timed_samples())
  File "C:\Users\ghada\Desktop\DeepSpeech-0.7.4\training\deepspeech_training\util\helpers.py", line 103, in imap
    for obj in self.pool.imap(fun, self._limit(it)):
  File "C:\Users\ghada\Anaconda3\envs\mhb\lib\multiprocessing\pool.py", line 735, in next
    raise value
ValueError: Unknown audio type extension ""

Knowing that my data (70 hours long, almost 29000 audio files) is in the correct format (int16, 16khz, mono) and all audios are not corrupted.

I’m using windows 10
Cuda 10.0
cudnn 7.6.2
tensorflow-gpu 1.15.2
python 3.6.5

Can anyone help me figure out why am I getting these errors please?

lissyx · July 30, 2020, 11:10am

Please understand we don’t support training on Windows.

lissyx · July 30, 2020, 11:13am

There’s something obviously wrong somewhere, please check your dataset.

lissyx · July 30, 2020, 11:14am

Please verify with ./bin/run-ldc93s1.sh.

reuben · July 30, 2020, 11:25am

Looks like maybe your training filenames don’t have the ‘.wav’ extension so the feeding code ends up with an empty dataset? Check your CSV, training input files need to have the ‘.wav’ extension.

Ghada_Mjanah · July 30, 2020, 11:51am

It gives the same error when ran on Linux (ubuntu 18.04 lts)

Ghada_Mjanah · July 30, 2020, 11:52am

That’s true, I will change it, thanks !

Ghada_Mjanah · July 30, 2020, 2:29pm

It works now, I also had to set flag ignore_longer_outputs_than_inputs=True in tensorflow method ctc_loss call in train.py
Thank you

lissyx · July 30, 2020, 2:39pm

It means you have some data that needs to be cleaned. Please check your importer and get inspiration from existing imports to check for minimum audio length matching transcript.

Ghada_Mjanah · July 30, 2020, 2:54pm

will do that, thanks !