Unknown Error from ./bin/run-ldc93s1.sh (DeepSpeech 0.9.1)

I’m Japanese, so I’m not fluent in English. Please pardon.

When I execute the shell script ./bin/run-ldc93s1.sh described on the official website as an introduction to learning using DeepSpeech, an error of unknown cause occurs.

My current environment is as follows.

  • OS : macOS Catalina ver. 10.15.7
  • Processor : 2.3GHz quad core Intel Core i7
  • Memory : 16GB 3733MHz LPDDR4X
  • Graphics : Intel Iris Plus Graphics 1536MB
  • Anaconda virtual environment
  • Python 3.6.12 (from conda)
  • TensorFlow 1.15.0 (from conda)
  • DeepSpeech 0.9.1

I cloned from GitHub under this environment, and since there was a part where import of the library did not pass, I put a symbolic link to the deepspeech_training directory under the bin directory. Other structures have not changed. The result of executing bin/run-ldc93s1.sh following the official website in this state is shown below.

% ./bin/run-ldc93s1.sh
+ '[' '!' -f DeepSpeech.py ']'
+ '[' '!' -f data/ldc93s1/ldc93s1.csv ']'
+ '[' -d '' ']'
++ python -c 'from xdg import BaseDirectory as xdg; print(xdg.save_data_path("deepspeech/ldc93s1"))'
+ checkpoint_dir=/Users/t-yamane/.local/share/deepspeech/ldc93s1
+ export CUDA_VISIBLE_DEVICES=0
+ CUDA_VISIBLE_DEVICES=0
+ python -u DeepSpeech.py --noshow_progressbar --train_files data/ldc93s1/ldc93s1.csv --test_files data/ldc93s1/ldc93s1.csv --train_batch_size 1 --test_batch_size 1 --n_hidden 100 --epochs 200 --checkpoint_dir /Users/t-yamane/.local/share/deepspeech/ldc93s1
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
I Training epoch 0...
Traceback (most recent call last):
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
	 [[{{node tower_0/IteratorGetNext}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 570, in run_set
    feed_dict=feed_dict)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
	 [[node tower_0/IteratorGetNext (defined at /Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'tower_0/IteratorGetNext':
  File "DeepSpeech.py", line 12, in <module>
    ds_train.run_script()
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 976, in run_script
    absl.app.run(main)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 948, in main
    train()
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 483, in train
    gradients, loss, non_finite_files = get_tower_results(iterator, optimizer, dropout_rates)
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 316, in get_tower_results
    avg_loss, non_finite_files = calculate_mean_edit_distance_and_loss(iterator, dropout_rates, reuse=i > 0)
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 235, in calculate_mean_edit_distance_and_loss
    batch_filenames, (batch_x, batch_seq_len), batch_y = iterator.get_next()
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/data/ops/iterator_ops.py", line 426, in get_next
    name=name)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/ops/gen_dataset_ops.py", line 2518, in iterator_get_next
    output_shapes=output_shapes, name=name)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "DeepSpeech.py", line 12, in <module>
    ds_train.run_script()
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 976, in run_script
    absl.app.run(main)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 948, in main
    train()
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 605, in train
    train_loss, _ = run_set('train', epoch, train_init_op)
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/train.py", line 573, in run_set
    exception_box.raise_if_set()
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/util/helpers.py", line 123, in raise_if_set
    raise exception  # pylint: disable = raising-bad-type
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/util/helpers.py", line 131, in do_iterate
    yield from iterable()
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/util/feeding.py", line 114, in generate_values
    for sample_index, sample in enumerate(samples):
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/util/augmentations.py", line 221, in apply_sample_augmentations
    yield from pool.imap(_augment_sample, timed_samples())
  File "/Users/t-yamane/Documents/sotsuken/DeepSpeech/env2/DeepSpeech/training/deepspeech_training/util/helpers.py", line 102, in imap
    for obj in self.pool.imap(fun, self._limit(it)):
  File "/Users/t-yamane/opt/anaconda3/envs/sotsuken2/lib/python3.6/multiprocessing/pool.py", line 735, in next
    raise value
EOFError

This script is a very simple command to train and test with a single training data. I have 200 epochs, but I get this error immediately on the first epoch. I asked a question because it was difficult to find a similar case on this forum.
How can I get rid of this error and complete the learning? I’m sorry for the question because I didn’t study well.

So basically there’s an error you don’t share, you do something that we don’t document, and you don’t share the steps you did.

I can only assume you have not done pip install step from there, considering you mention TensorFlow 1.15.0 while our requirements are 1.15.4.

There’s nothing we can do if you don’t follow the docs.

Please also note you won’t be able to use any GPU on macOS for training. Pure CPU will be super slow.

As you pointed out, there was a problem with the version of TensorFlow. After careful modification, it works fine. Thank you very much.

1 Like

@mtness_8810 Please let us know what you changed as this might help others in the future.