When I am trying to run the below command to train a dataset from common voice:
python DeepSpeech.py --train_files ./20140421/scripts/Ib/clips/train.tsv --dev_files ./20140421/scripts/Ib/clips/dev.csv --test_files ./20140421/scripts/Ib/clips/test.csv
I am getting the below error:
STARTING Optimization
Epoch 0 | Training | Elapsed Time: 0:00:00 | Steps: 0 | Loss: 0.000000 Traceback (most recent call last):
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[{{node tower_0/IteratorGetNext}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 552, in run_set
feed_dict=feed_dict)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[node tower_0/IteratorGetNext (defined at /home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
Original stack trace for 'tower_0/IteratorGetNext':
File "DeepSpeech.py", line 12, in <module>
ds_train.run_script()
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 942, in run_script
absl.app.run(main)
File "/home/ritish/.local/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/ritish/.local/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 914, in main
train()
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 474, in train
gradients, loss, non_finite_files = get_tower_results(iterator, optimizer, dropout_rates)
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 312, in get_tower_results
avg_loss, non_finite_files = calculate_mean_edit_distance_and_loss(iterator, dropout_rates, reuse=i > 0)
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 231, in calculate_mean_edit_distance_and_loss
batch_filenames, (batch_x, batch_seq_len), batch_y = iterator.get_next()
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/data/ops/iterator_ops.py", line 426, in get_next
name=name)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_dataset_ops.py", line 2518, in iterator_get_next
output_shapes=output_shapes, name=name)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/home/ritish/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "DeepSpeech.py", line 12, in <module>
ds_train.run_script()
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 942, in run_script
absl.app.run(main)
File "/home/ritish/.local/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/ritish/.local/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 914, in main
train()
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 592, in train
train_loss, _ = run_set('train', epoch, train_init_op)
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py", line 560, in run_set
exception_box.raise_if_set()
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/helpers.py", line 117, in raise_if_set
raise exception # pylint: disable = raising-bad-type
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/helpers.py", line 125, in do_iterate
yield from iterable()
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/feeding.py", line 119, in generate_values
samples = samples_from_files(sources, buffering=buffering, labeled=True)
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/sample_collections.py", line 363, in samples_from_files
return samples_from_file(filenames[0], buffering=buffering, labeled=labeled)
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/sample_collections.py", line 338, in samples_from_file
return CSV(filename, labeled=labeled)
File "/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/sample_collections.py", line 288, in __init__
raise RuntimeError('No transcript data (missing CSV column)')
I have placed all the csv files in the same clips folder where all the mp3 files are stored. The files were converted from tsv to csv through python and have the following headers along with the first two rows:
|client_id|path|sentence|up_votes|down_votes|age|gender|accent|
|---|---|---|---|---|---|---|---|
|181b63f0202ba1fd0594b5a55c4a9bb53429c87b2d2bda74f0370273a94bcffeaf810c3b0838481915f9dfbe59011ea40c5f9281b6a858c7f60f4a644c148d7f|common_voice_ga-IE_18183675.mp3|Gura fada buan sibh agus go raibh míle maith agaibh go léir|2|0|twenties|male|connachta|
|181b63f0202ba1fd0594b5a55c4a9bb53429c87b2d2bda74f0370273a94bcffeaf810c3b0838481915f9dfbe59011ea40c5f9281b6a858c7f60f4a644c148d7f|common_voice_ga-IE_18183677.mp3|Is í ding di féin a scoileann an dair|2|1|twenties|male|connachta|
I have taken the data from the following link:
https://voice.mozilla.org/en/datasets
Kindly let me know, where I am erring.