Testing for correctness of the samples

Tejas_Shah · September 3, 2020, 2:32pm

Hello,
I am trying to train the model on a bunch of audio wav files. I created the csv files based on the wav and the text corresponding to it.
However, as soon as I start training it, I get the following error…

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
  (0) Out of range: End of sequence
         [[{{node tower_3/IteratorGetNext}}]]
         [[cond_1/Adam-wrapped/update/NoOp/_336]]
  (1) Out of range: End of sequence
         [[{{node tower_3/IteratorGetNext}}]]
0 successful operations.
3 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 575, in run_set
    feed_dict=feed_dict)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
  (0) Out of range: End of sequence
         [[node tower_3/IteratorGetNext (defined at /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
         [[cond_1/Adam-wrapped/update/NoOp/_336]]
  (1) Out of range: End of sequence
         [[node tower_3/IteratorGetNext (defined at /home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
3 derived errors ignored.

Original stack trace for 'tower_3/IteratorGetNext':
  File "DeepSpeech.py", line 12, in <module>
    ds_train.run_script()
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 970, in run_script
    absl.app.run(main)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 942, in main
    train()
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 488, in train
    gradients, loss, non_finite_files = get_tower_results(iterator, optimizer, dropout_rates)
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 321, in get_tower_results
    avg_loss, non_finite_files = calculate_mean_edit_distance_and_loss(iterator, dropout_rates, reuse=i > 0)
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 240, in calculate_mean_edit_distance_and_loss
    batch_filenames, (batch_x, batch_seq_len), batch_y = iterator.get_next()
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/data/ops/iterator_ops.py", line 426, in get_next
    name=name)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/ops/gen_dataset_ops.py", line 2518, in iterator_get_next
    output_shapes=output_shapes, name=name)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "DeepSpeech.py", line 12, in <module>
    ds_train.run_script()
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 970, in run_script
    absl.app.run(main)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 942, in main
    train()
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 610, in train
    train_loss, _ = run_set('train', epoch, train_init_op)
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py", line 578, in run_set
    exception_box.raise_if_set()
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/helpers.py", line 123, in raise_if_set
    raise exception  # pylint: disable = raising-bad-type
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/helpers.py", line 131, in do_iterate
    yield from iterable()
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/feeding.py", line 114, in generate_values
    for sample_index, sample in enumerate(samples):
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/augmentations.py", line 221, in apply_sample_augmentations
    yield from pool.imap(_augment_sample, timed_samples())
  File "/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/helpers.py", line 102, in imap
    for obj in self.pool.imap(fun, self._limit(it)):
  File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/multiprocessing/pool.py", line 735, in next
    raise value
EOFError

I am not sure what could be the error as it doesnt print the troublesome filename.

I tried running sox --i on some samples and it gave following output.

Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:00:08.08 = 129280 samples ~ 606 CDDA sectors
File Size : 259k
Bit Rate : 256k
Sample Encoding: 16-bit Signed Integer PCM

The file size in CSV is same as what we get with ls -l.
Hence its difficult to understand the reason behind EOFError.
Can you please help me to decode this issue?

Thanks in advance.

lissyx · September 3, 2020, 3:51pm

care to share its content? asking for support and not sharing informations to understand the problem is kind of not going to be effective

maybe it is the csv that you have not properly written?

Tejas_Shah · September 3, 2020, 4:30pm

The zip of datasets at https://drive.google.com/file/d/1y8EFkCa8luIf9G_8oOKLHp_vC1CWaPu7/view?usp=sharing
I saw that some wav files are with zero filesize. May be that is causing the issue?

lissyx · September 3, 2020, 4:35pm

Can you just share it as plain text here ?

lissyx · September 3, 2020, 4:36pm

Have you verified the format of your CSV file ?

Tejas_Shah · September 3, 2020, 6:19pm

I removed all the samples with zero size / smaller than 50KB files. Now, I am getting a new error…

Traceback (most recent call last):
File “DeepSpeech.py”, line 12, in
ds_train.run_script()
File “/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py”, line 970, in run_script
absl.app.run(main)
File “/home/ubuntu/.local/lib/python3.6/site-packages/absl/app.py”, line 299, in run
_run_main(main, args)
File “/home/ubuntu/.local/lib/python3.6/site-packages/absl/app.py”, line 250, in _run_main
sys.exit(main(argv))
File “/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py”, line 942, in main
train()
File “/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py”, line 610, in train
train_loss, _ = run_set(‘train’, epoch, train_init_op)
File “/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/train.py”, line 578, in run_set
exception_box.raise_if_set()
File “/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/helpers.py”, line 123, in raise_if_set
raise exception # pylint: disable = raising-bad-type
File “/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/helpers.py”, line 131, in do_iterate
yield from iterable()
File “/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/feeding.py”, line 114, in generate_values
for sample_index, sample in enumerate(samples):
File “/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/augmentations.py”, line 221, in apply_sample_augmentations
yield from pool.imap(_augment_sample, timed_samples())
File “/home/ubuntu/deepspeech/DeepSpeech/training/deepspeech_training/util/helpers.py”, line 102, in imap
for obj in self.pool.imap(fun, self._limit(it)):
File “/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/multiprocessing/pool.py”, line 735, in next
raise value
wave.Error: file does not start with RIFF id
Even the debug logging is not helping in finding out the culprit file.
There are 700K samples. Is there a way to test for invalid samples from deepspeech perspective?

Tejas_Shah · September 4, 2020, 5:13am

Yes @lissyx, the CSV format looks OK.

lissyx · September 4, 2020, 9:51am

that’s not a wav file?

please read importers code, those are the one dealing with that

abdullah.tayyab · December 14, 2020, 9:06pm

@Tejas_Shah Did you find the issue with your CSV files? I have spent hours trying to find out what is wrong with mine, but to no avail.

@lissyx Discourse does not allow new users to upload attachments, so I am not sure what to share here. If it makes more sense, I can open another issue but I was wondering maybe I have the same issue as Tejas; i.e. EOFError - not the RIFF error.

lissyx · December 14, 2020, 9:11pm

RIFF is MP3, not WAV.

abdullah.tayyab · December 14, 2020, 10:44pm

@lissyx I am not getting the RIFF error. I am getting the EOFError. My files are .wav

(deepspeech-venv) ubuntu@ip-172-31-9-108:~/DeepSpeech$ python3 DeepSpeech.py     --drop_source_layers 1     --alphabet_config_path /$HOME/Uploads/UrduAlphabet_newscrawl2.txt     --checkpoint_dir /$HOME/DeepSpeech/dataset/trained_load_checkpoint     --train_files  /$HOME/Uploads/train.csv     --dev_files   /$HOME/Uploads/dev.csv     --test_files  /$HOME/Uploads/test.csv     --epochs 2     --train_batch_size 32     --export_dir /$HOME/DeepSpeech/dataset/urdu_trained     --export_file_name urdu     --test_batch_size 12     --learning_rate 0.00001     --reduce_lr_on_plateau true     --scorer /$HOME/Uploads/kenlm.scorer
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 |   Training | Elapsed Time: 0:00:10 | Steps: 3 | Loss: 146.564260                                                  Traceback (most recent call last):
  File "DeepSpeech.py", line 12, in <module>
    ds_train.run_script()
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 976, in run_script
    absl.app.run(main)
  File "/home/ubuntu/tmp/deepspeech-venv/lib/python3.7/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/home/ubuntu/tmp/deepspeech-venv/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 948, in main
    train()
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 605, in train
    train_loss, _ = run_set('train', epoch, train_init_op)
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 571, in run_set
    exception_box.raise_if_set()
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py", line 123, in raise_if_set
    raise exception  # pylint: disable = raising-bad-type
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py", line 131, in do_iterate
    yield from iterable()
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/feeding.py", line 114, in generate_values
    for sample_index, sample in enumerate(samples):
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/augmentations.py", line 221, in apply_sample_augmentations
    yield from pool.imap(_augment_sample, timed_samples())
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py", line 102, in imap
    for obj in self.pool.imap(fun, self._limit(it)):
  File "/home/ubuntu/anaconda3/lib/python3.7/multiprocessing/pool.py", line 748, in next
    raise value
EOFError

I am now able to attach the CSV files. Can you please help?
Archive.zip (100.4 KB)

The command I am using is:

python3 DeepSpeech.py --drop_source_layers 1 --alphabet_config_path /$HOME/Uploads/UrduAlphabet_newscrawl2.txt --checkpoint_dir /$HOME/DeepSpeech/dataset/trained_load_checkpoint --train_files /$HOME/Uploads/trainis.csv --dev_files /$HOME/Uploads/devis.csv --test_files /$HOME/Uploads/testis.csv --epochs 2 --train_batch_size 32 --export_dir /$HOME/DeepSpeech/dataset/urdu_trained --export_file_name urdu --test_batch_size 12 --learning_rate 0.00001 --reduce_lr_on_plateau true --scorer /$HOME/Uploads/kenlm.scorer

This is all set up on Ubuntu 16 with the default installation as on the web site. The setup is working fine for smaller files but doesn’t work when I try to combine transcripts from multiple sources into the same training file.

Tejas_Shah · December 15, 2020, 4:19am

Hi Abdullah,
For me, it was more of wav file errors rather than CSV. I wrote a python script which loads wav files (using wave.open()) and closes it… EOFError is an exception if the wav file is corrupt. I discarded those corrupt files from the dataset.
Hope this helps.

Topic		Replies	Views
KeyError: 'wav_filename' DeepSpeech	19	1602	July 21, 2020
EOFError while validation DeepSpeech learning , issue	12	825	April 25, 2021
Error when trying to train DeepSpeech	7	1888	January 30, 2018
wave.Error: fmt chunk and/or data chunk missing DeepSpeech	4	1729	December 19, 2020
Getting RuntimeError: No transcript data (missing CSV column) when trying to train a model DeepSpeech	12	1605	April 12, 2020

Testing for correctness of the samples

Related topics