Testing for correctness of the samples

abdullah.tayyab · December 14, 2020, 10:44pm

@lissyx I am not getting the RIFF error. I am getting the EOFError. My files are .wav

(deepspeech-venv) ubuntu@ip-172-31-9-108:~/DeepSpeech$ python3 DeepSpeech.py     --drop_source_layers 1     --alphabet_config_path /$HOME/Uploads/UrduAlphabet_newscrawl2.txt     --checkpoint_dir /$HOME/DeepSpeech/dataset/trained_load_checkpoint     --train_files  /$HOME/Uploads/train.csv     --dev_files   /$HOME/Uploads/dev.csv     --test_files  /$HOME/Uploads/test.csv     --epochs 2     --train_batch_size 32     --export_dir /$HOME/DeepSpeech/dataset/urdu_trained     --export_file_name urdu     --test_batch_size 12     --learning_rate 0.00001     --reduce_lr_on_plateau true     --scorer /$HOME/Uploads/kenlm.scorer
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 |   Training | Elapsed Time: 0:00:10 | Steps: 3 | Loss: 146.564260                                                  Traceback (most recent call last):
  File "DeepSpeech.py", line 12, in <module>
    ds_train.run_script()
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 976, in run_script
    absl.app.run(main)
  File "/home/ubuntu/tmp/deepspeech-venv/lib/python3.7/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/home/ubuntu/tmp/deepspeech-venv/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 948, in main
    train()
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 605, in train
    train_loss, _ = run_set('train', epoch, train_init_op)
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 571, in run_set
    exception_box.raise_if_set()
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py", line 123, in raise_if_set
    raise exception  # pylint: disable = raising-bad-type
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py", line 131, in do_iterate
    yield from iterable()
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/feeding.py", line 114, in generate_values
    for sample_index, sample in enumerate(samples):
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/augmentations.py", line 221, in apply_sample_augmentations
    yield from pool.imap(_augment_sample, timed_samples())
  File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py", line 102, in imap
    for obj in self.pool.imap(fun, self._limit(it)):
  File "/home/ubuntu/anaconda3/lib/python3.7/multiprocessing/pool.py", line 748, in next
    raise value
EOFError

I am now able to attach the CSV files. Can you please help?
Archive.zip (100.4 KB)

The command I am using is:

python3 DeepSpeech.py --drop_source_layers 1 --alphabet_config_path /$HOME/Uploads/UrduAlphabet_newscrawl2.txt --checkpoint_dir /$HOME/DeepSpeech/dataset/trained_load_checkpoint --train_files /$HOME/Uploads/trainis.csv --dev_files /$HOME/Uploads/devis.csv --test_files /$HOME/Uploads/testis.csv --epochs 2 --train_batch_size 32 --export_dir /$HOME/DeepSpeech/dataset/urdu_trained --export_file_name urdu --test_batch_size 12 --learning_rate 0.00001 --reduce_lr_on_plateau true --scorer /$HOME/Uploads/kenlm.scorer

This is all set up on Ubuntu 16 with the default installation as on the web site. The setup is working fine for smaller files but doesn’t work when I try to combine transcripts from multiple sources into the same training file.