@lissyx I am not getting the RIFF error. I am getting the EOFError. My files are .wav
(deepspeech-venv) ubuntu@ip-172-31-9-108:~/DeepSpeech$ python3 DeepSpeech.py --drop_source_layers 1 --alphabet_config_path /$HOME/Uploads/UrduAlphabet_newscrawl2.txt --checkpoint_dir /$HOME/DeepSpeech/dataset/trained_load_checkpoint --train_files /$HOME/Uploads/train.csv --dev_files /$HOME/Uploads/dev.csv --test_files /$HOME/Uploads/test.csv --epochs 2 --train_batch_size 32 --export_dir /$HOME/DeepSpeech/dataset/urdu_trained --export_file_name urdu --test_batch_size 12 --learning_rate 0.00001 --reduce_lr_on_plateau true --scorer /$HOME/Uploads/kenlm.scorer
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 0:00:10 | Steps: 3 | Loss: 146.564260 Traceback (most recent call last):
File "DeepSpeech.py", line 12, in <module>
ds_train.run_script()
File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 976, in run_script
absl.app.run(main)
File "/home/ubuntu/tmp/deepspeech-venv/lib/python3.7/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/home/ubuntu/tmp/deepspeech-venv/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 948, in main
train()
File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 605, in train
train_loss, _ = run_set('train', epoch, train_init_op)
File "/home/ubuntu/DeepSpeech/training/deepspeech_training/train.py", line 571, in run_set
exception_box.raise_if_set()
File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py", line 123, in raise_if_set
raise exception # pylint: disable = raising-bad-type
File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py", line 131, in do_iterate
yield from iterable()
File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/feeding.py", line 114, in generate_values
for sample_index, sample in enumerate(samples):
File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/augmentations.py", line 221, in apply_sample_augmentations
yield from pool.imap(_augment_sample, timed_samples())
File "/home/ubuntu/DeepSpeech/training/deepspeech_training/util/helpers.py", line 102, in imap
for obj in self.pool.imap(fun, self._limit(it)):
File "/home/ubuntu/anaconda3/lib/python3.7/multiprocessing/pool.py", line 748, in next
raise value
EOFError
I am now able to attach the CSV files. Can you please help?
Archive.zip (100.4 KB)
The command I am using is:
python3 DeepSpeech.py --drop_source_layers 1 --alphabet_config_path /$HOME/Uploads/UrduAlphabet_newscrawl2.txt --checkpoint_dir /$HOME/DeepSpeech/dataset/trained_load_checkpoint --train_files /$HOME/Uploads/trainis.csv --dev_files /$HOME/Uploads/devis.csv --test_files /$HOME/Uploads/testis.csv --epochs 2 --train_batch_size 32 --export_dir /$HOME/DeepSpeech/dataset/urdu_trained --export_file_name urdu --test_batch_size 12 --learning_rate 0.00001 --reduce_lr_on_plateau true --scorer /$HOME/Uploads/kenlm.scorer
This is all set up on Ubuntu 16 with the default installation as on the web site. The setup is working fine for smaller files but doesn’t work when I try to combine transcripts from multiple sources into the same training file.