pvk444
(Pvk444)
July 25, 2019, 8:29pm
1
I’m using the latest DeepSpeech git clone on Ubuntu 18.04, and have downloaded Common Voice 2, as required.
When running
bin/import_cv2
the program correctly finds the *.tsv files and clip folders, but then reports
Final amount of imported audio: 0
and all files were skipped due to failing upon conversion. No other errors are reported.
Any help would be greatly appreciated.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
July 26, 2019, 6:20am
2
Can you share log, so we can have a look ?
pvk444
(Pvk444)
July 26, 2019, 7:01am
3
Here is the terminal output. Is there a log file as well somewhere? Did not find any reference looking through import_cv2.py
(nlp) orchestrate@gpurig:~/projects/DeepSpeech$ bin/import_cv2.py --filter_alphabet /home/orchestrate/projects/DeepSpeech/data/alphabet.txt /home/orchestrate/projects/corpora/common_voice_2
Loading TSV file: /home/orchestrate/projects/corpora/common_voice_2/train.tsv
Saving new DeepSpeech-formatted CSV file to: /home/orchestrate/projects/corpora/common_voice_2/clips/train.csv
Importing mp3 files…
Progress |################################################################################################################################################################################## | 99% completedWriting CSV file for DeepSpeech.py as: /home/orchestrate/projects/corpora/common_voice_2/clips/train.csv
Progress |# | 100% completed
Imported 0 samples.
Skipped 63330 samples that failed upon conversion.
Final amount of imported audio: 0:00:00.
Loading TSV file: /home/orchestrate/projects/corpora/common_voice_2/test.tsv
Saving new DeepSpeech-formatted CSV file to: /home/orchestrate/projects/corpora/common_voice_2/clips/test.csv
Importing mp3 files…
Progress |###################################################################################################################################################################################| 100% completedWriting CSV file for DeepSpeech.py as: /home/orchestrate/projects/corpora/common_voice_2/clips/test.csv
Progress |# | 100% completed
Imported 0 samples.
Skipped 13178 samples that failed upon conversion.
Final amount of imported audio: 0:00:00.
Loading TSV file: /home/orchestrate/projects/corpora/common_voice_2/dev.tsv
Saving new DeepSpeech-formatted CSV file to: /home/orchestrate/projects/corpora/common_voice_2/clips/dev.csv
Importing mp3 files…
Progress |###################################################################################################################################################################################| 100% completedWriting CSV file for DeepSpeech.py as: /home/orchestrate/projects/corpora/common_voice_2/clips/dev.csv
Progress |# | 100% completed
Imported 0 samples.
Skipped 13178 samples that failed upon conversion.
Final amount of imported audio: 0:00:00.
Progress |###################################################################################################################################################################################| 100% completed
Progress |###################################################################################################################################################################################| 100% completed
Progress |###################################################################################################################################################################################| 100% completed
(nlp) orchestrate@gpurig:~/projects/DeepSpeech$
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
July 26, 2019, 7:09am
4
How fast does it completes ?
pvk444
(Pvk444)
July 26, 2019, 7:56am
5
With an Intel Xeon CPU E5-2650 v2 @ 2.60GHz CPU and 64 GB RAM the entire import_cv2 run takes 825.22 seconds
reuben
July 26, 2019, 8:54am
6
import_cv2.py
is masking the real error. Try removing the try block here so you can see the actual problem:
pvk444
(Pvk444)
July 26, 2019, 11:16am
7
This is really odd: it works now. Because of some other challenges, I had to reinstall SWIG and rebuild ctcdecoder in parallel to updating / running import_cv2. This seems to have “unblocked” something (what, I can’t tell), and it works now as expected.
Thanks for the help reuben and lissyx.
eggonlea
(Eggonlea)
July 27, 2019, 5:36pm
8
Most likely you fixed the sox package at the same time.