Preprocessing data

swarajbadhei · July 13, 2020, 1:44pm

I collected Indian accent dataset and put all the .mp3 files into one directory and prepared all the csv files having two columns [path,sentence], sollowing in the common voice dataset structure. But with the import_cv2.py it did not work. It ended up creating just one train-all.csv file in the clips folder. Really need help with this.

othiele · July 13, 2020, 3:07pm

Check the import python script, and you’ll find out. And please follow these guidelines for posting:

https://discourse.mozilla.org/t/what-and-how-to-report-if-you-need-support/62071/2

swarajbadhei · July 14, 2020, 6:07am

Thanks for the reply. But I am unable to understand where I can find that. Would you care to explain sir ?

othiele · July 14, 2020, 7:13am

You’ll need to check the import script yourself, it usually works. But use one from a release not the latest master as there may be some changes.

swarajbadhei · July 14, 2020, 7:26am

okay sir, will do. Can you please confirm that the csv file attributes are sufficient i.e if there is any other column that is necessary apart from ‘path’ and ‘sentence’ ?

othiele · July 14, 2020, 9:00am

Please follow the guide I provided. You are not posting the actual csv, you are posting images. I can only guess that you did not read the guidelines … sorry, can’t help

swarajbadhei · July 14, 2020, 2:52pm

files.zip (212.6 KB)
I am sorry sir the uploading part did not support tsv files. I have attached a zip file that contains the current error and the tsv file structure. My dataset contains .wav files and the tsv file contains 3 fields named client_id, path and sentence. Please have a look. Thanks for your kind vigilance.

Deepspeech version 0.7.1
Ubuntu 18.04
Intel i5 8th gen
RAM 16GB
TensorFlow version 1.14.0

othiele · July 15, 2020, 7:34am

Format in general looks fine to me, you’ll have to debug the import script to find out whats wrong with your script.

lissyx · July 15, 2020, 10:07am

So you re-used the script for something completely different that you built?

You don’t seem to have carefully read the documentation, training with 0.7 requires tensorflow 1.15.

You can’t expect us to be of any help if you don’t elaborate more on how you

lissyx · July 15, 2020, 10:09am

This is not actionable:

screenshots are not helpful
you only share one test.tsv
your error in the screenshot is completely different from what you report in this thread, and it’s your file that is not existent.