Error while training the dataset

cv-invalid
cv-other-dev
cv-other-test
cv-other-train
cv-valid-dev
cv-valid-test
cv-valid-train
LICENSE.txt
README.txt
cv-invalid.csv
cv-other-dev.csv
cv-other-test.csv
cv-other-train.csv
cv-valid-dev.csv
cv-valid-test.csv
cv-valid-train.csv

my dataset is in this format which I downloaded it from kaggle
link for dataset:https://www.kaggle.com/mozillaorg/common-voice

when I try to run the training command
!./DeepSpeech.py -train_batch_size 128 --dev_batch_size 128 --test_batch_size 128 --drop_source_layers 2 --show_progressbar True --alphabet_config_path /content/DeepSpeech/data/alphabet.txt --train_files /content/cv-valid-train.csv --dev_files /content/cv-valid-dev.csv --test_files /content/cv-valid-test.csv --epochs 1 --export_dir /content/drive/MyDrive/common_voice_eng/Output --checkpoint_dir /content/drive/MyDrive/common_voice_eng/checkpoint --load_cudnn

it shows error as

raise RuntimeError(‘No transcript data (missing CSV column)’)
RuntimeError: No transcript data (missing CSV column)

This is old content, please use actual content and perform import as documented using import_cv2.py

1 Like

We can’t help about that, sorry.

Will common voice corpus 1 dataset work for deepspeech version 0.9.3?

Yes, format of CV is the same, but less material. But if you can’t get hold of a server that can hold this amount of data, it will be hard for you to do anything useful. Look at the importers, maybe another dataset is better suited for you?

I have a doubt can I train common voice dataset on windows 10 CMD?
I tried running some of the commands from deepspeech train your own model but they aren’t working.
Is there any way I can train it on windows?

Training on Windows is really hard. Try Google Colab if you don’t have a Linux server.

Yes but how do I import dataset on Google colab because the common voice dataset is too big.

There are many google hits for “working with big datasets in Google Colab” e.g. upload it to your Google drive.
If you have a precise problem on running DeepSpeech you may find help here but this is not the regular Colab support.

1 Like