Help on creating the model from the common voice mozilla

javi.rahman · August 17, 2019, 11:50am

Hi,

I was trying to create a new model from the common voice corpus (English).

As per the document I have downloaded the corpus en tar file.

Then extracted

bin/import_cv2.py --filter_alphabet path/to/some/alphabet.txt /path/to/extracted/language/archive

I am on this command -->

./DeepSpeech.py --train_files CommonVoice-2.0/corpus3-en/clips/train.csv --dev_files CommonVoice-2.0/corpus3-en/clips/dev.csv --test_files CommonVoice-2.0/corpus3-en/clips/test.csv --train_batch_size 12 --dev_batch_size 12 --test_batch_size 12 --learning_rate 0.0001 --epoch 95 --validation_step 5 --dropout_rate 0.30 --default_stddev 0.046875 --export_dir /opt/deepspeech/exportmodels --checkpoint /opt/deepspeech/checkpoint

But it is taking too long.

Please confirm once the command finishes it will bring like what inside the --> deepspeech-0.5.1-models

I assume it will create a new models file ( lm.binary, output_graph.pb, output_graph.pbmm, output_graph.tflite).

lissyx · August 18, 2019, 11:11am

Please share details, we do document that training is an intensive process, so without knowing your hardware as well as what you consider “too long”, we can’t do anything.

Don’t assume, read the documentation and the help.

javi.rahman · August 18, 2019, 12:47pm

Hi @lissyx,

I am using Core i7 7th generation, 8gb ram, 4gb Nvidia graphics.

I will check the document and let you know if I have any doubts.

lissyx · August 18, 2019, 1:26pm

Then no surprise, 250h of french audio takes ~4h for training on 2x RTX2080Ti GPUs.

lissyx · August 19, 2019, 7:23am

@javi.rahman Just for the sake of completeness, can you make sure you are running tensorflow-gpu ? pip uninstall tensorflow && pip install --upgrade tensorflow-gpu==1.14.0 in your virtualenv.

Topic		Replies	Views
Using common voice datasets? DeepSpeech	5	1073	November 17, 2020
Inaccurate results from 0.9.3 model Common Voice learning	1	364	April 16, 2024
Timeline for releasing the DeepSpeech models trained with the Common Voice data Common Voice dataset	1	1337	June 23, 2018
Help regarding validating my current approach for training common voice dataset DeepSpeech learning , feedback	6	1009	December 16, 2019
Tutorial: Training a Dutch model DeepSpeech learning	6	2995	July 9, 2020

Help on creating the model from the common voice mozilla

Related topics