Create a domain model with DeepSpeech

I am trying to create a domain template, I have about 700 audio files, they are all: mono / 16kHz / 16bit.
I created 3 csv files in which I inserted file_path, file_size, transcript, dividing the audio files into 70% train, 20% dev, 10% test. I created an alphabet that contains all the characters present in the transcriptions.
I trained the model using this command:

  python3 DeepSpeech.py  \
--train_files /test/csvFile/train.csv \
--dev_files /test/csvFile/dev.csv \
--test_files /test/csvFile/test.csv \
--export_dir /test/model/ \
--alphabet_config_path /test/alphabet.txt 

During the training phase there are no errors or problems, after finishing the training the model only returns empty strings like this

+

What should I do, has anyone already had this problem?

  1. Please no images, just text.

  2. 700 files of 10 secs is way too few to do anything. Maybe transfer, but even then.

  3. The output is totally normal for this few data.

  4. If you have more data, use dropout of 0.3 or higher.

Hard to say, but you should have thousands of samples to get ok results.

What do you mean by domain template?

it looks like you are working on italian, maybe you should work from italian’s trained model checkpoint? @Mte90

The italian model is at https://github.com/MozillaItalia/DeepSpeech-Italian-Model but you can reach in the community on telegram with @mozitabot in the developers group.

1 Like