Links to pretrained models

lissyx · July 3, 2020, 12:28pm

There is CONTRIBUTING.md as well and in English :[

MestafaKamal · July 10, 2020, 4:42pm

You can find a Kabyle model here

dan.bmh · July 18, 2020, 9:09am

Now you can also find models for Italian and Polish at DeepSpeech-Polyglot and I did retrain the French and Spanish models with the new CommonVoice release.

dan.bmh · September 16, 2020, 6:06pm

DeepSpeech-Polyglot got some updates in the last weeks:

Improved models for German, French and Spanish
Experimental support for training with wav2letter (but I didn’t achieve good results in my first tests)
You can now extract manual subtitles from YouTube playlists to generate more training data (check some videos before to ensure the text alignments are good)

double · September 17, 2020, 7:52pm

@dan.bmh
Thank you for sharing.
You could provide the python command line to use polyglot with e.g. the Spanish model?

dan.bmh · September 17, 2020, 8:47pm

Not sure I’m understanding this right, you can’t “use” the models with that code. It’s only for training new models. Checkout the examples from Deepspeech’s repository and use the “.pbmm” and “.scorer” files from polyglot, if you want to use Spanish instead of English.

double · September 18, 2020, 7:21am

@dan.bmh
I see. Thank you for your great work.

fhalamos · October 22, 2020, 9:00pm

@dan.bmh I could not find the “.pbmm” and “.scorer” files in the polyglot repo. From the Readme it seems that we have to do the training in order to create the models right? Is there anywhere we can directly download the .pbmm and .scorer files? I particularly need spanish.

sanjaesc · October 23, 2020, 7:54am

The links to the corresponding models are at the bottom.

Mte90 · December 28, 2020, 4:08pm

The Mozilla Italia community release the model for italian https://github.com/MozillaItalia/DeepSpeech-Italian-Model

Together with a new text corpus, and now we are working to create another audio+text dataset with an aggregation of a lot of mini datasets around the web.

lissyx · February 2, 2021, 1:13pm

Please avoid hijacking unrelated threads, and look at the documentation covering rebuilding a scorer.

dan.bmh · April 6, 2021, 8:31am

The DeepSpeech-Polyglot project did receive a large update over the last weeks. It was reimplemented in tensorflow2 and new networks have been added. The recognition performance was greatly improved. It also got a new name: Scribosermo and now can be found here:

The new models can be trained very fast (~3 days on 2x1080Ti to reach SOTA in German) and with comparatively small datasets (~280h for competitive results in Spanish). Using a little bit more time and data, the following Word-Error-Rates on CommonVoice testset were achieved:

German	English	Spanish	French
7.2 %	3.7 %	10.0 %	11.7 %

Training custom models with Scribosermo is very simple, step by step instructions can be found in the readmes. Adding new languages is very easy, too. After training, the models can be exported into tflite-format for easier inference. They are able to run faster than real-time on a RaspberryPi-4.

The most important features are already implemented, but there is still some room left for optimizations. Feel free to improve it and send a merge request. And it would be great if you can publish your own models as well.

Note: Currently only inference with python is supported, the new models are not compatible with the DeepSpeech bindings anymore (the old models are still available). But technically it should be possible to integrate them again. If someone is interested in doing this, some notes can be found in this thread: Integration of DeepSpeech-Polyglot's new networks

ftyers · April 24, 2021, 6:33pm

I have trained models for most of the Common Voice languages. They are available here: https://tepozcatl.omnilingo.cc/v0.1.0/manifest.html

ftyers · April 27, 2021, 2:38pm

Update on the Basque model, I managed to train it a bit longer:

ftyers · April 29, 2021, 10:52pm

Pretrained models for Swahili (sw), Wolof (wo), Yoruba (yo) and Amharic (am):

enavarro · May 14, 2021, 11:17am

Hello.

I’m having trouble getting the models to run. I’m using windows, VS -C#, I get the example of .NET Framework running in english, but I can’t figure out the way to plug in the spanish model. I’ve spent hours reading up and down, in “getting-the-pre-trained-model” in readthedocs explains perfectly how to use a .pbmm model, but doesn’t mention about .pb, which is the kind I find in the mediafire downloads site.
So clearly I’m missing some important point, please point me in the right direction.

ftyers · May 14, 2021, 2:08pm

Hi @enavarro, this isn’t the right topic for that. You can convert the .pb model to .pbmm using convert_graphdef_memmapped_format. But if you’d like more support, please open another topic or join us on Mozilla’s Matrix.

dan.bmh · May 15, 2021, 7:12am

Hi, I think you using the wrong models. The new ones aren’t compatible with DeepSpeech anymore, the older models for DeepSpeech are linked later in the readme.

-------- Original-Nachricht --------

enavarro · May 17, 2021, 6:06am

Hello Daniel.
Could you please point me to where it explains this change of model type, and how I could use them? I guess there is some DeepSpeech 2 to use them, in which case I don’t want to use an older version.
Thank you!

dan.bmh · May 17, 2021, 7:10am

As written in a post above, Scribosermo’s new models only support usage with Python, you will need to add an additional interface to use them in C# or .NET.

And, citing from the first paragraph of the Readme: " You can find a short and experimental inference example here"