Links to pretrained models

@dan.bmh
I see. Thank you for your great work.

@dan.bmh I could not find the “.pbmm” and “.scorer” files in the polyglot repo. From the Readme it seems that we have to do the training in order to create the models right? Is there anywhere we can directly download the .pbmm and .scorer files? I particularly need spanish.

The links to the corresponding models are at the bottom.

1 Like

The Mozilla Italia community release the model for italian https://github.com/MozillaItalia/DeepSpeech-Italian-Model

Together with a new text corpus, and now we are working to create another audio+text dataset with an aggregation of a lot of mini datasets around the web.

4 Likes

Please avoid hijacking unrelated threads, and look at the documentation covering rebuilding a scorer.

The DeepSpeech-Polyglot project did receive a large update over the last weeks. It was reimplemented in tensorflow2 and new networks have been added. The recognition performance was greatly improved. It also got a new name: Scribosermo and now can be found here:

The new models can be trained very fast (~3 days on 2x1080Ti to reach SOTA in German) and with comparatively small datasets (~280h for competitive results in Spanish). Using a little bit more time and data, the following Word-Error-Rates on CommonVoice testset were achieved:

German English Spanish French
7.2 % 3.7 % 10.0 % 11.7 %

Training custom models with Scribosermo is very simple, step by step instructions can be found in the readmes. Adding new languages is very easy, too. After training, the models can be exported into tflite-format for easier inference. They are able to run faster than real-time on a RaspberryPi-4.

The most important features are already implemented, but there is still some room left for optimizations. Feel free to improve it and send a merge request. And it would be great if you can publish your own models as well.

Note: Currently only inference with python is supported, the new models are not compatible with the DeepSpeech bindings anymore (the old models are still available). But technically it should be possible to integrate them again. If someone is interested in doing this, some notes can be found in this thread: Integration of DeepSpeech-Polyglot's new networks

1 Like

I have trained models for most of the Common Voice languages. They are available here: https://tepozcatl.omnilingo.cc/v0.1.0/manifest.html

1 Like

Update on the Basque model, I managed to train it a bit longer:

Pretrained models for Swahili (sw), Wolof (wo), Yoruba (yo) and Amharic (am):

1 Like

Hello.

I’m having trouble getting the models to run. I’m using windows, VS -C#, I get the example of .NET Framework running in english, but I can’t figure out the way to plug in the spanish model. I’ve spent hours reading up and down, in “getting-the-pre-trained-model” in readthedocs explains perfectly how to use a .pbmm model, but doesn’t mention about .pb, which is the kind I find in the mediafire downloads site.
So clearly I’m missing some important point, please point me in the right direction.

Hi @enavarro, this isn’t the right topic for that. You can convert the .pb model to .pbmm using convert_graphdef_memmapped_format. But if you’d like more support, please open another topic or join us on Mozilla’s Matrix.

Hi, I think you using the wrong models. The new ones aren’t compatible with DeepSpeech anymore, the older models for DeepSpeech are linked later in the readme.

-------- Original-Nachricht --------

Hello Daniel.
Could you please point me to where it explains this change of model type, and how I could use them? I guess there is some DeepSpeech 2 to use them, in which case I don’t want to use an older version.
Thank you!

As written in a post above, Scribosermo’s new models only support usage with Python, you will need to add an additional interface to use them in C# or .NET.

And, citing from the first paragraph of the Readme: " You can find a short and experimental inference example here"

Hi, what version of DeepSpeech is required to use spanish model?

They were trained with 0.9.X, but should work with any version between 0.7.X and the current 0.10.X, like the official English model.

I’ve been trying use your pre-trained spanish model but when I run the example of the microphone of deepspeech and I call your model an error appears “rebuild TensorFlow with the appropriate compiler flags” I need to change the way how the example of the micropphone load the model? because I have used the pre-trained model of deepspeech .pbmm and it works
(Image of my error -> https://www.dropbox.com/s/tkuog2gu1hq1lc4/Capture.PNG?dl=0)

You are using the wrong model. As written above and in the readme, the new model (ending on .pb) is not compatible with DeepSpeech anymore. You have to use the old model ending on .pbmm

Oh thank you so much. However, if I want to use the last spanish model Quartznet15x5, D8CV (WER: 10.0%) I need to install Quartznet? because I’ve been reading but when I search on Internet how to install Quartznet, I don’t find anything only how to install NeMo, NeMo is the library to use Quartznet right?

No, you just need to install tflite+dsctcdecoder. See the inference example which is linked in the first paragraph of the usage chapter:

You can find a short and experimental inference example here