Share your trained model for Mozilla DeepSpeech?

I have recently started working with Mozilla Deepspeech, and I was very pleased with its performance. However, the pre-trained model shared by the folks at Mozilla has been a bit of a mixed bag: it works pretty well with crystal-clear, professional speaker’s speech, but anything slightly aside from that (e.g. conferences recordings, 1:1 classes) will trigger a host of words often not even “real” (brands, names, rare loan words from other languages).

I would be very grateful if anyone had a relatively more accurate trained model they were willing to share (ideally under 50 GBs) - I was struck by the Ted-lium corpus in particular, but any submission will be greatly appreciated!! Thanks!

1 Like

They are working on better models, e.g. through augmentation and more, better data. I don’t think you’ll find anybody sharing a better model for free for now :slight_smile:

Depending on your use case, try playing with the language model for better transcriptions. And it looks like you have more real-life data, think of contributing it for future training runs.

Thank you very much for pitching in!
Unfortunately I do not possess a GPU powerful enough to train data myself, and this is why I was looking for a pre-trained model that performed better in real life applications.

My use-case scenario is transcribing university recordings, and a strong will to find a suitable free and open source alternative that would alleviate my reliance on online services (e.g. otter.ai and similar ones)

Are you using specific software like eSup Pod’s or OpenCast ?

I should have mentioned, my position is that of university student.
I would usually record classes (now the recording part has been superseded as classes have moved online in my country, many being pre-recorded, and those which are live-streamed are easily captured), and make use of AI transcription tools to aid myself in preparing revision notes based on what was taught during the class.

Still, you might have been working as an intern etc.
I was asking because people working on those platforms are looking into DeepSpeech (you can see their questions here and here on Discourse) for this exact need.

Thank you @lissyx , I will have a look into them.
If you personally had in mind a specific relevant conversation you would point me to, I would be grateful about that!