Future of DeepSpeech / STT after recent changes at Mozilla

Last week Mozilla announced a layoff of approximately 250 employees and a big restructuring of the company. I’m sure many of you are asking yourselves how this impacts DeepSpeech. Unfortunately, as of this moment we don’t have concrete answers to give. We’re working to find out if the project will have a new home in the restructured Mozilla, and what changes would be necessary for a successful transition.

As many of you know, we were gearing up for our first stable release, version 1.0, in conjunction with announcing our plans for the future of voice projects at Mozilla. Most of the technical changes were already landed, and we see no reason not to ship it. We’ll be releasing 1.0 soon and encourage everyone to update their applications. I would like to thank all of the people who invested their time and effort to make it possible for this project to get this far. It would not have been possible without you all.

In a world where more and more people prefer to interact with their devices using voice, it’s more important than ever to have open source, privacy preserving solutions that enable anyone to innovate in this space, outside of the control of the big cloud services players. DeepSpeech is a result of our effort to address this by building an easy-to-use, open source speech-to-text solution that can be easily integrated in many platforms, programming languages and types of applications. No other open source solution comes close to the accuracy, maturity and ease of use of our tools.

In this moment of uncertainty, we ask all of you to help us make DeepSpeech even better by continuing to open issue reports, make pull requests, help each other on Discourse, and build things using DeepSpeech. Until a proper decision is being made regarding the future of the project, we will “keep the lights on” and try to address existing issues and review your contributions to the best accommodation we can in the scope of our new roles.


Thank you Reuben for this. You’re absolutely correct that nothing comes close to DeepSpeech. It’s been pivotal in our work for developing speech recognition for a lesser resourced language. We will continue to make full use and do what we can to increase its awesomeness. Thank you and all the team for everything.


thank you Reuben and all the DS coders. i do hope that DeepSpeech will be able to continue with all speed. it is an amazing project and as dewi.jones points out, a wonderful resource not only for languages with less reach than English, but also for those many speakers whose English accent is not male American or British RP (received pronunciation). i am personally very interested both in how to adapt DeepSpeech for different accents, genders, and voices, but also for specialist vocabularies, perhaps using transfer learning or other training techniques.


I am disappointed that Mozilla has chosen to put projects that genuinely serve a need (e.g. Common Voice, DeepSpeech) on hold in favor of solving problems that have already been solved by other companies (e.g. VPN, password management). I understand that the financial situation ties Mozilla’s hands somewhat, but I am still disappointed that they are focusing on also-ran commercial products over innovation. The world probably doesn’t need yet another VPN service but it really does need better speech tools.

I believe that DeepSpeech is superior to Kaldi and I hope that the community will be able to keep it alive if Mozilla decides to cancel it. I would be happy to contribute general coding but I don’t really have the knowledge to maintain the deep learning aspects. But I know there was talk of finding an organization to “adopt” Common Voice, so hopefully the same could be done for DeepSpeech.


Thank you for this topic… I was searching this and here i found it….


As someone from the outside, I feel it would be awesome if there would be some businesses who are interested to buy some kind of services from the DeepSpeech team because that would make a great case of why Mozilla still should give that team a home of some sort.


I do hope Mozilla continues with this unique project. As a member of a team that provides speech-recognition solution to portuguese-BR (using DeepSpeech in our core), I have no doubts that there is no other open-source ASR project that comes near to what you (Reuben), your team, and further collaborators have created.

I will continue to contribute with what I can, and also continue to spread the word about how awesome this project is.


We use DeepSpeech at FOSSASIA. The Free and Open Source community can’t afford to lose this project. I will bring this up in the Open Source Initiative board and see how we can support you.


I see the big problem here is that we can sell VPN membership to consumer to generate revenue more easily than sell DeepSpeech to other projects in case of Mozilla’s financial situation. But yes we should discover ways to generate revenue to support such fundamental project of the open web.


Also ping Mycroft.ai

1 Like

Mahalo nui loa e @reuben and the rest of the team. DeepSpeech is critically important for our work in helping to revitalize te reo Māori and ensure our indigenous languages have a place on digital devices. We’ve been using DS since 2018 to speed up native speaker (L1) interview transcriptions by incorporating STT. The work we were able to achieve using DS supported us to get a 7 year, $13M NZD grant from the NZ government for our Papa Reo project.

We’re keen to keep supporting however we can during this transition and hopefully into the future. I know there’s a decent sized community out there who rely on Mozilla STT, so hopefully we can work together to keep a vibrant open source community going around the this project. The most important value of this project for us is around democratizing language technologies. We believe indigenous communities should be able to access these sorts of tools to help them champion their own revitalization journeys. We don’t think relying on Big Tech to “save us” by providing our indigenous languages as a service back to us is the right approach. One case in point is this tweet by @mathematiguy, https://twitter.com/Caleb_Speak/status/1302744407611908097?s=20.

We’re hoping to launch a pronunciation app within the next 4 - 6 months. The back-end is built on DeepSpeech and especially around the fact that we can get timings and character level confidences out of the model. So again, this project and your work exemplifies the importance of inclusive, open internet and technologies.

Nāku iti nei
nā Keoni.