Client-side offline speech recognition

gritter97 · March 19, 2021, 2:06am

Hi,

Is there client-side speech recognition? A workaround that I have attempted is to write everything in nodeJS, bundle it with browserify and include it in my client side HTML. However, this has failed to work for reasons beyond my understanding.

I would appreciate it if anyone can offer me insight into this domain.
Also, my goal is to create an offline PWA with speech recognition, is this possible?

Thank you!

lissyx · March 19, 2021, 9:11am

Have you had a look at our documentation ? https://deepspeech.readthedocs.io/ there’s API for many languages.

gritter97 · March 19, 2021, 10:29pm

Yes I have. What is your take on my workaround though? Is this the approach I should be employing? Or are there other approaches to developing an offline web app with speech recognition?

lissyx · March 19, 2021, 10:34pm

[gritter97] gritter97 https://discourse.mozilla.org/u/gritter97
Ashwin selvakumar
March 19

Yes I have. What is your take on my workaround though? Is this the
approach I should be employing? Or are there other approaches to
developing an offline web app with speech recognition?

I’m sorry but you don’t explain anything, so I don’t know what your
workaround is.

gritter97 · March 19, 2021, 10:46pm

In my first post, I said that I wrote everything in nodejs and bundled it with browserify to include it in my client-side HTML. However, it did not work and I owe that to my lack of knowledge with deepspeech. Hence, I would like to clarify, am I headed down the right path? How do I include it in my client-side JS?

I understand that I can stream audio from my client to my nodejs server, but I do not want that. I want it to work completely through the browser.

lissyx · March 19, 2021, 10:50pm

I have no idea what that means, what this produced.

it’s non specific, was it an error? something else?

lissyx · March 19, 2021, 10:51pm

no, you would need tensorflow.js somehow but our model does not work with that to the best of our knowledges.

gritter97 · March 19, 2021, 10:55pm

Browserify lets you require() node modules in your client-side JS. I used it as the deepspeech API gave documentation for nodeJS and none for client-side JS.

This was the error that I got in my developer console: “Uncaught TypeError: Cannot read property ‘_handle’ of undefined”

I have searched it up and can’t seem to find resources that explains what this means.

gritter97 · March 19, 2021, 11:06pm

Okay, are you implying that I would have to train my own model to achieve this? And also, I can then use this model to work with deepspeech?

lissyx · March 19, 2021, 11:11pm

Please search on github issues, there was already a thread about tensorflow.js

gritter97 · March 19, 2021, 11:11pm

Also, to go along with questions that I have just asked. Essentially, I am trying to get complete offline speech recognition(in the browser) for a very limited set of vocabulary. My goal is to create a PWA that does exactly that. My knowledge on ML is very basic, could you point me to the right direction(ie. resources, tech to research on etc.) as to how I can achieve this?

I really appreciate you taking your time to respond by the way.

jancarius · March 22, 2021, 4:03am

@gritter97 This seems like a clear and useful question. It’s a shame there doesn’t appear to be people interested in helping. Good luck. I’m on a similar mission. I’m just starting down the road, and was wondering if I could use something like Web Assembly to accomplish it. I’d be surprised if it works, but a large “install” would be acceptable for my use case. I’ll follow up here if I make any headway.

jancarius · March 28, 2021, 3:29am

@gritter97 looks like there is a Web API! SpeechRecognition, unfortunately it has limited browser support.

https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition

gritter97 · March 28, 2021, 6:06am

Hi @jancarius, I appreciate the resource. I wonder how it slipped from right under our noses! How is your progress? Did you ever get into Web Assembly? Seems like very interesting stuff, I wonder what its capabilities are. I suggest you look into tensorflow.js , I followed this tutorial: https://codelabs.developers.google.com/codelabs/tensorflowjs-audio-codelab/index.html#0 and am attempting to create a PWA that caches the included resources. The next step would be to use other models that have a larger vocabulary.

I will be sure to update this thread on my progress.

gritter97 · April 2, 2021, 2:12am

Hey guys @jancarius @bozden, I have managed to get the TensorFlow tutorial working as a PWA. Here is a link to the repo: https://github.com/Ashwin2397/Offline_STT . I have made it as a proof-of-concept and am looking to develop a similar rendition with full English speech recognition. This implementation is based on the limited vocabulary provided by TensorFlow’s model.

Based on my limited understanding, I have gathered that I would be able to produce full speech recognition by using data provided by the common voice project and training my own model. I shall not go into this realm until I have fully explored Deepspeech. Hence, my next course of action is to do the following:

Use deepspeech in nodejs
Create a bundle with webpack
Deploy it as a PWA

As aforementioned, bundling the nodeJS script with browserify did not work. Thus, I am unsure if using webpack will be any different. I am amidst learning webpack now.

To those whom are more well-versed and experienced, I would appreciate any information or advise, thank you!

Topic		Replies	Views
Deepspeech on the web DeepSpeech	2	1178	March 19, 2021
Tensorflowjs for Deepspeech browser support DeepSpeech	5	589	March 22, 2021
Paper Implementation of Speech recognition on mobile devices- DeepSpeech	19	1781	October 6, 2019
DeepSpeech Node.js User Transcription Project DeepSpeech	9	2744	January 27, 2021
DeepSpeech over Web Browser DeepSpeech	10	6484	November 23, 2023

Client-side offline speech recognition

Related topics