Is there client-side speech recognition? A workaround that I have attempted is to write everything in nodeJS, bundle it with browserify and include it in my client side HTML. However, this has failed to work for reasons beyond my understanding.
I would appreciate it if anyone can offer me insight into this domain.
Also, my goal is to create an offline PWA with speech recognition, is this possible?
Thank you!
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
Yes I have. What is your take on my workaround though? Is this the approach I should be employing? Or are there other approaches to developing an offline web app with speech recognition?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
4
Yes I have. What is your take on my workaround though? Is this the
approach I should be employing? Or are there other approaches to
developing an offline web app with speech recognition?
Iām sorry but you donāt explain anything, so I donāt know what your
workaround is.
In my first post, I said that I wrote everything in nodejs and bundled it with browserify to include it in my client-side HTML. However, it did not work and I owe that to my lack of knowledge with deepspeech. Hence, I would like to clarify, am I headed down the right path? How do I include it in my client-side JS?
I understand that I can stream audio from my client to my nodejs server, but I do not want that. I want it to work completely through the browser.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
6
I have no idea what that means, what this produced.
itās non specific, was it an error? something else?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
7
no, you would need tensorflow.js somehow but our model does not work with that to the best of our knowledges.
Browserify lets you require() node modules in your client-side JS. I used it as the deepspeech API gave documentation for nodeJS and none for client-side JS.
This was the error that I got in my developer console: āUncaught TypeError: Cannot read property ā_handleā of undefinedā
I have searched it up and canāt seem to find resources that explains what this means.
Also, to go along with questions that I have just asked. Essentially, I am trying to get complete offline speech recognition(in the browser) for a very limited set of vocabulary. My goal is to create a PWA that does exactly that. My knowledge on ML is very basic, could you point me to the right direction(ie. resources, tech to research on etc.) as to how I can achieve this?
I really appreciate you taking your time to respond by the way.
@gritter97 This seems like a clear and useful question. Itās a shame there doesnāt appear to be people interested in helping. Good luck. Iām on a similar mission. Iām just starting down the road, and was wondering if I could use something like Web Assembly to accomplish it. Iād be surprised if it works, but a large āinstallā would be acceptable for my use case. Iāll follow up here if I make any headway.
Hi @jancarius, I appreciate the resource. I wonder how it slipped from right under our noses! How is your progress? Did you ever get into Web Assembly? Seems like very interesting stuff, I wonder what its capabilities are. I suggest you look into tensorflow.js , I followed this tutorial: https://codelabs.developers.google.com/codelabs/tensorflowjs-audio-codelab/index.html#0 and am attempting to create a PWA that caches the included resources. The next step would be to use other models that have a larger vocabulary.
I will be sure to update this thread on my progress.
Hey guys @jancarius@bozden, I have managed to get the TensorFlow tutorial working as a PWA. Here is a link to the repo: https://github.com/Ashwin2397/Offline_STT . I have made it as a proof-of-concept and am looking to develop a similar rendition with full English speech recognition. This implementation is based on the limited vocabulary provided by TensorFlowās model.
Based on my limited understanding, I have gathered that I would be able to produce full speech recognition by using data provided by the common voice project and training my own model. I shall not go into this realm until I have fully explored Deepspeech. Hence, my next course of action is to do the following:
Use deepspeech in nodejs
Create a bundle with webpack
Deploy it as a PWA
As aforementioned, bundling the nodeJS script with browserify did not work. Thus, I am unsure if using webpack will be any different. I am amidst learning webpack now.
To those whom are more well-versed and experienced, I would appreciate any information or advise, thank you!