Tensorflowjs for Deepspeech browser support

udaram · July 5, 2020, 8:20am

I want to use Deepspeech in browser using tensorflowjs. I don’t know how to proceed. Could you please guide or share some resource link. My goal is to build browser support for Deepspeech.
@reuben

reuben · July 5, 2020, 8:25am

I don’t know how to proceed either. Search these forums and the issue tracker for useful info from other people who tried it.

gritter97 · March 19, 2021, 1:50am

Hi @udaram, did you ever get around to enabling client-side control of deepspeech?

I am trying to have full client-side capabilities with deepspeech. The method I am seeking to employ is as follows:

Create stream from microphone
Create deepspeech stream
Stream audio from microphone to deepspeech stream.
Receive transcription and display it to page

My goal is to eventually build an offline PWA with speech recognition.
Any tips as to how I can approach this?

bozden · March 21, 2021, 3:34pm

This is exactly what I plan to do. I also want to hear how it is best done from those experienced before I jump into the project.

gritter97 · March 21, 2021, 8:55pm

Do you intend to have full speech recognition capabilities? Or are you only looking for a limited set of vocabulary that you could potentially add to?

My knowledge regarding this realm is very high level and basic. However, this link has shown promise: https://codelabs.developers.google.com/codelabs/tensorflowjs-audio-codelab/index.html

My project only requires a limited set of vocabulary. I intend to refactor this tutorial and make it into a PWA as a proof of concept. If I’m not mistaken, we can download the free trained model from Mozzila’s deepspeech and use it as an alternative in the tutorial? What do you think?

bozden · March 23, 2021, 2:26am

Thanks for the pointers.
I plan to implement a basic media player with basic voice commands. This is for being used in museums (for post-covid death of touch screens). Single word commands only, but should understand multiple languages, so probably already trained models are not very much help here. I know that it is another level of complexity, but anyway…