I want to use Deepspeech in browser using tensorflowjs. I don’t know how to proceed. Could you please guide or share some resource link. My goal is to build browser support for Deepspeech.
@reuben
I don’t know how to proceed either. Search these forums and the issue tracker for useful info from other people who tried it.
Hi @udaram, did you ever get around to enabling client-side control of deepspeech?
I am trying to have full client-side capabilities with deepspeech. The method I am seeking to employ is as follows:
- Create stream from microphone
- Create deepspeech stream
- Stream audio from microphone to deepspeech stream.
- Receive transcription and display it to page
My goal is to eventually build an offline PWA with speech recognition.
Any tips as to how I can approach this?
This is exactly what I plan to do. I also want to hear how it is best done from those experienced before I jump into the project.
Do you intend to have full speech recognition capabilities? Or are you only looking for a limited set of vocabulary that you could potentially add to?
My knowledge regarding this realm is very high level and basic. However, this link has shown promise: https://codelabs.developers.google.com/codelabs/tensorflowjs-audio-codelab/index.html
My project only requires a limited set of vocabulary. I intend to refactor this tutorial and make it into a PWA as a proof of concept. If I’m not mistaken, we can download the free trained model from Mozzila’s deepspeech and use it as an alternative in the tutorial? What do you think?
Thanks for the pointers.
I plan to implement a basic media player with basic voice commands. This is for being used in museums (for post-covid death of touch screens). Single word commands only, but should understand multiple languages, so probably already trained models are not very much help here. I know that it is another level of complexity, but anyway…