DeepSpeech over Web Browser

(Agarwalaashish20) #1

Hi, I have created a project to use DeepSpeech on the web browser to ease the use. Till now I haven’t found any relevant project in the example section of DeepSpeech (https://github.com/mozilla/DeepSpeech/tree/master/examples) that talks about accessing the Mozilla DeepSpeech on the Web.

I thought of sharing it. Please have a look, if the project can be included in the DeepProject. :slight_smile:

Any suggestions and improvements are welcomed.

0 Likes

(Lissyx) #2

That’s nice! However, quickly looking at the code, it’s an API using deepspeech and client code to call it, right? And since the API is written in Python, why bother with inefficient subprocess calls that will cause the model to be reloaded from scratch each time when you could directly use the python module, and write from deepspeech import Model ?

0 Likes

(Agarwalaashish20) #3

Thank you for the input @lissyx.

As suggested, I have made the necessary changes in the backend code. But strangely, the results are not as accurate as before. I believe it has to do with the BEAM_WIDTH, LM_WEIGHT etc that are required to define the Deep Speech model. Please guide.

Here is the link: DeepSpeech-API

Also, my intent for this discussion is to have the above repository forked as a part of Mozilla DeepSpeech examples folder. DeepSpeech Examples

Reason: We want to introduce Mozilla DeepSpeech model to the students in the University. Since we don’t want the students to go through the entire setup, we want to run the model on a standalone server that runs DeepSpeech and students can use it over the browser. And later, if they find it interesting, they can involve themselves. A model over the browser makes things really easy.

Looking forward to your answer.

0 Likes

(Agarwalaashish20) #4

Any comments on the above request? Looking forward to your answer.

0 Likes

(Carlos Fonseca) #5

I think there’s no viable server-client yet, as mentioned here the model can’t take more than one audio at time, @agarwalaashish20 multiple clients feeding the same deepspeech server instance can cause the things you said.

You can read about batching here
https://github.com/tensorflow/serving/tree/master/tensorflow_serving/batching

0 Likes

(Lissyx) #6

Please refer to the release notes and the other client to see the proper values, it’s all there, no magic.

0 Likes

(Agarwalaashish20) #7

@lissyx @carlfm01: Please have a look. I did some fixes. Are these changes viable for a fork?

Here is the link: DeepSpeech-API

0 Likes

(Lissyx) #8

That looks better, but I’m really really not convinced this should land in the main repo.

0 Likes

(Agarwalaashish20) #9

@lissyx: Please guide, if you have a some ideas to improve it. This implementation serves purpose of accessing DeepSpeech over the web browser. But I am open to enhance it. I just need your guidance, so that we can end up with an implementation that is sufficient to help users to quickly start using DeepSpeech over the browser.

0 Likes

(Lissyx) #10

I’m not sure exactly what you mean, that’s not a DOM implementation of DeepSpeech, that an API exposed over HTTP, there are others like https://pypi.org/project/deepspeech-server/ or in Rust: https://gitlab.com/deepspeech/ds-srv/tree/master/

0 Likes