Near to real-time inferencing on Android devices

As of now, in Android demo app provided, we need to provide the audio wav file location. when we click on infer, it gives us the result. I modified the code such that, now it takes microphone input. On clicking start button, we can speak our content and the app continuously writes the audio buffer onto a file. Once, we click the stop button, inferencing is done on the wav file generated.

I now, want to perform speech-to-text on the go. Similar to google assistant, where a person continuously speaks and speech is getting converted to text simultaneously.

Can anyone guide me to the code which I can modify to change the input from wav file to wav buffer for stt method of DeepSpeechModel class.

Please read have a look at the API and the streaming part, as well as https://github.com/mozilla/androidspeech/

@dhanesh Please be aware this androidspeech library with DeepSpeech support is still very experimental, and that running on Android is still kinda new, and we only support / know it runs wells on a few SoC (namely snapdragon 820 and 835).

@lissyx i tried the app as-is. I see that default value is “eng” and if it doesn’t find the model file, it downloads from following link: eng.zip
This is however 1.5 GB size file, not good for an android phone with limited RAM size. I have small 45 MB tflite model for android with custom lm and trie. I tried that, accuracy wasn’t good though.
Is there anyway I can contribute in android speech development?

Yeah, but that’s what I warned you about: the links are not correct now.

The zip file is 1.5GB but it’s not going to hurt your RAM. There’s 48MB of TFLite model and 1.7GB of KenLM language model.

There can be a lot of factor explaining that … I could tell you that during our tests it was good.

There’s a lot … But if you can start by sharing more details on your environment maybe we can help you …

@dhanesh
A live transcribe engine is open sourced by Google two days ago…

Though it uses cloud API in backend, but documents that it can be extended for offline models. You might get some inspirations from it…

@lissyx
Your thoughts on it?

@lissyx found the issue that why my custom lm was not getting picked up. AndroidSpeech code requires that .useDecoder file be present at model root directory. If it’s not present, app doesn’t use local decoder. After creating a dummy file, it worked good.

Yeah, as I told you, it’s still alpha :slight_smile: