Offline speech recognition on mobile

Allenlee · December 4, 2017, 5:44pm

Can DeepSpeech let me implement local, offline speech recognition on mobile?

reuben · December 6, 2017, 1:11pm

Right now, you could do it on a high end phone, but it would be slow. We haven’t yet created models optimized for inference on mobile devices, but it’s on the roadmap.

madbilly · March 25, 2018, 9:13pm

I was just wondering how to use Mozilla Deep Speech in Android instead of the Google Voice service. I guess it’s not possible yet? What’s the roadmap, roughly? How can someone with little coding experience help?
Cheers

tonytopper · December 5, 2019, 10:32pm

Any update on getting Deep Speech on to an iOS device?

reuben · December 5, 2019, 11:33pm

There’s been progress, in that the model is actually convertible to CoreML now: https://github.com/mozilla/DeepSpeech/issues/642 and https://github.com/tf-coreml/tf-coreml/issues/309

Next steps would be:

Adding a class that implements the ModelState API using CoreML, similar to how we currently have TFModelState and TFLiteModelState implementations.
Figuring out how to compute features, as I’ve had to remove the feature computation sub-graph to get the CoreML conversion to finish. I don’t think the AudioSpectrogram/MFCC ops are supported in CoreML. I’d start by simply vendoring TensorFlow’s kernels and building those into libdeepspeech.so. We could even use this work in all model types, to reduce overhead.
Figuring out packaging for iOS. Nobody on our team has iOS experience so I don’t have any suggestions for this. Basically, make it possible to build a DeepSpeech package with the format used for iOS dependency management.

reuben · December 5, 2019, 11:33pm

Step 3 would possibly also involve adding Swift bindings to the C API.

kezakool · February 7, 2020, 5:31pm

sorry if i missed something, but does it mean that for now, the best way to make a deepspeech model run streaming inferences on IOS is :

Recoding preprocessing part in metal
convert subgraph CNNRNN to coreml
convert language model to an ios interpretable format
Recoding ds_decode part in metal
wrap all this with custom swift pipeline

thxs

imart · August 19, 2020, 10:10am

@reuben could you update this thread? Is the DeepSpeech library available on iOS? If isn’t yet could you share maybe plans regarding these activities?

lissyx · August 19, 2020, 11:08am

It is available as documented in the 0.8 release notes, as a preview for feedback.

imart · August 19, 2020, 11:21am

Great, thanks for update!
Is there any demo or example how use the library in iOS app?

lissyx · August 19, 2020, 11:24am

Please: https://github.com/mozilla/STT/tree/master/native_client/swift

Web_Solutions · January 27, 2023, 9:33am

I have trained a model (Mozilla DeepSpeech) can i use it offline.

that model is just for few words and want to use offline. Is there any way to do that using JS and localStorage?