Any Good Architecture for continous Inference

(Hnipun) #1


Is there any good resources to learn to implement a continuous speech to text system using. From microphone sound capturing, segmenting the continuous sound signal and any preprocessing methods till feed the DeepSpeech model.

Thanks in advance.

Uni-directional model for online/incremental ASR in real-time applications e.g. voice assistants
(Lissyx) #2

You can probably have a look at the Voice Fill WebExtension, it’s not relying on DeepSpeech, but some people have provided DeepSpeech “servers” implementation on Python and NodeJS.

(Mike Sheldon) #3

You might like to take a look at the GStreamer DeepSpeech plugin I wrote, which does some basic segmentation (just based on silence thresholds):

(Hnipun) #4

Thanks a lot. Its a really nice idea. Do you have any documentation so that I can learn about your architecture.