Any Good Architecture for continous Inference

Hi,

Is there any good resources to learn to implement a continuous speech to text system using. From microphone sound capturing, segmenting the continuous sound signal and any preprocessing methods till feed the DeepSpeech model.

Thanks in advance.

You can probably have a look at the Voice Fill WebExtension, it’s not relying on DeepSpeech, but some people have provided DeepSpeech “servers” implementation on Python and NodeJS.

You might like to take a look at the GStreamer DeepSpeech plugin I wrote, which does some basic segmentation (just based on silence thresholds):

1 Like

Thanks a lot. Its a really nice idea. Do you have any documentation so that I can learn about your architecture.