Deep Speech v0.4.1 Released

What about packing rnnoise into the current C++ client and add an option to enable denoise on the fly? You think it will worth to try make it work together? For my use case using ffmpeg with a band filter is not practical at least using the streaming feature from C#, I think it would be great to have the same noise filter for all the clients.

Here’s the GitHub https://github.com/xiph/rnnoise

It’s certainly possible.

However, there are a few reasons we have not added in rnnoise:

  • We’ve created, but yet to utilize, a tool voice-corpus-tool to supplement our audio with noise to make the model itself capable of denoising the audio. In this case rnnoise is not needed.
  • Adding in rnnoise with the current model will systematically modify the audio in ways not seen at train time and could increase WER.
  • Adding in rnnoise could require retraining the model with rnnoise in the pipeline to combat the previous issue
  • Adding in another dependency where may not be needed is something we try to avoid

This is our current take on rnnoise. I’d be curious as to your opinion/experience with it in the pipeline.

I think at the end just by testing we will know who will perform better at handling the noise or the artifacts from rnnoise. I’ll add this to my road. Thanks for sharing your opinion.