I need help/ideas about what I can use for de-noising audios so I can improve the model accuracy. I have already tried RNNoise, model output for some audios gets better and for some it gets worse(which is not helping). So, can you guys suggest anything else that I can do? It’ll be really helpful.
Following are the details I’m working with:
Acoustic Model : 0.7.1 released by Mozilla
Scorer : Custom
Data : Conversational data. Customer support.
I am splitting the audio call into smaller chunks using VAD and those audio chunks are then fed to the model. What I’ve observed is that the model does good when the audio duration is not very high. So I am hoping that if I can de-noise, that will probably help me more with longer audio files but I am not sure where to start. Thanks!