Preprocesses steps of Common Voice dataset

Hello everyone. I’ve trained a DeepSpeech model for Persian language. It works really well on Common Voice dataset. But when it comes to transcribing a wild-recorded voice, it performs really poor. Do you know what preprocessing steps should be done on the voice to make it more like Common Voice dataset?

I also trained a DeepSpeech model for Persian. You can find it here. If you can give more details about your set up then maybe we can help better. You could also consider joining us on Matrix.