Preprocesses steps of Common Voice dataset

Soroush_Hashemi_far · May 8, 2021, 7:29am

Hello everyone. I’ve trained a DeepSpeech model for Persian language. It works really well on Common Voice dataset. But when it comes to transcribing a wild-recorded voice, it performs really poor. Do you know what preprocessing steps should be done on the voice to make it more like Common Voice dataset?

ftyers · May 8, 2021, 3:10pm

I also trained a DeepSpeech model for Persian. You can find it here. If you can give more details about your set up then maybe we can help better. You could also consider joining us on Matrix.

Topic		Replies	Views
DeepSpeech with Common Voice Training Data DeepSpeech	7	2526	December 2, 2019
Does the pre-trained model use CommonVoice data? DeepSpeech	4	722	October 29, 2018
"Transcription wrong on giving audio from Mozilla Common voice dataset DeepSpeech	17	695	April 9, 2020
Using Common Voice data with DeepSpeech Common Voice	11	7509	August 21, 2021
Using common voice datasets? DeepSpeech	5	1078	November 17, 2020

Preprocesses steps of Common Voice dataset

Related topics