What dataset error rate is acceptable?

comodoro · January 26, 2021, 4:01pm

Is there any information on what the error rate of a corpus should be in order for the model to improve? I guess CER of the model will never be lower than CER of the corpus, which is almost never a factor, but is there any other limitation? By dataset errors I mean things like typos or abbreviations in the transcription and mispronunciations in the audio.

Topic		Replies	Views
Training a Small Dataset on DeepSpeech DeepSpeech	1	889	March 1, 2023
Forced alignment and train data quality DeepSpeech	1	301	February 17, 2020
Documentation about WER CER and loss value on Test Set of LibriSpeech for pre trained models? DeepSpeech	0	637	September 16, 2019
Question on training data set DeepSpeech	3	375	June 22, 2020
Deepspeech accuracy decreasing? DeepSpeech	8	2691	October 10, 2018

What dataset error rate is acceptable?

Related topics