Training DeepSpeech on (near) silence?

zaptrem · August 29, 2020, 3:26am

When recording silence (or near silence) DeepSpeech starts to produce gibberish instead of nothing. Would it be possible to add silence (or noise alone) to the dataset to prevent this?

othiele · August 31, 2020, 8:09am

In my simplified view of the algorithm will always try to attribute some audio to some letter. As silence is usually some very low acoustic signals, it will be hard to train it to some letter.

I would rather check the input with VAD and noise/signal levels to determine whether this should be recognized.

And play around with the confidence levels in the metadata, which should indicate bad transcriptions.

zaptrem · August 31, 2020, 1:34pm

Thanks. Is there a way to get per token metadata to filter this sort of stuff?

othiele · August 31, 2020, 1:40pm

No, as far as I remember this for the total currently. But there was somebody who wanted to do a PR to change that recently. Search the forum. It is a bit harder to implement as it resides on the C++ side of things.

Topic		Replies	Views
Training DeepSpeech on Silence? TTS (Text-to-Speech)	1	476	August 29, 2020
Inf loss training DeepSpeech	2	349	February 19, 2020
DeepSpeech v0.6.0 Trained with background noise DeepSpeech	3	831	January 6, 2020
Question on training data set DeepSpeech	3	375	June 22, 2020
Preprocessing, Silence, Lyric Recognition DeepSpeech	0	340	April 10, 2019

Training DeepSpeech on (near) silence?

Related topics