Benchmark results with v0.3.0?

cycho · October 25, 2018, 12:15am

It’s reported that WER on LibreSpeech’s test-clean set was 6.5%. I believe this was with DeepSpeech v0.1.1. Do we have any results on the same benchmark data with the newer versions, 0.3.0 and/or 0.2.0?

When I tested on my little proprietary test set, 0.3.0 was slightly worse than 0.2.0. I was wondering if this was an anomaly or a trend.
Thanks!

reuben · October 25, 2018, 2:00am

6.5% was with v0.1, but that turned out to be affected by the test data being included by accident in the language model. v0.2 then had an increase in the error rate due to the reduction in the model size for streaming as well as removing the offending data from the construction of the language model. v0.3 fixed some non-deterministic behavior bugs that were introduced in v0.2, but the acoustic model released is the same as the one in v0.2, so you the v0.3 error rate is probably more correct. We’re working on different features that will hopefully bring the WER back down to under 10%.

cycho · October 25, 2018, 2:21am

Awesome. Thanks for the information!

jahir · October 27, 2018, 3:16pm

Can you please expand on this, I don’t understand it clearly. Does it mean the transcripts of the test data set was included in the language corpus when building the language model?

reuben · October 27, 2018, 5:17pm

Yes. This was corrected in the LM released with v0.2 and v0.3.

Topic		Replies	Views
Deepspeech accuracy decreasing? DeepSpeech	8	2708	October 10, 2018
Help with understanding benchmarks - are we at 5.6% word error rate on Librespeech Clean+Other? DeepSpeech	1	1065	May 25, 2018
Deep Speech v0.4.1 Released DeepSpeech	22	2254	February 1, 2019
DeepSpeech accuracy data for librispeeh DeepSpeech	5	5260	March 5, 2018
DeepSpeech WER on librispeech clean dataset DeepSpeech	3	835	December 10, 2019

Benchmark results with v0.3.0?

Related topics