What are the training data sets of the future release of pre-trained model 0.2 or 0.3?

tan_oscar · June 23, 2018, 4:26pm

Hi,

I’m in the planning process to fine tune the future release of 0.2 or 0.3 pre-trained model because we can’t add new vocabulary words to the 0.1 model.

Do we know yet what are the training data sets of the future release of pre-trained model 0.2 or 0.3 ?

Thanks.

lissyx · June 24, 2018, 6:28pm

There should be no change to the model for 0.2.0 release, we will only ship optimizations on libdeepspeech.so and language model side, as well as proper sources for rebuilding the language model. For 0.3.0, it’s likely too early to tell, @reuben is still working on the changes for the streaming model.