Hi, I am developing an Android app which does some on-device speech recognition. I got this working with the pretrained model (0.5.1), but the results weren’t great.
I wanted to do some further training for my use case. I read this thread, so I checked out DeepSpeech 0.5.1 and installed Tensorflow 1.13.1 via pip. The additional training completed successfully and the model seems to work quite well when tested via the command line.
Now, I am unable to export my checkpoint to TFLite. I get the error “Exception: TensorFlow Lite currently doesn’t support control flow ops: Merge, Switch.”
I found some threads stating that this has been fixed in DeepSpeech 0.6, but I also understand from the above linked thread that my 0.5.1 model is not compatible with DeepSpeech 0.6.
I wondered if using Mozilla’s TensorFlow fork would help, so I tried to build that but then had issues with building it, so I wanted to stop and ask before I struggled with that further.
So, is there a way for me to continue training a pretrained model, and then export it as a TFLite model?
Will Mozilla’s fork of TensorFlow help? Should I continue trying to build it?
I think this sounds like a good plan, I will try to continue training the 0.6a model.
One question - when I originally trained from the 0.5.1 model, I downloaded the checkpoint and used that in addition to the released model files. Is the checkpoint available for 0.6a4?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
4
Sorry, I think I must be misunderstanding something.
How is it possible for me to follow the suggestion from @kdavis without the checkpoint? Is it possible for me to use the 0.6a4 output_graph.pb and 0.6a4 codebase, but train from the 0.5.1 checkpoint? I assumed this would be incompatible.
To continue training you’ll need to patch the tf.train.Saver to be able to match the 0.5.1 variable names with the names on master. I don’t know how it’ll interact when saving the new checkpoints though, maybe you’ll want to have two separate savers, one for loading the 0.5.1 checkpoint initially and then one for saving the checkpoints from your fine tuning run. In any case, it’ll require some light modification of the code. I’m attaching the patch with the logic applied to the export function, which is how we got the 0.6a4 exported models for testing, but for your use case you’ll want to apply the same logic to the Saver used during training.
I gave this a try and was able to successfully export the tflite model (meaning that there were no command line errors during the export process).
However, the tflite model isn’t working on the Android device (I’m using the Android mozillaspeechlibrary). It looks like the constructor in DeepSpeechModel.java is unsuccessful. Specifically, at the end of the constructor, this._msp is still null, which causes a null pointer exception later.
I simply replaced the old .tflite file with the new one in the device storage - perhaps I need to do something else to make the new model work?
output_graph.pb results, for comparison:
WER: 0.019960, CER: 0.008225, loss: 1.638512
Seems like the .tflite model is working great, so I guess it’s something in Android. I haven’t actually changed the Android code at all from the demo, so I’m not sure what the issue would be there. It seems like the call to impl.CreateModel() in the DeepSpeechModel constructor is running into a problem, which causes _msp to be null. I’m honestly not familiar with using a JNI - I traced that CreateModel() function back as far as I could but it looks like it’s a native function declared in libdeepspeech.jar. Can I somehow view the native CreateModel function so I can try to debug it?