Creation of language model ( lm.binary , output_graph.tflite and trie file)

Rohith_Krishnan · May 5, 2020, 12:15pm

I am trying to fine tune the deepspeech-0.7.0-models using some of the voice data i have collected for one of my Android application( which is created referring https://github.com/mozilla/androidspeech ) . And trained the same using the steps mentioned in link . And exported the tflite using the --export_tflite flags.

For the Android application to work with need the two files lm.binary and trie . Whether we can reuse the same from the release model , even if the model is fine tuned using some other voice data ? Or how to create these files so that i can work with my Android Application ?

othiele · May 5, 2020, 12:22pm

I am not the expert on the Android version, but think of it working like that: the neural net outputs some characters and then Deepspeech checks the trie and binary for matching words. So if you still do English, the released files should be just fine. You would change that if you want to transcribe specific contextual areas (medicine, aviation, …) otherwise go with the standard.

othiele · May 5, 2020, 12:29pm

The 0.7 switched from trie and binary to a combined scorer. Don’t know whether it’s the same for the tf.lite model on Android @lissyx?

Rohith_Krishnan · May 5, 2020, 12:35pm

So if i want to create lm.binary and trie how will i do that ?

lissyx · May 5, 2020, 2:52pm

read what you are doing when copy/pasting the code, androidspeech depends on 0.6.1, not 0.7.

Rohith_Krishnan · May 6, 2020, 1:56pm

@lissyx @othiele Thanks for the update.

I have retrained model v0.6.1 using some of my voice files (about 50 + files ) using the steps mentioned in https://deepspeech.readthedocs.io/en/v0.6.1/TRAINING.html#continuing-training-from-a-release-model , Also i have generated tflite file after training using the steps mentioned in https://deepspeech.readthedocs.io/en/v0.6.1/TRAINING.html#exporting-a-model-for-tflite . But when i check the newly created output_graph.tflite and output_graph.pb the model file size are same as that of files which is presented in the release models https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/deepspeech-0.6.1-models.tar.gz.

Whether this is expected ?

othiele · May 6, 2020, 3:25pm

Mostly in AI the model size stays the same, but the content differs

You should get different results for different training material.

Topic		Replies	Views
Inference on Android using custom language model and trie DeepSpeech	3	516	September 23, 2019
Fine tuning 0.5.1 - Do I need to create a lm.binary and trie file for the training for common voice or can I use language model already in 0.5.1 DeepSpeech	6	1401	September 24, 2019
Fine tune the Language Model DeepSpeech	3	494	December 6, 2019
Creating a tflite model and lm for command recognition DeepSpeech	7	836	September 30, 2019
How to update the language model DeepSpeech	25	2491	April 19, 2019

Creation of language model ( lm.binary , output_graph.tflite and trie file)

Related topics