Segmentation Fault (Core Dumped) during inference using deepspeech-tflite

Hello All,

I have trained the DeepSpeech model with my custom dataset with version 0.9.1. After training, I have generated checkpoints and exported a tflite and pb files using the given script.

However, when I run inference using deepspeech-tflite==0.9.1 I get the following error:

(venv) rahul@rahul-rs:~$ deepspeech --model /home/rahul/DS1/output_graph.tflite --audio /home/rahul/x.wav
Loading model from file /home/rahul/DS1/output_graph.tflite
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.1-0-gab8bd3e
Segmentation fault (core dumped)

The inference from .pb using pip3 install deepspeech-gpu works fine. But it crashes when it comes to tflite. As a result, when I run the same model in native client in android studio, the Application is also crashing.

The pre-trained model works absolutely fine. Any suggestions @lissyx ??
Thanks

do you repro only with your own tflite file or with our released ones? we already got reports of β€œsomeone’s exported tflite” segfaulting, but we failed to get more infos. User re-exported the TFLite model from the checkpoints, and the segfault was gone …

Thank you for the timely reply.

No, even when I create from the checkpoints given here: https://github.com/mozilla/DeepSpeech/releases/tag/v0.9.1

I am getting the same error.

Here is what I did :-

(venv) rahul@rahul-rs:~$ cd DeepSpeech/
(venv) rahul@rahul-rs:~/DeepSpeech$ python3 DeepSpeech.py --export_tflite --export_dir /home/rahul/ --checkpoint_dir /media/rahul/Personal/deepspeech-0.9.1-checkpoint
I Exporting the model…
I Loading best validating checkpoint from /media/rahul/Personal/deepspeech-0.9.1-checkpoint/best_dev-1466475
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
I Models exported at /home/rahul/
I Model metadata file saved to /home/rahul/author_model_0.0.1.md. Before submitting the exported model for publishing make sure all information in the metadata file is correct, and complete the URL fields.
(venv) rahul@rahul-rs:~/DeepSpeech$ cd …/
(venv) rahul@rahul-rs:~$ deepspeech --model /home/rahul/output_graph.tflite --audio /home/rahul/sample_libri_61.wav
Loading model from file /home/rahul/output_graph.tflite
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.1-0-gab8bd3e
Segmentation fault (core dumped)

Yes sir, I am aware of this . I went through it and saw that the error was with respect to the alpha version of 0.6 that was installed by the user. Here I have double-checked and followed step by step instructions given in the docs to run the training, exporting & inference.

I have trained my dataset without changing parameters (reducing the number of epochs and batch size) with TensorFlow 1.15.4 (as mentioned).

When I pip install deepspeech-tflite==0.9.1 and run inference.
Why is it showing that the Tensorflow version is 2.3?? even though it’s running in the same virtual environment and there are no other TensorFlow versions installed on the computer?

In DS 0.9 training is still on TF 1.15 and inference on 2.3.

Good point as this might be the problem. Could you please pip install 0.7.4 in a different environment and try the tflite there? 0.7 models should be compatible, but inference is still on TF 1.15? Maybe the problem lies there. And if you got the time, check 0.8.2 on TF 2.2

But as it works with the released tflite, maybe this is not the problem …

Encountering the same error with 0.7.4 for the tflite generated from the pre-trained checkpoints itself:-

rahul@rahul-rs:~$ python3 -m venv venv_0.7
rahul@rahul-rs:~$ source ./venv_0.7/bin/activate
(venv_0.7) rahul@rahul-rs:~$ pip install --upgrade pip
Collecting pip
Downloading https://files.pythonhosted.org/packages/54/eb/4a3642e971f404d69d4f6fa3885559d67562801b99d7592487f1ecc4e017/pip-20.3.3-py2.py3-none-any.whl (1.5MB)
100% |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.5MB 1.1MB/s
Installing collected packages: pip
Found existing installation: pip 8.1.1
Uninstalling pip-8.1.1:
Successfully uninstalled pip-8.1.1
Successfully installed pip-20.3.3
(venv_0.7) rahul@rahul-rs:~$ pip3 install deepspeech-tflite==0.7.4
DEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality.
Collecting deepspeech-tflite==0.7.4
Downloading deepspeech_tflite-0.7.4-cp35-cp35m-manylinux1_x86_64.whl (1.2 MB)
|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.2 MB 2.3 MB/s
Collecting numpy
Downloading numpy-1.18.5-cp35-cp35m-manylinux1_x86_64.whl (19.9 MB)
|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 19.9 MB 5.6 MB/s
Installing collected packages: numpy, deepspeech-tflite
Successfully installed deepspeech-tflite-0.7.4 numpy-1.18.5
(venv_0.7) rahul@rahul-rs:~$ deepspeech --model /home/rahul/output_graph.tflite --audio /home/rahul/sample_libri_61.wav
Loading model from file /home/rahul/output_graph.tflite
TensorFlow: v1.15.0-24-gceb46aa
DeepSpeech: v0.7.4-0-gfcd9563
Segmentation fault (core dumped)

No changes here either :-

(venv_0.8) rahul@rahul-rs:~$ pip3 install deepspeech-tflite==0.8.2
DEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality.
Collecting deepspeech-tflite==0.8.2
Downloading deepspeech_tflite-0.8.2-cp35-cp35m-manylinux1_x86_64.whl (1.6 MB)
|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.6 MB 2.5 MB/s
Collecting numpy
Using cached numpy-1.18.5-cp35-cp35m-manylinux1_x86_64.whl (19.9 MB)
Installing collected packages: numpy, deepspeech-tflite
Successfully installed deepspeech-tflite-0.8.2 numpy-1.18.5
(venv_0.8) rahul@rahul-rs:~$ clear
(venv_0.8) rahul@rahul-rs:~$ deepspeech --model /home/rahul/output_graph.tflite --audio /home/rahul/sample_libri_61.wav
Loading model from file /home/rahul/output_graph.tflite
TensorFlow: v2.2.0-24-g1c1b2b9
DeepSpeech: v0.8.2-0-g02e4c76
Segmentation fault (core dumped)

Thanks, this is exactly what I meant. I wanted to make sure this is not related to TF versions.

As stated in the docs, did you change the alphabet file?

No, I haven’t changed anything except the dataset. I’ve used the same code for training, but I have changed the number of epochs and the batch size, that’s all.

I am training with 960 hours LibriSpeech dataset.

Hm, you could try to build the tflite from your checkpoint on Colab or a different system. Just to make sure the reason for the segmentation fault isn’t introduced by something on your system.

But as @lissyx said, if the pb file works and the supplied tflite works it is strange that you can’t build your own tflite. You could start to debug it yourself, but tflite is probably a bigger project to learn in itself.

@Rahul_Badarinath I have already replied to you: if our released .pb and .tflite works, then it’s a problem from your tflite export. Some people reported the same, but we have not been able to get more informations, so we don’t know what is wrong. And since I’m still on PTO, you will have to wait.

Try and use debug builds to investigate the stack?

Yes, I’m already on it, but was hoping you could point me in a direction to solve the error that’s all. Nevermind, I’ll look into it and see if I can debug anything.

Already tried on 3 systems, running on Ubuntu 16 & 18, and windows. On all 3, we encountered the same issue. After using DeepSpeech.py to export tflite from the checkpoints generated, when we try to use deepspeech-tflite (0.9.1), since we trained with that version to run inference, it crashes on the Native Client and on the system, the same segfault error. :frowning:

Sure, will do.

Already on it, nothing so far, but will dig deeper and see if I can come up with something. Thank you.

1 Like