Failed using my own model

Hello,

I’m here from the conversation I got with “lissyx” : https://github.com/mozilla/DeepSpeech/issues/1609

What I did since this conversation :

  • git checkout v0.2.0

M bin/run-ldc93s1.sh
Note: checking out ‘v0.2.0’.

You are in ‘detached HEAD’ state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

git checkout -b

HEAD est maintenant sur 009f9b6… Retrigger v0.2.0

  • pip3 install tensorflow
  • install deepspeech==0.2.0
  • install native_client with the branch 0.2.0
  • deepspeech --version : TensorFlow: v1.6.0 / DeepSpeech : v0.2.0

Then I retrained and use the model created and I got the same log than before :

deepspeech --model …/exportModels/output_graph.pb --alphabet ./data/alphabet.txt --lm ./data/lm/lm.binary --trie ./data/lm/trie --audio ./data/ldc93s1/LDC93S1.wav
Loading model from file …/exportModels/output_graph.pb
TensorFlow: v1.6.0-18-g5021473
DeepSpeech: v0.2.0-0-g009f9b6
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
Invalid argument: No OpKernel was registered to support Op ‘Pack’ with these attrs. Registered devices: [CPU], Registered kernels:

 [[Node: lstm_fused_cell/stack_1 = Pack[N=2, T=DT_INT32, axis=1](input_lengths, lstm_fused_cell/range_1)]]

Traceback (most recent call last):
File “/home/xa/tmp/deepspeech-venv/bin/deepspeech”, line 11, in
sys.exit(main())
File “/home/xa/tmp/deepspeech-venv/lib/python3.6/site-packages/deepspeech/client.py”, line 81, in main
ds = Model(args.model, N_FEATURES, N_CONTEXT, args.alphabet, BEAM_WIDTH)
File “/home/xa/tmp/deepspeech-venv/lib/python3.6/site-packages/deepspeech/init.py”, line 14, in init
raise RuntimeError(“CreateModel failed with error code {}”.format(status))
RuntimeError: CreateModel failed with error code 3

Any ideas how to solve this problem? Please let me know if you need more information about something else.

Many thanks

Why did you do this ? Please use the documented pip3 install -r requirements.txt. I’d bet you have a pip3 list that shows a tensorflow version different from 1.6.0, and thus it explains why you have incompatible trained model.

Ok, I’m gonna reset my VM in order to get what the documentation says step by step and add the 3 things you ask me to do ( git checkout v0.2.0; pip3 install -r requirements.txt; install deepspeech==0.2.0) and I will publish my feedback.

And just for the sake of avoiding piling risks, please use two different virtualenv for training and running :slight_smile:

So in both the same configuration but one for train and create the model, and the other one to use this model created?

One venv to git checkout v0.2.0 && pip install -r requirements.txt and then run training ; a second one to pip install deepspeech==0.2.0

@GoBa sir i have the same issue. but it resloved.

which version git clone deepspeech you run in your pc, that similar version pip deepspeech you installed. that is not sync then it is throw this issue.this is my perspective only.

i did this,

pip3 install deepspeech-gpu

deepspeech --model models/output_graph.pb --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio 6.wav

virtualenv -p python3 $HOME/tmp/DeepSpeech_v0.2.0/

source /home/dell/tmp/DeepSpeech_v0.2.0/bin/activate

cd git-lfs-linux-amd64-v2.5.2/
sudo ./install.sh

git clone https://github.com/mozilla/DeepSpeech

DeepSpeech-0.2.1-alpha.1
cd DeepSpeech

pip3 install -r requirements.txt

python3 util/taskcluster.py --branch “v0.2.1-alpha.1” --target new_native_client/

change requirements.txt -> tensorflow-gpu==1.11.0

build checkpoint:

python3 DeepSpeech.py --n_hidden 2048 --initialize_from_frozen_model …/models/output_graph.pb --checkpoint_dir fine_tuning_checkpoints --epoch 3 --train_files audio_folder/audio_file_train.csv --dev_files audio_folder/audio_file_dev.csv --test_files audio_folder/audio_file_test.csv --learning_rate 0.0001 --decoder_library_path new_native_client/libctc_decoder_with_kenlm.so --alphabet_config_path data/alphabet.txt --lm_binary_path data/lm/lm.binary --lm_trie_path data/lm/trie

export .pb model:

python3 DeepSpeech.py --n_hidden 2048 --initialize_from_frozen_model …/models/output_graph.pb --checkpoint_dir fine_tuning_checkpoints --epoch 3 --train_files audio_folder/audio_file_train.csv --dev_files audio_folder/audio_file_dev.csv --test_files audio_folder/audio_file_test.csv --learning_rate 0.0001 --decoder_library_path new_native_client/libctc_decoder_with_kenlm.so --alphabet_config_path data/alphabet.txt --lm_binary_path data/lm/lm.binary --lm_trie_path data/lm/trie --export_dir funetune_export/

deepspeech --model funetune_export/output_graph.pb --alphabet …/models/alphabet.txt --lm …/models/lm.binary --trie …/models/trie --audio …/6.wav

(DeepSpeech_v0.2.0) dell@dell-OptiPlex-7050:~/Documents/DeepSpeech$ deepspeech --model funetune_export/output_graph.pb --alphabet …/models/alphabet.txt --lm …/models/lm.binary --trie …/models/trie --audio …/6.wav
Loading model from file funetune_export/output_graph.pb
TensorFlow: v1.6.0-18-g5021473
DeepSpeech: v0.2.0-0-g009f9b6
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2018-10-01 23:42:40.476190: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-10-01 23:42:40.558563: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-10-01 23:42:40.558907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.43
pciBusID: 0000:01:00.0
totalMemory: 3.94GiB freeMemory: 3.57GiB
2018-10-01 23:42:40.558918: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-10-01 23:42:40.684696: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3311 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Invalid argument: No OpKernel was registered to support Op ‘Pack’ with these attrs. Registered devices: [CPU,GPU], Registered kernels:

 > [[Node: lstm_fused_cell/stack_1 = Pack[N=2, T=DT_INT32, axis=1](input_lengths, lstm_fused_cell/range_1)]]

Traceback (most recent call last):
File “/home/dell/tmp/DeepSpeech_v0.2.0/bin/deepspeech”, line 11, in
sys.exit(main())
File “/home/dell/tmp/DeepSpeech_v0.2.0/lib/python3.5/site-packages/deepspeech/client.py”, line 81, in main
ds = Model(args.model, N_FEATURES, N_CONTEXT, args.alphabet, BEAM_WIDTH)
File “/home/dell/tmp/DeepSpeech_v0.2.0/lib/python3.5/site-packages/deepspeech/init.py”, line 14, in init
raise RuntimeError(“CreateModel failed with error code {}”.format(status))
RuntimeError: CreateModel failed with error code 3

solution:
pip3 install deepspeech-gpu==0.2.1-alpha.1

(DeepSpeech_v0.2.0) dell@dell-OptiPlex-7050:~/Documents/DeepSpeech$ deepspeech --model funetune_export/output_graph.pb --alphabet data/alphabet.txt --lm data/lm/lm.binary --trie data/lm/trie --audio …/6.wav
Loading model from file funetune_export/output_graph.pb
TensorFlow: v1.11.0-rc2-4-g77b7b17
DeepSpeech: v0.2.1-alpha.1-0-gae2cfe0
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2018-10-02 00:16:42.960099: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-10-02 00:16:43.035727: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-10-02 00:16:43.036300: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.43
pciBusID: 0000:01:00.0
totalMemory: 3.94GiB freeMemory: 3.50GiB
2018-10-02 00:16:43.036311: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2018-10-02 00:16:43.238818: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-10-02 00:16:43.238844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] 0
2018-10-02 00:16:43.238849: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0: N
2018-10-02 00:16:43.239010: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3234 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Loaded model in 0.382s.
Loading language model from files data/lm/lm.binary data/lm/trie
Loaded language model in 3.47s.
Running inference.

Inference took 2.352s for 4.362s audio file.
:slightly_smiling_face::slightly_smiling_face::slightly_smiling_face:

1 Like

Sir,

I will let you know how everything is going from what Lyssix told me. The problem with your solution, it’s that I don’t have Nvida GPU. I have an AMD, so I can’t use gpu command. However I appreciate your help and I thank you :slight_smile: !

This time I can’t even train.

So what I did :

  • sudo apt-get install python3.6
  • alias python=’/usr/bin/python3.6’
  • . ~/.bashrc
  • sudo apt install git
  • tar xzvf git-lfs-linux-amd64-v2.5.2.tar.gz
  • ./install.sh
  • sudo ./install.sh
  • git lfs install
  • git clone https://github.com/mozilla/DeepSpeech
  • wget -O - https://github.com/mozilla/DeepSpeech/releases/download/v0.2.0/deepspeech-0.2.0-models.tar.gz | tar xvfz -
  • sudo apt install virtualenv
  • virtualenv -p python3.6 $HOME/tmp/deepspeech-venv/
  • source $HOME/tmp/deepspeech-venv/bin/activate
  • pip3 install deepspeech
  • cd DeepSpeech/
  • pip3 install -r requirements.txt
  • git checkout v0.2.0
  • pip3 install -r requirements.txt
  • python3 util/taskcluster.py --target .
  • ./bin/run-ldc93s1.sh

What I get :

  • [ ! -f DeepSpeech.py ]
  • [ ! -f data/ldc93s1/ldc93s1.csv ]
  • [ -d ]
  • python -c from xdg import BaseDirectory as xdg; print(xdg.save_data_path(“deepspeech/ldc93s1”))
  • checkpoint_dir=/home/xa/.local/share/deepspeech/ldc93s1
  • python -u DeepSpeech.py --train_files data/ldc93s1/ldc93s1.csv --dev_files data/ldc93s1/ldc93s1.csv --test_files data/ldc93s1/ldc93s1.csv --train_batch_size 1 --dev_batch_size 1 --test_batch_size 1 --n_hidden 494 --epoch 75 --checkpoint_dir /home/xa/Bureau/checkpoint/ldc93s1 --decoder_library_path ./libctc_decoder_with_kenlm.so --export_dir /home/xa/Bureau/exportModel
    Traceback (most recent call last):
    File “DeepSpeech.py”, line 1976, in
    tf.app.run(main)
    File “/home/xa/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/platform/app.py”, line 126, in run
    _sys.exit(main(argv))
    File “DeepSpeech.py”, line 1927, in main
    initialize_globals()
    File “DeepSpeech.py”, line 336, in initialize_globals
    custom_op_module = tf.load_op_library(FLAGS.decoder_library_path)
    File “/home/xa/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py”, line 58, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
    File “/home/xa/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py”, line 516, in exit
    c_api.TF_GetCode(self.status.status))
    tensorflow.python.framework.errors_impl.NotFoundError: ./libctc_decoder_with_kenlm.so: undefined symbol: _ZN10tensorflow6StatusC1ENS_5error4CodeEN4absl11string_viewE

Check the documentation, this downloaded latest master, which is now based on TensorFlow r1.11 while 0.2.0 is r1.6, that explains the symbol not found.

So I need to do :

  • python3 util/taskcluster.py --branch v0.2.0 --target .

In order to get the right native client? I’m sorry for asking so much, but I’m blocked since a moment.

Yes. Also, please try to use proper code formatting otherwise it’s painful to read and we can miss some informations that are being interpreted as message formatting.

It seems it’s working but it’s taking a lot of memory to launch the inferences. I’m gonna launch some tests during these 2 next weeks and I will keep in touch with you.

However, I would like to thank you all guys for your help !

Can you be more precise ?

Well, I gave like 11Go of RAM and when I launched the inferences the VM freezed and I had to shut it down. I gave 2 more Go and it works, but it seems a lot.

Strange, I just verified, valgrind --tool=massif reports heap allocation ~650MB.

Not in my case, it went up really fast. But I will give you my feedback from my futur tests.

Without language model:

With language model:

How to make two different virtual env ?