Error during trie creation

I have created all the files necessary for training a specific Indian english model and now i am stuck at trie creation.

I give these command:

/Users/naveen/Downloads/DeepSpeech/generate_trie / /Users/naveen/Downloads/DeepSpeech/alphabet.txt / /Users/naveen/Downloads/DeepSpeech/lm.binary / /Users/naveen/Downloads/DeepSpeech/vocabulary.txt / /Users/naveen/Downloads/DeepSpeech/trie

I get this error:

Illegal Instruction (core dumped)

Can anyone please help me?

@reuben @kdavis @elpimous_robot @lissyx

Your system lacks some of the instructions in the binaries. What’s your CPU ?

Thanks for quick response. This was my linux configuration.

-cpu
product: IntelÂź PentiumÂź CPU N3540 @ 2.16GHz
vendor: Intel Corp.
physical id: 1
bus info: cpu@0
size: 2640MHz
capacity: 2665MHz
width: 64 bits

Earlier i was working on macOS where i failed, then here again i got the above error and now again i switched back to macOS and successfully generated trie.

Also, i would like to bring it upto your notice that, this command :

“python3 util/taskcluster.py --arch osx --target .” (macOS specific command)
is not mentioned in this page in the “training” section :

, instead its mentioned previously somewhere which makes it kind of confusing and made me believe that “python3 util/taskcluster.py --target .” is the default command to download pre-build binaries for any OS.

I am mentioning this as i got stuck with this error earlier while working in mac OS : “cannot execute binary file” since i downloaded using “python3 util/taskcluster.py --target .”

I hope that will be useful/helpful to someone in future.

Two things:

  1. Your CPU seems to lack AVX, according to Intel’s website. So linux or macOS, same deal.
  2. python util/taskcluster.py --help should document you the usage and thus tells you about osx.

Okay @lissyx

Thank you.

I finally started the training on my datasets.

I got this error, though its still running. Do you know why this is happening? Its got something to do with inbuilt files of DeepSpeech.


Prafful’s MacBook Pro:DeepSpeech naveen$ sh /Users/naveen/Downloads/DeepSpeech/DeepSpeech/run_file.sh

  • ‘[’ ‘!’ -f DeepSpeech.py ‘]’
  • python -u DeepSpeech.py --train_files /Users/naveen/Downloads/DeepSpeech/train/train.csv --dev_files /Users/naveen/Downloads/DeepSpeech/dev/dev.csv --test_files /Users/naveen/Downloads/DeepSpeech/test/test.csv --train_batch_size 80 --dev_batch_size 80 --test_batch_size 40 --n_hidden 375 --epoch 33 --validation_step 1 --early_stop True --earlystop_nsteps 6 --estop_mean_thresh 0.1 --estop_std_thresh 0.1 --dropout_rate 0.22 --learning_rate 0.00095 --report_count 100 --use_seq_length False --export_dir /Users/naveen/Downloads/DeepSpeech/results/model_export/ --checkpoint_dir /Users/naveen/Downloads/DeepSpeech/results/checkout/ --decoder_library_path /Users/naveen/Downloads/DeepSpeech/DeepSpeech/libctc_decoder_with_kenlm.so --alphabet_config_path /Users/naveen/Downloads/DeepSpeech/alphabet.txt --lm_binary_path /Users/naveen/Downloads/DeepSpeech/lm.binary --lm_trie_path /Users/naveen/Downloads/DeepSpeech/trie
    /anaconda3/lib/python3.6/site-packages/h5py/init.py:34: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
    from ._conv import register_converters as _register_converters
    I STARTING Optimization
    I Training of Epoch 0 - loss: 195.601520
    Exception in thread Thread-6:
    Traceback (most recent call last):
    File “/anaconda3/lib/python3.6/threading.py”, line 916, in _bootstrap_inner
    self.run()
    File “/anaconda3/lib/python3.6/threading.py”, line 864, in run
    self._target(*self._args, **self._kwargs)
    File “/Users/naveen/Downloads/DeepSpeech/DeepSpeech/util/feeding.py”, line 148, in _populate_batch_queue
    target = text_to_char_array(transcript, self._alphabet)
    File “/Users/naveen/Downloads/DeepSpeech/DeepSpeech/util/text.py”, line 40, in text_to_char_array
    return np.asarray([alphabet.label_from_string© for c in original])
    File “/Users/naveen/Downloads/DeepSpeech/DeepSpeech/util/text.py”, line 40, in
    return np.asarray([alphabet.label_from_string© for c in original])
    File “/Users/naveen/Downloads/DeepSpeech/DeepSpeech/util/text.py”, line 30, in label_from_string
    return self._str_to_label[string]
    KeyError: ‘3’

I Validation of Epoch 0 - loss: 159.338902


It seems more like an issue where you have mismatch between text transcriptions and your alphabet

Hey @lissyx, i tried to check the mismatch between text transcriptions and alphabet, but i was not able to resolve this.

I am using the alphabet.txt file which used for DeepSpeech pre-trained model as my text transcripts only contain english alphabets (a to z).

Further, regarding my text transcriptions, this is a 4 row snippet of how my different csv files look (1st row starting with wav_filename. Next 3 rows starting with /Users):-

wav_filename,wav_filesize,transcript
/Users/naveen/Downloads/DeepSpeech/TEST/engtext_3488.wav,253470,hit by the stone the kite released its prey and the mouse at once ran to the sage asking him for protection
/Users/naveen/Downloads/DeepSpeech/TEST/engtext_3489.wav,202702,the kite addressed sage and said sage you have hit me with a stone which is not proper
/Users/naveen/Downloads/DeepSpeech/TEST/engtext_3490.wav,167212,are you not afraid of god surrender that mouse to me or you will go to hell

@elpimous_robot , do you have any suggestions on this?

Hi.
It seems correct.

Just a snippet is not useful. Try to instrument util/text.py to provide more context on the source of the error. It seems you lack ‘3’ in the alphabet, according to the stack trace. And we have no number at all in data/alphabet.txt.

Does it imply that any of my input files(text transcriptions, binary, trie file) might contain ‘3’ or other numbers too along with alphabets?

That’s my guess, you have some transcription used during learning that seems to have a ‘3’ somewhere. We have an issue open to help mitigate that, but nobody picked it so far :frowning: https://github.com/mozilla/DeepSpeech/issues/1107

Okay. I will be glad to do it. Just suggest me a flow /process/format to create tooling which will be easier to follow.

Well, whatever you can do is already much better than what we have: nothing :). I have to admit I have not really thought it through, so I don’t have a hard opinion on that.

Sure @lissyx, i will do it today.

Also, my Indian accent datasets model finally got created but this is the error i am getting while trying to run a inference. I am not able to resolve this.

MacBook Pro:DeepSpeech naveen$ deepspeech /Users/naveen/Downloads/DeepSpeech/results/model_export/output_graph.pb /Users/naveen/Downloads/DeepSpeech/TEST/engtext_3488.wav /Users/naveen/Downloads/DeepSpeech/alphabet.txt

Loading model from file : /Users/naveen/Downloads/DeepSpeech/results/model_export/output_graph.pb
Loaded model in 0.089s.

Running inference.
2018-04-25 12:58:24.039593: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: NodeDef mentions attr ‘identical_element_shapes’ not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
2018-04-25 12:58:24.039664: E tensorflow/core/common_runtime/executor.cc:643] Executor failed to create kernel. Invalid argument: NodeDef mentions attr ‘identical_element_shapes’ not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Error running session: Invalid argument: NodeDef mentions attr ‘identical_element_shapes’ not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
None
Inference took 0.256s for 7.920s audio file.

Check on the forum, you’re not the first one, this is because you trained with TensorFlow > r1.4 and you are using binaries that are r1.4 (like deepspeech v0.1.1)

Yes. I tried to resolve this after checking the forum, especially this :

I realize that i followed the steps on github page of DeepSpeech v0.1.1 and the binaries there are r1.4.

I also understand the solution to this is installing C++ client binaries from here :

https://tools.taskcluster.net/groups/PrzjPY-ITSK6cn9x4yr3yg/tasks/U4tyZPPEQS6bZ5RHRyORFA/runs/0/artifacts

especially this one for mac OSX as i am using mac :
public/native_client.tar.xz

Let me know if i am correct till here.

Further, what i dont understand is how will i install it after downloading, because when i am using this command:

python util/taskcluster.py --arch osx --target .

it is downloading and installing some other one. Also, if i want to use my downloaded binaries in this command then how do i specify it?

Also, after installing the new binaries, do i recreate trie and binary file and then train the model again??

No, this taskcluster URL is an old merge. If you want to get latest uptodate, use the util/taskcluster.py.

I don’t understand your question:

if i want to use my downloaded binaries in this command then how do i specify it?

It will download native_client.tar.xz and extract it, so you can use that directly 


I also don’t understand:

Also, after installing the new binaries, do i recreate trie and binary file and then train the model again??

Why would there be any need to do so?

I don’t understand this : “If you want to get latest uptodate, use the util/taskcluster.py.”

about the second part:

i have downloaded “public/native_client.tar.xz” in my downloads folder from here:
https://tools.taskcluster.net/groups/PrzjPY-ITSK6cn9x4yr3yg/tasks/U4tyZPPEQS6bZ5RHRyORFA/runs/0/artifacts

my question is how can i now install this?

to simplify:
if i use “util/taskcluster.py” then where exactly do i specify my folder path to the “downloads”?
I am just not able to understand which command to use?

about the 3rd part:

I think that for trie file creation, “generate_trie” is used as first argument which is located inside native_client folder. Also, native_client has kenlm decoder, that’s why i think that model might have to be trained again.

Also,

can i simply unzip the “public/native_client.tar.xz” file instead of installing?

Please make an effort and read what I am telling you.

This is exactly what I told you above, util/taskcluster.py will take care of downloading (by default) binaries from the latest master and extract content of native_client.tar.xz.

1 Like

Sorry. I am really trying hard to understand what you told. And i appreciate your help a lot.

my doubt is that while setting up everything, i had already used “util/taskcluster.py” as specified in the DeepSpeech github page with command as:

python3 util/taskcluster.py --arch osx --target .

and then i successfully trained my own model but while running inference, i got error due to mismatch of binaries and tensorflow version as i told you.

Now, what exactly do i do now? What i understand is that if i have to use “util/taskcluster.py” again then i must change this command (python3 util/taskcluster.py --arch osx --target .) somehow so that it downloads and extracts latest binaries.

Also, do i even need to retrain my model?