Error during trie creation

lissyx · April 23, 2018, 5:51pm

That’s my guess, you have some transcription used during learning that seems to have a ‘3’ somewhere. We have an issue open to help mitigate that, but nobody picked it so far https://github.com/mozilla/DeepSpeech/issues/1107

pra978 · April 23, 2018, 6:17pm

Okay. I will be glad to do it. Just suggest me a flow /process/format to create tooling which will be easier to follow.

lissyx · April 24, 2018, 7:39am

Well, whatever you can do is already much better than what we have: nothing :). I have to admit I have not really thought it through, so I don’t have a hard opinion on that.

pra978 · April 25, 2018, 9:41am

Sure @lissyx, i will do it today.

Also, my Indian accent datasets model finally got created but this is the error i am getting while trying to run a inference. I am not able to resolve this.

MacBook Pro:DeepSpeech naveen$ deepspeech /Users/naveen/Downloads/DeepSpeech/results/model_export/output_graph.pb /Users/naveen/Downloads/DeepSpeech/TEST/engtext_3488.wav /Users/naveen/Downloads/DeepSpeech/alphabet.txt

Loading model from file : /Users/naveen/Downloads/DeepSpeech/results/model_export/output_graph.pb
Loaded model in 0.089s.

Running inference.
2018-04-25 12:58:24.039593: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: NodeDef mentions attr ‘identical_element_shapes’ not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
2018-04-25 12:58:24.039664: E tensorflow/core/common_runtime/executor.cc:643] Executor failed to create kernel. Invalid argument: NodeDef mentions attr ‘identical_element_shapes’ not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Error running session: Invalid argument: NodeDef mentions attr ‘identical_element_shapes’ not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
None
Inference took 0.256s for 7.920s audio file.

lissyx · April 25, 2018, 12:03pm

Check on the forum, you’re not the first one, this is because you trained with TensorFlow > r1.4 and you are using binaries that are r1.4 (like deepspeech v0.1.1)

pra978 · April 26, 2018, 8:37am

Yes. I tried to resolve this after checking the forum, especially this :

I realize that i followed the steps on github page of DeepSpeech v0.1.1 and the binaries there are r1.4.

I also understand the solution to this is installing C++ client binaries from here :

https://tools.taskcluster.net/groups/PrzjPY-ITSK6cn9x4yr3yg/tasks/U4tyZPPEQS6bZ5RHRyORFA/runs/0/artifacts

especially this one for mac OSX as i am using mac :
public/native_client.tar.xz

Let me know if i am correct till here.

Further, what i dont understand is how will i install it after downloading, because when i am using this command:

python util/taskcluster.py --arch osx --target .

it is downloading and installing some other one. Also, if i want to use my downloaded binaries in this command then how do i specify it?

Also, after installing the new binaries, do i recreate trie and binary file and then train the model again??

lissyx · April 26, 2018, 9:02am

No, this taskcluster URL is an old merge. If you want to get latest uptodate, use the util/taskcluster.py.

I don’t understand your question:

if i want to use my downloaded binaries in this command then how do i specify it?

It will download native_client.tar.xz and extract it, so you can use that directly …

I also don’t understand:

Also, after installing the new binaries, do i recreate trie and binary file and then train the model again??

Why would there be any need to do so?

pra978 · April 26, 2018, 9:40am

I don’t understand this : “If you want to get latest uptodate, use the util/taskcluster.py.”

about the second part:

i have downloaded “public/native_client.tar.xz” in my downloads folder from here:
https://tools.taskcluster.net/groups/PrzjPY-ITSK6cn9x4yr3yg/tasks/U4tyZPPEQS6bZ5RHRyORFA/runs/0/artifacts

my question is how can i now install this?

to simplify:
if i use “util/taskcluster.py” then where exactly do i specify my folder path to the “downloads”?
I am just not able to understand which command to use?

about the 3rd part:

I think that for trie file creation, “generate_trie” is used as first argument which is located inside native_client folder. Also, native_client has kenlm decoder, that’s why i think that model might have to be trained again.

Also,

can i simply unzip the “public/native_client.tar.xz” file instead of installing?

lissyx · April 26, 2018, 9:46am

Please make an effort and read what I am telling you.

This is exactly what I told you above, util/taskcluster.py will take care of downloading (by default) binaries from the latest master and extract content of native_client.tar.xz.

pra978 · April 26, 2018, 9:57am

Sorry. I am really trying hard to understand what you told. And i appreciate your help a lot.

my doubt is that while setting up everything, i had already used “util/taskcluster.py” as specified in the DeepSpeech github page with command as:

python3 util/taskcluster.py --arch osx --target .

and then i successfully trained my own model but while running inference, i got error due to mismatch of binaries and tensorflow version as i told you.

Now, what exactly do i do now? What i understand is that if i have to use “util/taskcluster.py” again then i must change this command (python3 util/taskcluster.py --arch osx --target .) somehow so that it downloads and extracts latest binaries.

Also, do i even need to retrain my model?

lissyx · April 26, 2018, 11:04am

No, re-running will download the latest binaries from master branch as I said earlier. To retrain, re-run the training ?

pra978 · April 26, 2018, 11:44am

Ohkay, so you are saying latest binaries from Master branch are > r1.4.

Actually I am trying to confirm because my tensorflow is 1.6 and I trained the new model 3 days ago only and I got the error.

lissyx · April 26, 2018, 3:04pm

It’s easy to confirm: we have the version from DeepSpeech and TensorFlow being outputted on standard error now, so you should see v1.6.0-… If you don’t see it, then you are not downloading or running the proper binaries.

pra978 · April 27, 2018, 8:18am

To download latest binaries from master branch, I re-ran the command :

python3 util/taskcluster.py --arch osx --target .

and it showed something like this while downloading:

Downloading https://index.taskcluster.net/v1/task/project.deepspeech.deepspeech.native_client.master.osx/artifacts/public/native_client.tar.xz …

Then i re-run the training.

Then i got model, i ran inference but it threw exact same error : ‘identical_element_shapes’.

Did i miss anything here?

My tensorflow version is still v1.6.0

lissyx · April 27, 2018, 8:21am

I cannot help you if you don’t provide me with only partial informations. Please provide the full output of all your process, please verify that the downloaded files are the ones you are calling.

pra978 · April 27, 2018, 8:24am

okay.

This is my output while running inference.

Loading model from file /Users/naveen/Downloads/DeepSpeech/results/model_export/output_graph.pb
Loaded model in 0.090s.
Running inference.
2018-04-27 13:38:04.067366: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: NodeDef mentions attr ‘identical_element_shapes’ not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
2018-04-27 13:38:04.067432: E tensorflow/core/common_runtime/executor.cc:643] Executor failed to create kernel. Invalid argument: NodeDef mentions attr ‘identical_element_shapes’ not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Error running session: Invalid argument: NodeDef mentions attr ‘identical_element_shapes’ not in Op<name=TensorArrayV3; signature=size:int32 -> handle:resource, flow:float; attr=dtype:type; attr=element_shape:shape,default=; attr=dynamic_size:bool,default=false; attr=clear_after_read:bool,default=true; attr=tensor_array_name:string,default=""; is_stateful=true>; NodeDef: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: bidirectional_rnn/bw/bw/TensorArray_1 = TensorArrayV3clear_after_read=true, dtype=DT_FLOAT, dynamic_size=false, element_shape=[?,750], identical_element_shapes=true, tensor_array_name=“bidirectional_rnn/bw/bw/dynamic_rnn/input_0”, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
None
Inference took 0.269s for 7.920s audio file.

lissyx · April 27, 2018, 8:26am

No version number in this, you are not running the latest binaries.

lissyx · April 27, 2018, 8:26am

$ python ../util/taskcluster.py --arch osx --target . 
Downloading https://index.taskcluster.net/v1/task/project.deepspeech.deepspeech.native_client.master.osx/artifacts/public/native_client.tar.xz ...
Downloading: 100%

generate_trie
libctc_decoder_with_kenlm.so
libdeepspeech.so
libdeepspeech_utils.so
LICENSE
deepspeech
README.mozilla
$ LC_ALL=C ls -hal libdeepspeech.so deepspeech 
-rwxr-xr-x 1 alex alex 26K Apr 20 00:31 deepspeech
-r-xr-xr-x 1 alex alex 75M Apr 20 00:31 libdeepspeech.so

pra978 · April 27, 2018, 8:26am

This is what i got while downloading/installing binaries inside DeepSpeech folder:

Prafful’s MacBook Pro:DeepSpeech naveen$ python3 util/taskcluster.py --arch osx --target .
Downloading https://index.taskcluster.net/v1/task/project.deepspeech.deepspeech.native_client.master.osx/artifacts/public/native_client.tar.xz …
Downloading: 100%

x generate_trie
x libctc_decoder_with_kenlm.so
x libdeepspeech.so
x libdeepspeech_utils.so
x LICENSE
x deepspeech
x README.mozilla

lissyx · April 27, 2018, 8:27am

$ sha1sum deepspeech libdeepspeech.so 
b0455dd60674ad49d858a161902b37d7c87f4282  deepspeech
0c3f7496930eaed4759e8a78d05b99f965ef6615  libdeepspeech.so