Inference error: Shapes of all inputs must match

reyxuan · January 22, 2020, 1:45pm

Hello!

I’m facing some problems during inference. I trained my model with

python -u DeepSpeech.py \
  --train_files my_resources/train.csv \
  --dev_files my_resources/dev.csv \
  --test_files my_resources/test.csv \
  --train_batch_size 64 \
  --dev_batch_size 64 \
  --test_batch_size 64 \
  --export_batch_size 1 \
  --n_hidden 1024 \
  --epochs 0 \
  --learning_rate 0.0001 \
  --alphabet_config_path my_resources/alphabet.txt \
  --lm_binary_path my_resources/spanish-models/new_lm.binary \
  --lm_trie_path my_resources/spanish-models/new_trie \
  --automatic_mixed_precision=True \
  --use_cudnn_rnn=True \
  --checkpoint_dir my_resources/checkpoints \
  --export_dir my_resources/models \
  --export_language es \
  --report_count 20 \
  --summary_dir my_resources/summaries \
  --test_output_file my_resources/models/model_test_output.txt \
  --noearly_stop \
  --load best \
  --dropout_rate 0.20

with the following graph:

And the results are not so horrendous as I spected:

                    src                                 res
0              ayudenme                         ayuda en me
1       no regularmente  ahora si estoy tomando me lamentos
2          posiblemente                       posible mente
3                  copo                            con poco
4  capitulo veintiseis                   la futuro vinci se

However, when I run

python native_client/python/client.py \
              --model my_resources/models/output_graph.pbmm \
              --lm my_resources/spanish-models/new_lm.binary \
              --trie my_resources/spanish_models/new_trie \
              --audio my_resources/common_voice_es_18933587.wav \
              --lm_alpha 0.75 \
              --lm_beta 1.85 \
              --beam_width 1024

It returns:

+ python native_client/python/client.py --model my_resources/models/output_graph.pbmm --lm my_resources/spanish-models/new_lm.binary --trie my_resources/spanish_models/new_trie --audio my_resources/common_voice_es_18933587.wav --lm_alpha 0.75 --lm_beta 1.85 --beam_width 1024
Loading model from file my_resources/models/output_graph.pbmm
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.1-0-g3df20fe
2020-01-22 14:39:41.522375: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-01-22 14:39:41.523505: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-01-22 14:39:41.545293: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 14:39:41.545575: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 2080 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.845
pciBusID: 0000:02:00.0
2020-01-22 14:39:41.545583: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-01-22 14:39:41.545634: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 14:39:41.545850: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 14:39:41.546251: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-01-22 14:39:41.788283: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-01-22 14:39:41.788301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2020-01-22 14:39:41.788305: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2020-01-22 14:39:41.788375: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 14:39:41.788594: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 14:39:41.788794: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 14:39:41.788980: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:40] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2020-01-22 14:39:41.788994: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6933 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:02:00.0, compute capability: 7.5)
Loaded model in 0.274s.
Loading language model from files my_resources/spanish-models/new_lm.binary my_resources/spanish_models/new_trie
Loaded language model in 0.242s.
Running inference.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[logits/_87]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[logits/_87]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[new_state_h/_91]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[logits/_87]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[new_state_h/_91]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[logits/_87]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[new_state_h/_91]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[new_state_h/_91]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[logits/_87]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.
Error running session: Invalid argument: 2 root error(s) found.
  (0) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
	 [[new_state_h/_91]]
  (1) Invalid argument: Shapes of all inputs must match: values[0].shape = [1] != values[1].shape = [64]
	 [[{{node cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/stack_1}}]]
0 successful operations.
0 derived errors ignored.

Inference took 0.325s for 2.904s audio file.

What could I’ve done wrong? The audio is one of the training set. It looks as if the model architecture was not compatible with the weights. I trained mine with 1024 n_hidden but native_client/python/client.py has no n_hidden flag.

Thanks in advance!

lissyx · January 22, 2020, 1:55pm

You are not supposed to run that directly, please use the deepspeech command from deepspeech installed python wheel instead.

Please also make sure you use matching versions at training time and at inference time.

lissyx · January 22, 2020, 2:00pm

I think this is more related to your batch size arguments.

lissyx · January 22, 2020, 2:01pm

How do you get any training when you specify 0 ?

reyxuan · January 22, 2020, 3:08pm

I used 0 epochs starting from checkpoint in order to change the export_batch_size from 64 to 1 after training for 75 epochs with export_batch_size 64 and check if that was the problem. It didn’t work with 0 epochs, but now I used 1 train epoch and worked.

+ deepspeech --model my_resources/models/output_graph.pbmm --lm my_resources/spanish-models/new_lm.binary --trie my_resources/spanish_models/new_trie --audio my_resources/common_voice_es_18933587.wav --lm_alpha 0.75 --lm_beta 1.85 --beam_width 1024
Loading model from file my_resources/models/output_graph.pbmm
TensorFlow: v1.14.0-21-ge77504a
DeepSpeech: v0.6.1-0-g3df20fe
2020-01-22 16:01:55.377395: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-01-22 16:01:55.386260: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-01-22 16:01:55.408923: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 16:01:55.409163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 2080 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.845
pciBusID: 0000:02:00.0
2020-01-22 16:01:55.409177: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-01-22 16:01:55.409200: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 16:01:55.409422: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 16:01:55.409632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-01-22 16:01:55.652865: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-01-22 16:01:55.652882: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2020-01-22 16:01:55.652886: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2020-01-22 16:01:55.652954: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 16:01:55.653171: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 16:01:55.653367: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-01-22 16:01:55.653548: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:40] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2020-01-22 16:01:55.653562: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6869 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:02:00.0, compute capability: 7.5)
Loaded model in 0.288s.
Loading language model from files my_resources/spanish-models/new_lm.binary my_resources/spanish_models/new_trie
Loaded language model in 0.234s.
Running inference.
no quedar cuenta pero te utilizar
Inference took 0.594s for 2.904s audio file.

Sorry for the confusion and thanks for the help

So if I train a model with export batch size 1 I won’t be able to run inference on batches? Is it possible to fill the batches with empty audios?

reuben · January 22, 2020, 3:15pm

I think you’re mixing things up a bit. You don’t have to train to be able to re-export with different options. Just don’t specify --train_files and it’ll skip the train phase. (Same for dev and test). You’re only required to specify the flags relevant for exporting: n_hidden, checkpoint_dir and export_dir. Then it’ll go straight to the export.

Topic		Replies	Views
Error when running inference on an audio file DeepSpeech	40	4126	September 7, 2018
Inference prediction with own trained model DeepSpeech	9	1424	September 19, 2018
Error when training model DeepSpeech	95	4577	January 17, 2019
Empty inferences after training a french model DeepSpeech	1	451	June 13, 2019
Blank inference for transfer learning 2 DeepSpeech	8	648	January 14, 2020

Inference error: Shapes of all inputs must match

Related topics