Testing fails with evaluate.py

Hi I’m trying to test one of my models but evaluate.py fails.

(tensorflow_p36) ubuntu@ip-172-31-30-60:~/OwnLanguage/DeepSpeech$ python evaluate.py --checkpoint_dir Spanska2 --test_files Spanska/clips/test.csv
WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/__init__.py:1467: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.

I Loading best validating checkpoint from Spanska2/best_dev-1159
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
Traceback (most recent call last):
  File "evaluate.py", line 156, in <module>
    absl.app.run(main)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "evaluate.py", line 147, in main
    samples = evaluate(FLAGS.test_files.split(','), create_model)
  File "evaluate.py", line 89, in evaluate
    load_or_init_graph(session, method_order)
  File "/home/ubuntu/OwnLanguage/DeepSpeech/util/checkpoints.py", line 103, in load_or_init_graph
    return _load_checkpoint(session, ckpt_path)
  File "/home/ubuntu/OwnLanguage/DeepSpeech/util/checkpoints.py", line 70, in _load_checkpoint
    v.load(ckpt.get_tensor(v.op.name), session=session)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/variables.py", line 1033, in load
    session.run(self.initializer, {self.initializer.inputs[1]: value})
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1156, in _run
    (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (4096,) for Tensor 'cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias/Initializer/Const:0', which has shape '(8192,)'

It works if I use the DeepSpeech command like this:
python DeepSpeech.py --test_files Spanska/clips/test.csv --n_hidden 1024 --test_batch_size 512 --dropout_rate 0.1 --checkpoint_dir Spanska2 --audio_sample_rate 16000 --export_dir Spanska2

  1. Is there a way to get a graph of the lost and WER for each epoch?

My current command just gives me the following when it is done:

(tensorflow_p36) ubuntu@ip-172-31-30-60:~/OwnLanguage/DeepSpeech$ python DeepSpeech.py --test_files Spanska/clips/test.csv --n_hidden 1024 --test_batch_size 512 --dropout_rate 0.1 --checkpoint_dir Spanska2 --audio_sample_rate 16000 --export_dir Spanska2
WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/__init__.py:1467: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.

I Loading best validating checkpoint from Spanska2/best_dev-1159
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on Spanska/clips/test.csv
Test epoch | Steps: 6 | Elapsed Time: 0:09:54                                                                               
Test on Spanska/clips/test.csv - WER: 1.000000, CER: 0.643806, loss: 63.413624
--------------------------------------------------------------------------------
Best WER: 
--------------------------------------------------------------------------------
WER: 0.500000, CER: 0.538462, loss: 11.808677
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19655988.wav
 - src: "es inevitable"
 - res: "es esta"
--------------------------------------------------------------------------------
WER: 0.500000, CER: 0.428571, loss: 11.046744
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19743024.wav
 - src: "hijo de"
 - res: "j de"
--------------------------------------------------------------------------------
WER: 0.500000, CER: 0.357143, loss: 10.889674
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19105155.wav
 - src: "ella no te ama"
 - res: "la te ama"
--------------------------------------------------------------------------------
WER: 0.500000, CER: 0.625000, loss: 6.412522
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19606260.wav
 - src: "es árido"
 - res: "es a"
--------------------------------------------------------------------------------
WER: 0.500000, CER: 0.444444, loss: 1.378866
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19658843.wav
 - src: "la perche"
 - res: "la ce"
--------------------------------------------------------------------------------
Median WER: 
--------------------------------------------------------------------------------
WER: 1.230769, CER: 0.650000, loss: 76.885765
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19978943.wav
 - src: "su padre lucio portales natural de huánuco fue violinista y director de orquesta"
 - res: "su ad l u c r as a ra e a u c u l d stal"
--------------------------------------------------------------------------------
WER: 1.230769, CER: 0.567568, loss: 67.122025
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19678121.wav
 - src: "además fue el descubridor de michael landon el cual actuaba en la película"
 - res: "a d m u e l s cue mc ad l clac caa la te le cul"
--------------------------------------------------------------------------------
WER: 1.230769, CER: 0.594203, loss: 57.308815
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19487873.wav
 - src: "el escritor thomas de quincey fue uno de muchos que acudieron a verla"
 - res: "l e ta ra ta as ecce fu u a e us caa d r la"
--------------------------------------------------------------------------------
WER: 1.230769, CER: 0.552239, loss: 54.113281
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19694610.wav
 - src: "por lo tanto nos movemos de un pasado definido a un futuro incierto"
 - res: "ta ra ta ta la me m s e u a sad ed a u t set"
--------------------------------------------------------------------------------
WER: 1.230769, CER: 0.579710, loss: 45.881577
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_20061372.wav
 - src: "ha sido decano de la escuela de medicina de la universidad de alberta"
 - res: "a c d e a e l e s c a e mc e l esddsea"
--------------------------------------------------------------------------------
Worst WER: 
--------------------------------------------------------------------------------
WER: 4.000000, CER: 0.695652, loss: 38.546658
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19657102.wav
 - src: "títulos internacionales"
 - res: "e t a s t e a saal"
--------------------------------------------------------------------------------
WER: 4.000000, CER: 0.520000, loss: 7.751944
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19653875.wav
 - src: "complicaciones comentadas"
 - res: "m le cas es ca me ta as"
--------------------------------------------------------------------------------
WER: 4.250000, CER: 1.000000, loss: 95.440102
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_19945106.wav
 - src: "actualmente vive en atlanta"
 - res: "d a a a a a r le te d e a t a a s e "
--------------------------------------------------------------------------------
WER: 4.333333, CER: 1.428571, loss: 80.769714
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_18430819.wav
 - src: "pero se supera"
 - res: "r e u te ra te u a u t r a e"
--------------------------------------------------------------------------------
WER: 9.000000, CER: 3.000000, loss: 282.579132
 - wav: file:///home/ubuntu/OwnLanguage/DeepSpeech/Spanska/clips/common_voice_es_18309522.wav
 - src: "basta  por favor"
 - res: "a s t r a e a a d a e a e a l e d s a s a u d a a a esar"
--------------------------------------------------------------------------------
I Exporting the model...
I Loading best validating checkpoint from Spanska2/best_dev-1159
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
I Models exported at Spanska2

WER on each epoch will kill your performances

You should be able to produce a chart (not a graph) of loss using TensorBoard I think?

Are you mixing CUDNN / non CUDNN checkpoints here? Without details on how you produce the model, we can’t help.

Also, why do you want to use evaluate.py ? It is not going to giev you anything different than the test run at the end of training that you shared.

Is there a way to get a graph of the lost and WER for each epoch?

I was also looking for this a while ago (to get a graph for loss on each epoch) and ended up writing a few lines in the source code that writes an epoch loss tuple into a .txt file at the end of each epoch. Then later on you can always go back to it or use any package/language you like to make a graph out of it.

Why don’t you just use tensorboard?