Save the frozen model after every epoch


(Rpratesh) #1

Any suggestions on what modifications have to be done in DeepSpeech.py, so as to save a frozen model (.pb) after every epoch.
Current script saves frozen model only after all the mentioned epochs.


(Lissyx) #2

You need to change the place export() is called and move it at the end of each epoch. But why do you want to do that ?


(Rpratesh) #3

I wanted to save frozen models after each epoch so that I can later use them for evaluation and comparison.

I tried calling export() function after

# Gathering job results
job.loss = total_loss / job.steps
export()

in Deepspeech.py .

But getting the following error:

E Do not use tf.reset_default_graph() to clear nested graphs. If you need a cleared graph, exit the nesting and create a new graph.
Traceback (most recent call last):
  File "DeepSpeech.py", line 652, in train
    export()
  File "DeepSpeech.py", line 807, in export
    tf.reset_default_graph()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 5539, in reset_default_graph
    raise AssertionError("Do not use tf.reset_default_graph() to clear "
AssertionError: Do not use tf.reset_default_graph() to clear nested graphs. If you need a cleared graph, exit the nesting and create a new graph.
Traceback (most recent call last):
  File "DeepSpeech.py", line 1006, in <module>
    tf.app.run(main)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "DeepSpeech.py", line 956, in main
    train()
  File "DeepSpeech.py", line 688, in train
    hook.end(session)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/basic_session_run_hooks.py", line 588, in end
    self._save(session, last_step)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/basic_session_run_hooks.py", line 599, in _save
    self._get_saver().save(session, self._save_path, global_step=step)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1429, in save
    raise TypeError("'sess' must be a Session; %s" % sess)
TypeError: 'sess' must be a Session; <tensorflow.python.training.monitored_session.MonitoredSession object at 0x7f6a7851d690>

(Lissyx) #4

I guess that’s not the right place, if you take a look at the context, it’s in the middle of the TensorFlow graph session, so it’s going to mess up. Maybe you should move that down inside the train() function, after the coord.stop() ?


(Reuben Morais) #5

I think you’ll have an easier time saving a checkpoint rather than a frozen model, due to the way TF sessions work. You can then convert the checkpoints to frozen models afterwards.