Train 1 epoch take too much time

My each epoch training time take about 7 hours, so I have a question about what checkpoint can stored? For example: when I’m tranining an epoch 1 get about 30%, then I close the process by quit the Terminal (on MacOS). So when I’m running again (with the same params like the last time training) can Deepspeech start on the checkpoint with 30%?. Because I have test this 1 time and the result is the 2nd running start counting from 0%.

As far as I know, no.
DeepSpeech have a flag to save each n seconds the weights of the model in the checkpoint format.
When you restart a training the epoch start from the beginning it just loads the last checkpoint available, in your case “epoch 0.3”.
Epochs start from the beginning of the data.

1 Like