Yep I’m interested in this but I’m relatively new in this aspect what would I need to do?
First, familiarize yourself with TensorFlow 1.x checkpoint saving and loading logic. Then, familiarize yourself a bit with TensorFlow tf.data.Dataset APIs. Then, read and understand our feeding code. Start from function create_dataset in feeding.py, and go from there.
You’d have to add, as a start:
- Code to the CSV and SDB loading classes to skip to an index in the input file when loading
- Code to differentiate the first epoch from subsequent epochs, since you only want to skip on the first epoch when resuming
- Code to save and load the last sample index that was loaded during training, so that it can be used for resuming
1 Like
Ok thanks will check it out!!