sry for that
env:
Ubuntu 20.04 RTX3090 run in Nvidia container which is built by Nvidia image 20.11-tf1-py3.
the latest command line:
python3 -u DeepSpeech.py
–train_files /workspace/de/clips/train.csv
–test_files /workspace/de/clips/test.csv
–dev_files /workspace/de/clips/dev.csv
–export_dir /workspace/DeepSpeech/data/model
–train_batch_size=100
–dev_batch_size=100
–test_batch_size=100
–epochs=1
–n_hidden=512
–learning_rate=0.0001
–dropout_rate=0.2
–export_file_name output_615
–export_author_id sun
–export_model_name 615
–export_model_version 1
–summary_dir /workspace/DeepSpeech/data/model
the latest logs:
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 0:03:53 | Steps: 2371 | Loss: 118.941052
Epoch 0 | Validation | Elapsed Time: 0:00:11 | Steps: 151 | Loss: 94.769755 | Dataset: /workspace/de/clips/dev.csv
I Saved new best validating model with loss 94.769755 to: /root/.local/share/deepspeech/checkpoints/best_dev-2371
I FINISHED optimization in 0:04:04.418811
I Loading best validating checkpoint from /root/.local/share/deepspeech/checkpoints/best_dev-2371
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on /workspace/de/clips/test.csv
Test epoch | Steps: 151 | Elapsed Time: 1:21:49
Test on /workspace/de/clips/test.csv - WER: 0.949567, CER: 0.449052, loss: 94.426300
…
I Models exported at /workspace/DeepSpeech/data/model
I Model metadata file saved to /workspace/DeepSpeech/data/model/sun_615_1.md. Before submitting the exported model for publishing make sure all information in the metadata file is correct, and complete the URL fields.
the logs shows no error or warning.
Problems:
- I can’t get the pb and md file in the export_dir.
- The summaries are not saved in the new path as the flag says.
- In the default path there are no new summary saved. Each time I use tensorboard, I can just get the same graphic which is generated by the oldest summaries.
which I want:
- new summaries after Training in the path which is provided in flag. At least in the default path.
- new model after Training in the path which is provided in flag.