How many time is required to train?

Hi there, Im running Google Colab but I wonder how many time it is needed to run in collab? it seems that it will take some hours? Maybe more than 12?

Just starting at TTS, I want to construct a MVP where it can speack the words the text I write and the other way around so I will love if you have any suguestions for tackle the inverse problem STT would be nice.

Im trying to run diverse things I have found on internet, but for the moment no luck in constructing that MVP :slight_smile:.

And currently

I dont think it will end the 1000 epoch before 12 hours.

You could save the model to gdrive and continue training with a new session.

python --continue_path ‘path to model’

Also for STT see here

Thanks! will look in to that… can I ask things here even that they are not of the models used in mozilla?

Anyway, I tried to run on my computer (I have a 2080 with 8Gb RAM) got this on first step of the 1000, it is a OOM, is there a way I can train it with a parameter?

  --> STEP: 149/195 -- GLOBAL_STEP: 150
     | > decoder_loss: 1.54959  (2.75099)
     | > postnet_loss: 1.65185  (3.31417)
     | > stopnet_loss: 0.33840  (0.53356)
     | > ga_loss: 0.02369  (0.04075)
     | > loss: 3.22514 
     | > align_error: 0.99233  (0.99067)
     | > avg_spec_len: 705.203125
     | > avg_text_len: 126.171875
     | > step_time: 1.02
     | > loader_time: 0.01
     | > lr: 0.00010
 ! Run is removed from ../ljspeech-July-03-2020_03+56PM-3366328
Traceback (most recent call last):
  File "", line 676, in 
  File "", line 591, in main
    global_step, epoch)
  File "", line 191, in train
  File "/home/tyoc213/miniconda3/envs/fastai2/lib/python3.7/site-packages/torch/", line 198, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/tyoc213/miniconda3/envs/fastai2/lib/python3.7/site-packages/torch/autograd/", line 100, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: CUDA out of memory. Tried to allocate 110.00 MiB (GPU 0; 7.79 GiB total capacity; 4.37 GiB already allocated; 116.88 MiB free; 4.64 GiB reserved in total by PyTorch) (malloc at /opt/conda/conda-bld/pytorch_1587428398394/work/c10/cuda/CUDACachingAllocator.cpp:289)
If you mean ask here regarding DeepSpeech. It’s better you use the forum for the given topic → DeepSpeech - Mozilla Discourse

You can reduce the batch_size. Check the “gradual_training” setting in config.json.
If its something like this [[0, 7, 64], [1, 5, 64], [50000, 3, 64], [130000, 2, 32], [290000, 1, 32]]. In this case 64/64/64/32/32 are the batch_sizes → reduce them.
Try something like this → [[0, 7, 32], [1, 5, 32], [50000, 3, 32], [130000, 2, 32], [290000, 1, 16]]

