How do I create a preprocess script for a custom dataset?

Hi. I am working in Windows 10. I have 2 GPUs GeForce NVIDIA RTX 2080 Ti rev A. I have installed CUDA 10.1 and CUDNN 10.1.

I installed pytorch of this way:

conda install pytorch=01.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch

When I tried with this, I couldn’t train the model because of this message:

Traceback (most recent call last):
File “TTS/bin/train_tacotron.py”, line 721, in
main(args)
File “TTS/bin/train_tacotron.py”, line 505, in main
init_distributed(args.rank, num_gpus, args.group_id,
File “c:\users\voice-trainner\proyecto\tts\TTS\utils\distribute.py”, line 69, in init_distributed
dist.init_process_group(
AttributeError: module ‘torch.distributed’ has no attribute ‘init_process_group’

As far I found this error is because of PyTorch doesn’t support multi GPUs in Windows but using library torch.nn.parallel.DistributtedDataparallel. How I should modify scripts in MozillaTTS to use both GPUs in Windows and avoiding uninstall one of the GPUs?

Well I uninstalled one the GPU, but the message was:

python TTS/bin/train_tacotron.py --config_path TTS/tts/configs/config.json
2021-02-17 16:53:06.621290: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll

Using CUDA: True
Number of GPUs: 1
Mixed precision mode is ON
Git Hash: e9e0784
Experiment folder: Models/LJSpeech/ljspeech-ddc-February-17-2021_04+53PM-e9e0784
Setting up Audio Processor…
| > sample_rate:22050
| > resample:False
| > num_mels:80
| > min_level_db:-100
| > frame_shift_ms:None
| > frame_length_ms:None
| > ref_level_db:20
| > fft_size:1024
| > power:1.5
| > preemphasis:0.0
| > griffin_lim_iters:60
| > signal_norm:True
| > symmetric_norm:True
| > mel_fmin:50.0
| > mel_fmax:7600.0
| > spec_gain:1.0
| > stft_pad_mode:reflect
| > max_norm:4.0
| > clip_norm:True
| > do_trim_silence:True
| > trim_db:60
| > do_sound_norm:False
| > stats_path:scale_stats.npy
| > hop_length:256
| > win_length:1024
| > Found 13100 files in C:\Users\Voice-trainner\MozillaTTS\TTS\LJSpeech-1.1
Using model: Tacotron2

Model has 47914548 parameters

DataLoader initialization
| > Use phonemes: True
| > phoneme language: en-us
| > Number of instances : 12969
| > Max length sequence: 187
| > Min length sequence: 5
| > Avg length sequence: 98.3403500655409
| > Num. instances discarded by max-min (max=153, min=6) seq limits: 476
| > Batch group size: 16.

EPOCH: 0/1000

Number of output frames: 7
TRAINING (2021-02-17 16:53:17)
2021-02-17 16:53:18.451386: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
Using CUDA: True
Number of GPUs: 1
2021-02-17 16:53:21.205552: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
Using CUDA: True
Number of GPUs: 1
2021-02-17 16:53:23.927322: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
Using CUDA: True
Number of GPUs: 1
2021-02-17 16:53:26.653252: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
Using CUDA: True
Number of GPUs: 1
! Run is removed from Models/LJSpeech/ljspeech-ddc-February-17-2021_04+53PM-e9e0784
Traceback (most recent call last):
File “TTS/bin/train_tacotron.py”, line 721, in
main(args)
File “TTS/bin/train_tacotron.py”, line 619, in main
train_avg_loss_dict, global_step = train(train_loader, model,
File “TTS/bin/train_tacotron.py”, line 165, in train
decoder_output, postnet_output, alignments, stop_tokens, decoder_backward_output, alignments_backward = model(
File “C:\Users\Voice-trainner\anaconda3\envs\TF2\lib\site-packages\torch\nn\modules\module.py”, line 722, in _call_impl
result = self.forward(*input, **kwargs)
File “c:\users\voice-trainner\proyecto\tts\TTS\tts\models\tacotron2.py”, line 148, in forward
encoder_outputs = self.encoder(embedded_inputs, text_lengths)
File “C:\Users\Voice-trainner\anaconda3\envs\TF2\lib\site-packages\torch\nn\modules\module.py”, line 722, in _call_impl
result = self.forward(*input, **kwargs)
File “c:\users\voice-trainner\proyecto\tts\TTS\tts\layers\tacotron2.py”, line 109, in forward
o, _ = self.lstm(o)
File “C:\Users\Voice-trainner\anaconda3\envs\TF2\lib\site-packages\torch\nn\modules\module.py”, line 722, in _call_impl
result = self.forward(*input, **kwargs)
File “C:\Users\Voice-trainner\anaconda3\envs\TF2\lib\site-packages\torch\nn\modules\rnn.py”, line 579, in forward
result = _VF.lstm(input, batch_sizes, hx, self._flat_weights, self.bias,
RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM

I think it would help if you confirmed what you’ve got installed within your environment in case you’ve accidentally messed something up.

I notice you’ve got what appears to be a typo in the text you said you used to install Pytorch. Could you copy paste that again so we can be sure you’ve used 1.6 rather than 0.1.6 (which is also a version but it’s way too old). Always good to copy paste what you can as it cuts down on transcription errors (which will just confuse everyone further!)

Thanks!

Sorry. It was an typing error. The command was:

conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch

Doing a quick google on that error message it looks to me that is an error that occurs only when running on setups with multiple GPUs, e.g. here.
To my understanding multi-GPU support is deprecated for Mozilla-TTS.

I think it’s because you are using torch 1.6 and mixed precision, there were some issues before.

Either try disabling mixed_precision in the config or update pytorch to the latest 1.7 build.

Don’t double post your issues !

Thanks a lot. It’s training without problems. I only disabled mixed_precision because installing latest version or pytorch had conflicts with other libraries.