Is it possible to install nvidia toolkit and use GPU accelerated TTS training without using windows insider builds?

cesm23 · February 16, 2021, 10:45am

So far i have been forced to use CPU only tacotron2 training (which obviously it’s very slow) because even tough i DO have a nvida card (NVIDIA GEFORCE GTX 1060 SUPER 6GB), i am unable to setup up the nvidia tookit simply because it requires we to be part of the windows insider program AND install insider builds! I really didn’t want to do this because i have heared of so many bugs with windows updates, and i have been lucky so far with windows updates (without using insider builds), and now this means i am FORCED to use a windows insider build to be able to take advantage of my gpu in training, which might be quite unstable and full of bugs and risking having to reinstall windows in case of getting a bad build without knowing? The ONLY alternative to this is using cpu training??

I really wonder what most people here that use gpu acceleration in training for TTS do, is everyone here running insider builds? No one had problems with this, at least not more than those that already exist with updates for normal builds of windows?? That’s quite unusual, since so far i haven’t seen a single person complaining about this!

Sure i am aware that we could use dual boot, but this is really not practical to me since i DO want to use the computer in the normal non-insider build i am using currently as the main OS WHILE doing the training, it would be a hugle hassle to install everything i need to use on a daily basis on the second boot with the insider build os…

Also using a virtual machine for this i think it’s out of the question too, since we have to use WSL2 and ubuntuu to run the text to speech engines and training, so i think this isn’t viable too.

Regarding google colab i never used this before, but i am not sure if i am able to use it too extensively because i need to train from scratch quite a few custom new datasets for new voices, so i will need to use this really a LOT (even though i am afraid my graphics card isn’t powerfull enough, altough i have seen people online having average training speed with this card so it might be okay to me)

Unless… there is a way to run gpu accerelated training for TTS without installing insider builds, that i haven’t found anywhere on google… so tell me guys what you advise about this? Unless perhaps there is a “safe” specific build number that is stable to use? But won’t they expire?

erogol · February 16, 2021, 11:39am

To my knowledge, you can only either use WSL2 on win10 or install a separate Linux Distro.

rdh · February 18, 2021, 2:26pm

I’m not sure what causes you to force to use inside builds, but I have had no such issues. Could you clarify this? Tacotron2 trains fine for me on Windows 10 without using WSL2. The only fixes I ever had to apply are the occasional explicit UTF-8 file encodings to open files.

cesm23 · February 19, 2021, 8:14am

Ok, in the meantime i was able to get other repository working with cuda (DeepLearningExamples/Fastpitch), for example the fastpitch model which in fact i prefer over just tacotron2 itself, and all inside windows command prompt without using no ubuntu, no insider builds, no nvidia developer drivers and no docker!! That’s amazing! Regarding this repository (mozilla tts), even that one now recognizes cuda finally, even being fully run inside a windows command prompt without ubuntu.

I’m not sure what causes you to force to use inside builds, but I have had no such issues. Could you clarify this? Tacotron2 trains fine for me on Windows 10 without using WSL2. The only fixes I ever had to apply are the occasional explicit UTF-8 file encodings to open files.

Yeah!! Exactly i finally figured out how to do it without wsl too, thanks to a person from the other repository, and yeah indeed i run by the UTF-8 file encoding problems, but i just had to had to open the \TTS\TTS\utils\io.py file and replace all instances of :
“r”
with :
“r”, encoding = “utf-8”

And doing the same with “w”, and now it works, that’s how you did it too? Or is there an easier way?

But wait a minute, how the heck were you able to solve the problem of the espeak dependency?? It’s the only thing now that i wasn’t able to fix yet, what is the windows command prompt command to do this ? (using PIP) Just using pip install espeak doesnt work, and installing espeak for windows has no effect either, unless it has a different name to use in pip, so tell me how you installed espeak in a way to this repository recognize it.

nmstoker · February 19, 2021, 3:09pm

Good to hear you’re making progress @cesm23

FYI pip won’t help you with espeak as it’s for installing python packages and what you need for espeak is the local executable. You probably want to look at espeak-ng rather than straight espeak as the latter hasn’t been updated in a long while.
I haven’t installed either on Windows before but I believe espeak-ng have assets (eg an .MSI installer) on their GitHub release page (their releases are very sporadic too but it’s in a better state than espeak!)

BTW, when you’re in a good position, how would you feel about writing up some of the windows specific stuff? It might help to add to the documentation / questions sections and save others from struggling too, sharing your hard won expertise back with the community!

rdh · February 19, 2021, 5:01pm

It’s been several months since I did this, so I don’t remember it very well. I can give you these two pointers:

Make sure to add the following to your PATH:

C:\Program Files (x86)\eSpeak\command_line

And reboot your PC after installing eSpeak.

cesm23 · February 21, 2021, 3:53pm

Just to note that while i am not using this repository for the moment (since i always wanted Fastpitch model more than tacotron2 itself, i was using mozilla tts only because it worked without gpu) but later i can still provide information here about how i set up cuda use without using WSL and ubuntu.

Good to hear you’re making progress @cesm23

And thats nothing… this morning was the final milestone, i just finally started doing training with fastpitch (being able to have apex with cuda extensions installed and being able to use Lamb optimizer was incredibly difficult), but i guess there isnt much difference in both repositories workflow, it’s mostly the commands to use it that change and how to set up the datasets.

FYI pip won’t help you with espeak as it’s for installing python packages and what you need for espeak is the local executable. You probably want to look at espeak-ng rather than straight espeak as the latter hasn’t been updated in a long while.
I haven’t installed either on Windows before but I believe espeak-ng have assets (eg an .MSI installer) on their GitHub release page (their releases are very sporadic too but it’s in a better state than espeak!)

Oh i see… anyway i am not using this repository for now, but i will try this later with more time.

BTW, when you’re in a good position, how would you feel about writing up some of the windows specific stuff? It might help to add to the documentation / questions sections and save others from struggling too, sharing your hard won expertise back with the community!

YES of course! I was strongly considering this because you have no idea, how much hours of intense googling and frustation i spent, i nearly got 10-20 different errors in the whole process of setting up the cuda use, and each error would take about 5-20m minutes to solve (some of them 1-2 hours!), since as you know, using google can sometimes take a long time for we to find the right answer and the right keywords, so yeah i dont wish this whole frustating process to anyone, so its garanteed i am going to do the tutorial.

One of the hardest was yesterday, in fastpitch repository apex it’s required to be installed with cuda extensions, how i was supposed to guess that for windows i was supposed to use a specific fork of apex instead of the official one… the official one has a specific command to install in windows, but does NOT install cuda extentions, if i try cuda extensions it will give errors on compiling.

AND in top of that i had to have installed the 11.0 version of the cuda toolkit, NOT the most recent one… and another one, i am not sure now but i think it was pytorch, the packages wont get correctly installed with the latest python version (3.9), it has to be the latest 3.8 version! Took ages to be able to “guess” that was the problem… Things that i could only find by reading issues created by persons in repositories like this by googling the error messages, and a LOT, and i mean, a lot of trial and error…

As soon as i finish learning about the training process i will prepare the tutorial, but i wonder… to whom and where i post this? Here on this forum as a new thread? Since i presume posting it here wont be as visible… Any mod here can advise?

It’s been several months since I did this, so I don’t remember it very well. I can give you these two pointers:

Make sure to add the following to your PATH:

C:\Program Files (x86)\eSpeak\command_line

And reboot your PC after installing eSpeak.

Thanks i will take note of that when i write the tutorial and try using this again.