Training with a 2GB GPU

nmstoker · February 14, 2021, 12:21am

I am not convinced that this will be worthwhile for training scenarios as the CPU contribution will be minimal and it would likely add substantial complexity.

Ole_Klett · February 14, 2021, 10:52am

Floppy disk?

Summary of both + dkreutz:
I have no idea how and if that works, but I think it will not be well performing based on my lack of experience with it and therefore I need to inform everybody here that I feel that way. And I will libel every approach that tries to come closer to it based on that prejudice.

Which makes me wonder: what if the creators of tacotron had thought that way?

nmstoker · February 14, 2021, 11:21am

Presumably everyone has a lack of experience with things that have never been done before!

I’m not saying no one should try it, I’m merely of the opinion that it’s not worth it - we can already determine that the performance contribution to training would be minimal because of what we observe with training solely on the CPU.

I couldn’t definitively comment on the complexity but it seems a fair assumption that it would add to it.

If you find evidence to suggest it will make a larger contribution to performance and you find evidence that it can be implemented simply then I suggest you make a case that it should be looked at. So far I’m not seeing anything that stands up to scrutiny.

Ole_Klett · February 14, 2021, 11:45am

The best proof won’t make you get your act together.

nmstoker · February 14, 2021, 11:41am

That’s not right - then I’d gladly look at it for you, but you would “need to compensate me and we need to have a business relationship”

Ole_Klett · February 14, 2021, 1:18pm

Nothing new here.

Besides, you are using my statement out of context in a libelous way.

I get the impression, that nobody here is aware of the inner workings of tacotron and everyone is mainly a user, ie the blind leading the blind.

nmstoker · February 14, 2021, 11:49am

Ole, you use your words with a clear lack of understanding of their meaning. I was amused to see that you threw in “ad hominem” in the other thread, again clearly not knowing how it is applied because the case you highlighted was not an ad hominem.

Ole_Klett · February 14, 2021, 12:20pm

Just as I wrote to baconator, you stultify yourself, nmstoker. Don’t make it worse for yourself. Leave the thread.

nmstoker · February 14, 2021, 12:23pm

Best of luck making progress bringing people together with your charm and persuasive ways!

dkreutz · February 14, 2021, 12:30pm

Now that is rich… please enlighten us with your deeper understanding on how Tacotron works. At least you should know by now that 2GB RAM will force you to use small training batch size which results in longer training to get results comparable with larger batch size (if results will be comparable at all).

People here have shared their experience with training on CPU - which is awfully slow and therefore impracticable given that a Taco model needs at least 100k+ steps for usable results. Nobody stops you from using CPU but you will be on your own here.

And distributed training with multiple GPU was already attempted here, look into the code. The additional complexity in maintaining the code did not justify the result.

tl;dr: You are asking for things here that other people already tried and abandoned for a reason.

baconator · February 14, 2021, 8:05pm

You clearly have no clue about what you’re attempting to do here. On top of that you’ve gone and misidentified the sort of person @nmstoker is, to boot.

You ask for help, then wave it away, you respond with belligerence and complain. A smart person would accept the help and use that knowledge to improve their situation.

You, on the other hand, refuse to do so. Like I said above, when you want to help yourself out, do so and let us know. But you’re just a troll until then.

erogol · February 15, 2021, 11:34am

@Ole_Klett I’d like to remind you of our Community Participation Guideline. Please be mindful of it.

baconator · February 18, 2021, 10:32pm

Did you try a reasonable gpu yet?

Escobar523 · March 1, 2021, 3:17am

Coming back to the actual subject of the post; i think its actually a reasonable question and more important than ever before. I dont want to start a NVIDIA rant, but its somehow obvious that their consumer politics lately are more than questionable. Or is it somehow reasonable explainable that cards in the lower mid price range (about 200€) just got 2 gig RAM instead of 4 gig like not that long ago?
In times where prices for GPUs with more than 2 gig RAM skyrocketed (reasonable or not) is Ole_Kletts question to me perfectly reasonable. Not everybody has the budget (or is willing to spend) 400€ and more for that what baconator calls a reasonable GPU.
That in mind, i would like to know if somebody has experience with multiple 2gig cards? Lets say 2 GTX 750 Ti, would they be usable for training?
@baconator: no offense, i just quote u as an example

baconator · March 1, 2021, 8:15pm

If you want to attempt this journey, good luck. There’s other documentation/comments about multiple cards on the same system you can use.

Training a viable model is a relatively massive undertaking. Tens to hundreds of hours of wav files to have a good dataset means you will need commensurate computational hardware to handle it. This has been the case since tacotron (8gb+ gpu was recommended on keith ito’s repo from 4 or 5 years ago). While it’s possible to use a smaller gpu, it hasn’t been found to produce models with the same quality, or as quickly. Perhaps this is changing, but if you are serious now about accomplishing training a quality model in say, a month or less, you’re going to have a difficult time if you don’t use a higher end gpu.

Also, you don’t need to buy a big gpu. There’s google colab to do testing and start modeling with. Cloud gpu’s can be rented fairly cheaply these days. On pre-emptible instances the price can be brought down quite a bit. I’ve even used one despite having 8gb gpu’s locally to run with for batch sizing and performance reasons. It’s a matter of balancing your wants and needs.

Escobar523 · March 2, 2021, 11:15am

Thanks for the answer, i started that journey/adventure of training my own model on a 2gb GT1030 (first try, first steps, first fail). The result after 10 days training was not good at all. I got myself a used GTX750 lately and planing to get a second one for testing that multi gpu idea. If there is some intrest, i can document the resulting fail or success here.
The idea of using colab for basic training might be a alternative, i never used colab before and will give it a try.
My wants and needs in terms of tts are tiny till none at all. To me its just the most fascinating part of the whole ongoing machine learning progress. There is no final or productive usecase i try to realize. Its just a huge intrest in the matter and the possibilities

mrthorstenm · March 2, 2021, 8:22pm

Hi @Escobar523.
As far i know multi gpu training is going to be depricated as it’s more work for code maintainance than benefit. So keep this in mind before buying an additional gpu.

Escobar523 · March 2, 2021, 10:46pm

Thanks, thats something i didnt had on my mind.
Ur right, if eaven compiling more than the latest cuda compute capability is to much to ask for (on tensorflow or torch), why walk a extra mile on top and implement multi gpu. I know, i dont pay the makers of that software, so i got no reason to complain. But its somehow frustrating to see that not eaven basic things, like a bigger ccc range, is maintained propperly on the fundamental parts (frameworks?) like torch and tensorflow. Sure, its possible for everyone (in theory at least) to compile the ccc needed, u just need to find ur way accross the “dependency hell” of countles subversions from librarys and more (before i got the gtx750, was trying to get a quadro k5000 to work with torch or tensor and failed misserably).
Long story short for now, it looks like google colab or buy some gpu time from a hoster is the only is the only way to get on a budget a trained voice that dont sound like “admiral chainsmoker” or worse.

btw, @mrthorstenm Huge thanks for the voice u created. Its the best german non Microsoft TTS voice by far. Maybe i am wrong but i think i hear a tiny bit of “saarbrigger platt” in that voice. correct me if i am wrong

Escobar523 · March 3, 2021, 6:06pm

just another logbook entry
Did anyone try https://gpu.land/ for training?
At first view, they offer a great deal for a reasonable price.
I will give them a try

Ole_Klett · March 14, 2021, 1:05am

To me the biggest thing is to prepare data the right way, so waiting is not in vain.

Sometimes I am gone for 5 days. I can have my CPU compute.

But as you experienced, coming back to it and seeing garbage is very discouraging.

Or when it crashed right after you left and it was idle for 5+ days.