Same response time in 1 core and 8 core AWS gpu

Hi all,
I am running mozilla TTS on 8 core AWS GPU and the response time is same as 1 core.

i first run the 1) export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
2) source tts-venv/bin/activate
3) python -m TTS.server.server

Am i missing any configuration setting.

No, to my understanding inference/synthesizer can not use multiple GPUs as there is no batch processing possible like when training a model.

Yes, using one GPU is consistent with Eren’s response here:

Thanks for the reply. I understood 8 core gpu will not improve response time. Is there a way out to improve TTS response time within 1 core AWS gpu. My current TTS response time 4-5 sec. I want to bring down to this below 1-2 sec. Is it doable in 1 core gpu by some configuration changes or settings.

number of GPUs does not do any good for a single utterance response time. You can only increase the number of utterances processed in parallel.