Hello,
I have a server with a double CPU socket, and I have noticed that it takes a long time to make an inference (I use the GPU to train and do the transcriptions). Will the dual CPU socket affect? I haven’t found much information about it.
I have an rtx quadro 6000. It takes 1.1 seconds per second of audio. I have checked the gpu load when making the inference and yes, it is in use.
I have tried with an rtx 2070 and a processor (a socket), and the inference times are very low.
Do you have information about it? Thank you.