Processing multiple audio at a time using multithreading

Ganofins · July 30, 2019, 2:25pm

Hello

I tried processing (stt) 10 audios and each 10 seconds using multithreading at the same time. But it took more than 2 minutes to get the transcribed text of those audio files.

Shouldn’t it give me the transcribed text of those audio like in less than 15 seconds…?

Multithreading can be used with deepspeech…?

Ganofins · July 31, 2019, 1:33pm

Hey @eggonlea can you help me with this?

lissyx · July 31, 2019, 3:17pm

@Ganofins Can you share more details ? code ? hardware ? versions ?

eggonlea · July 31, 2019, 4:42pm

@Ganofins as lissyx asked, you’d provide more details about your environment.

E.g. if you’re running decoding on CPU, multithreading only helps if you have more than one CPU. If you’re using GPU, you need batch mode instead of multithreading. And the default nativeclient doesn’t support that so you’d do it by yourself.

Ganofins · July 31, 2019, 5:14pm

@eggonlea @lissyx

Manjaro KDE

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
RAM: 4 GB
CPU MHz: 812.696
CPU max MHz: 3000.0000
CPU min MHz: 800.0000
Model name: Intel® Core™ i5-2430M CPU @ 2.40GHz

lissyx · July 31, 2019, 5:42pm

That’s middle-range, 8 years old, 2 cores / 4 threads CPU. Not really surprising that you don’t get a lot of speedup. How much times does it takes to decode one of those files ?

Ganofins · August 1, 2019, 6:14am

Yeah I know but that’s the best I can have right now

it takes like 19 seconds to convert a single audio (of 10 seconds duration) file into text

Ganofins · August 1, 2019, 8:22am

I can buy a vps and put deepspeech on it and then process the audio through it like google cloud speech…right?

can you tell me what’s minimum spec VPS will require to process multiple audio files smoothly?

lissyx · August 1, 2019, 8:28am

And let me guess: you have all cores running at 100% ?

VPS are complicated, because you will share hardware, so it’s hard to give you guarantees.

What exactly do you want to build ?

Ganofins · August 1, 2019, 9:03am

Yes

I just need it to convert audio to text and duration of the audio will be more than 5 minutes

dabinat · August 2, 2019, 2:57pm

DeepSpeech inference takes up significant system resources so it may actually end up being faster just to process one at a time on your system.

Ganofins · August 2, 2019, 4:51pm

but is there anyway to process multiple files at a single time

KevinNotable · December 19, 2019, 1:23am

You can either:

rebuild the python client to use multi-processing Running multiple inferences in parallel on a GPU
Run the DeepSpeech inference in a flask app. Flask will handle multithreading predict. Be sure to run 0.6.0 since 0.5.1 is not thread-safe.