Processing multiple audio at a time using multithreading

Hello

I tried processing (stt) 10 audios and each 10 seconds using multithreading at the same time. But it took more than 2 minutes to get the transcribed text of those audio files.

Shouldn’t it give me the transcribed text of those audio like in less than 15 seconds…?

Multithreading can be used with deepspeech…?

Hey @eggonlea can you help me with this?

@Ganofins Can you share more details ? code ? hardware ? versions ?

@Ganofins as lissyx asked, you’d provide more details about your environment.

E.g. if you’re running decoding on CPU, multithreading only helps if you have more than one CPU. If you’re using GPU, you need batch mode instead of multithreading. And the default nativeclient doesn’t support that so you’d do it by yourself.

@eggonlea @lissyx

Manjaro KDE

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
RAM: 4 GB
CPU MHz: 812.696
CPU max MHz: 3000.0000
CPU min MHz: 800.0000
Model name: Intel® Core™ i5-2430M CPU @ 2.40GHz

That’s middle-range, 8 years old, 2 cores / 4 threads CPU. Not really surprising that you don’t get a lot of speedup. How much times does it takes to decode one of those files ?

Yeah I know but that’s the best I can have right now

it takes like 19 seconds to convert a single audio (of 10 seconds duration) file into text

I can buy a vps and put deepspeech on it and then process the audio through it like google cloud speech…right?

can you tell me what’s minimum spec VPS will require to process multiple audio files smoothly?

And let me guess: you have all cores running at 100% ?

VPS are complicated, because you will share hardware, so it’s hard to give you guarantees.

What exactly do you want to build ?

Yes :slightly_frowning_face:

I just need it to convert audio to text and duration of the audio will be more than 5 minutes

DeepSpeech inference takes up significant system resources so it may actually end up being faster just to process one at a time on your system.

but is there anyway to process multiple files at a single time

You can either:

  1. rebuild the python client to use multi-processing Running multiple inferences in parallel on a GPU
  2. Run the DeepSpeech inference in a flask app. Flask will handle multithreading predict. Be sure to run 0.6.0 since 0.5.1 is not thread-safe.