Is there a way to load the model only once?

niklas.donges1 · October 4, 2018, 1:50pm

Hello everybody,

When I use the pre-trained model, it outputs the following:

Loaded model in 1.678s.
Loading language model from files models/lm.binary models/trie
Loaded language model in 5.425s.
Running inference.
Inference took 21.084s for 3.990s audio file.

Is there a way to load the model and maybe also the language model only once, before submitting audio files? Because for example, when 10 users want to use the service one after the other, the model gets also loaded 10 times.
I am building a transcription service based on the DeepSpeech library and this could lead to a significant performance improvement.

Regards,
Niklas

lissyx · October 4, 2018, 1:55pm

Share us code, because that’s trivial to do, and this is already what we do in native_client/client.cc for the multi-WAV usecase (multiple WAV files in one directory).

ds-srv in Rust also does it: src/inference.rs · 5ba4bf0bf338aeb7c2b186cb2f6d65fd72eff4c0 · deepspeech / ds-srv · GitLab src/inference.rs · 5ba4bf0bf338aeb7c2b186cb2f6d65fd72eff4c0 · deepspeech / ds-srv · GitLab

lissyx · October 4, 2018, 1:56pm

That seems very slow, what’s your hardware ?

niklas.donges1 · October 4, 2018, 6:50pm

2014 Macbook Pro with 2 GHz Intel Core i7, 8 GB RAM, Intel Iris Pro 1536 MB.

niklas.donges1 · October 5, 2018, 11:11am

Thanks for your quick answer Lissyx.
Unfortunately, I can’t share the code with you, except a few lines.

How can I change the following command (in python), so that it loads the model only once?

subprocess.Popen("deepspeech models/output_graph.pb {} models/alphabet.txt 
                                   "models/lm.binary "
                     "models/trie > {}".format(audio_name, result_name), shell=True).wait()

lissyx · October 5, 2018, 11:14am

Well, if you use directly the binary or the CLI python tool through forks, there’s no help we can provide.

Why don’t you just look at the client.py and get inspired instead of just forking like that?

lissyx · October 5, 2018, 11:15am

github.com

mozilla/DeepSpeech/blob/master/native_client/python/client.py#L79-L108


print('Loading model from file {}'.format(args.model), file=sys.stderr)
model_load_start = timer()
ds = Model(args.model, N_FEATURES, N_CONTEXT, args.alphabet, BEAM_WIDTH)
model_load_end = timer() - model_load_start
print('Loaded model in {:.3}s.'.format(model_load_end), file=sys.stderr)


if args.lm and args.trie:
    print('Loading language model from files {} {}'.format(args.lm, args.trie), file=sys.stderr)
    lm_load_start = timer()
    ds.enableDecoderWithLM(args.alphabet, args.lm, args.trie, LM_WEIGHT,
                           VALID_WORD_COUNT_WEIGHT)
    lm_load_end = timer() - lm_load_start
    print('Loaded language model in {:.3}s.'.format(lm_load_end), file=sys.stderr)


fin = wave.open(args.audio, 'rb')
fs = fin.getframerate()
if fs != 16000:
    print('Warning: original sample rate ({}) is different than 16kHz. Resampling might produce erratic speech recognition.'.format(fs), file=sys.stderr)
    fs, audio = convert_samplerate(args.audio)
else:

This file has been truncated. show original

lissyx · October 5, 2018, 2:26pm

You are also using the non mmap-able file format, so it means lots of memory allocations. We have a warning in place for that.

niklas.donges1 · October 9, 2018, 9:36am

Just for clarification: You mean I should use the command-line client instead of the python package and change the native-client “client.py” script, so that it loads the model only once?

lissyx · October 9, 2018, 9:40am

No, I’m saying you should not call subprocess and just directly import deepspeech in your codebase. This way you can control loading of the model and separately inference. And you can just look at client.py to see the code to use

niklas.donges1 · October 9, 2018, 12:27pm

Thanks for you quick answers!
I am having a little issue: Where do you have these functions from? “from deepspeech import Model, printVersions”

lissyx · October 9, 2018, 12:31pm

I’m not sure I understand your question … But you should look inside native_client/python/ …

niklas.donges1 · October 9, 2018, 12:32pm

Yes I did. But I can’t import these functions like you do in the script. Whether I just run the script or when I try to import them in the terminal.

lissyx · October 9, 2018, 12:35pm

It would help if you could be a bit more verbose and share the error … Everything you need is in native_client/python/ …

niklas.donges1 · October 9, 2018, 12:40pm

Sorry if I am unprecise.

For example when I run “DeepSpeech/native_client/python/client.py” it outputs:

Traceback (most recent call last):

File "client.py", line 12, in &lt;module&gt;

from deepspeech import Model, printVersions

ImportError: cannot import name 'Model'

lissyx · October 9, 2018, 12:42pm

Can you document what you do ? Like virtualenv setup, pip steps, etc. ?

niklas.donges1 · October 9, 2018, 12:48pm

Deepspeech is installed on my machine and I also upgraded it today. For example, when I open a python shell, I can do “import deepspeech” without any issues.

Is that what you mean?

lissyx · October 9, 2018, 12:54pm

Yes, but it’s still very blurry … Like, which version of DeepSpeech is installed ? And how did you installed that ?

This is heavily tested code, it works for sure.

niklas.donges1 · October 9, 2018, 12:57pm

Thanks for your patience. I have version 0.2.0 and I installed it via pip.

lissyx · October 9, 2018, 12:59pm

Can you ensure it’s available when you try the client.py code? The error suggests it’s not available at this moment. Again, a proper description of your STR would save everyone a lot of time.

Topic		Replies	Views
Reusing tensorflow model for multiple inference, without binaries DeepSpeech	5	653	July 30, 2020
Has the time to load a model into memory changed in 0.7 as opposed to 0.5/0.6? DeepSpeech	3	365	July 3, 2020
Using pre-trained model DeepSpeech	13	1501	May 11, 2020
Error when loading a sequence of models in python DeepSpeech	13	1963	October 11, 2018
How to efficiently run concurrent inferences with DeepSpeech model DeepSpeech	1	902	February 20, 2019

Is there a way to load the model only once?

Related topics