I do not understand how to use deepspeech on windows (if it is possible at all)

(Jfenigan) #1

This might have been mentioned elsewhere, but I cannot seem to find an answer and since I am new to this, I need to ask: How does one use deepspeech on Windows? Assuming I do not want to use vitrtual machines running Linux, is there any way I can use the python bindings? I read somewhere that I need to comile it myself, but I would like to avoid that. I also read something about nuget packages, which I have downloaded, yet have no idea how to use. Is it possible to enter a command line prompt like: deepspeech --model models\output_graph.pbmm --alphabet models\alphabet.txt --lm models\lm.binary --trie models\trie --audio speech.wav

All relevant material I have stumbled upon did not help me at all. Could someone please clarify how one can get deepspeech to work on windows without compiling and/or retraining? Thank you!

(20richardh) #2

Hi,

Unfortunately, you can’t use this in Windows. I have a Windows machine, so I had to go with the virtual machine (or partitioning the Linux system). It ultimately works out fine though. Although you can’t use a GPU on virtual machine – for GPU you’d have to partition. I’m about two weeks new to this as well, so I haven’t used python bindings.

Either way, the people who mainly respond to these forums have told people that Mozilla DeepSpeech only supports Mac and Linux systems, so I’m guessing that Windows doesn’t have much of a workaround…

(Carlos Fonseca) #3

Hello @jfenigan

Well there’s a middle situation here, when 0.4.1 was released the Windows compilation was late introduced, most of the work for the bindings was after the release of 0.4.1, then changes to the model and the bindings made the bindings incompatible with 0.4.1 model, notice that the model for 0.5 is not released yet.By the date 0.4.1 version was released I think the only client working for it was the .NET client.

Nuget package is pretended to be used as a dependency package for an existing project, the most common way to use it is with Visual Studio nuget explorer.

You can use my very old builds for 0.4.1 model: https://github.com/carlfm01/deepspeech-tempwinbuilds/releases/

With the build I linked, yes you can run command line.

DeepSpeechConsole.exe --model output_graph.pbmm --alphabet alphabet.txt --lm lm.binary --trie trie --audio arctic_a0024.wav

For now I think is the only way.

(Jfenigan) #4

Thank you for this info! However, being a complete newbie, I do not know how to use your repo. I downloaded libdeepspeech-avx-avx2.zip and libdeepspeech-no-avx.zip and by unzipping the files, I got a libdeepspeech.so file. I think I read somewhere that this file is just a DLL, but I have no clue as to what to do with it.
It’s the same file I got when I extracted a nuget package. I reckon that the "
DeepSpeechConsole.exe --model output_graph.pbmm --alphabet alphabet.txt --lm lm.binary --trie trie --audio arctic_a0024.wav" command needs to be executed from within a folder that contains those files, right? I wish there ws a step-by-step guide because things that others believe are self-explanatory are not really that obvious to me.

(Carlos Fonseca) #5

Yes, that’s the native client (c++), you also need to download DeepSpeech-Console.zip and move the .so file to the root of the .exe side by side.

The console project can be found at the first release:

1 Like
(Carlos Fonseca) #6

avx and avx2 means that the code was compiled to use avx and avx2 instructions(will run faster)

(Jfenigan) #7

Well, I downloaded the DeepSpeech-Console.zip and placed the libdeepspeech.so file in the root directory where the DeepSpeechConsole.exe is. I ran the command “DeepSpeechConsole.exe --model output_graph.pbmm --alphabet alphabet.txt --lm lm.binary --trie trie --audio arctic_a0024.wav” but I got the following:
Loading model…
Error loading lm.
Cannot find the alphabet file: alphabet.txt
Error loding the model.

What am I doing wrong?

(Jfenigan) #8

Update: I managed to make it work by placing all files (including the trained model) in the same folder. I just find inference to be slower than it was when I tried it on a linux system. Now the question is how I get it to work from inside python (if it is possible).

(Lissyx) #9

pip install deepspeech==0.5.0a8

(Jfenigan) #10

Just to make sure we are on the same page, I have successfully installed the 0.5 alpha version of python’s deepspeech, but I have some old binaries for the 0.4 version. Can I still use the 0.4 binaries programmatically through python? Are there any tutorials?

(Lissyx) #11

No, we did the work for bindings post-0.4.1. There’s no alternative.