Using deep speech in windows environment

Bazellete · March 13, 2019, 3:41am

Thank you, that was exactly it, after quite a while it worked!
Fixed some config issues on visual studio projects and its now doing recognizing, although, it takes up 2gb ram and it fails to recognize a lot of words
I tested by recording audio from a input device from a show The Office, language english american accent, it fails about 20%/40% of the words easily

carlfm01 · March 13, 2019, 3:48am

You mean from the speaker using the micro or from the windows output? WPF or console? test with librivox recordings. The Windows solutions proven to score WER 8.87%

Bazellete · March 13, 2019, 4:10am

Using WPF tried from windows output, still trying to fix sound from microphone
If i play the arctic_a0024.wav it did it perfectly, but it was a perfect recording which is not possible in my application which will be recording from a microphone
Also, it takes quite a while to transcribe from real-live audio is this normal behavior?

carlfm01 · March 13, 2019, 4:12am

Remember that the model is not good at handling noise yet, maybe the audio contains laughs or claps?

Yes if you disabled avx2

Bazellete · March 13, 2019, 4:22am

Yes it does have laughs claps, random background noise which really hurts the recognition, on a clean audio its pretty good
Unfortunately my cpu doesn’t have avx2
Would the dataset from mozilla common voice be better at handling background noise?

carlfm01 · March 13, 2019, 4:29am

I know is trained using common voice data, but not sure if the data used contains noise.
Questions about speech corpora for pre-trained model

Bazellete · March 13, 2019, 5:09am

Interesting, going to keep at but i don’t think this is a viable solution to me at this moment due it picking up a lot of background noise although it is very promising.
I will dedicate this week to try fine tune, trying noise gates tomorrow.

Thanks a lot Carlos Fonseca, um abraço.

lissyx · March 13, 2019, 9:53am

FYI windows builds on TaskCluster got merged this morning, I’ll add upload to NuGet Gallery soon …

lissyx · March 13, 2019, 2:29pm

@Bazellete You should now be able to use prebuilt binaries from nuget.org. Please test, it’s still brand new.

lissyx · March 13, 2019, 2:30pm

Please define it takes quite a while, it’s not very clear …

dabinat · March 13, 2019, 8:51pm

For what it’s worth, I’ve found DeepSpeech performs decently on interviews, news broadcasts, documentaries, etc where speech tends to be natural, but very poorly in situations with “acted speech” such as dramas or sitcoms. I think it’s because most people are recording in a neutral, non-emotional tone.

Bazellete · March 14, 2019, 1:19am

lissyx they worked after a few issues, much much easier than compiling the whole thing, thanks
my bad about taking a while, for some reason i thought the wpf example would transcribe in real-time but you need to press stop to actually transcribe
Yes dabinat, it does work quite well on those scenarios, am trying to use voicemeter banana and audiocity trying to minimize background noise, having some success still not good enough

carlfm01 · March 14, 2019, 3:43am

The native client doesn’t supports stream decoding yet, I think somewhere I read that @reuben was hitting issues trying to make a streaming decoder.

lissyx · March 14, 2019, 7:40am

Could you give more informations ?

Bazellete · March 14, 2019, 11:49am

Had to install .net 4.6.2(not really an issue)
References were not added automatically
There was something else i don’t remember, will edit post if i remember

lissyx · March 14, 2019, 12:43pm

please send patches cc @carlfm01

carlfm01 · March 14, 2019, 5:45pm

I don’t know what you mean, you mean when you install the nuget the DeepSpeechClient.dll not being added to the references or what? Please explain a little more.

Bazellete · March 14, 2019, 8:39pm

Exactly that, installed the nuget package and it wasn’t added to the references, had to manually make the reference

carlfm01 · March 15, 2019, 2:20am

Working on it, I’ll add 4.5,4.6 and 4.7

lissyx · March 15, 2019, 10:01am

It’s landed but it broke