Tune MoziilaDeepSpeech to recognize specific sentences

Hi

Thanks Mozilla for its wonderful DeepSpeech project,

I have some problems with accuracy,

In windows speech recognition library you can limit the vocab to only focus on some sentences or words. so the result will be so accurate.

Is there any option for this in Mozilla Deep Speech?

2 Likes

You can train your own language model on only the desired sentences. See here for how we trained ours on a larger set of text.

1 Like

thanks for your answer
I didn’t get what i shloud do?

Is there any tutorial or video (explaining step by step)?

1 Like
  • put your sentences in a text file, one on each line
  • make the text lowercase
  • remove all characters other than a-z and apostrophe ’
  • build language model using the link from the previous comment
  • build trie using the command from the link
  • when running the inference, use your custom language model and trie
4 Likes

how can i find generate_trie?

From native client in releases. Choose one for your system, e.g. for linux, latest native client 0.5.0 is here

1 Like

for words that deep speech predicting can i have collection of alternatives ?

Yes, although it’s worth experimenting to see the impact.

Might be worth reading a bit about Language Models in general so you have an idea of what they’re for and why they’re being used here.

1 Like

I cant find any example of that, Would you send me an example of getting collection of alternative words from MozillaDeepSpeech?

thanks to your help
I successfully generate lm.binary and trie
but deepspeech crash on it without say any error

thanks to your help
I successfully generate lm.binary and trie
but deepspeech crash on it without say any error

this is my text file:

one
two
three
how
what
could
where
report
maximize
minimize
could you maximize the form
would you minimize the form

I also had to add --discount_fallback to lmplz
and i am using deepspeech 0.5.0

what is my mistake?

@kdavis @yv001 @nmstoker

I haven’t migrated to 0.5.0 from 0.4.1 yet, so that might be specific to the new release. Custom lm models work just fine for me on 0.4.1.

1 Like

OK
I will test it on 0.4.1,
Did you use --discount_fallback to lmplz
Is the version of kenlm important?

@yv001

no, i did not use the option

1 Like

I think I need an example of a text file winch is fine for kenlm,
I didn’t find any on the web,
would you send me an example?
@yv001

one is referred in the original link above and used in the script, you can also read about the lm in this post.

Simple utf8 encoded file with lowercase a-z and apostrophe phrases one per line should work. Also make sure that you have compatible line endings (\n on linux).

1 Like

I have to train model after creating new LM and Trie?
my sentences are English
@yv001

For standard English, the acoustic model can stay the same. Fine tuning of the acoustic model would be needed if you were planning on transcribing atypical words, e.g. names of products or companies etc.

1 Like

thank a lot

So I dont have to train the model because i dont need any new words.

I could not find solution for my problem of making LM and Trie

this is my text file:

one
two
three
how
what
could
where
report
maximize
minimize
could you maximize the form
would you minimize the form

I know this will so much asking, Would you test it? (making lm and trie and run deepspeech)

@yv001

sorry, don’t have time to setup 0.5.0 now

1 Like