Is any one use Streaming API for deepspeech==0.5.1

Sudarshan.gurav14 · May 22, 2020, 9:38am

Is any one CreateStream function for deepspeech==0.5.0 it is available for deepspeech==0.6.0 onward

My model is trained with WER 0.013 and CER 0.011 Thank you, everyone, for your help

The model trained on deepspeech==0.5.0 So, am unable to use CreateSteam() method

Please help or any idea how to use this please suggest

lissyx · May 22, 2020, 9:48am

Retrain on 0.7, that’s the best you can do. Everything else will be headaches and more time consumed than just upgrading to newer version.

Sudarshan.gurav14 · May 22, 2020, 9:54am

@lissyx Thank you will try upgrading

Sudarshan.gurav14 · May 22, 2020, 11:53am

Is any there any package or github repo similar to deepspeech for printed character recognition mainly CNN and RNN based

Sudarshan.gurav14 · May 23, 2020, 11:38am

@lissyx all package installs successfully but am unable to download gererate_trie file

I upgrade deepspeech==0.7.0

Can you please share the commands for download gererate_trie file

othiele · May 23, 2020, 11:52am

A model trained for 0.5 can’t be used with 0.7 code. As @lissyx said retrain your model or try to backport streaming, but I would strongly advise against that option.

The 0.7 code uses a combined binary and trie called the scorer, check data/lm how it is built.

Sudarshan.gurav14 · May 23, 2020, 5:43pm

@othiele Thanks

  python generate_lm.py --input_txt Vocabulary20052020.txt --output_dir . \
  --top_k 500000 --kenlm_bins /home/ec2-user/LM/kenlm/build/bin/ \
  --arpa_order 3 --max_arpa_memory "25%" --arpa_prune "1" \
  --binary_a_bits 255 --binary_q_bits 8 --binary_type trie

this commands successfully run

vocab-500000.txt was created and lm.binary file

Then,

    python generate_package.py --alphabet ../alphabet.txt --lm lm.binary --vocab vocab-500000.txt \
  --package kenlm.scorer --default_alpha 0.931289039105002 --default_beta 1.1834137581510284

After that i got below results

334 unique words read from a vocabulary file.
Doesn't look like a character-based model.
Using detected UTF-8 mode: False
Package created in kenlm.scorer

Thanks @othiele for help

othiele · May 23, 2020, 6:48pm

Why do you think you have a problem?

lissyx · May 25, 2020, 2:22pm

Can you please read the documentation ?