Web_microphone_websocket demo work fine and not very well on android_mic_streaming demo

hiyassat · July 14, 2021, 2:51am

I have trained arabic model using the following configuration, domain language vocabulary size 50k words only
using the following configuration
Android Microphone Streaming

–train_files myfile.csv
–checkpoint_dir ‘/checkpoint-4-new’
–alphabet_config_path ‘alphabet.txt’
–dev_files dev.csv
–test_files test.csv
–summary_dir summaries-4-new’
–train_batch_size 32
–dropout_rate 0.35
–dev_batch_size 50
–test_batch_size 30
–test_output_file ‘/test/test.txt’
–scorer_path ‘/full.kenlm.scorer’
–n_hidden 1024
–export_dir ‘/export’
–export_tflite true
–learning_rate 0.0001
–epochs 25
–max_to_keep 2
–use_allow_growth “true”
–stderrthreshold debug
–noearly_stop
–automatic_mixed_precision
–augment reverb[p=0.1,delay=50.0~30.0,decay=10.0:2.0~1.0]
.
.
.
I bulit scorer using the followin comands

` python3   generate_lm.py --input_txt   1000.Lines.LM.txt   --output_dir     --top_k 50000 --kenlm_bins /home/ubuntu/DeepSpeech_latest/EX-HD/kenlm/bin/ --arpa_order 5 --max_arpa_memory "85%" --arpa_prune "0|0|1"  --binary_a_bits 255 --binary_q_bits 8 --binary_type trie --discount_fallback`


 `./generate_scorer_package --alphabet   alphabet.txt --lm    /lm.binary --vocab    /vocab-50000.txt  --package     /new.scorer --default_alpha 0.931289039105002 --default_beta 1.1834137581510284`

on batch testing I got - WER: 0.055143, CER: 0.017157, loss: 12.519335
and good result on demo
web_microphone_websocket

my problem is on android demo
android_mic_streaming
the problem is that it only recognizes one utterance only then stuck, this happened on short sentences, but on long sentences, it recognizes more than one utterance
example
if my language model contains the following sentences

   ` *1 As he crossed toward the pharmacy*`

    *2 As he crossed toward the pharmacy at the corner he involuntarily turned his head because of a burst of light that had ricocheted from his temple*

*    3 the man who then stole his car*

*    4 a blindingly white parallelogram of the sky being unloaded from the van — a dresser with mirror, across which, as across a cinema screen, passed a flawlessly clear reflection of boughs, sliding and swaying not aboreally, but with a human vacillation, produced by the nature of those who were carrying this sky, these boughs, this sliding facade*

and if I said ‘As he crossed toward the pharmacy’ it recognizes it and stuck

if I said ‘ at the corner he involuntarily turned his head ’ it recognizes it
if I then stopped and said ‘because of a burst of light that’ it continues to recognize it
and keep recognizing tell the end of a long sentence

if I speak any part of sentence number 4 it will not be stuck and it will keep recognizing tell the end of a sentence
where is my mistake

note that the android demo works fine on the English model
my training date about 3000 hours

hiyassat · July 16, 2021, 12:19pm

Android demo keep recognizing while the user speaks from the same line in the language model text file
for example, the text file used to generate score has the following two-line
1-as he crossed toward the pharmacy at the corner he involuntarily turned his head because of a burst of light that had ricocheted from his temple

2-the man who then stole his car

if the user said the second sentence, the application recognize it very good and the stop recognition

if the user said part of the first sentence, the application recognize it very good and continue recognition

othiele · July 20, 2021, 6:53am

Development is now continued by coqui. Read more in this post.