Why tflite don't return any word

Hung_Phung · April 22, 2021, 10:13am

I am about to create my own model with little data using colab to train (tensorflow 1.15, cuda11.2) this is notebook
First I created vocab-500.txt and lm.binary:
python3 generate_lm vocabulary.py --input_txt vocabulary.txt --output_dir.
–top_k 500 --kenlm_bins ~ / DeepSpeech / kenlm / build / bin /
–arpa_order 3 --max_arpa_memory “85%” --arpa_prune “0 | 0 | 1”
–binary_a_bits 255 --binary_q_bits 8 --binary_type trie
–discount_fallback
Secondly I create scorer with alpha = 0.75 and beta = 1.85
./generate_scorer_package - alphabet alphabet.txt
–lm lm.binary
–vocab vocab-500.txt
–package kenlm.scorer
–default_alpha 0.75
–default_beta 1.85
–force_bytes_output_mode True
3rd I train models:
! python3 /content/DeepSpeech/DeepSpeech.py –n_hidden 2048
–early_stop True
–es_epochs 30
–test_batch_size 1
–dev_batch_size 10
–train_batch_size 16
–feature_win_step 10 -
-train_cudnn True
–checkpoint_dir / content /
–epochs 100
–train_files /content/vivos/train.csv
–dev_files /content/vivos/dev.csv
–test_files /content/vivos/test.csv
–learning_rate 0.000095
–export_tflite
–export_dir / content / DeepSpeech / output_models /
–automatic_mixed_precision True
–dropout_rate 0.05
–al alphabet_config_path /content/vivos/al alphabet.txt
–scorer_path /content/vivos/kenlm.scorer
And get results:
Epoch x | Training | Elapsed Time: 0:05:50 | Steps: 728 | Loss: 0.669057
Epoch x | Validation | Elapsed Time: 0:00:08 | Steps: 52 | Loss: 96.212590 | Dataset: /content/vivos/dev.csv
and test without any words:
Best WER:

WER: 1.000000, CER: 1.000000, loss: 318.883636

wav: file:///content/vivos/test/waves/VIVOSDEV02/VIVOSDEV02_R154.wav
src: “BỌN BUÔN DỰ ÁN CHẠY CHỌT BÀY RA CÁC DỰ ÁN ĐỂ KIẾM CHÁC”
res: “”

WER: 1.000000, CER: 1.000000, loss: 221.555115

wav: file:///content/vivos/test/waves/VIVOSDEV02/VIVOSDEV02_R132.wav
src: “ĐIỆN THOẠI RENG NHƯNG TA CŨNG PHẢI NHẤC MÁY ĐÚNG KHÔNG”
res: “”

WER: 1.000000, CER: 1.000000, loss: 204.180481

wav: file:///content/vivos/test/waves/VIVOSDEV02/VIVOSDEV02_T022.wav
src: “THÁNG TRƯỚC CHỒNG VỀ MẤY NGÀY MÌNH NẰM NGỦ THẤY ẤM KINH KHỦNG”
res: “”

WER: 1.000000, CER: 1.000000, loss: 198.045197

wav: file:///content/vivos/test/waves/VIVOSDEV02/VIVOSDEV02_R071.wav
src: “MỖI KHI XE RÁC ĐI NGANG QUA ĐÁNH KẺNG MỚI ĐEM RA ĐỔ”
res: “”

WER: 1.000000, CER: 1.000000, loss: 195.038269

wav: file:///content/vivos/test/waves/VIVOSDEV02/VIVOSDEV02_R045.wav
src: “CÔ CỨ NHÈ NGAY CÁI NỌNG CÁ TRÊ”
res: “”

4th I search for alpha and beta with this model:
! python3 /content/DeepSpeech/lm_optimizer.py
–test_files /content/vivos/test.csv
–al alphabet_config_path /content/vivos/al alphabet.txt
–checkpoint_dir / content / fine_tuning_checkpoints
–n_hidden 2048
Results: [I 2021-04-22 09: 34: 16,592] Trial 0 finished with value: 0.9127290260366442 and parameters: {‘lm_alpha’: 2.6143084230162623, ‘lm_beta’: 1.998640895456838}. Best is trial 0 with value: 0.9127290260366442.
At worst, it still has the words:
Worst WER:

WER: 1.000000, CER: 0.666667, loss: 48.078129

wav: file:///content/vivos/test/waves/VIVOSDEV04/VIVOSDEV04_R049.wav
src: “VÌ TỐI ĐÓ”
res: “VIÌ TÍ MỐC”

WER: 1.000000, CER: 0.625000, loss: 46.857883

wav: file:///content/vivos/test/waves/VIVOSDEV05/VIVOSDEV05_091.wav
src: “TRÁN GIÔ”
res: “TẢN DIỘC”

WER: 1.000000, CER: 0.333333, loss: 35.851494

wav: file:///content/vivos/test/waves/VIVOSDEV06/VIVOSDEV06_222.wav
src: “NĂM MƯƠI NĂM MƯƠI MỐT”
res: “ĐĂM MƯỜI ĐĂM MÙY MÓT”

WER: 1.000000, CER: 0.500000, loss: 29.178698

wav: file:///content/vivos/test/waves/VIVOSDEV04/VIVOSDEV04_R100.wav
src: “BÀ NÓI”
res: “BẢ ÓC”

WER: 1.000000, CER: 0.263158, loss: 25.467527

wav: file:///content/vivos/test/waves/VIVOSDEV06/VIVOSDEV06_212.wav
src: “BA MƯƠI BA MƯƠI MỐT”
res: “BAN MƯỜI BAN MƯAI MÓT”

then I create a new scorer with there alpha,beta
./generate_scorer_package - alphabet alphabet.txt
–lm lm.binary
–vocab vocab-500.txt
–package kenlm.scorer
–default_alpha 2.6143084230162623
–default_beta 1.998640895456838
–force_bytes_output_mode True

I use this tflite and scorer files in android-mic-strem-app however I don’t get any word at output. Please help me.

Hung_Phung · April 22, 2021, 6:01pm

Perhaps the problem is with my scorer model. I think I did the right thing to create the scorer file, this is my alphabet.txt. Has someone encountered a similar situation yet. Please help me!

ftyers · April 23, 2021, 12:28am

This looks about right for the amount of data. You might try using transfer learning, and for Vietnamese you might try reducing the vocabulary size by using NFKD.

Hung_Phung · April 23, 2021, 3:26am

Transfer learning is to reuse the checkpoints of the previous training process. I am doing the same. Especially when I find the right alpha-beta, the test results have word but when using this to build scorer, no word is found.

ftyers · April 24, 2021, 2:39pm

I’m sorry, I didn’t understand. Your alphabet also has a space after every character. Could you join us on Matrix ?

ftyers · April 24, 2021, 3:54pm

So I just tried training with Vietnamese and the Common Voice data. You can find the model here.

The results without an LM, and the results with an LM.

I used:

drop_source_layers: 2
learning_rate: 0.00001
dropout: 0.2
no SpecAugment

Hung_Phung · April 24, 2021, 4:40pm

Can you share me code to create scorer model? I think this is my break.

Hung_Phung · April 24, 2021, 4:41pm

Thank you for enthusiasm,love you

ftyers · April 24, 2021, 4:48pm

Sure, the code is here, it’s not very readable, I’m sorry -_-;; … I’m happy to explain specific parts though.

Hung_Phung · April 24, 2021, 4:56pm

Thank you i use force byte and maybe it’s not right, i’m adding data to the model, i will try it as soon as i am done. Thank you so much.

Topic		Replies	Views
Result of tflite and scorer on android_mic_strem is so bad DeepSpeech	4	480	April 26, 2021
How to use the pretrained tflite model? DeepSpeech	33	6289	May 6, 2020
Learning new words for STT DeepSpeech	8	572	November 10, 2020
"Doesn't look like a character based model" DeepSpeech	3	1025	May 27, 2020
How to get good transcription results with only a specific English vocabulary? DeepSpeech	15	1773	June 3, 2020

Why tflite don't return any word

Related topics