Doesn't look like a character based (Bytes Are All You Need) model

Vlad_Hornai · March 19, 2021, 5:50pm

Hi,
I have been following DeepSpeech documentation in order to build my own scorer. After implementing this blocks of code

cd data/lm
python3 generate_lm.py --input_txt vocabulary.txt --output_dir .
–top_k 1500 --kenlm_bins path/to/kenlm/build/bin/
–arpa_order 3 --max_arpa_memory “50%” --arpa_prune “0|0|1”
–binary_a_bits 255 --binary_q_bits 8 --binary_type trie

cd data/lm

Download and extract appropriate native_client package:

curl -LO http://github.com/mozilla/DeepSpeech/releases/…
tar xvf native_client.*.tar.xz
./generate_scorer_package --alphabet …/alphabet.txt --lm lm.binary --vocab vocab-1500.txt
–package kenlm.scorer --default_alpha 0.931289039105002 --default_beta 1.1834137581510284

I get the following errors:

Doesn’t look like a character based (Bytes Are All You Need) model.
–force_bytes_output_mode was not specified, using value infered from vocabulary contents: false
Error: Can’t parse scorer file, invalid header. Try updating your scorer file.
Error loading language model file: Invalid magic in trie header.

I want to mention that I use a different alphabet that also contains other characters besides english characters.

lissyx · March 19, 2021, 8:03pm

Please share console output using proper formatting, right now it’s unreadable and it might hide useful informations.

Please provide output of all commands as well.

lissyx · March 19, 2021, 8:04pm

Please follow [READ FIRST] What and how to report if you need support

Topic		Replies	Views
"Doesn't look like a character based model" DeepSpeech	3	1030	May 27, 2020
Generate_scorer_package error creating language model DeepSpeech	4	1050	September 17, 2021
Error while generating own scorer DeepSpeech	5	698	November 27, 2020
Generating own scorer file DeepSpeech	41	6976	November 14, 2020
Using the newly generated language model doesn't perform as expected DeepSpeech	2	489	June 26, 2021

Doesn't look like a character based (Bytes Are All You Need) model

Download and extract appropriate native_client package:

Related topics