Can we use DeepSpeech for Vietnamese Speech To Text?

(Lissyx) #21

You need to find where it comes from: you have some transcription that has characters not in the alphabet :). There’s nothing we can do more for now.

(Phanthanhlong7695) #22

yeah. thank you for your help.
for example. in windown: mà
and in linux : ma`
maybe error :smiley:
i’m try to fix this. thank for your support

(Lissyx) #23

The best I can suggest for this is simply binary search: open your train CSV (if it happens during training), remove the first half of the lines, try to re-run. If it works, then the first half contains the offending character. If not, it’s in the second half, and you restart the process by removing half of the second half, until you get ONE line :slight_smile:

(Phanthanhlong7695) #24

i know. but i can not create trie file. so how could i start trainning. i am still follow instruction. :slight_smile:

(Lissyx) #25

Well, you don’t need the trie file for training. Worst case, you can just apply the same process with generate_trie.

(Phanthanhlong7695) #26

that mean i can delete this :
–lm_trie_path /home/nvidia/DeepSpeech/data/alfred/trie
that right ?

(Lissyx) #27

What is this ? Where is this coming from ?

(Phanthanhlong7695) #28

come from there .

(Lissyx) #29

I’m a bit lost now in the status of your system. When do you have the invalid label error ? At training or during trie creation ? Why do you try to use a trie made for french on vietnamese data ?

(Phanthanhlong7695) #30

invalid label during trie creation

(Phanthanhlong7695) #31

i did not use trie for french. i find how to create trie for Vietnamese ?

(Lissyx) #32

We are circling here. You need to create it. Vincent documented it in his thread. If you are hitting the invalid label during its creation, you need to find what is missing in your alphabet.

(Phanthanhlong7695) #33

i am trying :frowning:

(Vincent Foucault) #34

Hi, @phanthanhlong7695,

To create Trie file, you need some parts :

  • alphabet.txt
  • lm.binary
  • vocabulary.txt

invalid label during trie creation : it seems that you have unknown characters in your vocabulary.
a “label” is a character (a letter, or a punctuation)
check that all caracters in your vocabulary are present in alphabet.

If not, correct it, and restart all process.

(Lissyx) #35

Well, I cannot do it for you, and I have much other work to perform. I gave you a process to find what is broken. Apply it.