Can anyone enlighten me, I am stuck here Fatal Python Error: Segmentation fault.
I am using a vertual environment. and have run DeepSpeech using below .sh file.
This is my error log,
This is my .sh file
Can anyone enlighten me, I am stuck here Fatal Python Error: Segmentation fault.
I am using a vertual environment. and have run DeepSpeech using below .sh file.
This is my error log,
This is my .sh file
I’am taking advantage of some online books in spanish language … which have some accented letters (áéíóú) … should i include them in the alphabet.txt … or must i transform them into letters without an accent in the text … finally … should i leave all letters in lower case (in the text) and in the process eliminating all the complementary symbols (-_¡!¿? …) … thank’s in advance
FYI: Here’s my alphabet for spanish:
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
ü
á
é
í
ó
ú
ñ
alphabet.zip (221 Bytes)
Noticed that I removed the #, using the # fails with the trie.
Fom public domain? I’m also training for Spanish
Hi @carlfm01
Yes, you need to keep accented letters.
And you need to preprocess your text, to make it lowercase.
M and n are not same letters, for deepspeech system, and it would be nearly impossible for it to produce correct inferences…
Have a nice day
Vincent
Thank’s for all the tips
Your alphabet file miss that instruction
The forum removes it, check the one inside the zip file. Maybe this is useful for you: https://www.kaggle.com/carlfm01/120h-spanish-speech/
There’s a lot of new spanish data on openslr, http://www.openslr.org/resources.php
Hello
I’am doing this one … based on the following tutorial and some python scripts
Please share and contribute those
Hello,
I have also limited audio data with 10 command in korean. I want to know the hyper parameter used for training your model. could you provide the details of your model parameter? how can I create language model with just 10 command?
Hello.
For very small model, I tested with small n_hidden, and results were better!!
Try with n_hidden = 464
Thanks, what about other parameter value? you used all same as deep speech models. could you provide more details so that it can be helpful for starting training.
@elpimous_robot as you said for
TRIE CREATION :
you use
alphabet.txt
lm.binary
vacab.txt
and you create trie right?
is it required to use vocab.txt for trie creation ?
also one more question
for creation vocab.txt suppose in my wave transcripts repeated
eg:
1000.wav 23093 hello good morning [male voice]
2000.wav 32424 hello good morning [female voice]
so capy both sentence or not?
Hello
Yes, and no !
For trie creation, you need textual sentences, to work with probabilities, accuracy…
Vocab.txt doesn’t need multiple times the same sentence
hope to help.
Vincent
Ready to help again my friend