Hi @caucheteux
Yes, you re right.
And yes, this model will only work for your 5 speakers!!
To become open, you should record a lot lot of speakers…
Hi @caucheteux
Yes, you re right.
And yes, this model will only work for your 5 speakers!!
To become open, you should record a lot lot of speakers…
Well, I m not expert, but I think that the data needs are exponential, regarding to users!!
Ex : same as simple code numbers possibilities… [] for 3 numbers are not proportional to [] for 4 or 5…
If you add a new user, sure you ll have to record it voice, but you ll need too to grow up your global data.
More possibilities, more datas, more learning
Thank you for your response @elpimous_robot
That’s my concern, as it might be problematic to need 1h of recordings for each new speaker and re-train the model.
My idea is that if I get enough variety of speaker, I can have an acceptable versatility for my model and therefore don"t need to re-train the model everytime I have a new user
Yes ! The issue is to make a compromise between time to create dataset (and ressources) and quality of the model trained with this dataset…
I’m thinking of reducing the number of recordings per speaker per words/orders but increase the number of speaker. Does it seems a good way to reduce the amount of time needed per speaker without significatively decreasing the quality of my model ?
may be @lissyx or @reuben can help if they have any idea ?
Thanks again, every message is really helpful
You need an English model, right?
If yes, why not use the whole model, and create a lm containing only the words you need.
@reuben, I think it could work, no?
No, I’m building a french model unfortunately…
On parallel I’m exploring the possibility to use the mozilla model for an English model yes so your idea is welcome
Dear All, Thanks for this amazing instructions, I have prepared my own voice file to train a specific domain data for Bangla Language. I have created all the file according to the instruction, when I run the .sh file after 1-2 hours training it produce an error
" Fatal Python error: Segmentation fault
Thread 0x00007f3a569bd700 (most recent call first):
File “/usr/lib64/python3.6/threading.py”, line 295 in wait
File “/usr/lib64/python3.6/queue.py”, line 164 in get
File “/home/venvs/projectSTT/lib/python3.6/site-packages/tensorflow/python/summary/writer/event_file_writer.py”, line 159 in run
File “/usr/lib64/python3.6/threading.py”, line 916 in _bootstrap_inner
File “/usr/lib64/python3.6/threading.py”, line 884 in _bootstrap"
Can anyone help me how I can successfully train my model ?
Can anyone enlighten me, I am stuck here Fatal Python Error: Segmentation fault.
I am using a vertual environment. and have run DeepSpeech using below .sh file.
This is my error log,
This is my .sh file
I’am taking advantage of some online books in spanish language … which have some accented letters (áéíóú) … should i include them in the alphabet.txt … or must i transform them into letters without an accent in the text … finally … should i leave all letters in lower case (in the text) and in the process eliminating all the complementary symbols (-_¡!¿? …) … thank’s in advance
FYI: Here’s my alphabet for spanish:
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
ü
á
é
í
ó
ú
ñ
alphabet.zip (221 Bytes)
Noticed that I removed the #, using the # fails with the trie.
Fom public domain? I’m also training for Spanish
Hi @carlfm01
Yes, you need to keep accented letters.
And you need to preprocess your text, to make it lowercase.
M and n are not same letters, for deepspeech system, and it would be nearly impossible for it to produce correct inferences…
Have a nice day
Vincent
Thank’s for all the tips
Your alphabet file miss that instruction
The forum removes it, check the one inside the zip file. Maybe this is useful for you: https://www.kaggle.com/carlfm01/120h-spanish-speech/
There’s a lot of new spanish data on openslr, http://www.openslr.org/resources.php
Hello
I’am doing this one … based on the following tutorial and some python scripts
Please share and contribute those
Hello,
I have also limited audio data with 10 command in korean. I want to know the hyper parameter used for training your model. could you provide the details of your model parameter? how can I create language model with just 10 command?