if we split the whole dataset into test,train and dev then where should i put the vocabulary.txt file ?
what is an arpa and why do we need it to build the lm model?
I have deepSpeech installed inside a linux virtual machine in my PC and i do not have a GPU support in my device, would deepspeech training will work for my small dataset
Like those questions i have many questions?
Basically like in kaldi i need a “DeepSpeech for Dummies tutorial”
((slow to reply) [NOT PROVIDING SUPPORT])
Where you want, since you can pass --train_files and others arguments
Please look at data/lm content
This question makes no sense to me, there’s no intersection between dataset and vocabulary file
This is KenLM-level, you jush ave to build it in the process, but you won’t need it after
Not sure I get the point here:
do you want help to get teh GPU working in the VM ?
do you want help to get it working on your basesystem where the GPU is available ?
Define your hardware, define your dataset. We can’t tell you without more context …
Hard to write when you don’t now what “dummies” might be. Training a model is non-trivial. What dummy do you target ? People who know nothing about machine learning ? People who are keen in machine learning but just new to DeepSpeech ?
I agree with raghupathyv4 here, I also use Linux in VMs only. I’m guessing he’s asking for the same reason as I - since we don’t have dedicated Linux computers and a limited budget - we use virtual machines instead.
So I guess both his and my question is - how to make it work in a Linux VM, where there is no GPU.
Also, what is a KenLM-level?
I 100% agree that this is the worst documented tech for many years, and I’m also trying to:
make it work.
want to create my own recognizer, in a different language, and I can easily make voice files from different voices.
I want to include new words (local street names etc.)
((slow to reply) [NOT PROVIDING SUPPORT])
Thanks for taking care of sharing your feelings. Doing documentation is hard, especially when the topic is complex ; however when we are being shared with actionable feedback on what to improve, we can do things.