I’m already using deepspeech pre-trained English model from Mozilla, it works fine. now I want to make my own simple model just to get an idea of how it works.
i read the docs on here: https://deepspeech.readthedocs.io/en/v0.9.3/TRAINING.html#
but it was a bit unclear for me. I have installed DeepSpeech Training Code and its dependencies…but I’m stuck at this part: https://deepspeech.readthedocs.io/en/v0.9.3/TRAINING.html#common-voice-training-data I want to use my own dataset to create a simple model that recognize numbers and some words in germany. i have the wave files of me saying the words and numbers from 0 to 9 . what should I do now to turn this into a model ready to be trained.
SW:
ubuntu 18.04 bionic
HW:
4gb ram
cpu i5 10gen
no GPU
-
Put all audios in one dir.
-
Create 3 files for train/dev/test with 85/10/5 that are like xxx.csv with a header line:
wav_filename, wav_filesize,transcript
-
Fill in the splits for the 3 files. Filename includes the path relative from where you are calling DS.
-
Size is the size as in
ls -la
Post if you have any other questions.
You will likely have difficulties to train anything.
i have created the files in xx.tsv format is that okay?
next, I do not know what do you mean by 85/10/5. please be more specific, I’m a total nobe to deepspeech.
i have 11 wav.files so as you see its a very small dataset consist on numbers from 0 to 9 and a hallo word. can my CPU do the work?
this is unrelated to deepspeech, it’s that you need to split your training data by 85% / 10% / 5%
Yes, but then such a small amount of data will not allow you to train anything useful.
and you mean by that, spliting my 11 wav.files data, like 6 wav.files in train.tsv & 3 wav.files in dev.tsv, & 2 wav.files in test.tsv.
can you lpease help me understand why i should do that. and how this would help in my training.
you are right, I’m only trying to understand how the system works using small datasets and recognizing from the same wave files I sued in training to get accurate results.
This is out of the scope of deepspeech, you need to read about machine learning and how to make a training
If you don’t care about the model really being efficient, then it’s good, this volume of data will allow you to perform training on CPU. Just don’t have expectations on the model quality.
thank you for your help, I’m new to machine learning and I love it and I will learn more about it.
I do care about the model to be efficient, but in my case, I’m in the learning stage. once I know how it works on a small set dataset, i will start a medium project with a decent dataset.
[UPDATE]
I have done working on data, all files are ready for training
I’m following this HTML playbook from the beginning.
my dataset directory:
root@localhost:~/DeepSpeech/data# ls
alphabet.txt dev.tsv lm README.rst ted train.tsv
clips invalidated.tsv other.tsv smoke_test test.tsv validated.tsv
root@localhost:~/DeepSpeech/data#
when i run the import_cv2.py i get this error:
root@localhost:~/DeepSpeech# bin/import_cv2.py DeepSpeech/data
Traceback (most recent call last):
File "bin/import_cv2.py", line 15, in <module>
import progressbar
ImportError: No module named progressbar
root@localhost:~/DeepSpeech#
tried installing progressbar but the error remains:
root@localhost:~/DeepSpeech/data# apt install progressbar
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package progressbar
root@localhost:~/DeepSpeech/data#
It looks like you have not properly followed the playbook requirements on Docker.
cc @kathyreid
yes im not using docker, do i have to use docker? is it required to get this working?
Yes, please follow the playbook in a consistent manner.
ok i will do so, thank you for bieng so patient with me. i really apreciate it. will follow the docker way and give updates.
@Freegs_Box the PlayBook walks you through building a Docker image, and running a Docker container to do the training. Let me know if you get stuck!
honestly im new to docker, i really need help. im here on the PlayBook trying set up docker. but I didn’t get the meaning of this sentence [ Copy or extract them to the directory you specified in your bind mount .]
how to set up docker for DeepSeech project?
OK, first step, which directory did you use for your bind mount?
thas the problem, i dont know what it was meant by bind mount. is it the deepspeech directroy
OK, so this command for spinning up a Docker container from an image:
$ docker run -it \
--entrypoint /bin/bash \
--name deepspeech-training \
--gpus all \
--mount type=bind,source="$(pwd)"/deepspeech-data,target=/DeepSpeech/deepspeech-data \
7cdc0bb1fe2a
specifies a bind mount. This is a directory on your system that Docker can read from, and write to.
Did you create a directory called deepspeech-data
on your system? If so, this is where you need to store the files for training so that DeepSpeech can access them from within the Docker container.
i went to the root directory and created a new folder and renamed it to deepspeech-data
Right, so the deepspeech-data
directory that you created is what is specified in the --bind-mount
parameter when spinning up a Docker container. It should be available within the Docker container from /DeepSpeech/deepspeech-data
.
Are you receiving an error message at all? Did you install Docker OK? And did you pull down the mozilla/deepspeech-train:latest
Docker image?
yes I have docker installed
docker 19.03.13 from Canonical✓ installed
as for DeepSpeech, I have installed it directly on my ubuntu system, it was for testing pre-trained English models and it was working well. is the docker version different?
do i need to install deepspeech inside docker container
when I run the above code that is what I get, but I think I’m not meeting all the requirements. and that is why the code doesnt work.
root@localhost:~# docker run -it \
> --entrypoint /bin/bash \
> --name deepspeech-training \
> --gpus all \
> --mount type=bind,source="$(pwd)"/deepspeech-data,target=/DeepSpeech/deepspeech-data \
> 7cdc0bb1fe2a
Unable to find image '7cdc0bb1fe2a:latest' locally
docker: Error response from daemon: pull access denied for 7cdc0bb1fe2a, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.