Hello !!!
I trained my deepspeech model with 6900 epochs and 5Go of Mozilla Datasets but I did not achieve a good accuracy.
Is it a problem of the size of the datasets?
Hello !!!
I trained my deepspeech model with 6900 epochs and 5Go of Mozilla Datasets but I did not achieve a good accuracy.
Is it a problem of the size of the datasets?
You need to share more informations so we can help. How did you train, what data exactly, how much hours does that makes, what’s your training parameters, language model, etc.
Thx for the reply
I used the Common-voice French Datasets “https://voice.mozilla.org/en/datasets”
Here are the training parameters that I used :
batch_size = 200
n_hidden = 2048
epoch = 6900
validation_step 1
earlystop_nsteps 6
dropout_rate 0.25
learning_rate 0.00095
estop_mean_thresh 0.1
estop_std_thresh 0.1
report_count 100
I use the same configuration described here : “https://github.com/mozilla/DeepSpeech”
Also I followed this tutorial to create the language model “TUTORIAL : How I trained a specific french model to control my robot”
Why don’t you try and build on top of what I shared that documents a basic french model ?
That seems huge. If you only used french common voice, it means you have at best 70 hours … Don’t expect anything usable.
Your dockerfile contains the same steps I used before.
There are only some changes in the training parameters
I did not know how to do this
Please, Can you provide more information ?
What’s unclear ? https://github.com/Common-Voice/commonvoice-fr/blob/master/DeepSpeech/CONTRIBUTING.md
As you can see if you read the docs and the Dockerfile, it does also contains other datasources to help getting a more accurate model.
How to do what ?
I can not install NVIDIA Docker
. What ever I tried I get always the same error
Using default tag: latest
Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io: no such host
That’s something to deal with your network administrator, I fear. Nevertheless, you have the shell scripts, so you should be able to train using the same sources. Could you try that ?
Yeah I did that, but it finished with errors
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic/InRelease Could not resolve ‘archive.ubuntu.com’
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic-updates/InRelease Could not resolve ‘archive.ubuntu.com’
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic-backports/InRelease Could not resolve ‘archive.ubuntu.com’
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/bionic-security/InRelease Could not resolve ‘security.ubuntu.com’
W: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/InRelease Could not resolve ‘developer.download.nvidia.com’
W: Failed to fetch https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/InRelease Could not resolve ‘developer.download.nvidia.com’
W: Some index files failed to download. They have been ignored, or old ones used instead.
E: Unable to locate package build-essential
E: Unable to locate package curl
E: Unable to locate package wget
E: Unable to locate package git
E: Unable to locate package python3
E: Unable to locate package python3-pip
E: Unable to locate package cmake
E: Unable to locate package libboost-all-dev
E: Unable to locate package zlib1g-dev
E: Unable to locate package libbz2-dev
E: Unable to locate package liblzma-dev
E: Unable to locate package pkg-config
E: Unable to locate package virtualenv
E: Unable to locate package unzip
E: Unable to locate package pixz
E: Unable to locate package sox
E: Unable to locate package libsox-fmt-all
E: Package ‘locales’ has no installation candidate
E: Package ‘locales-all’ has no installation candidate
E: Package ‘xz-utils’ has no installation candidate
@yasine.nifa I don’t know what is being done on your network, but it seems heavily filtered. If your network administrators does not give you the proper tools to do your work, there’s nothing we can really do to help you
ok thank you I will check this out
My best suggestion in the meantime is at least to reproduce looking at the content of the dockerfile and the shell scripts. It should be easy to re-use. Feel free to report any issue on the github account.