Make your own simple model

thas the problem, i dont know what it was meant by bind mount. is it the deepspeech directroy

OK, so this command for spinning up a Docker container from an image:

$ docker run  -it \
  --entrypoint /bin/bash \
  --name deepspeech-training \
  --gpus all \
  --mount type=bind,source="$(pwd)"/deepspeech-data,target=/DeepSpeech/deepspeech-data \
  7cdc0bb1fe2a

specifies a bind mount. This is a directory on your system that Docker can read from, and write to.

Did you create a directory called deepspeech-data on your system? If so, this is where you need to store the files for training so that DeepSpeech can access them from within the Docker container.

1 Like

i went to the root directory and created a new folder and renamed it to deepspeech-data

Right, so the deepspeech-data directory that you created is what is specified in the --bind-mount parameter when spinning up a Docker container. It should be available within the Docker container from /DeepSpeech/deepspeech-data.

Are you receiving an error message at all? Did you install Docker OK? And did you pull down the mozilla/deepspeech-train:latest Docker image?

yes I have docker installed
docker 19.03.13 from Canonical✓ installed

as for DeepSpeech, I have installed it directly on my ubuntu system, it was for testing pre-trained English models and it was working well. is the docker version different?
do i need to install deepspeech inside docker container
when I run the above code that is what I get, but I think I’m not meeting all the requirements. and that is why the code doesnt work.

root@localhost:~# docker run  -it \
>   --entrypoint /bin/bash \
>   --name deepspeech-training \
>   --gpus all \
>   --mount type=bind,source="$(pwd)"/deepspeech-data,target=/DeepSpeech/deepspeech-data \
>   7cdc0bb1fe2a
Unable to find image '7cdc0bb1fe2a:latest' locally
docker: Error response from daemon: pull access denied for 7cdc0bb1fe2a, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.

should i pull the image from here https://hub.docker.com/r/mozilla/deepspeech-train/tags?page=1&ordering=last_updated

So taking things one step at a time,

  • The DeepSpeech version that is in Docker is the same, but it’s packaged with other dependencies required for training.
  • In the PlayBook you are instructed to run the command $ docker pull mozilla/deepspeech-train:latest. This pulls down a Docker image.
  • In the PlayBook, there is a line that states:

You will now see the mozilla/deepspeech-train image when you run the command docker image ls:

$ docker image ls
REPOSITORY                             TAG              IMAGE ID       CREATED         SIZE
mozilla/deepspeech-train               latest           7cdc0bb1fe2a   7 days ago      4.77GB
  • In the PlayBook, this command shows that the IMAGE_ID of this image is 7cdc0bb1fe2a. Your image will have a different IMAGE_ID and that is why you’re receiving the error message

Unable to find image '7cdc0bb1fe2a:latest' locally.

You need to run the command docker image ls to find the IMAGE_ID of your image and substitute it. For example, if the IMAGE_ID of your image is 501979231a7b then you would run:

$ docker run  -it \
  --entrypoint /bin/bash \
  --name deepspeech-training \
  --gpus all \
  --mount type=bind,source="$(pwd)"/deepspeech-data,target=/DeepSpeech/deepspeech-data \
  501979231a7b
1 Like

i have pulled the Docker image.

root@localhost:~# docker image ls
REPOSITORY                 TAG                 IMAGE ID            CREATED             SIZE
mozilla/deepspeech-train   latest              8cdc37f75f1d        11 days ago         4.77GB

that is my image ID : 8cdc37f75f1d

$ docker run  -it \
      --entrypoint /bin/bash \
      --name deepspeech-training \
      --mount type=bind,source="$(pwd)"/deepspeech-data,target=/DeepSpeech/deepspeech-data \
      8cdc37f75f1d

as you noticed I have removed --gpus all \ as I don’t have a GPU on this server.

root@localhost:~# docker run  -it \
>   --entrypoint /bin/bash \
>   --name deepspeech-training \
>   --mount type=bind,source="$(pwd)"/deepspeech-data,target=/DeepSpeech/deepspeech-data \
>   8cdc37f75f1d

________                               _______________
___  __/__________________________________  ____/__  /________      __
__  /  _  _ \_  __ \_  ___/  __ \_  ___/_  /_   __  /_  __ \_ | /| / /
_  /   /  __/  / / /(__  )/ /_/ /  /   _  __/   _  / / /_/ /_ |/ |/ /
/_/    \___//_/ /_//____/ \____//_/    /_/      /_/  \____/____/|__/


WARNING: You are running this container as root, which can cause new files in
mounted volumes to be created as the root user on your host machine.

To avoid this, run the container by specifying your user's userid:

$ docker run -u $(id -u):$(id -g) args...

root@6accbb76ff97:/DeepSpeech#

what should I do now?

Follow the next steps in the PlayBook.

You are up to here

1 Like

i have downloaded the german language dataset containing of 4Gb, and unzipped it in the deepspeech-data directory and it was visible in the docker container:

root@294bb9e108fd:/DeepSpeech# ls -la deepspeech-data/
total 72144
drwxr-xr-x 3 root root    36864 Feb 24 15:31 .
drwxr-xr-x 1 root root     4096 Feb 24 15:04 ..
drwxrwxr-x 2 1000 1000 27156480 Feb 24 14:38 clips
-rw-rw-r-- 1 1000 1000   735847 Feb 25  2019 dev.tsv
-rw-rw-r-- 1 1000 1000  1837461 Feb 25  2019 invalidated.tsv
-rw-rw-r-- 1 1000 1000       62 Feb 25  2019 other.tsv
-rw-rw-r-- 1 1000 1000   725953 Feb 25  2019 test.tsv
-rw-rw-r-- 1 1000 1000   868206 Feb 25  2019 train.tsv
-rw-rw-r-- 1 1000 1000 42490887 Feb 25  2019 validated.tsv
root@294bb9e108fd:/DeepSpeech#

when I run the importer [import_cv2.py] I run into this issue:

    root@294bb9e108fd:/DeepSpeech# bin/import_cv2.py deepspeech-data/
/bin/sh: 1: sox: not found
SoX could not be found!

    If you do not have SoX, proceed here:
     - - - http://sox.sourceforge.net/ - - -

    If you do (or think that you should) have SoX, double-check your
    path variables.

Loading TSV file:  /DeepSpeech/deepspeech-data/test.tsv
Importing mp3 files...
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
This install of SoX cannot process .mp3 files.
This install of SoX cannot process .mp3 files.
This install of SoX cannot process .mp3 files.
This install of SoX cannot process .mp3 files.
This install of SoX cannot process .mp3 files.
This install of SoX cannot process .mp3 files.
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "bin/import_cv2.py", line 65, in one_sample
    _maybe_convert_wav(mp3_filename, wav_filename)
  File "bin/import_cv2.py", line 185, in _maybe_convert_wav
    transformer.build(mp3_filename, wav_filename)
  File "/usr/local/lib/python3.6/dist-packages/sox/transform.py", line 594, in build
    input_filepath, input_array, sample_rate_in
  File "/usr/local/lib/python3.6/dist-packages/sox/transform.py", line 496, in _parse_inputs
    input_format['channels'] = file_info.channels(input_filepath)
  File "/usr/local/lib/python3.6/dist-packages/sox/file_info.py", line 82, in channels
    output = soxi(input_filepath, 'c')
  File "/usr/local/lib/python3.6/dist-packages/sox/core.py", line 149, in soxi
    stderr=subprocess.PIPE
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'sox': 'sox'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "bin/import_cv2.py", line 221, in <module>
    main()
  File "bin/import_cv2.py", line 216, in main
    _preprocess_data(PARAMS.tsv_dir, audio_dir, PARAMS.space_after_every_character)
  File "bin/import_cv2.py", line 172, in _preprocess_data
    set_samples = _maybe_convert_set(dataset, tsv_dir, audio_dir, space_after_every_character)
  File "bin/import_cv2.py", line 127, in _maybe_convert_set
    for i, processed in enumerate(pool.imap_unordered(one_sample, samples), start=1):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 735, in next
    raise value
FileNotFoundError: [Errno 2] No such file or directory: 'sox': 'sox'
root@294bb9e108fd:/DeepSpeech#

this container is really a bare version, you need to add dependencies for import_cv2.py, like sox.

2 Likes

great, it works after i installed sox.
but I see this warning message :

WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.

should i abort or keep runing?

root@294bb9e108fd:/DeepSpeech# bin/import_cv2.py deepspeech-data/
Loading TSV file:  /DeepSpeech/deepspeech-data/test.tsv
Importing mp3 files...
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
Progress |################################################################################################################################################################# |  99% completed
mported 2267 samples.
Skipped 2 samples that were longer than 10 seconds.
Final amount of imported audio: 2:51:31 from 2:51:51.
Saving new DeepSpeech-formatted CSV file to:  /DeepSpeech/deepspeech-data/clips/test.csv
Writing CSV file for DeepSpeech.py as:  /DeepSpeech/deepspeech-data/clips/test.csv
Progress |##################################################################################################################################################################| 100% completed
Loading TSV file:  /DeepSpeech/deepspeech-data/dev.tsv
Importing mp3 files...
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
Progress |################################################################################################################################################################# |  99% completed
mported 2269 samples.
Final amount of imported audio: 2:41:06 from 2:41:06.
Saving new DeepSpeech-formatted CSV file to:  /DeepSpeech/deepspeech-data/clips/dev.csv
Writing CSV file for DeepSpeech.py as:  /DeepSpeech/deepspeech-data/clips/dev.csv
Progress |##################################################################################################################################################################| 100% completed
Loading TSV file:  /DeepSpeech/deepspeech-data/train.tsv
Importing mp3 files...
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.

You should read the doc and understand what you are doing.

it stopped by itself!!!

root@294bb9e108fd:/DeepSpeech# bin/import_cv2.py deepspeech-data/
Loading TSV file:  /DeepSpeech/deepspeech-data/test.tsv
Importing mp3 files...
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
Progress |################################################################################################################################################################# |  99% completed
mported 2267 samples.
Skipped 2 samples that were longer than 10 seconds.
Final amount of imported audio: 2:51:31 from 2:51:51.
Saving new DeepSpeech-formatted CSV file to:  /DeepSpeech/deepspeech-data/clips/test.csv
Writing CSV file for DeepSpeech.py as:  /DeepSpeech/deepspeech-data/clips/test.csv
Progress |##################################################################################################################################################################| 100% completed
Loading TSV file:  /DeepSpeech/deepspeech-data/dev.tsv
Importing mp3 files...
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
Progress |################################################################################################################################################################# |  99% completed
mported 2269 samples.
Final amount of imported audio: 2:41:06 from 2:41:06.
Saving new DeepSpeech-formatted CSV file to:  /DeepSpeech/deepspeech-data/clips/dev.csv
Writing CSV file for DeepSpeech.py as:  /DeepSpeech/deepspeech-data/clips/dev.csv
Progress |##################################################################################################################################################################| 100% completed
Loading TSV file:  /DeepSpeech/deepspeech-data/train.tsv
Importing mp3 files...
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
WARNING: No --validate_label_locale specified, your might end with inconsistent dataset.
Progress |####################################################################################################################################################             |  92% completedmultiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "bin/import_cv2.py", line 65, in one_sample
    _maybe_convert_wav(mp3_filename, wav_filename)
  File "bin/import_cv2.py", line 185, in _maybe_convert_wav
    transformer.build(mp3_filename, wav_filename)
  File "/usr/local/lib/python3.6/dist-packages/sox/transform.py", line 594, in build
    input_filepath, input_array, sample_rate_in
  File "/usr/local/lib/python3.6/dist-packages/sox/transform.py", line 493, in _parse_inputs
    file_info.validate_input_file(input_filepath)
  File "/usr/local/lib/python3.6/dist-packages/sox/file_info.py", line 249, in validate_input_file
    "input_filepath {} does not exist.".format(input_filepath)
OSError: input_filepath deepspeech-data/clips/8d31fe1fe37219527930900426bf0614eb87342f3443ab33540b47900fa6f993f2a78e3699ef73714272fa93b9d90854c010f9d1ad7977ec97c7f2aa2619e64d.mp3 does not exist.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "bin/import_cv2.py", line 221, in <module>
    main()
  File "bin/import_cv2.py", line 216, in main
    _preprocess_data(PARAMS.tsv_dir, audio_dir, PARAMS.space_after_every_character)
  File "bin/import_cv2.py", line 172, in _preprocess_data
    set_samples = _maybe_convert_set(dataset, tsv_dir, audio_dir, space_after_every_character)
  File "bin/import_cv2.py", line 127, in _maybe_convert_set
    for i, processed in enumerate(pool.imap_unordered(one_sample, samples), start=1):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 735, in next
    raise value
OSError: input_filepath deepspeech-data/clips/8d31fe1fe37219527930900426bf0614eb87342f3443ab33540b47900fa6f993f2a78e3699ef73714272fa93b9d90854c010f9d1ad7977ec97c7f2aa2619e64d.mp3 does not exist.
Progress |##################################################################################################################################################################| 100% completed
Progress |##################################################################################################################################################################| 100% completed
Progress |#################################################################################################################################################################| 100% completed
root@294bb9e108fd:/DeepSpeech#**strong text**

Read your error messages, please.

And search on our docs: https://deepspeech.readthedocs.io/en/v0.9.3/search.html?q=validate_label_locale&check_keywords=yes&area=default

1 Like

i did read the docs, and im trying to test to have a better understanding of how this works. and with your help that would be great

So explain to us what is not clear in the docs about validate_label_locale. If you don’t share feedback, we can’t know that you found but failed to understand and we can’t improve. This is not good for either you or the project.

@lissyx hello
I am working with Bengali speech to text. I have a data set which is not similar to the common voice data set. The dataset contain flac files and corresponding text in a csv. I took some (2518) flac files, convert them to mp3 and created three separate train(85%), test(5%) and dev(10%) tsv files. The tsv files contain path(filenames), client_id(some random numbers), sentence(the actual text) as of commonvoice tsv. The directory is as like as commonvoice directory.
Then, when I try to create the corresponding train, test and dev files using import_cv2.py(that also responsible for converting flac files to mp3), it says,
OSError: input_filepath dataset/clips/(a_random_filename).mp3 does not exist.
I check in the directory and there the file is located.
So, I remove the file from both the directory and the tsv. Then it shows the same problem for another file that is also located in the directory.
Would please give a hand?

This is a problem in your dataset. I cant help and I don’t have time (I’m nor working on this anymore).

If you want help, check out the successor to this project: coqui.

Check this post.