Problem with deepspeech training (latest version), not recognised as a module when making csv/wav files with bin/import

arpi.aszalos · August 28, 2020, 9:48am

Hey I’m using google collab for training the lastest version of Deepspeech on a Spanish training set and I have some problem at the data preprocessing part here is the GitHub gist of the full code I have done . https://gist.github.com/aszi09/96bad06114310f2cf5b517df37352e16

It’s basically an error on the last part saying that there is no deepspeech_training module. (included in the gist)

However there are other errors I have ran into which I have absolutely no idea if are important or mean anything in terms of training and future inference. I can provide other info as well , it’s just that I’m new to this.

Also, I have another training going on another dataset with an earlier version which works (0.7.4) after 50 minutes it still hasn’t completed an epoch, I didn’t change any parameters when running so I’m unsure as to what the issue may be. (unsupported CUDA version may be the problem as Google collab uses 10.1)

Thanks in advance

lissyx · August 28, 2020, 9:57am

This is not clear. Depends on when, how you perform things.

So you are using master, please dont if you are not ready to experience some failures.

Read the docs: 0.9.0a7 was published under mvs_ctcdecoder package, we reverted that but have not yet published a new package under the new old name. Please either stick to v0.8.2 tag for a stable release or v0.9.0-alpha.7 for master-that-works.

Without more information, we can’t help.

arpi.aszalos · August 28, 2020, 11:05am

i will try with the v0.8.2 release thanks, for the other problem I will write back if I see any error I have with v0.8.2 release . Thanks

lissyx · August 28, 2020, 12:17pm

FTR other community have had success using https://github.com/Common-Voice/commonvoice-fr/blob/master/DeepSpeech/Dockerfile.train but I can’t tell if it works on Google Collab.

arpi.aszalos · August 28, 2020, 12:30pm

What is supposed to be the output after the “make Dockerfile.train” command? In the lastest version I got “No rules … stopping” or along the lines of that but it’s inside the gist somewhere. However, with the 0.7.4 version of deep speech I did manage to start training and had a different output for “make Dockerfile.train”. I will try with the version you suggested and will get back to you if something is misbehaving . Thanks in advance

lissyx · August 28, 2020, 12:30pm

Have you read and understood CONTRIBUTING.md ? There’s no such requirement.

Please stick to our documantion and not some “gist somewhere”. We can’t provide support on that.

Again, you are mixing things.

arpi.aszalos · August 31, 2020, 11:42am

Hey i tried setting up training with the version u suggested v0.8.2. First problem I had , which Im guessing is related to pip is this error message after !pip3 install --upgrade -e.

Obtaining file:///content/DeepSpeech
Requirement already satisfied, skipping upgrade: numpy in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.8.2) (1.18.5)
Requirement already satisfied, skipping upgrade: progressbar2 in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.8.2) (3.38.0)
Requirement already satisfied, skipping upgrade: six in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.8.2) (1.15.0)
Collecting pyxdg
  Using cached pyxdg-0.26-py2.py3-none-any.whl (40 kB)
Collecting attrdict
  Using cached attrdict-2.0.1-py2.py3-none-any.whl (9.9 kB)
Requirement already satisfied, skipping upgrade: absl-py in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.8.2) (0.8.1)
Collecting semver
  Using cached semver-2.10.2-py2.py3-none-any.whl (12 kB)
Collecting opuslib==2.0.0
  Using cached opuslib-2.0.0.tar.gz (7.3 kB)
Collecting optuna
  Using cached optuna-2.0.0.tar.gz (226 kB)
Collecting sox
  Using cached sox-1.4.0-py2.py3-none-any.whl (39 kB)
Requirement already satisfied, skipping upgrade: bs4 in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.8.2) (0.0.1)
Requirement already satisfied, skipping upgrade: pandas in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.8.2) (1.0.5)
Requirement already satisfied, skipping upgrade: requests in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.8.2) (2.23.0)
Collecting numba==0.47.0
  Using cached numba-0.47.0-cp36-cp36m-manylinux1_x86_64.whl (3.7 MB)
Requirement already satisfied, skipping upgrade: llvmlite==0.31.0 in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.8.2) (0.31.0)
Requirement already satisfied, skipping upgrade: librosa in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.8.2) (0.6.3)
Collecting soundfile
  Using cached SoundFile-0.10.3.post1-py2.py3-none-any.whl (21 kB)
Collecting ds_ctcdecoder==0.8.2
  Using cached ds_ctcdecoder-0.8.2-cp36-cp36m-manylinux1_x86_64.whl (2.0 MB)
Collecting tensorflow==1.15.2
  Using cached tensorflow-1.15.2-cp36-cp36m-manylinux2010_x86_64.whl (110.5 MB)
Requirement already satisfied, skipping upgrade: python-utils>=2.3.0 in /usr/local/lib/python3.6/dist-packages (from progressbar2->deepspeech-training==0.8.2) (2.4.0)
Collecting alembic
  Using cached alembic-1.4.2.tar.gz (1.1 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
ERROR: Exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/cli/base_command.py", line 186, in _main
    status = self.run(options, args)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/commands/install.py", line 331, in run
    resolver.resolve(requirement_set)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/legacy_resolve.py", line 177, in resolve
    discovered_reqs.extend(self._resolve_one(requirement_set, req))
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/legacy_resolve.py", line 333, in _resolve_one
    abstract_dist = self._get_abstract_dist_for(req_to_install)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/legacy_resolve.py", line 282, in _get_abstract_dist_for
    abstract_dist = self.preparer.prepare_linked_requirement(req)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/operations/prepare.py", line 516, in prepare_linked_requirement
    req, self.req_tracker, self.finder, self.build_isolation,
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/operations/prepare.py", line 95, in _get_prepared_distribution
    abstract_dist.prepare_distribution_metadata(finder, build_isolation)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/distributions/sdist.py", line 38, in prepare_distribution_metadata
    self._setup_isolation(finder)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/distributions/sdist.py", line 96, in _setup_isolation
    reqs = backend.get_requires_for_build_wheel()
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/pep517/wrappers.py", line 152, in get_requires_for_build_wheel
    'config_settings': config_settings
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/pep517/wrappers.py", line 255, in _call_hook
    raise BackendUnavailable(data.get('traceback', ''))
pip._vendor.pep517.wrappers.BackendUnavailable: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/pep517/_in_process.py", line 63, in _build_backend
    obj = import_module(mod_path)
  File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 941, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/lib/python3.6/dist-packages/setuptools/__init__.py", line 5, in <module>
    import distutils.core
  File "/tmp/pip-build-env-mmrfxsqo/overlay/lib/python3.6/site-packages/_distutils_hack/__init__.py", line 82, in create_module
    return importlib.import_module('._distutils', 'setuptools')
  File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'setuptools._distutils'

After this, I tried continuing the process described in the documentation of v0.8.2 release but when I ran !bin/import_cv2.py I got this

/bin/sh: 1: sox: not found
SoX could not be found!

    If you do not have SoX, proceed here:
     - - - http://sox.sourceforge.net/ - - -

    If you do (or think that you should) have SoX, double-check your
    path variables.
    
Traceback (most recent call last):
  File "bin/import_cv2.py", line 18, in <module>
    from deepspeech_training.util.downloader import SIMPLE_BAR
ModuleNotFoundError: No module named 'deepspeech_training'

The whole process I did is inside this gist.

gist.github.com

https://gist.github.com/aszi09/41df4f9b232b4e2cd42be77fafb2de74

testee2v0-8-2.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "testee2v0.8.2.ipynb",
      "provenance": [],
      "authorship_tag": "ABX9TyNFcqV0M3m/q52Uf1lT/T4W",
      "include_colab_link": true
    },

This file has been truncated. show original

Thanks for your response in advance .

lissyx · August 31, 2020, 11:43am

no idea why, but this is obviously not a bug on our side

install the required deps?

lissyx · August 31, 2020, 12:00pm

This is weird, it also breaks on Docker train CI: https://github.com/mozilla/DeepSpeech/issues/3295 but not on other places. I have no idea what is going on …

arpi.aszalos · August 31, 2020, 12:24pm

Im quite new so I don’t exactly know what that means, but if you need me to put in any lines that will help with resolving the issue tell me.

lissyx · August 31, 2020, 12:28pm

As I said, it looks like it also happens on our CI, but I don’t have time to look at that right now …

lissyx · August 31, 2020, 2:02pm

@arpi.aszalos Looks like it’s a Python level issue: https://github.com/pypa/setuptools/issues/2353

lissyx · August 31, 2020, 2:20pm

This error also seems to be a side-effect of the linked Pypa/Setuptools error, and if you pip install --upgrade . instead of pip install --upgrade -e . it should work. You might need also setuptools==50.0.0 as well?

arpi.aszalos · August 31, 2020, 2:22pm

okay i will try with the deepspeech v0.8.2 version. Thank you very much for helping

arpi.aszalos · August 31, 2020, 3:02pm

I tried with the v0.8.2 version of deepspeech and used !pip3 install --upgrade .
The error doesn’t occur anymore however this message was shown :

Processing /content/DeepSpeech
Requirement already satisfied, skipping upgrade: numpy in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.9.0a7) (1.18.5)
Requirement already satisfied, skipping upgrade: progressbar2 in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.9.0a7) (3.38.0)
Requirement already satisfied, skipping upgrade: six in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.9.0a7) (1.15.0)
Collecting pyxdg
  Downloading pyxdg-0.26-py2.py3-none-any.whl (40 kB)
     |████████████████████████████████| 40 kB 3.4 MB/s 
Collecting attrdict
  Downloading attrdict-2.0.1-py2.py3-none-any.whl (9.9 kB)
Requirement already satisfied, skipping upgrade: absl-py in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.9.0a7) (0.8.1)
Collecting semver
  Downloading semver-2.10.2-py2.py3-none-any.whl (12 kB)
Collecting opuslib==2.0.0
  Downloading opuslib-2.0.0.tar.gz (7.3 kB)
Collecting optuna
  Downloading optuna-2.0.0.tar.gz (226 kB)
     |████████████████████████████████| 226 kB 12.9 MB/s 
Collecting sox
  Downloading sox-1.4.0-py2.py3-none-any.whl (39 kB)
Requirement already satisfied, skipping upgrade: bs4 in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.9.0a7) (0.0.1)
Requirement already satisfied, skipping upgrade: pandas in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.9.0a7) (1.0.5)
Requirement already satisfied, skipping upgrade: requests in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.9.0a7) (2.23.0)
Collecting numba==0.47.0
  Downloading numba-0.47.0-cp36-cp36m-manylinux1_x86_64.whl (3.7 MB)
     |████████████████████████████████| 3.7 MB 12.0 MB/s 
Requirement already satisfied, skipping upgrade: llvmlite==0.31.0 in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.9.0a7) (0.31.0)
Requirement already satisfied, skipping upgrade: librosa in /usr/local/lib/python3.6/dist-packages (from deepspeech-training==0.9.0a7) (0.6.3)
Collecting soundfile
  Downloading SoundFile-0.10.3.post1-py2.py3-none-any.whl (21 kB)
ERROR: Could not find a version that satisfies the requirement ds_ctcdecoder==0.9.0-alpha.7 (from deepspeech-training==0.9.0a7) (from versions: 0.6.1, 0.7.0, 0.7.1, 0.7.3, 0.7.4, 0.8.0a3, 0.8.0a4, 0.8.0a5, 0.8.0a6, 0.8.0a7, 0.8.0a8, 0.8.0, 0.8.1, 0.8.2, 0.9.0a0, 0.9.0a1, 0.9.0a2, 0.9.0a3, 0.9.0a4, 0.9.0a5)
ERROR: No matching distribution found for ds_ctcdecoder==0.9.0-alpha.7 (from deepspeech-training==0.9.0a7)

Furthermore, when I try to **!bin/import_cv2.py ** I get this still

Traceback (most recent call last):
File “bin/import_cv2.py”, line 18, in
from deepspeech_training.util.downloader import SIMPLE_BAR
ModuleNotFoundError: No module named ‘deepspeech_training’

Here is the gist of the process I did

gist.github.com

https://gist.github.com/aszi09/7230a188d3bd97db04fdc684cedba912

eetestv-0-8-2-not-exactly.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "eetestv 0.8.2,not exactly .ipynb",
      "provenance": [],
      "authorship_tag": "ABX9TyOQZvIiAL3fVyny+Z2RUpNv",
      "include_colab_link": true
    },

This file has been truncated. show original

Thanks in advance!

lissyx · August 31, 2020, 3:03pm

You can’t have that error on v0.8.2 …

Please understand it’s upstream tooling that is broken, I can’t help you.

lissyx · August 31, 2020, 3:04pm

Which is meaningless since the previous install failed so deepspeech_training is not installed.

arpi.aszalos · August 31, 2020, 3:05pm

yes sorry didnt include --branch v0.8.2 in cloning . thanks for helping. I will see if it works still

arpi.aszalos · August 31, 2020, 3:25pm

I got it to import thank you so much.
I did have to change setuptools==50.0.0 and use !pip3 install --upgrade .

lissyx · August 31, 2020, 3:26pm

Thanks, so it confirms there’s something bad with this release. Given the amount of people reporting issues against it in the past hours