Dockerfile build issue

Hello. I tried to set up my own training model by build the Dockerfile in Docker for Windows. But I got the below error log and i didn’t find any reference post for this error log.

[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 158s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 353s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 571s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 839s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 1136s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 1471s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 1856s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 2299s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 2809s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 3395s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 4069s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 4843s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 5736s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 6761s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 7940s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 9296s local … (2 actions running)
[3,614 / 4,117] Compiling tensorflow/core/kernels/cwise_op_minimum.cc; 10855s local … (2 actions running)
ERROR: /tensorflow/tensorflow/core/kernels/BUILD:7985:1: C++ compilation of rule ‘//tensorflow/core/kernels:deepspeech_cwise_ops’ failed (Exit 4): crosstool_wrapper_driver_is_not_gcc failed: error executing command
(cd /root/.cache/bazel/bazel_root/68a62076e91007a7908bc42a32e4cff9/execroot/org_tensorflow &&
exec env -
CUDA_TOOLKIT_PATH=/usr/local/cuda
GCC_HOST_COMPILER_PATH=/usr/bin/gcc
LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu/:/usr/local/cuda/lib64/stubs/
PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/proc/self/cwd
PYTHON_BIN_PATH=/usr/bin/python3.6
PYTHON_LIB_PATH=/usr/lib/python3.6/dist-packages
TF_CONFIGURE_IOS=0
TF_CUDA_COMPUTE_CAPABILITIES=6.0
TF_CUDA_PATHS=/usr/local/cuda,/usr/lib/x86_64-linux-gnu/
TF_CUDA_VERSION=10.0
TF_CUDNN_VERSION=7
TF_NCCL_VERSION=2.3
TF_NEED_CUDA=1
external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/k8-opt/bin/tensorflow/core/kernels/objs/deepspeech_cwise_ops/cwise_op_mul_1.pic.d '-frandom-seed=bazel-out/k8-opt/bin/tensorflow/core/kernels/objs/deepspeech_cwise_ops/cwise_op_mul_1.pic.o’ -D__CLANG_SUPPORT_DYN_ANNOTATION -DEIGEN_MPL2_ONLY ‘-DEIGEN_MAX_ALIGN_BYTES=64’ ‘-DEIGEN_HAS_TYPE_TRAITS=0’ -DTF_USE_SNAPPY -iquote . -iquote bazel-out/k8-opt/genfiles -iquote bazel-out/k8-opt/bin -iquote external/com_google_absl -iquote bazel-out/k8-opt/genfiles/external/com_google_absl -iquote bazel-out/k8-opt/bin/external/com_google_absl -iquote external/eigen_archive -iquote bazel-out/k8-opt/genfiles/external/eigen_archive -iquote bazel-out/k8-opt/bin/external/eigen_archive -iquote external/local_config_sycl -iquote bazel-out/k8-opt/genfiles/external/local_config_sycl -iquote bazel-out/k8-opt/bin/external/local_config_sycl -iquote external/nsync -iquote bazel-out/k8-opt/genfiles/external/nsync -iquote bazel-out/k8-opt/bin/external/nsync -iquote external/gif_archive -iquote bazel-out/k8-opt/genfiles/external/gif_archive -iquote bazel-out/k8-opt/bin/external/gif_archive -iquote external/jpeg -iquote bazel-out/k8-opt/genfiles/external/jpeg -iquote bazel-out/k8-opt/bin/external/jpeg -iquote external/com_google_protobuf -iquote bazel-out/k8-opt/genfiles/external/com_google_protobuf -iquote bazel-out/k8-opt/bin/external/com_google_protobuf -iquote external/zlib_archive -iquote bazel-out/k8-opt/genfiles/external/zlib_archive -iquote bazel-out/k8-opt/bin/external/zlib_archive -iquote external/com_googlesource_code_re2 -iquote bazel-out/k8-opt/genfiles/external/com_googlesource_code_re2 -iquote bazel-out/k8-opt/bin/external/com_googlesource_code_re2 -iquote external/farmhash_archive -iquote bazel-out/k8-opt/genfiles/external/farmhash_archive -iquote bazel-out/k8-opt/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/k8-opt/genfiles/external/fft2d -iquote bazel-out/k8-opt/bin/external/fft2d -iquote external/highwayhash -iquote bazel-out/k8-opt/genfiles/external/highwayhash -iquote bazel-out/k8-opt/bin/external/highwayhash -iquote external/double_conversion -iquote bazel-out/k8-opt/genfiles/external/double_conversion -iquote bazel-out/k8-opt/bin/external/double_conversion -iquote external/snappy -iquote bazel-out/k8-opt/genfiles/external/snappy -iquote bazel-out/k8-opt/bin/external/snappy -iquote external/hwloc -iquote bazel-out/k8-opt/genfiles/external/hwloc -iquote bazel-out/k8-opt/bin/external/hwloc -iquote external/local_config_cuda -iquote bazel-out/k8-opt/genfiles/external/local_config_cuda -iquote bazel-out/k8-opt/bin/external/local_config_cuda -iquote external/local_config_tensorrt -iquote bazel-out/k8-opt/genfiles/external/local_config_tensorrt -iquote bazel-out/k8-opt/bin/external/local_config_tensorrt -iquote external/gemmlowp -iquote bazel-out/k8-opt/genfiles/external/gemmlowp -iquote bazel-out/k8-opt/bin/external/gemmlowp -Ibazel-out/k8-opt/bin/external/local_config_cuda/cuda/virtual_includes/cuda_headers_virtual -Ibazel-out/k8-opt/bin/external/local_config_tensorrt/virtual_includes/tensorrt_headers -Ibazel-out/k8-opt/bin/external/local_config_cuda/cuda/virtual_includes/cublas_headers_virtual -Ibazel-out/k8-opt/bin/external/local_config_cuda/cuda/virtual_includes/cudnn_header -isystem external/eigen_archive -isystem bazel-out/k8-opt/genfiles/external/eigen_archive -isystem bazel-out/k8-opt/bin/external/eigen_archive -isystem external/nsync/public -isystem bazel-out/k8-opt/genfiles/external/nsync/public -isystem bazel-out/k8-opt/bin/external/nsync/public -isystem external/gif_archive -isystem bazel-out/k8-opt/genfiles/external/gif_archive -isystem bazel-out/k8-opt/bin/external/gif_archive -isystem external/com_google_protobuf/src -isystem bazel-out/k8-opt/genfiles/external/com_google_protobuf/src -isystem bazel-out/k8-opt/bin/external/com_google_protobuf/src -isystem external/zlib_archive -isystem bazel-out/k8-opt/genfiles/external/zlib_archive -isystem bazel-out/k8-opt/bin/external/zlib_archive -isystem external/farmhash_archive/src -isystem bazel-out/k8-opt/genfiles/external/farmhash_archive/src -isystem bazel-out/k8-opt/bin/external/farmhash_archive/src -isystem external/double_conversion -isystem bazel-out/k8-opt/genfiles/external/double_conversion -isystem bazel-out/k8-opt/bin/external/double_conversion -isystem external/hwloc/hwloc -isystem bazel-out/k8-opt/genfiles/external/hwloc/hwloc -isystem bazel-out/k8-opt/bin/external/hwloc/hwloc -isystem external/hwloc/include -isystem bazel-out/k8-opt/genfiles/external/hwloc/include -isystem bazel-out/k8-opt/bin/external/hwloc/include -isystem external/local_config_cuda/cuda -isystem bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda -isystem bazel-out/k8-opt/bin/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/cuda/include -isystem bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda/cuda/include -isystem bazel-out/k8-opt/bin/external/local_config_cuda/cuda/cuda/include -isystem external/local_config_cuda/cuda/cublas/include -isystem bazel-out/k8-opt/genfiles/external/local_config_cuda/cuda/cublas/include -isystem bazel-out/k8-opt/bin/external/local_config_cuda/cuda/cublas/include ‘-std=c++11’ -Wno-builtin-macro-redefined '-D__DATE=“redacted”’ '-D__TIMESTAMP=“redacted”’ '-D__TIME
_=“redacted”’ -fPIC -U_FORTIFY_SOURCE ‘-D_FORTIFY_SOURCE=1’ -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -O3 ‘-D_GLIBCXX_USE_CXX11_ABI=0’ ‘-mtune=generic’ ‘-march=x86-64’ -msse -msse2 -msse3 -msse4.1 -msse4.2 -mavx ‘-fvisibility=hidden’ -DEIGEN_AVOID_STL_ARRAY -Iexternal/gemmlowp -Wno-sign-compare ‘-ftemplate-depth=900’ -fno-exceptions ‘-DGOOGLE_CUDA=1’ -msse3 -DTENSORFLOW_MONOLITHIC_BUILD -pthread ‘-DGOOGLE_CUDA=1’ -c tensorflow/core/kernels/cwise_op_mul_1.cc -o bazel-out/k8-opt/bin/tensorflow/core/kernels/_objs/deepspeech_cwise_ops/cwise_op_mul_1.pic.o)
Execution platform: @bazel_tools//platforms:host_platform
gcc: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
Target //native_client:libdeepspeech.so failed to build
INFO: Elapsed time: 12337.005s, Critical Path: 11254.53s
INFO: 2178 processes: 2178 local.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully
The command ‘/bin/sh -c bazel build --workspace_status_command=“bash native_client/bazel_workspace_status_cmd.sh” --config=monolithic --config=cuda -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-mtune=generic --copt=-march=x86-64 --copt=-msse --copt=-msse2 --copt=-msse3 --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-fvisibility=hidden //native_client:libdeepspeech.so --verbose_failures --action_env=LD_LIBRARY_PATH=${LD_LIBRARY_PATH}’ returned a non-zero code: 1

If there is any mistake, hope somebody correct my mistake. Thank you.

How much memory do you have ? That feels like OOM.

I had set 2GB RAM to the Docker, and my PC have 32GB RAM.

So i set 8GB RAM and build it again?

I had built several times, but the error are not the same.

I will increase the memory and try it again first. If i meet the error again, i will update the post here.

Thank you your reply. Thank you.

CUDA builds with Bazel are really intensive, 8GB might be barely enough. Please explore bazel build docs to limit resources: https://docs.bazel.build/versions/master/user-manual.html

Thank you lissyx,

I increase the memory to 16GB and i build it successfully in 1.5 hours.

Thank you very much.

When I execute the command,

./DeepSpeech.py --train_files …/data/CV/zh-HK/clips/train.csv --dev_files …/data/CV/zh-HK/clips/dev.csv --test_files …/data/CV/zh-HK/clips/test.csv

I got the below error message.

Traceback (most recent call last):
File “./DeepSpeech.py”, line 7, in
from deepspeech_training import train as ds_train
File “/usr/local/lib/python3.6/dist-packages/deepspeech_training/train.py”, line 30, in
from .evaluate import evaluate
File “/usr/local/lib/python3.6/dist-packages/deepspeech_training/evaluate.py”, line 26, in
check_ctcdecoder_version()
File “/usr/local/lib/python3.6/dist-packages/deepspeech_training/util/helpers.py”, line 53, in check_ctcdecoder_version
rv = semver.compare(ds_version_s, decoder_version_s)
File “/usr/local/lib/python3.6/dist-packages/semver.py”, line 452, in compare
v1, v2 = parse(ver1), parse(ver2)
File “/usr/local/lib/python3.6/dist-packages/semver.py”, line 74, in parse
raise ValueError("%s is not valid SemVer string" % version)
ValueError: …/…/VERSION is not valid SemVer string

I find the post which related to the error message, but seem not the same satiation.

Should I clone the source code from repository and rebuild it again?

Hi @axcn, were you able to solve the issue?
If yes, could you please share the steps that you took?

I am getting the same issue.

Can you elaborate please ? What makes you in a different situation ? We can’t spend our time digging in other post and doing divination on your situation …

@axcn @Vibhav_Anand please:

  • make sure you git cloned proper tree
  • make sure you have VERSION file at the root, and that it is readable
  • make sure you have setup properly the virtualenv

I edit the VERSION file in /usr/local/lib/python3.6/dist-packages/deepspeech_training/
from

…/…/VERSION

to (the version number of DeepSpeech)

0.7.0

to ignore the error.

Of course it is not a correct way to solve the issue.

Again, can you reply to the questions I am asking ?
What does your statement means ?

You don’t have a file?
It is not readable ?

Why do you have path /usr/local ? That would imply you are not using a virtualenv ?

I built the DeepSpeech with Dockerfile, i didn’t know i still need to run the command

python3 -m venv $HOME/tmp/deepspeech-train-venv/

Well, it might not be needed, but I insist once again: this Dockerfile is provided as-is, and is only tested to build, not to train, it’s not mentionned in any doc.

Could you please be precise in how to “run training”: have you modified the Dockerfile, do you connect to the container, etc.

I also asked you three questions, I don’t see any answer.

ls -halR /usr/local/lib/python3.6/dist-packages/deepspeech_training/

As i am a Newbie, Sorry for the inconvenience.

After I changed the VERSION to 0.7.0, I can run the training and testing. I had not trained in English, therefore i could not say whether be precise or not.

When i generated my own lm.binary and vocab-500000.txt, the training is in normal.
But there is error when start to run the testing.

/usr/local/lib/python3.6/dist-packages/deepspeech_training/:
total 84K
drwxr-sr-x 1 root staff 4.0K 4月  27 14:46 .
drwxrwsr-x 1 root staff 4.0K 4月  26 20:05 ..
-rw-r--r-- 1 root staff 5.7K 4月  26 19:22 evaluate.py
-rwxr-xr-x 1 root staff   19 4月  26 19:22 GRAPH_VERSION
-rw-r--r-- 1 root staff    0 4月  26 19:22 __init__.py
drwxr-sr-x 2 root staff 4.0K 4月  26 19:22 __pycache__
-rw-r--r-- 1 root staff  40K 4月  26 19:22 train.py
drwxr-sr-x 1 root staff 4.0K 4月  27 03:25 util
-rwxr-xr-x 1 root staff   14 4月  27 14:46 VERSION

/usr/local/lib/python3.6/dist-packages/deepspeech_training/__pycache__:
total 48K
drwxr-sr-x 2 root staff 4.0K 4月  26 19:22 .
drwxr-sr-x 1 root staff 4.0K 4月  27 14:46 ..
-rw-r--r-- 1 root staff 5.3K 4月  26 19:22 evaluate.cpython-36.pyc
-rw-r--r-- 1 root staff  147 4月  26 19:22 __init__.cpython-36.pyc
-rw-r--r-- 1 root staff  24K 4月  26 19:22 train.cpython-36.pyc

/usr/local/lib/python3.6/dist-packages/deepspeech_training/util:
total 168K
drwxr-sr-x 1 root staff 4.0K 4月  27 03:25 .
drwxr-sr-x 1 root staff 4.0K 4月  27 14:46 ..
-rw-r--r-- 1 root staff  15K 4月  26 19:22 audio.py
-rw-r--r-- 1 root staff 2.7K 4月  26 19:22 check_characters.py
-rw-r--r-- 1 root staff 6.1K 4月  26 19:22 checkpoints.py
-rw-r--r-- 1 root staff 5.7K 4月  26 19:22 config.py
-rw-r--r-- 1 root staff 1.1K 4月  26 19:22 downloader.py
-rw-r--r-- 1 root staff 4.0K 4月  26 19:22 evaluate_tools.py
-rw-r--r-- 1 root staff 9.5K 4月  26 19:22 feeding.py
-rw-r--r-- 1 root staff  16K 4月  27 03:25 flags.py
-rw-r--r-- 1 root staff  310 4月  26 19:22 gpu.py
-rw-r--r-- 1 root staff 4.2K 4月  26 19:22 helpers.py
-rw-r--r-- 1 root staff 3.3K 4月  26 19:22 importers.py
-rw-r--r-- 1 root staff    0 4月  26 19:22 __init__.py
-rw-r--r-- 1 root staff  969 4月  26 19:22 logging.py
drwxr-sr-x 1 root staff 4.0K 4月  27 03:38 __pycache__
-rw-r--r-- 1 root staff  15K 4月  26 19:22 sample_collections.py
-rw-r--r-- 1 root staff 9.6K 4月  26 19:22 sparse_image_warp.py
-rw-r--r-- 1 root staff 6.7K 4月  26 19:22 spectrogram_augmentations.py
-rw-r--r-- 1 root staff 1.9K 4月  26 19:22 stm.py
-rw-r--r-- 1 root staff 5.3K 4月  26 19:22 taskcluster.py
-rw-r--r-- 1 root staff 5.8K 4月  26 19:22 text.py

/usr/local/lib/python3.6/dist-packages/deepspeech_training/util/__pycache__:
total 152K
drwxr-sr-x 1 root staff 4.0K 4月  27 03:38 .
drwxr-sr-x 1 root staff 4.0K 4月  27 03:25 ..
-rw-r--r-- 1 root staff  13K 4月  26 19:22 audio.cpython-36.pyc
-rw-r--r-- 1 root staff 2.6K 4月  26 19:22 check_characters.cpython-36.pyc
-rw-r--r-- 1 root staff 4.8K 4月  26 19:22 checkpoints.cpython-36.pyc
-rw-r--r-- 1 root staff 3.7K 4月  26 19:22 config.cpython-36.pyc
-rw-r--r-- 1 root staff 1.1K 4月  26 19:22 downloader.cpython-36.pyc
-rw-r--r-- 1 root staff 4.0K 4月  26 19:22 evaluate_tools.cpython-36.pyc
-rw-r--r-- 1 root staff 7.5K 4月  26 19:22 feeding.cpython-36.pyc
-rw-r--r-- 1 root staff  15K 4月  27 03:38 flags.cpython-36.pyc
-rw-r--r-- 1 root staff  637 4月  26 19:22 gpu.cpython-36.pyc
-rw-r--r-- 1 root staff 5.3K 4月  26 19:22 helpers.cpython-36.pyc
-rw-r--r-- 1 root staff 3.1K 4月  26 19:22 importers.cpython-36.pyc
-rw-r--r-- 1 root staff  152 4月  26 19:22 __init__.cpython-36.pyc
-rw-r--r-- 1 root staff 1.4K 4月  26 19:22 logging.cpython-36.pyc
-rw-r--r-- 1 root staff  14K 4月  26 19:22 sample_collections.cpython-36.pyc
-rw-r--r-- 1 root staff 6.8K 4月  26 19:22 sparse_image_warp.cpython-36.pyc
-rw-r--r-- 1 root staff 4.6K 4月  26 19:22 spectrogram_augmentations.cpython-36.pyc
-rw-r--r-- 1 root staff 2.2K 4月  26 19:22 stm.cpython-36.pyc
-rw-r--r-- 1 root staff 4.6K 4月  26 19:22 taskcluster.cpython-36.pyc
-rw-r--r-- 1 root staff 5.2K 4月  26 19:22 text.cpython-36.pyc

So, again, what is the content of that file? (before you change it)

Being a newbie does not excuse you for not answering the simple questions I am asking.