Hi there!
I am trying to create a backend API that can transcript audio files using a self-trained model on Spanish. I have successfully trained and exported the model but I am running into problems when building a Docker image from the Dockerfile.build file for inference.
Currently using:
- Ubuntu 18.04
- DeepSpeech code v0.8.0
- CUDA 10.0
It seems like there is a file that is no longer available.
Used command: docker build -t ds-gpu-inference-image .
Sending build context to Docker daemon 2.112MB
Step 1/78 : FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
---> b4879c167fc1
Step 2/78 : ENV DEEPSPEECH_REPO=https://github.com/mozilla/DeepSpeech.git
---> Using cache
---> 444156e926a9
Step 3/78 : ENV DEEPSPEECH_SHA=f56b07dab4542eecfb72e059079db6c2603cc0ee
---> Using cache
---> 384b8c501aea
Step 4/78 : RUN apt-get update && apt-get install -y --no-install-recommends apt-utils bash-completion build-essential ca-certificates cmake curl g++ gcc git libbz2-dev libboost-all-dev libgsm1-dev libltdl-dev liblzma-dev libmagic-dev libpng-dev libsox-fmt-mp3 libsox-dev locales openjdk-8-jdk pkg-config python3 python3-dev python3-pip python3-wheel python3-numpy sox unzip wget zlib1g-dev
---> Using cache
---> efc6c3b960f5
Step 5/78 : RUN update-alternatives --install /usr/bin/pip pip /usr/bin/pip3 1
---> Using cache
---> 5ba47c76f517
Step 6/78 : RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 1
---> Using cache
---> 644df0f52ab4
Step 7/78 : RUN curl -LO "https://github.com/bazelbuild/bazel/releases/download/2.0.0/bazel_2.0.0-linux-x86_64.deb"
---> Using cache
---> 0507b6601591
Step 8/78 : RUN dpkg -i bazel_*.deb
---> Using cache
---> 2b1af20cd1e8
Step 9/78 : RUN rm -rf /var/lib/apt/lists/*
---> Using cache
---> 4cacf777f3a7
Step 10/78 : ENV TF_NEED_ROCM 0
---> Using cache
---> 3f27c2e14ead
Step 11/78 : ENV TF_NEED_OPENCL_SYCL 0
---> Using cache
---> b4b2ee280043
Step 12/78 : ENV TF_NEED_OPENCL 0
---> Using cache
---> 22256f4c31a2
Step 13/78 : ENV TF_NEED_CUDA 1
---> Using cache
---> 087a9749ff65
Step 14/78 : ENV TF_CUDA_PATHS "/usr,/usr/local/cuda-10.1,/usr/lib/x86_64-linux-gnu/"
---> Using cache
---> efba441d0240
Step 15/78 : ENV TF_CUDA_VERSION 10.1
---> Using cache
---> 2b7766e5eae0
Step 16/78 : ENV TF_CUDNN_VERSION 7.6
---> Using cache
---> db6e969af19d
Step 17/78 : ENV TF_CUDA_COMPUTE_CAPABILITIES 6.0
---> Using cache
---> 6f2da0577550
Step 18/78 : ENV TF_NCCL_VERSION 2.4
---> Using cache
---> e83383f6370f
Step 19/78 : ENV TF_BUILD_CONTAINER_TYPE GPU
---> Using cache
---> 38400da19ffb
Step 20/78 : ENV TF_BUILD_OPTIONS OPT
---> Using cache
---> a4eefdf7f939
Step 21/78 : ENV TF_BUILD_DISABLE_GCP 1
---> Using cache
---> 1d3000fa789d
Step 22/78 : ENV TF_BUILD_ENABLE_XLA 0
---> Using cache
---> 1cdcdc3c900b
Step 23/78 : ENV TF_BUILD_PYTHON_VERSION PYTHON3
---> Using cache
---> c14e53f797a6
Step 24/78 : ENV TF_BUILD_IS_OPT OPT
---> Using cache
---> 49fe25a28bed
Step 25/78 : ENV TF_BUILD_IS_PIP PIP
---> Using cache
---> 529142550289
Step 26/78 : ENV CC_OPT_FLAGS -mavx -mavx2 -msse4.1 -msse4.2 -mfma
---> Using cache
---> 5dfe84271b8e
Step 27/78 : ENV TF_NEED_GCP 0
---> Using cache
---> bdca3e85d066
Step 28/78 : ENV TF_NEED_HDFS 0
---> Using cache
---> 55e4cd2b64a8
Step 29/78 : ENV TF_NEED_JEMALLOC 1
---> Using cache
---> 733cbf159b70
Step 30/78 : ENV TF_NEED_OPENCL 0
---> Using cache
---> 02baafd3ab56
Step 31/78 : ENV TF_CUDA_CLANG 0
---> Using cache
---> 6a38cdd39d12
Step 32/78 : ENV TF_NEED_MKL 0
---> Using cache
---> 6cda864189a3
Step 33/78 : ENV TF_ENABLE_XLA 0
---> Using cache
---> 9ab772a5589e
Step 34/78 : ENV TF_NEED_AWS 0
---> Using cache
---> 61efb8c69886
Step 35/78 : ENV TF_NEED_KAFKA 0
---> Using cache
---> 497d1e296270
Step 36/78 : ENV TF_NEED_NGRAPH 0
---> Using cache
---> 58b78a2e6207
Step 37/78 : ENV TF_DOWNLOAD_CLANG 0
---> Using cache
---> d5d0932e3951
Step 38/78 : ENV TF_NEED_TENSORRT 0
---> Using cache
---> 03c1b52e2f3c
Step 39/78 : ENV TF_NEED_GDR 0
---> Using cache
---> de520a921cd7
Step 40/78 : ENV TF_NEED_VERBS 0
---> Using cache
---> ba51095102bb
Step 41/78 : ENV TF_NEED_OPENCL_SYCL 0
---> Using cache
---> 13b82cb7bc44
Step 42/78 : ENV PYTHON_BIN_PATH /usr/bin/python3.6
---> Using cache
---> 7986e3530984
Step 43/78 : ENV PYTHON_LIB_PATH /usr/local/lib/python3.6/dist-packages
---> Using cache
---> b235a61c40a9
Step 44/78 : RUN echo "startup --batch" >>/etc/bazel.bazelrc
---> Using cache
---> 12a852e2387b
Step 45/78 : RUN echo "build --spawn_strategy=standalone --genrule_strategy=standalone" >>/etc/bazel.bazelrc
---> Using cache
---> e9a7d6054fb0
Step 46/78 : WORKDIR /
---> Using cache
---> e0edc519b068
Step 47/78 : RUN git clone --recursive $DEEPSPEECH_REPO
---> Using cache
---> 1cd739787180
Step 48/78 : WORKDIR /DeepSpeech
---> Using cache
---> 6ad8a9936a60
Step 49/78 : RUN git checkout $DEEPSPEECH_SHA
---> Using cache
---> e66941d9666d
Step 50/78 : RUN git submodule sync tensorflow/
---> Using cache
---> 51f3cdf5dfdf
Step 51/78 : RUN git submodule update --init tensorflow/
---> Using cache
---> 51a4ab54deac
Step 52/78 : WORKDIR /DeepSpeech/tensorflow
---> Using cache
---> 75ec19c7a3d2
Step 53/78 : RUN ./configure
---> Using cache
---> 4ac002464f1f
Step 54/78 : RUN bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=cuda -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-mtune=generic --copt=-march=x86-64 --copt=-msse --copt=-msse2 --copt=-msse3 --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-fvisibility=hidden //native_client:libdeepspeech.so --verbose_failures --action_env=LD_LIBRARY_PATH=${LD_LIBRARY_PATH}
---> Using cache
---> 4e0cdffccee8
Step 55/78 : RUN cp bazel-bin/native_client/libdeepspeech.so /DeepSpeech/native_client/
---> Using cache
---> ca0ebf61759c
Step 56/78 : ENV TFDIR /DeepSpeech/tensorflow
---> Using cache
---> 3cc353bad69b
Step 57/78 : RUN nproc
---> Using cache
---> 1bb00577d8cc
Step 58/78 : WORKDIR /DeepSpeech/native_client
---> Using cache
---> 823fb6066949
Step 59/78 : RUN make NUM_PROCESSES=$(nproc) deepspeech
---> Using cache
---> 8d3a17d399b0
Step 60/78 : WORKDIR /DeepSpeech
---> Using cache
---> ef58c905daaa
Step 61/78 : RUN cd native_client/python && make NUM_PROCESSES=$(nproc) bindings
---> Running in 594bdd4e8e7e
mkdir -p /DeepSpeech/native_client/ds-swig
wget -O - ""https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.swig.linux.amd64.b5fea54d39832d1d132d7dd921b69c0c2c9d5118/artifacts/public/ds-swig.tar.gz"" | tar -C /DeepSpeech/native_client/ds-swig -zxf -
--2020-08-17 09:35:21-- https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.swig.linux.amd64.b5fea54d39832d1d132d7dd921b69c0c2c9d5118/artifacts/public/ds-swig.tar.gz
Resolving community-tc.services.mozilla.com (community-tc.services.mozilla.com)... 34.102.144.36
Connecting to community-tc.services.mozilla.com (community-tc.services.mozilla.com)|34.102.144.36|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2020-08-17 09:35:22 ERROR 404: Not Found.
gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
make: *** [/DeepSpeech/native_client/ds-swig/bin/swig] Error 2
../definitions.mk:226: recipe for target '/DeepSpeech/native_client/ds-swig/bin/swig' failed
The command '/bin/sh -c cd native_client/python && make NUM_PROCESSES=$(nproc) bindings' returned a non-zero code: 2
From the output I can read that the file https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.swig.linux.amd64.b5fea54d39832d1d132d7dd921b69c0c2c9d5118/artifacts/public/ds-swig.tar.gz
is no longer available.
I tried to search it by hand in Chrome but got the same output:
{
"code": "ResourceNotFound",
"message": "Indexed task not found\n\n---\n\n* method: findArtifactFromTask\n* errorCode: ResourceNotFound\n* statusCode: 404\n* time: 2020-08-17T08:40:34.063Z",
"requestInfo": {
"method": "findArtifactFromTask",
"params": {
"0": "public/ds-swig.tar.gz",
"indexPath": "project.deepspeech.swig.linux.amd64.b5fea54d39832d1d132d7dd921b69c0c2c9d5118",
"name": "public/ds-swig.tar.gz"
},
"payload": {},
"time": "2020-08-17T08:40:34.063Z"
}
}
I have modified the Dockerfile to meet my needs but just by making additions. Here are the modified contents just in case they help. I have ommitted most of the file to avoid a too long post. The rest of the file is left untouched:
# Build KenLM in /DeepSpeech/native_client/kenlm folder
WORKDIR /DeepSpeech/native_client
RUN rm -rf kenlm && \
git clone https://github.com/kpu/kenlm && \
cd kenlm && \
git checkout 87e85e66c99ceff1fab2500a7c60c01da7315eec && \
mkdir -p build && \
cd build && \
cmake .. && \
make -j $(nproc)
# START >> My modifications
EXPOSE 8064
RUN mkdir /node_backend
WORKDIR /node_backend
COPY package.json ./
COPY app.js ./
COPY src/ ./
RUN curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -
RUN apt-get install -y nodejs
RUN npm install
RUN node app.js
# END << My modifications
# Done
WORKDIR /DeepSpeech
Does anyone know what might I have done wrong or missconfigured? Any piece of advice would be very appreciated.
Thanks in advance