Building libdeepspeech.so.. multiple definition zlib_archive vs zlib errors?

utunga · June 25, 2020, 2:05am

I’m trying to build libdeepspeech. For various reasons I am not using the Deepspeech Dockerfile or, the exact code from mozilla/Deepspeech but I was hoping someone would be familiar with this error and may be able to help?

I’ll attach the full log but the gist of it is multiple duplicate definition errors during the linking stage, as follows.

ERROR: /code/tensorflow/native_client/BUILD:91:1: Linking of rule '//native_client:libdeepspeech.so' failed (Exit 1)

gzlib.c:(.text.gzopen+0x0): multiple definition of `gzopen'
bazel-out/k8-opt/bin/external/zlib_archive/libzlib.pic.a(gzlib.pic.o):gzlib.c:(.text.gzopen+0x0): first defined here
bazel-out/k8-opt/bin/external/zlib/libzlib.pic.a(gzlib.pic.o): In function `gzopen64':

This is then repeated 101 times. Basically k8-opt/bin/external/zlib_archive vs k8-opt/bin/external/zlib/

Anybody seen this or have any suggestions for how one would go about fixing it?

–

Can’t seem to attach files so… snippets from the build log below

root@832a3d4a38ef:/code/tensorflow# bazel --output_user_root=/code/build/.bazel_cache \
>                     build \
>                         --config=monolithic \
>                         --jobs 7 \
>                         --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" \
>                         -c opt \
>                         --copt=-O3 \
>                         --copt="-D_GLIBCXX_USE_CXX11_ABI=0" \
>                         --copt=-fvisibility=hidden \
>                         //native_client:libdeepspeech.so
Starting local Bazel server and connecting to it...
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=157
INFO: Reading rc options for 'build' from /code/tensorflow/.bazelrc:
  'build' options: --apple_platform_type=macos --define framework_shared_object=true --define open_source_build=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone --strategy=Genrule=standalone -c opt --announce_rc --define=grpc_no_ares=true --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include
INFO: Reading rc options for 'build' from /code/tensorflow/.tf_configure.bazelrc:
  'build' options: --action_env PYTHON_BIN_PATH=/usr/bin/python --action_env PYTHON_LIB_PATH=/usr/local/lib/python3.7/dist-packages --python_path=/usr/bin/python --config=xla --action_env CUDA_TOOLKIT_PATH=/usr/local/cuda --action_env TF_CUDA_COMPUTE_CAPABILITIES=3.5,7.0 --action_env LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 --action_env GCC_HOST_COMPILER_PATH=/usr/bin/gcc --config=cuda --action_env TF_CONFIGURE_IOS=0
INFO: Found applicable config definition build:xla in file /code/tensorflow/.tf_configure.bazelrc: --define with_xla_support=true
INFO: Found applicable config definition build:cuda in file /code/tensorflow/.bazelrc: --config=using_cuda --define=using_cuda_nvcc=true
INFO: Found applicable config definition build:using_cuda in file /code/tensorflow/.bazelrc: --define=using_cuda=true --action_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain
INFO: Found applicable config definition build:monolithic in file /code/tensorflow/.bazelrc: --define framework_shared_object=false
WARNING: /code/tensorflow/tensorflow/core/BUILD:647:12: in srcs attribute of cc_library rule //tensorflow/core:lib_proto_parsing: please do not import '//tensorflow/core/platform:protobuf.cc' directly. You should either move the file to this package or depend on an appropriate rule there
WARNING: /code/tensorflow/tensorflow/core/BUILD:2451:12: in srcs attribute of cc_library rule //tensorflow/core:lib_internal_impl: please do not import '//tensorflow/core/platform:abi.h' directly. You should either move the file to this package or depend on an appropriate rule there
WARNING: /code/tensorflow/tensorflow/core/BUILD:2451:12: in srcs attribute of cc_library rule //tensorflow/core:lib_internal_impl: please do not import '//tensorflow/core/platform:byte_order.h' directly. You should either move the file to this package or depend on an appropriate rule there
<snip>
WARNING: /code/tensorflow/native_client/BUILD:91:1: in cc_binary rule //native_client:libdeepspeech.so: target '//native_client:libdeepspeech.so' depends on deprecated target '//tensorflow/contrib/rnn:lstm_ops_op_lib': contrib/rnn kernels and ops are now part of core TensorFlow
INFO: Analysed target //native_client:libdeepspeech.so (108 packages loaded, 6350 targets configured).
INFO: Found 1 target...
INFO: From Executing genrule //native_client:workspace_status:
++ cut '-d ' -f2
++ grep STABLE_TF_GIT_VERSION bazel-out/stable-status.txt
+ tf_git_version=v1.15.0-24-gceb46aae58
++ cut '-d ' -f2
++ grep STABLE_DS_VERSION bazel-out/stable-status.txt
+ ds_version=0.7.1
++ grep STABLE_DS_GIT_VERSION bazel-out/stable-status.txt
++ cut '-d ' -f2
+ ds_git_version=v0.7.1-7-ga6c6dc21
++ grep STABLE_DS_GRAPH_VERSION bazel-out/stable-status.txt
++ cut '-d ' -f2
+ ds_graph_version=6
+ cat
ERROR: /code/tensorflow/native_client/BUILD:91:1: Linking of rule '//native_client:libdeepspeech.so' failed (Exit 1)
bazel-out/k8-opt/bin/external/zlib/libzlib.pic.a(adler32.pic.o): In function `adler32_z':
adler32.c:(.text.adler32_z+0x0): multiple definition of `adler32_z'
bazel-out/k8-opt/bin/external/zlib_archive/libzlib.pic.a(adler32.pic.o):adler32.c:(.text.adler32_z+0x0): first defined here
bazel-out/k8-opt/bin/external/zlib/libzlib.pic.a(adler32.pic.o): In function `adler32':
adler32.c:(.text.adler32+0x0): multiple definition of `adler32'
<snip>
zutil.c:(.text.zError+0x0): multiple definition of `zError'
bazel-out/k8-opt/bin/external/zlib_archive/libzlib.pic.a(zutil.pic.o):zutil.c:(.text.zError+0x0): first defined here
bazel-out/k8-opt/bin/external/zlib/libzlib.pic.a(zutil.pic.o):(.data.rel.ro.local.z_errmsg+0x0): multiple definition of `z_errmsg'
bazel-out/k8-opt/bin/external/zlib_archive/libzlib.pic.a(zutil.pic.o):(.data.rel.ro.local.z_errmsg+0x0): first defined here
bazel-out/k8-opt/bin/external/zlib/libzlib.pic.a(zutil.pic.o): In function `zcalloc':
zutil.c:(.text.zcalloc+0x0): multiple definition of `zcalloc'
bazel-out/k8-opt/bin/external/zlib_archive/libzlib.pic.a(zutil.pic.o):zutil.c:(.text.zcalloc+0x0): first defined here
bazel-out/k8-opt/bin/external/zlib/libzlib.pic.a(zutil.pic.o): In function `zcfree':
zutil.c:(.text.zcfree+0x0): multiple definition of `zcfree'
bazel-out/k8-opt/bin/external/zlib_archive/libzlib.pic.a(zutil.pic.o):zutil.c:(.text.zcfree+0x0): first defined here
collect2: error: ld returned 1 exit status
Target //native_client:libdeepspeech.so failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 22.922s, Critical Path: 11.30s
INFO: 1 process: 1 local.
FAILED: Build did NOT complete successfully

utunga · June 25, 2020, 2:06am

This appears to be related

utunga · June 25, 2020, 2:17am

FYI the version of DeepSpeech was (branched off of)

commit 2e9c281d06ea8da97f7e4eebd3e4476350e7776a (tag: v0.7.1)
Merge: e23390eb d1b4ea85
Author: Reuben Morais <reuben.morais@gmail.com>
Date:   Tue May 12 17:29:44 2020 +0200

    Merge pull request #2990 from mozilla/release-071

    Bump VERSION to 0.7.1

And the version of tensorflow is

commit ceb46aae5836a0f648a2c3da5942af2b7d1b98bf (HEAD -> r1.15, upstream/r1.15, origin/r1.15)
Merge: bd115ee104 917d341c6c
Author: lissyx <1645737+lissyx@users.noreply.github.com>
Date:   Fri Jan 17 19:49:35 2020 +0100

    Merge pull request #115 from lissyx/arm64-tflite

    Switch ARM64 builds to TFLite

PS I find it a little bit tricky to work out what exact version/commit of tensorflow goes with which version of DeepSpeech… (and where to get it from). As you can see this is pulling from the mozilla branch of tensorflow but not sure exactly what is the right commit?

utunga · June 25, 2020, 5:26am

OK so it seems that this patch

github.com

mozilla/tensorflow/blob/r1.15/third_party/protobuf/protobuf.patch

diff --git a/BUILD b/BUILD
index 2fb26050..c2744d5b 100644
--- a/BUILD
+++ b/BUILD
@@ -19,7 +19,7 @@ config_setting(
 # ZLIB configuration
 ################################################################################
 
-ZLIB_DEPS = ["@zlib//:zlib"]
+ZLIB_DEPS = ["@zlib_archive//:zlib"]
 
 ################################################################################
 # Protobuf Runtime Library
@@ -218,7 +218,7 @@ cc_library(
 # TODO(keveman): Remove this target once the support gets added to Bazel.
 cc_library(
     name = "protobuf_headers",
-    hdrs = glob(["src/**/*.h"]),
+    hdrs = glob(["src/**/*.h", "src/**/*.inc"]),
     includes = ["src/"],

This file has been truncated. show original

should - in theory - avoid this error happening.

But how can i be sure that the patch is actually being applied?

lissyx · June 25, 2020, 8:36am

If the patch is needed and it fails to applyl bazel would tell

It would help if you still shared more info, like the bazel build command and what exactly you change

I’m not sure what you mean here, we document to use the mozilla XX branch (r1.15 here), and this is directly the same naming as tensorflow, the head of the branch is all you need

99.9999% of the time, those duplicated definitions are from missing --config=monolithic in the build command

utunga · June 25, 2020, 8:43am

hey @lissyx thanks for the reply. The command i used is

root@13f49df5ac06:/code/tensorflow# bazel --output_user_root=/code/build/.bazel_cache build -c opt --config=monolithic --jobs 8 --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --verbose_failures --copt=-fvisibility=hidden //native_client:libdeepspeech.so

In the log above you’ll see the same command but with each option on a separate line.

I’m pretty sure I have tensorflow at mozilla/tensorflow (r1.15 tag)

utunga · June 25, 2020, 8:45am

Don’t quite know what you mean here. In some sense bazel is in fact telling me - because its saying it doesnt build… but perhaps you mean this in some other sense.

Thanks again for the reply!

lissyx · June 25, 2020, 8:46am

right at the beggining it applied the patches, you would have an error clearly stating it’s unable to apply the patches.

BTW, what are you building? we just merged moving to tensorflow r2.2 for inference, you might benefit from that

lissyx · June 25, 2020, 8:49am

it’s not a tag, it’s a branch

right, I missed it when reading, so you seem to have the monolithic config enabled. Just make sure you are not getting tricked by weird utf8 / copy-pasting that would mess -- for example.

So maybe you changed something related?

lissyx · June 25, 2020, 8:49am

Also, have you verified completely eradicating Bazel’s cache? Sometimes, it might go crazy. And which version of Bazel do you use?

utunga · June 25, 2020, 10:37am

Thanks for the replies @lissyx I’ll try to respond to each one as needed.

FWIW I’m just trying to build v0.7.1 of Deepspeech with these tiny changes to extract letter by letter confidences out into the ds.sttWithMetadata(audio) method.

I’d totally think about upgrading to TensorFlow 2.2 etc but we just spent ages upgrading everything to 0.7 and retraining model etc so kinda keen to get this going at this version…

This letter by letter confidences stuff was working great in v0.5.1 btw just trying to bring it over into 0.7.

Sorry yes I meant branch.

Just to eliminate all other confusion I’m going to roll back DeepSpeech to the v0.7.1 tag, leave tensorflow at r1.15 rm -rf the .bazel_cache and rebuild. It takes ages to do that though so will be a while before i report back.

bazel 0.24.1

More specifically… (this is from within our Dockerfile btw)

root@13f49df5ac06:/code/tensorflow# bazel info
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=157
INFO: Reading rc options for 'info' from /code/tensorflow/.bazelrc:
  Inherited 'build' options: --apple_platform_type=macos --define framework_shared_object=true --define open_source_build=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone --strategy=Genrule=standalone -c opt --announce_rc --define=grpc_no_ares=true --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include
INFO: Reading rc options for 'info' from /code/tensorflow/.tf_configure.bazelrc:
  Inherited 'build' options: --action_env PYTHON_BIN_PATH=/usr/bin/python --action_env PYTHON_LIB_PATH=/usr/local/lib/python3.7/dist-packages --python_path=/usr/bin/python --config=xla --action_env CUDA_TOOLKIT_PATH=/usr/local/cuda --action_env TF_CUDA_COMPUTE_CAPABILITIES=7.0 --action_env LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 --action_env GCC_HOST_COMPILER_PATH=/usr/bin/gcc --config=cuda --action_env TF_CONFIGURE_IOS=0
INFO: Found applicable config definition build:xla in file /code/tensorflow/.tf_configure.bazelrc: --define with_xla_support=true
INFO: Found applicable config definition build:cuda in file /code/tensorflow/.bazelrc: --config=using_cuda --define=using_cuda_nvcc=true
INFO: Found applicable config definition build:using_cuda in file /code/tensorflow/.bazelrc: --define=using_cuda=true --action_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain
DEBUG: Rule 'io_bazel_rules_docker' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1556410077 -0400"
bazel-bin: /root/.cache/bazel/_bazel_root/dc321da52ae17570621748eb04acb03f/execroot/org_tensorflow/bazel-out/k8-opt/bin
bazel-genfiles: /root/.cache/bazel/_bazel_root/dc321da52ae17570621748eb04acb03f/execroot/org_tensorflow/bazel-out/k8-opt/genfiles
bazel-testlogs: /root/.cache/bazel/_bazel_root/dc321da52ae17570621748eb04acb03f/execroot/org_tensorflow/bazel-out/k8-opt/testlogs
character-encoding: file.encoding = ISO-8859-1, defaultCharset = ISO-8859-1
command_log: /root/.cache/bazel/_bazel_root/dc321da52ae17570621748eb04acb03f/command.log
committed-heap-size: 964MB
execution_root: /root/.cache/bazel/_bazel_root/dc321da52ae17570621748eb04acb03f/execroot/org_tensorflow
gc-count: 5
gc-time: 119ms
install_base: /root/.cache/bazel/_bazel_root/install/7da6a92c096ada842b8d48c251312343
java-home: /root/.cache/bazel/_bazel_root/install/7da6a92c096ada842b8d48c251312343/_embedded_binaries/embedded_tools/jdk
java-runtime: OpenJDK Runtime Environment (build 11.0.2+7-LTS) by Azul Systems, Inc.
java-vm: OpenJDK 64-Bit Server VM (build 11.0.2+7-LTS, mixed mode) by Azul Systems, Inc.
max-heap-size: 14309MB
output_base: /root/.cache/bazel/_bazel_root/dc321da52ae17570621748eb04acb03f
output_path: /root/.cache/bazel/_bazel_root/dc321da52ae17570621748eb04acb03f/execroot/org_tensorflow/bazel-out
package_path: %workspace%
release: release 0.24.1
repository_cache: /root/.cache/bazel/_bazel_root/cache/repos/v1
server_log: /root/.cache/bazel/_bazel_root/dc321da52ae17570621748eb04acb03f/java.log.13f49df5ac06.root.log.java.20200625-103403.278
server_pid: 278
used-heap-size: 228MB
workspace: /code/tensorflow

utunga · June 25, 2020, 10:59am

As i mentioned I’m currently rebuilding per above.

That said I really just wish I could figure out what bit of code inside the many bazel configs is the bit that actually applies the patches because it really does seem like this thing about importing multiple definitions of zlib caused by protobuf is pretty much a known problem with the tensorflow build and the reason why that patch is there in the first place… it just seems like I dunno… its not doing it for some reason?

lissyx · June 25, 2020, 11:21am

Right, so nothing that should impact

Training is still r1.15 and the models are compatible, so you can just do that safely

Ok, some people reported weird issues when building with 0.26.0

Again, if the patch is referenced in TensorFlow’s build configs, it is applied. If it failed to apply, you would be blocked at it.

Anyway you can still verify manually the files if you are unsure. Likely find -L . -type f -name "zlib*" ?

lissyx · June 25, 2020, 11:22am

@utunga Also, are you able to rebuild clean tree ? Can you try with our supplied Dockerfile.build ? You can change the repo used when issuing make Dockerfile.build

utunga · June 25, 2020, 12:19pm

Thanks @lissyx its late here so i may not get to this till tomorrow but fyi i finished a build after cleaning bazel_cache and rolling DeepSpeech back to v0.7.1 - still getting the same ‘multiple definition’ errors relating to zlib.

I will try with the other Dockerfile.build also as you ask ( more from a ‘clean bug perspective’ tbh because this Dockerfile was able to build earlier things OK )

Actually just to clarify when you say Dockerfile.build do you mean HEAD Dockerfile.build.tmpl or v0.7.1 Dockerfile

–

PS I am not quite sure what you mean by…

FWIW I’m doing git clean -f -d and git reset --hard in both tensorflow and Deepspeech dir before building if that’s what you mean?

lissyx · June 25, 2020, 12:22pm

No, just building mozilla/tensorflow@r1.15 plain after cleaning up any bazel cache

HEAD one

utunga · June 27, 2020, 6:57am

Because I appreciate your help @lissyx I thought I’d give you an update on this…

I used the HEAD Dockerfile.build.tmpl and with a few changes (listed below) was able to build both libdeepspeech.so and the python wheel.

First thing I did was verify, as you suggested, that I could build the HEAD from mozilla/Deepspeech against tensorflow 2.2… then was able to specify the custom DeepSpeech REPO by altering the DEEPSPEECH_REPO and DEEPSPEECH_SHA param at the top of the DockerFIle.

In case someone else is wrestling with this, I had to also do the following, in order to build DeepSpeech at v0.7.1_gpu specifically:

downgrade BAZEL_VERSION to 0.24.1
downgrade tensorflow to r1.15

RUN git clone https://github.com/mozilla/tensorflow/
WORKDIR /tensorflow
RUN git checkout r1.15

–

The only problem I have now is that the dockerfile builds at python 3.6 but our code relies on async / await so I have to upgrade to python 3.7 and do it again.

lissyx · June 26, 2020, 7:59am

Ubuntu 18.04 has python 3.7 packages, so you should be able to do so.

Was that against our pristine repo or your code? Anyway, looks like you have progress, so you should be able to find if it’s an issue with your changes or in your build environment.

utunga · June 26, 2020, 11:41am

Both actually. Was able to generate py3.6 version on pristine version of deepspeech and py3.6 and py3.7 version of the wheel on our own branch. That said, unfortunately still having a few problems getting the exact right combination of Cuda library dependencies and such to work.

lissyx · June 26, 2020, 11:51am

Where? The dockerfile has them clearly stated. But TensorFlow r1.15 needs CUDA 10.0 + CUDNN v7.6