DeepSpeech native client compilation for Asus Thinkerboard

(Mar Martinez) #1


I am trying to run a native client in an Asus Thinkerboard card that has an architecture similar to Raspberry Pi3 (armv7l 32 bit).
But I am a bit stuck now.

The steps I followed are:

  1. Create a clean OS SD with ThinkerOS (Debian), install Miniconda3 (because some python packages are available without compilation there), create a conda environment deep-spech with python 2.7.

  2. Install DeepSpeech with the instructions from, except for tensorflow that has to be compiled because no package is availabe neither in pip nor in conda, and anyway I need the compilation for native client.
    Obviously the download native_client from taskcluster is not working, because it is the linux 64bit one.

  3. Compile bazel an tensorflow from scratch with these instructions:
    WARNING: tensorflow code is retrieved from mozilla/tensorflow not from tensorflow site

  4. Compile DeepSpeech native_client with the instructions here (not language bindings, just custom decoder):
    NOTE: those steps were made in both cards, Thinkerboard and RaspberryPi3

  5. Finally, try to run a pretrained toy spanish model (that I have used before in my Mac with success) with native client and some test wav files
    This same error appears in both cards RP3 and Thinker:
    Invalid argument: No OpKernel was registered to support Op ‘SparseToDense’ with these attrs. Registered devices: [CPU], Registered kernels:
    device=‘CPU’; T in [DT_STRING]; Tindices in [DT_INT64]
    device=‘CPU’; T in [DT_STRING]; Tindices in [DT_INT32]
    device=‘CPU’; T in [DT_BOOL]; Tindices in [DT_INT64]
    device=‘CPU’; T in [DT_BOOL]; Tindices in [DT_INT32]
    device=‘CPU’; T in [DT_FLOAT]; Tindices in [DT_INT64]
    device=‘CPU’; T in [DT_FLOAT]; Tindices in [DT_INT32]
    device=‘CPU’; T in [DT_INT32]; Tindices in [DT_INT64]
    device=‘CPU’; T in [DT_INT32]; Tindices in [DT_INT32]
    [[Node: SparseToDense = SparseToDense[T=DT_INT64, Tindices=DT_INT64, validate_indices=true](CTCBeamSearchDecoder, CTCBeamSearchDecoder:2, CTCBeamSearchDecoder:1, SparseToDense/default_value)]]

  6. I found this post Error with sample model on Raspbian Jessie
    And I download the precompiled raspberry libraries from here:
    Those libraries do not include, and I kept the compiled one I had.

  7. With the raspberry libraries, the model at the raspberry card is working FINE :-), but the thinkerboard throws a new error:
    Thread 1 “deepspeech” received signal SIGILL, Illegal instruction.
    0xb692de84 in tensorflow::(anonymous namespace)::GraphConstructor::TryImport() () from /home/ftx/fonotexto/herramientas/DeepSpeech/

  8. I run out of ideas and post a question to you to get any new hint that can unblock me.

This is the overview of the history, if you need additional details let me know.
Thanks a lot for your help,

(Lissyx) #2

Just use our tooling, I know nothing about but obviously it’s not good.

From our mozilla/tensorflow checkout, use r1.5 branch with master for mozilla/DeepSpeech. To build for RPi3, just use --config=rpi3 on the bazel build command-line.

If you don’t use --config=rpi3 it will not be properly configured to use Bazel’s RPi3 toolchain definition, that includes the -DRASPBERRY_PI flag neded for SparseToDense and others to behave properly.

(Mar Martinez) #3

Hi again,

The tensorflow sources I am using are these:
should I use these ones?:

I have tried to follow your hint, without success :-(.

  1. First of all I tried to compile inside the Thinkerboard card itself with --config=rpi3 param:

bazel build --config=monolithic --config=rpi3 -c opt --copt=-O3 --copt=-fvisibility=hidden // //native_client:deepspeech_utils // //native_client:generate_trie

and end up with this error:

tools/arm_compiler/gcc_arm_rpi/arm-linux-gnueabihf-gcc: line 3: /proc/self/cwd/external/GccArmRpi/arm-bcm2708/arm-rpi-4.9.3-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc: cannot execute binary file: Exec format error

Because the tensorflow toolchain downloads a 64-bit compiler from to do the job ???, so I assumed that it is designed for cross-compiling.

  1. Then I tried to cross-compile from the Mac using the --config=mp3

bazel build --config=monolithic -c opt --copt=-O3 --config=rpi3 --copt=-fvisibility=hidden // //native_client:deepspeech_utils // //native_client:generate_trie

The process ended up ok, but the generated binaries are 64-bit as well, so unable to be used.

So, any other hint? Maybe I should set something special in the tensorflow ./configure about this option “–config=opt” is specified [Default is -march=native]:

I would like to compile DeepSpeech inside the Thinkerboard, because your precompiled raspberry libraries do the “Illegal instruction” error.
Is there any alternative of compiling without bazel toolchain, just using make?


(Lissyx) #4

If you compile on the board itself, it’s going to be slow, but it should work: don’t use --config=rpi3 if you want to do that.

You are right that this is cross-compilation but this is designed and tested only from linux host, not from Mac: the downloaded toolchain is linux binaries (official RPi foundation toolchain), I have no idea what you might get outside of that.

But cross-compiling with --config=rpi3 will get you (hopefully) the same binaries, so it might be failing the same way. Ideally you should add a cross-compilation target in tools/arm_compiler/CROSSTOOL.

Looking at the specs: it’s not that close to RPi3, which is an ARM Cortex-A53 while the Asus is RK3288 Cortex-A17. So I don’t really understand how you might expect RPi3 binaries to run by default :).

As a quick hack, you can change the compiler_flag sections in the above CROSSTOOL file, in the toolchain section identified by gcc_rpi_linux_armhf, and specifically the -mtune and -mfpu to adapt to your system. I’m unsure whether the GCC provided for the RPi3 toolchain will be good enough in your case.

If that’s not working, then your best (but sloooooow) option is in-situ compilation, but just DONT add --config=rpi3 there.

(Mar Martinez) #5

Yes, I was very naive thinking your precompiled raspberry libraries could work in this architecture :frowning:
I will try to follow these new instructions, Thanks a lot.

(Lissyx) #6

Since we explicitely target Raspbian distro and Cortex-A53 architecture of the RPi3, it would be really surprising that it works, in fact.

There might be a third solution of using the TensorFlow’s cross-compilation bits that they landed after I did it, but I have not explored how it works precisely, so I cannot recommend or guide you on that for now.

I’d really suggest sticking to cross-compilation, though, since it’s much much faster, so even if you have to trial/error, in the end it’s likely to be faster to get something working. In-situ compilation will harm you memory, (might) require extra setup for swap etc.

(Mar Martinez) #7


I finally did a make compilation inside the card without “bazel” but “make”.

  1. Install a pendrive swap area (3Gb) to speed up things.

  2. Compile tensorflow using these directives for RPi3 but adding the extra parameter ANDROID_TYPES=-D__ANDROID_TYPES_FULL__ to the compilation line:

Yes this takes a long time :confounded: (about 1-2 hours) with -j4.

  1. Compile deepspeech (libraries and binary) doing custom makefile for the libraries following the bazel instructions and the information in BUILD your file, including the objects from tensorflow compilation.

And it works fine :slight_smile: .

Thanks a lot for your help!