We could, but I’ll have to be in the office to do that, so not before tomorrow at best. Besides, can you sha1sum to verify everything? Models, audio, code, libs?
There you go:
1f20d309fb3a07d5dad01f934adcaa8ee5656154 ./bin/README.mozilla 26d885f7a8557c293f42ae72c51f5340e4454bab ./bin/deepspeech 27e1ac471eead95e8389e67fef8abf5446475b0b ./bin/libtensorflow_framework.so 71ee4a4524235a94d463853b381ca0dc7c81171b ./bin/libdeepspeech.so 029ecf9d7b5a8dfdd5bdf7146a7c128111abab45 ./bin/libdeepspeech_utils.so 2be65d4ff1d982f78d4f45a900c5ee9627ad59da ./bin/libtensorflow_cc.so 6023b6a9d16e1febe8bf5d2e5ac1269e9f0fc116 ./bin/libctc_decoder_with_kenlm.so 8e6627932560a0b5a33e3a1ee8ebbc0da5e4fa29 ./bin/generate_trie d22157abc0fc0b4ae96380c09528e23cf77290a9 ./bin/LICENSE
I have realised a few things:
- The very first error that I got was because of lm.binary which is around 1,5GB. It does not fit into RPi’s memory and that is why it crashes.
- The second error that I got (trying without lm.binary) got resolved by using your “LDC93S1” output_graph.pb. Of course recognition was totally off in that case.
So the questions arise:
- What is the lm.binary file and is it really needed?
- How can accuracy be achieved without the lm.binary file?
- Is there a proper output_graph.pb that I can use?
lm.binary is the language model, used to improve recognition when accoustic model was phonetically right but the word was wrong.
output_graph.pb, I think the one I used for “good” decoding was the test overfit LC93S1.
I just checked @fotiDim and in my tests above, perfect decoding was achieved on overfitted LDC93S1
pi@raspberrypi:~/tc/rpi3-cpu $ time ./deepspeech ~/models-release/models/output_graph.pb ~/models-release/audio/2830-3980-0043.wav ~/models-release/models/alphabet.txt -t experience proves les cpu_time_overall=80.82691 cpu_time_mfcc=0.02943 cpu_time_infer=80.79748 real 1m39.378s user 1m17.210s sys 0m6.850s
SHA1 for release’s
alphabet.txt matches yours, but as you can see in the post above, I have proper decoding.
That is weird then. Can I ask as last favor to hand out the .img from your sd card?
Send me an email in private
I am trying to infer a custom trained model with native client in a Asus Thinkerboard, a card similar to raspberry with 32bit armv7l architecture.
I tried to compile from scratch, including tensorflow, but ended up with the same error than this issue is reporting (the SparseToDense one).
Then I tried your raspberry-ready compiled libraries downloaded from:
But unfortunately this new “Illegal instruction” error showed up:
Thread 1 “deepspeech” received signal SIGILL, Illegal instruction.
0xb692de84 in tensorflow::(anonymous namespace)::GraphConstructor::TryImport() () from /home/ftx/fonotexto/herramientas/DeepSpeech/nativelibs_jmd/libdeepspeech.so
(NOTE: in my raspberry pi3 these libraries work fine).
So I would like to do a specific compilation for the Thinkerboard card, could you give me any hint to do that and skip the SparseToDense error?
Thanks in advance,
Heads up, I have been working on moving to newer GCC toolchain (from Linaro) and build armv7-a binaries. This URL should point to that: https://index.taskcluster.net/v1/task/project.deepspeech.deepspeech.native_client.master.arm/artifacts/public/native_client.tar.xz
Switching to this newer toolchain also enabled me to make it cleaner to build for other targets. You can copy some of the build flags we have defined here: https://github.com/mozilla/tensorflow/blob/7455bfae5cc3b5eafbc9a5d8faee33e7772a4d12/tools/bazel.rc#L52-L56
Make sure you pass all those you need in
--copt=, and adapt things like