first I download the deepspeech from this link:
then I download the native_client from this:
https://index.taskcluster.net/v1/task/project.deepspeech.deepspeech.native_client.v0.2.0.cpu/artifacts/public/native_client.tar.xz
I unzip the 2 files, and I put the native_client folder inside the folder that is generated by deepspeech, I give it in combine folder
I enter the native_client folder within deepspeech, and I try to compile the kenlm version within it, as follows:
mkdir build
cd build/
cmake …
it appears that:
`-- The C compiler identification is GNU 7.3.0
– The CXX compiler identification is GNU 7.3.0
– Check for working C compiler: /usr/bin/cc
– Check for working C compiler: /usr/bin/cc – works
– Detecting C compiler ABI info
– Detecting C compiler ABI info - done
– Detecting C compile features
– Detecting C compile features - done
– Check for working CXX compiler: /usr/bin/c++
– Check for working CXX compiler: /usr/bin/c++ – works
– Detecting CXX compiler ABI info
– Detecting CXX compiler ABI info - done
– Detecting CXX compile features
– Detecting CXX compile features - done
– Looking for pthread.h
– Looking for pthread.h - found
– Looking for pthread_create
– Looking for pthread_create - not found
– Looking for pthread_create in pthreads
– Looking for pthread_create in pthreads - not found
– Looking for pthread_create in pthread
– Looking for pthread_create in pthread - found
– Found Threads: TRUE
– Boost version: 1.65.1
– Found the following Boost libraries:
– program_options
– system
– thread
– unit_test_framework
– chrono
– date_time
– atomic
– Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version “1.2.11”)
– Found BZip2: /usr/lib/x86_64-linux-gnu/libbz2.so (found version “1.0.6”)
– Looking for BZ2_bzCompressInit
– Looking for BZ2_bzCompressInit - found
– Looking for lzma_auto_decoder in /usr/lib/x86_64-linux-gnu/liblzma.so
– Looking for lzma_auto_decoder in /usr/lib/x86_64-linux-gnu/liblzma.so - found
– Looking for lzma_easy_encoder in /usr/lib/x86_64-linux-gnu/liblzma.so
– Looking for lzma_easy_encoder in /usr/lib/x86_64-linux-gnu/liblzma.so - found
– Looking for lzma_lzma_preset in /usr/lib/x86_64-linux-gnu/liblzma.so
– Looking for lzma_lzma_preset in /usr/lib/x86_64-linux-gnu/liblzma.so - found
– Found LibLZMA: /usr/include (found version “5.2.2”)
CMake Error at util/CMakeLists.txt:58 (add_subdirectory):
add_subdirectory given source “stream” which is not an existing directory.
CMake Error at lm/CMakeLists.txt:44 (add_subdirectory):
add_subdirectory given source “builder” which is not an existing directory.
CMake Error at lm/CMakeLists.txt:45 (add_subdirectory):
add_subdirectory given source “filter” which is not an existing directory.
– Could NOT find Eigen3 (missing: EIGEN3_INCLUDE_DIR EIGEN3_VERSION_OK) (Required is at least version “2.91.0”)
CMake Warning at lm/interpolate/CMakeLists.txt:65 (message):
Not building interpolation. Eigen3 was not found.
– To install Eigen3 in your home directory, copy paste this:
export EIGEN3_ROOT=$HOME/eigen-eigen-07105f7124f9
(cd $HOME; wget -O - https://bitbucket.org/eigen/eigen/get/3.2.8.tar.bz2 |tar xj)
rm CMakeCache.txt
– Configuring incomplete, errors occurred!
See also “/home/manuel/Descargas/version2/DeepSpeech-0.2.0/native_client/kenlm/build/CMakeFiles/CMakeOutput.log”.
See also “/home/manuel/Descargas/version2/DeepSpeech-0.2.0/native_client/kenlm/build/CMakeFiles/CMakeError.log”.
`
then I delete the folder kenlm, which is contained by default in native_client of deepspech and I download the kenlm repository that is this https://github.com/kpu/kenlm.git
, then I do the same:
mkdir build
cd build/
cmake …
now it appears:
`-- The C compiler identification is GNU 7.3.0
– The CXX compiler identification is GNU 7.3.0
– Check for working C compiler: /usr/bin/cc
– Check for working C compiler: /usr/bin/cc – works
– Detecting C compiler ABI info
– Detecting C compiler ABI info - done
– Detecting C compile features
– Detecting C compile features - done
– Check for working CXX compiler: /usr/bin/c++
– Check for working CXX compiler: /usr/bin/c++ – works
– Detecting CXX compiler ABI info
– Detecting CXX compiler ABI info - done
– Detecting CXX compile features
– Detecting CXX compile features - done
– Looking for pthread.h
– Looking for pthread.h - found
– Looking for pthread_create
– Looking for pthread_create - not found
– Looking for pthread_create in pthreads
– Looking for pthread_create in pthreads - not found
– Looking for pthread_create in pthread
– Looking for pthread_create in pthread - found
– Found Threads: TRUE
– Boost version: 1.65.1
– Found the following Boost libraries:
– program_options
– system
– thread
– unit_test_framework
– chrono
– date_time
– atomic
– Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version “1.2.11”)
– Found BZip2: /usr/lib/x86_64-linux-gnu/libbz2.so (found version “1.0.6”)
– Looking for BZ2_bzCompressInit
– Looking for BZ2_bzCompressInit - found
– Looking for lzma_auto_decoder in /usr/lib/x86_64-linux-gnu/liblzma.so
– Looking for lzma_auto_decoder in /usr/lib/x86_64-linux-gnu/liblzma.so - found
– Looking for lzma_easy_encoder in /usr/lib/x86_64-linux-gnu/liblzma.so
– Looking for lzma_easy_encoder in /usr/lib/x86_64-linux-gnu/liblzma.so - found
– Looking for lzma_lzma_preset in /usr/lib/x86_64-linux-gnu/liblzma.so
– Looking for lzma_lzma_preset in /usr/lib/x86_64-linux-gnu/liblzma.so - found
– Found LibLZMA: /usr/include (found version “5.2.2”)
– Could NOT find Eigen3 (missing: EIGEN3_INCLUDE_DIR EIGEN3_VERSION_OK) (Required is at least version “2.91.0”)
CMake Warning at lm/interpolate/CMakeLists.txt:65 (message):
Not building interpolation. Eigen3 was not found.
– To install Eigen3 in your home directory, copy paste this:
export EIGEN3_ROOT=$HOME/eigen-eigen-07105f7124f9
(cd $HOME; wget -O - https://bitbucket.org/eigen/eigen/get/3.2.8.tar.bz2 |tar xj)
rm CMakeCache.txt
– Configuring done
– Generating done
– Build files have been written to: /home/manuel/Descargas/version2/DeepSpeech-0.2.0/native_client/kenlm/build`
then:
make -j 4
Scanning dependencies of target kenlm_filter Scanning dependencies of target kenlm_util [ 0%] Building CXX object lm/filter/CMakeFiles/kenlm_filter.dir/arpa_io.cc.o [ 1%] Building CXX object lm/filter/CMakeFiles/kenlm_filter.dir/phrase.cc.o [ 2%] Building CXX object lm/filter/CMakeFiles/kenlm_filter.dir/vocab.cc.o [ 3%] Building CXX object util/CMakeFiles/kenlm_util.dir/double-conversion/bignum-dtoa.cc.o [ 3%] Building CXX object util/CMakeFiles/kenlm_util.dir/double-conversion/bignum.cc.o [ 4%] Building CXX object util/CMakeFiles/kenlm_util.dir/double-conversion/cached-powers.cc.o [ 5%] Building CXX object util/CMakeFiles/kenlm_util.dir/double-conversion/diy-fp.cc.o [ 6%] Building CXX object util/CMakeFiles/kenlm_util.dir/double-conversion/double-conversion.cc.o [ 7%] Building CXX object util/CMakeFiles/kenlm_util.dir/double-conversion/fast-dtoa.cc.o [ 8%] Building CXX object util/CMakeFiles/kenlm_util.dir/double-conversion/fixed-dtoa.cc.o [ 8%] Building CXX object util/CMakeFiles/kenlm_util.dir/double-conversion/strtod.cc.o [ 9%] Building CXX object util/CMakeFiles/kenlm_util.dir/stream/chain.cc.o [ 10%] Building CXX object util/CMakeFiles/kenlm_util.dir/stream/count_records.cc.o [ 11%] Building CXX object util/CMakeFiles/kenlm_util.dir/stream/io.cc.o [ 12%] Linking CXX static library ../../lib/libkenlm_filter.a [ 12%] Built target kenlm_filter [ 13%] Building CXX object util/CMakeFiles/kenlm_util.dir/stream/line_input.cc.o [ 13%] Building CXX object util/CMakeFiles/kenlm_util.dir/stream/multi_progress.cc.o [ 14%] Building CXX object util/CMakeFiles/kenlm_util.dir/stream/rewindable_stream.cc.o [ 15%] Building CXX object util/CMakeFiles/kenlm_util.dir/bit_packing.cc.o [ 16%] Building CXX object util/CMakeFiles/kenlm_util.dir/ersatz_progress.cc.o [ 17%] Building CXX object util/CMakeFiles/kenlm_util.dir/exception.cc.o [ 17%] Building CXX object util/CMakeFiles/kenlm_util.dir/file.cc.o [ 18%] Building CXX object util/CMakeFiles/kenlm_util.dir/file_piece.cc.o [ 19%] Building CXX object util/CMakeFiles/kenlm_util.dir/float_to_string.cc.o [ 20%] Building CXX object util/CMakeFiles/kenlm_util.dir/integer_to_string.cc.o [ 21%] Building CXX object util/CMakeFiles/kenlm_util.dir/mmap.cc.o [ 21%] Building CXX object util/CMakeFiles/kenlm_util.dir/murmur_hash.cc.o [ 22%] Building CXX object util/CMakeFiles/kenlm_util.dir/parallel_read.cc.o [ 23%] Building CXX object util/CMakeFiles/kenlm_util.dir/pool.cc.o [ 24%] Building CXX object util/CMakeFiles/kenlm_util.dir/read_compressed.cc.o [ 25%] Building CXX object util/CMakeFiles/kenlm_util.dir/scoped.cc.o [ 25%] Building CXX object util/CMakeFiles/kenlm_util.dir/spaces.cc.o [ 26%] Building CXX object util/CMakeFiles/kenlm_util.dir/string_piece.cc.o [ 27%] Building CXX object util/CMakeFiles/kenlm_util.dir/usage.cc.o [ 28%] Linking CXX static library ../lib/libkenlm_util.a [ 28%] Built target kenlm_util Scanning dependencies of target sized_iterator_test Scanning dependencies of target bit_packing_test Scanning dependencies of target string_stream_test Scanning dependencies of target joint_sort_test [ 29%] Building CXX object util/CMakeFiles/sized_iterator_test.dir/sized_iterator_test.cc.o [ 30%] Building CXX object util/CMakeFiles/joint_sort_test.dir/joint_sort_test.cc.o [ 30%] Building CXX object util/CMakeFiles/string_stream_test.dir/string_stream_test.cc.o [ 31%] Building CXX object util/CMakeFiles/bit_packing_test.dir/bit_packing_test.cc.o [ 31%] Linking CXX executable ../tests/sized_iterator_test [ 31%] Built target sized_iterator_test Scanning dependencies of target file_piece_test [ 31%] Building CXX object util/CMakeFiles/file_piece_test.dir/file_piece_test.cc.o [ 32%] Linking CXX executable ../tests/bit_packing_test [ 32%] Built target bit_packing_test Scanning dependencies of target sorted_uniform_test [ 33%] Building CXX object util/CMakeFiles/sorted_uniform_test.dir/sorted_uniform_test.cc.o [ 34%] Linking CXX executable ../tests/joint_sort_test [ 34%] Built target joint_sort_test Scanning dependencies of target probing_hash_table_benchmark [ 35%] Building CXX object util/CMakeFiles/probing_hash_table_benchmark.dir/probing_hash_table_benchmark_main.cc.o [ 36%] Linking CXX executable ../tests/string_stream_test [ 36%] Built target string_stream_test Scanning dependencies of target pcqueue_test [ 36%] Building CXX object util/CMakeFiles/pcqueue_test.dir/pcqueue_test.cc.o [ 37%] Linking CXX executable ../tests/file_piece_test [ 38%] Linking CXX executable ../tests/pcqueue_test [ 38%] Built target file_piece_test Scanning dependencies of target tokenize_piece_test [ 39%] Building CXX object util/CMakeFiles/tokenize_piece_test.dir/tokenize_piece_test.cc.o [ 39%] Built target pcqueue_test Scanning dependencies of target probing_hash_table_test [ 40%] Building CXX object util/CMakeFiles/probing_hash_table_test.dir/probing_hash_table_test.cc.o [ 41%] Linking CXX executable ../tests/sorted_uniform_test [ 41%] Built target sorted_uniform_test Scanning dependencies of target read_compressed_test [ 41%] Building CXX object util/CMakeFiles/read_compressed_test.dir/read_compressed_test.cc.o [ 41%] Linking CXX executable ../bin/probing_hash_table_benchmark [ 41%] Built target probing_hash_table_benchmark Scanning dependencies of target multi_intersection_test [ 42%] Building CXX object util/CMakeFiles/multi_intersection_test.dir/multi_intersection_test.cc.o [ 43%] Linking CXX executable ../tests/probing_hash_table_test [ 43%] Built target probing_hash_table_test Scanning dependencies of target integer_to_string_test [ 44%] Building CXX object util/CMakeFiles/integer_to_string_test.dir/integer_to_string_test.cc.o [ 45%] Linking CXX executable ../tests/tokenize_piece_test [ 45%] Built target tokenize_piece_test Scanning dependencies of target io_test [ 46%] Building CXX object util/stream/CMakeFiles/io_test.dir/io_test.cc.o [ 47%] Linking CXX executable ../tests/read_compressed_test [ 47%] Built target read_compressed_test Scanning dependencies of target sort_test [ 48%] Building CXX object util/stream/CMakeFiles/sort_test.dir/sort_test.cc.o [ 49%] Linking CXX executable ../tests/multi_intersection_test [ 49%] Built target multi_intersection_test Scanning dependencies of target stream_test [ 49%] Building CXX object util/stream/CMakeFiles/stream_test.dir/stream_test.cc.o [ 50%] Linking CXX executable ../../tests/io_test [ 50%] Built target io_test Scanning dependencies of target rewindable_stream_test [ 51%] Building CXX object util/stream/CMakeFiles/rewindable_stream_test.dir/rewindable_stream_test.cc.o [ 52%] Linking CXX executable ../tests/integer_to_string_test [ 52%] Built target integer_to_string_test Scanning dependencies of target kenlm [ 53%] Building CXX object lm/CMakeFiles/kenlm.dir/bhiksha.cc.o [ 54%] Building CXX object lm/CMakeFiles/kenlm.dir/binary_format.cc.o [ 55%] Building CXX object lm/CMakeFiles/kenlm.dir/config.cc.o [ 55%] Building CXX object lm/CMakeFiles/kenlm.dir/lm_exception.cc.o [ 56%] Building CXX object lm/CMakeFiles/kenlm.dir/model.cc.o [ 57%] Linking CXX executable ../../tests/stream_test [ 57%] Built target stream_test [ 58%] Building CXX object lm/CMakeFiles/kenlm.dir/quantize.cc.o [ 59%] Linking CXX executable ../../tests/rewindable_stream_test [ 59%] Built target rewindable_stream_test [ 60%] Building CXX object lm/CMakeFiles/kenlm.dir/read_arpa.cc.o [ 61%] Building CXX object lm/CMakeFiles/kenlm.dir/search_hashed.cc.o [ 62%] Linking CXX executable ../../tests/sort_test [ 62%] Built target sort_test [ 63%] Building CXX object lm/CMakeFiles/kenlm.dir/search_trie.cc.o [ 63%] Building CXX object lm/CMakeFiles/kenlm.dir/sizes.cc.o [ 64%] Building CXX object lm/CMakeFiles/kenlm.dir/trie.cc.o [ 65%] Building CXX object lm/CMakeFiles/kenlm.dir/trie_sort.cc.o [ 66%] Building CXX object lm/CMakeFiles/kenlm.dir/value_build.cc.o [ 67%] Building CXX object lm/CMakeFiles/kenlm.dir/virtual_interface.cc.o [ 67%] Building CXX object lm/CMakeFiles/kenlm.dir/vocab.cc.o [ 68%] Building CXX object lm/CMakeFiles/kenlm.dir/common/model_buffer.cc.o [ 69%] Building CXX object lm/CMakeFiles/kenlm.dir/common/print.cc.o [ 70%] Building CXX object lm/CMakeFiles/kenlm.dir/common/renumber.cc.o [ 71%] Building CXX object lm/CMakeFiles/kenlm.dir/common/size_option.cc.o [ 71%] Linking CXX static library ../lib/libkenlm.a [ 71%] Built target kenlm Scanning dependencies of target fragment Scanning dependencies of target query Scanning dependencies of target partial_test Scanning dependencies of target model_test [ 73%] Building CXX object lm/CMakeFiles/query.dir/query_main.cc.o [ 73%] Building CXX object lm/CMakeFiles/fragment.dir/fragment_main.cc.o [ 74%] Building CXX object lm/CMakeFiles/model_test.dir/model_test.cc.o [ 75%] Building CXX object lm/CMakeFiles/partial_test.dir/partial_test.cc.o [ 75%] Linking CXX executable ../bin/fragment [ 75%] Built target fragment Scanning dependencies of target left_test [ 76%] Building CXX object lm/CMakeFiles/left_test.dir/left_test.cc.o [ 77%] Linking CXX executable ../bin/query [ 77%] Built target query Scanning dependencies of target build_binary [ 78%] Building CXX object lm/CMakeFiles/build_binary.dir/build_binary_main.cc.o [ 78%] Linking CXX executable ../bin/build_binary [ 78%] Built target build_binary Scanning dependencies of target kenlm_benchmark [ 79%] Building CXX object lm/CMakeFiles/kenlm_benchmark.dir/kenlm_benchmark_main.cc.o [ 80%] Linking CXX executable ../tests/partial_test [ 80%] Built target partial_test Scanning dependencies of target model_buffer_test [ 81%] Building CXX object lm/common/CMakeFiles/model_buffer_test.dir/model_buffer_test.cc.o [ 82%] Linking CXX executable ../../tests/model_buffer_test [ 82%] Built target model_buffer_test Scanning dependencies of target kenlm_builder [ 83%] Building CXX object lm/builder/CMakeFiles/kenlm_builder.dir/adjust_counts.cc.o [ 83%] Linking CXX executable ../tests/left_test [ 83%] Built target left_test Scanning dependencies of target filter [ 84%] Building CXX object lm/filter/CMakeFiles/filter.dir/filter_main.cc.o [ 85%] Building CXX object lm/builder/CMakeFiles/kenlm_builder.dir/corpus_count.cc.o [ 85%] Building CXX object lm/builder/CMakeFiles/kenlm_builder.dir/initial_probabilities.cc.o [ 86%] Linking CXX executable ../bin/kenlm_benchmark [ 86%] Built target kenlm_benchmark Scanning dependencies of target phrase_table_vocab [ 87%] Building CXX object lm/filter/CMakeFiles/phrase_table_vocab.dir/phrase_table_vocab_main.cc.o [ 88%] Building CXX object lm/builder/CMakeFiles/kenlm_builder.dir/interpolate.cc.o [ 88%] Linking CXX executable ../tests/model_test [ 88%] Built target model_test [ 89%] Building CXX object lm/builder/CMakeFiles/kenlm_builder.dir/output.cc.o [ 90%] Linking CXX executable ../../bin/phrase_table_vocab [ 90%] Built target phrase_table_vocab [ 91%] Building CXX object lm/builder/CMakeFiles/kenlm_builder.dir/pipeline.cc.o [ 92%] Linking CXX executable ../../bin/filter [ 92%] Built target filter [ 93%] Linking CXX static library ../../lib/libkenlm_builder.a [ 93%] Built target kenlm_builder Scanning dependencies of target lmplz Scanning dependencies of target corpus_count_test Scanning dependencies of target count_ngrams Scanning dependencies of target adjust_counts_test [ 93%] Building CXX object lm/builder/CMakeFiles/adjust_counts_test.dir/adjust_counts_test.cc.o [ 94%] Building CXX object lm/builder/CMakeFiles/count_ngrams.dir/count_ngrams_main.cc.o [ 95%] Building CXX object lm/builder/CMakeFiles/lmplz.dir/lmplz_main.cc.o [ 96%] Building CXX object lm/builder/CMakeFiles/corpus_count_test.dir/corpus_count_test.cc.o [ 97%] Linking CXX executable ../../tests/adjust_counts_test [ 97%] Built target adjust_counts_test [ 98%] Linking CXX executable ../../bin/lmplz [ 99%] Linking CXX executable ../../tests/corpus_count_test [ 99%] Built target corpus_count_test [ 99%] Built target lmplz [100%] Linking CXX executable ../../bin/count_ngrams [100%] Built target count_ngrams
ends successfully in my opinion
I copy my corpus (it’s very small, it contains only 10 .wav audio files with the CSV files attached to it)
files.zip (621,3 KB)
then being in the data folder of deepspeech I do the following:
I create the .arpa file
/home/manuel/Descargas/version2/DeepSpeech-0.2.0/native_client/kenlm/build/bin/./lmplz --text vocabulary.txt --arpa words.arpa --o 3
and I get this:
`=== 1/5 Counting and sorting n-grams ===
Reading /home/manuel/Descargas/version2/DeepSpeech-0.2.0/data/vocabulary.txt
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100
Unigram tokens 309281 types 24737
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:296844 2:1710177792 3:3206583552
Statistics:
1 24737 D1=0.629506 D2=1.08051 D3+=1.51425
2 147409 D1=0.801094 D2=1.1342 D3+=1.40326
3 255902 D1=0.896885 D2=1.21438 D3+=1.40798
Memory estimate for binary LM:
type kB
probing 8581 assuming -p 1.5
probing 9541 assuming -r models -p 1.5
trie 3744 without quantization
trie 2183 assuming -q 8 -b 8 quantization
trie 3559 assuming -a 22 array pointer compression
trie 1998 assuming -a 22 -q 8 -b 8 array pointer compression and quantization
=== 3/5 Calculating and sorting initial probabilities ===
Chain sizes: 1:296844 2:2358544 3:5118040
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100
####################################################################################################
=== 4/5 Calculating and writing order-interpolated probabilities ===
Chain sizes: 1:296844 2:2358544 3:5118040
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100
####################################################################################################
=== 5/5 Writing ARPA model ===
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100
Name:lmplz VmPeak:4976972 kB VmRSS:14504 kB RSSMax:1137172 kB user:0.567799 sys:0.508239 CPU:1.07618 real:1.22807`
then the .binary file
/home/manuel/Descargas/version2/DeepSpeech-0.2.0/native_client/kenlm/build/bin/build_binary -T -s words.arpa lm.binary
and I get this:
`Reading words.arpa
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100
SUCCESS`
and finally try to generate the trie file
/home/manuel/Descargas/version2/DeepSpeech-0.2.0/native_client/generate_trie alphabet.txt lm.binary vocabulary.txt trie
and this appears:
terminate called after throwing an instance of 'lm::FormatLoadException' what(): native_client/kenlm/lm/binary_format.cc:131 in void lm::ngram::MatchCheck(lm::ngram::ModelType, unsigned int, const lm::ngram::Parameters&) threw FormatLoadException. The binary file was built for probing hash tables but the inference code is trying to load trie with quantization and array-compressed pointers Abortado (
core’ generado)`