Error after following installation steps

Can you also share more details about your system ? Are you using some VM on some provider, and the output of cat /proc/cpuinfo so others can find it ? Thanks :slight_smile:

My machine has 32 cores, so it’s a bit of a flood, I’ll post link to pastebin:
Link

Thanks, it confirms:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt

There’s avx but no avx2. So far, only solution is that you rebuild the python tensorflow package, it’s should only take some time but with that kind of machine not that much.

Thanks a lot!

What about requirements.txt? It’s stated there “tensorflow”, that has no version, so it’ll download 1.4.0? I understand, that it’s not necessary due to compilation from sources but anyway, should there be ‘tensorflow == 1.3.0’?

P.S. The information about processors setting might be very helpful to others who wants to try it, it would be good thing to add this to readme, in my opinion

It is already documented in the README that we require CPU with at least AVX2 and FMA. Regarding the requirements.txt we should probably fix that in a more proper way. The current issue is that since Tensorflow has no API stability guarantee besides Python, even tensorflow==1.3.0 might get into troubles ; people reported problems with upstream 1.3.0 and out libctc_decoder_with_kenlm.so.

You can probably open an issue and make a PR to change that. We could use the TaskCluster link: https://index.taskcluster.net/v1/task/project.deepspeech.tensorflow.pip.master.gpu/artifacts/public/tensorflow_gpu_warpctc-1.3.0rc0-cp27-cp27mu-linux_x86_64.whl

This would assume that people wanting to train have a CUDA-enabled setup (which is probably fine). Happy to review your issue and PR on that :slight_smile:

yep, thanks! I got the CUDA-enabled setup working, tho it fails at decoding:

Loading the LM will be faster if you build a binary file.
Reading data/lm/lm.binary
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
terminate called after throwing an instance of 'lm::FormatLoadException'
  what():  native_client/kenlm/lm/read_arpa.cc:65 in void lm::ReadARPACounts(util::FilePiece&, std::vector<long unsigned int>&) threw FormatLoadException.
first non-empty line was "version https://git-lfs.github.com/spec/v1" not \data\. Byte: 43
Aborted (core dumped)```

You have not setup git-lfs, so there is no language model, please read the README for this.

Oh, it’s in FAQ section and in README, I’m so sorry for not paying full attention to it.

I’ll try to make PR asap, thanks.

FYI I’m covering similar changes for when we switch to r1.4: we will have a task running on TaskCluster, and testing training of the model against upstream tensorflow with our libctc_decoder_with_kenlm.so :slight_smile:

Yeah, I saw it in Issues :slight_smile:

I’ve also run into the AVX2 error and do not see avx2 in my procinfo flags. I assume I also need to rebuild the python tensorflow package.

My setup:

  • OS: Debian 9
    -Python: python 3.6, built from source

BTW-- DeepSpeech is python3 compatible, right?

Yes, it should all run properly with Python 3.

1 Like

I installed tensorflow from source but it appears that I still haven’t successfully added avx2 support. It would be helpful if someone would provide the ./configure and bazel-build commands that will surely work.

If you follow Tensorflow’s build instruction, there’s no way you endup with a package that depends on AVX2 if you system does not. Which configure and bazel build command did you use ?

Going back to the start.

I pip3 installed deepsearch. Then, when I tested it at the command line, the error I got was

“The TensorFlow library was compiled to use AVX2 instructions, but these aren’t available on your machine.”

So, I pip3 uninstalled deepsearch and tensorflow. Then, I configured and bazel-built tensorflow. Then, I pip3 installed deepsearch, tried again, got the same avx2 error.

My guess is that I’m not ./configuring or bazel-building correctly so I’m asking what the recommended lines are for no-gpu configuration.

Does this make sense?

It makes sense, but you are doing something different than what was the start of this topic :slight_smile:, where the first poster was having issues during training. Basically, any package you will pip install is one we did build with AVX2 support.

In your case, you need to follow the native_client/README.md about building the Python bindings (and also building Tensorflow, it should be all documented there). When doing so, you should have no forced optimization and thus it should run properly.

So, pip uninstall anything you installed earlier, and follow that.

Besides, out of curiosity, what is your CPU / system ?

Apologies for the confusion! I will try as recommended.

CPU: Intel® Core™ i7-3770 CPU @ 3.40GHz
OS: Debian 9

1 Like

My confusion was due to the use of multiple README files in the DeepSearch repository. You are referencing instructions in the DeepSearch/native_client/README.me where as I was referencing the installation instructions README in the root directory. I just noticed the root README tells me to check out the other README if installation fails. Oh well.

Hello

Have the same issue:
Reading data/lm/lm.binary

----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100

terminate called after throwing an instance of ‘lm::FormatLoadException’

what(): native_client/kenlm/lm/read_arpa.cc:65 in void lm::ReadARPACounts(util::FilePiece&, std::vector<long unsigned int>&) threw FormatLoadException.

first non-empty line was “version https://git-lfs.github.com/spec/v1” not \data. Byte: 43

Aborted

Tried to fix it, but probably did it wrong, what I did,
git clone https://github.com/git-lfs/git-lfs
/git-lfs$ pip install git-lfs

result:
https://files.pythonhosted.org/packages/a0/4e/6fc59e52b2178a1990cdeb54a6370f9e1c152f764e5f4e2d4b5f41edfa9b/git_lfs-1.5-py2.py3-none-any.whl

Installing collected packages: git-lfs

Successfully installed git-lfs-1.5

In spite of reading the readme, not sure I understand it well

So could someone please help me ?

Thanks!

That’s not what you are supposed to do. In the documentation, we do link to https://github.com/mozilla/DeepSpeech/blob/master/README.md#prerequisites Git-LFS’s website, so you should follow their instructions to set-it-up.

I don’t know where you got that pip install git-lfs, but it’s not what the document, so you still don’t have git-lfs properly installed and thus not the language model file.