Hey all,
I have been using my own trained model for inference on google colab for a while now. However, I now need to move everything into a sagemaker notebook. I created a fresh persistent conda environment (miniconda env) with python 3.7.
Here are the steps I took:
- installed tensorflow 2.3.0 into the conda env
- installed deepspeech into the conda env with %pip install deepspeech
- when I ran the script (vad_transcriber) it would say no module called “deepspeech”
- So I installed deepspeech outside of the env with !pip install deepspeech
- Now when I run it, I see this
Pasting the error here also:
DEBUG:root:Transcribing audio file @ first_6.wav
DEBUG:root:Found Model: speech_model/tedlium_checkpoint.pbmm
DEBUG:root:Found scorer: speech_model/tedlium_model.scorer
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
2020-12-16 23:18:57.797780: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
DEBUG:root:Loaded model in 0.015s.
terminate called after throwing an instance of 'lm::FormatLoadException'
what(): native_client/kenlm/lm/binary_format.cc:160 in void* lm::ngram::BinaryFormat::LoadBinary(std::size_t) threw FormatLoadException because `file_size != util::kBadSize && file_size < total_map'.
Binary file has size 14680064 but the headers say it should be at least 941209108
I am quite new to using deepspeech so if anyone has any insight that would be amazing.