Thanks, I found this Openfst Win looks up to date and actually compiles with Visual Studio, I think the generated library is too big 450MB. I’ll try to compile it with Bazel alone then include it to DeepSpeech Bazel BUILD.
Is it possible it’s a debug non optimized build you did ?
Yes it was. Reading more the configuration build files I found:
All settings are disabled by default. Enable any of them by removing the
‘Condition=“false”’ part.
I enabled the use of AVX, AVX2 and target platform for win 10.0.17134.0. now the size of the compiled file is just 39MB
The command I used :
msbuild openfst.sln -v:m -m -t:Build ^
-p:Platform=x64 ^
-p:Configuration=Release ^
-p:PlatformToolset=v141 ^
-p:WindowsTargetPlatformVersion=10.0.17134.0 ^
-p:EnableEnhancedInstructionSet=AdvancedVectorExtensions
Now I’ll see how to compile it with Bazel and include that settings.
I fixed few errors related to imports but I can’t fix this using Bazel options
native_client/ctcdecode/third_party/openfst-1.6.7/src/include\fst/vector-fst.h(697): error C2872: ‘uint32’: ambiguous symbol
native_client/ctcdecode/third_party/openfst-1.6.7/src/include\fst/types.h(32): note: could be 'uint32_t uint32’
.\tensorflow/core/platform/default/integral_types.h(31): note: or 'tensorflow::uint32’
I tried to use Bazel visibility but I’m not familiar with Bazel.
I fixed similar issue by changing to uint32_t
but I feel is not the correct approach.
I think I fixed the errors related to Windows compability, now it says that the rule is missing, but the tensorflow file exist, so I’m not sure where is the problem
ERROR: C:/mo/mozilla-tensorflow/native_client/BUILD:42:1: undeclared inclusion(s) in rule ‘//native_client:libdeepspeech.so’:
this rule is missing dependency declarations for the following files included by 'native_client/deepspeech.cc’:
’tensorflow/core/util/memmapped_file_system.h’
This one is strange, this is something we have for a long time, you should share us your changed to native_client/BUILD
Here’s more information:
WARNING: C:/mo/mozilla-tensorflow/tensorflow/core/BUILD:2463:1: in includes attribute of cc_library rule //tensorflow/core:framework_internal_headers_lib: ‘…/…/external/com_google_absl’ resolves to ‘external/com_google_absl’ not below the relative path of its package ‘tensorflow/core’. This will be an error in the future. Since this rule was created by the macro ‘cc_header_only_library’, the error might have been caused by the macro implementation in C:/mo/mozilla-tensorflow/tensorflow/tensorflow.bzl:1379:20
INFO: Analysed 2 targets (0 packages loaded).
INFO: Found 2 targets…
INFO: From Compiling native_client/ctcdecode/third_party/openfst-1.6.7/src/lib/util.cc:
cl : Command line warning D9002 : ignoring unknown option ‘-O3’
cl : Command line warning D9002 : ignoring unknown option ‘-std=c++11’
INFO: From Compiling native_client/ctcdecode/third_party/openfst-1.6.7/src/lib/symbol-table-ops.cc:
cl : Command line warning D9002 : ignoring unknown option ‘-O3’
cl : Command line warning D9002 : ignoring unknown option ‘-std=c++11’
INFO: From Compiling native_client/generate_trie.cpp:
cl : Command line warning D9002 : ignoring unknown option ‘-O3’
cl : Command line warning D9002 : ignoring unknown option ‘-std=c++11’
ERROR: C:/mo/mozilla-tensorflow/native_client/BUILD:42:1: undeclared inclusion(s) in rule '//native_client:libdeepspeech.so’:
this rule is missing dependency declarations for the following files included by ‘native_client/deepspeech.cc’:
’tensorflow/core/util/memmapped_file_system.h’
cl : Command line warning D9002 : ignoring unknown option ‘-O3’
INFO: Elapsed time: 6,555s, Critical Path: 5,23s
INFO: 3 processes: 3 local.
FAILED: Build did NOT complete successfully
Here’s the BUILD
file Build file, I think the error is caused by the symbolic link.
Oh wait. We have code to support mmap()
for protobuf, you should check tensorflow internals on that, maybe it’s not supported on Windows? And thus the tensorflow bazel deps we have just dont include this header.
Yes, that’s it: https://github.com/mozilla/tensorflow/blob/1c93ca24c99d7011ad639eea4cd96e4fe45e1a95/tensorflow/core/BUILD#L909-L913 windows has a specific handling and they don’t include the headers there.
For now, maybe just #ifdef
to exclude that part of the code from windows builds ?
Nice, I’ll try with a if for Windows.
Or look at this, mmap windows maybe this that we are looking for ?
Not working, the file is required
native_client/deepspeech.cc(150): error C2614: ‘ModelState’: illegal member initialization: ‘mmap_env’ is not a base or member
native_client/deepspeech.cc(163): error C2065: ‘mmap_env’: undeclared identifier
I will investigate more and see what I find, but I’m afraid maybe there is no Windows implementation for that file.
Maybe Make memmapped_file_system build on Windows and deleting the select exclude for Windows in the BUILD
file that you mention
I found few commits on Github related to making memmapped work on Windows
I changed the BUILD of tf core to use the memmapped files in Windows too, but :
deepspeech.obj : error LNK2019: unresolved external symbol "public: __cdecl tensorflow::MemmappedEnv::MemmappedEnv(class tensorflow::Env *)" (??0MemmappedEnv@tensorflow@@QEAA@PEAVEnv@1@@Z) referenced in function “int __cdecl DS_CreateModel(char const *,unsigned int,unsigned int,char const *,unsigned int,struct ModelState * *)” (?DS_CreateModel@@YAHPEBDII0IPEAPEAUModelState@@@Z)
deepspeech.obj : error LNK2019: unresolved external symbol "public: class tensorflow::Status __cdecl tensorflow::MemmappedEnv::InitializeFromFile(class std::basic_string<char,struct std::char_traits,class std::allocator > const &)" (?InitializeFromFile@MemmappedEnv@tensorflow@@QEAA?AVStatus@2@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@Z) referenced in function “int __cdecl DS_CreateModel(char const *,unsigned int,unsigned int,char const *,unsigned int,struct ModelState * *)” (?DS_CreateModel@@YAHPEBDII0IPEAPEAUModelState@@@Z)
bazel-out/x64_windows-opt/bin/native_client/libdeepspeech.so : fatal error LNK1120: 2 unresolved externals
The file memmapped_file_system
seems to support Windows, I was going to change the code for Windows support but It already includes conditional compilation for Windows see : MSC_VER
Any ideas?
You should first try to build without that, instead of fixing all the world …
What do you mean with “without that” the memmapped_file_system ?
I think I almost compile it, I guess just missing linking command for Windows somewhere, I used verbose output and this list of commands was ignored:
"-ldl", "-pthread", "-Wl,-Bsymbolic", "-Wl,-Bsymbolic-functions", "-Wl,-export-dynamic"
I’m reading about each command.
Disable everything related to that in deepspeech.cc
, if it’s not officially supported. Try and get something to work first, before starting pulling unsupported code.
Thanks for the advice, I removed the code related to that file and compiled, I’m almost sure the problem is related to bad configuration in BUILD file of Tensorflow (for Windows sections), the error triggers in the final linking operation.
Build completed successfully, 730 total actions.
UPDATE
If I add the mmem files to the BUILD file still compiles it successfully, also I’m able to link the generated file and call DS_PrintVersions
from C# , it prints unknow but I think is BAZEL issue because is generating hard coded file with the “unknown”.
Loading output_graph.pb
seems to work, not sure yet. I also have to check why AVX was ignored.
Still have to figure out why is failing with the memmapped_file_system.h
reference in deepspeech.cc
Console:
TensorFlow: b'unknown' DeepSpeech: unknown TensorFlow: b'unknown' DeepSpeech: unknown Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage. 2018-11-03 21:13:02.357125: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX TF result code : 0
About memory consumption, is the 570MB of RAM the expected usage for output_graph.pb
?
Update:
–copt=/arch:AVX --copt=/arch:AVX2 for compiling on Windows using BAZEL with AVX
I’m also getting 0 from DS_CreateModel
Yes, please read the documentation … Like this warning you have …
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
Hi, good news I succesfully compiled without removing memmapped_file_system
parts using master, now I can load output_graph.pbmm
. If I call DS_EnableDecoderWithLM
throws this error
Error occurred: vector too long (fixed)
I’m also trying to build the code for streaming with Naudio and WindowsSpeech.
@lissyx thanks for all your help
Update:
Finally getting ouput: “捣c”, the vector too long error was misleading casting from C# to C++, I think the same thing happening on the buffer from C#. I’ll update.