Am I missing something obvious, clearly I am. Perhaps I’m making some fundamental mistake in understanding, but I expected the GPU version to run an inference far faster than a CPU version. This is roughly 3 times as long as the CPU version took for the same inference. Perhaps it’s not really engaging the GPUs? Is there a way to verify?
mail_reknew@deepdictation-1-gpu ~]$ deepspeech models/output_graph.pb data/recording2.wav models/alphabet.txt models/lm.binary models/trie Loading model from file models/output_graph.pb 2018-02-22 23:01:22.906902: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2018-02-22 23:01:23.602419: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2018-02-22 23:01:23.602852: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285 pciBusID: 0000:00:04.0 totalMemory: 15.90GiB freeMemory: 15.61GiB 2018-02-22 23:01:23.602891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla P100-PCIE-16GB, pci bu s id: 0000:00:04.0, compute capability: 6.0) Loaded model in 4.769s. Loading language model from files models/lm.binary models/trie Loaded language model in 11.281s. Running inference. my mom this is bread i am speaking as clearly as possible and of slowly as possible i hope you get this Inference took 17.003s for 10.000s audio file.