The output_graph.pb model file generated in the above step will be loaded in memory to be dealt with when running inference. This will result in extra loading time and memory consumption. One way to avoid this is to directly read data from the disk.
I have 2 questions about this:
Why load from disk (HDD or SSD) is faster than load from memory (RAM) ?
After I have tested on 5 differents audio files. Why all the results of .pb file is the same text and all the results of .pbmm return not the same text ?
It’s not comparing “loading from disk” against “loading from memory”. It’s about an extra step has to be taken with .pb.
Both file will be loaded from HDD to RAM. With .pbmm, only necessary blocks of data will be loaded to RAM, and they are usable immediately. With .pb, the whole file will be loaded to RAM, and then a post-processing step is performed for the data to become usable.
As to your 5 different audio files, I don’t know what happened. Maybe a wrong wav file was used with .pb file?