Meaning of n_steps in Deepspeech

chenda0413 · March 27, 2019, 2:29pm

I have been trying to get the raw logits output from the output layer. I loaded the pre-trained model and loaded the input layer. The shape is fixed to [1,16,19,26]. However, the input does not always have this shape. By performing the same operations as in do_single_file_inference on the input file, the features feed in will have shape [1, num_strides, 19, 26]. I have noticed that the second shape corresponds to a variable called n_steps in Deepspeech.BiRNN, but I could not understand what does this variable mean and why is it fixed in the pre-trained model. I apologize if this question might be trivial. It would be really pleased if anyone can explain this.

ena.1994 · July 22, 2019, 3:34pm

–n_steps: how many timesteps to process at once by the export graph, higher
values mean more latency
(default: ‘16’)
(an integer)

reuben · July 23, 2019, 5:36pm

This this help? https://hacks.mozilla.org/2018/09/speech-recognition-deepspeech/

Topic		Replies	Views
Spanish Speech To Text: n_context meaning? DeepSpeech	2	1502	February 21, 2018
N_hidden parameter DeepSpeech	0	515	February 12, 2022
Raw logit shape DeepSpeech	4	502	April 18, 2019
How can I change the bottom dense layer to conv1d layer? DeepSpeech	2	655	January 29, 2020
Is there "--display_step" in the latest version of DeepSpeech DeepSpeech learning	0	313	March 8, 2019

Meaning of n_steps in Deepspeech

Related topics