Word/letter timestamp with deep speech

reuben · January 15, 2019, 11:18am

Take a look at the decoder sources in native_client/ctcdecode, the get_beam_search_result function returns an Output structure that contains the predicted characters as well as timesteps for each character. Exposing this in the API requires experimentation to figure out if/how this data needs to be transformed before being shared with users, how accurate it is, etc.

Topic		Replies	Views
Using deep speech to get timestamp for each word, not only string DeepSpeech	1	2088	February 17, 2019
Speech-to-text json result with time per word DeepSpeech	3	1142	October 19, 2018
Word timestamps DeepSpeech	0	636	August 27, 2018
Language Model influence on word timings DeepSpeech	6	524	July 30, 2019
Using DeepSpeech to determine location of speech in transcription DeepSpeech	1	579	March 1, 2019

Word/letter timestamp with deep speech

Related topics