Word/letter timestamp with deep speech

Take a look at the decoder sources in native_client/ctcdecode, the get_beam_search_result function returns an Output structure that contains the predicted characters as well as timesteps for each character. Exposing this in the API requires experimentation to figure out if/how this data needs to be transformed before being shared with users, how accurate it is, etc.