Audio files for Deepspeech

Can we train Deepspeech on audio files longer than 10 seconds (and less than one minute) ?

Theoretically you can, but it may be trickier to get things working well. Longer audio files will mean lower batch sizes, which can affect convergence, and TensorFlow has some reported numerical instability issues with CTC and long sentences (https://github.com/tensorflow/tensorflow/issues/4193), although I don’t know if files shorter than one minute will trigger that. So, it may be possible, but it may take experimentation to get it there.