Can DeepSpeech process longer audio files?

Well, the thread I pointed you at contains an answer from Kelly, who explicitely documents that because of the architecture of the network and because of the current training dataset, processing very long audio will likely not work as expected :).

Regarding the accuracy, there are a number of other factors. You say 46% on 19 secs audio, that feels not expected, but it might depend on a lot of stuff: the dataset we have makes the model behave erratically if you don’t really have american english clean sound. It could also be mic interferences …

Besides, sorry, but DeepSpeech does not “damage” your computer. It’s computationnally intensive, but we know that, and again, we are working on that. But all of that takes time to accomplish properly. If you train for a specific speaker, I would suggest taking a look at TUTORIAL : How I trained a specific french model to control my robot where Vincent produced a model dedicated to himself, smaller and running good on NVIDIA GPU for his robot. It’s not magic, he has been able to produce enough audio data to train seriously, but he also reduced the model size, making it much smaller, and thus much less computationnally intensive. There’s a balance between the generalization capabilities of the model and its complexity.

For CTRL+C, I guess you are referring to Python binding? It’s likely the usual mess of Python and threads, not even sure we can do something to that.