I’d like to train a model of the German Language CommonVoice data (and publish it for free, too). Therefore, I have a supercomputer at my hands with many GPUs, but I do not have root rights. I already created a singularity container, https://github.com/NormanTUD/TrainDeepSpeechGerman , with which I can train DeepSpeech. But it takes a really long time.
The architecture of my supercomputer is that there are many nodes with good GPUs.
Is there any way to utilize this? Can I easily train DeepSpeech in parallel, so that results from one node affects all the results on other nodes?
We use Slurm as batch management system, if this is important. Any more information that may help you you can ask for and I’ll try to deliver it. Also, I use DeepSpeech 0.6.1 (the latest release from GitHub).
It would be great to successfully utilize this and create a good german language model for publication. If this works, I could also create models for other languages and also release them.