Thread: Discussion on how to manage repositories on GitHub for DeepSpeech work in other languages

Hi @kreid,
in general, I think this is a great idea.

I just wanted to notify you that I already did some work on this field with my DeepSpeech-Polyglot project:

Pros:

  • Currently supports 5 different languages (German, Spanish, French, Italian, Polish)
  • Covers the whole training process, with data preprocessing, language model building, training and exporting.
  • Adding support for new languages is also very easy, you just have to add a new alphabet_xx.txt file and extend the special words and character replacement file (langdicts.json).

Cons:

  • I will soon drop support of direct integration into DeepSpeech, because I’m trying to replace it with an improved network architecture (I’m not finished with it yet).
    I’m open for a full integration of the exported networks into DeepSpeech again, but this requires some effort, mainly in the native client code, and currently I don’t have the time for it (I already started a discussion about this here: Integration of DeepSpeech-Polyglot's new networks - #10 by dan.bmh).
2 Likes