DeepSpeech PlayBook v0.1 Alpha is available for feedback

Hi everyone,

Firstly a huge thanks to everyone for sending through ideas on what should be in a DeepSpeech PlayBook - your feedback was invaluable.

We’re pleased to announce that the first alpha release of the DeepSpeech PlayBook is now available for feedback and testing.

https://mozilla.github.io/deepspeech-playbook/

The PlayBook is written in MarkDown, and we welcome Issues and PRs to the GitHub repository.

In particular, we are seeking the following feedback:

  • Please try these instructions, particularly for building a Docker image and running a Docker container, on multiple distributions of Linux so that we can identify corner cases.

  • Please contribute your tacit knowledge - such as:

    • common errors encountered in data formatting, environment setup, training and validation
    • techniques or approaches for improving the scorer, alphabet file or the accuracy of Word Error Rate (WER) and Character Error Rate (CER).
    • case studies of the work you or your organisation have been doing, showing your approaches to data validation, training or evaluation.
  • Please identify errors in text - with many eyes, bugs are shallow :slight_smile:


The PlayBook focuses on training models in DeepSpeech, and does not seek to replace the existing documentation, but instead provides initial guidance to overcome common hurdles experienced when first training models in DeepSpeech.


I’d like to give a huge shoutout here to @ftyers, @Joshua_Meyer and XXX for all their expertise, feedback and patience as we built this out.

7 Likes

Do you know whether one could easily add a search function? Wanted to find “test splits” in a hurry.

1 Like

Great question @othiele. At the moment the only way to search would be searching the GitHub repo itself. I know that test splits are not covered, and that information about test splits should live in TESTING.md:

Best, Kathy