Alternative to print()

print() in python2.7 has some glitches with unicode, and you might get UnicodeEncodeError when you have unicode characters in alphabet.txt and redirect the output from DeepSpeech.py to a file:

python -u DeepSpeech.py \
  ...
  > log/somewhere

There are a couple of ways to avoid getting the exception, for instance having export PYTHONIOENCODING=utf-8.

How about using built-in logging like tf_logging instead of the current prefix_print()/print()?

Could you file this as a bug against the current code base in github?

As a side note, Python3 has no such problem. I think at some point we need to upgrade things to Python 3 since some of the libraries are also ditching 2.7 (e.g. Numpy, Scipy, probably TensorFlow soon, PyTorch).

1 Like

We do already have builds for 2.7, 3.4, 3.5 and 3.6. On TaskCluster, we just still stick to 2.7 for training the model, but it is an easy switch to perform.

1 Like