Sample Colab Notebook for faster than real-time speech synthesis on a CPU

erogol · June 26, 2020, 3:36pm

https://colab.research.google.com/drive/1u_16ZzHjKYFn1HNVuA4Qf_i2MMFB9olY?usp=sharing

florian_4_sepia · June 28, 2020, 4:22pm

Hi erogol,

Florian here from SEPIA Open Assistant. I thought its better to continue our Twitter discussion here .

I’ve been playing around a bit more with the Raspberry Pi 4 adaptation of your Colab code. Especially with the threading.

The Pi4 has 4 cores. First I tried to set the used cores via code (torch.set_num_threads and torch.set_num_interop_threads). This does not seem to work as the Pi was always at 400% CPU usage no matter where I put the code.
I searched the web for it and found other users with the same problem. What worked for me was to use OMP_NUM_THREADS=1 python3 at start. After that I saw CPU usage at 100%/200%/etc. depending on the OMP_NUM.

Now for the results. If done each test 3 times and took the middle result (errors are given as very rough estimation).

Test sentence: "Hello this is a test"

1 Core:

Step 1: 3.37 (+/- 0.1)
Run-time: 5.54
Real-time factor: 3.702
Time per step: 0.0001678

2 Cores:

Step 1: 2.92 (+/- 0.1)
Run-time: 4.41
Real-time factor: 2.945
Time per step: 0.0001335

3 Cores:

Step 1: 2.86 (+/- 0.15)
Run-time: 4.24
Real-time factor: 2.83
Time per step: 0.0001284

4 Cores:

Step 1: 2.86 (+/- 0.2)
Run-time: 4.16
Real-time factor: 2.77
Time per step: 0.0001259

“Step 1” is right after the synthesis(...) function, “Run-time” same as in your code.
It looks like there is a small effect but not much. At 3 and 4 cores synth. times tend to be much more random as well.

Now something very strange I noticed. When I use the sentence
"Hello this is a longer test"
synth. time gets very very long (2 cores):

| > Decoder stopped with 'max_decoder_steps

Step 1: 54.14
Run-time: 86.44
Real-time factor: 2.483
Time per step: 0.0001126

[EDIT] I just checked the audio and its 3MB (compared to 600KB of the even longer sentence below). It starts perfectly normal then is followed by a lot of silence then some strange artifact that sounds like “lo loooo looooo” followed by even more silence ^^.

An even longer sentence like
"Hello this is a longer test to see if threading actually changes anything at all. Ok, let’s go."
looks ok again (2 cores):

Step 1: 12.58
Run-time: 19.22
Real-time factor: 2.732
Time per step: 0.0001239

Do you have any idea whats going on here?

Regards,
Florian

erogol · June 28, 2020, 4:43pm

Hey Florian

Welcome to our forum. Happy to carry on our conversation here.

The problem with the longer runtime is that the model fails to stop at the right time and it stops by a threshold value. In general if you don’t use a right set of punctuations in a sentence it is prone to happen. For instance in your sample you are missing the stop sign at the end. It might also happen in complex sentences. But for your sample please try it with the stop at the end.

I think we can try a couple of more optimization tricks to improve the runtime speed like exporting model using pytorch script or using tensorflow backend. I am on a vacation for a week after that I can help on that.

Best

florian_4_sepia · June 28, 2020, 6:02pm

You are right, it works. Actually I’ve seen similar errors in older TTS systems (I think it was in Mary-TTS) that’s why I’m doing a check in my SEPIA code that adds the “.” at the end if its not there ^^.

Looking forward to it Enjoy your vacation.

cu

erogol · July 7, 2020, 4:07pm

Just check the models with TF and TF introduces almost 10% boost in speed. Soon I’ll share a notebook running TF models. It’d be nice if you get to try these models too.

florian_4_sepia · July 14, 2020, 9:33pm

Hi @erogol

I’ve tried to get the new TF-Lite demo running on my RPi4. Gladly I had most of the required libraries already installed from last time and after a lot of Googling I found a way to install TF-Lite 2.3.0 runtime … I guess … but now I have the following error:

from TTS.tf.utils.tflite import load_tflite_model
ModuleNotFoundError: No module named 'TTS.tf'

My guess was that this should be installed during the ‘setup.py’ step but I can’t see anything in the logs (if its supposed to appear there). Do you have an idea?

cu,
Florian

erogol · July 14, 2020, 9:59pm

This looks like a setup problem but the module should be available as you setup.py Mozilla TTS as in the sample notebook. i don’t know the reason unfortunately.

florian_4_sepia · July 15, 2020, 8:43am

Here is a complete dump of what gets installed with branch “c7296b3”. TTS/tf is nowhere to be found

/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/synthesize.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/setup.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/compute_statistics.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/distribute.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/version.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/train.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/server/server.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/server/synthesizer.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/server/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/server/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/server/__pycache__/server.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/server/__pycache__/synthesizer.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/server/templates/index.html
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/__pycache__/compute_statistics.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/__pycache__/distribute.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/__pycache__/setup.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/__pycache__/synthesize.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/__pycache__/train.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/__pycache__/version.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/losses.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/gst_layers.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/tacotron.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/tacotron2.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/common_layers.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/__pycache__/gst_layers.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/__pycache__/tacotron.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/__pycache__/losses.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/__pycache__/tacotron2.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/layers/__pycache__/common_layers.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/models/tacotron_abstract.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/models/tacotron.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/models/tacotron2.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/models/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/models/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/models/__pycache__/tacotron.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/models/__pycache__/tacotron_abstract.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/models/__pycache__/tacotron2.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/loss.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/compute_embeddings.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/tests.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/model.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/dataset.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/generic_utils.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/visual.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/train.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__pycache__/visual.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__pycache__/tests.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__pycache__/generic_utils.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__pycache__/dataset.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__pycache__/model.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__pycache__/train.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__pycache__/loss.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/speaker_encoder/__pycache__/compute_embeddings.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/synthesis.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/speakers.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/tensorboard_logger.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/measures.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/training.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/data.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/generic_utils.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/visual.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/audio.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/io.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/radam.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/console_logger.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/audio.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/measures.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/speakers.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/radam.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/visual.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/generic_utils.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/console_logger.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/training.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/tensorboard_logger.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/data.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/io.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/__pycache__/synthesis.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/cleaners.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/symbols.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/number_norm.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/cmudict.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/__pycache__/cmudict.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/__pycache__/number_norm.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/__pycache__/cleaners.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/utils/text/__pycache__/symbols.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/compute_tts_features.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/train.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/__pycache__/compute_tts_features.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/__pycache__/train.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/layers/losses.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/layers/melgan.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/layers/pqmf.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/layers/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/layers/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/layers/__pycache__/melgan.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/layers/__pycache__/losses.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/layers/__pycache__/pqmf.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/multiband_melgan_generator.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/melgan_discriminator.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/melgan_multiscale_discriminator.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/random_window_discriminator.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/melgan_generator.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/__pycache__/random_window_discriminator.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/__pycache__/melgan_discriminator.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/__pycache__/melgan_multiscale_discriminator.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/__pycache__/melgan_generator.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/models/__pycache__/multiband_melgan_generator.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/utils/generic_utils.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/utils/io.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/utils/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/utils/console_logger.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/utils/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/utils/__pycache__/generic_utils.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/utils/__pycache__/console_logger.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/utils/__pycache__/io.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/test_datasets.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/test_rwd.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/test_pqmf.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/test_losses.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/test_melgan_generator.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/test_melgan_discriminator.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/__pycache__/test_melgan_discriminator.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/__pycache__/test_pqmf.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/__pycache__/test_melgan_generator.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/__pycache__/test_datasets.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/__pycache__/test_losses.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/tests/__pycache__/test_rwd.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/datasets/preprocess.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/datasets/gan_dataset.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/datasets/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/datasets/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/datasets/__pycache__/preprocess.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/vocoder/datasets/__pycache__/gan_dataset.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/test_text_processing.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/generic_utils_text.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/test_tacotron_model.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/test_tacotron2_model.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/symbols_tests.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/test_loader.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/test_layers.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/test_audio.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/test_preprocessors.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/test_demo_server.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/test_demo_server.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/test_audio.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/test_tacotron_model.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/generic_utils_text.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/test_preprocessors.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/test_tacotron2_model.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/test_loader.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/test_layers.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/symbols_tests.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/tests/__pycache__/test_text_processing.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/datasets/TTSDataset.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/datasets/preprocess.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/datasets/__init__.py
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/datasets/__pycache__/TTSDataset.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/datasets/__pycache__/__init__.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/TTS/datasets/__pycache__/preprocess.cpython-37.pyc
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/EGG-INFO/dependency_links.txt
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/EGG-INFO/entry_points.txt
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/EGG-INFO/requires.txt
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/EGG-INFO/SOURCES.txt
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/EGG-INFO/top_level.txt
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/EGG-INFO/not-zip-safe
/usr/local/lib/python3.7/dist-packages/TTS-0.0.3+c7296b3-py3.7.egg/EGG-INFO/PKG-INFO
/usr/local/bin/tts-server
/usr/local/bin/phonemize
/usr/local/bin/bokeh
/usr/local/bin/tqdm
/usr/local/bin/flask
/usr/local/bin/unidecode
/usr/local/bin/f2py
/usr/local/bin/f2py3
/usr/local/bin/f2py3.7
/usr/local/bin/convert-caffe2-to-onnx
/usr/local/bin/convert-onnx-to-caffe2
/usr/local/bin/segments
/usr/local/bin/futurize
/usr/local/bin/pasteurize
/usr/local/bin/easy_install
/usr/local/bin/easy_install-3.8
/usr/local/bin/tabulate

I will try to copy it manually.

florian_4_sepia · July 15, 2020, 10:51am

I’ve managed to get it running and … its about 2 times faster

Run-time: 2.36s
Real-time factor: 1.47

https://b07z.net/downloads/tts.html

I’ll try to get more detailed results soon checking the threading as well.

erogol · July 16, 2020, 12:42pm

It think there is even more possibilty for speed up if we train the model using characters. This way, we can skip the phoneme computation.

@florian_4_sepia do you have a chance to try running espeak (our phonemization backend) on raspi and get its runtime?

Or does anyone have any values (@nmstoker maybe?)?

florian_4_sepia · July 17, 2020, 8:54am

yes, eSpeak is a core part of SEPIA.
I’ll get you some numbers, it should be around 0.1 rtf.

florian_4_sepia · July 17, 2020, 9:19am

— eSpeak results (RPi4):

“Hello this is a test”
length: 1.3s
runtime: 0.08s
rtf: 0.062

“Bill got in the habit of asking himself ‘Is that thought true?’ and if he wasn’t absolutely certain it was, he just let it go.”
length: 7s
runtime: 0.11s
rtf: 0.016

— picoTTS results (RPi4):

“Hello this is a test”
length: 1.8s
runtime: 0.13s
rtf: 0.07

“Bill got in the habit of asking himself ‘Is that thought true?’ and if he wasn’t absolutely certain it was, he just let it go.”
length: 8.5s
runtime: 0.24s
rtf: 0.03

erogol · July 17, 2020, 9:23am

then it looks like a not a big bottleneck. Thx for testing.

florian_4_sepia · July 17, 2020, 10:45am

looks like

I’ve updated my evaluation overview page with eSpeak, pico and Mary-TTS examples:

For a good voice-assistant experience I think it is necessary to push the RTF at least down to 0.5 or it will be irritating for Users.

erogol · July 17, 2020, 11:43am

this is the most we can get with the current model in Mozilla TTS. I’ll start on a new model soon to enable even faster runtime but it is months away.

nmstoker · July 17, 2020, 2:59pm

@florian_4_sepia beat me to it!

As I’d tested purely the espeak and espeak-ng parts alone (ie not as part of TTS), maybe it’s still worth me posting this.

My simple test script is attached. It runs espeak (or espeak-ng) with test phrases, repeating it within timeit 1,000 times for each, so we can simply take the times printed (which are in seconds) and treat them as milliseconds for a single run.

It’s just showing the time to output the IPA for the input samples (first two match @florian_4_sepia 's sentences and then there’s a longer quote of just over 600 characters)

As you’ll see there’s very little in it, but espeak is marginally faster than espeak-ng.

The timings for both are so low, it does look like it won’t make a meaningful difference cutting them.

With espeak

Installed simply with sudo apt install espeak

pi@rpi4B3:~/Projects/espeak_tests $ python3 time_espeak_calls.py
Running timing test
“Hello this i” (full length: 20)
8.488519589000134
Running timing test
"Bill got in " (full length: 126)
13.674001921000126
Running timing test
“It was the b” (full length: 612)
39.1383117169998

With espeak-ng

Installed simply with sudo apt install espeak-ng

Install was done after running sudo apt remove espeak and rebooting to ensure there’s no interference between the two.

pi@rpi4B3:~/Projects/espeak_tests $ python3 time_espeak-ng_calls.py
Running timing test
“Hello this i” (full length: 20)
10.240827402
Running timing test
"Bill got in " (full length: 126)
15.530353007999999
Running timing test
“It was the b” (full length: 612)
42.789440690999996

Script is attached. Please note that unlike on Arch, installing espeak-ng on the Pi doesn’t seem to add a symlink to make the espeak command point to espeak-ng so you simply need to update the command within the script (leaving the parameters as they are).

time_espeak_calls.zip (854 Bytes)

florian_4_sepia · July 17, 2020, 3:37pm

Thanks for the info. I think it might be possible to get 20-30% improvement on the RPi4 when the OS and packages are set up optimally. I read somewhere that the Ubuntu 64bit version for the Pi can be considerably faster. Nevertheless it will probably not be the desired factor of 3 faster. I’ll keep an eye on it