Slightly “off topic”, but hope that won’t be a concern as there is a (tenuous) connection
Just thought I’d share a trick for turning TTS audio (or any audio) into a nice little spectrogram video using FFmpeg - this means it’s then easy to share on Twitter (which doesn’t take straight audio uploads otherwise) or for other similar uses.
The result can be seen here: https://twitter.com/nmstoker/status/1276569419267952643
There are a whole load of ways to create the video “image” part besides using a spectrogram (it’s probably best to Google it if you want to explore other things like using a static image etc; I’m not an FFmpeg guru!)
This technique uses the audio to create the spectrogram which it merges with the audio.
So that it works on Twitter the output audio needs to be transcoded to AAC, that happens by default but to have some control over the quality you can make adjustments to the 128k parameter below (I haven’t yet figured out if Twitter keeps the audio or resamples it after upload)
Command is all one line:
ffmpeg -i your_input_audio.wav -filter_complex "[0:a]showspectrum=s=320x240:color=magma,format=yuv420p[v]" -map "[v]" -map 0:a -c:a aac -b:a 128k output_video_file.mp4
If you want a smaller file with just a blank video you can use something like this:
ffmpeg -f lavfi -i color=c=0x0a84ff:s=320x240 -i your_input_audio.wav -c:v libx264 -tune stillimage -pix_fmt yuv420p -shortest -c:a aac -b:a 128k output_video_file.mp4
Details on the showspectrum options here:
Details on requirements / limitations for uploading video to Twitter here: https://help.twitter.com/en/using-twitter/twitter-videos