Text-to-speech Behavior

History

since the beginning all rovers had a text to speech of sorts to speak with other rovers and people in the room (citation needed) at first it used flite engine because they are fairly easy to integrate and work on practically anything (target hardware is a pi with 512 meg ram) but eventually espeak and the proprietary Google TTS engine

you may hear some examples of the tts at https://files.otter.land/rover_theater/fake.mp4 and https://files.otter.land/rover_theater/chake.mp4 and https://files.otter.land/rover_theater/roombanomics.mp4

Google TTS

google tts is using the proprietary chromeos linux libraries to help it run on the pi cpu (but still using the same voices used since android 4? perhaps a less verbose voice used there?) with https://github.com/biemster/gtts and https://storage.googleapis.com/chromeos-localmirror/distfiles/googletts-26.5.tar.xz it is very cool but latest google tts version has not been worked out yet (they tweaked it a bit in 2021). this engine is notably very understandable and varied in its output, and can run on anything from x86-64 to low power arm devices

in the chatbox u can select from the different voices (sfg, iob, iog, iol, iom, tpc, tpd, tpf) and these usually have a name attached to them but those are the internal ids used by the tts for english voices.

Behavior Notes

in gtts using lots of , or ; in your sentences varies the intonation sometimes (to make it sound more human probably) but ends up sounding funny at high speeds and fewer words

since the tts is in english, trying to use characters from another language will get rid of the pitch change and likely not pronounce it correctly, which also sounds funny

it also likes to infer like dates and other information from text, trying @_2-2-2-22_2-2-2- will say something like "the second to the second of february twenty second"

espeak

adding [[ before a character or a bit of text enters it into phoneme mode, making it pronounce text differently but its not very useful beyond that. in theory it would be usieful for singing but no speed or pitch control so.......e..

awesome tts schemes to use

engine + setting + text to say that is hilarious or historically significant

gtts - tpf - pitch 2 speed .75
@-swows@-@-swoows@-@-swows@-@-swooows@-@-swows@-@-swoows@-@-swows@-@-swooows@-@-swows@-@-swoows@-@-swows@-@-swooows@-@-swows@-@-swoows@-@-swows@-@-swooows@-
@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-@-
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9