$tts updates and ElevenAI

Hey!

I caught wind of an emerging service, ElevenAI, that allows you to submit clean voice samples and then produce an eerily accurate TTS from them.

Once you’ve trained a “voice” with some samples, you’re able to forward-synthesize arbitrary text in a web UI and download the resulting MP3. There is also a fairly extensive API for automating these operations.

I saw a promo for a free month of the “starter tier”, so I played with it on Saturday morning and ultimately updated the goontube chat bot’s $tts command to use this API with a Dagoth Ur voice.

Users seemed to love Dagoth’s narrations and the “token quota” was quickly exhausted throughout the course of the day, completely so by the time I went to bed. :smith:

ElevenAI’s pricing tiers are based on “tokens”, which are the number of characters submitted in a TTS request. I will summarize them here in USD:

  • free: 10,000 characters per month
  • $5 / mo: starter (what we had/have a trial of) / 40,000 characters per month
  • $22 / mo: creator / 100,000 characters per month
  • $99 / mo: independent publisher / 500,000 characters per month
  • $330 / mo: growing business / 2,000,000 characters per month (they say this works out to about 40 hours of audio)

While I am happy to continue eating the costs of the platform & hosting regardless of donations, going overboard on auxiliary content features like this one on an optional chat bot is getting pretty far away from that core.

What am I going to do with the bot to ensure we can sort of continue to enjoy this feature without suddenly paying more for funny voices than the hosting itself is:

  • I will get us the “creator” tier for now to get a new quota at 100K characters, anything above that feels excessive IMO.
  • Update the bot to report the exhausted % of the TTS quota and to throw an abort error chat message if the quota has been exceeded, rather than write & chat an invalid empty MP3 file in this case. We observed this last night at first encounter with quota exhaustion.
  • The $tts use will be rate-limited like $dream, but not for quite so long.
  • The same exception username list for $dream content will apply to $tts. We will be on the honour system here, but obviously you/chat will know if you’re ripping through 100% of the quota and then would lose it so just play nice and enjoy the occasional Dagoth ok?
  • As a workaround, and to be fair to myself, if somebody wants a few extra tokens and we’re still blowing through the creator tier, then they are more than welcome to register an ElevenAI account, get a service tier, and privately send me an API key for the bot to use. :slightly_smiling_face: But we’re good to start.
  • A $voice command will be added to the bot that sets a per-user choice of a $tts voice. The default will remain Dagoth Ur, but $voice will take the arguments: dagoth or denton to begin with, and we can add more fairly easily.
  • General quotas can be revisited/relaxed as the novelty and usage dies down.

Anyways, the last parts of the above will take effect within a few hours and the improved TTS will be available again. :toot:

2 Likes

OK, the changes described above have been made. We have more tokens, rate limiting, per-user voice option with $voice, the exception list for $dream also applies to $tts, normal users get a half hour wait, and the current monthly usage is reported with each generation, so we won’t blow the 100K by surprise. The bot throws a rejection notice instead of writing a bad MP3 if the quota is blown.

Here is the usage of the new $voice:

Enjoy!

2 Likes

I added snake as a voice too, since I did the legwork to chop up a bunch of YouTube vids for clean quotes already. It’s not bad.

damn, I was thinking of doing this too but you beat me to it :haw:

This one can use some work, it needs more samples to get that “gravel” right every time

Which game did you use for the base model? Snake sounded pretty young in MGS1, but he definitely became more gravely during MGS3

and then Old Snake is just a pure gravel pit

Hi, this is definately Josh from let’s game it out. And not a computer.

Okay I gotta request for a $tts voice:

John de Lancie’s Q

:boom: