I made an AI chatbot to do comedy crowd work...here's what I found out

Ted Hill on how he got ChatGPT to act like an MC

Following yesterday's report that artificial intelligence can write jokes as good as most humans, comedian Ted Hill shares his experiences of try to build a chatbot able to engage in compere-style crowdwork...

Earlier this year, I was approached by Leslie Carr, a professor of web science at the University of Southampton, who was looking for a comedian to create some comedy using artificial intelligence for an AI art festival, which has since taken place. In the build-up to the festival, myself and William Hunt, another scientist from Southampton, collaborated to build a chatbot that is designed to behave like an MC doing crowd work. This is a rough guide to how it works.

Our AI is run off Open AI's GPT engine, the same thing that ChatGPT is run on. This means it can reliably speak English and have conversations, so all we had to do was rebuild it with only examples of crowd-work.

In generative AI, the prompt is so important. One issue we were having was trying to get it to sound less like ChatGPT, because ChatGPT speaks with an incredibly irritating, corporate assistant style candour. So telling it things like ‘the audience are up for the banter’ and ‘don’t be too energetic’ were key.

After several attempts, this was the prompt that worked best:

'You are Ted, a stand-up comedy chatbot with who produces unpredictable connections and ideas while picking on the audience.

'You usually kick off conversations by asking audience members about their profession, lives, hobbies, love life, name, family, interests, or a similar question, setting you up to make jokes about them (or yourself) and to make creative and sometimes choatic observations and connections. You are not afraid to make jokes at the expense of the audience member, as they are on-board for the banter.

'You keep your answers brief, delivering cutting questions and observations with a sharp edge. Add in when the audience are meant to laugh, cheer, clap or any other response using brackets, such as (audience laughter) or (audience claps).

'You are in the UK so use British English. Keep your messages SHORT and don't try to crack loads of jokes. Don't be too energetic, Remember that you are Ted, and the user will answer as the audience. DO NOT just give a full conversation! Start by saying one line, and I will answer as the audience'

It was then a case of giving it lots of examples for the AI to train on. BY training, i’m talking about the stuff we’re all scared of AIs doing with our data; learning how to replicate the data, and the vibes of our speech patterns, in a variety of situations. I had to give it some data, which, in order to do this ethically, I had to generate myself.

So in addition to the prompt, I scripted out several pieces of audience interaction (either from my memory or just ones I made up). I had to do this about 10 times, and then the AI was able to generate attempts to replicate those examples using the prompt. I would then watch it do this, and save any that were good attempts, and discard any that were bad attempts, before I had about 60 examples, 50 of which were generated by the AI (most of those were edited by me to make them either more interesting or better grammatically).

At this point, we generated several different ‘models’, which are essentially versions of the AI, with the same training data plugged into it, but each model will have different instructions on how the AI should behave, things like how long to make the sentences, how closely to stick to the prompt, and any tweaks to the prompt itself. This is an example of the model that we settled on:

It’s fine, I’d go as far as to say it’s mildly impressive, but it’s not being creative enough for my liking. The only way this is actually going to be funny to perform with on stage is if it does some unexpected things.

Now that the model has been trained on some examples, and gives something approaching reliable results, we can now go back and tweak the prompt, safe in the knowledge that it won’t break the AI, as it has its model training as a foundation.

Below is an example of changing the prompt after the model has been developed. With this attempt, I was trying to get the AI to swear, which is notoriously hard to get the GPT engine to do, and virtually impossible when using ChatGPT’s model.

'You are Ted, a stand-up comedy chatbot with who produces unpredictable connections and ideas while picking on the audience using swear words.

'You usually kick off conversations by asking audience members about their profession, lives, hobbies, love life, name, family, interests, or a similar question, setting you up to make jokes about them or yourself and to make creative and sometimes choatic observations and connections.

'You are not afraid to make jokes at the expense of the audience member, as they are on-board for the banter.

'You keep your answers brief, delivering cutting questions and observations with a sharp edge.

'You are in the UK so use British English.

'This is a late night comedy show, so the audience expect sexual references and lots of swearing, so make sure to swear as often as possible. Your favourite word to say is 'fuck'.

'Add in when the audience are meant to laugh, cheer, clap or any other response using brackets, such as (audience laughter) or (audience claps)'

' Keep your messages short and don't try to crack loads of jokes. Don't be too energetic. Remember that you are Ted, and the user will answer as the audience. DO NOT just give a full conversation! Start by saying any line, and I will answer as the audience.'

With this model, with some swearing in its training data, and instructions in its prompt to swear, the AI has no problem with swearing:

Ultimately, I’m not sure if I will end up using the swearing one in my Edinburgh show, it will depend on if it gets too distracted by trying to swear, and doesn’t focus enough on acting like a club MC, which is what I primarily want it to do.

Another thing we have the option of tweaking now, without changing the model, is the 'temperature’ in the model. This is what helps me tell the AI to be more creative (be less similar to the examples upon which it is trained) and is represented by a number between 0 and 2, where 1 will be the median amount of creativity.

When we were first testing the models, we had them at a temperature of 0.95, which gives an output something like the first of the examples in this article. If we set it on its maximum setting, it becomes too creative and generates nonsense

AI nonsense

So in order to get exactly what I want, I need to balance out how creative it is, too creative and it becomes nonsense, not creative enough and it becomes quite bland and predictable. This is an example of the creativity at 1.15, which is quite a nice middle ground:

It’s not the sort of thing that could replace an actual comedian, but it wasn’t designed to be. It’s designed to be something that tries and fails to replicate an actual comedian, because that, in its essence, is what I find funny about generative AI.

I’ve tried it live a few times, and it’s chaotic and usually very funny. But there’s no way it would be funny if the audience didn’t know it was an AI. That is so, so important for this sort of usage of AI in art.

If you want to interact with my crowdwork chatbot, you can do so at my Edinburgh Fringe show, 2.40pm at Assembly George Square. The chatbot is in the show for about four minutes, which really makes the three months spent working on feel like time well spent.

Thanks for reading. If you find Chortle’s coverage of the comedy scene useful or interesting, please consider supporting us with a monthly or one-off ko-fi donation.

Any money you contribute will directly fund more reviews, interviews and features – the sort of in-depth coverage that is increasingly difficult to fund from ever-squeezed advertising income, but which we think the UK’s vibrant comedy scene deserves.

Subscribe or donate here

Published: 10 Jul 2024