From Co-pilot to Auto-pilot: Can Automated AI Writing Take Flight?

Jun 21, 2023

Gabrielle, the AI-powered autopilot, sits in the cockpit

Welcome aboard, dear readers! This is your pilot speaking, and I'd like to welcome you all to today's flight, destination: automated AI writing. In our last post, we explored the AI writing assistant, and today we're taking it a step further by handing over the controls to our AI writer co-pilot, testing the limits of automated creativity at 30,000 feet.

The idea to embark on fully automated writing was inspired by this impressive case study of content automation from Byword.ai. Byword developed a groundbreaking AI writing generator that writes high-quality, AI-written articles for SEO (Search Engine Optimization), propelling websites to new heights of engagement. For instance, Causal, a financial modeling startup, saw its website traffic soar with thousands of AI-generated finance-related posts, such as "How to Make a Pie Chart in Excel." They achieved an impressive 750k monthly visits!

Could we do the same for Gabrielle's blog, where she shares insightful life advice for navigating the AI age? If we let our AI auto-pilot write thousands of blog posts, we could reach a bigger audience to raise awareness about the profound impact of artificial intelligence on our daily lives.

Please fasten your seatbelts as we reveal our progress so far. We've made good speed with a remarkable "one-button" prompt that generates end-to-end blog posts from a single topic. But the auto-pilot experience still has some turbulence. Although our test flights on the fully-automated route landed successfully, there are still some quality challenges along the flight path.

Sit back, relax, and enjoy the in-flight entertainment as we share valuable techniques for automated AI writing, exploring what's already possible and charting a course for future advancements!

A) Useful techniques in automating AI writing

Like a pilot relies on detailed instructions and pre-flight preparations, we've discovered crucial steps to guide our AI auto-pilot and ensure smooth writing experiences. Follow along as we fly through upfront instructions, pre-work, the power of examples, and the automated feedback-rewrite loops that keep our flight on the right trajectory.

1. Detailed upfront instructions

In collaborating with AI on our blog-writing process, I noticed some patterns that came up often. For example, we started with a naive approach to writing the 'reader's letter' to Gabrielle. I would simply give the topic to ChatGPT and ask it to write a letter to Gabrielle based on it.

As we wrote a few blog posts together, I saw a few things that I didn't like - for example, ChatGPT would start the letter as an official email, "Dear Gabrielle, I hope this finds you well," or end with too many pleasantries like "I am a huge fan of your column and am looking forward to hearing your sagacious advice." Other times, ChatGPT would write a long, elaborate letter with complicated language - even if it was allegedly from a teenager. In the collaborative process, I'd give ChatGPT that feedback and ask it to rewrite. For example, I'd ask: "Make it shorter and write it like a teenager would."

Over time, I took the most common feedback points and put them into a set of upfront instructions. Now, I ask the AI to write the reader's letter following these guidelines - and it usually eliminates the need to give further feedback after the fact. These instructions cover things to do and not to do, including points about content, style, and some examples.

Example: my upfront writing instructions for the reader’s letter.

2. Pre-work

Gabrielle's column mixes life advice with a deliberate look at the impact of AI on society. This is a delicate balance that took some work in our collaboration. Sometimes, ChatGPT would give life advice but forget to talk about AI entirely. Other times, it spent an entire response pondering AI's impact on society without helping the reader with their situation.

To solve this, I started doing human pre-work before asking ChatGPT to write the letter. For each blog post, I would write down a few dilemmas about AI that relate to the story. Then I'd give ChatGPT the reader's letter, Gabrielle's personality instructions, and the AI dilemmas and ask it to incorporate them into the response. With the human pre-work, Gabrielle's answers hit a good balance of personal life advice with a touch of AI ethics dilemmas.

This idea of pre-work is related to a prompting concept of 'working step by step,' or chain-of-thought (CoT) prompting. Researchers found that when faced with a complex task like math or complicated reasoning, Large Language Models (LLMs) perform better when asked to show their work in steps (read more about it here).

In our case, I was doing the pre-work steps myself. Then, inspired by the CoT technique, I asked ChatGPT to do that pre-work and incorporate it into Gabrielle's responses. It worked! So now there are several places where I tell ChatGPT to do some pre-work first and then incorporate that into its ultimate answer.

Example: asking ChatGPT to do pre-work, writing AI dilemmas for the story.

3. Examples

Large language models are inherently trained to continue writing based on their initial input. Write the beginning of a sentence, and the AI text generator will ___ (complete it for you). If the beginning input follows some pattern in style or content, the AI will try to pick up that pattern and continue it.

One way to build on this is to use examples, a concept often referred to as few-shot prompting (read more here). With this approach, you include several example answers along with your question. That helps the model learn from the context of the samples and steer its response to mimic them.

When I initially asked chatGPT to write AI dilemmas, they were too generic and obvious (e.g., "Can we trust AI not to be biased?). To make its responses more hard-hitting, challenging, and punchy, I added a few examples of what I was looking for - and it now does a better job.

Example: ChatGPT improves its answers when given examples.

4. Automated feedback-rewrite loops

A typical pattern in our human+AI collaboration is feedback - I give ChatGPT feedback on its work and ask it for feedback on mine. A lot of great ideas have come out of this iterative process.

But is there a way to automate it? One approach that works well is asking ChatGPT to give itself feedback and then rewrite.

Specifically, I applied this to improving Gabrielle's answers. One challenge we ran into was that Gabrielle's answers were funny but not always thoughtful. To improve this, I asked chatGPT to adopt different personalities of experts and have each one provide 'feedback' about Gabrielle's answers. ChatGPT simulated decent feedback from these experts that helped Gabrielle see a broader set of valuable viewpoints.

Besides the experts, I asked ChatGPT to evaluate Gabrielle's answer on several stylistic aspects like humor and clarity and make suggestions for improvement.

Armed with these suggestions, ChatGPT's second run at Gabrielle's answer becomes more thoughtful and has another chance to hit the right style.

Example: ChatGPT takes on the personas of experts to generate automated feedback for Gabrielle.

B) Putting it all together - step-by-step instructions

With all the steps above, I arrived at a repeatable process for writing a Dear Gabrielle post in under an hour. I had a set of instructions for each step saved in a document. I would copy and paste these instructions into ChatGPT to do each step - write the reader's letter, come up with dilemmas, write Gabrielle's response, get feedback, rewrite it, etc.

My role in this 'collaboration' was reduced to copy-pasting. Occasionally, I saw an opportunity to improve a reader's letter or Gabrielle's response. But mostly, it was Ctrl-C, then Ctrl-V.

My initial thought was to write some code to automate this with a series of calls to the ChatGPT API. And by "write code," I mean I asked ChatGPT to write that code for me!

But then, I saw several projects where ChatGPT gets a detailed "specification" of instructions and can follow the instructions pretty well1. That inspired me to take my process and turn it into step-by-step instructions.

I collected the separate instructions from my document into a single prompt with 11 steps - from creating a reader persona to writing the reader's letter, feedback and even writing the titles for the blog post. Each step has clear instructions and all the necessary inputs - examples, detailed style guidance, etc. To run the instructions, I give ChatGPT the topic for the blog and ask it to run through all the steps to write out the full blog post.

An illustration of the step-by-step process described in the full prompt to ChatGPT.

And the best part? It works! ChatGPT can now write end-to-end posts for the Dear Gabrielle blog with a single prompt. To see the complete prompt and an example output article, see this shared conversation (link). It led to the following post from Gabrielle:

Gabrielle.Day’s Substack

A Fairy Godparent’s Quest: Queering Up Fairy Tales with AI Magic

Dear Gabrielle, I'm Jamie, and I recently cooked up this super cool website called "FairyQueer." I used this wicked AI tool, MagicPotion, to inject rainbows and unicorns into classic fairy tales, making them way more relatable to LGBTQIA+ teens. The magic beans of AI turned Cinderella into a transgender princess, and Snow White had seven fabulous queer f…

2 years ago · Gabrielle.Day

With this one-button solution, I was excited to retire! Let ChatGPT write Gabrielle's blog, and I will fly to Hawaii to sip Mai-Tais on the beach. But it turns out it's a bit too early to give ChatGPT its wings...

C) Challenges:

When I got the automated process to work and ChatGPT wrote an entire blog post with a single click, my jaw was practically on the floor. It wasn't just that it ran through the steps, but it also generally followed the style and nature of the blog.

But, this automation still has its challenges...

Context length:

I've mentioned how ChatGPT has a limited short-term memory. When answering questions, it can look at up to ~3 pages of history in the conversation, forgetting any earlier content.

The end-to-end instructions that I put together took about two pages. And I am asking ChatGPT to write about two pages of text between the reader's letter, the AI dilemma pre-work, and the versions and feedback of Gabrielle's responses. So eventually, as ChatGPT progresses in writing its answer, it can forget the beginning of the instructions! For example, when my instructions were too long, ChatGPT started hallucinating new steps such as "Step 10: summarize the post" or "Step 11: send the final draft for editor approval."

Models are getting longer context lengths, so this problem will disappear with time. But I've gotten around it for now by condensing the instructions (I asked ChatGPT to write them more concisely without losing the details!) and organizing them by order. This way, it's ok if ChatGPT forgets the instructions for step 1 when it's already working on step 9.

Model capability:

Even with the most detailed instructions, some models are limited in their ability to follow them. For example, no matter how much I ask ChatGPT 3.5 to be funny and make up specific examples - it often sticks to boring generalities. GPT-4 does a much better job creating specific examples and being funny. But it is only available to people with a paid GPT-plus subscription or through the paid API.

Bing has a different limitation - it is limited to 4,000 input characters in its requests. My instructions are around 12,000 characters, so it can't handle them in one prompt.

With the difference in model capabilities, I've come to rely exclusively on GPT-4 for Gabrielle's blog. None of the others I've tried have come close yet.

The human touch:

Despite all my efforts, ChatGPT's autonomous writing is not on par with our human+AI collaboration process. It writes decent stories and responses, but they sometimes miss the X-factor we reach when working collaboratively.

I still need to put my finger on what exactly is missing. It's usually in the form of an idea that comes up in the collaboration - one that improves the story, gives it an unexpected twist, makes it more emotional, or makes Gabrielle's answer more insightful. Through closely examining our collaborative process, we may find more instructions to steer the automated process to even better quality.

Example: While AI-generated stories were decent, the human+AI collaboration made them more compelling.

Conclusion:

So, did our AI succeed in flying solo? Yes, indeed! ChatGPT has already written several entire blog posts for Dear Gabrielle. And while you may suspect that automated AI writing is bland and low-quality, this auto-pilot might surprise you!

Our flight path to automation town included giving clear step-by-step instructions, guiding the AI through pre-work, giving examples, and a few rounds of automated feedback. We ended up with a successful flight, from take-off to landing, at the press of a single button!

Was it perfect? No. The human+AI collaboration still has an edge over the automated process. For now, at least. So while Gabrielle may not achieve her dream of publishing 8,000 blog posts this month, the good news from this human co-pilot is that I'm not entirely useless in this team (yet!)

What's on the horizon? The sky is the limit! AI's automation capabilities will improve with longer context lengths, more advanced prompts, and better models. And while AI could fly more routes solo, this human+AI co-pilot team is excited to see all the new destinations we could reach together.

Note: Some of the references in this article link to the excellent prompting guide from DAIR.AI. It is an advanced collection of techniques with links to the scientific papers where they were introduced. Highly recommended for diving deeper and getting some ideas for working with advanced prompts.

Another note: This post explored whether and how we could use AI to automate writing. Another interesting question to explore is whether we should do it. There is a whole box of issues to study there, from how to maintain proper oversight over AI, to whether and how creating a lot of content automatically might make each piece of content far less valuable. We'll save some of those thoughts for another upcoming post about AI ethics topics that came up during the project.

Over to you, dear passengers! Have you tried using AI to write content? What techniques have you found worked, or what disappointed you? Leave a comment below - even if it’s AI-generated!