Frustrated with content production bottlenecks for SEO? It’s a common problem, and technologies like ChatGPT can help us to solve that problem. Welcome to the world of scalable AI content production.
Common SEO Bottlenecks for Content Production
Working in SEO, it’s common to encounter content production bottlenecks. A client comes to you, and it turns out that their product content is a performance detractor. Perhaps they have thousands, or tens-of-thousands of products. Maybe the products are using thin boilerplate descriptions, pulling through titles as descriptions or maybe there is no product-level descriptive content at all.
In these scenarios you might advise the creation of content for thousands of products on your client’s site. You want the product descriptions to be unique. Unique and helpful content, has been more highly prized since Google’s Helpful Content update. You want the product content to be accurate, informative and useful to end-users.
Creating thousands of snippets of content can take a lot of human time. Each snippet has to be carefully authored, tweaked and edited. For these reasons, some agencies will support clients with their own in-house content teams, or might instead export the project to a platform like Textbroker where it can more easily be managed. In either case, there are associated fees for the content production, and they are not small. If you want to write 250 word snippets for 2000 product URLs, that’s a total word count of 500,000 (half a million!) words.
You know that your client needs more content for their site, and it needs to be authored to a set standard. But the costs of producing all of this are high, maybe too high for your client to afford. What can you do? Enter ChatGPT, GPT-3, text-davinci-003 and AI content production.
A Brief Introduction to ChatGPT
If you visit the OpenAI site and create a free account, you’ll see this button:
Once you click it, you’re taken to an interface. From here you can ask ChatGPT to perform various tasks. Let’s imagine that we’re working on content for the Angling Direct website (fishing gear and supplies). We’ll imagine that we’re working on a content snippet for this page: https://www.anglingdirect.co.uk/mainline-cell-freezer-boilies
The more detailed the prompt, the better the response. In this case we have included some background info on the company (taken from their “About” page), we have told ChatGPT about their website and the product page. We have specified the product in question. We have given ChatGPT some of the product’s key features (listed on the product page). We have given ChatGPT some advice in terms of how to render those key features, and we have suggested a writing style. We have also suggested a rough word count.
Here is ChatGPT’s response:
This response came out slightly over our suggested word count at 297 words. It’s easier to cut content down than it is to produce new content.
ChatGPT and AI Content Drawbacks
The first thing to note is that this technique works better for more established products, which have a higher presence within ChatGPT’s data model. The data model is where ChatGPT fetches, combines and re-writes all of this information from.
This means that new and emerging news stories, or emerging products may not be part of ChatGPT’s data model. In-fact, you can actually field this query to ChatGPT directly:
Obviously, this means that getting ChatGPT to write about emerging products or services can be a bit tricky. In such a scenario, ChatGPT will search for ‘similar’ information and will start generating content on a more assumptive basis. The bottom line here is that ChatGPT can’t replace content writers and authors (whom can do more thorough fact-checking). What ChatGPT can do, is to produce the bulk of the content for editorial review and tweaking. This is still beneficial, as it transforms ‘from-scratch’ mass-content creation tasks, into content review and editing tasks instead.
Keep in mind though, a human review process is strongly advised when utilising technologies like OpenAI’s ChatGPT in this way. This is especially true if you are using AI to generate content for law firms and service pages, as a strict content adherence policy will likely be in place.
How to Produce AI SEO Content at Scale: OpenAI’s Davinci
If you’ve followed along thusfar, you’re probably wondering how to scale this? There are a few problems with utilising ChatGPT’s native interface for scalable content production:
- Firstly, there’s the prompt-crafting. You saw the level of detail we put into our earlier ChatGPT prompt / response example. How can you scale the production of detailed, crafted prompts?
- Secondly, the ChatGPT interface needs a human to site there, to copy prompts in, copy the responses down and log them somewhere. That’s a slow, expensive and inefficient process.
- Thirdly, ChatGPT is currently API-inaccessible. You can sign up for ChatGPT Plus here. ChatGPT Plus still uses the same human-accessible interface as ChatGPT free. There’s a separate ChatGPT API waiting list Google Form here. But currently, you can’t access ChatGPT via API.
Due to these issues, we have to defer to ChatGPT’s little brother, text-davinci-003. Davinci 003 is of ‘similar’ power to ChatGPT, but it’s not quite as advanced. In fact, ChatGPT confirms that it utilises both GPT-3 (model) and text-davinci-003 (sub-model / engine):
From this perspective, ChatGPT is a superior interpretation and UI sleeve, that slides over the top of text-davinci-003, which in turn is a sub-model of GPT-3.
Although text-davinci-003 is less powerful than ChatGPT (for example is struggles a bit to keep word counts higher, without sufficient input) – it’s still really powerful. It’s still a technology which is produced by OpenAI (therefore transforming your project later, from text-davinci-003 to ChatGPT, should be relatively easy). Davinci is still capable of interpreting direct instructions and writing something, Davinci still has a very large data model.
The main benefit with Davinci, is that it is already API accessible and you can put it to scaled work immediately.
Using Excel to Scale the Creation of Rich Prompts
If you’ve followed us this far, you know ChatGPT’s current limitations and API inaccessibility. You also know that we’ll be using text-davinci-003 instead to scale the work. This solves the issues of mass-feeding OpenAI with prompts to generate content and responses. What it doesn’t solve, is the necessity of creating thousands of very detailed prompts for Davinci to process. To do this, we’ll just use Excel and some complex, logic-based formulae.
Usually, you would require some input data from your client to create prompts from. This might come in the flavour of CSV / spreadsheet output from their PIM (Product Information Management) system.
In our example, we have no access to such data. Instead, we will collect some from the front-end of the site using Screaming Frog and XPath (more info on that combination of technologies here).
After our crawl, we have a couple of hundred products listed in a spreadsheet with some associated data:
We have each product’s URL, the product name (product title), the SKU and MPN for each product, each product’s key features (transformed from a HTML list to a comma separated list) and also the old product description (we may wish to recycle elements of this).
Now it’s time to write a formula to combine these bits of information into a prompt for OpenAI’s text-davinci-003 engine. First, we need to create some new columns to contain some text, which we will wrap around our data:
Instead of creating new columns, you could quote these text strings inline, within the end-result (prompt generating) formula. However, text strings within formulae are limited to 255 characters, which is insufficient for our purposes.
Once that step is done, you simply construct the prompt-generating formula and hide the redundant (text-string containing) columns:
In the example above, not all products had a listed MPN, so we had to include an IF statement, which would remove that part of the prompt if no MPN existed. Keep in mind that you can’t simply copy and paste our formula, as your input data will likely be in a different shape and state, and may have completely different fields to work with (hence we are not supplying a template download, as this isn’t really a template-friendly task).
Now that we have our formula working, we have a full, rich, detailed prompt for each of these products:
Our prompts are looking pretty good. Very rich and detailed. We have a prompt for around 250 product URLs, and this equates to 91,000 words of input prompts. It’s time to process all of these prompts and reap our data.
Using OpenAI’s API and Text-Davinci-003 to Fetch our Prompt Responses
It’s finally time to generate 250 product descriptions from our 250 input prompts, which we generated using some basic input data and Microsoft Excel.
To do this, you will need to work with a web developer or a programmer. In this instance, I (the author of this post, James Allen) dabble in API / AI programming – in addition to my SEO and Analytics roles. As such, I produced a Python script for Anicca Digital. The script loads the prompts from a spreadsheet, sends them to OpenAI’s text-davinci-003 engine for processing, harvests the responses and then logs them in a separate data dump.
There’s a lot more to it than this. For example, I needed to import libraries to create a bridge between Python, Excel and the API. I needed to create a sensible loop to process all rows from the spreadsheet. I needed to code custom error handling into the script, in the event of an OpenAI API response error (usually happens when the data model is being overloaded). There are also parameters to be set for each request (e.g: temperature, max tokens, frequency penalty, presence penalty).
Off goes our Python script:
Above you cannot see the coding of the script, but you can see the request data and the response. Command Prompt is also printing the data in a way which escapes the apostrophe with backslash.
Once all of the data is processed, we can then explore the data dump:
In this instance, we generated content for 190 products (rather than the full 250). Since this exercise was for demonstrational purposes, we wanted to save some of our Open AI API credits. We managed to generated 190 product descriptions, with a total word count of 44,788 words.
The content only cost us around $3 to produce, in terms of API fees from OpenAI:
For $3.10, we were able to produce over 44,000 words of copy across 190 generated product descriptions. Here’s an example of one of the product descriptions which was written:
It’s fairly good. Some of the wording could be altered, some of the features maybe placed into a bulleted list. Potentially a little tone of voice work is also needed. Weaknesses aside, this process has managed to generate a large quantity of content which is now ready for editorial review and editing.
Producing all of this content from scratch would have taken days. Hopefully, the editing time will be much shorter.
Google’s Stance on AI Content
The waters are murky in this area. If you’re considering building a cookie-cutter site which just uses ChatGPT to comment on all the high-volume, recent trends – and then you’re just going to slap some ads on it, that would be firmly against Google’s guidelines.
Google have stated (in the past) that AI generated content goes against their guidelines for this reason. However, this assumes that the content would be uploaded as-is, with no editorial review process and no fact-checking. What we’re suggesting, isn’t to simply generate a load of stuff and then upload it. We’re crafting very detailed prompts very carefully, and then sending the resulting content (generated by AI) through a human review and editorial process. The content which ends up on the website, won’t be exactly the same content that the AI spat out.
Google have actually admitted that currently, they cannot fingerprint and identify AI content (even OpenAI, the creators of the technology, are finding this difficult). Google then also later shifted their stance on AI content, with Google’s Search Liaison (Danny Sullivan) stating:
“We haven’t said AI content is bad. We’ve said, pretty clearly, content written primarily for search engines rather than humans is the issue. That’s what we’re focused on. If someone fires up 100 humans to write content just to rank, or fires up a spinner, or a AI, same issue…”
So, what Google really mean is that if you spin up a load of generated content to underpin spam-pages, that’s against their guidelines. If however, you already have useful pages on your site which already exist (such as product pages where users can buy things), and you’re using the power of AI to supplement content growth – that’s okay. Here at Anicca we’d also put all of the generated content through a full editorial review and tweaking process to humanise the content.
If you’re really interested, you can read Google‘s AI–generated content guidance here:
https://developers.google.com/search/blog/2023/02/google-search-and-ai-content
Also note that Google have recently unveiled ‘Bard’, their own ChatGPT rival. Google aren’t fundamentally against this technology, though their current revenue model (Pay Per Click advertising) isn’t fully compatible with it (if no one clicks, none of Google’s clients will be spending to serve ads). OpenAI’s ChatGPT has forced Google to take a more adoptive stance of this technology, which will accelerate the ways in which we use and integrate AI-generated content for SEO.
We’re not aiming to get rid of copywriters, instead we want to use the same resource to create more unique and useful content (for our clients) than we could before. We want to retain our copywriting staff, profit from new technology, and also give our clients a competitive edge in terms of content production (more content generated in less time).
In a nutshell: using AI-generated content should be okay, so long as you are using AI to produce content for humans (rather than ‘spun’ content for search engines). In any case, neither Google nor OpenAI can properly fingerprint the content, and you should also be sending it through a human editorial review process.
Here at Anicca, we stay up-to-date on all SEO, search and algorithmic developments and apply them to our clients’ campaigns. You can check out our SEO services here. Impressed with our knowledge and coverage of AI in search? Give us a call on 0116 254 7224 – or message us directly.