Comment To see where this dumptruck is heading, let’s first follow the trail of debris.
It’s difficult to track back, but the impacts of internet content mills, which thrived until around 2010, are still readily visible.
The net effect of content generated at the rate of ten to thirty pieces daily on specialty topics — all at the hands of non-specialists via the guiding hand of Google Trends — led to an internet that, by 2010, was saturated with fluffy (if not nonsensical), keyword-stuffed articles that offered little in the way of usable info and in many cases, plenty of advice and information that was simply incorrect.
And because content mills naturally begat more content mills (and why not when worst-offender Associated Content sold to Yahoo! for $100 million), what happened next was inevitable. Those new content companies simply parroted what they found on bigger content mills, using the internet of the time as a training set, so to speak. The cycle of bad articles with little detail or worse, inaccurate detail, was repeated again and again until it became difficult to distinguish one article from the next unless it was found on one of a handful of edited, reputable sites.
The name of the game for these early content companies was sheer volume. Ad network (Google Adsense, etc.) revenues were already falling by 2005 but with thousands, if not millions of articles, each generating perhaps three cents per day, the money wasn’t bad. For a content mill with 200,000 articles, that was a tidy $2 million business with ultra-low overhead. Hosting wasn’t hugely expensive, web design was easy with open source CMS tools like WordPress, Drupal, and others, and most important (and ultimately most disastrous) the bulk content could be bought for mere cents per article from offshore shops.
This model meant the internet was quickly flooded with badly written nonsense, much of which is still searchable in the original form or even more badly rehashed. Google had to start stepping up its game to filter around this and learn how to deliver quality in content versus the magical keyword blend that content mills could exploit.
The problems with content mills are clear, especially all these years later, but it was all at a human scale with the limitations of “slow” writers and keyword-stuffers. The future presents us with a new matter—one that could shatter how we use the internet for good.
Let’s do a little math
Pretend it’s 2006 and you’re in the content mill business. You’re at the top of your game. You have a team of 100 writers in India making the equivalent of $10 per day to write and post twenty 400-word pieces (topics dictated by Google keyword trend data versus expertise, etc.).
Your daily costs for salary are around $1,000. Every day your content mill posts 2,000 pieces of “unique” content 365 days per year, and each of those articles, assuming good search engine ranking (which could be easily gamed with keyword tricks back then) each of those articles will generate three cents per day.
And while we’re using nice, rounded numbers for ease, consider these annual numbers (annual because you only need to run this business for a year, the Adsense money comes in no matter what, at least for a while):
Salary for writers who create, post, and tag 20 articles per day costs you $365,000 per year. They generate 730,000 pieces of content valued at $10.95 per piece over the course of the year (assuming three cents per day for 365 days). And all of that, which is pretty hands-off for you, Western Content Lord, means you have a business generating around $8 million per year.
Oh. But you have to subtract hosting and such. Let’s call that five grand. The big ugly cost? All those “expensive” writers. And you think to yourself, who needs ’em?
Well, you don’t.
Because oh boy is there a new business model for content mills. And while their early 2000s predecessors made the internet annoying and full of junk articles that hit the keyword and word count targets without saying anything at all, this one is disruptive enough to turn the internet into complete trash. And not just trash from a content perspective, but from a whole how-the-business-of-the-internet-works one too.
Putting the S in IoS
This new business model is already unfurling. You’ve likely read plenty of articles that were generated by GPT or similar AI models. The reason you probably didn’t notice is because they aren’t bad. Well, you think they’re not bad, but that’s because you’ve been weaned on the Internet of Shit (IoS) the content mills brought about, which trained us to lower our expectations when it came to information consumption.
The problem is that these AI generated articles have to get their information from somewhere in enough volume to suitably churn out new info clones cloaked in slightly more eloquent language. And where do AI training algorithms get all of this? From the IoS, of course.
If we do more math, let’s assume that 10 percent of IoS-derived training data contains factual errors. As AI trains, then retrains, and retrains, those errors mount. And mount. And multiply and within a decade of retraining on bad, weird, oddly worded, and increasingly incomprehensible data, we are left with a truly IoS.
And math is super important again—and so is volume.
A single content mill operator at the scale of Western Content Lord, for instance, can use free tools to generate content as fast as human operators can plug it in with a simple promping sentence. That same team of 100 workers can enter 300 pieces per day.
They don’t write it, they just ask ChatGPT. They can ask it to keyword stuff it like a mofo and generate keywords too, for that matter. Eventually, that process of ChatGPT (as one of many examples) will have API hooks to publish output directly to WordPress or whatever CMS Content Lord chooses.
When that AI to CMS platform unification is complete, so is the circle: the Internet is just talking to itself.
The race to the bottom
What Western Content Lord and competitors don’t realize is how fast that race to the bottom will commence—and soon.
Google Adsense and every other ad network on the planet will recognize the flood and reduce what it pays for a click or view to almost nothing. And then it will be nothing, but not before Google and ilk scramble to blacklist known AI content mills. But there will be too many of them popping up too quickly. It will just be easier for a Google, for example, to create a safelist of known publishers backed by plodding humans.
Great, you think, balance has been restored! Not so much.
To keep up with all the innovations in search that push those IoS results to the bottom for you will cost Google money, AI training at billion dollar scales, and considerable, frequent retraining on the corpus of the internet. That corpus will be infected fast and furiously and how do the search giants pay for all that search innovation? Via ad revenues.
The search advertising giants like Google might appear to hold their noses and accept content mill results into the queue because it’s in their economic interest to do so. But what if the pool of “acceptable” content shrinks by 95 percent?
The exponential rate of internet shittification
We go back again to the theme of math and volume and such to address the most important point: the information danger is an exponential problem. One series of mistakes generated, then repeated by content mills for a decade, means those problems get trained into the core AI language model from the corpus of the internet and reinforced.
It’s one thing to live in an era of fake news in part because for most thinking people, it’s obviously fake. When the internet repeats a mistake often enough it becomes truth and that is the most insidious accidental outcome of all of this.
It would make me, personally, feel better to end this piece with some kind of “fight the power” message but honestly, at this point, the cat’s out of the bag. The content mills can be satisfied with per-article revenues that are measured on a value-over-five-years plan and might only amount to .05 cents over the term. But who cares, right? It’s free money. Hosting is cheap, a CMS is free, and as long as there is ad money, it’s worth the passive income effort.
This is the internet you deserve, apparently. ®
- AI
- ai art
- ai art generator
- ai robot
- artificial intelligence
- artificial intelligence certification
- artificial intelligence in banking
- artificial intelligence robot
- artificial intelligence robots
- artificial intelligence software
- blockchain
- blockchain conference ai
- coingenius
- conversational artificial intelligence
- crypto conference ai
- dall-e
- deep learning
- google ai
- machine learning
- plato
- plato ai
- Plato Data Intelligence
- Plato Game
- PlatoData
- platogaming
- scale ai
- syntax
- The Register
- zephyrnet