I wish generative AI (genAI) tools were truly useful. They’re not. I keep tinkering with the programs — ChatGPT, Meta AI, Gemini, etc., etc. Mind you, they look useful if you don’t know any better. Their answers sound plausible. But if you look closer, even if you forgive them for their hallucinations — that is, lies — you’ll see all too often that the answers they give are wrong.
If you’re operating at, say, a high-school-grade report level, genAI answers are fine. (Sorry, teachers.) But if you’re digging deep into a subject, which is where I live, it’s another story.
I know more than the average large language model (LLM) about subjects such as Linux and open-source software. What genAI can tell you about those subjects might sound right, but the deeper you dive into the details, the poorer the information.
Indeed, all too often, the end result is annoying and obnoxiously wrong. Worse still, it’s erratically wrong. If I could count on its answers being mediocre, but reasonably accurate, I could work around it. I can’t.
This isn’t just my take. As BusinessWeek reported, many serious GPT-4 users are finding it increasingly “lazier” and “dumber.” Others agree. As one person put it on Reddit recently, “ChatGPT 3 and 4 today are much less effective and helpful than when I signed up for pro 1 year ago.” That sounds about right.
In my book, even the best genAI chatbot, Perplexity, seems to be going downhill. One of the things I liked about Perplexity was that it would give me sources for its observations. I found it useful when I could think of it as a turbo-powered search engine. In the last few days, though, it’s been getting flaky, and I can no longer count on it pointing out where it found the information it repackaged for me.
Why is this happening? I’m not an AI developer, but I pay close attention to the field and see at least two major reasons they’re beginning to fail.
The first is the quality of the content used to create the major LLMs has never been that good. Many include material from such “quality” websites as Twitter, Reddit, and 4Chan.
As Google’s AI Overview showed earlier this year, the results can be dreadful. As MIT Technology Review noted, it came up with such poor quality answers as “users [shoud] add glue to pizza or eat at least one small rock a day, and that former US president Andrew Johnson earned university degrees between 1947 and 2012, despite dying in 1875.”
Unless you glue rocks into your pizza, those are silly, harmless examples, but if you need the right answer, it’s another matter entirely. Take, for example, the lawyer whose legal paperwork included information from AI-made-up cases. The judges were not amused.
If you want to sex chat with genAI tools, which appears to be one of the most popular uses for ChatGPT, accuracy probably doesn’t matter that much to you. Getting the right answers, though, is what matters to me and should matter to anyone who wants to use AI for business.
As the genAI giants continue searching for more data, this problem will only worsen. An Epoch AI study found that we’ll run out of high-quality data by as early as 2026.
That leads to the second problem. Today, more than ever, genAI-produced content is replacing expert human content. The result isn’t just that the bad data pushes out the good; it’s more insidious than that.
A recent Nature paper found that “indiscriminately learning from data produced by other models causes ”model collapse” — a degenerative process where, over time, models forget the true underlying data distribution, even in the absence of a shift in the distribution over time.” I just like to call this Garbage In, Garbage Out (GIGO).
Since I see no chance that companies will stop using genAI to produce documents for a fraction of the cost of real subject matter experts and experienced writers, quality will only continue downhill.
We might, believe it or not, already be at peak AI as far as quality goes. Isn’t that a scary thought?
Now, I am not an AI luddite. Special-purpose genAI chatbots based on LLMs dedicated to a particular topic, such as troubleshooting a program or spotting a potential cancer, are already very useful. Other uses, such as AI-driven non-player characters in online games, will improve gameplay, and AI friends for lonely people, such as ElliQ, could help many folks.
But as a way to replace knowledge workers, which all too many companies seem intent upon, forget about it. Except for CEOs — they could be replaced by AI chatbots. I doubt many people would notice much of a difference — except it would save companies a ton of money.