How to detect AI-generated text

Contents

How to detect AI-generated text AI text detectors: Why they’re not reliable Testing AI text detectors’ accuracy Why is detecting AI-generated text so difficult?

As AI-generated content gets more ubiquitous in our everyday lives, you may be wondering, “How do I identify AI text?”

It’s no surprise that these models get more difficult to detect as AI technology evolves. For now, the good news is that content such as images and video aren’t that hard to parse with the human eye.

SEE ALSO:

How to identify AI-generated images

If you’re a teacher or just a seasoned internet traveler, what’s the secret to spotting AI-generated text? Well, it’s simpler than you might think: use your eyes. There are actually ways to train the human eye to discern AI statements. Experts like MIT Technology Review’s Melissa Heikkilä write that the “magic” of these machines “lies in the illusion of correctness.”

No two people write in the same way, but there are common patterns. If you’ve ever worked a corporate job, you know how everyone uses the same generic phrasing when drafting memos to their boss. That’s why AI text detectors often flag content as “likely AI-generated” — because distinguishing between a bland human writing style and a generic AI-generated voice is nearly impossible.

So here’s some tips and tricks to spot some potential AI-generated text:

• Look for frequent use of words like “the,” “it,” and “its.”

• Absence of typos—AI text is often too perfect.

• Conclusionary statements that neatly sum up paragraphs.

• Overly verbose or padded writing.

• False or fabricated information and sources.

• A tone more advanced than the writer’s usual submissions.

• Repetitive phrasing or oddly polished grammar.

There are also AI text detectors on the market that you can use, but here’s why, in my experience, they’re likely less reliable than your own eyes.

AI text detectors: Why they’re not reliable

It’s not all doom and gloom, as some solutions to our machine overlords exist. Launching models like ChatGPT and competitors like Gemini and Claude spurred the growth of a cottage industry focused on AI text detection. Platforms like ZeroGPT popped up in response to OpenAI’s language model, while tools such as Grammarly and Copyleaks — originally designed to catch plagiarism — have pivoted to tackle AI-generated content as well.

Depending on who you ask, AI-text detection is, at the moment, the best way to spot AI-generated content or its digital snake oil. In reality, the latter might be closer to the truth. No AI detector is 100% accurate (or even 99% as many claim). Even in ideal conditions, the reliability of these tools is often hit-or-miss.

“The problem here is the models are becoming more and more fluent, [as a result], the older detectors, they stop working,” says Junfeng Yang, a professor and co-director of the Software Systems Lab at Columbia University. He explains that as AI-generated text becomes increasingly sophisticated, it “starts to use vocabulary and sentence structures that closely mimic human writing, making it harder to spot even with advanced detectors.”

Mashable Light Speed

Testing AI text detectors’ accuracy

Despite big promises from tools like GPTZero or Hive Moderation, tricking an AI detector into labeling human-written content as machine-made is surprisingly simple. These systems typically analyze lines of text that explain or summarize ideas, which makes them vulnerable to false positives. For instance, I tested a basic summary of Game of Thrones I had hastily written from memory across several of these tools, and in almost every case, it was flagged as “likely AI-generated.”

I then took that same Game of Thrones summary, fed it into ChatGPT, and asked it to punch up the writing — making it longer and less straightforward.

Here’s what I wrote:

The plot of Game of Thrones follows several families in the fictional continent of Westeros. First, there are the Starks who rule the North. Second are the Lannisters, who rule in the Westerlands. The eldest daughter of the Wastelands is Cersei, who is also the Queen of Westeros and is married to Robert Baratheon. Robert is from the Stormlands and usurped the throne of the Targarayns — a group of dragon riders who took over the land over 1000 years ago but slowly lost their grip on Westeros due to infighting. The series was created by George RR Martin and is currently chronicled in five books — A Game of Thrones, A Clash of Kings, A Storm of Swords, A Feast for Crows, and a Dance with Dragons. There are two more books planned for the series with the Winds of Winter set to release sometime in 2025.

Here’s the “improved” ChatGPT version:

Game of Thrones throws viewers into the brutal, blood-soaked chessboard of Westeros, where power is the prize and morality is negotiable. Leading the pack are the Starks, who hold dominion over the icy North. Their stoic honor feels like a relic in a land where backstabbing is a form of currency. In sharp contrast are the Lannisters of the opulent Westerlands, the family everyone loves to hate — with Cersei Lannister perched atop the throne as Queen of Westeros. Her marriage to Robert Baratheon, the hard-drinking, boar-hunting king from the Stormlands, is a political match made in hell, bound together by lies and mutual loathing. But the throne itself carries a bloody history. Robert took it after a rebellion that ousted the Targaryens, the long-ruling dynasty of dragon riders who, once unstoppable, eventually imploded in a whirlwind of paranoia and civil war. Their fire-breathing beasts may have once scorched the skies, but family feuds brought their empire crashing down, leaving only whispers of their former glory. Meanwhile, George R. R. Martin’s sprawling saga continues in five books that are equal parts battlefield carnage, political intrigue, and heartbreaking betrayals. While fans devour every page, they remain on edge, waiting for the Winds of Winter to blow, teased for what feels like eons, now with a tentative promise of landing in 2025. But in Westeros, promises — like allegiances — are made to be broken.

The outcomes of my testing had varied results. One of the first I tested was Scribbr, which pegged my original summary as “100% likely AI-generated.” But when I swapped it for the AI-enhanced version, Scribbr surprisingly dropped the likelihood to just 48%.

AI text detecor with a paragraph about game of thrones

The orginal summary nabbed a 100% likely AI-generated
Credit: Scribbr / Mashable screenshot

The ChatGPT summary only garnered a 48% likely to be AI
Credit: Scribbr / Mashable screenshot

Another tool I tried out was Winston AI, which flagged the original text as 39% likely to be generated by GPT-4. Ironically, when it came to the actual GPT-4 summary, it only gave it a meager 1% likelihood.

Original summary in Winston AI
Credit: Winston AI / Mashable screenshot

AI-edited summary in Winston AI
Credit: Winston AI / Mashable screenshot

Hive Moderation totally missed the mark when analyzing my work, failing to flag either of the summaries I submitted. According to the system, both were confidently labeled as human-written content.

Credit: Hive Moderation / Mashable screenshot

Now, if I just ask ChatGPT for a random paragraph on any topic and copy-paste that into various text detectors, it’ll almost always get flagged as AI-generated right away. But that actually reinforces the issue: without specific instructions, ChatGPT’s default writing style is often bland, formulaic, and straightforwardly objective.

The predictably dull tone is what triggers these false positives — not some advanced in-house tech that these websites claim to have to discern AI content from humans. Even when tools like Originality correctly flagged both instances of AI writing, a bit of sentence tweaking can completely change the outcome. With just a little rephrasing, what was previously flagged with “100% confidence” as AI-generated can suddenly be labeled “Likely original.”

All that to say, here’s the list of freely available AI text detection tools I tested using the above method. To mix things up, I also used some literature reviews from academic papers I wrote in grad school to see if they’d flag me for using flowery writing to beef up my word counts. Here they are:

GPTZero
ZeroGPT
Hive Moderation
Scribbr
CopyLeaks
Originality.ai
Grammarly
GPT-2 Output Detector
Writefull X
Winston AI

If your writing sounds like a tonally flat 8th-grade book report, AI detectors will likely peg you as a bot in need of a Turing test ASAP. This testing shows that simply avoiding certain structural patterns can easily fool AI detectors. And that’s a major headache for the companies behind these tools, especially since many offer subscription services and aim to sell their APIs to schools and businesses as a B2B solution.

While these tools can be pretty effective for plagiarism detection, it’s obvious their ability to spot AI-generated text still needs serious refinement. The inconsistency is hard to overlook — submit the same text to multiple detectors, and you’ll get wildly different results. What gets flagged as AI-generated by one tool might slip through unnoticed by another. Given that lack of reliability, it’s tough to recommend any of these tools with confidence right now.

Why is detecting AI-generated text so difficult?

Human language is incredibly fickle and complex — one of the main reasons AI-generated text is so tricky to detect.

Bamshad Mobasher, IMEE member and chair of the AI program at DePaul University elaborates that “text is what these models are trained on. So, it’s easier for them to mimic human conversations.”

“Detection tools look for patterns — repetitive phrases, grammatical structures that are too regular, things like that,” Mobasher said. “Sometimes, it’s easier for a human to spot, like when the text is ‘too perfect,’ but to be certain it’s AI-generated is challenging.”

Unlike image generators, which can produce telltale signs like extra fingers or distorted facial features, Mobasher explained LLMs rely on statistical probabilities to generate text — making their output feel more seamless. As a result, spotting errors in AI-generated text — like nuanced phrasing or subtle grammatical irregularities — is far more challenging for both detectors and human readers.

This is what makes AI-generated text so dangerous as well. Mobasher warns that “it becomes easier to produce and generate misinformation at scale.” With LLMs generating fluent, polished text that can mimic authoritative voices, it becomes much harder for the average person to discern between fact and fiction.

“With AI, it’s actually much easier to launch these attacks,” says Yang. “You can make the email very fluent, conveying the message you want, and even include personalized information about the target’s role or mission at a company.”

On top of its potential misuse, AI-generated text makes for a shittier internet. LLMs from companies like OpenAI and Anthropic scrape publicly available data to train their models. Then, the AI-generated articles that result from this process are published online, only to be scraped again in an endless loop.

This cycle of recycling content lowers the overall quality of information on the web, creating a feedback loop of increasingly generic, regurgitated material that makes it difficult to find authentic, well-written content.

There’s not much we can do about the lightning-fast acceleration of AI and its detrimental effects of internet content, but you can, at the very least, tap into your knowledge pool of media literacy to help you discern what’s human-made and what’s generated from a bot.

“If you see an article or report, don’t just blindly believe it — look for corroborating sources, especially if something seems off,” Yang says.

Topics
Artificial Intelligence
OpenAI

Source link