The era of innocence for AI-generated text is coming to a close. While Large Language Models (LLMs) like ChatGPT have become frighteningly adept at mimicking human syntax, they leave behind subtle yet distinct traces. As recently highlighted by the French newspaper Le Monde, it is not always the content that betrays the machine, but the very structure and punctuation—with the primary culprit being the humble dash.

The Micro-Stylometrics of Large Language Models

To the average user, a dash is just a line. However, in the world of typography and coding, there is a significant difference between the hyphen (-), the en-dash (–), and the em-dash (—). ChatGPT exhibits a remarkable obsession with the en-dash (–) when creating lists or inserting parenthetical asides. In everyday, hurried human writing, most users stick to the simple hyphen found easily on the keyboard.

This "typographical perfection" of ChatGPT paradoxically acts as a red flag. The consistency with which the model uses specific Unicode characters reveals its origin. It's not just the dashes; it's the overall "cleanliness" of the text. Artificial intelligence tends to produce sentences with similar length and rhythm, avoiding the idioms or syntactic quirks that give human writing its character. In the French language, where stylistic elegance is a form of cultural capital, these minor deviations are immediately noticed by seasoned readers and detection algorithms alike.

Why Punctuation Matters: The En-Dash Mystery

The explanation lies in how these models are trained. ChatGPT was trained on massive datasets including digitized books, academic papers, and high-quality web content. These texts have undergone professional editing, where the use of the correct dash is the standard. As the model attempts to predict the "next most likely token," it gravitates toward the typographically correct version, which the average person rarely uses in an email or a draft.

  • Burstiness: Human writing is characterized by a mix of short and long sentences. AI tends toward a monotonous uniformity.
  • Perplexity: AI chooses words that are statistically expected. Humans often make unpredictable lexical choices.
  • Specific Key Phrases: Expressions like "In the rapidly evolving landscape" or "It is important to note" have become digital fingerprints for AI usage.

The Economic and Academic Fallout of AI "Tells"

The exposure of AI usage through punctuation has serious implications for education and journalism. In universities, professors no longer rely solely on detection software; they are developing an "instinct" for the AI-style. The sudden appearance of a perfectly placed em-dash in a student's essay, who usually writes in fragments, is an immediate indicator of a copy-paste job. In the professional world, using AI without meticulous editing is increasingly seen as a sign of laziness or a lack of authenticity.

"AI is the mirror of average human knowledge, but it lacks the angular nature of individual genius," analysts noted in Le Monde.

This "angular nature" is what content creators are now fighting to preserve. The battle between those wanting to hide AI usage and those seeking to uncover it resembles the arms race between virus creators and antivirus companies. Tools are already emerging that promise to "humanize" AI text by intentionally introducing typos or changing dashes to more "human" forms.

The Paradox of Perfect Prose

As models evolve, they will inevitably learn to mimic our imperfections. OpenAI and Google are aware that their models' styles are recognizable and are working on watermarking techniques. However, the discussion sparked by Le Monde highlights something deeper: our inherent need for true human connection. When we read, we look for the voice of another person—with their obsessions, their passions, and yes, their incorrect punctuation.

In conclusion, the dash is not just a punctuation mark. It is the last bastion of a style of writing that has not yet been fully standardized by algorithms. Recognizing these patterns reminds us that, for now, human thought remains more chaotic, more unpredictable, and ultimately more interesting than any statistical prediction. The machine may be perfect, but it is our mistakes that make us real.