Fake news: do the tools meant to protect us actually mask their flaws?

In 5 seconds AI tools designed to combat misinformation have major blind spots, warns an Université de Montréal doctoral student who advocates for fairer, more transparent and citizen-focused solutions.
When current AI systems for detecting fake news flag content as false, they do not verify the facts: they calculate probabilities based on the data with which they have been trained. This is a fundamental misunderstanding, according to Dorsaf Sallami.

A dubious link from a friend. A headline too sensational to be true. A video that seems fake but you can’t be sure. As online misinformation grows harder to detect, new artificial-intelligence tools promise to help us separate fact from fiction. But do they actually work?

Not really, according to Dorsaf Sallami. For her doctoral research at Université de Montréal’s Department of Computer Science and Operations Research, she examined the limitations of AI  systems designed to detect fake news.

Her conclusion: these tools have significant flaws that their technical performance often masks.

She detailed her findings in a paper published last fall in the proceedings of an international conference on AI, ethics and society, co-authored with her supervisor Esma Aïmeur and professor Gilles Brassard.

A mirror, not a fact-checker

“Current AI systems for detecting fake news are built on a fundamental misconception,” Sallami said. “When AI flags content as false, it doesn’t fact-check as a journalist would. It calculates probabilities based on its training data.”

In other words, these systems don’t check the facts against reality. They only reflect what they’ve been shown, like a mirror, complete with all the biases and gaps in their training data.

Sallami finds it paradoxical that tech giants are pouring resources into these tools. Meta is labelling content that passes existing fact-checkers, Google has launched a Gemini-based prototype, and X is using Grok to analyze information on its platform in real time.

“The arsenal is impressive, but what good is a system that boasts 95 per cent accuracy in the lab but fails under real-life conditions, especially if it violates users’ privacy, is biased against some media outlets, and can be weaponized to censor political opposition?” Sallami asked.

Effectiveness is typically measured against technical benchmarks under controlled conditions. It’s a bit like judging a car by its top speed, without considering safety, affordability or emissions, she said.

Who decides what’s true?

Sallami points to another critical issue: the lack of consensus over what constitutes misinformation.

“To train a system to distinguish fact from fabrication, you have to feed it thousands of examples labeled true or false,” she explained. “For simple tasks, like telling a cat from a dog, the labels aren’t controversial. But when it comes to fake news, even experts disagree.”

Sallami calls this the “ground truth problem.”

“AI systems are trained using labels provided by fact-checking organizations, but their methods often lack transparency,” she said. “Some are for-profit businesses, making the process even more opaque. The technological edifice is built on foundations that are shakier than they appear.” 

The rise of large language models—the technology behind ChatGPT and Gemini—also helps the creators of fake news mimic credible sources more easily than ever before. As a result, systems trained on misinformation strategies just a few months ago may be unable to detect the latest ruses. 

Built-in bias

The biases embedded in AI fake-news detection systems are another major flaw, according to Sallami.

She found that, when gendered language appears in texts, some models are more likely to consider women to be purveyors of disinformation. Others are prejudiced against non-Western sources or reproduce political and geographic biases.

Sallami considers these biases particularly pernicious because they go largely unnoticed.

“While the industry fixates on improving accuracy, few researchers are examining the discrimination these systems can propagate,” she said. “Equity shouldn’t be an afterthought, secondary to performance; it must be an integral part of performance.”

Her thesis proposes concrete methods for measuring and correcting bias, including CoALFake, a framework she developed that helps a detector trained in one area adapt to new domains—such as scientific or commercial disinformation—rather than starting from scratch.

To address all these issues, Sallami argues for a socially responsible evaluation framework .

“Instead of judging systems solely on accuracy, we must also consider equity, transparency, privacy and real-world usefulness for citizens,” she said.

She also argues for giving user feedback greater weight, collaborating with journalists, social scientists and legal experts, and rejecting the false dichotomy between accuracy and social responsibility.

Aletheia: a new tool

In another paper based on her doctoral dissertation, Sallami noted that research has been focused on developing AI detection models, many of which are designed for people with technical expertise.

While these models are necessary, they aren’t enough, she argues: we also need tools that are accessible to the end users.

Sallami wasn’t content to simply point out the problem; she set out to solve it by designing Aletheia, a browser extension that lets users check online content themselves.

With a few clicks, users can verify the credibility of a news item, view fact-checks from trusted organizations and discuss with other users.

According to Sallami, what makes Aletheia different is its philosophy: instead of just labeling content “true” or “false,” it explains why, presents evidence from available online sources, and lets users judge for themselves rather than blindly trusting the underlying model.

“The extension has three modules,” Sallami explained. “VerifyIt, the core of the system, automatically consults external sources and delivers a verdict accompanied by plain-language explanations. Users can see the reasons why an item may be suspect and the sources on which the system is based.”

In tests using claims verified by PolitiFact, an American non-profit operated by the Poynter Insitute, VerifyIt achieved about 85 per cent reliability, outperforming many existing tools.

Aletheia also offers a live feed of recent fact checks and a forum where users can share their analyses and comment on those contributed by others.

“What we have presented here is only the tip of the iceberg,” Sallami concluded. “AI must earn public trust, not just ace technical tests. Future efforts should resist the lure of fully automated fact-checking and instead develop systems that work with and for human judgment.”

Share

Media requests

Université de Montréal
Phone: 514-343-6111, ext. 75930