AI Hallucinations: Detection, Mitigation and EU AI Act Compliance
In February 2024, a Canadian tribunal ruled that Air Canada was liable for incorrect information its chatbot had given a passenger — confidently describing a refund policy that didn't exist. AI hallucinations aren't a ...
AI Hallucinations: Detection, Mitigation and EU AI Act Compliance
In February 2024, a Canadian tribunal ruled that Air Canada was liable for incorrect information its chatbot had given a passenger about bereavement fares. The chatbot had confidently described a refund policy that didn't exist. Air Canada's defence — that the chatbot was "a separate legal entity responsible for its own actions" — was rejected. The company paid the claim and the legal costs.
A year earlier, a US attorney was sanctioned and fined after submitting a legal brief to a federal court citing six cases that ChatGPT had fabricated. The citations looked real. The formatting was correct. The cases didn't exist.
These aren't edge cases. They're early examples of a pattern that will become more common as AI moves from experimentation into operational workflows. The question for any business deploying LLMs isn't whether hallucinations will occur — they will — but whether you have the architecture to catch them before they cause damage.
What hallucinations actually are — and why they're not going away
The term "hallucination" is slightly misleading because it implies the model is malfunctioning. It isn't. An LLM generates text by predicting the most probable next token given its training data and the current context. When asked about something outside its training distribution, or asked to recall a specific fact accurately, it continues generating plausible-sounding text — because that's what it's designed to do. There's no internal truth-checking mechanism. There's no flag that says "I'm not sure about this."
This produces several distinct failure modes worth distinguishing:
Factual errors: The model states something incorrect with full confidence. Dates, numbers, names, technical specifications — all can be wrong and presented without qualification.
Fabricated citations: When asked to find sources, the model generates citations in the correct bibliographic format for papers, cases, or reports that don't exist. The formatting is indistinguishable from real citations.
Knowledge cutoff drift: The model's training data has a cutoff date. It may present outdated information as current — particularly relevant for regulatory content, pricing, product specifications, and legal requirements that change frequently.
Sycophantic confirmation: When a user's prompt contains an incorrect assumption, many LLMs will confirm it rather than correct it. This is especially dangerous in decision-support contexts where a user's prior belief might be wrong.
None of these failure modes will be eliminated by larger models or better training. They are reduced, but not resolved. GPT-4 hallucinates less than GPT-3.5. Claude hallucinates less than earlier Anthropic models. But no current LLM has a hallucination rate of zero, and the rate varies significantly by task type — factual recall is much riskier than summarisation of a provided document.
The EU AI Act dimension
The EU AI Act's obligations for high-risk AI systems apply from August 2026. For businesses deploying AI in the categories listed in Annex III — which includes systems used in employment decisions, essential private and public services, education and vocational training, access to credit, and administration of justice — the Act imposes specific technical requirements that directly address the hallucination problem.
Article 9 requires a risk management system that "identifies and analyses the known and reasonably foreseeable risks" associated with the AI system and implements "appropriate risk management measures." An LLM producing factually incorrect output in a high-risk context is an identified, foreseeable risk. Not having a mitigation architecture is a compliance gap, not just a technical one.
Article 14 requires that high-risk AI systems be designed and developed such that natural persons can "effectively oversee" the system during its use. This creates a concrete requirement for human review mechanisms on high-stakes outputs — not just a general disclaimer that "AI can make mistakes."
Article 50 creates transparency obligations for AI-generated content more broadly: users must be informed when content is AI-generated in certain contexts. This interacts with the hallucination problem because transparency about AI origin is different from transparency about AI accuracy.
For businesses not in Annex III high-risk categories, the obligations are lighter — but the liability exposure from incorrect AI output remains a real risk under existing civil law, as the Air Canada case illustrates.
The Italian context: D.Lgs. 231/2001 and emerging AI liability
Italian corporate liability law adds another layer. D.Lgs. 231/2001 establishes that legal entities can be held criminally and administratively liable for certain offences committed by their personnel in the company's interest. As AI systems increasingly make or influence decisions that touch on the offence categories covered by 231 — fraud, market manipulation, environmental offences, data protection violations — the question of whether inadequate AI governance constitutes a failure of the organisational models required by 231 is being actively debated by Italian legal scholars.
The Garante's enforcement track record on AI (ChatGPT, Replika) shows a regulator willing to act on inadequate safeguards. Combining AI Act obligations with 231 exposure creates a compliance landscape where "we used an AI tool and it got it wrong" is unlikely to be a sufficient defence for a company that had no hallucination mitigation architecture in place.
Mitigation architectures that actually work
There is no single solution that eliminates hallucinations. What works is a layered approach, with the appropriate layers selected based on the risk profile of the specific use case.
Retrieval-Augmented Generation (RAG) with source attribution is the most effective intervention for factual accuracy. Instead of relying on the model's parametric memory, every response is grounded in documents retrieved from a curated knowledge base. The model is instructed to answer only from the retrieved context and to cite the source. When the answer isn't in the retrieved documents, the model should say so rather than infer. This doesn't eliminate hallucinations entirely — models can still misread retrieved content — but it dramatically reduces the scope of what the model needs to "know" from training data.
Structured output with schema validation reduces hallucination risk in extraction and classification tasks. If you define a strict JSON schema for the model's output and validate every response against it, the model has less freedom to generate plausible-but-wrong content. Combined with confidence thresholds, you can route low-confidence outputs to human review rather than passing them downstream.
Secondary model verification uses a second LLM (which can be smaller and cheaper) to evaluate specific claims in the primary model's output. This is sometimes called an "LLM-as-judge" architecture. It's effective for catching certain error types but has a well-documented limitation: the judge model can have the same wrong prior as the generating model, so correlated errors can pass through. It works better as a consistency check than as a ground-truth verifier.
Human-in-the-loop review tiers are appropriate for outputs that carry regulatory or liability risk. This isn't a technology solution — it's a process design choice. The AI handles volume; humans review edge cases and high-stakes decisions. The challenge is defining the routing logic clearly enough that the right things escalate.
Output monitoring and sampling doesn't prevent individual hallucinations but creates visibility into your error rate over time. Sampling a percentage of AI outputs for human review, logging anomalies, and tracking accuracy metrics by task type gives you the data to tune your mitigation strategy and demonstrate to regulators that you have an active risk management process in place.
Matching mitigation to risk level
Not every AI use case carries the same hallucination risk, and over-engineering low-risk workflows creates unnecessary friction. A useful way to think about it:
Low risk — internal drafting, summarisation of provided documents, brainstorming: Basic prompt engineering (instruct the model to express uncertainty), output review by the user before acting on the content. No dedicated mitigation architecture required.
Medium risk — customer-facing content, knowledge base Q&A, first-draft legal or financial documents: RAG with source attribution, structured output validation, logging for audit trail. Human review before publication or delivery to external parties.
High risk — decisions affecting individuals (credit, employment, medical, legal), regulatory submissions, content with liability exposure: Full RAG grounding, secondary model verification, mandatory human review with documented sign-off, output monitoring and sampling. AI Act Article 9/14 compliance architecture required if Annex III applies.
What doesn't work
A few approaches that are commonly used but don't actually address the problem:
Disclaimers alone. Adding "this AI can make mistakes — please verify" to your interface meets a transparency obligation but does nothing to prevent incorrect outputs from being acted on. The Air Canada case turned on what the company's system actually told the customer, not on whether there was a disclaimer elsewhere on the site.
Temperature reduction. Setting the model's temperature to zero makes outputs more deterministic, not more accurate. A model that consistently gives the same wrong answer is not safer than one that varies.
Assuming newer models are safe enough. Each new model generation reduces hallucination rates on benchmark tasks. But benchmark performance doesn't translate directly to your specific use case, your specific document types, or your specific user query patterns. Evaluation on your own data remains necessary.
The practical starting point
If you're currently running LLM workflows without a formal hallucination mitigation layer, the highest-leverage first step is usually an audit: map your existing AI use cases, classify them by risk tier, and identify which ones lack adequate grounding or review mechanisms. That gives you a prioritised list of where to invest in mitigation architecture — rather than trying to solve the problem everywhere at once.
For businesses approaching AI Act compliance, this audit also provides the documented risk assessment that Article 9 requires as a starting point for the risk management system.
Running AI workflows without a mitigation layer?
We help European businesses audit their current AI use cases for hallucination risk, design the right mitigation architecture for each risk tier, and build the governance documentation required for EU AI Act compliance. If you're approaching the August 2026 deadline for high-risk AI obligations, the audit is the right place to start.
Let's scope your AI risk audit →
Drawing from over 20 years of expertise as Fractional innovation Manager, I love bridging diverse knowledge areas while fostering seamless collaboration among internal departments, external agencies, and providers. My approach is characterized by a collaborative and engaging management style, strong negotiation skills, and a clear vision to preemptively address operational risks.
No guesswork.
No slide decks.
Just impact.
Ready to move from AI hype to a working system? In a free 30-minute call we'll identify your highest-impact use case and tell you exactly what it takes to get there.
No upfront cost · Italy · Malta · Europe · English & Italian
An AI customer support chatbot is only as good as the knowledge it's built on. The technology is available and mature — the challenge is ...
Due diligence is fundamentally a document problem. A typical M&A or investment process generates hundreds of documents spread across a ...
Italy's AI market grew 58% in 2024. The share of Italian SMBs that have actually started an AI project sits at 15% for mid-sized companies ...