Back to blog
The Psychology of AI Trust: Why We Over-Trust and Under-Trust
AIPsychologyTrustHuman BehaviourAI AdoptionCognitive Bias

The Psychology of AI Trust: Why We Over-Trust and Under-Trust

16-04-20269 min readNirmal Nambiar

When Air Canada's AI chatbot told a customer that bereavement fares could be applied retroactively a policy that did not exist the customer trusted the chatbot and made travel plans based on that information. When a radiologist reviewing an AI-assisted cancer screening flags a result the AI marked as low-risk, they sometimes override the AI not because they have specific evidence of an error but because the case feels unusual to them. Both of these are trust calibration failures one toward over-trust, one toward under-trust and both have consequences. The customer lost money because he trusted a confident-sounding wrong answer. The patient experienced delayed diagnosis because the radiologist overrode an accurate AI assessment. Understanding why humans trust AI incorrectly in both directions is not an academic exercise. It is the psychological infrastructure on which every AI deployment either succeeds or fails.

People trust AI outputs they cannot evaluate and distrust AI outputs they could verify in seconds. Both errors are costly. The psychology of AI trust is the hidden variable in almost every AI deployment success or failure and it follows predictable patterns that are worth understanding.

Why We Over-Trust AI: The Confidence Effect

AI systems, particularly large language models, produce outputs that are formatted with the confidence of an expert regardless of the accuracy of the content. There are no hedging phrases, no visible uncertainty, no formatting differences between a response that is thoroughly grounded in training data and one that is a fluent confabulation. When a human expert is uncertain, they typically signal that uncertainty through hedging language, through acknowledgment of what they do not know, through visible hesitation. When a language model is uncertain, it produces the same fluent, confident prose as when it is certain. The recipient has no formatting cue to calibrate their trust level.This creates a systematic over-trust dynamic in domains where the human cannot independently evaluate the AI's output. A user asking about a legal question in a jurisdiction where they have no legal training cannot tell the difference between a correct and incorrect legal answer. A patient reading an AI health information response cannot tell the difference between accurate medical information and a plausible-sounding error. In both cases, the user calibrates their trust to the confidence of the output rather than to any assessment of accuracy which is exactly the wrong calibration.The Workday hiring AI case makes this visible at scale: thousands of companies trusted the AI's resume screening recommendations without having any basis for evaluating whether those recommendations were accurate or fair. The AI produced confident accept/reject signals. Companies treated those signals as expert assessments. The result systematic discrimination against older applicants that led to a class action lawsuit was the consequence of institutionalised over-trust in an AI output nobody was actually verifying.

Why We Under-Trust AI: The Algorithm Aversion Effect

Paradoxically, people who observe an AI system make a single error often reduce their trust in that system below the level warranted by the system's actual accuracy. This is called algorithm aversion, and it has been documented consistently in experimental research. A human decision-maker who sees a human expert make an error tends to assume the error was a one-off they update their trust slightly and continue relying on the expert. A decision-maker who sees an AI system make an error of the same magnitude tends to dramatically reduce their trust often to the point of reverting entirely to human judgment, even when the AI's overall accuracy is higher than the available human alternative.This asymmetry produces systematic under-trust in high-stakes domains where AI accuracy is genuinely superior to human accuracy on average. Radiologists who override AI-assisted cancer screening recommendations at rates that reduce overall detection accuracy compared to following the AI alone. Judges who discount algorithmic risk assessments in sentencing decisions, sometimes in ways that produce less consistent outcomes than the algorithm would have. Financial analysts who manually override AI-generated forecasts at rates that reduce forecast accuracy compared to using the model output directly. In each case, the under-trust is driven by loss aversion the psychological cost of a wrong decision is higher when an unusual AI recommendation was followed than when a conventional human judgment was exercised, regardless of the statistical outcome.

The Calibration Problem in Enterprise Deployments

Enterprise AI deployments are trust calibration systems as much as they are technology deployments. The organisations that get the most value from AI tools are the ones that have figured out how to calibrate their teams' trust appropriately neither deferring to AI outputs they should verify nor overriding AI outputs they should act on. This calibration is not automatic. It requires deliberate investment in two things: feedback loops that show users where the AI is accurate and where it is not, and explicit guidance about which output types require verification and which can be acted on directly.The Faros AI finding that only 24% of developers fully trust AI-generated code is a calibration data point: developers have enough experience with specific categories of AI coding failure (security vulnerabilities, edge case errors, architectural inconsistencies) that their trust level reflects a genuine accuracy assessment. This is appropriate calibration not algorithm aversion, but calibrated scepticism based on documented failure modes. The goal of AI deployment trust architecture is to produce this kind of calibrated trust across all user groups and all output types, rather than the over-trust and under-trust patterns that characterise naive AI adoption.

The Anthropomorphism Trap

A specific and increasingly well-documented form of AI over-trust is anthropomorphism attributing human-like understanding, intent, and reliability to AI systems because they communicate in natural language. When an AI system says 'I understand your situation' or 'I recommend that you consider,' the language carries implicit social signals that activate the same trust mechanisms humans use in human-to-human relationships. These mechanisms were calibrated over millennia of human social interaction where language use implied a certain level of intent, understanding, and accountability. They are not calibrated for systems that produce language without any of these properties.Research published in 2025 found that people who interacted with AI systems that used first-person language and expressed apparent empathy made significantly riskier decisions based on AI recommendations than people who interacted with the same systems presenting the same information in impersonal, data-report format. The language framing changed the trust level without changing the accuracy of the underlying information. This finding has direct implications for how AI systems should be designed and for why the conversational interface that makes AI tools feel most accessible may be precisely the interface that makes them most prone to miscalibrated trust.

Building Appropriate Trust: What Actually Works

  • Transparency about uncertainty: AI systems that communicate their confidence level alongside their outputs and that communicate it in calibrated, honest terms rather than as false precision allow users to modulate their verification effort proportionally to the actual risk
  • Documented failure modes: users who know specifically where an AI system tends to fail (AI coding tools are prone to security vulnerabilities; AI medical information tools are unreliable on rare conditions; AI legal tools are unreliable on jurisdictional specifics) calibrate their trust more accurately than users who have only general knowledge that AI can be wrong
  • Feedback loops that reveal accuracy: systems that allow users to verify AI outputs against ground truth and see the results of that verification over time build calibrated trust much faster than systems where the accuracy of AI outputs is never formally assessed
  • Role clarity in high-stakes decisions: explicit guidance about which decisions the AI should inform and which it should not make autonomously prevents both the over-trust of uncritical adoption and the under-trust of complete dismissal
  • Removing anthropomorphism from high-stakes interfaces: AI systems in clinical, legal, and financial contexts should be designed to communicate as data tools rather than as apparent reasoning agents, to prevent the anthropomorphism-driven trust escalation that leads to uncritical adoption