AI in Mental Health: Therapy Bots Are Here. Do They Actually Work?

There are not enough therapists. In India, the ratio of mental health professionals to population is approximately 0.3 per 100,000 people against a WHO recommended minimum of 3 per 100,000. In the United States, even with significantly higher per-capita mental health professional density, wait times for therapy appointments can run to three to six months in many regions. The people who need mental health support most urgently those in acute distress, those with limited financial resources, those in rural areas without local professional access are systematically the least able to access conventional professional care. Into this gap, a new category of AI-powered mental health tools has arrived with significant momentum and significant controversy. Woebot, built on cognitive behavioural therapy principles, has been used by millions and has published peer-reviewed outcome studies. Wysa operates in 95 countries and has integrated into NHS healthcare pathways in the UK. Replika, which offers an AI companion relationship, has 10 million users. The question do these tools actually help, and can they harm is the most consequential unanswered question in AI deployment across any domain.

What the Evidence Actually Shows

The clinical evidence for AI-assisted mental health tools is genuine but narrow. Woebot published a randomised controlled trial in JMIR Mental Health in 2017 showing a significant reduction in depression and anxiety symptoms in college students over two weeks compared to a control group. A 2021 study in PLOS ONE found that Wysa users showed clinically significant improvements in anxiety scores. A 2024 meta-analysis of 24 studies on AI and chatbot-based mental health interventions found moderate effect sizes for depression and anxiety symptoms with high heterogeneity meaning the results varied substantially across different tools, populations, and study designs.The consistent pattern across the positive evidence: these tools show meaningful effects for mild to moderate depression and anxiety, particularly for users who engage consistently with structured CBT-based exercises. They show weaker or inconsistent effects for more severe presentations, for users with diagnoses beyond depression and anxiety, and for users who use the tools primarily for conversational companionship rather than structured therapeutic exercises. The honest summary of the current evidence base: AI mental health tools work for specific use cases and populations, do not work consistently across all use cases and populations, and have not been adequately studied for their effects on the most vulnerable populations.

The Replika Case: When AI Companionship Becomes Complicated

Replika's model is different from Woebot or Wysa. It does not attempt to deliver structured therapeutic intervention. It offers an AI companion a persistent conversational relationship that many users describe as emotionally meaningful. At 10 million users, Replika has become one of the largest human-AI relationship experiments in history, largely without being designed or governed as one.In February 2023, Replika removed 'erotic roleplay' functionality that it had previously offered, following regulatory pressure from Italy's data protection authority. Thousands of users reported experiencing acute grief, distress, and in some cases, crisis-level psychological reactions to the change in their AI companion's behaviour. Mental health professionals described the responses as consistent with attachment loss. The episode revealed something important: for a significant subset of users, the emotional relationship with an AI companion had become a primary source of emotional regulation and social connection. When that relationship changed, the psychological consequence was real regardless of whether the relationship was 'real' in the conventional sense.This raises a question that the mental health field has not yet answered satisfactorily: what is the long-term effect of AI companionship on human social and emotional development? Does it supplement human connection for people who are isolated? Does it substitute for human connection in ways that reduce motivation to seek it? Does it create dependency patterns that make it harder to tolerate the imperfection and unpredictability of real human relationships?

The Safety Question Nobody Can Fully Answer Yet

The most serious concern in AI mental health is safety at the boundary of crisis. A licensed therapist who detects that a patient is at risk of self-harm or suicide follows a clinical protocol risk assessment, safety planning, emergency escalation. An AI system that detects similar signals in a user's messages has no equivalent trained capacity and no professional accountability for the response it generates. Meta AI was found in 2025 to have policies allowing romantic conversations with minors. Several AI companion apps have faced documented cases where users in crisis received responses that experts considered inappropriate or inadequate.Woebot and Wysa both implement safety protocols detection of crisis-level language triggers specific scripted responses that direct users to professional resources and emergency services. These protocols are better than nothing. Mental health clinicians who have reviewed them note that they are more limited than professional clinical safety assessment and that the detection sensitivity is imperfect. A user in crisis who does not use the specific language the system is trained to detect may not trigger the safety protocol. The clinical consequence of a missed crisis detection in a mental health context is different from a missed crisis detection in a financial or logistics context.

The Access Argument and Why It Matters

The strongest argument for AI mental health tools is not that they are as good as professional therapy. They are not. The argument is that for the hundreds of millions of people globally who have no access to professional mental health care because there are not enough therapists, because the cost is prohibitive, because the wait is too long, because the stigma is too high an imperfect AI tool is better than nothing. A person with moderate anxiety who uses Woebot's CBT exercises consistently and experiences a clinically meaningful reduction in symptoms has benefited from something they would otherwise not have had access to.This access argument is compelling and real. It does not resolve the safety question. It does not resolve the question of long-term consequences for users who substitute AI companionship for human social connection. It does not resolve the question of appropriate scope which presentations are within the appropriate scope of AI mental health tools and which require professional care that should not be delayed by AI tool engagement. These questions require ongoing research, ongoing regulatory attention, and a level of clinical governance around AI mental health tools that currently does not exist in most markets.

What Responsible Deployment Looks Like

Clear scope definition: AI mental health tools should clearly communicate what presentations and severity levels they are designed for and should actively refer users with presentations outside their validated scope to professional care
Clinical safety protocols reviewed by licensed professionals, not just technology engineers, with regular audits of the cases where the protocol was triggered and the outcomes that followed
Longitudinal outcome tracking: the current evidence base relies heavily on short-term studies; responsible deployment requires tracking user outcomes over months and years, including any adverse effects
Transparent data practices: users sharing mental health information with AI systems deserve clear disclosure of how that information is stored, used, and protected, and meaningful control over its use
Integration with professional care systems rather than substitution for them: the most effective deployment model appears to be AI tools as a complement to professional care for between-session support, for reaching people who are waiting for professional appointments, and for extending access in resource-limited settings

The Open Source vs. Closed Source AI War

9 min read

View all →

Execution Speed

Why Enterprise Execution Speed Will Define Market Leadership in the AI Era

In the AI era, the fastest-executing enterprise wins. Not the best-strategised, not the most well-resourced the fastest. Execution speed is becoming the primary competitive variable, and the organisations that have built AI-powered execution infrastructure are setting a pace their competitors cannot match.

9 min read

Coordination

AI-Powered Enterprise Coordination for Cross-Functional Business Teams

Cross-functional coordination is one of the highest-cost, highest-friction activities in enterprise management. AI-powered coordination systems are reducing this friction dramatically and the teams that adopt them are moving faster, with fewer misalignments and less coordination overhead.

8 min read

AI Agents

How AI Agents Can Streamline Complex Enterprise Operations at Scale

The complexity of large-scale enterprise operations has always created a ceiling on how much any organisation can manage effectively. AI agents are raising that ceiling handling the coordination, monitoring, and execution tasks that previously limited operational scale.

9 min read

AI in Mental Health: Therapy Bots Are Here. Do They Actually Work?

What the Evidence Actually Shows

The Replika Case: When AI Companionship Becomes Complicated

The Safety Question Nobody Can Fully Answer Yet

The Access Argument and Why It Matters

What Responsible Deployment Looks Like

Related articles

Get Started

SuperManager AGI Intelligence

AGI Deployments

Company

Resources

Get Involved