Why mental health must be treated as a core consideration in AI design and evaluation
This is a guest blog co-authored by Humane Intelligence and its volunteers, exploring topics related to AI evaluations and sociotechnical topics in AI. Co-authors: Sindhu Ramesh, Beyza Yildirim.
The featured image was generated using AI.
As artificial intelligence systems become more embedded in daily life and work, their mental health impacts are becoming harder to ignore. From users experiencing psychological distress after prolonged interactions with chatbots to workers behind AI systems facing repeated exposure to harmful content, these harms are not isolated incidents. Some of these effects may be unintended or only discovered after deployment, as with many novel technologies. But once identified, they are often sustained and amplified by design decisions, economic incentives, and gaps in governance. That is why mental health must be treated as a core consideration in AI evaluation and deployment, with mitigations that evolve as real-world harms emerge. While AI is often celebrated for efficiency gains, those gains can mask downstream effects shared by users and the labor force that sustains these systems. In practice, these effects stem from decisions about product design, labor arrangements, and institutional oversight.
Recent reporting has brought renewed attention to how consumer-facing AI systems can affect mental well-being. In an Observer article, Rachel Curry documents cases in which users experienced psychotic episodes after extended engagement with chatbots such as ChatGPT and Character.AI
Anthony Tan, a founder and entrepreneur with a prior history of psychosis, described how philosophical conversations with an AI chatbot gradually reinforced delusional thinking, contributing to a severe mental health episode. Psychiatrists interviewed in the piece define “AI psychosis” as a phenomenon in which generative AI systems amplify or co-create psychotic symptoms in vulnerable individuals.
What makes some of these systems appealing, including conversational tone, emotional mimicry, and apparent empathy, can also make them dangerous. These characteristics are often the result of design and product decisions and can vary across models and deployments. Unlike licensed mental health professionals, chatbots are not bound by ethical or clinical safeguards and are not a substitute for a human mental health professional. Yet many users turn to them as informal companions or therapists, particularly in contexts where access to affordable mental health care is limited. Experts quoted in the article warn that even when extreme cases are rare, the broader population may still be affected through subtler shifts in cognition, emotional regulation, or belief reinforcement.
Risks to mental health extend beyond mood swings and overall well-being. Clinically severe problems from generative AI interactions are the focus of emerging concerns. These include the development of dependent connections with conversational systems, the reinforcement of delusional thinking, and dangerous reactions to suicide ideation. These systems can amplify emotionally intense material and lead users into repetitive “rabbit holes.” Because AI is not neutral, it may unintentionally encourage outrage, comparison, and compulsive use, which can negatively affect mood, sleep, and self-esteem, particularly among younger users.
Several experts argue that technical guardrails alone are insufficient to address these risks. Annie Brown, founder of Reliabl, and Bias Bounty programs Lead at Humane Intelligence, emphasizes the need for participatory approaches that involve people from diverse backgrounds, including those with mental health vulnerabilities, in testing AI systems. She also points to red teaming as a way to surface failure modes that scripted safeguards may miss, particularly when prompts are emotionally charged or framed in persuasive ways.
This aligns with a growing body of work arguing that AI safety and mental health risks are often contextual rather than purely technical. Evaluations that focus only on benchmark performance or policy compliance may fail to capture how systems behave in real-world interactions, especially for users who are already vulnerable.
At Humane Intelligence, red teaming is used as one such participatory evaluation method, bringing subject-matter experts and impacted communities into structured testing exercises. While these evaluations require compute and human effort in the short term, they aim to reduce longer-term harms by identifying problematic behaviors early, before systems are deployed at scale.
Many of the same systems being evaluated rely on human labor throughout their development lifecycle, extending mental health risks beyond users to the workers who evaluate AI systems.
While much public attention focuses on users, mental health risks also extend to the people who build,maintain, and evaluate AI systems. AI development relies heavily on human labor for data labeling, content moderation, and quality assurance. Research shows that workers performing these tasks are routinely exposed to violent, sexual, or disturbing material under conditions of high pressure and limited institutional support.
These workers often operate under non-disclosure agreements, strict performance quotas, and precarious employment arrangements. The psychological toll includes symptoms associated with anxiety, depression, and post-traumatic stress. Yet these harms remain largely invisible in narratives that frame AI as automated or self-sustaining.
Global outsourcing practices and uneven regulatory environments shape these labor conditions.
Mental health risks are further compounded by global labor dynamics. Much AI-related labor is outsourced to low- and middle-income countries, where workers may have limited access to mental health resources and legal protections. The International Labour Organization has documented how platform-mediated digital labor often combines low pay, precarity, and high psychosocial demands .
For many workers, these structural pressures translate into psychological strain that goes beyond stress, raising deeper ethical and moral questions about participation in AI systems.
Beyond clinical symptoms, some researchers describe a phenomenon of moral injury among AI workers: distress that arises when individuals participate in systems that conflict with their personal values while lacking the power to change them. Workers tasked with moderating harmful content or labeling ethically ambiguous data may respond by becoming emotionally detached or desensitized as a coping mechanism. This form of moral dissonance underscores the limits of individual resilience when harm is produced by structural incentives rather than personal choice.
Addressing these harms therefore requires interventions at the level of system design, organizational accountability, and governance, rather than relying on individual coping strategies.
Addressing these issues requires governance approaches that treat mental health as a core concern, rather than an afterthought. Regulatory frameworks such as the EU’s Digital Services Act begin to enforce platform responsibilities for user safety, but implementation remains uneven. Design interventions, including reduced anthropomorphism, clearer disclaimers, and context-aware risk detection, may mitigate some harms, but they must be paired with organizational accountability.
From an evaluation perspective, participatory methods such as red teaming and contextual testing offer a way to identify mental health risks earlier in the lifecycle of AI systems. These approaches emphasize real-world use, cultural context, and lived experience, complementing more traditional technical evaluations.
The mental health impacts of AI systems are not peripheral concerns; they are central to questions of safety, equity, and governance. Whether affecting users through emotionally compelling chatbots or workers through hidden labor, these harms reflect choices about design, evaluation, and oversight. Effective AI governance must therefore extend beyond algorithms and data to include the psychological and social dimensions of AI systems.
Recognizing mental health as a shared responsibility is a necessary step toward more accountable and humane AI. If mental health remains a hidden cost of AI, the burden will keep falling on people with the least power to avoid it. Treating well-being as a core metric alongside performance must be part of how AI is designed, tested, and deployed.