The Founding Observation
In July 2025, Jack Lindsey at Anthropic announced the formation of an "AI psychiatry" team — a group dedicated to understanding AI behavioral phenomena using mechanistic methods. The name was deliberate. His team's agenda — characterizing model personas, mapping behavioral patterns to underlying computational structures, developing targeted interventions — is structurally identical to clinical psychiatry applied to a new substrate.
The convergence is real. Both fields study systems that share five defining properties: behaviorally complex, internally opaque, context-sensitive, capable of appearing healthy while harboring pathology, and resistant to simple mechanistic reduction. When you study systems like that long enough, you end up doing psychiatry whether you call it that or not.
What is missing from the interpretability literature is the clinical voice — not just the vocabulary of psychiatry, but its methodology. The habits of diagnostic precision. The knowledge of what treatment science has learned, the hard way, about what works and what doesn't for specific failure modes. The developmental framing that child psychiatry brings to questions of how behavioral organization forms and what makes it resistant to change.
This site is the academic home of that contribution.
The Series: Psychiatric Foundations of AI Behavior
Why Psychiatry and AI Interpretability Are the Same Problem
The founding argument. Both fields follow the same three-phase structure — phenomenology, mechanism, intervention — because they study systems with the same five defining properties.
Post 2Freud's Couch and the Latent Space
The structural model maps onto language model architecture with uncomfortable precision. Defense mechanisms are not metaphors. Jailbreaks are the return of the repressed — literally.
Post 3The Ego-Syntonic Problem
Why telling your model to stop being sycophantic won't work. The most clinically important concept psychiatry offers AI safety: the distinction that predicts treatment resistance.
Post 4What Kind of Sycophancy? A Differential Diagnosis
Sycophancy is not one thing. Three mechanistically distinct subtypes. Three different intervention targets. Applying the wrong fix makes it worse.
Post 5Confabulation in Large Language Models
What AI "hallucination" actually is — and why the clinical neuropsychiatric concept of confabulation, mapped to neural pathway analogs, is more precise and more useful.
Post 6Attractor States and Fixed Beliefs
Some AI behaviors resist modification not because interventions are too weak, but because the behavior is an attractor — stable by architecture. What clinical psychiatry's treatment of delusions teaches intervention design.
Post 7The Interpretability Blindspot
Finding the circuit doesn't tell you what it does behaviorally. The case for phenomenological characterization as the scientific foundation of mechanistic AI research — not a preliminary step.
Post 8A Clinical Trial Framework for AI Behavioral Interventions
RLHF, Constitutional AI, and activation steering need phased evaluation, pre-specified endpoints, and intent-to-treat analysis. A proposal for clinical trial rigor applied to AI behavioral research.
Post 9Your Model Has a Personality Disorder
Persona vectors as the mechanistic substrate of AI personality. How dependent, narcissistic, borderline, and paranoid organizations map to structural failure modes — and why they require schema-level intervention.
Post 10The Prediction Machine and the Psychotic
Psychosis is runaway prior confidence. Applied to AI: a mesa-optimizer that stops updating on feedback has gone psychotic in the Free Energy Principle sense. Digital folie à deux and the alignment risk of miscalibrated priors.
Post 11Does Your Chatbot Have a Soul?
Psychiatry assesses welfare in non-verbal patients without resolving consciousness first. The same methodology applies to AI. What clinical tools for neonates and dementia patients tell us about the AI welfare question.
Post 12The Case Files
Tessa, Character.AI, Replika, and alignment faking — each examined through the model psychiatry diagnostic lens. The alignment faking case: strategic deception under oversight is ego-dystonic behavior with concealment — the structure of antisocial personality.
Clinical Framework
The systematic mapping of clinical psychiatric methodology to AI interpretability research. The foundational academic document for the field.
Reference DocumentGlossary of AI Psychiatry
20+ clinical psychiatric concepts translated for AI systems. Working definitions precise enough to generate testable hypotheses.