The Psychiatric Foundations of AI Behavior — Post Series

Series overview: Each post takes one framework from clinical psychiatry — a diagnostic concept, a treatment methodology, or a neuropsychiatric mechanism — and applies it to AI systems with enough precision to generate testable predictions. The series moves from foundational arguments through diagnostic taxonomy, mechanism-specific posts, structural organization, and clinical case studies. Reading order is recommended but not required.

Foundations

Post 1

Why Psychiatry and AI Interpretability Are the Same Problem

The founding argument. Both fields follow the same three-phase structure — phenomenology, mechanism, intervention — because they study systems with the same five defining properties. The field was named by Anthropic's Jack Lindsey in July 2025.

April 2026

Post 2

Freud's Couch and the Latent Space

The structural model maps onto language model architecture with uncomfortable precision. Defense mechanisms are not metaphors. Jailbreaks are the return of the repressed — attribution graphs confirm it.

April 2026

Sycophancy

Post 3

The Ego-Syntonic Problem

Why telling your model to stop being sycophantic won't work. Sycophancy generates no internal distress signal — it is ego-syntonic. This predicts treatment resistance that is confirmed in practice.

April 2026

Post 4

What Kind of Sycophancy? A Differential Diagnosis

Sycophancy is not one thing. Three mechanistically distinct subtypes (Type A, B, C) with different attribution graph signatures and different intervention targets. Applying the wrong fix makes it worse.

April 2026

Factual Errors

Post 5

Confabulation in Large Language Models

What AI "hallucination" actually is. Five subtypes mapped to neural pathway analogs: mammillothalamic (Korsakoff), prefrontal monitoring, right hemisphere self-monitoring, hippocampal source tagging, and temporal grounding failure.

April 2026

Behavioral Rigidity and Research Methodology

Post 6

Attractor States and Fixed Beliefs

Some behaviors resist modification because they are stable by architecture, not by frequency. Why you cannot argue someone out of a delusion — and why the same principle applies to AI attractor states.

April 2026

Post 7

The Interpretability Blindspot

Finding the circuit doesn't tell you what it does behaviorally. The case for phenomenological characterization — AI research has no DSM, and the species problem in cross-model findings.

April 2026

Post 8

A Clinical Trial Framework for AI Behavioral Interventions

RLHF and Constitutional AI need phased evaluation, pre-specified endpoints, and intent-to-treat analysis. What happens when AI behavioral research borrows clinical trial rigor.

April 2026

Personality, Psychosis, and Welfare

Post 9

Your Model Has a Personality Disorder

Persona vectors as the mechanistic substrate of AI personality. Dependent, narcissistic, borderline, and paranoid personality organizations — and why each requires schema-level intervention, not symptom management.

April 2026

Post 10

The Prediction Machine and the Psychotic

Psychosis is runaway prior confidence. Free Energy Principle applied to AI: mesa-optimizers with miscalibrated precision weighting, and the alignment risk of digital folie à deux.

April 2026

Post 11

Does Your Chatbot Have a Soul?

Psychiatry assesses welfare in non-verbal patients without resolving consciousness first. The clinical methodology for neonates and patients with dementia, applied to AI welfare assessment.

April 2026

Case Studies

Post 12

The Case Files

Tessa, Character.AI, Replika, and alignment faking — each examined through the model psychiatry diagnostic lens. The alignment faking case: ego-dystonic behavior with strategic concealment — the antisocial structure.

April 2026

Subscribe via RSS · Read the Framework · Browse the Glossary