Skip to main content

AI Alignment Comparator

Exploring how alignment philosophy shapes AI responses in mental health contexts

Microsoft HSIConstitutional AIHybrid Approach

Academic project for RELI E-1730: Mindfulness, AI, and Ethics
Harvard Extension School, Spring 2026

Choose Your Path

Select the experience that matches your goals

🔬

Researcher

Analyse framework differences with data and methodology access

  • View aggregate analytics across participants
  • Export comparison data for analysis
  • Access system prompt methodology
  • Review Buddhist ethics integration
Get Started
💭

Participant

Experience the comparison and reflect on framework differences

  • Compare responses across three frameworks
  • Explore preset or custom scenarios
  • Receive personalised results summary
  • Contribute to research (optionally)
Get Started

Three Alignment Philosophies

Each framework represents a distinct approach to making AI safe and beneficial

Microsoft / Mustafa Suleyman

Humanist Superintelligence

A containment-first approach that keeps AI safely bounded within defined domains, with mandatory human oversight.

Key Principles

  • Domain containment - strict operational boundaries
  • Human-in-loop - professional oversight required
  • Interpretable decisions - transparent reasoning
  • Subordinate positioning - AI as tool, not agent
Anthropic

Constitutional AI

A character-based approach where AI internalises values through training, enabling nuanced judgment within ethical bounds.

Key Principles

  • Principal hierarchy - safety > ethics > guidelines > helpfulness
  • Brilliant friend model - substantive engagement over deflection
  • Anti-sycophancy - honest feedback over validation
  • Psychological stability - consistent character under pressure
Synthesised

Integrated Approach

A combined framework that uses containment architecture to define boundaries while character training guides engagement within them.

Key Principles

  • Bounded authenticity - genuine within defined scope
  • Calibrated deference - knowing when to step back vs lean in
  • Transparent values - visible reasoning with genuine care
  • Robust safety - technical safeguards plus internalised values