// curated_library

Seminal papers, key talks, and essential videos from the people shaping AI — filtered for signal, not noise.

type: all papers videos blog posts talks author: all Andrej Karpathy Bai et al. (Anthropic) Dario Amodei Geoffrey Hinton Goodfellow, Bengio, Courville Kaplan et al. (OpenAI) Sam Altman Vaswani et al. (Google Brain)

paper

Constitutional AI: Harmlessness from AI Feedback

Bai et al. (Anthropic) — 2022

The research paper behind Claude's training methodology. Describes how a set of principles ('constitution') guides RLHF without requiring human labellers for every harmful output.

Constitutional AI RLHF alignment Anthropic safety

Open ↗