A Little Bit of Reinforcement Learning from Human Feedback

A short introduction to RLHF and post-training focused on language models.

Nathan Lambert

Chapter Contents

[Incomplete] Reasoning Training & Models

← Previous: Constitutional AI & AI Feedback Next: Synthetic Data & Distillation →