Reinforcement Learning from Human Feedback

A short introduction to RLHF and post-training focused on language models.

Nathan Lambert

Course

Course lectures and talks based on the RLHF Book, built with Colloquium. Click into a deck to navigate through the slides, or open in full screen.

Welcome to the Course

Introduction and overview of what you'll learn

Watch

Lectures

Lecture 1: Overview

Chapters 1-3 · Foundations of RLHF and post-training

Watch PDF Slides

Lecture 2: IFT, Reward Models, & Rejection Sampling

Chapters 4, 5, 9 · From instruction tuning to reward-guided data curation

PDF Slides

Other Lectures

2026

An Introduction to Reinforcement Learning from Human Feedback and Post-training

SALA 2026 · Quito, Ecuador · March 2026

Invited Talk PDF Full Screen

Citation

If you found this useful for your research, please cite it!

@book{rlhf2026lambert,
  author = {Nathan Lambert},
  title = {Reinforcement Learning from Human Feedback},
  year = {2026},
  publisher = {Online},
  url = {https://rlhfbook.com}
}