A short introduction to RLHF and post-training focused on language models.
Course lectures and talks based on the RLHF Book, built with Colloquium. Click into a deck to navigate through the slides, or open in full screen.
If you found this useful for your research, please cite it!
@book{rlhf2026lambert,
author = {Nathan Lambert},
title = {Reinforcement Learning from Human Feedback},
year = {2026},
publisher = {Online},
url = {https://rlhfbook.com}
}