Notebooks for Reinforcement Learning: An Introduction second edition

Chapter 1: Introduction
Chapter 2: Multi-armed Bandits
Chapter 3: Finite Markov Decision Processes
Chapter 4: Dynamic Programming
Chapter 5: Monte Carlo Methods
Chapter 6: Temporal Difference Learning
Chapter 7: n-Step Bootstrapping