Course

Sample-based Learning Methods

Alberta Machine Intelligence Institute & University of Alberta

In this comprehensive course on sample-based learning methods in reinforcement learning, you will explore algorithms that enable near optimal policies to be learned through trial and error interactions with the environment. The course covers intuitive yet powerful Monte Carlo methods, temporal difference learning methods such as Q-learning, and strategies for combining model-based planning with temporal difference updates for accelerated learning.

Throughout the course, you will delve into various topics, including the importance of exploration, the connections between Monte Carlo and dynamic programming, and the differences between on-policy and off-policy control. You will also gain practical experience by implementing and applying algorithms such as TD learning, Expected Sarsa, and Q-learning. Additionally, you will learn about planning with simulated experience and explore a model-based approach to reinforcement learning called Dyna.

By the end of the course, you will have the knowledge and skills to understand, implement, and apply sample-based learning methods in reinforcement learning, and conduct empirical studies to measure improvements in sample efficiency.

Certificate Available ✔

Get Started / More Info
Sample-based Learning Methods
Course Modules

This course covers a range of topics, including Monte Carlo methods for prediction and control, temporal difference learning methods for prediction and control, and planning, learning, and acting with a focus on model-based reinforcement learning using the Dyna architecture.

Welcome to the Course!

In the introductory module, you will meet the instructors, get an overview of the reinforcement learning textbook, and review the course prerequisites and learning objectives.

Monte Carlo Methods for Prediction & Control

Module 2 delves into Monte Carlo methods for prediction and control, covering topics such as off-policy learning, importance sampling, and the use of Monte Carlo for generalized policy iteration.

Temporal Difference Learning Methods for Prediction

Module 3 focuses on temporal difference learning methods for prediction, exploring the advantages of TD learning, comparing TD and Monte Carlo, and policy evaluation with temporal difference learning.

Temporal Difference Learning Methods for Control

Temporal difference learning methods for control are covered in Module 4, including Sarsa, Q-learning, and Expected Sarsa, with a focus on off-policy learning and learning multiple goals.

Planning, Learning & Acting

The final module, Module 5, delves into planning, learning, and acting, exploring the concept of a model, comparing sample and distribution models, and implementing model-based reinforcement learning using the Dyna architecture.

More Machine Learning Courses

Natural Language Processing

DeepLearning.AI

This Natural Language Processing course equips you with the skills to design NLP applications, perform sentiment analysis, build chatbots, and translate languages....

Deep Neural Networks with PyTorch

IBM

Learn to develop deep learning models using PyTorch in this comprehensive course that covers fundamental concepts to advanced techniques.

Introduction to Trading, Machine Learning & GCP

Google Cloud & New York Institute of Finance

Introduction to Trading, Machine Learning & GCP is a comprehensive course covering the fundamentals of trading, quantitative trading strategies, and application...

Generative AI: Prompt Engineering Basics

IBM

This course equips learners with the essential skills and knowledge to effectively guide generative AI models through prompt engineering techniques, enabling them...