Theory and Methods for Reinforcement Learning

EE-618

Lecture 2: Dynamic Programming.

This page is part of the content downloaded from Lecture 2: Dynamic Programming. on Wednesday, 25 December 2024, 15:49. Note that some content and any files larger than 50 MB are not downloaded.

Description

MDPs; value and Q functions; value iteration, policy iteration; operator perspectives. Model-free policy-based and value-based methods; Monte Carlo (MC) method and temporal difference (TD) learning.


Files and subfolders