In Reinforcement Learning, the default MDP has an assumption…
In Reinforcement Learning, the default MDP has an assumption of infinite horizons. To overcome that, we introduce a concept of ________ rewards. Multiplying the reward by gamma raised to t. Where gamma’s limits are ___< gamma
Read Details