在没有人类知识的情况下掌握围棋

Mar 5 12:51 2018    Author :  David Silver

AlphaGo Zero论文译文

阅读全文

AlphaGo, AlphaGo Zero, and AlphaZero

Mar 4 19:35 2018    Author :  He Liu

Every time you specialize something you hurt your generalization ability. -- David Silver

阅读全文

RL - Integrating Learning and Planning

Jan 9 21:11 2018    Author :  He Liu

In this lecture, we will learn model directly from experience and use planning to construct a value function or policy.

阅读全文

RL - Policy Gradient

Jan 7 13:42 2018    Author :  He Liu

This lecture talks about methods that optimise policy directly. Instead of working with value function as we consider so far, we seek experience and use the experience to update our policy in the direction that makes it better.

阅读全文

RL - Value Function Approximation

Jan 3 18:09 2018    Author :  He Liu

This lecture will introduce how to scale up our algorithm to real practical RL problems by value function approximation.

阅读全文

浏览量:52470