 RL-pong-dqn.mp4
RL-pong-dqn.mp4Learning Pong implemented with DeepMind's deep Q-learning. My model's paddle is on the right. The cool thing about this is that it's learning directly from pixels! The input are the last four video frames of the gameplay, after preprocessing them: max-pooling between adjacent frames, converting to grayscale, cropping, rescaling, and stacking, so 210x160x3x4 becomes a 80x80x4 input.
 RL-HalfCheetah-no-baseline.mp4
RL-HalfCheetah-no-baseline.mp4Learns locomotion in most unlikely way.
 RL-CartPole-baseline.mp4
RL-CartPole-baseline.mp4Learns a balancing act, by applying forces horizontally to the cart.
 RL-InvertedPendulum-baseline.mp4
RL-InvertedPendulum-baseline.mp4Similar to the last one, but powered by the Mujoco physics simulator.