PyData Delhi 2017 - Presentation: Proximal Policy Optimization : The new kid in the RL Jungle

Proximal Policy Optimization : The new kid in the RL Jungle

Audience level:

Intermediate

Description

My talk will enlighten the audience with respect to the newly introduced class of Reinforcement Learning Algorithms called Proximal Policy optimization. These algorithms were recently released by OpenAI and have been found to perform better than the current state of the art while being simpler to implement and tune, Interested in RL ? or even training a beast of an Atari player? This is the Talk.

Abstract

The Reinforcement Learning problem

A basic overview of the Reinforcement learning Problem
Current challenges and goals in RL (Reinforcement learning)
Quick overview of Applications

Basic time tested Strategies : a shallow dive

Q-learning
Temporal difference method

Deep Q-learning

The deep learning difference in RL
Improvements in results compared to those of previous techniques.

Enter Proximal Policy Optimization Techniques

Explaning the algorithm
Results and improvements, especially in Atari Games
implementation and tuning

Comparison of these Algorithms

This will consist of mainly showing various graphs showing a comaprison of RL algorithms including PPO on the basis of: (i) Implementation Time (ii) Training Time (iii) Time for tuning (iv) 'effort'

Minimum takeaway

This would just be a conclusion with a simplifed takeaway points so that audience at all levels gain something from this talk.

Saturday 10:30 AM–11:00 AM in C11