hierarchical reinforcement learning ersin basaran 19/03/2005

Download Hierarchical Reinforcement Learning Ersin Basaran 19/03/2005

Post on 21-Dec-2015

214 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Slide 1
  • Hierarchical Reinforcement Learning Ersin Basaran 19/03/2005
  • Slide 2
  • Outline Reinforcement Learning RL Agent RL Agent Policy Policy Hierarchical Reinforcement Learning The Need The Need Sub-Goal Detection Sub-Goal Detection State Clusters State Clusters Border States Border States Continuous State and/or Action Spaces Continuous State and/or Action Spaces Options Options Macro Q-Learning with Parallel Option Discovery Macro Q-Learning with Parallel Option Discovery Experimental Results
  • Slide 3
  • Reinforcement Learning Agent observes the state, and takes the action according to the policy Policy is a function from the state space onto the action space Policy can be deterministic or non- deterministic State and action spaces can be discrete, continuous or hybrid
  • Slide 4
  • RL Agent No model of the environment Agent observes state s, takes action a and goes into state s observing reward r Agent tries to maximize total expected reward (return) Finite state machine model SS a, r
  • Slide 5
  • Policy In a flat RL model, policy is a map from each state to a primitive action In the optimal policy, the action taken by the agent return highest return at each each step Can be kept in tabular format for small state and action spaces Function approximators can be used for large state or action spaces (or continuous ones)
  • Slide 6
  • The Need For Hierarchical RL Increase the performance Applying RL to the problems with large action and/or state space become feasible Detection of sub-goals can help the agent to have the abstract actions defined over the primitive actions Sub-goals and abstract actions can be used in different tasks on the same domain. The knowledge is transferred between tasks The policy of the agent can be translated into a natural language
  • Slide 7
  • Sub-goal Detection A sub-goal can be a single state, a subset of the state space, or a constraint in the state space Reaching a sub-goal should help the agent reaching the main goal (to get the highest return) Sub-goals must be discovered by the agent autonomously
  • Slide 8
  • State Clusters The states in a cluster are strongly connected to each other The number of state transitions among clusters are small The states at two ends of a state transition between two different clusters are sub-goal candidates Clusters can be hierarchical Different clusters can be in the same cluster at a higher level Different clusters can be in the same cluster at a higher level
  • Slide 9
  • Border States Some actions cannot be applied in some states. These states are defined as border states Border states are assumed to have a transition sequence. We can travel through the border states by taking some actions Each end in this transition sequence is a candidate sub-goal assuming the agent sufficiently explored the environment
  • Slide 10
  • Border State Detection For discrete action and state space F(s): set of states which can be reached from state s in one time unit F(s): set of states which can be reached from state s in one time unit G(s): if an action in G(s) is applied at state s, no state transition occurs G(s): if an action in G(s) is applied at state s, no state transition occurs H(s): if an action in H(s) is applied at state s, the agent moves to a different state H(s): if an action in H(s) is applied at state s, the agent moves to a different state
  • Slide 11
  • Border State Detection Detect the longest state sequence s 0,s 1,s 2,,s k-1,s k which satisfies the following constraints s i F(s i+1 ) or s i+1 F(s i ) for 0 i

Recommended

View more >