Offline Reinforcement Learning:

Hierarchical Adversarial Inverse Reinforcement Learning:

Multi-task Imitation Learning (MIL) aims to train a policy capable of performing a distribution of tasks based on multi-task expert demonstrations, which is essential for general-purpose robots. Existing MIL algorithms suffer from low data efficiency and poor performance on complex longhorizontal tasks. We develop Multi-task Hierarchical Adversarial Inverse Reinforcement Learning (MH-AIRL) to learn hierarchically-structured multi-task policies, which is more beneficial for compositional tasks with long horizons and has higher expert data efficiency through identifying and transferring reusable basic skills across tasks.

Jiayu Chen, Dipesh Tamboli, Tian Lan, and Vaneet Aggarwal, "Multi-task Hierarchical Adversarial Inverse Reinforcement Learning," in Proc. ICML, Jul 2023.
Jiayu Chen, Tian Lan, and Vaneet Aggarwal, "Option-Aware Adversarial Inverse Reinforcement Learning for Robotic Control," in Proc. IEEE International Conference on Robotics and Automation (ICRA), May 2023.
Jiayu Chen, Tian Lan, and Vaneet Aggarwal, "Hierarchical Adversarial Inverse Reinforcement Learning," Accepted to IEEE Transactions on Neural Networks and Learning Systems, Aug 2023.

Variance Reduction in Offline Reinforcement Learning:

Recent work has shown that offline reinforcement learning can be formulated as a sequence modeling problem and solved via supervised learning with approaches such as decision transformer. While these sequence-based methods achieve competitive results over return-to-go methods, especially on tasks that require longer episodes or with scarce rewards, importance sampling is not considered to correct the policy bias when dealing with off-policy data, mainly due to the absence of behavior policy and the use of deterministic evaluation policies. To this end, we propose DPE: an RL algorithm that blends offline sequence modeling and offline reinforcement learning with Double Policy Estimation (DPE) in a unified framework with statistically proven properties on variance reduction.

Hanhan Zhou, Tian Lan, and Vaneet Aggarwal, "Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning," Aug 2023

Home