Guoxi Zhang, Hisashi Kashima. Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning. In Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI), 2022.
A Paper Accepted for AAAI 2023
Our paper proposing a method for off-line deep reinforcement learning methods from mixed-behavior episode data obtained from several different policies was accepted to AAAI 2023, a top conference in the field of AI.