Skip to main content Skip to secondary navigation
Main content start

Ben Van Roy, Wanqiao Xu receive Outstanding Paper Award on the Theory of Reinforcement Learning

Their paper develops a new exploration method for the understudied continuing RL problem.

Congratulations to professor Ben Van Roy and PhD student Wanqiao Xu for receiving the Outstanding Paper Award on the Theory of Reinforcement Learning!

The award, presented at the first-ever Reinforcement Learning Conference, was given for their paper, "Posterior Sampling for Continuing Environments."

Professor Ben Van Roy (left) and Wanqiao Xu

Accolades from the Reinforcement Learning Conference site: "This paper develops a new exploration method for the understudied continuing RL problem, showing how one can extend the posterior sampling algorithm originally designed for the episodic setting to the continuing setting. This is achieved by showing how one can reinterpret existing methods to resample a new policy at every time step instead of doing so at the beginning of each episode, an approach which is also effective in high-dimensional state spaces. The resampling probability can be used by the agent to dynamically adjust its planning horizon, thus better handling infinite-horizon problems. This paper stood out through its theoretical rigour in adapting algorithms from the episodic to the continuing setting, being one of the first papers to do so, potentially serving as a catalyst for more research on this topic."

More News