Time-Varying Constraint-Aware Reinforcement Learning for Energy Storage Control (Papers Track) Spotlight

Jaeik Jeong (Electronics and Telecommunications Research Institute (ETRI)); Tai-Yeon Ku (Electronics and Telecommunications Research Institute (ETRI)); Wan-Ki Park (Electronics and Telecommunications Research Institute (ETRI))

Paper PDF Slides PDF Poster File Cite
Power & Energy Reinforcement Learning Time-series Analysis

Abstract

Energy storage devices, such as batteries, thermal energy storages, and hydrogen systems, can help mitigate climate change by ensuring a more stable and sustainable power supply. To maximize the effectiveness of such energy storage, determining the appropriate charging and discharging amounts for each time period is crucial. Reinforcement learning is preferred over traditional optimization for the control of energy storage due to its ability to adapt to dynamic and complex environments. However, the continuous nature of charging and discharging levels in energy storage poses limitations for discrete reinforcement learning, and time-varying feasible charge-discharge range based on state of charge (SoC) variability also limits the conventional continuous reinforcement learning. In this paper, we propose a continuous reinforcement learning approach that takes into account the time-varying feasible charge-discharge range. An additional objective function was introduced for learning the feasible action range for each time period, supplementing the objectives of training the actor for policy learning and the critic for value learning. This actively promotes the utilization of energy storage by preventing them from getting stuck in suboptimal states, such as continuous full charging or discharging. This is achieved through the enforcement of the charging and discharging levels into the feasible action range. The experimental results demonstrated that the proposed method further maximized the effectiveness of energy storage by actively enhancing its utilization.