Abstract:Coordinated charging of electric vehicles (EVs) is becoming an important topic for the smart demand management. Traditional model-driven methods are highly dependent on the accuracy of models for charging behavioral characteristics. However, affected by the strong stochastics of related parameters, etc., the selection of relevant models cannot fully reflect their uncertainties. Considering that the data-driven model-free reinforcement learning algorithms has the advantages of not relying on pre-modeling, and adapting to data samples with strong nonlinear relationships, it is proposed to be applied to optimize the charging loads of the EV charging stations. In the Markov decision process customized for the satisfaction of EV charging need, both a charging completion degree index and a penalty term for user's charging satisfaction are introduced to improve the policy evaluating function. Specifically, in order to guarantee the computational speed underneath the volume of charging data, the temporal difference learning algorithm is used for the training with incremental updates. The simulation is carried out with the real-world data from one charging station. Results show that the proposed algorithm can accurately and quickly calculate the coordinated charging schedules without the pre-modeling for the EV charging behavior parameters.