Abstract:Reactive voltage regulation is an important control measure to ensure voltage stability in the power grid area. The existing single agent design method based on reinforcement learning has many problems, such as high coupling between state and action, various combinations of reactive power compensation devices, and unreasonable reward design based on target deviation model. Aiming at these problems, a method for describing the integrated operation state of the grid considering the node voltage amplitude and capacitor switching condition is proposed. The reinforcement learning agent group architecture for grid reactive voltage regulation is designed. The group determines the members of the agent according to the current comprehensive operating state of the grid, and gives corresponding reactive power regulation actions. Each agent member uses the improvement degree of the grid state in the adjacent time period as a reward mechanism. The example shows that the method can be applied to the grid reactive voltage regulation environment. Compared with the single agent design method, the number of action sets can be effectively reduced, and various reactive voltage regulation conditions can be better dealt with.