TY - GEN
T1 - Regret-Optimal Controller for the Full-Information Problem
AU - Sabag, Oron
AU - Goel, Gautam
AU - Lale, Sahin
AU - Hassibi, Babak
N1 - Publisher Copyright:
© 2021 American Automatic Control Council.
PY - 2021/5/25
Y1 - 2021/5/25
N2 - We consider the infinite-horizon, discrete-time full-information control problem. Motivated by learning theory, as a criterion for controller design we focus on regret, defined as the difference between the linear quadratic regulator (LQR) cost of a causal controller (that has only access to past and current disturbances) and the LQR cost of a clairvoyant one (that has also access to future disturbances). In the full-information setting, there is a unique optimal non-causal controller that in terms of LQR cost dominates all other controllers, and we focus on the regret compared to this particular controller. Since the regret itself is a function of the disturbances, we consider the worst-case regret over all possible bounded energy disturbances, and propose to find a causal controller that minimizes this worst-case regret. The resulting controller has the interpretation of guaranteeing the smallest possible regret compared to the best non-causal controller that has can see the future, no matter what the disturbances are. We show that the regret-optimal control problem can be reduced to a Nehari extension problem, i.e., to approximate an anticausal operator with a causal one in the operator norm. In the state-space setting we obtain explicit formulas for the optimal regret and for the regret-optimal controller. The regret-optimal controller is the sum of the classical H2 control law and an n-th order controller (where n is the state dimension of the plant) obtained from the Nehari problem. The controller construction simply requires the solution to the standard LQR Riccati equation, in addition to two Lyapunov equations. Simulations over a range of plants demonstrates that the regret-optimal controller interpolates nicely between the H2 and the H8 optimal controllers, and generally has H2 and H8 costs that are simultaneously close to their optimal values. The regret-optimal controller thus presents itself as a viable option for control system design.
AB - We consider the infinite-horizon, discrete-time full-information control problem. Motivated by learning theory, as a criterion for controller design we focus on regret, defined as the difference between the linear quadratic regulator (LQR) cost of a causal controller (that has only access to past and current disturbances) and the LQR cost of a clairvoyant one (that has also access to future disturbances). In the full-information setting, there is a unique optimal non-causal controller that in terms of LQR cost dominates all other controllers, and we focus on the regret compared to this particular controller. Since the regret itself is a function of the disturbances, we consider the worst-case regret over all possible bounded energy disturbances, and propose to find a causal controller that minimizes this worst-case regret. The resulting controller has the interpretation of guaranteeing the smallest possible regret compared to the best non-causal controller that has can see the future, no matter what the disturbances are. We show that the regret-optimal control problem can be reduced to a Nehari extension problem, i.e., to approximate an anticausal operator with a causal one in the operator norm. In the state-space setting we obtain explicit formulas for the optimal regret and for the regret-optimal controller. The regret-optimal controller is the sum of the classical H2 control law and an n-th order controller (where n is the state dimension of the plant) obtained from the Nehari problem. The controller construction simply requires the solution to the standard LQR Riccati equation, in addition to two Lyapunov equations. Simulations over a range of plants demonstrates that the regret-optimal controller interpolates nicely between the H2 and the H8 optimal controllers, and generally has H2 and H8 costs that are simultaneously close to their optimal values. The regret-optimal controller thus presents itself as a viable option for control system design.
UR - http://www.scopus.com/inward/record.url?scp=85107617594&partnerID=8YFLogxK
U2 - 10.23919/ACC50511.2021.9483023
DO - 10.23919/ACC50511.2021.9483023
M3 - Conference contribution
AN - SCOPUS:85107617594
T3 - Proceedings of the American Control Conference
SP - 4777
EP - 4782
BT - 2021 American Control Conference, ACC 2021
PB - Institute of Electrical and Electronics Engineers
T2 - 2021 American Control Conference, ACC 2021
Y2 - 25 May 2021 through 28 May 2021
ER -