Every day, almost every minute we make a choice. Right now you have made the choice to read this text instead of scrolling further. Choices can be insignificant: to go by tram or by bus, to take an umbrella or not. Sometimes they can be very significant and even crucial: the choice of University, life partner. However, the importance of choice may not be realized initially. Sometimes a decision "not to take an umbrella" radically changes everything.
The choice may affect a small group of people or entire countries. In game тtheory, we call it the choice of strategy. Constantly interacting with society and adopting certain strategies, many of us wonder: why can't everyone exist peacefully and cooperate with each other? Why do those who have agreed to cooperate, suddenly break the agreement? What if one is cooperative and the other is not? How profitable should the interaction be for the opponent to change his opinion? When are long-term stable prospects better than short-term benefits, and when not?
The answers to these and other questions you will find out in our course.
This course will be useful for those who want to make choices based on mathematical calculations rather than relying on fate. Who is interested in world politics and at least once heard about the "Prisoner's Dilemma".
The course is basic and does not require any special knowledge. In several sections, definitions and theorems from mathematical analysis and elements of probability theory will be used.

Ministrado por

Петросян Ованес Леонович

Смирнова Надежда Владимировна

Ли Инь

Mariia Bulgakova

Тайницкий Владислав Александрович

Панкратова Ярославна Борисовна

Transcrição

Hello. The topic of the today's lecture is the differential games. The first part is devoted to the study of some preliminary information or the approaches of how to solve differential games. The second part is devoted to the non-cooperative differential games of n players, where the main question is of how to model the behavior of players in processes where they have individual preferences, or each player has his own payoff function. The third part is devoted to the topic of cooperative differential games, where the question is of how to allocate the maximum joint payoff of players in the game. So, that the cooperation would be beneficial for all of the participants. Let's start with the example called optimization of advertising costs and consider a market. On the market, there is a company who tries to maximize its revenue. Its revenue mainly depends on the market share. The only tool that can be used in order to increase the market share is the advertising. So, the company can control the advertising costs. Let's suppose that the company wants to make a plan for advertising for one year. Then the question is, of how it would allocate the advertising costs, when company need to spend more money on advertising and when not. In order to construct a mathematical model for this process, the first thing we need to do is to define the optimal control problem. Let's suppose that we have a dynamical system. In our case, it is a company. The dynamics of the system is defined by the system of differential equations or motion equations (2). The solution of this motion equation is the function x(t), which defines the state of the game. In our case, under the state of the game, we can understand the market share of the company. The right hand side of the system of differential equations, also depends on the function u(t), which has a control function or in our case, the volume of advertising. For the control function u(t), we will consider a class of functions u(t,x). So, the functions that depend on t - the time instant and x - the state of the game. Also, we will suppose that the conditions of existence, uniqueness and prolongability of the system of differential equations (2) for each of such function exist. In the similar way, we already defined the control function or the strategy of the players in one of the previous sections. When we consider a multi-stage game with perfect information, the strategy of the player was a mapping that for each vertex from the set of personal moves of the player i, assigns the next vertex on the graph. It is important that we define the strategy as a function of any vertex. In the same way, we do it in here. We define a strategy or a control function u(t,x) for any time instant t and for any state x(t). So, for any function u(t,x) or for any advertising expenses, the right-hand side of the system of differential equations is different. Then as a result, the trajectory of the system or the function x(t) is different. For a different function x(t), and for different control function, we have the different, values of the functional (1). The optimal control problem is to find the control function u(t,x), that maximizes the value of the functional (1). In our case, the function (1) could be the profits or the revenue of the company. In here, we also suppose that the functions f(t), g(t) and q(t) are differentiable. Let's construct an optimal control problem for advertising costs model. On the slide, the formula (3) defines the functional that we need to maximize, which is a revenue of the company on the interval [0,T], which depends on the state of the game or the state function x(t) on this period, and on the advertising expenses. So, for if we fix the market share of the company, and then we try to change the advertising costs, then of course the advertising costs are higher than the value of the functional is lower. The formula (4) defines the differential equation or the motion equation for this dynamical system. The right-hand sign of this differential equation depends on the market share in the current time instant, and also depends on the marketing expenses. But the question is of how to find the optimal control or how to find a function u(t,x), that would maximize the functional (3). In order to do that, we can use a several classical approaches. The first one is dynamic programming principle or the Bellman equation. The second one that we can use is called the maximum principle or the Pontryagin's maximum principle, but we will use the first one. Because in the differential games, this is the approach that is more widely used. Why? Because the Bellman equation is a sufficient condition for the optimal control. So, if there is a solution for a Bellman equation, then we say that our solution is optimal. With maximum principle, we can find a solution for a much wider class of problems, but it is only the necessary condition. So, we would need to check the solution once again and prove that it is sufficient. So, in general, in differential games, people use the dynamic programming principle. But it has some disadvantages and we will talk about that later. So, what is the dynamic programming principle? Suppose that we know the optimal control in the problem defined on the interval [t0,T]. We also can define the corresponding trajectory. Let's denote the optimal control as a u*(t,x), and the corresponding trajectory as x*(t). Then, the truncation of the optimal control u*(t,x) on the subproblem defined on the interval [t',T], would be also optimal in the problem starting at time instant t' and in the position x*(t'). In the position on the optimal trajectory. This is true for any truncated interval. According to this statement, we can define the procedure to find the optimal solution of the control problem. In order to do that, we need to define the notion of the Bellman function. What is the Bellman function? It is the optimal value of the functional (3) defined in the subproblem starting at that time instant t and in the state x(t) or when the initial condition for the motion equation system, differential equation not at zero but at t and x(t). This is what you see on the slide. The Bellman function is a function V(t,x). So, it only depends on the initial time instant and state of the subproblem. As you can see, it does not depend on the optimal control or on any control, because the value of the Bellman function is already optimal. Well, how can we use that in order to find the optimal control problem (1),(2)? For that, we can use the so-called Bellman equation which is presented below in the slide. We say that if there exists a continuously differentiable function V(t,x) satisfying the Bellman equation presented below, which is a partial differential equation, then the function u(t) which maximizes the right-hand side of the Bellman equation is an optimal control in the problem (1),(2). If we can solve the Bellman equation, then the corresponding control would be optimal. It's important to know that this is a sufficient condition for the optimal control. But, there is a problem. We cannot solve the Bellman equation for a general class of problems. Why? Because the Bellman function can be any continuously differentiable function. But there is a approach that we can use and let's demonstrate it on the advertising costs example. On the slide, you can see the Bellman equation corresponding to the advertising costs problem, and the question is of how to solve it. Let's suppose that the Bellman function has the form presented on the slide or let's try to define it in this particular form. So, when the V(t,x) is equal to exponenta^(-rt) multiplied to sum of function A(t)*x and function B(t). The functions B(t) and A(t) are not known. Also, I did not mention that before but the exponenta^(-rt) defines the discount factor. So, of how people discount the payoffs that they are going to obtain in the future. But in order to get more information about that you can look at the list of references. How can we solve a partial differential equation? We need to substitute this form of Bellman function into the Bellman equation. Then, the partial derivatives would be the derivatives of the functions A(t) and B(t). Then after the defining of the control that maximizes the right-hand side, we can derive the optimal control, which is presented on the slide. But, we do not know the function A(t). We do not know yet the optimal control. In order to find functions A(t) and B(t), we need to transform the Bellman equation into the system of differential equations. How can we do that? We need to simplify the left and the right-hand side of the Bellman equation so that on the left we will have derivative of the function A(t) multiplied by x plus derivative of the function B(t). On the right-hand side, we're going to have a term multiplied by x plus some other term. Then, we can say that the derivative of the function A(t) is equal to the first term that is multiplied by x, and then the derivative of function B(t) is equal to the term on the right-hand side of the Bellman equation. Of course, we cannot do that for a general class of problems. But, for example, for a linear quadratic games the explicit solution is known. But if it is not the linear quadratic game, then on the first step we need to try to find the form for the Bellman function, then we need to try to solve the system of differential equation. Anyway, if we solve the system of differential equations, we substitute the functions A(t) and B(t) into the optimal control, then we substitute into the Bellman function, then the optimal control as a function of (t,x) we substitute to the motion equation. Then, we can solve it and define the trajectory x*(t) along which the system or the company would go. But in this particular model, the system of differential equations for the functions A(t) and B(t) cannot be solved analytically. What we can do is we can use the numerical methods. As a result, the control function will also be calculated using the numerical methods and Bellman function as well. On the slide on the right-hand side you can see the optimal control along the corresponding optimal trajectory when x at time instant t is equal to x*(t) and on the left-hand side you can see the corresponding optimal trajectory. On this slide, you can see a list of references from where you could find more information of how to use the dynamic programming principle, where we could find information about the maximum principle and to find more examples.