Step 1: Write Down the Values of Interest

The first step to solve any problem is writing it down. As this section's focus is optimal control problems, I will formalize the process by which I face any optimal control problem. The goal is to write down a set of necessary conditions that any locally optimal solution must satisfy so that whether we eventually solve the problem analytically or numerically, we may verify that it satisfies them. The first step is to write down the values of interest.

  1. The states that describe the system, such as position, velocity, temperature, angle, etc. These quantities are characterized by being continuous variables. For our purposes here, let us write them as $x_j(t)$, $j = 1,\ldots,n_x$, but note that alternative notation may be more useful depending on the problem.
  2. The inputs that we may design, such as voltage, force, etc. These quantities do not necessarily have to be continuous. If you find that an input must be continuous, then it is probably a state variable and the real input is something else. For this discussion, the inputs are labeled $u_j(t)$, $j = 1,\ldots,n_u$.
  3. The time interval of interest; an initial time $t_0$ and a final $t_f$ such that $t_0 < t_f$. These times may either be fixed values set by the problem statement, or they may be free variables for which we must solve.
  4. Additional parameters for the problem. These typically are static values such as coefficient of friction, thermal expansion, coefficient of drag, reference values, etc. These parameters are important as they often provide valuable information as to whether an optimal control problem is solvable or not. These parameters are labeled as $p_j$, $j = 1,\ldots,n_p$, but again, other notation may be more useful depending on the problem. Some of these may either be fixed or free.
Before moving further, it is good practice to make a list of the states, the inputs, and the parameters, as well as the number of states $n_x$, the number of inputs $n_u$, and the number of parameters $n_p$.

Step 2: Write down the Dynamical Equations, the Costs, and the Constraints

With the values now written down, we move to the functions that relate these quantities.

  1. For each state, $x_j(t)$, we write down the first order differential equation that describes its evolution, $\dot{x}_j(t) = f_j(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{p},t)$, $j = 1,\ldots,n_x$
  2. There may exist end-point constraints, $e_j^L \leq e_j(\boldsymbol{x}(t_0),\boldsymbol{x}(t_f),t_0,t_f) \leq e_j^U$, $j = 1,\ldots,n_e$. Often these consist of initial conditions for the system of ODEs, or final conditions if we are trying to reach a particlar value for the states at the final time.
  3. There may be path constraints, $h_j^L \leq h_j(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{p},t) \leq h_j^U$, $j = 1,\ldots,n_h$, which must hold true at any time $t \in [t_0,t_f]$. These may consist of expressions to ensure some quantities remain positive, or if an input must be bounded.
  4. There may be an end-point cost, denoted as $E(\boldsymbol{x}(0),\boldsymbol{x}(t_f),t_0,t_f)$, which returns a scalar value associated with some initial and final values of the states and times. This cost function should reward more beneficial values of the states at initial and final times. A common use of this function is to attempt to move the system towards a desirable state without setting it as a constraint in the $e_j$ functions.
  5. The running cost, denoted $F(\boldsymbol{x}(t),\boldsymbol{u}(t),t)$, which is integrated over the complete time horizon. This function can be used to measure total fuel consumption, total energy, total deviation from a desired trajectory, etc.
Writing down the dynamical functions, the end-point constraints, the path constraints, the end-point cost, and the running cost, makes the following steps much easier to follow. The complete optimal control problem can be stated as, $$ \begin{array}{ll} \min & J = E(\boldsymbol{x}(t_0),\boldsymbol{x}(t_f),t_0,t_f) + \int_{t_0}^{t_f} F(\boldsymbol{x}(t), \boldsymbol{u}(t),t) dt\\ \text{s.t.} & \dot{x}_j(t) = f_j(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{p},t), \quad j = 1,\ldots,n_x\\ & h_j^L \leq h(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{p},t) \leq h_j^U, \quad j = 1,\ldots,n_h\\ & e_j^L \leq e_j(\boldsymbol{x}(t_0),\boldsymbol{x}(t_f),t_0,t_f) \leq e_j^U, \quad j = 1,\ldots,n_e \end{array} $$

Step 3: Writing the Hamiltonian

We are now ready to take the first step towards solving the optimal control problem. We write two Lagrangians, the end-point Lagrangian and the Lagrangian of the Hamiltonian. First, the Lagrangian of the Hamiltonian is written as, $$ \bar{H}(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{\lambda}(t),\boldsymbol{\mu}(t),t) = H(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{\lambda}(t),t) + \boldsymbol{\mu}^T(t) \boldsymbol{h}(\boldsymbol{x}(t),\boldsymbol{u}(t),t) $$ where the Hamiltonian is, $$ H(\boldsymbol{x}(t),\boldsymbol{u}(t),\boldsymbol{\lambda}(t),t) = F(\boldsymbol{x}(t),\boldsymbol{u}(t),t) + \boldsymbol{\lambda}^T(t) \boldsymbol{f}(\boldsymbol{x}(t),\boldsymbol{u}(t),t) $$ We have introduced the co-states, denoted $\lambda_j(t)$, $j=1,\ldots,n_x$, where each co-state is associated with the corresponding state variable $x_j(t)$. They evolve according to the following differential equations, $$ \dot{\lambda}_j(t) = -\frac{\partial \bar{H}}{\partial x_j} = -\frac{\partial F}{\partial x_j} - \left(\frac{\partial \boldsymbol{f}}{\partial x_j} \right)^T \boldsymbol{\lambda}(t) - \left( \frac{\partial \boldsymbol{h}}{\partial x_j} \right)^T \bm{\mu}(t) $$ Also introduced are the co-path values, denoted $\mu_j(t)$, $j = 1,\ldots,n_h$, associated with each path constraint. They follow the complementarity condition, $$ \mu_j(t) \left\{ \begin{array}{lll} \leq 0 & \text{if} & h_j(\boldsymbol{x}(t),\boldsymbol{u}(t),t) = h_j^L\\ = 0 & \text{if} & h_j^L < h_j(\boldsymbol{x}(t),\boldsymbol{u}(t),t) < h_j^U\\ \geq 0 & \text{if} & h_j(\boldsymbol{x}(t), \boldsymbol{u}(t),t) = h_j^U\\ \text{unrestricted} & \text{if} & h_j^L = h_j^U \end{array} \right. $$ Finally, the Lagrangian of the Hamiltonian provides the condition for a control to be optimal, called the stationarity condition, $$ \frac{\partial \bar{H}}{\partial u_j} = 0, \quad j = 1,\ldots,n_u $$

The end-point Lagrangian is written as, $$ \bar{E} = E(\boldsymbol{x}(t_0),\boldsymbol{x}(t_f),t_0,t_f) + \boldsymbol{\nu}^T \boldsymbol{e}(\boldsymbol{x}(t_0),\boldsymbol{x}(t_f),t_0,t_f) $$ The end-point Lagrangian provides the two transversality conditions. $$ \boldsymbol{\lambda}(t_0) = -\frac{\partial \bar{E}}{\partial \boldsymbol{x}(t_0)} \quad \text{and} \quad \boldsymbol{\lambda}(t_f) = \frac{\partial \bar{E}}{\partial \boldsymbol{x}(t_f)} $$ The end-point Lagrange multipliers, $\nu_j$, $j = 1\ldots,n_e$, satisfy a similar complementarity condition as $\mu_j(t)$ did. $$ \nu_j \left\{ \begin{array}{lll} \leq 0 & \text{if} & e_j(\boldsymbol{x}(t_0),\boldsymbol{x}(t_f),t_0,t_f) = e_j^L\\ = 0 & \text{if} & e_i^L < e_j(\boldsymbol{x}(t_0),\boldsymbol{x}(t_f),t_0,t_f) < e_j^U\\ \geq 0 & \text{if} & e_j(\boldsymbol{x}(t_0),\boldsymbol{x}(t_f),t_0,t_f) = e_j^U\\ \text{unrestricted} & \text{if} & e_j^L = e_j^U \end{array} \right. $$ Finally, we can write the end point Hamiltonian conditions, $$ H|_{t=t_0} = \frac{\partial \bar{E}}{\partial t_0} \quad \text{and} \quad H|_{t=t_f} = - \frac{\partial \bar{E}}{\partial t_f} $$ Writing down each of these equations provides the necessary conditions which any locally optimal solution of the original optimal control problem must satisfy.

Step 4: Making Hypotheses about the Solution

After the required book-keeping in step 1, writing the problem in step 2, and then writing the Hamiltonian and the various necessary conditions in step 3, it is time to stop and think about what the optimal solution might look like. There is no strict list of steps to take at this point, only experience can guide the process, but a general framework that can be useful is the following.

  1. Determine when the various constraints will be active or in-active. This step can provide a potential decomposition of the time domain into segments. The Breakwell problem is a good example of this step.
  2. If a control input does not explicitly appear in the stationarity condition, then it is possible that we should expect a "bang-bang" type control, that is one which is either only at its maximum value or minimum value. The bang-bang optimal control problem presented is an example of how to handle this case.
  3. Determine if the system of differential equations can be solved explicitly or not. If they can be, an analytical solution may be possible! If not, you will probably have to resort to numerical methods
At this point, after making hypotheses about the form of the solution, using a numerical optimal control solver can be useful to verify whether the solution takes the proposed form. Using numerical optimal control solvers well is a skill that should be developed in parallel with learning optimal control as both topics are mutually beneficial.

Step 5: Solve the Equations for States and Co-States if Possible.

For most real problems, the differential equations that describe the evolution of the states and the co-states, $\boldsymbol{x}(t)$ and $\boldsymbol{\lambda}(t)$ respectively, cannot be solved analytically. A special exception is when the states are linear time-invariant (LTI), that is, $$ \dot{\boldsymbol{x}}(t) = A \boldsymbol{x}(t) + B \boldsymbol{u}(t) $$ The section on Minimum Energy Control describes this type of problem in general terms.

Sometimes, it can be useful to draw a graph which describes the relations between states. The graph has $n_x+n_u$ nodes, where each node is assigned either a state or an input. Add directed edges from node $x_j$ to node $x_k$ if $x_j$ appears in $f_k$. Similarly, add directed edges from node $u_j$ to node $x_k$ if $u_j$ appears in $f_k$ as well. This exercise can inform whether some of the states (or costates) are solvable even if the whole system of equations cannot be solved.

Step 6: Analyze the Solution (Numerical or Analytical) to Revise the Problem Formulation