docs:math:newton_krylov_hookstep

The following is a detailed guide to Divakar Viswanath's Newton-Krylov-hookstep solution algorithm for equilibria, traveling waves, and periodic orbits of the Navier-Stokes equations. See Viswanath, “Recurrent motions within plane Couette turbulence”, J. Fluid Mech. 580 (2007), http://arxiv.org/abs/physics/0604062 for a higher-level description.

In this exposition, *f* stands for the finite-time map of the
Navier-Stokes equations, not the residual being minimized (as it
does in Dennis & Schnabel and Trefethen's discussions of constrained
minimization and Krylov methods. Also, I use *t* and *dt* for
the orbit period and its Newton-step increment, although in my code
these variables are named *T* and *dT*, and *dt* stand for the
integration time step.

Let

We seek solutions of the equation

*t* can be a fixed or a free variable, as can σ. For plane
Couette the possible continuously varying variables in σ are
*a_x,a_z* in the phase shifts *x → x + a_x L_x* and *z → z + a_z L_z*.

Let *x* be the vector of unknown real numbers being sought.
For concreteness in this exposition I will assume that
we're looking for a relative periodic orbit, so that the
period *t* of the orbit and the phase shifts *a_x* and *a_z* are
unknowns. Then *x = (u, σ, t)*, where by *u* I really
mean the finite set of real-valued variables that
parameterize the discrete representation of the velocity field
(e.g. the real and imaginary parts of the spectral coefficients).
To be perfectly pedantic (and because it will come in handy later),
let there be *M* independent real numbers *{u_m}* in the
discrete representation of u, and let *P* represent the map
between continuous fields u and the discrete representation {u_m}.

Then *x = (u, σ, t)* is an *M+3* dimensional vector

and the discrete equation to be solved is

This is *M* equations in *M+3* unknowns. The indeterminacy follows from the
equivariance of the *f* and *G* under time and spatial phase shifts. We will
get three more equations by constraining the Newton step to be orthogonal to
time and space shifts.

The Newton algorithm for solving *G(x) = 0* is as follows.
Let *x* = (u*,σ*,t*)* be the solution, i.e. *G(x*) = G(u*,σ*, t*) = 0.*
Suppose we start with an initial guess *x* near *x**. Let

In these equations the *N* subscript signifies the Newton step. Then

Drop higher order terms and solve the Newton equation

Because *DG* is a *M x (M+3)* matrix, we have to supplement this
system with three constraint equations to have a unique solution
*dx_N*.

*( , )* here signifies an inner product, so these are orthogonality
constraints preventing the Newton step from going in the directions
of the time, *x*, and *z* equivariance. That forms an *(M+3) x (M+3)*
dimensional system

Since since *M* is very large, we will use the iterative GMRES Krylov-subspace algorithm
to find an approximate solution *dx_N* to the *(M+3)x(M+3)* system *A dx_N = b*.
GMRES requires multiple evaluations of the product *A dx* for test values of *dx*.
*A dx* is an *M+3* dimensional vector whose first *M* entries can be approximated
with finite differencing and time integration:

The last three entries of *A dx* are simple evaluations of the inner products *(du, du/dt),
(du, du/dx)*, and *(du, du/dz)*. For details of the GMRES algorithm, see Trefethen and Bau.

That gives us the Newton step *dx_N*. However if we are too far from
the solution *x**, the linearization inherent in the Newton algorithm
will not be accurate. At this point we switch from the pure Newton
algorithm to a constrained minimization algorithm called the
“hookstep”, specially adapted to work with GMRES.

The classic hookstep algorithm is this: If the Newton step *dx_N* does not decrease
the residual *|| G(x+dx_N)* || sufficiently, we obtain a smaller step
*dx_H* by minimizing *|| A dx_H - b ||^2* subject to *||dx_H||^2 ⇐ \delta^2*,
rather using the Newton step, which solves *A dx = b*.

In the current case *A* is very high-dimensional, so we further constrain the
hookstep *dx_H* to lie within the Krylov subspace obtained from the solution of
the Newton step equations. I.e. we minimize the norm of the projection of
*A dx_H - b* onto the Krylov subspace. The *n*th GMRES iterate gives
matrices *Q_n, Q_{n-1},* and *H_n* such that

where *Q_n* is *M x n*, *Q_{n-1}* is *M x (n+1)*, and *H_n* is *(n+1) x n*.
The projection of *A dx - b* onto the *(n+1)*th Krylov subspace is
*Q_{n-1}^T (A dx - b)*. Confine the *dx_H* to *n*th Krylov subspace by
letting *dx_H = Q_n s* where *s = {s_n}* is a *n*-dimensional vector to be determined by
constrained optimization. Namely, we minimize

subject to *||dx_H||^2 = || s ||^2 ≤ δ^2*. Note that the quantity in
the norm on the right-hand side of (10) is only *n*-dimensional, and *n* is
small (order 10 or 100) compared to *M* (10^5 or 10^6).

The constrained minimization problem simplifies if we decompose *H_n* with SVD: *H_n = U D V^T*.
Then the RHS of the previous equation becomes

where * \hat{s} = V^T s* and *\hat{b} = U^T Q_{n-1}^T b*. Now we need to
minimize *|| D \hat{s} - \hat{b} ||^2* subject to *||\hat{s}||^2 ≤ δ^2*

From Divakar Viswanath, personal communication: Since *D* is diagonal *D_{i,i} = d_i*,
the constrained minimization problem can be solved easily with Lagrange multipliers.
The solution is

(no Einstein summation) for the *μ* such that *||\hat{s}||^2 = δ^2*.
This value of *μ* can be found with a 1d Newton search for the zero of

A straight Newton search a la *μ → μ - Φ(μ)/Φ'(μ)* is suboptimal
since *Φ'(μ) → 0* as *μ → ∞* with *Φ''(μ) > 0* everywhere, so
we use a slightly modified update rule for the Newton search over *μ*.
Please refer to Dennis and Schnabel regarding that. Then, given the
solution *\hat{s}(μ)*, we compute the hookstep solution *s* from

How do we know a good value for δ? Essentially, by comparing
the reduction in residual obtained by actually taking the hookstep
*dx_H* to the reduction predicted by the linearized model *G(x+dx_H) =
G(x) + DG(x) dx_H*. If the reduction is accurate but small, we increase
δ by small steps until the reduction is marginally accurate but larger.
If the reduction is poor we reduce δ.

(Note: the trust-region optimization is actually performed in terms of the squared residual, so the comments in the code refer to a quadratic rather than linear model of the residual.)

The heuristics associated with adjusting the trust region are fairly complex. Dennis and Schnabel has a decent description, but it took quite a bit of effort to translate that description into an algorithm. Rather than attempt to translate the algorithm back into prose, I recommend you read Dennis and Schnabel and then refer to the code and comments. This portion of findorbit.cpp is very liberally commented.

docs/math/newton_krylov_hookstep.txt · Last modified: 2014/12/01 09:16 by gibson