User Tools

Site Tools


docs:math:newton_krylov_hookstep

This is an old revision of the document!


A PCRE internal error occured. This might be caused by a faulty plugin

====== The Newton-Kyrylov-hookstep algorithm ====== The following is a detailed guide to Divakar Viswanath's Newton-Krylov-hookstep solution algorithm for equilibria, traveling waves, and periodic orbits of the Navier-Stokes equations. See Viswanath, "Recurrent motions within plane Couette turbulence", J. Fluid Mech. 580 (2007), http://arxiv.org/abs/physics/0604062 for a higher-level description. In this exposition, //f// stands for the finite-time map of the Navier-Stokes equations, not the residual being minimized (as it does in Dennis & Schnabel and Trefethen's discussions of constrained minimization and Krylov methods. Also, I use //t// and //dt// for the orbit period and its Newton-step increment, although in my code these variables are named //T// and //dT//, and //dt// stand for the integration time step. ===== Problem definition ===== Let <latex> $ \begin{align*} f^t(u) &= \text{the time-t map of Navier-Stokes computed by DNS}.\\ \sigma &= \text{a symmetry of the flow} \end{align*} $ </latex> We seek solutions of the equation <latex> G(u,\sigma,t) = \sigma f^t(u) - u = 0 </latex> //t// can be a fixed or a free variable, as can σ. For plane Couette the possible continuously varying variables in σ are //a_x,a_z// in the phase shifts //x → x + a_x L_x// and //z → z + a_z L_z//. ===== Discretization ===== Let //x// be the vector of unknown real numbers being sought. For concreteness in this exposition I will assume that we're looking for a relative periodic orbit, so that the period //t// of the orbit and the phase shifts //a_x// and //a_z// are unknowns. Then //x = (u, σ, t)//, where by //u// I really mean the finite set of real-valued variables that parameterize the discrete representation of the velocity field (e.g. the real and imaginary parts of the spectral coefficients). To be perfectly pedantic (and because it will come in handy later), let there be //M// independent real numbers //{u_m}// in the discrete representation of u, and let //P// represent the map between continuous fields u and the discrete representation {u_m}. <latex> $ \begin{align*} \{u_m\} &= P(u) \\ u_m &= P_m(u), \;\;\; 1 \leq m \leq M \end{align*} $ </latex> Then //x = (u, σ, t)// is an //M+3// dimensional vector <latex> x_m = (u_1, u_2, ..., u_M, ax, az, t) </latex> and the discrete equation to be solved is <latex> $ \begin{align*} G(x) &= 0 \\ G_m(x) &= P_m(\sigma f^t(u) - u), \;\;\; 1 \leq m \leq M \end{align*} $ </latex> This is //M// equations in //M+3// unknowns. The indeterminacy follows from the equivariance of the //f// and //G// under time and spatial phase shifts. We will get three more equations by constraining the Newton step to be orthogonal to time and space shifts. ===== The Newton algorithm ===== The Newton algorithm for solving //G(x) = 0// is as follows. Let //x* = (u*,σ*,t*)// be the solution, i.e. //G(x*) = G(u*,σ*, t*) = 0.// Suppose we start with an initial guess //x// near //x*//. Let \begin{eqnarray*} x^* &= x + dx_N \\ u^* &= u + du_N \\ \sigma^* &= \sigma + d\sigma_N \\ t^* &= t + dt_N \end{eqnarray*} In these equations the //N// subscript signifies the Newton step. Then \begin{eqnarray*} G(x^*) &= G(x + dx_N) \\ 0 &= G(x) + DG(x) dx_N + O(|dx_N|^2) \end{eqnarray*} Drop higher order terms and solve the Newton equation \begin{eqnarray*} DG(x) \, dx_N = -G(x) \end{eqnarray*} Because //DG// is a //M x (M+3)// matrix, we have to supplement this system with three constraint equations to have a unique solution //dx_N//. \begin{eqnarray*} (du_N, du/dt) &= 0 \\ (du_N, du/dx) &= 0 \\ (du_N, du/dz) &= 0 \end{eqnarray*} //( , )// here signifies an inner product, so these are orthogonality constraints preventing the Newton step from going in the directions of the time, //x//, and //z// equivariance. That forms an //(M+3) x (M+3)// dimensional system \begin{eqnarray*} A \, dx_N = b \end{eqnarray*} ===== Krylov subspace solution of the Newton equation ===== Since since //M// is very large, we will use the iterative GMRES Krylov-subspace algorithm to find an approximate solution //dx_N// to the //(M+3)x(M+3)// system //A dx_N = b//. GMRES requires multiple evaluations of the product //A dx// for test values of //dx//. //A dx// is an //M+3// dimensional vector whose first //M// entries can be approximated with finite differencing and time integration: <latex> $ \begin{align*} DG(x) dx &= 1/e \; (G(x + \epsilon dx) - G(x)) \text { where } \epsilon |dx| << 1. \\ &= 1/e \; P[(\sigma+d\sigma) f^{t+dt}(u+du) - (u+du) - (\sigma f^{t}(u) - u)] \\ &= 1/e \; P[(\sigma+d\sigma) f^{t+dt}(u+du) - \sigma f^{t}(u) - du] \end{align*} $ </latex> The last three entries of //A dx// are simple evaluations of the inner products //(du, du/dt), (du, du/dx)//, and //(du, du/dz)//. For details of the GMRES algorithm, see Trefethen and Bau. That gives us the Newton step //dx_N//. However if we are too far from the solution //x*//, the linearization inherent in the Newton algorithm will not be accurate. At this point we switch from the pure Newton algorithm to a constrained minimization algorithm called the "hookstep", specially adapted to work with GMRES. ===== The hookstep algorithm ===== The classic hookstep algorithm is this: If the Newton step //dx_N// does not decrease the residual //|| G(x+dx_N)// || sufficiently, we obtain a smaller step //dx_H// by minimizing //|| A dx_H - b ||^2// subject to //||dx_H||^2 <= \delta^2//, rather using the Newton step, which solves //A dx = b//. In the current case //A// is very high-dimensional, so we further constrain the hookstep //dx_H// to lie within the Krylov subspace obtained from the solution of the Newton step equations. I.e. we minimize the norm of the projection of //A dx_H - b// onto the Krylov subspace. The //n//th GMRES iterate gives matrices //Q_n, Q_{n-1},// and //H_n// such that <latex> A Q_n = Q_{n-1} H_n </latex> where //Q_n// is //M x n//, //Q_{n-1}// is //M x (n+1)//, and //H_n// is //(n+1) x n//. The projection of //A dx - b// onto the //(n+1)//th Krylov subspace is //Q_{n-1}^T (A dx - b)//. Confine the //dx_H// to //n//th Krylov subspace by letting //dx_H = Q_n s// where //s = {s_n}// is a //n//-dimensional vector to be determined by constrained optimization. Namely, we minimize <latex> $ \begin{align*} || Q_{n-1}^T (A \, dx_H - b) || &= || Q_{n-1}^T A Q_n s - Q_{n-1}^T b || \\ &= || Q_{n-1}^T Q_{n-1} H_n s - Q_{n-1}^T b || \\ &= || H_n s - Q_{n-1}^T b || \end{align*} $ </latex> subject to //||dx_H||^2 = || s ||^2 ≤ δ^2//. Note that the quantity in the norm on the right-hand side of (10) is only //n//-dimensional, and //n// is small (order 10 or 100) compared to //M// (10^5 or 10^6). The constrained minimization problem simplifies if we decompose //H_n// with SVD: //H_n = U D V^T//. Then the RHS of the previous equation becomes <latex> $ \begin{align*} || H_n s - Q_{n-1}^T b || &= || U D V^T s - Q_{n-1}^T b || \\ &= || D V^T s - U^T Q_{n-1}^T b || \\ &= || D \hat{s} - \hat{b} || \end{align*} $ </latex> where // \hat{s} = V^T s// and //\hat{b} = U^T Q_{n-1}^T b//. Now we need to minimize //|| D \hat{s} - \hat{b} ||^2// subject to //||\hat{s}||^2 ≤ δ^2// ===== Solving for the Krylov-hookstep ===== From Divakar Viswanath, personal communication: Since //D// is diagonal //D_{i,i} = d_i//, the constrained minimization problem can be solved easily with Lagrange multipliers. The solution is <latex> \hat{s}_i = (\hat{b}_i d_i)/(d^2_i + \mu) </latex> (no Einstein summation) for the //μ// such that //||\hat{s}||^2 = δ^2//. This value of //μ// can be found with a 1d Newton search for the zero of <latex> \Phi(\mu) = ||\hat{s}(\mu)||^2 - \delta^2 </latex> A straight Newton search a la //μ → μ - Φ(μ)/Φ'(μ)// is suboptimal since //Φ'(μ) → 0// as //μ → ∞// with //Φ''(μ) > 0// everywhere, so we use a slightly modified update rule for the Newton search over //μ//. Please refer to Dennis and Schnabel regarding that. Then, given the solution //\hat{s}(μ)//, we compute the hookstep solution //s// from <latex> dx_H = Q_n s = Q_n V \hat{s} </latex> ===== Determining the trust-region radius δ ===== How do we know a good value for δ? Essentially, by comparing the reduction in residual obtained by actually taking the hookstep //dx_H// to the reduction predicted by the linearized model //G(x+dx_H) = G(x) + DG(x) dx_H//. If the reduction is accurate but small, we increase δ by small steps until the reduction is marginally accurate but larger. If the reduction is poor we reduce δ. (Note: the trust-region optimization is actually performed in terms of the squared residual, so the comments in the code refer to a quadratic rather than linear model of the residual.) The heuristics associated with adjusting the trust region are fairly complex. Dennis and Schnabel has a decent description, but it took quite a bit of effort to translate that description into an algorithm. Rather than attempt to translate the algorithm back into prose, I recommend you read Dennis and Schnabel and then refer to the code and comments. This portion of findorbit.cpp is very liberally commented.

docs/math/newton_krylov_hookstep.1417453804.txt.gz · Last modified: 2014/12/01 09:10 by gibson