9) Fourier and von Neumann stability analyses#
Last time:
Hyperbolic PDEs
FD schemes for hyperbolic equations
Stability of hyperbolic PDEs
The Courant-Friedrichs-Lewy Condition
Today:
Fourier analysis
1.1 Discrete Fourier transform
1.2 Parseval identities
1.3 Fourier analysis for PDEsNumerical computation of continuous symbols
Von Neumann Stability Analysis
3.1 The heat equation
3.2 Discrete Fourier transform properties
3.3 ExamplesUpwind methods
Godunov’s Theorem
1. Fourier analysis#
What we have learned so far:
Note
Courant-Friedrichs-Lewy (CFL) condition \(|a r|\le 1\) (for explicit one-step schemes applied to \(u_t+au_x+bu=f\)) is necessary (but not sufficient) for stability. It expresses the need for the numerical speed of propagation \(r^{-1}\) to match or exceed the physical speed of propagation \(a\).
Lax Equivalence Theorem: A consistent finite difference scheme for a partial differential equation for which the initial value problem is well-posed is convergent if and only if it is stable.
The Fourier transform of a function \(u(x)\), \(x\in\mathbb{R}\) is defined by
The Fourier inversion formula
Essentially, the Fourier transform representation expresses \(u(x)\) as an infinite superposition of (complex) waves \(e^{i\omega x}=\cos(\omega x)+i\sin(\omega x)\), with amplitudes \(\hat{u}(\omega)\).
\(u(x)\) and \(\hat{u}(\omega)\) must satisfy certain criteria for the integrals (above) to be well-defined. We sweep those details under the rug, and refer to MATH 668: Applied Fourier Analysis_.
Example 9.1:#
Consider the function
The Fourier transform of \(u(x)\) is given by
The tools needed for evaluation of such integrals can be found in MATH 532: Functions of a Complex Variable.
Tables of Fourier transforms can be found online.
A warning:
There are several ways of defining the Fourier transform. The normalization constants for the forward and inverse transforms are chosen from one of the following three set
and the factors in the integrals can be chosen to be
Of course, mathematicians and engineers have agreed to disagree on the definition of the “One True Fourier Transform”.
Example 9.2:#
For the IVP for the heat equation:
We have
and, taking the transform of the whole PDE:
The second to last step followed by integrating by parts twice (under the assumption that the solution to the PDE is sufficiently nice at \(\pm \infty\) so that the integrals exist and the terms evaluated at \(\pm \infty\) are zero).
Note
We see that the Fourier transform reduces the PDE to an ordinary differential equation (ODE) in transform space (the space of transform functions).
We also note that on the RHS, where we only have spatial terms, we have found that
That is:
The Fourier mode is an eigenfunction.
And the eigenvalue is \(-\omega^2\).
1.1 Discrete Fourier transform#
We want to extend the defintion of the Fourier transform to functions defined on discrete sets of points, such as the grid points \(x_k = k \Delta x, \quad k = 0, \ldots, M .\)
For a grid function \(u_m\) defined for all integers coordinates \(m\), the Fourier transform is given by
for \(\xi\in[-\pi,\pi]\), and \(\hat{u}(\pi)=\hat{u}(-\pi)\).
The inversion formula is given by
For a grid function \(u_m\) defined for all coordinates \(x_m=h\cdot m\), the Fourier transform is given by
where \(\xi\in[-\pi/h,\pi/h]\), and \(\hat{u}(\pi/h)=\hat{u}(-\pi/h)\).
The inversion formula is given by
1.2 Parseval identities#
Recall that the
Euclidean norm is also called the quadratic norm, \(L^2\) norm. This is a special case (\(p=2\)) of the \(L^p\) norm for \(L^p\)-spaces, which are finite spaces.
For vectors that have an infinite number of components (like sequences), we use the notation \(ℓ^2\) norm.
With this defintion, we say that a (real- or complex-valued) measurable function is a square-integrable (or quadratically integrable) function or \(L^2\) function if the integral of the square of the absolute value is finite. That is:
Note that the same definition is valud for a function defined on a bounded intervale \([a,b]\).
When we take transforms, we get the inifinite dimensional linear space of complex-valued, Lebesgue square-integrable functions, \(L^2(\mathbb{R})\),
with norm
For vectors of infinite dimensions (sequences), \(\mathbf{x} = (\ldots, x_{-1}, x_0, x_1, \ldots)^T\), we have the \(\ell_2\) norm:
Proposition
If \(v \in L^2\) and \(\hat{v}\) is the discrete Fourier transform of \(v\), then
and
These relations are key to our stability analysis, and are also a big reason why measuring quantities in the \(L^2\) (and/or \(\ell_2\)) norm is usually a good thing.
Many times the norm expresses a natural physical energy,and that energy is preserved under the Fourier transform.
1.3 Fourier analysis for PDEs#
First-derivative in space#
Given the Fourier inversion formula
we formally compute the derivative with respect to \(x\):
This leads us to the stunningly simple, and extremely useful conclusion that
That is:
Note
Differentiation on the physical space (where \(x\) is the indipendent variable), corresponds to multiplication by \(i\omega\) on the Fourier transform space (where \(\omega\) is the indipendent variable).
In general, \(\partial_x^{(k)} \longleftrightarrow (i \omega)^k\)
Higher-order derivatives in space#
It now follows that the squared \(L^2\)-norm of the \(k\)-th derivative is given by
These quantities exist (i.e., \(u\) has \(L^2\) integrable derivatives of order through \(k\)), if and only if
From this we can define the function space called Sobolev space, a vector space of functions equipped with a norm that is a combination of \(L^p\)-norms of the function together with its derivatives (the derivatives in a Sobolov space are understood in a suitable weak sense to make the space complete - we will see more of this later in the course) up to a given order, denoted by \(W_{2}^r(\mathbb{R})\), or \(W^{k,2}(\mathbb{R})\)) \(H^k(\mathbb{R})\) (\(k>0\)), as the set of functions \(u \in L^2(\mathbb{R})\), for which (note \(H^0(\mathbb{R})\equiv L^2(\mathbb{R})\))
We can now introduce the notation
2. Numerical computation of continuous symbols#
Example 9.3: Symbol of the spatial second-derivative FD operator#
Recall that for the Poisson problem \(-u_{xx} = f\), we need the second-derivative \(u_{xx}\). We have used the centered-scheme stencil:
In the Fourier analysis, we consider the plane waves \(\phi(x, \omega) = e^{-i\omega x}\).
We sample \(\phi\) on a discrete grid, where \(m \in \mathbb Z\), and obtain \(\phi_m = e^{-im \xi}\). We then apply the stencil
We obtain
And we call \(\hat{\rho}(\xi)\) the symbol of the spatial differential operator.
Let’s plot it:
using Plots
using LinearAlgebra
using LaTeXStrings
default(linewidth=3)
default(legendfontsize=12)
# Check stencil's symbol against symbol of continuous (second) differentiation
plot([ξ -> 2 - 2*cos(ξ),
ξ -> ξ^2],
xlims=(-pi, pi), ylim=(0, 5),
label = [L"2 - 2*\cos(\xi)" L"\xi^2"],
xlabel = L"\xi", ylabel = L"\hat{\rho}(\xi)")
Numerically computing continuous symbols of spatial derivatices#
function symbol(S, ξ)
if length(S) % 2 != 1
error("Length of stencil must be odd")
end
w = length(S) ÷ 2
phi = exp.(1im * (-w:w) * ξ')
vec(S * phi) # not! (S * phi)'
end
ξ = LinRange(-pi, pi, 10)
symbol([1 -2 1], ξ)
10-element Vector{ComplexF64}:
-4.0 + 0.0im
-3.532088886237956 + 0.0im
-2.34729635533386 + 0.0im
-1.0000000000000004 + 0.0im
-0.12061475842818326 + 0.0im
-0.12061475842818326 + 0.0im
-0.9999999999999992 + 0.0im
-2.34729635533386 + 0.0im
-3.532088886237956 + 0.0im
-4.0 + 0.0im
function plot_symbol(S, deriv, n_ξ=30)
ξ = LinRange(-pi, pi, n_ξ)
sym = symbol(S, ξ)
rsym = real.((-1im)^deriv * sym) # taking advantage of the shift property
fig = plot(ξ, rsym, marker=:circle, label="stencil")
plot!(fig, th -> th^deriv, label="exact", xlabel = L"\xi", ylabel = L"\hat{\rho}(\xi)")
fig
end
plot_symbol([1 -2 1], 2) # second derivative
Stencils of high-order spatial operators#
# Recall our utility/convenience `vander` and `fdstencil` from Lecture 7
function vander(x, k=nothing)
if k === nothing
k = length(x)
end
V = ones(length(x), k)
for j = 2:k
V[:, j] = V[:, j-1] .* x
end
V
end
function fdstencil(source, target, k)
"kth derivative stencil from source to target"
x = source .- target
V = vander(x)
rhs = zero(x)'
rhs[k+1] = factorial(k)
rhs / V
end
x = -5:5
plot_symbol(fdstencil(x, 0, 1), 1) # 1st derivative, it's ω^(1) = linear
x = -5:5
plot_symbol(fdstencil(x, 0, 3), 3) # 3rd derivative
Outlook on Fourier methods#
The Fourier modes \(e^{-i m \xi}\) and their multi-dimensional extensions are eigenfunctions of all stencil-type operations
“High frequencies” \([-\pi, \pi) \setminus [-\pi/2, \pi/2)\) are generally poorly resolved so we need to use a grid fine enough that important features are at low frequencies \([-\pi/2, \pi/2)\)
Same technique can be used to study the inverse (and approximations thereof), as with multigrid and multilevel domain decomposition methods
These methods can also be framed within the theory of (block) Toeplitz matrices
3. Von Neumann Stability analysis#
The application of Fourier analysis which presently is of interest to us is the application to the stability analysis of finite difference schemes for PDEs is known as the von Neumann stability analysis.
Let’s see it with a worked example.
3.1 The heat equation#
Let’s start with an example. We have seen the heat equation already:
Discretized by a F.T.C.S. scheme:
We can recast this grouping the like terms:
where \(r = \nu \Delta t/ \Delta x^2\) and \(-\infty < k < \infty\).
We begin by taking the discrete Fourier transform of both sides of (9.4). We get:
Recognizing that by making the change of variables \(m = k \pm 1\), we get,
Thus, using this expression above, leads us to
Definition
We call \(\rho(\xi)\) the symbol, growth/amplification factor, or recurrence relation of the difference scheme.
We see that, just as it was the case with the continuous analog, taking the Fourier transform of the difference scheme gets rid of the spatial derivatives and it reduces the euqation to an algebraic equation.
Applying this recursively, \(n+1\) times, we get
We note that our numerical solution \(\hat{u}^{n+1}\) stays bounded only if restrict \(r\) so that
so that our numerical scheme is stable.
Let’s analyze this condition:
The right inequality
is always true.
The left inequality
or
which holds when
We have that \(r \leq 1/2\) is a sufficient condition for stability (and along with the consistency, for convergence).
What if \(r > 1/2\)? You can try it as an exercise and notice that for at least some \(\xi\) (specifically, for at least \(\xi = \pi\)) we have that the symbol is such that \(|\rho(\xi)|>1\). Hence, for \(r > 1/2\), the method is unstable.
Therefore, we have that \(r \leq 1/2\) is a sufficient and necessary condition for stability (and along with the consistency, for convergence).
Remark:
We assume that the sumbol \(\rho\) is a continuous function of \(\xi\). Often, we will assume that it has at least one or two derivatives.
Recall
For the calculation of the symbol, it is useful to review some of the trigononetric identities derived with Euler’s formula:
\(e^{i\theta} = \cos\theta + i\sin\theta\)
\(e^{-i\theta} = \cos\theta - i\sin\theta\)
\(\cos \theta = \textrm{Re}(e^{i \theta}) = (e^{i\theta} + e^{-i\theta})/2\)
\(\sin \theta = \textrm{Im}(e^{i \theta}) = (e^{i\theta} - e^{-i\theta})/2i\)
\(1 - \cos\theta = 2\sin^2\left(\frac{\theta}{2}\right)\)
\(\sin\theta = 2 \sin\left(\frac{\theta}{2}\right) \cos\left(\frac{\theta}{2}\right)\)
3.2 Discrete Fourier transform properties#
To shorten the notation so that we can take the discrete Fourier transform without writing all of the summations, we have given the definition of discrete Fourier transform above:
This operator \( \mathcal{F}\) has many properties, but the two most important ones are that \(\mathcal{F}\) is linear and preseves the norm.
Shifts#
We can then define the shift operators as
Proposition
3.3 Examples#
Example 9.4#
The one-way wave/transport/advection equation:
Remember that this equation expresses transport in a “wind” with constant propagation speed \(a\) in this case.
In general, we can find the following representation for the unknown scalar field \(u(t, x)\), whih represents the local density of the material being transported
Generally, in 3D, the bulk motion of the fluid that transports the material is represented by the vector field \(w = (w_x, w_y, w_z)\) (which might be time- and position-dependent)
The general time-dependent model is then
whose analytical soluton is \(u(t,x) = u(0, x - tw)\).
u0(x) = @. -tanh(2*x) # a smooth initial condition
usolution(t,x; a=-1) = @. u0(x - t*a)
x = LinRange(-1, 1, 50)
plot(x, [usolution(t, x) for t in 0:.2:2], legend=:none)
flux: mass of material transported through unit area in unit time (dimensions: \(ML^{-2}T^{-1}\))
\(\to\) advective flux \(= wu\) (check dimensions: \( M L^{-3} \cdot L T^{-1}\))
W.l.o.g. we can consider this problem in 1D, and noticing that the total change in mass in the segment \([x_1, x_2]\) is due to incoming and outgoing flux on the left and right, we have:
For smooth functions, rewrite right-hand-side
Rearrange so that
\[ \int_{x_1}^{x_2} dx (u_t + (wu)_x) = 0 \]Since it needs to be true for all \(x_1\), \(x_2\), the integrand has to be identically zero
Now, let’s go back to the constant speed of propagation case,
We want to analyze the stability of the following FTFS difference scheme
which we can rewrite it as:
where we have introduced \(R = a\frac{\Delta t}{\Delta x}\).
Solution:
We start by taking the discrete Fourier transform of (9.5) to get
We see that in this case, the symbol or amplification/growth factor is complex.
When we apply the difference scheme iteratively, we find the recursive relation:
For stability, we need
In this case, the magnitude of \(\rho\) is:
Hence, it is more convenient to analyze \(| \rho(\xi)|^2\).
Here
Though we could analyze this via the trig identities as we have done in the previous example, we want to introduce here another simple method.
To determine the maximun of \(| \rho(\xi)|^2\) on the interval \([-\pi, \pi]\), we analyze its extrema (by looking at points where \(\rho'(\xi) = 0\)). By doing so, we find that \(\rho(\xi)\) has potential points of maximum/minimum at \(\xi = 0, \pm \pi\).
Let’s evaluate these points of maximum/minimum and bound them by \(1\):
For \(| \rho(\pm \pi)| \leq 1\) we have \(-1 \leq 1 + 2R \leq 1\).
Recall that for this IVP \(a<0\), then \(R<0\), which gives us that the difference (9.5) is conditionally stable for \(R \geq - 1\).
Once we have stability via the symbol of the PDE, we translate this notion back to the PDE via the energy norm (Parseval identity).
s(R, ξ) = (1+R) .- R .* exp.(im .* ξ)
R = -0.8 # try -0.9, -1.1
ξ = range(0, 2π; length=256)
ρξ = s(R, ξ)
# plot in the ξ-complex plane
plot(real.(ρξ), imag.(ρξ); linecolor=:blue, linestyle=:dash, label = L"ρ")
plot!(cos.(ξ), sin.(ξ); linecolor=:red, linestyle=:solid, aspect_ratio=:equal, xlabel = L"\textrm{Re}(ξ)", ylabel = L"\textrm{Im}(ξ)", label = L"|ρ|\leq 1")
Diffusion#
Imagine a dye droplet being added to a glass of water
Spreads, even though water is still (\(w = 0\))
Initial spread is quick, then it slows down; steady state is when density (concentration) is uniform
Question: what does that tell you about the flux (rate of spread) as a function of the density (concentration)? In other words, if \(f_{\mathrm{diff}} \propto g(u)\), what is \(g\)?
Solution: \(f_{\mathrm{diff}} \propto \nabla u\), specifically \(f_{\mathrm{diff}} =- \kappa\nabla u\) (Fick’s first law of diffusion)
\(\kappa\) is the (scalar) diffusivity coefficient, set to 1 for now
Using the same continuity argument as earlier and adding a potential source of flux \(f\), we reach the nonhomegenous advection-diffusion equation
Advection-diffusion example with conservative FD#
In general, if a PDE is derived from a conserved quantity, a conservative form is the one that expresses the PDE in terms of said quantity. (We will learn more about conservation laws next lecture).
function advdiff_conservative(n, kappa, wind, forcing)
x = LinRange(-1, 1, n)
xstag = (x[1:end-1] + x[2:end]) / 2 # edge/face values
L = zeros(n, n)
rhs = forcing.(x)
kappa_stag = kappa.(xstag)
for i in 2:n-1
# Recall: fdstencil(source, target, k)
flux_L = -kappa_stag[i-1] * fdstencil(x[i-1:i], xstag[i-1], 1) + wind * [.5 .5] # average of u_{i-1} and u_i
flux_R = -kappa_stag[i] * fdstencil(x[i:i+1], xstag[i], 1) + wind * [.5 .5] # average of u_{i+1} and u_i
js = i-1:i+1
weights = fdstencil(xstag[i-1:i], x[i], 1)
L[i, i-1:i+1] = weights[1] * [flux_L..., 0] + weights[2] * [0, flux_R...] # Hence, we have a centered (second-order) interpolation of u
end
L[1, 1] = 1
rhs[1] = 0
L[2:end, 1] .= 0
L[end,end] = 1
rhs[end] = 0
L[1:end-1,end] .= 0
x, L, rhs
end
advdiff_conservative (generic function with 1 method)
4. Upwind methods#
In an upwinded discretization the main idea is that:
Idea: incoming advective flux should depend only on upwind value, outgoing should depend only on “my” current value
i.e. derivatives computed using a set of nodes biased to be more “upwind” of the query (target) point
function advdiff_upwind(n, kappa, wind, forcing)
x = LinRange(-1, 1, n) # cell values
xstag = (x[1:end-1] + x[2:end]) / 2 # edge/face values
L = zeros(n, n)
rhs = forcing.(x)
kappa_stag = kappa.(xstag)
for i in 2:n-1 # for the interior nodes
# first-order upwinding for the advective flux: use the value from the upstream side of the face
flux_L = -kappa_stag[i-1] * fdstencil(x[i-1:i], xstag[i-1], 1) +
wind * (wind > 0 ? [1 0] : [0 1]) # for w>0, wu ≈ w u_{i-1}, for w<0, wu ≈ w u_{i+1}
flux_R = -kappa_stag[i] * fdstencil(x[i:i+1], xstag[i], 1) +
wind * (wind > 0 ? [1 0] : [0 1]) # for w>0, wu ≈ w u_{i+1}, for w<0, wu ≈ w u_{i-1}
js = i-1:i+1
weights = fdstencil(xstag[i-1:i], x[i], 1)
L[i, i-1:i+1] = weights[1] * [flux_L..., 0] + weights[2] * [0, flux_R...] # hence the flux is computed using F_{i-1/2} and F_{i+1/2}
end
L[1, 1] = 1
rhs[1] = 0
L[2:end, 1] .= 0
L[end,end] = 1
rhs[end] = 0
L[1:end-1,end] .= 0
x, L, rhs
end
advdiff_upwind (generic function with 1 method)
Let’s test if it is robust
n = 30;
h = 2/(n-1) # our Δx
kappa = 1 # diffusivity coefficient
wind = 100000
x, L, rhs = advdiff_upwind(n, x -> kappa, wind, one)
@show minimum(diff(x))
@show Pe = minimum(diff(x)) * abs(wind) / kappa
plot(x, L \ rhs, marker=:auto, legend=:none, title="\$Pe_h = $(wind*h/kappa)\$", xlabel = "x", ylabel = "u")
minimum(diff(x)) = 0.06896551724137923
Pe = (minimum(diff(x)) * abs(wind)) / kappa = 6896.551724137923
What is the order of accuracy of the upwinded discretization?
5. Godunov’s Theorem (1954)#
Linear numerical methods
For our purposes, monotonicity is equivalent to positivity preservation,
Discontinuities#
A numerical method for representing a discontinuous function on a stationary grid (const in time) can be no better than first order accurate in the \(L^1\) norm,
If we merely sample a discontinuous function, say
In light of these two observations, we may ask ourselves if we are doomed to only have first-order accurate methods.
No, we may still ask for numerical methods that are more than first order accurate for smooth solutions, but those methods must be nonlinear.