6) Introduction to Numerical Analysis#
Last Time:
Introduction to Finite Difference schemes
Evaluating derivatives
Taylor series and truncation error
Today:
Well-posedeness
Introduction to Consistency
The leapfrog scheme
Introduction to Accuracy
Introduction to Stability
Introduction to Conditioning
Lax equivalence theorem
Higher-order schemes: the Method of Undetermined Coefficients
1. Well-posedeness#
Definition
The initial value problem (IVP) for the first order PDE \(\mathcal{P} u = 0\) is well-posed if for any time \(T \geq 0\), there is a constant \(C_T\) such that any solution \(u(t,x)\) satisfies
for \(0 \leq t \leq T\).
Meaning, an IVP is well-posed if it depends continuously upon its initial conditions.
2. Introduction to Consistency#
Last time, we found one possible approximation to the heat equation:
where \(f(0)=a(0)\), and \(f(1)=b(0)\).
We derived a F.T.C.S. scheme:
With the Initial Condition (I.C.) and Boundary Conditions (B.C.) that can be approximated by:
and
Of course, this is not the only possible approximation. We could have chosen different schemes. We will see more in the future. For now, let’s see how to assess the goodness of this numerical approximation.
Let’s see how well the difference equation (1.2.6) approximates the partial differential equation (1.2.1).
We will use again the Taylor series expansion to do this
Recall that to approximate the temporal derivative, we have used a simple forward finite difference:
For the Taylor expansion:
(where we use the letter \(v\) for the solution to the analytic partial differential equation and \(u\) for the discrete approximated solution).
So,
Written with the big-\(O\) notation, we see that the order of what is left is:
This expression assumes that the higher-order derivatives of \(v\) at \((k \Delta x,n \Delta t) \) are bounded.
This Taylor expansion of \(v(k \Delta x, (n+1) \Delta t)\) shows what we are throwing away when we replace \(v_t\) in the PDE with \(\frac{ u^{n+1}_k - u^{n}_k}{\Delta t}\).
We note that for the big-\(O\) notation, a function \(f(s) = O(\phi(s))\) for \(s \in S\), if there exists a constant \(C\) such that \(|f(s)| \leq C |\phi(s)|\), \(\forall s \in S\).
The constant \(C\) can be very large. It will be large for problems in which the solution has sudden changes with respect to time. But generally, for \(\Delta t\) sufficiently small, we are throwing away things of the order of \(\Delta t\), so that the finite difference scheme gives us a decent approximation of the \(v_t\) derivative.
The same approach can be used to show that
and
We note that the centered difference approximates the first derivative with respect to \(x\) more accurately than either of the one-sided differences. In fact, the size of what is left is \(O( \Delta x^2 )\) versus \(\Delta x\).
For the second-order derivative, we have seen:
Putting it all together the PDE (1.2.1) is approximated by
We say that the difference equation (1.2.10) approximates the PDE (1.2.1) to the first order in \(\Delta t\) and second order in \(\Delta x\).
Another way to view (1.2.10) is that if \(v=v(x,t)\) is a solution to the PDE (so that the LHS of (1.2.10) is zero), then \(v\) satisfies the difference equation to the first order in \(\Delta t\) and second order in \(\Delta x\).
Either way, we see that eq. (1.2.10) shows how well the difference equation approximates the PDE.
Note
We should note that this in no way tells us how well the solution to the difference equation approximates the solution to the PDE (which is a notion of convergence).
This is a very important topic that we will investigate. We will claim at this time that the solution to the difference scheme will generally approximate the solution to the PDE to the same order that the difference scheme approximates the partial differential equation.
3. The leapfrog scheme#
Can we do better than approximation (1.2.10)?
So far I have said that this is just a potential approximation to the PDE. Of course, it is not the only one.
What if we used a centered difference for the temporal derivative?
We can use
as approximation of \(v_t\). The difference equation then becomes:
which is second order in both space and time.
This difference scheme is called the leapfrog scheme and is referred to as a three-step (or three-level) method.
For which values of \(n\) is the difference equation (1.2.11) valid?
How to start this method? We note that now we not only need one initial condition, for \(u_k^0\). How many do we need?
Before we move on, let us introduce some notation that might be handy.
We define
and
to be the forward, backward, centered, and centered second-order difference operators, respectively.
With this notation, we can rewrite (1.2.6) as
and is referred to as the forward in time, centered in space (FTCS) scheme for solving the heat equation (1.2.1).
4. Introduction to Accuracy#
Consider the boundary value problem (Poisson’s problem): find \(u\):
We say
\(f(x)\) is the “forcing”
the left boundary condition is a non-homegenous Dirichlet boundary condition (it imposes a condition on the unknown \(u\))
the right boundary condition is a non-homegenous Neumann boundary condition (it imposes a condition on the first-order derivative of the unknown \(\partial u / \partial x\))
We need to choose
how to represent \(u(x)\), including evaluating it on the boundary,
how to compute derivatives of \(u\),
in what sense to ask for the differential equation to be satisfied,
where to evaluate \(f(x)\) or integrals thereof,
how to enforce boundary conditions.
4.1 Finite Differences (FD)/collocation approach#
We have seen that we can represent the function \(u(x)\) by its values \(u_i = u(x_i)\) at a discrete set of points
(where I am changing the indices from \(0, 1, \ldots M\) to \(1, 2, \ldots n\) for simplicity to match the code below, since Julia is one-index based).
The FD framework does not uniquely specify the solution values at other points
Compute derivatives at \(x_i\) via differencing formulas involving a finite number of neighbor points (independent of the total number of points \(n\)).
FD methods ask for the differential equation to be satisfied pointwise at each \(x_i\) in the interior of the domain.
Evaluate the forcing term \(f\) pointwise at \(x_i\).
Approximate derivatives at discrete boundary points (\(x_1 =0\) and \(x_n = 1\) above), typically using one-sided differencing formulas.
Computing a derivative#
using Plots
using LaTeXStrings
default(linewidth=3)
default(legendfontsize=12)
n = 41
h = 6 / (n - 1)
x = LinRange(-3, 3, n)
u = sin.(x) # Note: dot operator allows functions in Julia to accept arrays as arguments and return arrays (it applies the function point-wise on all entries of the array).
plot(x, u, marker=:circle, label = L"\sin(x)", xlabel = "x", ylabel = "y")
u_x = cos.(x) # analytic first-order derivative
fd_u_x = (u[2:end] - u[1:end-1]) / h # What does this do?
plot(x, u_x, label = L"\cos(x)")
plot!(x[1:end-1], fd_u_x, marker=:circle, label= "FD", xlabel = "x", ylabel = "y") # Note: the "!" overlays a curve to an existing plot
How accurate is it?#
Without loss of generality, we’ll approximate \(u'(x_i = 0)\), taking \(h = x_{i+1} - x_i\). Remember, Taylor’s expansion:
and substitute into the differencing formula
Evidently the error in this approximation is \(u''(0)h/2 + O(h^2)\).
We say this method is first order accurate.
5. Introduction to Stability#
x = big(1e-15) # Represent this number with maximal precision (much better than double)
@show big(1+x)
@show log(1 + x)
big(1 + x) = 1.000000000000001000000000000000077705399876661079238307185601195015145492561714
log(1 + x) = 9.999999999999995777053998766613348662406422732737965069319591605458493216045864e-16
9.999999999999995777053998766613348662406422732737965069319591605458493216045864e-16
plot([h -> log(1+h), log1p], xlim=(-1e-15, 1e-15), label = ["y = log(1 + x)" "y = log1p(x)"]) # What's going on?
We can see that a more stable implementation (he Julia built-in log1p function) produces a more accurate and “error-free” result.
Reliable = well-conditioned and stable algorithm#
Mathematical functions \(f(x)\) can be ill-conditioned (big \(\kappa\))#
Modeling is how we turn an abstract question into a mathematical function
We want well-conditioned models (small condition number)
Some systems are intrinsically sensitive (chaotic): fracture, chaotic systems, combustion
Algorithms f(x) can be unstable#
Unreliable, though sometimes practical
Unstable algorithms are constructed from ill-conditioned parts
Other Finite Difference (FD) method examples#
diff1l(x, u) = x[2:end], (u[2:end] - u[1:end-1]) ./ (x[2:end] - x[1:end-1]) # first-order accurate one-sided left FD
diff1r(x, u) = x[1:end-1], (u[2:end] - u[1:end-1]) ./ (x[2:end] - x[1:end-1]) # first-order accurate one-sided right FD
diff1c(x, u) = x[2:end-1], (u[3:end] - u[1:end-2]) ./ (x[3:end] - x[1:end-2]) # second-order accurate centered FD
difflist = [diff1l, diff1r, diff1c]
n = 40
h = 2 / (n - 1)
x = LinRange(-3, 3, n)
u = sin.(x)
fig = plot(cos, xlims=(-3, 3), label = L"\cos(x)")
for d in difflist
xx, yy = d(x, u)
plot!(fig, xx, yy, marker=:circle, label=d)
end
fig
Where is the blue (analytical) curve?
Accuracy: Measuring errors#
Absolute and Relative Errors#
Absolute Error#
where \(\tilde f(x)\) is the computed, approximate solution and \(f(x)\) is the analytical one.
Relative Error#
using Pkg
Pkg.add("LinearAlgebra") # adds the LinearAlgebra.jl package to your environment
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
┌ Warning: Circular dependency detected.
│ Precompilation will be skipped for dependencies in this cycle:
│ ┌ Symbolics → SymbolicsForwardDiffExt
│ └─ Symbolics → SymbolicsPreallocationToolsExt
└ @ Pkg.API.Precompilation ~/julia-1.10.10/share/julia/stdlib/v1.10/Pkg/src/precompilation.jl:583
using LinearAlgebra
grids = 2 .^ (2:10)
hs = 1 ./ grids
function refinement_error(f, fprime, d)
error = []
for n in grids
x = LinRange(-3, 3, n)
xx, yy = d(x, f.(x))
push!(error, norm(yy - fprime.(xx), 2)/sqrt(n))
end
error
end
refinement_error (generic function with 1 method)
fig = plot(xscale=:log10, yscale=:log10)
for d in difflist
error = refinement_error(sin, cos, d)
plot!(fig, hs, error, marker=:circle, label=d)
end
plot!(fig, hs, [hs hs .^ 2], label=["\$h\$" "\$h^2\$"], xlabel = "h", ylabel = "error")
Interpretation: We can see that:
The one-sided left and right FD schemes are only first order accurate (their error behaves like the \(h\) reference line)
This means that the absolute error is halved if the grid spacing is halved. So the method converges linearly
The centered FD scheme is second order accurate (its error behaves like the \(h^2\) reference line)
This means that the number of correct digits doubles every time we consider a smaller (halved) grid spacing. That is, the absolute error is divided by 4 if the grid spacing is halved.
6. Introduction to Conditioning#
Keeping in mind roundoff errors and floating point arithmetic (refer to Lecture 4), to how many digits can you trust a computer’s output of
and
if it uses double precision?
This problem (evaluating \(\sin\) with a large argument) is ill-conditioned. Even the best algorithm cannot be expected to produce a more accurate answer.
Example 6.1:#
Computing \(e^x\)
function myexp(x)
sum = 1
for k in 1:100
sum += x^k / factorial(big(k))
end
return sum
end
myexp(1) - exp(1)
1.445646891729250136554224997792468345749669676277240766303534163549513721620773e-16
function myexp(x)
sum = 0
term = 1
n = 1
while sum + term != sum
sum += term
term *= x / n
n += 1
end
sum
end
myexp(1) - exp(1)
4.440892098500626e-16
How accurate is it?
plot(myexp, xlims=(-2, 2), label = "exp")
plot(exp, xlims=(-1e-15, 1e-15), linestyle=:solid, label = "exp")
plot!(myexp, xlims=(-1e-15, 1e-15), linestyle=:dash, label="myexp") # the bang operator overlays the second plot in the same figure
GKS: Possible loss of precision in routine SET_WINDOW
What’s happening?
We’re computing \(f(x) = e^x\) for values of \(x\) near zero.
This function is well approximated by \(1 + x\).
Values of \(y\) near 1 cannot represent every value.
After rounding, the error in our computed output \(\tilde f(x)\) is of order \(\epsilon_{\text{machine}}\).
Suppose I want to compute \(e^x - 1\)
plot([x -> myexp(x) - 1 , x -> x],
xlims=(-1e-15, 1e-15), label=["myexp" "y=x"])
What now?
We’re capable of representing outputs with 16 digits of accuracy
Yet our algorithm
myexp(x) - 1can’t find themWe can’t recover without modifying our code
Let’s modify the code
function myexpm1(x)
sum = 0
term = x
n = 2
while sum + term != sum
sum += term
term *= x / n
n += 1
end
sum
end
myexpm1 (generic function with 1 method)
plot(myexpm1, xlims=(-1e-15, 1e-15), label="myexpm1")
Plot relative error#
function relerror(x, f, f_ref)
fx = f(x)
fx_ref = f_ref(x)
max(abs(fx - fx_ref) / abs(fx_ref), 1e-17)
end
badexpm1(t) = exp(t) - 1
plot(x -> relerror(x, badexpm1, expm1), yscale=:log10, xrange=(-1e-15, 1e-15), label = "rel error")
plot(x -> (1 + x) - 1, xlims=(-1e-15, 1e-15), label = "y = 1 + x - 1")
plot!(x -> x, label="y = x")
6.1 Condition number#
What sort of functions cause small errors to become big?
Consider a function \(f: X \to Y\) and define the absolute condition number
Floating point arithmetic offers relative accuracy, so it’s more useful to discuss relative condition number,
f(x) = x - 1; fprime(x) = 1
plot(x -> abs(fprime(x)) * abs(x) / abs(f(x)), xlims=(0, 2), label = "kappa for x-1")
Example:#
Back to \(f(x) = e^x - 1\)
f(x) = exp(x) - 1
fprime(x) = exp(x)
plot(x -> abs(fprime(x)) * abs(x) / abs(f(x)), label = "kappa for exp(x)")
What does it mean?
The function \(f(x) = e^x - 1\) is well-conditioned
The function \(f_1(x) = e^x\) is well-conditioned
The function \(f_2(x) = x - 1\) is ill-conditioned for \(x \approx 1\)
The algorithm is unstable#
f(x) = exp(x) - 1is unstableAlgorithms are made from elementary operations
Unstable algorithms do something ill-conditioned
A stable algorithm#
We used the series expansion previously.
accurate for small \(x\)
less accurate for negative \(x\) (see activity)
we could use symmetry to fix
inefficient because we have to sum lots of terms
Standard math libraries define a more efficient stable variant, \(\texttt{expm1}(x) = e^x - 1\).
expm1(-40) + 1 # this way, using the built-in, stable variant, it is exactly 0
0.0
exp(-40) # almost zero
4.248354255291589e-18
plot([x -> exp(x) - 1,
x -> expm1(x)],
xlims = (-1e-15, 1e-15), label = ["y = exp(x) - 1" "y = expm1(x)"])
Reliable = well-conditioned and stable algorithm#
Mathematical functions \(f(x)\) can be ill-conditioned (big \(\kappa\))#
Modeling is how we turn an abstract question into a mathematical function
We want well-conditioned models (small \(\kappa\))
Some systems are intrinsically sensitive (chaotic): fracture, chaotic systems, combustion
Algorithms f(x) can be unstable#
Unreliable, though sometimes practical
Unstable algorithms are constructed from ill-conditioned parts
7. Lax equivalence theorem#
In summary,
Consistency: describes how well the numerical scheme aproximates the PDE (if the Finite Difference (FD) discretization is at least of order 1 \(\implies\) it is consistent — The residual reduces under grid refinement).
Stability: Numerical stability concerns how errors introduced during the execution of an algorithm affect the result. It is a property of an algorithm rather than the problem being solved (check Higham’s blog). This gets subtle for problems like incompressible materials or contact mechanics.
Convergence: When the solution of the approximated equation approaches the actual solution of the continuous equation.
A consistent finite difference scheme for a partial differential equation for which the initial value problem is well-posed is convergent if and only if it is stable.
In practice, we have the two one-way implications and the Lax equivalence theorem puts them together to form a necessary and sufficient codition:
Consistency + Convergence \(\implies\) Stability
Consistency + Stability \(\implies\) Convergence
Hence,
Lax equivalence theorem: Consistency + Convergence \(\iff \) Stability
This theorem is extremely useful:
Checking consistency is straight-forward (Taylor expansions).
We are going to introduce tools (based on Fourier transforms) which make checking stability quite easy.
Thus, if the problem is well-posed, we get the more difficult (and desirable) result, convergence, by checking two (relatively) easy things: consistency and stability.
The relationship is one-to-one, hence only consistent and stable schemes need to be considered.
8. Higher-order schemes: the Method of Undetermined Coefficients#
So far, for the heat equation (1.2.1) we have derived two numerical schemes. The first one, FTCS, (1.2.6),
And the second one, CTCS, (1.2.11),
It is possible to approximate derivatives to any arbitrary order of accuracy.
Note
Order of accuracy \(\neq\) order of the derivative
To derive any order approximation Finite Difference (FD) scheme, we can use a method called the Method of Undetermined Coefficients.
We first start by noting that for the derivatives we have seen so far (first-order time derivative, and first-order and second-order spatial derivatives) we always needed one more grid point than the desired order of accuracy. For instance, to obtain a first-order accurate scheme of a first order derivative, we needed two grid points. And, for a second-order accurate scheme of a second order derivative, we needed three grid points.
This builds up intuition that, for instance, we could not approximate \(v_{xx}\) to the fourth order of accuracy by using only three grid points: \(x = k \Delta x\), and \(x = (k \pm 1)\Delta x\). We shall see that to have a fourth-order approximation of \(v_{xx}\), we need a \(5\)-point stencil touching the points \(x = k \Delta x\), \(x = (k \pm 1)\Delta x\), and \(x = (k \pm 2)\Delta x\).
To have a \(k\)-order approximation in 1D, we need \((k+1)\) grid points.
Let’s re-derive the second-order approximation of the secon derivative \(u_{xx}\). We know that we need three coefficients.
We can write the ansatz
Or, we could have equivalently used the \(x = k \Delta x\), and \(x = (k \pm 1)\Delta x\) grid points (symmetric w.r.t. the reference point \(x_k\))
but then we need to be careful about the negative sign in the Taylor expansion!
Let’s stick with (1.2.12) for now and use Taylor expansion around the point \(u_k\):
Let’s plug them in (1.2.12)
Equating the like-term coefficients on both sides of the equation, gives us the following constraints:
We solve (ii) and substitute in (i) and (iii):
Therefore,
Hence, we find
Or, shifting indices, we see the familiar second-order centered scheme: