Estimating the area under a curve

In the lecture on numpy I showed an example of computing the area under a curve via the trapezoid rule. Today we are going to develop an improved method for computing the area under a curve, Simpson's rule.

All of the methods we will study for estimating areas have the same basic setup. We seek to estimate the area underneath a function f(x) over the interval [a,b]. We do this by dividing the interval into N subintervals of width h = (b-a)/N using division points xi where x0 = a, xN = b, and xi+1 = xi + h for all i.

The trapezoid rule estimates the area under the curve over the subinterval [xi , xi+1] by constructing a straight line that passes through the points (xi , f(xi)) and (xi+1 , f(xi+1)) and using the area under that line over the subinterval as an approximation for the area under the curve.

Simpson's rule begins by constructing a second degree polynomial that passes throught the points (xi , f(xi)), (xi + h/2 , f(xi + h/2)), and (xi + h , f(xi + h)) = (xi+1 , f(xi+1)) and using the area underneath that polynomial to estimate the curve area. The algebraic calculations involved in constructing the polynomial and computing the area under the polynomial are somewhat involved, so I will be using the sympy package to perform the necessary algebra calculations.

I did those calculations in a Jupyter notebook. That notebook is available in the archive provided at the bottom of these notes. Those calculations prove that the area under the polynomial is

Adding up all of these area estimates over all the subintervals produces an estimate for the area under the curve:

The alternating pattern of 2 and 4 coefficients coupled with the coefficients of 1 at both ends makes this sum slightly tricky to compute. Here is the strategy I used to compute this sum in the Python program below.

  1. Start by using the numpy linspace function to construct the set of points x0 = a to xN-1 = b - h in steps of size h.
  2. Use the numpy vector function evaluation method to compute f(x) evaluated at each of these points. We multiply all of these values by 2.
  3. Next, we shift all of these sample points to the right by h/2 and evaluate f(x) at each of the shifted points. We multiply all of these values by 4.
  4. If we sum up all of the values computed in steps 2 and 3 we will almost get the sum that we want. The only adjustment we need to make is for the first and last entries in the summation formula. We will end up with a term that looks like 2 f(x0) instead of the f(x0) we want at the start of the sum, and we will end up with no term for f(xN). We can fix these problems by adding f(b) - f(a) to our sum.
  5. We then take our final sum and multiply it by h/6 to get the area estimate.

Here now is the Python program that fully implements Simpson's rule:

import numpy as np
import math

# This is the function we will work with for this example
def f(x):
    return np.sqrt(4.0 - x*x)

# Compute an approximation to the integral
# of f(x) from a to b by using Simpson's rule
# with steps of size h = (b-a)/N
def SimpsonArea(a,b,N):
    h = (b-a)/N
    x = np.linspace(a,b-h,N)

    return h/3*(f(x).sum()) + 2*h/3*(f(x+h/2).sum()) - h*f(a)/6 + h*f(b)/6

print("      N  error");
for j in range(3,20,2):
    estimate = SimpsonArea(0.0,2.0,2**j)
    error = math.fabs(math.pi - estimate)
    print("{:>7d}  {:g}".format(2**j,error))

To illustrate the behavior of Simpson's rule as a function of N this program computes Simpson's rule estimates for a range of different values of N.

Code for today's examples

Here is an archive containing the notebook and the python program.