Numerics/Evaluation of Functions

Evaluation of Functions covers numerical integration (quadrature), the method of moments for parameter estimation, and techniques for computing functions when closed-form evaluation is impractical. The unifying theme: you have a function \(f(x)\) — maybe it's expensive, maybe it's only known at sampled points, maybe it's defined by an integral — and you need numbers from it.

The One-Sentence Opinion

Use Gaussian quadrature when you can evaluate \(f\) anywhere; use adaptive quadrature (scipy.integrate.quad) when you don't know how nasty your function is; use method of moments when you need a quick, consistent parameter estimate and don't want to mess with MLE optimizers; and never use high-order Newton-Cotes rules — they're a numerical trap.

Numerical Integration (Quadrature)

Quadrature approximates the definite integral

\[ I = \int_a^b f(x)\,dx \approx \sum_{i=1}^n w_i f(x_i) \]

The art is choosing the nodes \(x_i\) and weights \(w_i\). Get them right and you extract 15 digits with 10 evaluations. Get them wrong and you're summing noise.

Newton-Cotes: The Obvious (and Usually Wrong) Approach

Newton-Cotes rules use equally-spaced points and fit an interpolating polynomial. Simple, intuitive, and mostly a bad idea beyond the lowest orders.

Trapezoidal Rule

Fit a line between \((a, f(a))\) and \((b, f(b))\). Integrate the line:

\[ \int_a^b f(x)\,dx \approx \frac{b-a}{2}\big[f(a) + f(b)\big] \]

Error: \(O(h^3 f(\xi))\) on a single panel. Composite version: chop \([a,b]\) into \(n\) panels of width \(h\):

\[ \int_a^b f(x)\,dx \approx \frac{h}{2}\left[f(a) + 2\sum_{i=1}^{n-1} f(a+ih) + f(b)\right] \]

Error drops to \(O(h^2)\). Surprisingly, for periodic functions integrated over a full period, the trapezoidal rule converges exponentially fast. This is the one situation where it beats Gaussian quadrature.

import numpy as np

def trapezoidal(f, a, b, n):
    """Composite trapezoidal rule with n panels."""
    x = np.linspace(a, b, n + 1)
    h = (b - a) / n
    return h * (0.5 * f(x[0]) + np.sum(f(x[1:-1])) + 0.5 * f(x[-1]))

# Or just use SciPy:
from scipy.integrate import trapezoid
result = trapezoid(f(np.linspace(a, b, 1000)), np.linspace(a, b, 1000))

Simpson's Rule

Fit a quadratic through three equally-spaced points. Exact for cubics (degree of precision = 3):

\[ \int_a^b f(x)\,dx \approx \frac{h}{3}\big[f(a) + 4f(a+h) + f(b)\big], \quad h = \frac{b-a}{2} \]

Composite version (\(n\) even):

\[ \int_a^b f(x)\,dx \approx \frac{h}{3}\left[f(a) + 4\sum_{\text{odd }i} f(a+ih) + 2\sum_{\text{even }i} f(a+ih) + f(b)\right] \]

def simpson(f, a, b, n):
    """Composite Simpson's rule. n must be even."""
    if n % 2 != 0:
        raise ValueError("n must be even for Simpson's rule")
    x = np.linspace(a, b, n + 1)
    h = (b - a) / n
    result = f(x[0]) + f(x[-1])
    result += 4.0 * np.sum(f(x[1:-1:2]))   # odd indices
    result += 2.0 * np.sum(f(x[2:-2:2]))   # even indices
    return h * result / 3.0

# SciPy equivalent:
from scipy.integrate import simpson as simps
result = simps(f(np.linspace(a, b, 101)), x=np.linspace(a, b, 101))

Why Higher-Order Newton-Cotes Is a Trap

Simpson's 3/8 rule (cubic fit), Boole's rule (quartic fit), and beyond all exist. They all use equally-spaced points with alternating sign weights for orders 9 and above. Those sign alternations produce catastrophic cancellation — you subtract nearly-equal large numbers, losing precision. Meanwhile, the interpolating polynomial suffers from Runge's phenomenon at the edges (see Numerics/Interpolation Extrapolation).

Moral: Trapezoidal is fine for dense data or periodic functions. Simpson's is okay for a quick estimate. Beyond Simpson's, stop and use Gaussian quadrature.

Gaussian Quadrature: The Gold Standard

Gaussian quadrature answers the question: if I can evaluate \(f\) at \(n\) points of my choosing, where should those points go and how should I weight them to maximize accuracy?

The answer: choose the nodes \(x_i\) as the roots of an orthogonal polynomial from a family matched to the integration interval and weight function, and compute weights \(w_i\) from the same polynomials. An \(n\)-point Gauss rule is exact for polynomials of degree \(2n-1\) — double the degree of any Newton-Cotes rule with the same number of evaluations.

Why It Works (Intuition)

An \(n\)-point rule has \(2n\) degrees of freedom (\(n\) nodes + \(n\) weights). A polynomial of degree \(2n-1\) has \(2n\) coefficients. Gaussian quadrature solves for nodes and weights simultaneously so that all \(2n\) monomials \(1, x, x^2, \ldots, x^{2n-1}\) are integrated exactly. This is optimal — no rule with \(n\) points can do better.

Gauss-Legendre (The Default)

For \(\int_{-1}^1 f(x)\,dx\) with weight \(\omega(x) = 1\). The orthogonal polynomials are Legendre polynomials \(P_n(x)\). Nodes are the roots of \(P_n\); weights are:

\[ w_i = \frac{2}{(1-x_i^2)[P_n'(x_i)]^2} \]

To integrate over arbitrary \([a, b]\), map to \([-1, 1]\):

\[ \int_a^b f(x)\,dx = \frac{b-a}{2} \int_{-1}^1 f\!\left(\frac{b-a}{2}\xi + \frac{a+b}{2}\right) d\xi \approx \frac{b-a}{2} \sum_{i=1}^n w_i \, f\!\left(\frac{b-a}{2}x_i + \frac{a+b}{2}\right) \]

import numpy as np
from numpy.polynomial.legendre import leggauss

def gauss_legendre(f, a, b, n):
    """n-point Gauss-Legendre quadrature over [a, b]."""
    x_ref, w_ref = leggauss(n)
    # Map from [-1, 1] to [a, b]
    x_mapped = 0.5 * (b - a) * x_ref + 0.5 * (a + b)
    return 0.5 * (b - a) * np.dot(w_ref, f(x_mapped))

# SciPy fixed-order Gaussian quadrature:
from scipy.integrate import fixed_quad
result, _ = fixed_quad(f, a, b, n=32)

Gauss-Hermite: Integrals Over \((-\infty, \infty)\)

For \(\int_{-\infty}^\infty f(x)\,e^{-x^2}\,dx\). Nodes are roots of Hermite polynomials. The exponential weight is baked in:

from numpy.polynomial.hermite import hermgauss

def gauss_hermite(f, n):
    """Integrate f(x) * exp(-x^2) over (-∞, ∞)."""
    x, w = hermgauss(n)
    return np.dot(w, f(x))

# For integrals WITHOUT the exp(-x^2) weight, absorb it:
# ∫ g(x) dx = ∫ [g(x) * exp(x^2)] * exp(-x^2) dx
def gauss_hermite_unweighted(g, n):
    x, w = hermgauss(n)
    return np.dot(w, g(x) * np.exp(x**2))

Opinion: Gauss-Hermite with the unweighted trick is fragile — the \(\exp(x^2)\) factor amplifies errors in the tails. Use adaptive quadrature (quad) with infinite limits unless you have a strong structural reason.

Gauss-Laguerre: Integrals Over \([0, \infty)\)

For \(\int_0^\infty f(x)\,e^{-x}\,dx\). Nodes are roots of Laguerre polynomials:

from numpy.polynomial.laguerre import laggauss

def gauss_laguerre(f, n):
    """Integrate f(x) * exp(-x) over [0, ∞)."""
    x, w = laggauss(n)
    return np.dot(w, f(x))

The generalized Gauss-Laguerre quadrature handles weight \(x^\alpha e^{-x}\):

from scipy.special import roots_laguerre
x, w = roots_laguerre(n, alpha=0.5)  # weight x^0.5 * exp(-x)

Gauss-Chebyshev: Integrals with Endpoint Singularities

For \(\int_{-1}^1 \frac{f(x)}{\sqrt{1-x^2}}\,dx\). Nodes and weights have closed forms:

\[ x_i = \cos\left(\frac{2i-1}{2n}\pi\right), \quad w_i = \frac{\pi}{n} \]

def gauss_chebyshev(f, n):
    """Integrate f(x) / sqrt(1-x^2) over [-1, 1]."""
    i = np.arange(1, n + 1)
    x = np.cos((2*i - 1) * np.pi / (2*n))
    w = np.full(n, np.pi / n)
    return np.dot(w, f(x))

Summary of Gaussian Quadrature Families

Interval	Weight \(\omega(x)\)	Orthogonal Polynomials	NumPy/SciPy Function
\([-1, 1]\)	\(1\)	Legendre	`leggauss(n)`
\((-\infty, \infty)\)	\(e^{-x^2}\)	Hermite	`hermgauss(n)`
\([0, \infty)\)	\(e^{-x}\)	Laguerre	`laggauss(n)`
\([-1, 1]\)	\((1-x)^\alpha(1+x)^\beta\)	Jacobi	`roots_jacobi(n, alpha, beta)`
\([-1, 1]\)	\(1/\sqrt{1-x^2}\)	Chebyshev (1st kind)	`roots_chebyt(n)`

Computing Nodes and Weights: The Golub-Welsch Algorithm

Computing Gaussian quadrature nodes from polynomial root-finding is numerically unstable for large \(n\). The Golub-Welsch algorithm solves this elegantly via eigenvalues of a symmetric tridiagonal matrix built from the three-term recurrence coefficients of the orthogonal polynomial family. The nodes are the eigenvalues; the weights are proportional to the squared first components of the eigenvectors.

import numpy as np
from scipy.linalg import eigh_tridiagonal

def golub_welsch(alpha, beta, n):
    """
    Compute Gauss quadrature nodes and weights from recurrence coefficients.
    alpha: diagonal entries (shape n)
    beta:  off-diagonal entries (shape n-1), squared for the off-diagonals
    """
    # Build symmetric tridiagonal matrix
    d = alpha[:n]
    e = np.sqrt(beta[:n-1])
    eigenvalues, eigenvectors = eigh_tridiagonal(d, e)
    # Weights from squared first components
    weights = beta[0] * eigenvectors[0, :]**2
    return eigenvalues, weights

This is what SciPy and NumPy use internally. For standard families, just call the library functions.

Clenshaw-Curtis: The Pragmatic Alternative

Clenshaw-Curtis quadrature uses Chebyshev polynomial expansion and samples at the Chebyshev nodes \(x_k = \cos(k\pi/n)\). It integrates polynomials of degree \(n\) exactly (vs. \(2n-1\) for Gauss), but:

Weights can be computed in \(O(n \log n)\) via FFT (vs. \(O(n^2)\) for Golub-Welsch)
Nodes are nested — an \((n+1)\)-point rule reuses all points from the \(n\)-point rule, which enables efficient adaptive schemes
In practice, Clenshaw-Curtis often matches Gauss accuracy for smooth functions

Trefethen's verdict (2008): "The supposed factor-of-2 advantage of Gauss quadrature is rarely realized in practice." Gauss is theoretically twice as accurate for the same number of points, but the difference is often negligible for well-behaved integrands.

def clenshaw_curtis(f, a, b, n):
    """n-point Clenshaw-Curtis quadrature over [a, b]."""
    import numpy as np
    k = np.arange(n)
    theta = k * np.pi / (n - 1)
    x_cheb = np.cos(theta)          # nodes on [-1, 1]
    # Compute weights via FFT (simplified; SciPy does this better internally)
    w = np.zeros(n)
    for j in range(n):
        c = np.ones(n)
        c[0] *= 0.5
        c[-1] *= 0.5
        w[j] = (2.0 / (n - 1)) * np.dot(c, np.cos(j * theta))
        if j == 0 or j == n - 1:
            w[j] *= 0.5
    x_mapped = 0.5 * (b - a) * x_cheb + 0.5 * (a + b)
    return 0.5 * (b - a) * np.dot(w, f(x_mapped))

Opinion: Use Gauss-Legendre as your default. Use Clenshaw-Curtis when you need nested sampling for adaptive refinement, or when \(n\) is so large that the \(O(n \log n)\) weight computation matters.

Adaptive Quadrature: When You Don't Know How Bad \(f\) Is

The workhorse of practical integration: scipy.integrate.quad. It uses an adaptive Gauss-Kronrod scheme — a Gauss rule paired with a higher-order Kronrod extension that reuses the Gauss nodes. The difference between the two estimates gives an error estimate; the algorithm subdivides the interval where the error is largest.

from scipy.integrate import quad
import numpy as np

# Basic usage
result, error_estimate = quad(lambda x: np.exp(-x**2), -np.inf, np.inf)
print(f"{result:.14f} ± {error_estimate:.2e}")  # sqrt(pi) ≈ 1.77245...

# With extra arguments
def damped_oscillator(t, zeta, omega):
    return np.exp(-zeta * t) * np.cos(omega * t)

result, err = quad(damped_oscillator, 0, np.inf, args=(0.1, 2.0))

# Handling singularities: specify points where integrand misbehaves
result, err = quad(lambda x: np.log(x) / np.sqrt(x), 0, 1,
                   points=[0])  # alerts quad to the singularity at x=0

# Double/triple/n-fold integration
from scipy.integrate import dblquad, tplquad, nquad

# Double: ∫₀^π ∫₀^sin(y) x²y dx dy
area, err = dblquad(lambda x, y: x**2 * y, 0, np.pi,
                    lambda y: 0, lambda y: np.sin(y))

# n-fold: arbitrary dimension
def integrand(x, y, z):
    return x * y * z
result, err = nquad(integrand, [[0, 1], [0, 1], [0, 1]])

The "Gaussian Integral Trap"

Adaptive quadrature samples the integrand at a finite set of points. If your function is narrow and you set limits too wide, the sampler misses it entirely:

# Wrong: limits too wide, sampler misses the peak
quad(lambda x: np.exp(-x**2), -10000, 10000)
# → (1.975e-203, 0.0)  ← completely wrong!

# Right: tight limits around the action
quad(lambda x: np.exp(-x**2), -15, 15)
# → (1.77245..., 8.47e-11)  ← correct

Moral: know your integrand. If it's localized, don't make the integrator hunt for it.

Romberg Integration: Richardson Extrapolation for Quadrature

Romberg integration applies Richardson extrapolation to the trapezoidal rule. Compute trapezoidal approximations with step sizes \(h, h/2, h/4, \ldots\) and extrapolate to \(h \to 0\), eliminating successive powers of \(h^2\) from the error:

from scipy.integrate import romb

# Requires 2^k + 1 equally-spaced samples
k = 8
x = np.linspace(0, 1, 2**k + 1)
y = np.exp(x) * np.sin(3 * x)
result = romb(y, dx=(x[1] - x[0]))

Romberg is elegant but inflexible — you must commit to \(2^k+1\) function evaluations up front. Adaptive quadrature (quad) is usually better. Use Romberg when you already have a table of equally-spaced data and want a high-accuracy integral from it.

Monte Carlo Integration: When Dimension Kills

For high-dimensional integrals, all the rules above fail — the curse of dimensionality makes tensor-product grids exponentially expensive. Monte Carlo integration converges at \(O(1/\sqrt{N})\) regardless of dimension:

\[ \int_\Omega f(\mathbf{x})\,d\mathbf{x} \approx \frac{V}{N} \sum_{i=1}^N f(\mathbf{x}_i), \quad \mathbf{x}_i \sim \text{Uniform}(\Omega) \]

import numpy as np

def monte_carlo_integrate(f, bounds, N=100000):
    """
    Monte Carlo integration over a hyper-rectangle.
    bounds: list of (low, high) tuples for each dimension.
    """
    dim = len(bounds)
    volume = np.prod([hi - lo for lo, hi in bounds])
    samples = np.random.uniform(
        low=[b[0] for b in bounds],
        high=[b[1] for b in bounds],
        size=(N, dim)
    )
    values = np.apply_along_axis(lambda x: f(*x), 1, samples)
    estimate = volume * np.mean(values)
    error = volume * np.std(values) / np.sqrt(N)
    return estimate, error

# Example: area of a quarter-circle in 2D
def indicator_quarter_circle(x, y):
    return 1.0 if x**2 + y**2 <= 1.0 else 0.0

est, err = monte_carlo_integrate(indicator_quarter_circle,
                                  [(0, 1), (0, 1)], N=200000)
# est ≈ π/4 ≈ 0.785, err ≈ 0.001

For low dimensions (1D–3D), quadrature is always better. Monte Carlo starts winning around dimension 5–8 depending on the smoothness of \(f\). For very high dimensions with smooth integrands, quasi-Monte Carlo using low-discrepancy sequences (Sobol', Halton) converges as \(O((\log N)^d / N)\):

from scipy.stats import qmc

sampler = qmc.Sobol(d=5, scramble=True)
samples = sampler.random(n=4096)
# Transform to your integration domain and average

Method of Moments (Parameter Estimation)

The method of moments (MoM) is a technique for estimating parameters of a probability distribution by equating sample moments to theoretical moments. It was introduced by Karl Pearson in 1894 and predates maximum likelihood estimation.

The Core Idea

If a distribution has \(k\) unknown parameters \(\theta_1, \ldots, \theta_k\), you:

Write the first \(k\) theoretical moments \(\mu_j = \mathbb{E}[X^j]\) as functions of the parameters
Compute the first \(k\) sample moments \(\hat{\mu}_j = \frac{1}{n}\sum_{i=1}^n X_i^j\)
Solve \(\mu_j(\theta_1, \ldots, \theta_k) = \hat{\mu}_j\) for \(j = 1, \ldots, k\)

The resulting estimators are consistent (by the law of large numbers) and asymptotically normal (by the central limit theorem). They are generally not efficient compared to MLE, but they're simple, closed-form for many distributions, and don't require numerical optimization.

Opinion

MoM is your first-pass estimator. If the closed-form solution is trivial (Normal, Poisson, Gamma, Beta), use it. If you need efficiency (minimum variance), switch to MLE. If you have censored or truncated data, MoM becomes messier and MLE is usually the right call.

Example: Normal Distribution

Parameters: \(\mu, \sigma^2\). Moments: \(\mathbb{E}[X] = \mu\), \(\mathbb{E}[X^2] = \mu^2 + \sigma^2\).

\[ \hat{\mu} = \frac{1}{n}\sum_{i=1}^n X_i = \bar{X}, \qquad \hat{\sigma}^2 = \frac{1}{n}\sum_{i=1}^n (X_i - \bar{X})^2 \]

Note: The MoM variance estimator divides by \(n\), not \(n-1\). It's biased but consistent. The \(n-1\) correction (Bessel's correction) gives an unbiased estimator, which is not the MoM estimator. Use MoM when consistency matters more than unbiasedness.

import numpy as np

def mom_normal(data):
    """Method of moments estimators for Normal(μ, σ²)."""
    mu_hat = np.mean(data)
    var_hat = np.mean((data - mu_hat)**2)  # divide by n, NOT n-1
    return mu_hat, var_hat

Example: Gamma Distribution

Parameters: shape \(\alpha\), scale \(\beta\) (or rate \(\theta = 1/\beta\)). The PDF is:

\[ f(x; \alpha, \beta) = \frac{x^{\alpha-1} e^{-x/\beta}}{\beta^\alpha \Gamma(\alpha)}, \quad x > 0 \]

Moments: \(\mathbb{E}[X] = \alpha\beta\), \(\text{Var}(X) = \alpha\beta^2\).

Solving:

\[ \hat{\alpha} = \frac{\bar{X}^2}{s^2}, \qquad \hat{\beta} = \frac{s^2}{\bar{X}} \]

where \(s^2 = \frac{1}{n}\sum (X_i - \bar{X})^2\) (divide by \(n\)).

def mom_gamma(data):
    """MoM estimators for Gamma(α, β). Returns (alpha_hat, scale_hat)."""
    x_bar = np.mean(data)
    s2 = np.mean((data - x_bar)**2)  # MoM variance (divide by n)
    alpha_hat = x_bar**2 / s2
    scale_hat = s2 / x_bar           # scale β (NOT rate θ)
    return alpha_hat, scale_hat

# Verify with SciPy
from scipy import stats
data = stats.gamma.rvs(a=6, scale=2, size=5000)
alpha_hat, scale_hat = mom_gamma(data)
print(f"True: α=6, β=2  |  MoM: α={alpha_hat:.3f}, β={scale_hat:.3f}")

Example: Poisson Distribution

Parameter: \(\lambda\). Moment: \(\mathbb{E}[X] = \lambda\).

\[ \hat{\lambda} = \bar{X} \]

Trivial, but it works.

Example: Beta Distribution

Parameters: \(\alpha, \beta\). PDF on \([0,1]\):

\[ f(x; \alpha, \beta) = \frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha, \beta)} \]

Moments: \(\mathbb{E}[X] = \frac{\alpha}{\alpha+\beta}\), \(\text{Var}(X) = \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}\).

def mom_beta(data):
    """MoM estimators for Beta(α, β)."""
    x_bar = np.mean(data)
    s2 = np.mean((data - x_bar)**2)  # MoM variance
    # Solving the moment equations
    common = x_bar * (1 - x_bar) / s2 - 1
    alpha_hat = x_bar * common
    beta_hat = (1 - x_bar) * common
    return alpha_hat, beta_hat

When MoM Fails

No closed-form moments: Some distributions (e.g., the log-normal's moments exist but solving for parameters requires numerical root-finding, negating MoM's simplicity).
Moments don't exist: Cauchy distribution — \(\mathbb{E}[X]\) is undefined. MoM is impossible.
Sample moments are unstable: For heavy-tailed distributions, high-order sample moments can have huge variance. MoM with higher moments can give absurd estimates.
Parameters out of bounds: MoM can produce \(\hat{\sigma}^2 < 0\) or \(\hat{\alpha} < 0\) for some data realizations. MLE respects parameter constraints by construction.

MoM vs. MLE: Quick Comparison

Property	Method of Moments	Maximum Likelihood
Consistency	Yes (under mild conditions)	Yes (under mild conditions)
Efficiency	Usually not efficient	Asymptotically efficient (attains Cramér-Rao bound)
Computation	Often closed-form	Often requires numerical optimization
Bias	Can be biased (small samples)	Can be biased (small samples); often less so
Robustness to model misspecification	Decent	Can be fragile
Handles censoring/truncation	Messy	Straightforward (modify likelihood)
Needs starting values	No	Yes (ironically, MoM estimates make great starting values for MLE!)

Opinionated workflow: Compute MoM estimates first. They take 3 lines of Python. If you need statistical efficiency, use them as starting values for an MLE optimizer (scipy.optimize.minimize on the negative log-likelihood).

# MoM → MLE pipeline for Gamma distribution
from scipy.optimize import minimize
from scipy.stats import gamma

def neg_log_likelihood_gamma(params, data):
    alpha, scale = params
    if alpha <= 0 or scale <= 0:
        return np.inf
    return -np.sum(gamma.logpdf(data, a=alpha, scale=scale))

# Start from MoM
alpha0, scale0 = mom_gamma(data)
result = minimize(neg_log_likelihood_gamma, x0=[alpha0, scale0],
                  args=(data,), method='L-BFGS-B',
                  bounds=[(1e-6, None), (1e-6, None)])
alpha_mle, scale_mle = result.x

Evaluating Functions Defined by Integrals

Sometimes the function you need is an integral. For example:

The error function: \(\text{erf}(x) = \frac{2}{\sqrt{\pi}}\int_0^x e^{-t^2}\,dt\)
The exponential integral: \(E_n(x) = \int_1^\infty \frac{e^{-xt}}{t^n}\,dt\)
Cumulative distribution functions: \(\Phi(x) = \int_{-\infty}^x \phi(t)\,dt\)

When a Library Already Has It, Use It

from scipy.special import erf, erfc, expn, gammainc, betainc
# These are hyper-optimized. Do not reimplement them.

When You Must Compute It Yourself

from scipy.integrate import quad
import numpy as np

def my_special_function(x):
    """Compute ∫₀ˣ sin(t²) dt (Fresnel-type integral)."""
    result, _ = quad(lambda t: np.sin(t**2), 0, x)
    return result

# Vectorize for array input
vec_my_func = np.vectorize(my_special_function)
y = vec_my_func(np.linspace(0, 5, 100))

For repeated evaluation, precompute on a grid and use a cubic spline:

from scipy.interpolate import CubicSpline

x_grid = np.linspace(0, 5, 200)
y_grid = np.array([quad(lambda t: np.sin(t**2), 0, xi)[0] for xi in x_grid])
spline = CubicSpline(x_grid, y_grid)
# Now evaluation is O(1)
y_fast = spline(2.5)

Error Function and Related Functions: A Case Study

The error function \(\text{erf}(x)\) and complementary error function \(\text{erfc}(x) = 1 - \text{erf}(x)\) illustrate why numerical evaluation needs care.

For large \(x\), \(\text{erfc}(x) \approx \frac{e^{-x^2}}{x\sqrt{\pi}}\). Computing \(\text{erfc}(x) = 1 - \text{erf}(x)\) by subtracting \(\text{erf}(x)\) from 1 is catastrophic cancellation — for \(x = 10\), \(\text{erf}(10) \approx 1 - 2 \times 10^{-45}\), and in double precision you get exactly 1, giving \(\text{erfc}(10) = 0\) — wrong by 44 orders of magnitude.

from scipy.special import erf, erfc

x = 10.0
print(1 - erf(x))   # → 0.0  (WRONG)
print(erfc(x))      # → 2.088487583762545e-45  (correct)

Moral: Use purpose-built library functions. SciPy's special module implements asymptotic expansions, continued fractions, and Chebyshev approximations tuned for each regime.

Opinionated Decision Flowchart

What are you trying to do?
│
├─ Integrate f(x) numerically
│   ├─ Is f smooth and you can evaluate it anywhere?
│   │   ├─ 1D, need high accuracy? → scipy.integrate.quad (adaptive)
│   │   ├─ 1D, known smoothness, want speed? → fixed_quad (Gauss-Legendre, n=32 or 64)
│   │   ├─ Periodic over full period? → trapezoidal rule (exponentially convergent!)
│   │   └─ High dimension (d ≥ 5)? → Monte Carlo or quasi-Monte Carlo
│   │
│   ├─ Do you have equally-spaced sampled data?
│   │   ├─ Dense data (h small)? → scipy.integrate.simpson
│   │   ├─ Need high accuracy from samples? → romb (if 2^k+1 points)
│   │   └─ Sparse data? → Interpolate (CubicSpline) then quad
│   │
│   └─ Do you have scattered data?
│       └─ Interpolate (RBF or spline) then integrate
│
├─ Estimate distribution parameters from data
│   ├─ Quick and dirty, closed-form? → Method of Moments
│   ├─ Need efficiency, have priors, censored data? → Maximum Likelihood (use MoM as starting point)
│   └─ MoM gave nonsense (negative variance, etc.)? → MLE with bounds
│
└─ Evaluate a special function
    ├─ Is it in scipy.special? → Use it. Done.
    ├─ Is it defined by an integral? → quad, then spline if repeated evaluation
    └─ Is it defined by an infinite series? → Truncate, estimate remainder, validate

Summary Table

Method	Best For	Precision	Speed	Recommendation
Trapezoidal (composite)	Periodic functions, dense samples	\(O(h^2)\)	Fast	Special cases only
Simpson's (composite)	Sampled data, quick estimate	\(O(h^4)\)	Fast	Fine for smooth data
Gauss-Legendre	Smooth f, any interval	\(O(n^{2n-1})\) exact for polys	Fast	Default for known f
Adaptive (quad)	Unknown smoothness, singularities	Near machine precision	Moderate	Default for general use
Gauss-Hermite/Laguerre	Infinite/half-infinite intervals with weight	Excellent	Fast	When the weight matches
Clenshaw-Curtis	Nested adaptive refinement	Near Gauss	Fast	When nested nodes matter
Romberg	Pre-sampled \(2^k+1\) data	Extrapolated to high order	Fast	Niche but elegant
Monte Carlo	d ≥ 5	\(O(1/\sqrt{N})\)	Varies	High dimensions only
Quasi-Monte Carlo	d ≥ 5, smooth f	\(O((\log N)^d/N)\)	Varies	Better than MC for smooth f
Method of Moments	Parameter estimation	Consistent	Instant (closed-form)	First pass estimator
MLE	Parameter estimation, efficiency	Asymptotically efficient	Optimization needed	Use MoM as starting point