Dr Christian P. H. Salas

Legendre-transforming a Lagrangian mechanics problem into a Hamiltonian one

The Legendre transform is a mechanism for converting a relationship like

$t(k) = \frac{d \lambda(k)}{dk} \qquad \qquad \qquad (1)$

into a relationship like

$k(t) = \frac{dJ(t)}{dt} \qquad \qquad \qquad (2)$

In other words, by going from a function $\lambda(k)$ to its Legendre transform $J(t)$ and vice versa, the roles of $t$ and $k$ in (1) and (2) can be reversed.

In moving from (1) to (2), the required Legendre transform is

$J(t) = t \cdot k(t) - \lambda(k(t)) \qquad \qquad \qquad (3)$

because differentiating both sides of (3) with respect to $t$ gives

$\frac{dJ(t)}{dt} = k(t) + t \cdot \frac{d k(t)}{dt} - \frac{d \lambda(k(t))}{dk(t)} \cdot \frac{dk(t)}{dt} = k(t) \qquad \qquad \qquad (4)$

Using (1), we see that the last two terms in (4) cancel, so we get (2). Conversely, the Legendre transform of $J(t)$ in (2) can be written

$\lambda(k) = k \cdot t(k) - J(t(k)) \qquad \qquad \qquad (5)$

because differentiating both sides with respect to $k$ gives (1).

Mainly for teaching purposes, I want to consider a simple example here of applying the Legendre transform to the conversion of a Lagrangian mechanics problem into an equivalent Hamiltonian one. We might want to do this for a number of reasons, not least because it converts the problem of solving second-order ODEs in the case of the Lagrangian formulation into a potentially simpler problem of solving first-order ODEs in the case of the Hamiltonian formulation.

Consider a simple harmonic oscillator consisting of a mass $m$ attached to a spring with spring constant $k$ moving backwards and forwards horizontally on a frictionless table. There is a restoring force $F = -kx$ that wants to bring the mass back towards its equilibrium position whenever it is displaced either by compressing the spring or stretching it. Newton’s second law, $F = ma$ , then gives us the second-order ODE

$\frac{d^2x}{dt^2} + \frac{k}{m} x = 0 \qquad \qquad \qquad \qquad \qquad (6)$

with general solution

$x(t) = A \cos(\sqrt{k/m}\cdot t) + B \sin(\sqrt{k/m}\cdot t) \qquad \qquad \qquad \qquad \qquad (7)$

where $\sqrt{k/m}$ is the natural frequency of this system’s oscillations. We could obtain a particular solution (i.e., provide specific values for $A$ and $B$ ) given a pair of initial conditions, e.g., the initial conditions $x(0) = 10$ and $x^{\prime}(0) = 15$ would give us the particular solution

$x(t) = 10 \cos(\sqrt{k/m}\cdot t) + 15 \sin(\sqrt{k/m}\cdot t) \qquad \qquad \qquad \qquad \qquad (8)$

The dynamics of this system are now completely determined and we can specify the values of $x(t)$ , $v(t)$ and $a(t)$ at any point in time $t$ .

As is well known, we can arrive at exactly the same result using Lagrangian mechanics, which involves writing the Lagrangian for the problem as

$L = T - V \qquad \qquad \qquad \qquad \qquad (9)$

where $T$ is kinetic energy and $V$ is potential energy. For the simple harmonic oscillator we have

$T = \frac{1}{2} m \dot{x}^2 \qquad \qquad \qquad \qquad \qquad (10)$

and

$V = \frac{1}{2} k x^2 \qquad \qquad \qquad \qquad \qquad (11)$

Therefore, the Lagrangian for the harmonic oscillator problem is

$L(x, \dot{x}) = \frac{1}{2} m \dot{x}^2 - \frac{1}{2} k x^2 \qquad \qquad \qquad \qquad \qquad (12)$

We define the action as the functional

$S[x] = \int_{t_1}^{t_2} L(x, \dot{x}) dt \qquad \qquad \qquad \qquad \qquad (13)$

and seek the extremal function $x^{\ast}(t)$ at which the action is stationary. The extremal is obtained by solving for $x^{\ast}(t)$ the first-order condition for this optimisation problem, known as the Euler-Lagrange equation:

$\frac{\partial L}{\partial x} - \frac{d}{dt} \bigg(\frac{\partial L}{\partial \dot{x}}\bigg) = 0 \qquad \qquad \qquad \qquad \qquad (14)$

Applying the Euler-Lagrange equation to the Lagrangian for the harmonic oscillator problem in (12) we find

$-kx -m\ddot{x} = 0$

which is the same as the ODE in (6) obtained by Newtonian mechanics.

[To briefly explain how the Euler-Lagrange equation is derived, we begin by constructing a function $X(t) = x^{\ast}(t) + \epsilon \eta(t)$ , where $x^{\ast}(t)$ is the desired extremal, $\epsilon$ is a parameter, and $\eta(t)$ denotes a function which goes through the same two endpoints $t_1$ and $t_2$ as the extremal, but has zero value at these endpoints. The new function $X(t)$ then represents the set of all possible varied curves that can joint the two endpoints. Out of all of these curves, we want the one that makes the action stationary, where the action is now treated as the function of $\epsilon$ given by $S(\epsilon) = \int_{t_1}^{t_2} = L(X, \dot{X}) dt$ . That is, we want $(d/d\epsilon)S(\epsilon) = 0$ when $\epsilon = 0$ . Noting that $(d/d\epsilon) X(t) = \eta(t)$ and $(d/d\epsilon) \dot{X}(t) = \dot{\eta}(t)$ and differentiating under the integral sign we get

$\frac{dS}{d\epsilon} = \int_{t_1}^{t_2} \bigg( \frac{\partial L}{\partial X} \frac{dX}{d\epsilon} + \frac{\partial L}{\partial \dot{X}} \frac{d\dot{X}}{d\epsilon}\bigg)dt = \int_{t_1}^{t_2} \bigg( \frac{\partial L}{\partial X} \eta(t) + \frac{\partial L}{\partial \dot{X}} \dot{\eta}(t)\bigg)dt$

Setting $\epsilon = 0$ and $(d/d\epsilon)S(\epsilon) = 0$ at this point, which is what is required, and remembering that $\epsilon = 0$ means $X(t) = x^{\ast}(t)$ , we obtain

$\int_{t_1}^{t_2} \bigg( \frac{\partial L}{\partial x^{\ast}} \eta(t) + \frac{\partial L}{\partial \dot{x}^{\ast}} \dot{\eta}(t)\bigg)dt = 0$

To convert the second term in this integrand into something multiplying $\eta(t)$ instead of $\dot{\eta}(t)$ , we can use integration by parts to get

$\int_{t_1}^{t_2} \frac{\partial L}{\partial \dot{x}^{\ast}} \dot{\eta}(t)dt = \bigg[\frac{\partial L}{\partial \dot{x}^{\ast}} \eta(t)\bigg]_{t_1}^{t_2} - \int_{t_1}^{t_2} \frac{d}{dt}\bigg(\frac{\partial L}{\partial \dot{x}^{\ast}}\bigg)\eta(t)dt = - \int_{t_1}^{t_2} \frac{d}{dt}\bigg(\frac{\partial L}{\partial \dot{x}^{\ast}}\bigg)\eta(t)dt$

Putting this result into the previous integral we get

$\int_{t_1}^{t_2} \bigg( \frac{\partial L}{\partial x^{\ast}} - \frac{d}{dt} \bigg(\frac{\partial L}{\partial \dot{x}^{\ast}}\bigg)\bigg) \eta(t)dt = 0$

from which the Euler-Lagrange equation in (14) follows since $\eta(t)$ is arbitrary.]

We now want to see how this can be transformed into a Hamiltonian mechanics problem using the Legendre transform outlined earlier. We begin by defining a new variable $p$ to replace the partial $\partial L/\partial \dot{x}$ , which is usually referred to as the generalised momentum in the mechanics literature:

$p(\dot{x}) = \frac{\partial L(x, \dot{x})}{\partial \dot{x}} \qquad \qquad \qquad \qquad \qquad (15)$

(In the case of the simple harmonic oscillator problem, this will turn out to be the actual linear momentum. The other partial in the Euler-Lagrange equation, $\partial L/\partial x$ , is usually referred to as the generalised force in the mechanics literature, and again will turn out to be the actual force in the case of the simple harmonic oscillator). We now regard (15) as the analogue of equation (1). It follows that the analogue of equation (2) must then be

$\dot{x}(p) = \frac{\partial H(x, p)}{\partial p} \qquad \qquad \qquad \qquad \qquad (16)$

where $H(x, p)$ is the Legendre transform that takes us from (15) to (16), namely

$H(x, p) = p \cdot \dot{x}(p) - L(x, \dot{x}) \qquad \qquad \qquad \qquad \qquad (17)$

This is the analogue of (3). This Legendre transform now enables us to write down the Hamiltonian for the harmonic oscillator problem, starting from the Lagrangian in (12). We begin by using (15) to obtain

$p(\dot{x}) = \frac{\partial L(x, \dot{x})}{\partial \dot{x}} = m \dot{x}\qquad \qquad \qquad \qquad \qquad (18)$

which allows us to replace $\dot{x}$ by $p/m$ wherever it occurs. Then using (12) in (17) with $p/m$ instead of $\dot{x}$ we get

$H(x, p) = p \bigg(\frac{p}{m}\bigg) - \bigg(\frac{1}{2}m\bigg(\frac{p}{m}\bigg)^2 - \frac{1}{2}kx^2\bigg) = \frac{p^2}{2m} + \frac{kx^2}{2} \qquad \qquad \qquad \qquad \qquad (19)$

Therefore, we see that whereas $L = T - V$ , the Legendre transform has instead given us $H = T + V$ . We can also get an equation in terms of $H$ for the Euler-Lagrange equation in (14) above by noting from (17) that

$\frac{\partial L}{\partial x} = - \frac{\partial H}{\partial x} \qquad \qquad \qquad \qquad \qquad (20)$

and from (18) that

$\frac{d}{dt}\bigg(\frac{\partial L}{\partial \dot{x}}\bigg) = \dot{p} \qquad \qquad \qquad \qquad \qquad (21)$

Therefore, the Euler-Lagrange condition for making the action stationary can be written in terms of $H(x, p)$ as the pair of equations

$\dot{x} = \frac{\partial H(x, p)}{\partial p}$

which is just (16) and represents the act of carrying out the Legendre transform, and

$\dot{p} = -\frac{\partial H(x, p)}{\partial x}$

which from (20) and (21) represents the Euler-Lagrange condition expressed in terms of the Hamiltonian. These two first-order ODEs are usually referred to as Hamilton’s equations, and are often more convenient to solve than the second-order ODEs arising in Lagrangian mechanics. The fact that the Hamiltonian is expressed in terms $x$ and $p$ and represents the total energy of the system also allows geometric insights to be conveniently obtained, e.g., by plotting constant-energy contours for the system in a x-p phase space. All of this means that, when faced with a mechanics problem, we can usually simply state the Hamiltonian as $H = T + V$ , then use Hamilton’s equations as the first-order conditions for making the action stationary, as long as we know the kinetic energy $T$ and the potential energy $V$ for the problem in terms of $x$ and $p$ .

In the case of the simple harmonic oscillator, Hamilton’s equations give us

$\dot{x} = \frac{\partial H}{\partial p} = \frac{p}{m} \qquad \qquad \qquad \qquad \qquad (22)$

and

$\dot{p} = -\frac{\partial H}{\partial x} = -kx \qquad \qquad \qquad \qquad \qquad (23)$

From (22) we get

$\dot{p} = m \ddot{x}$

and equating this with (23) we get

$m \ddot{x} = -kx$

which again is the same as the ODE in (6) obtained from Newtonian mechanics.

Finally, it is obvious from the form of the Hamiltonian in (19) that the level sets of $H(x, p)$ for the simple harmonic oscillator, i.e., the constant-energy contours in x-p phase space, will be circular orbits as shown in the diagram.

A note on a Fourier sine series solution to a heat flow problem

A heat flow problem led to the following general series solution:

$u(x, t) = \sum_{n=0}^{\infty} b_n \exp\bigg[ -\bigg(\frac{(2n+1)\pi}{2L} \bigg)^2 kt \bigg] \sin\bigg(\frac{(2n+1)\pi}{2L}x\bigg) \qquad \qquad \qquad \qquad \qquad (1)$

An initial condition at $t = 0$ required $u(x, 0) = u_0$ , so it was necessary to find the coefficients $b_n$ in the following sine series:

$u_0 = \sum_{n=0}^{\infty} b_n \sin\bigg(\frac{(2n+1)\pi}{2L}x\bigg) \qquad \qquad \qquad \qquad \qquad (2)$

The sine functions here involve coefficients of $x$ of the form $\frac{(2n+1)\pi}{2L}$ rather than the usual $\frac{n \pi}{L}$ . They have period $4L$ and by looking at graphs it is easy to convince oneself that

$\int_{-2L}^{2L} \sin^2\bigg(\frac{(2n+1)\pi}{2L}x\bigg) dx = \int_{-2L}^{2L} \cos^2\bigg(\frac{(2n+1)\pi}{2L}x\bigg) dx$

$\int_{-2L}^{2L} \sin^2\bigg(\frac{(2n+1)\pi}{2L}x\bigg) dx$

$= \frac{1}{2} \int_{-2L}^{2L} \bigg[\sin^2\bigg(\frac{(2n+1)\pi}{2L}x\bigg) + \cos^2\bigg(\frac{(2n+1)\pi}{2L}x\bigg) \bigg] dx$

$= \frac{1}{2} \cdot 4L = 2L$

Also, orthogonality can be demonstrated in the usual ways (e.g., using the exponential form of the sine function):

$\int_{-2L}^{2L} \sin\bigg(\frac{(2n+1)\pi}{2L}x\bigg) \cdot \sin\bigg(\frac{(2m+1)\pi}{2L}x \bigg) dx = 0$

for $n \neq m$ . In fact, the functions $\sin\bigg(\frac{(2n+1)\pi}{2L}x \bigg)$ for $n = 0, 1, 2, \ldots$ form a complete, orthogonal set of basis functions and we can expand $u_0$ in this basis. To this end, we need to create an odd function involving $u_0$ of the form

$f(x) = \left\{ \begin{array}{rl} u_0 & \text{for } 0 < x < 2L \\ -u_0 & \text{for } -2L < x< 0 \end{array} \right.$

The coefficients of the Fourier sine series are then obtained as

$b_n = \frac{1}{2L} \int_{-2L}^{2L} f(x)\sin\bigg(\frac{(2n+1)\pi}{2L}x \bigg) dx = \frac{1}{L} \int_0^{2L} u_0 \sin\bigg(\frac{(2n+1)\pi}{2L}x \bigg) dx$

$= \frac{u_0}{L} \bigg[-\frac{2L}{(2n+1)\pi}\cos\bigg(\frac{(2n+1)\pi}{2L}x \bigg) \bigg]_0^{2L} = \frac{-2u_0}{(2n+1)\pi} \big(\cos((2n+1)\pi)-1\big)$

$= \frac{-2u_0}{(2n+1)\pi} \cdot (-2) = \frac{4u_0}{(2n+1)\pi}$

The full solution of the heat flow problem is thus

$u(x, t) = \frac{4u_0}{\pi}\sum_{n=0}^{\infty} \frac{1}{2n+1} \exp\bigg[ -\bigg(\frac{(2n+1)\pi}{2L} \bigg)^2 kt \bigg] \sin\bigg(\frac{(2n+1)\pi}{2L}x\bigg)$

Equivalent integral formulae for Bessel functions of the first kind, order zero

Bessel’s equation arises in countless physics applications and has the form

$x^2 y^{\prime \prime} + x y^{\prime} + (x^2 - p^2) y = 0 \qquad \qquad \qquad \qquad \qquad (1)$

where the constant $p$ is known as the order of the Bessel function $y$ which solves (1). The method of Frobenius can be used to find series solutions for $y$ of the form

$y = x^s \sum_{n=0}^{\infty} a_n x^n = \sum_{n=0}^{\infty} a_n x^{n+s} = a_0 x^s + a_1 x^{s+1} + a_2 x^{s+2} + \cdots \qquad \qquad \qquad \qquad \qquad (2)$

where $s$ is a number to be found by substituting (2) and its relevant derivatives into (1). We assume that $a_0$ is not zero, so the first term of the series will be $a_0 x^s$ . Successive differentiation of (2) and substitution into (1) produces for each power of $x$ in (2) an equation equal to zero involving combinations of several coefficients $a_n$ . (The total coefficient of each power of $x$ must be zero to ensure the right-hand side of (1) ends up as zero). The equation for the coefficient of $x^s$ is used to find the possible values of $s$ and is known as the indicial equation. For the Bessel equation in (1), this procedure results in the indicial equation

$s^2 - p^2 = 0$

so $s = \pm p$ and we need to find two solutions, one for $s=p$ and another for $s = -p$ . A linear combination of these two solutions can then be used to construct a general solution of (1). The Frobenius procedure also leads to $a_1$ = 0 and the following recursion formula for the remaining coefficients:

$a_n = -\frac{a_{n-2}}{(n+s)^2 - p^2} \qquad \qquad \qquad \qquad \qquad (3)$

We use this to find coefficients for $s = p$ first, and can then simply replace $p$ by $-p$ to get the coefficients for $s = -p$ . Here, we focus only on the case $s = p$ which upon substitution in (3) gives

$a_n = -\frac{a_{n-2}}{(n+p)^2 - p^2} = -\frac{a_{n-2}}{n(n+2p)} \qquad \qquad \qquad \qquad \qquad (4)$

Evaluating these coefficients with a starting value of $a_0 = \frac{1}{2^p p!}$ results in solutions called Bessel functions of the first kind and order $p$ , usually denoted by $J_p(x)$ . In the present note, I am concerned only with the Bessel function of the first kind and order zero, $J_0(x)$ , obtained by setting $p = 0$ . We then have a starting value $a_0 = 1$ and we obtain the following coefficients in the power series for $J_0(x)$ :

$a_0 = 1$

$a_1 = 0$

$a_2 = -\frac{a_0}{2(2+0)} = -\frac{1}{4} = -\frac{1}{2^2}$

$a_3 = 0$

$a_4 = -\frac{a_2}{4(4 + 0)} = \frac{1}{2^2 \cdot 4^2}$

$a_5 = 0$

$a_6 = -\frac{a_4}{6(6 + 0)} = -\frac{1}{2^2 \cdot 4^2 \cdot 6^2}$

and so on. So, $a_1 = 0$ leads to all odd-numbered coefficients being zero and we end up with the following generalized series expression for the Bessel function of the first kind and order zero:

$J_0(x) = 1 - \frac{x^2}{2^2} + \frac{x^4}{2^2 \cdot 4^2} - \frac{x^6}{2^2 \cdot 4^2 \cdot 6^2} + \cdots \qquad \qquad \qquad \qquad \qquad (5)$

We can easily plot $J_0(x)$ using MAPLE and it looks like a kind of damped cosine function. Here is a quick plot for positive values of $x$ :

The series in (5) above is given, albeit in a more general complex variable setting, as equation (3) on page 16 in the classic work by G. N. Watson, 1922, A Treatise On The Theory of Bessel Functions, Cambridge University Press. (At the time of writing, this monumental book is freely downloadable from several online sites). In the present note, I am interested in certain integral formulae which give rise to the same series as (5). These are discussed in Chapter Two of Watson’s treatise, and I want to unpick some things from there. In particular, I am intrigued by the following passage on pages 19 and 20 of Watson’s book:

In the remainder of this note, I want to obtain the series for $J_0(x)$ in (5) above directly from the integral in equation (1) in this extract from Watson’s book, adapted to the real-variable case with $n = 0$ . I also want to use Watson’s technique of bisecting the range of integration to obtain a clear intuitive understanding of other equivalent integral formulae for $J_0(x)$ .

Begin by setting $n = 0$ and $z = x$ in the integral in (1) in the extract to get

$J_0(x) = \frac{1}{2 \pi }\int_0^{2 \pi} \cos(x \sin \theta) d \theta \qquad \qquad \qquad \qquad \qquad (6)$

To confirm the validity of this, we can substitute $x \sin \theta$ into the Taylor series expansion for $\cos x$ , namely

$\cos x = 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} + \cdots$

to get

$\cos(x \sin \theta) = 1 - \frac{(x \sin \theta)^2}{2!} + \frac{(x \sin \theta)^4}{4!} - \frac{(x \sin \theta)^6}{6!} + \cdots \qquad \qquad \qquad \qquad \qquad (7)$

Now integrate both sides of (7) from $0$ to $2 \pi$ , using a formula I explored in a previous note for integrating even powers of the sine function, namely

$\int_0^{2 \pi} \sin^n \theta d \theta = \left\{ \begin{array}{rl} 0 & \text{for } n \text{ odd} \\ \frac{(n-1)(n-3) \cdots 3 \cdot 1}{n(n-2) \cdots 4 \cdot 2} \cdot 2 \pi & \text{for } n \text{ even} \end{array} \right. \qquad \qquad \qquad \qquad \qquad (8)$

We obtain

$\int_0^{2 \pi} \cos(x \sin \theta) d \theta = 2 \pi - \frac{x^2}{2!} \frac{1}{2} 2 \pi + \frac{x^4}{4!} \frac{3 \cdot 1}{4 \cdot 2} 2 \pi - \frac{x^6}{6!} \frac{5 \cdot 3 \cdot 1}{6 \cdot 4 \cdot 2} 2\pi + \cdots$

$= 2\pi \big(1 - \frac{x^2}{2^2} + \frac{x^4}{2^2 \cdot 4^2} - \frac{x^6}{2^2 \cdot 4^2 \cdot 6^2} + \cdots\big)$

$\frac{1}{2 \pi} \int_0^{2 \pi} \cos(x \sin \theta) d \theta = 1 - \frac{x^2}{2^2} + \frac{x^4}{2^2 \cdot 4^2} - \frac{x^6}{2^2 \cdot 4^2 \cdot 6^2} + \cdots \qquad \qquad \qquad \qquad \qquad (9)$

Thus, (6) holds. Notice that the integral in (6) is the average value of the integrand over the interval from $0$ to $2 \pi$ . Looking at graphs of the sine and cosine functions, we can easily imagine that we can switch between the two and that this average also remains unchanged if we restrict the range of integration to the interval from $0$ to $\frac{\pi}{2}$ , and divide the integral by $\frac{\pi}{2}$ instead of $2 \pi$ .

Thus, it seems intuitively obvious that the following are equivalent integral formulae for $J_0(x)$ :

$J_0(x) = \frac{2}{\pi} \int_0^{\pi/2} \cos(x \sin \theta) d \theta = \frac{2}{\pi} \int_0^{\pi/2} \cos(x \cos \theta) d \theta \qquad \qquad \qquad \qquad \qquad (10)$

In addition, if we substitute $i x \sin \theta$ into the Taylor series expansion for $e^x$ , namely

$e^x = 1 + \frac{x}{1!} + \frac{x^2}{2!} + \frac{x^3}{3!} + \frac{x^4}{4!} + \cdots$

we get

$e^{i x \sin \theta} = 1 + \frac{(i x \sin \theta)}{1!} + \frac{(i x \sin \theta)^2}{2!} + \frac{(i x \sin \theta)^3}{3!} + \frac{(i x \sin \theta)^4}{4!} + \cdots \qquad \qquad \qquad \qquad \qquad (11)$

But when we integrate both sides of this from $0$ to $2 \pi$ , the odd terms on the right-hand side will vanish and the even terms will be the same as they were in the derivation of (9) above using the Taylor series for cos. We thus obtain another equivalent integral formula for $J_0(x)$ :

$J_0(x) = \frac{1}{2 \pi} \int_0^{2 \pi} e^{i x \sin \theta} d \theta \qquad \qquad \qquad \qquad \qquad (12)$

Furthermore, we can again restrict the range of integration in (12) and switch between sine and cosine as indicated in the following graphs:

Thus, we obtain the following equivalent integral formulae for $J_0(x)$ :

$J_0(x) = \frac{1}{\pi} \int_{-\pi/2}^{\pi/2} e^{i x \sin \theta} d \theta = \frac{1}{\pi} \int_0^{\pi} e^{i x \cos \theta} d \theta$

We can imagine easily obtaining many other equivalent integral formulae for $J_0(x)$ in this way.

Useful formula for integrating integer powers of a sine function over a period

I needed to integrate increasing odd and even integer powers of the sine function repeatedly in a power series, with limits from $0$ to $2 \pi$ . Looking at the graph of the sine function, it is obvious that since sine is itself an odd function its odd powers must integrate to zero over the interval from $0$ to $2 \pi$ (intuitively, negatively signed areas will cancel positively signed ones).

For the even integer powers, I found the following definite integral formula with limits $0$ to $\frac{\pi}{2}$ in the standard reference book by I. S. Gradshteyn and I. M. Ryzhik, 1994, Table of Integrals, Series and Products, page 412:

$\int_0^{\pi/2} \sin^{2m}x dx = \frac{(2m - 1)!!}{(2m)!!} \frac{\pi}{2} \qquad \qquad \qquad \qquad \qquad (1)$

The double exclamation marks mean that we deduct 2 to get each subsequent term in the factorial, so

$(2m-1)!! = (2m-1)(2m-3) \cdots 3 \cdot 1$

and

$(2m)!! = (2m)(2m-2) \cdots 4 \cdot 2$ .

The symmetry of the graph and the fact that even powers of sine are even functions suggest that the right-hand side of (1) only needs to be multiplied by 4 to get the correct definite integral formula for the interval $0$ to $2\pi$ . Thus, we obtain the following useful formula for integrating integer powers of sine over a full period:

For peace of mind, I also wanted to prove the formula in (2) by direct integration and I found a way of doing this using integration by parts as follows. We have

$\int_0^{2 \pi} \sin^n \theta d \theta = \int_0^{2 \pi} \sin \theta \sin^{n-1} \theta d \theta$

$= [\sin^{n-1}\theta (-\cos \theta)]_0^{2 \pi} - \int_0^{2 \pi} (n-1)\sin^{n-2} \theta (-\cos^2 \theta)d \theta$

$= (n-1)\int_0^{2 \pi} \sin^{n-2} \theta (\cos^2 \theta)d \theta$

$= (n-1)\int_0^{2 \pi} \sin^{n-2} \theta (1 - \sin^2 \theta)d \theta$

$= (n-1)\int_0^{2 \pi} \sin^{n-2} \theta d \theta - (n-1)\int_0^{2 \pi} \sin^n \theta d \theta$

Therefore,

$n \int_0^{2 \pi} \sin^n \theta d \theta = (n-1)\int_0^{2 \pi} \sin^{n-2} \theta d \theta$

$\int_0^{2 \pi} \sin^n \theta d \theta = \frac{(n-1)}{n} \int_0^{2 \pi} \sin^{n-2} \theta d \theta$

Continuing this pattern recursively with the integral on the right-hand side we get

$\big(\frac{n-1}{n}\big) \big(\frac{n-3}{n-2}\big)\big(\frac{n-5}{n-4}\big) \cdots \big(\frac{1}{2}\big) \int_0^{2 \pi} \ d \theta$

From the pattern of the numerators we see that this evaluates to zero if $n$ is odd and to

$\big(\frac{n-1}{n}\big) \big(\frac{n-3}{n-2}\big)\big(\frac{n-5}{n-4}\big) \cdots \big(\frac{1}{2}\big) \cdot 2 \pi$

if $n$ is even.

Surface charge density of a conducting disk from that of a conducting ellipsoid

A basic result from applying Gauss’ Law of electromagnetism to a conductor is that any excess charge always ends up distributed entirely on the surface of the conductor. Furthermore, the excess charge will always distribute itself there so as to produce an equipotential surface. The resulting surface charge density distribution per unit area is usually difficult to obtain as a formula in closed form but it happens to be known for a conducting ellipsoid with general equation

$\frac{x^2}{a^2} + \frac{y^2}{b^2} + \frac{z^2}{c^2} = 1 \qquad \qquad \qquad \qquad \qquad (1)$

If the total excess charge on the surface of this conducting ellipsoid is $Q$ , the distribution of surface charge density per unit area is given by

$\sigma = \frac{Q}{4 \pi a b c } \big(\frac{x^2}{a^4} + \frac{y^2}{b^4} + \frac{z^2}{c^4}\big)^{-1/2} \qquad \qquad \qquad \qquad \qquad (2)$

(see, e.g., D. Griffiths, 1999, Introduction to Electrodynamics, Problem 2.52, page 109). Interestingly, it is possible to deduce from this the surface charge density distribution of other conducting solids that can be obtained as special cases of a conducting ellipsoid by suitable choices for the parameters $a$ , $b$ and $c$ , and I wanted to make a note of this here.

For example, we can immediately obtain the surface charge density distribution of a conducting sphere of radius $R$ by setting $a = b = c = R$ in (1) and (2) above to get the sphere

$x^2 + y^2 + z^2 = R^2 \qquad \qquad \qquad \qquad \qquad (3)$

and its corresponding surface charge density

$\sigma = \frac{Q}{4 \pi R^3 } \big(\frac{x^2}{R^4} + \frac{y^2}{R^4} + \frac{z^2}{R^4}\big)^{-1/2} = \frac{Q}{4 \pi} \bigg(R^6 \big(\frac{x^2}{R^4} + \frac{y^2}{R^4} + \frac{z^2}{R^4}\big)\bigg)^{-1/2}$

$= \frac{Q}{4 \pi} \big(R^2 (x^2 + y^2 + z^2)\big)^{-1/2} = \frac{Q}{4 \pi R^2} \qquad \qquad \qquad \qquad \qquad (4)$

where the last equality follows by (3).

Note that a unit sphere

$\tilde{x}^2 + \tilde{y}^2 + \tilde{z}^2 = 1 \qquad \qquad \qquad \qquad \qquad (5)$

can be transformed into the ellipsoid in (1) above via a scaling of each of the variables, i.e., the change of variables $x = a \tilde{x}$ , $y = b \tilde{y}$ , $z = c \tilde{z}$ . Thus, we can obtain the surface charge density of a circular conducting disk of radius $R$ coplanar with the $xy$ -plane and with centre at the origin by setting $c = 0$ and $a = b = R$ in (2), and using the fact that on the surface of the ellipsoid we have

$\frac{z^2}{c^2} = 1 - \frac{x^2}{a^2} - \frac{y^2}{b^2} \qquad \qquad \qquad \qquad \qquad (6)$

so that (2) can be rearranged to read

$\sigma = \frac{Q}{4 \pi ab} \bigg(c^2 \big(\frac{x^2}{a^2} + \frac{y^2}{b^2}\big) + 1 - \frac{x^2}{a^2} - \frac{y^2}{b^2}\bigg)^{-1/2} \qquad \qquad \qquad \qquad \qquad (7)$

Thus, setting $c = 0$ and $a = b = R$ in (7), we get the following formula for the surface charge density per unit area of a circular conducting disk of radius $R$ and total excess charge $Q$ (spread over both sides) at a point a radial distance $r = \sqrt{x^2 + y^2}$ from its centre:

$\sigma = \frac{Q}{4 \pi R^2} \big(1 - \frac{x^2}{R^2} - \frac{y^2}{R^2}\big)^{-1/2} = \frac{Q}{4 \pi R} \big(R^2 - r^2\big)^{-1/2} \qquad \qquad \qquad \qquad \qquad (8)$

Integrating integer powers of a sine function via a hypergeometric function

I recently needed to integrate the fifth power of a sine function from zero to $\pi$ and did this via a hypergeometric function, which I found quite interesting. I will make a quick note of it here. I came across the following entry in a table of integrals:

$\int \sin^n(ax) dx = -\frac{1}{a} \cos(ax) \ {_2F_1 \big[\frac{1}{2}, \frac{1-n}{2}, \frac{3}{2}, \cos^2(ax)\big]} \qquad \qquad \qquad \qquad \qquad (1)$

The function $_2F_1(a, b, c, z)$ is a special function known as the ordinary hypergeometric function. The required definite integral for $n = 5$ , $a = 1$ and limits $0$ to $\pi$ is then

$\int_0^{\pi} dx \sin^5(x) = \bigg[ -\cos(x) \ {_2F_1 \big[\frac{1}{2}, -2, \frac{3}{2}, \cos^2(x)\big]} \bigg]_0^{\pi} = 2 {_2F_1 \big[\frac{1}{2}, -2, \frac{3}{2}, 1 \big]} \qquad \qquad \qquad \qquad \qquad (2)$

What interested me is that there is a summation theorem for the case $_2F_1(a, b, c, 1)$ which yields a simple ratio of products of the gamma function when $Re(c) > Re(a+b)$ :

$_2F_1(a, b, c, 1) = \frac{\Gamma(c) \Gamma(c - a - b)}{\Gamma(c-a) \Gamma(c-b)} \qquad \qquad \qquad \qquad \qquad (3)$

Since this condition happens to be satisfied in our case, we thus have

${_2F_1 \big[\frac{1}{2}, -2, \frac{3}{2}, 1 \big]} = \frac{\Gamma \big(\frac{3}{2}\big) \Gamma \big(\frac{3}{2} - \frac{1}{2} + 2 \big)}{\Gamma \big(\frac{3}{2}-\frac{1}{2}\big) \Gamma \big(\frac{3}{2} + 2 \big)} = \frac{\big(\frac{\sqrt{\pi}}{2}\big)(2!)}{(1)\big(\frac{15\sqrt{\pi}}{8}\big)} = \frac{8}{15}$

Therefore

$\int_0^{\pi} dx \sin^5(x) = \frac{16}{15}$

Quick memo on five basic types of first-order differential equation

When confronted with a first-order differential equation, try to convert it to one of these five basic types:

1). Separable: This is when $\frac{dy}{dx} = F(x, y)$ can be written $\frac{dy}{dx} = f(x)g(y)$ . If there is an initial condition $y(0) = A$ , we can rearrange to get an equation involving two integrals:

$\int_A^y \frac{dv}{g(v)} = \int_a^x du f(u)$

This incorporates the initial condition. It can sometimes be solved to give a solution in the form $y = h(x)$ . Note the following integration tricks which are sometimes useful here:

$\int \frac{dy}{\sqrt{1+y}} = 2\sqrt{1+y} + C$

$\int \frac{x dx}{1 + x} = \int dx \bigg(\frac{1+x}{1+x} - \frac{1}{1+x}\bigg) = \int dx - \int \frac{dx}{1+x}$

2). Homogeneous: These equations are those for which $F(x, y)$ depends only the ratio $\frac{y}{x}$ rather than on $x$ and $y$ separately. So, they are of the form $\frac{dy}{dx} = F\big(\frac{y}{x}\big)$ with initial condition $y(a) = A$ . To transform this into a separable equation, make the change of variable $y = x v$ . Then $\frac{dy}{dx} = v + x \frac{dv}{dx}$ , so we have $F(v) = v + x \frac{dv}{dx}$ and therefore $\frac{dv}{dx} = \frac{F(v) - v}{x}$ . The initial condition becomes $A = a v(a)$ , so $v(a) = \frac{A}{a}$ . The method of separation of variables can now be applied to the transformed equation.

3). Non-separable linear first-order equations: These have the general form $\frac{dy}{dx} + y P(x) = Q(x)$ , with boundary condition $y(a) = A$ . Here, we need to find an integrating factor $p(x)$ that allows us to write the differential equation as

$\frac{d}{dx}\big(y p(x)\big) = Q(x) p(x)$

Expanding this form, we find that

$\frac{dy}{dx} p(x) + y p^{\prime}(x) = Q(x) p(x)$

and so

$\frac{dy}{dx} + y \frac{p^{\prime}(x)}{p(x)} = Q(x)$

This shows that in the original equation we must have $\frac{p^{\prime}(x)}{p(x)} = P(x)$ and therefore the required integrating factor is $p(x) = \exp\big(\int dx P(x)\big)$ . So what we do is find the integrating factor, then write the equation in the form $\frac{d}{dx}\big(y p(x)\big) = Q(x) p(x)$ . This can then be solved by integration to get $y(x) p(x) = C + \int dx \ p(x) Q(x)$ where $C$ is a constant of integration. If the integral constant cannot be evaluated directly, it is best to give the solution as

$y(x) p(x) = A + \int_a^x dt Q(t) p(t)$

$p(t) = \exp \bigg(\int_a^t du P(u) \bigg)$

because this expression automatically satisfies the boundary condition (when $x = a$ , the integral $\int_a^x dt Q(t) p(t)$ vanishes and $p(a) = \exp \big(\int_a^a du P(u) \big) = \exp(0) = 1$ so the equation reduces to $y(a) = A$ ).

4). Bernoulli’s equation: This is a nonlinear first-order equation of the form $\frac{dy}{dx} + y P(x) = y^n Q(x)$ . To solve this, make the change of variable $z = y^{1-n}$ . Then $\frac{dz}{dx} = \frac{1-n}{y^n} \frac{dy}{dx}$ , so the original equation can be rewritten as

$\frac{dz}{dx} + (1 - n) z P(x) = (1 - n) Q(x)$

This is now a linear first-order equation of the type 3) above.

5). Riccati’s equation: This is a nonlinear first-order equation of the form $\frac{dy}{dx} = P(x) + y Q(x) + y^2 R(x)$ . It reduces to a linear equation if $R(x) = 0$ and Bernoulli’s equation if $P(x) = 0$ . Otherwise, it can be reduced to a linear second-order equation by defining a new dependent variable $u(x)$ with the equation $y = - \frac{1}{uR} \frac{du}{dx}$ . This converts the original Riccati equation into

$\frac{d^2u}{dx^2} - \bigg(Q(x) + \frac{R^{\prime}(x)}{R(x)} \bigg) \frac{du}{dx} + P(x) R(x) u(x) = 0$

which can be solved using known methods depending on the specific details of each case.

Royal Holloway lectures (Quantitative Methods in Economics II)

My first academic job was as a lecturer in economics at Royal Holloway, University of London, where I was responsible for the second-year mathematical methods and econometrics course EC2203 Quantitative Methods in Economics II. I have lost all the original manuscripts of the lecture notes but still have scanned copies of most of the printouts (only Lecture 18 Estimation of structural equations is lost forever, unfortunately!) I am posting them here in the order in which I delivered them.

EC2203 Syllabus

EC2203 Lecture 1 Sets and functions

EC2203 Lecture 2 Vectors

EC2203 Lecture 3 Introduction to matrix algebra

EC2203 Lecture 4 Matrix inversion and Cramer’s rule

EC2203 Preparation for Lecture 5 Review of differentiation

EC2203 Lecture 5 Partial differentiation

EC2203 Lecture 5 Appendix A note on Jacobian determinants

EC2203 Lecture 6 Total differentiation

EC2203 Lecture 7 Implicit functions

EC2203 Lecture 8 Unconstrained optimisation

EC2203 Lecture 8 Appendix Quadratic forms

EC2203 Lecture 9 Constrained optimisation

EC2203 Lecture 10 Essential statistical concepts

EC2203 Lecture 11 Introduction to econometric models

EC2203 Lecture 12 Identification

EC2203 Lecture 13 Introduction to dynamic models

EC2203 Lecture 13 Appendix Some formal definitions

EC2203 Lecture 14 The bivariate regression model

EC2203 Lecture 14 Appendix Algebraic results for OLS estimation

EC2203 Lecture 15 Multiple regression analysis

EC2203 Lecture 16 Autocorrelation

EC2203 Lecture 17 Heteroskedasticity

EC2203 Lecture 18 Estimation of structural equations (Lost)

Derivation and manipulation of a generalised Erlang distribution

The present note uses the setup outlined in a previous note about a 2016 paper in Physical Review Letters on the large deviation analysis of rapid-onset rain showers. This paper will be referred to as MW2016 herein. In order to apply large deviation theory to a collector-drop framework for rapid-onset rain formation, equation [14] in MW2016 presents the cumulant generating function for the random variable $T$ , which I will write here as

$\lambda(k) = \sum_{n=1}^{\mathcal{N}} \ln \bigg(\frac{R_n + k}{R_n}\bigg)$

The Laplace transform of the required probability density is $\exp(-\lambda(k))$ , so equation [15] in MW2016 sets up a relevant Bromwich integral equation, which I will write here as

$f_T(\bar{T}) = \frac{1}{2 \pi i} \int_{\mathcal{R}-i \infty}^{\mathcal{R}+i \infty} dz \exp(z \bar{T}) \exp(-\lambda(z)) \qquad \qquad \qquad \qquad \qquad (1)$

At first sight, this seems intractable directly, and a saddle-point approximation for the density in (1) is presented in equation [17] in MW2016. This saddle-point approximation involves the large deviation entropy function for the random variable $T$ , and an intricate asymptotic analysis is then embarked upon to obtain an explicit expression for this entropy function. The result is the expression given in equation [24] in MW2016.

However, instead of approximating (1), it is possible to work out the Bromwich integral exactly. The result is the exact version of the probability density which is approximated in equation [17] in MW2016. This is a known distribution called the generalised Erlang distribution, but I have not been able to find any derivations of it using the Bromwich integral approach. Typically, derivations are carried out using characteristic functions (see, e.g., Bibinger, M., 2013, Notes on the sum and maximum of independent exponentially distributed random variables with different scale parameters, arXiv:1307.3945 [math.PR]), or phase-type distribution methods (see, e.g., O’Cinneide, C., 1999, Phase-type distribution: open problems and a few properties, Communication in Statistic-Stochastic Models, 15(4), pp. 731-757).

To derive this distribution directly from the Bromwich integral in (1), begin by replacing the $\exp(-\lambda(k))$ term by the product it gives rise to:

$f_T(\bar{T}) = \frac{1}{2 \pi i} \int_{\mathcal{R}-i \infty}^{\mathcal{R}+i \infty} dz \exp(z \bar{T}) \prod_{n=1}^{\mathcal{N}} \bigg(\frac{R_n}{R_n+z}\bigg) \qquad \qquad \qquad \qquad \qquad (2)$

Then, starting from an elementary bivariate partial fraction expansion, and extending this to $\mathcal{N}$ variables by induction, we can obtain the following expression for the product term in (2):

$\prod_{n=1}^{\mathcal{N}} \bigg(\frac{R_n}{R_n+z}\bigg) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n \bigg(\frac{1}{R_n+z}\bigg)\bigg) \qquad \qquad \qquad \qquad \qquad (3)$

The detailed derivation of (3) is set out in an Appendix below. Putting (3) back into (2), and moving the integral sign `through’ the summation and product in (3), we obtain the following form for the Bromwich integral:

$f_T(\bar{T}) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n\bigg \{\frac{1}{2 \pi i} \int_{\mathcal{R}-i \infty}^{\mathcal{R}+i \infty} dz \frac{\exp(z \bar{T})}{R_n + z} \bigg \} \bigg) \qquad \qquad \qquad \qquad \qquad (4)$

Each integrand in (4) has a simple pole at $z = -R_n$ , so the residue at each such singularity is given by

$\lim_{z \rightarrow -R_n} \bigg((R_n + z) \frac{\exp(z \bar{T})}{(R_n + z)}\bigg) = \exp(-R_n \bar{T}) \qquad \qquad \qquad \qquad \qquad (5)$

We can construct a contour in the complex plane which is appropriate for each $R_n$ by creating a semicircle with the straight vertical part located at a point $\mathcal{R}$ to the right of $-R_1$ , and with the arc of the semicircle large enough to enclose the point $-R_{\mathcal{N}}$ on the real axis (diagram below). Since $R_1$ is the smallest collision rate, and $R_{\mathcal{N}}$ is the largest, this ensures that the contour will enclose each $-R_n$ point.

It then follows by Cauchy’s residue theorem, and the fact that the integral along the circular arc component of the contour vanishes as the radius of the semicircle goes to infinity, that

$\frac{1}{2 \pi i} \int_{\mathcal{R}-i \infty}^{\mathcal{R}+i \infty} dz \frac{\exp(z \bar{T})}{R_n + z} = \frac{1}{2 \pi i} \big (2 \pi i \exp(-R_n \bar{T}) \big) = \exp(-R_n \bar{T}) \qquad \qquad \qquad \qquad \qquad (6)$

Putting (6) in (4), we then get the exact form of the required density:

$f_T(\bar{T}) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n \exp(-R_n \bar{T}) \bigg) \qquad \qquad \qquad \qquad \qquad (7)$

This is the density function of the generalised Erlang distribution, and is given in an alternative (but equivalent) form in equation [SB1.1] on page 241 of a paper by Kostinski and Shaw (Kostinski, A., Shaw, R., 2005, Fluctuations and Luck in Droplet Growth by Coalescence, Bull. Am. Meteorol. Soc. 86, 235). Integrating (7) with respect to $\bar{T}$ , we obtain the cumulative distribution function

$F_T(\bar{T}) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)\big (1 - \exp(-R_n \bar{T}) \big) \bigg) \qquad \qquad \qquad \qquad \qquad (8)$

Again this is given in an alternative (but equivalent) form on page 241 of Kostinski and Shaw’s paper, as equation [SB1.2].

Although we now have the exact pdf and cdf corresponding to the Bromwich integral in MW2016, these are not practical for calculations in the context of runaway raindrop growth. This runaway process can involve large numbers of water particle collisions $\mathcal{N}$ , e.g., $\mathcal{N}=10,000$ , so the expressions in (7) and (8) would involve sums of thousands of product terms, each of these products in turn involving thousands of terms. Therefore, we are still interested in obtaining a formula for the entropy function $J(\bar{T})$ to enable large deviation approximations for (7) and (8) for practical calculations. It may be possible to obtain an approximation for the entropy function directly from the cdf in (8), but this would require further investigation.

An initial problem that might need to be overcome is that the forms for the cdf in (8) above and also in Kostinski and Shaw (which is similar) are inconvenient. However, they can be manipulated as follows. First, expand (8) to obtain

$F_T(\bar{T}) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg) \bigg) - \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg) \exp(-R_n \bar{T})\bigg) \qquad \qquad \qquad \qquad \qquad (9)$

A recent paper by Yanev (Yanev, G., 2020, Exponential and Hypoexponential Distributions: Some Characterizations, Mathematics, 8(12), 2207) pointed out that the product terms appearing in (9) are related to the terms appearing in the Lagrange polynomial interpolation formula. This formula and its properties are well known in the field of approximation theory (see, e.g., pages 33-35 in Powell, M., 1981, Approximation Theory and Methods, Cambridge University Press). This fact was used in a paper by Sen and Balakrishnan (Sen, A., Balakrishnan, N., 1999, Convolution of geometrics and a reliability problem, Statistics & Probability Letters, 43, pp. 421-426) to give a simple proof that the sum/product which appears as the first term on the right-hand side of (9) is identically equal to 1:

$\sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg) \bigg) \equiv 1 \qquad \qquad \qquad \qquad \qquad (10)$

(See my previous note about this). Using (10) in (9), we then obtain the cdf in a potentially more convenient form:

$F_T(\bar{T}) = 1 - \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg) \exp(-R_n \bar{T})\bigg) \qquad \qquad \qquad \qquad \qquad (11)$

Appendix: Proof of the partial fraction expansion in (3)

The following arguments use the Lagrange polynomial interpolation formula to prove the partial fraction expansion in (3) above. We will need formula (6) in my previous note about this, which is repeated here for convenience:

$p(R) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j - R}{R_j - R_n}\bigg) \bigg) \cdot 1 \qquad \qquad \qquad (12)$

To prove (3) above, we begin by using Lagrange’s polynomial interpolation formula to prove the following auxiliary result:

$\sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)\bigg(\frac{R_n}{R_n - R_{\mathcal{N}+1}}\bigg) = \prod_{j=1}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg) \qquad \qquad \qquad (13)$

Taking apart (12) above (which equals 1) and using (10) above, we obtain

$\sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j - R}{R_j - R_n}\bigg) \bigg)$

$= \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg) \bigg) - \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R}{R_j - R_n}\bigg) \bigg)$

$= 1 - R \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{1}{R_j - R_n}\bigg) \bigg) = 1 \qquad \qquad \qquad (14)$

Therefore we conclude

$\sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{1}{R_j - R_n}\bigg) \bigg) = 0 \qquad \qquad \qquad (15)$

We will need this result shortly. Next, we expand the bracketed terms on the left-hand side of (13) by partial fractions. We write

$\frac{R_j R_n}{(R_j - R_n)(R_n - R_{\mathcal{N}+1})} = \frac{AR_j R_n}{(R_j - R_n)} + \frac{B R_j R_n}{(R_n - R_{\mathcal{N}+1})} \qquad \qquad \qquad (16)$

where $A$ and $B$ are to be determined. Therefore

$1 = (A - B) R_n - A R_{\mathcal{N}+1} + B R_j \qquad \qquad \qquad (17)$

Upon setting $A = B$ , (17) yields

$A = B = \frac{1}{(R_j - R_{\mathcal{N}+1})} \qquad \qquad \qquad (18)$

Putting (18) in (16), we see that the left-hand side of (13) becomes

$\sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg \{\bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg)\bigg(\frac{R_n}{R_n - R_{\mathcal{N}+1}}\bigg) + \bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg)\bigg(\frac{R_n}{R_j - R_n}\bigg)\bigg \}$

$= \sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg) \bigg \{ \bigg(\frac{R_n}{R_n - R_{\mathcal{N}+1}}\bigg) - \bigg(\frac{R_n}{R_n - R_j}\bigg)\bigg \} \qquad \qquad \qquad (19)$

Now, the term in curly brackets in (19) is

$\frac{R_n}{R_n - R_{\mathcal{N}+1}} - \frac{R_n}{R_n - R_j} = \frac{R_n(R_{\mathcal{N}+1} - R_j)}{(R_n - R_{\mathcal{N}+1})(R_j - R_n)}$

$= \bigg(\frac{R_n}{R_n - R_{\mathcal{N}+1}}\bigg)\bigg(\frac{R_{\mathcal{N}+1}}{R_n - R_j}\bigg) + \bigg(\frac{R_n}{R_n - R_{\mathcal{N}+1}}\bigg)\bigg(\frac{R_j}{R_j - R_n}\bigg) \qquad \qquad \qquad (20)$

Therefore (19) becomes

$\sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg)\bigg(\frac{R_n}{R_n - R_{\mathcal{N}+1}}\bigg)\bigg(\frac{R_{\mathcal{N}+1}}{R_n - R_j}\bigg)$

$+ \sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg)\bigg(\frac{R_n}{R_n - R_{\mathcal{N}+1}}\bigg)\bigg(\frac{R_j}{R_j - R_n}\bigg)$

$= R_{\mathcal{N}+1}\prod_{j=1}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg) \bigg[\sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{1}{R_n - R_j}\bigg) \bigg]$

$+ \prod_{j=1}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg) \bigg[\sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg) \bigg]$

$= \prod_{j=1}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg) \qquad \qquad \qquad (21)$

The last equality follows because the first term in the penultimate step is zero by (15) above, and the sum in the square brackets in the second term in the penultimate step is equal to 1 by (10) above. Therefore the auxiliary result in (13) is now proved.

Now we use this result to prove (3) by mathematical induction. As the basis for the induction, we can use the case $\mathcal{N} = 3$ which is easy to prove by repeated bivariate partial fraction expansions, and clearly exhibits the pattern represented by (3):

$\prod_{n=1}^{3} \bigg(\frac{R_n}{R_n+z}\bigg)$

$= \bigg(\frac{R_1}{R_1 + z}\bigg)\bigg(\frac{R_2}{R_2 + z}\bigg)\bigg(\frac{R_3}{R_3 + z}\bigg)$

$= \bigg(\frac{R_2}{R_2 - R_1}\bigg)\bigg(\frac{R_3}{R_3 - R_1}\bigg)\bigg(\frac{R_1}{R_1 + z}\bigg)$

$+ \bigg(\frac{R_1}{R_1 - R_2}\bigg)\bigg(\frac{R_3}{R_3 - R_2}\bigg)\bigg(\frac{R_2}{R_2 + z}\bigg)$

$+ \bigg(\frac{R_1}{R_1 - R_3}\bigg)\bigg(\frac{R_2}{R_2 - R_3}\bigg)\bigg(\frac{R_3}{R_3 + z}\bigg)$

$= \sum_{n=1}^{3} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{3} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n \bigg(\frac{1}{R_n+z}\bigg)\bigg)$

Next, suppose (3) is true. Then adding another factor to the product on the left-hand side of (3) would produce

$\prod_{n=1}^{\mathcal{N}+1} \bigg(\frac{R_n}{R_n+z}\bigg) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n \bigg) \bigg(\frac{R_{\mathcal{N}+1}}{(R_n+z)(R_{\mathcal{N}+1}+z)}\bigg)$

$= \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n \bigg) \bigg \{\bigg(\frac{R_{\mathcal{N}+1}}{R_{\mathcal{N}+1}-R_n}\bigg)\bigg(\frac{1}{R_n+z}\bigg)+\bigg(\frac{1}{R_n-R_{\mathcal{N}+1}}\bigg)\bigg(\frac{R_{\mathcal{N}+1}}{R_{\mathcal{N}+1}+z}\bigg)\bigg \}$

$= \sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)\bigg(\frac{R_{\mathcal{N}+1}}{R_{\mathcal{N}+1}-R_n}\bigg)\bigg(\frac{R_n}{R_n+z}\bigg)$

$+ \bigg[\sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)\bigg(\frac{R_n}{R_n - R_{\mathcal{N}+1}}\bigg)R_{\mathcal{N}+1}\bigg]\bigg(\frac{1}{R_{\mathcal{N}+1}+z}\bigg)$

$= \sum_{n=1}^{\mathcal{N}} \prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}+1} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n\bigg(\frac{1}{R_n+z}\bigg)$

$+ \bigg[\prod_{j=1}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_{\mathcal{N}+1}}\bigg)R_{\mathcal{N}+1}\bigg]\bigg(\frac{1}{R_{\mathcal{N}+1}+z}\bigg)$

$= \sum_{n=1}^{\mathcal{N}+1} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}+1} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n \bigg(\frac{1}{R_n+z}\bigg)\bigg) \qquad \qquad \qquad (22)$

where the expressions in square brackets in the two steps before the last are equal by the auxiliary result (13) proved earlier. Therefore, by the principle of mathematical induction, (3) must hold for all $\mathcal{N} \geq 3$ , so (3) is proved.

A spline approximation scheme for large deviation theory problems

In this note, I develop and implement a spline function approximation scheme that can quickly generate accurate benchmark solutions for large deviation theory problems. To my knowledge, spline function approximation methods have not been applied before in the context of large deviation theory. This note shows that such methods can be efficient and effective even when it is computationally difficult or impossible to obtain the required numerical results in other ways.

In fact, the effort in this note was initially motivated by computational difficulties encountered while trying to obtain fine-grained sets of numerical values for the large deviation functions in a 2016 paper in Physical Review Letters (outlined in a previous note) about the large deviation analysis of rapid-onset rain showers. This paper will be referred to as MW2016 herein. It is desirable to carry out such computations using sophisticated mathematical software systems such as MAPLE, which come with many built-in features that can make the necessary programming convenient and transparent. However, attempts to compute fine-grained exact function values for the cumulant generating function and its derivatives proved to be difficult and time consuming in MAPLE. Carrying out these computations for the case $\gamma=2$ took hours and it was even more time consuming to obtain the necessary fine-grained exact function values for the case $\gamma = 4/3$ . In view of these difficulties, it seemed desirable to attempt to develop a new numerical solution system for large deviation problems that would enable accurate numerical results to be produced in MAPLE, using only the simplest mathematical functions. This note reports my experiments with one way of achieving this using cubic spline functions.

A closer look at the cumulant generating function

For what follows, assume the same simple power law collision rates $R_n$ as in MW2016, with the normalisations $R_1 =1$ and $\langle T \rangle = 1$ . Also assume a given collision rate parameter $\gamma \in [4/3,2]$ , and that $k > 0$ . The cumulant generating function in equation [14] in MW2016 is then a smooth, strictly concave function in both $k$ and $N$ . To see this, write it as

$\lambda(N, k; \gamma) = \sum_{n=1}^N \ln\bigg(1 + \frac{k}{n^{\gamma}}\bigg) \qquad \qquad \qquad \qquad \qquad (1)$

The first two partial derivatives with respect to $k$ are

$\frac{\partial \lambda(N, k;\gamma)}{\partial k} = \sum_{n=1}^N \frac{1}{n^{\gamma} + k} > 0 \qquad \qquad \qquad \qquad \qquad (2)$

and

$\frac{\partial^2 \lambda(N, k;\gamma)}{\partial k^2} = - \sum_{n=1}^N \frac{1}{(n^{\gamma} + k)^2} < 0 \qquad \qquad \qquad \qquad \qquad (3)$

which confirms strict concavity with respect to $k$ . To find the partials with respect to $N$ , we can use the Euler-Maclaurin formula to express (1) as

$\lambda(N, k; \gamma) = \int_1^N dn f(n) + \frac{1}{2} \big(f(1) + f(N)\big)$

$- \sum_{m=1}^{\mathcal{M}} \frac{B_{2m}}{(2m)!} \big(f^{(2m-1)}(1) - f^{(2m-1)}(N)\big) + R_{\mathcal{M}+1} \qquad \qquad \qquad \qquad \qquad (4)$

where $f(n) = \ln\big(1+\frac{k}{n^{\gamma}}\big)$ , $B_{2m}$ are the even-indexed Bernoulli numbers, and $R_{\mathcal{M}+1}$ is a remainder term. Numerical experiments with MAPLE indicate that only the first two terms on the right-hand side of the Euler-Maclaurin formula are needed to obtain accurate approximations of our cumulant generating function over the entire range of relevant values of $N$ , $k$ and $\gamma$ . Thus, to a close approximation we have

$\lambda(N, k; \gamma) = \int_1^N dn \ln\bigg(1 + \frac{k}{n^{\gamma}}\bigg) + \frac{1}{2}\bigg(\ln\bigg(1 + \frac{k}{N^{\gamma}}\bigg)+\ln\big(1 + k\big)\bigg) \qquad \qquad \qquad \qquad \qquad (5)$

Using Liebniz’s Rule, we can now obtain the first two partials with respect to $N$ as

$\frac{\partial \lambda(N, k;\gamma)}{\partial N} = \ln\bigg(1 + \frac{k}{N^{\gamma}}\bigg) - \frac{\gamma k}{2N(N^{\gamma}+k)} > 0 \qquad \qquad \qquad \qquad \qquad (6)$

and

$\frac{\partial^2 \lambda(N, k;\gamma)}{\partial N^2} = - \frac{\gamma k}{N(N^{\gamma}+k)}\bigg(\frac{(2N-\gamma-1)N^{\gamma} + (2N-1)k}{2N(N^{\gamma}+k)}\bigg) < 0 \qquad \qquad \qquad \qquad \qquad (7)$

which confirms strict concavity with respect to $N$ . The cross-partial derivative is

$\frac{\partial^2 \lambda(N, k;\gamma)}{\partial N \partial k} = \bigg(\frac{1}{N^{\gamma}+k}\bigg)\bigg(\frac{2(N^{\gamma}+k)-\gamma N^{\gamma-1}}{2(N^{\gamma}+k)}\bigg) > 0 \qquad \qquad \qquad \qquad \qquad (8)$

For completeness, the first two partials with respect to the collision rate parameter $\gamma$ , using (1), are obtained as

$\frac{\partial \lambda(N, k;\gamma)}{\partial \gamma} = - \sum_{n=1}^N \frac{k \ln(n)}{n^{\gamma} + k} < 0 \qquad \qquad \qquad \qquad \qquad (9)$

and

$\frac{\partial^2 \lambda(N, k;\gamma)}{\partial \gamma^2} = \sum_{n=1}^N \frac{k n^{\gamma} (\ln(n))^2}{(n^{\gamma} + k)^2} > 0 \qquad \qquad \qquad \qquad \qquad (10)$

Thus, the cumulant generating function is a strictly convex function of $\gamma$ .

The following is a three-dimensional MAPLE plot of the cumulant generating function $\lambda(N, k;\gamma)$ for $N=0, \ldots, 10^4$ , $k=0, \ldots, 10^4$ , and $\gamma=2$ :

A simple idea from differential geometry

It is known in classical differential geometry that a smooth and strictly concave curve has an osculating parabola at each of its points. (See, e.g., Har’el, Z., 1995, Curvature of Curves and Surfaces – A Parabolic Approach, URL: http://harel.org.il/zvi/docs/parabola.pdf, and references therein). In particular, the curve at each point should match a Taylor expansion of the function around that point at least up to second-order. Some experiments in MAPLE indicated that remarkably accurate fits to the strictly concave $k$ -curves of the cumulant generating function can indeed be obtained using second-order Taylor expansions with respect to $k$ , for given values of $N$ and $\gamma$ . For example, the following are some 3D MAPLE plots showing closely fitting quadratic Taylor polynomials to different parts of the $k$ -curve for $N=10^4$ and $\gamma=2$ :

In these plots, the osculating quadratic Taylor expansions of $\lambda(N, k;\gamma)$ with respect to $k$ are shown centered at different points along the $k$ -curve. When just fifteen such quadratic pieces are spliced together, they form a quadratic spline providing an essentially perfect approximation of the cumulant generating function from $k = 0$ to $k = 1000$ , as shown below:

This suggests that numerical approximation schemes involving spline functions might be highly effective for the cumulant generating function in MW2016, and this indeed proved to be the case as discussed below.

Within each of the parabolic segments in the above plots, the cumulant generating function behaves to a very close approximation like a simple quadratic in $k$ . In fact, this applies at every point of the cumulant generating function, i.e., for every relevant value of $N$ and $k$ , and indeed for every relevant value of $\gamma$ . Before embarking on the development of spline function approximation techniques, it therefore seems desirable to try to understand better how the large deviation theory in MW2016 manifests itself in this simple quadratic setup. To this end, suppose that $N$ and $\gamma$ are given, and consider a point $k_0$ on the corresponding $k$ -curve of the cumulant generating function. There will be a parabolic segment around this point within which the cumulant generating function behaves exactly like the following quadratic in $k$ :

$\lambda(N, k; \gamma) = \lambda(N, k_0; \gamma) + \frac{\partial \lambda(N, k_0; \gamma)}{\partial k} (k - k_0) + \frac{1}{2}\frac{\partial^2 \lambda(N, k_0; \gamma)}{\partial k^2} (k - k_0)^2 \qquad \qquad \qquad \qquad \qquad (11)$

The coefficients here are just (1), (2) and (3) above, evaluated at the given values of $N$ and $\gamma$ , and at $k_0$ . To simplify notation in what follows, we will write the partial derivatives in (11) as

$A \equiv \frac{\partial \lambda(N, k_0; \gamma)}{\partial k} = \sum_{n=1}^N \frac{1}{n^{\gamma} + k_0} \qquad \qquad \qquad \qquad \qquad (12)$

and

$B \equiv - \frac{\partial^2 \lambda(N, k_0; \gamma)}{\partial k^2} = \sum_{n=1}^N \frac{1}{(n^{\gamma} + k_0)^2} \qquad \qquad \qquad \qquad \qquad (13)$

The entropy function in MW2016 is

$J(\bar{T}) = \sup_k \{\lambda(N, k; \gamma) - k \bar{T} \} \qquad \qquad \qquad \qquad \qquad (14)$

Differentiating the expression in curly brackets with respect to $k$ and setting equal to zero in order to find $k^{\ast}(\bar{T})$ , using (11) above, we obtain

$k^{\ast}(\bar{T}) = k_0 + \frac{A - \bar{T}}{B} \qquad \qquad \qquad \qquad \qquad (15)$

This expression is interesting for two reasons. First, recall that for any given $\bar{T}$ there is a unique $k^{\ast}(\bar{T})$ . Suppose we are in fact within the parabolic segment containing $k^{\ast}(\bar{T})$ for such a given value of $\bar{T}$ . If the centre of the Taylor expansion $k_0$ for that particular parabolic segment is slightly away from $k^{\ast}(\bar{T})$ , then the required correction for the given $\bar{T}$ is represented by the second term in (15) above, which involves $A - \bar{T}$ . There is no correction needed if $k_0 = k^{\ast}(\bar{T})$ , and this happens when $\bar{T} = A$ in (15) above. But this is exactly the saddlepoint condition in equation [16] in MW2016, so one of the key equations in the large deviation theory in MW2016 has now appeared naturally in this simple quadratic setup.

The second reason that (15) seems interesting is that it resembles the well known Newton-Raphson formula used for solving equations iteratively. This suggests that (15) might actually provide an efficient way of solving $A - \bar{T}=0$ in order to find exact values of $k^{\ast}(\bar{T})$ for different values of $\bar{T}$ . Experiments in MAPLE do indeed confirm that (15) is a useful formula for quickly obtaining values of $k^{\ast}(\bar{T})$ , particularly in the case $\gamma=4/3$ with large values of $N$ , say $N=10^4$ . Using the Maximize command in MAPLE’s Optimization package, with the right-hand side of (15) as the objective function and given values of $\bar{T}$ one very quickly obtains values of $k^{\ast}(\bar{T})$ as the maximised values of the objective function. This approach is much faster in the case $\gamma=4/3$ than using MAPLE’s fsolve command directly with formula [16] in MW2016, though fsolve does work well in the case $\gamma=2$ .

Putting (15) into (11) and simplifying we get

$\lambda(N, k^{\ast}(\bar{T});\gamma) = \lambda(N, k_0; \gamma) + \frac{A^2 - \bar{T}^2}{2B} \qquad \qquad \qquad \qquad \qquad (16)$

Again, we can regard the second term here as the correction needed when the centre of the Taylor expansion $k_0$ is not quite equal to the optimal value $k^{\ast}(\bar{T})$ . Putting (16) into (14), we get the entropy function as

$J(\bar{T}) = \lambda(N, k_0; \gamma) - k_0 \bar{T} + \frac{A^2 - \bar{T}^2}{2B} \qquad \qquad \qquad \qquad \qquad (17)$

where again the last term is a correction, and, finally, the saddlepoint density in [17] in MW2016 is obtained here as

$P(\bar{T}) = \frac{1}{\sqrt{2 \pi B}} \exp(-J(\bar{T})) \qquad \qquad \qquad \qquad \qquad (18)$

This analysis indicates how the large deviation theory in MW2016 can be linked to a simple quadratic spline approximation of the cumulant generating function. The question now is how to implement these ideas computationally to obtain a spline function approximation approach which can be used to obtain rapid and accurate quantitative results for any desired values of $N$ , $k$ , $\bar{T}$ and $\gamma$ . A simple scheme immediately suggests itself, as described in the next section.

A spline approximation approach for MAPLE

The following spline function approximation approach for MAPLE implements the insights from the previous sections, and proves to work extremely well in the sense that it produces highly accurate results, very quickly, using only the simplest polynomials, and for any desired values of $N$ , $k$ , $\bar{T}$ and $\gamma$ :

a). First, we obtain a small sample of exact values of $k^{\ast}(\bar{T})$ corresponding to some given values of $\bar{T}$ between 0.01 and 2.00, using equation [16] in MW2016 or equation (15) above. Experiments show that 40 sample values are ample for highly accurate spline approximations. These sample values can be obtained very quickly using MAPLE’s fsolve or Maximize commands, as indicated in the previous section.

b). Second, we use this small sample of exact values to create spline function representations of $k^{\ast}(\bar{T})$ and $\lambda(N, k^{\ast}(\bar{T});\gamma)$ , analogous to the quadratic spline created manually for the cumulant generating function in the previous section. This quadratic spline worked very well due to the smoothness and well-behaved nature of the cumulant generating function, but even more accurate results can be obtained with cubic splines. MAPLE routines can be written which can generate such cubic spline approximations very quickly.

c). Next, we obtain the first and second derivatives of the spline function representation of $\lambda(N, k^{\ast}(\bar{T});\gamma)$ . The component pieces of this spline will all be simple cubics, so this differentiation exercise is handled very quickly by MAPLE.

d). Finally, using the cubic spline function representation of $k^{\ast}(\bar{T})$ , we can now generate fine-grained numerical data by evaluating $k^{\ast}(\bar{T})$ for as many values of $\bar{T}$ as desired, and we can also obtain the corresponding values of the spline representation of $\lambda(N, k^{\ast}(\bar{T});\gamma)$ and its derivatives. These can then be used to compute sets of fine-grained numerical values for the entropy function $J(k^{\ast}(\bar{T}))$ and the log-saddlepoint-density function $\ln(P(\bar{T}))$ . (The term fine-grained is used here to mean approximately 200 values of the relevant functions, or more, corresponding to values of $\bar{T}$ in the range 0.0 to 2.0). Note that an immediate check of accuracy can be performed at this stage by comparing the values of the first derivative of the spline $\lambda(N, k^{\ast}(\bar{T});\gamma)$ , evaluated at the fine-grained values of the spline $k^{\ast}(\bar{T})$ , with the corresponding values of $\bar{T}$ . They should closely agree if the splines are accurately interpolating the corresponding exact function values.

As a first excursion with this new approximation approach, an attempt was made to replicate the results in MW2016 for the cases $\gamma=2$ and $\gamma=4/3$ , with $N=10^4$ . Spline function representations of $k^{\ast}(T)$ and $\lambda(N, k^{\ast}(\bar{T});\gamma)$ consisting of 40 cubic pieces were quickly obtained with initial samples of 40 exact values of $k^{\ast}(\bar{T})$ . These are displayed below as piecewise functions for the case $\gamma=2$ , together with plots of the corresponding splines. The derivatives of $\lambda(N, k^{\ast}(\bar{T});\gamma)$ were then calculated, and these were used to calculate the entropy functions and the log-saddlepoint-density functions, finally obtaining plots of these which are also shown below for the cases $\gamma=2$ and $\gamma=4/3$ .

Note that, despite the computational difficulties encountered when trying to compute exact function values for the case $\gamma=4/3$ in particular, obtaining the corresponding cubic spline function approximations proved to be just as quick and easy as for the case $\gamma=2$ .

Checking accuracy of splines and asymptotic formulae

In the previous section, a solution system was developed based on spline function approximations that can quickly and accurately solve the large deviation problems we are interested in, for any values of $N$ , $k$ , $\bar{T}$ and $\gamma$ , even when these solutions are computationally difficult or impossible to obtain in a fine-grained way using exact functions. This gives us a machine for quickly generating accurate benchmark solutions which, for example, can be compared with analytical work.

I decided to try to put both the spline function approach and the asymptotic formulae in MW2016 to the test by producing fine-grained exact values for the relevant functions (i.e., approximately 200 function values for values of $\bar{T}$ in the range 0.0 to 2.0), and then comparing our approximations with these. I was able to obtain fine-grained exact values for the case $N=10^4$ and $\gamma=2$ in MAPLE, and this took a significant amount of computation time using the relevant exact functions. In contrast, it only took a few minutes to produce corresponding spline function approximations to an equally fine-grained degree, i.e. approximately 200 values, with a discretization based on 40 values of $\bar{T}$ , giving a sample of 40 exact values of $k^{\ast}(\bar{T})$ . I also used formulas [19] and [21] in the Supplement to MW2016 to generate corresponding data based on MW’s asymptotic formulae. I then produced superimposed plots of the exact data, the spline function approximations, and the asymptotic formulae in MW2016 to visually assess accuracy.

As shown in the following plots, the spline function approximations produced results in a few minutes which are essentially indistinguishable from the exact values obtained after numerous hours of computation in MAPLE. However, the asymptotic formulae in MW2016 (plots shown in blue) produced rather inaccurate approximations for $N=10^4$ and $\gamma=2$ .

In the case $\gamma=4/3$ , obtaining fine-grained exact data is computationally very time consuming in MAPLE. Exact formula computations take a lot longer with $\gamma=4/3$ than with $\gamma=2$ , and producing a fine-grained set of around 200 exact values for all the required functions with $N=10^4$ and $\gamma=4/3$ takes a large amount of computing time. However, the spline approximation approach produced such fine-grained results again in only a few minutes. In lieu of fine-grained exact function values, I used MAPLE to generate six exact function values for the case $N=10^4$ and $\gamma=4/3$ , and I plotted these as discrete points together with the spline functions to provide some degree of comparison with exact data. The results again indicate that the spline approximations are highly accurate representations of what fine-grained exact results would have looked like.

I also decided to compare the asymptotic formulae in MW2016 with the spline approximations and exact function values. I used formulae [19] and [24] in the Supplement to MW2016 for this. (MW asserts there that formula [24] produces more accurate asymptotic approximations than formula [21] in the case $\gamma=4/3$ ). In the following plots, the spline function approximations are shown in green and MW’s asymptotic formulae in blue. The six exact function values are plotted as discrete red points. As in the case $\gamma=2$ , the spline functions seem to be accurately representing the exact results, while the asymptotic formulae in MW2016 are clearly producing inaccurate results here. The discrepancies between the asymptotic formulae and the spline approximations/exact values seem a lot worse for $\gamma=4/3$ than for $\gamma=2$ , with $N=10^4$ .

On the basis of these accuracy checks, there is therefore little question that the asymptotic formulae in MW2016 are producing rather inaccurate results in the case $N=10^4$ for both $\gamma=2$ and $\gamma = 4/3$ when compared with both exact function values and my spline approximations, notwithstanding the favourable-looking numerical simulation results reported in MW2016 for log-densities. (I have communicated these results to Michael Wilkinson, the author of MW2016).