Monomer injections leading to oscillatory behaviour in a mathematical model of rain formation

In a previous note, I outlined a mathematical model of rapid-onset rain formation published by Michael Wilkinson in Physical Review Letters, which will be referred to as MW2016 herein. (Full reference: Wilkinson, M., 2016, Large Deviation Analysis of Rapid Onset of Rain Showers, Phys. Rev. Lett. 116, 018501). In the present note, I want to briefly record an interesting experiment in which I augmented the framework in MW2016 by allowing monomer injections to oppose the reduction in the volume fraction of liquid water content due to runaway collisions in equation [11] in MW2016. I show that this immediately leads to the possibility of oscillatory behaviour in Wilkinson’s model.

The simplest possible assumption is that of a constant monomer injection rate which increases the volume fraction at a constant rate \alpha, thus changing the differential equation presented as equation [11] in MW2016 to

\frac{d \Phi_l}{d t} = \alpha - \mathcal{N} \Phi_l P(t) \qquad \qquad \qquad \qquad \qquad (1)

For convenience we will write

\bar{\mathcal{N}} \equiv \mathcal{N} P(t) \qquad \qquad \qquad \qquad \qquad (2)

so that (1) becomes

\frac{d \Phi_l}{d t} = \alpha - \bar{\mathcal{N}} \Phi_l \qquad \qquad \qquad \qquad \qquad (3)

This addition of monomer injections to equation [11] in MW2016 immediately gives rise to the possibility of oscillatory behaviour in the collector drop model because \bar{\mathcal{N}} also depends on the volume fraction \Phi_l, so there is a feedback loop between them. Suppose we capture this dependence by postulating that the fractional change in the number of collisions \bar{\mathcal{N}} is related to the liquid content volume fraction \Phi_l via some unknown function g, so that

\frac{1}{\bar{\mathcal{N}}} \frac{d \bar{\mathcal{N}}}{d t} = g(\Phi_l) \qquad \qquad \qquad \qquad \qquad (4)

A Maclaurin series approximation of the unknown function g to second order gives

g(\Phi_l) = g(0) + g^{\prime}(0) \Phi_l + \frac{1}{2}g^{\prime \prime}(0) \Phi_l^2 \qquad \qquad \qquad \qquad \qquad (5)

and putting this into (4) gives

\frac{d \bar{\mathcal{N}}}{d t} = g(0)\bar{\mathcal{N}} + g^{\prime}(0) \bar{\mathcal{N}} \Phi_l + \frac{1}{2}g^{\prime \prime}(0) \bar{\mathcal{N}} \Phi_l^2 \qquad \qquad \qquad \qquad \qquad (6)

Equations (3) and (6) now form a coupled ODE system reminiscent of a Lotka-Volterra system, and this is capable of yielding oscillatory behaviour in both the number of collisions \bar{\mathcal{N}} and the volume fraction \Phi_l, depending on the parameters \alpha, g(0), g^{\prime}(0) and g^{\prime \prime}(0). In fact, numerical experiments with (3) and (6) show that the numbers of collisions \bar{\mathcal{N}} in particular can oscillate violently in the initial stages of the interaction, before adjustment to equilibrium eventually takes place. In all numerical experiments in which oscillations are observed, the amplitudes of the oscillations in \bar{\mathcal{N}} are highest when the value of the time variable is small, becoming damped as the value of the time variable increases.

The above figures show classic damped oscillation patterns obtained in one such computation, carried out using MAPLE, with scaled variables representing the number of collisions divided by 10^3, and the volume fraction multiplied by 10^8 (due to the very large difference in magnitude between \bar{\mathcal{N}} and \Phi_l), with \alpha = 0.1, g(0) = -0.00004, g^{\prime}(0) = 4 and g^{\prime \prime}(0) = 400,000,000. (As a check, identical results were obtained using MATLAB and FORTRAN. See the screenshots below).

The observation that oscillatory behaviour arises so easily after incorporation of monomer injections into the model, due to the resulting Lotka-Volterra type of interaction between \bar{\mathcal{N}} and \Phi_l, might have some connection to the oscillatory behaviour reported in previous work on coagulation systems driven by monomer inputs and large cluster outputs (e.g., Ball, R., Connaughton, C., Jones, P., Rajesh, R., Zaboronski, O., 2012, Collective Oscillations in Irreversible Coagulation Driven by Monomer Inputs and Large-Cluster Outputs, Phys. Rev. Lett. 109, 168304). Further investigation would be needed to clarify this.

The following are screenshots of the MAPLE, MATLAB and FORTRAN programs I used to carry out these experiments.

MAPLE:

MATLAB:

FORTRAN:

A note on mathematics in Latin

Latin has played a significant role in the history and development of mathematics. For example, an interesting article by Richard Oosterhoff on Neo-Latin Mathematics indicates the extent to which Latin has helped to promote the prestige and circulation of mathematical work since medieval times.

There is a huge amount of mathematics written in Latin freely available online. To give one example of a particularly rich source, the Euler Archive provides all 850+ of Euler’s written works in digital form, most of which are written in Latin. The Latin edition of Isaac Newton’s Principia Mathematica is also in the public domain and freely downloadable online, as is a famous prize submission to the Paris Academy in 1861 by Bernhard Riemann, and Carl Friedrich Gauss’s famous Disquisitiones Arithmeticae, along with countless other works by numerous mathematicians in repositories such as wilbourhall.org, the French national library’s Gallica website, archive.org, and others.

Excerpt from Disquisitiones Arithmeticae by Carl Friedrich Gauss

I have compiled some Latin `crib notes’ here to provide a quick reminder of basic grammar rules. A useful vocabulary list which is worth learning is DCC’s Latin Core Vocabulary (the 1000 most frequently occurring words in Latin).

Accurate analytical approximation formulae for large-deviation analysis of rain formation

A 2016 paper by M Wilkinson in Physical Review Letters suggests that large-deviation theory is a suitable framework for studying the phenomenon of unexpectedly rapid rain formation in collector-drop collision processes. Wilkinson derives asymptotic approximation formulae for a set of exact large-deviation functions in the collector-drop model, such as the cumulant generating function and the entropy function. The asymptotic approach assumes a large number of water droplet collisions and is motivated by the fact that the exact large-deviation functions are prohibitively difficult to interpret and deal with directly.

Wilkinson uses his asymptotic formulae to obtain further results and also provides some numerical work which suggests that a certain log-density function for the collector-drop model (which is a function of his asymptotic approximation formulae) is itself approximated satisfactorily. However, the numerical work does not test the accuracy of the individual asymptotic approximation formulae directly against their exact large-deviation theory counterparts. When these direct checks are carried out, they reveal that the asymptotic formulae are, in fact, rather inaccurate, even for very large numbers of collisions. Their individual inaccuracy is masked by their incorporation into log-density functions in Wilkinson’s numerical work. Their inaccuracy, as well as some assumptions underlying their derivation, severely limit their applicability.

The purpose of the present note is to point out that it is quite possible to develop accurate analytical (i.e., non-asymptotic) approximation formulae for the large-deviation theory functions in the collector-drop model which also preserve the forms of the leading order power terms in Wilkinson’s asymptotic formulae. An analytical approximation approach can be developed based on a Euler-Maclaurin formula. The resulting analytical formulae are extremely accurate and valid for all relevant numbers of collisions and time scales, producing numerical results which are essentially indistinguishable from the exact function values of their large-deviation theory counterparts.

My full paper can be found on ArXiv.org.

A Phys. Rev. Letter by M Wilkinson on large deviation analysis of rapid-onset rainfall

The following is a record of some notes I made on the main mathematical developments in Michael Wilkinson’s 2016 Physical Review Letter on the large deviation analysis of rapid-onset rain showers (reference: Wilkinson, M., 2016, Large Deviation Analysis of Rapid Onset of Rain Showers, Phys. Rev. Lett. 116, 018501). This publication will be referred to as MW2016 herein. The published Letter has an accompanying Supplement which will also be referred to here. The Supplement provides additional details of some intricate derivations underlying key results presented in the main Letter.

One of the main contributions of MW2016 is that it shows that large deviation theory is the correct framework for analysing the implications of a collector-drop model of rapid rain formation which was discussed in a previous paper by Kostinski and Shaw (reference: Kostinski, A., Shaw, R., 2005, Fluctuations and Luck in Droplet Growth by Coalescence, Bull. Am. Meteorol. Soc. 86, 235). In this collector-drop model, a runaway droplet falls and coalesces at a rapidly increasing rate with many smaller water particles. The collision rates are assumed to increase according to a simple power law function of the number of collisions. In addition to formulating a volume fraction cut-off mechanism for raindrop size, MW2016 also successfully uses large deviation analysis in conjunction with asymptotic approximations to study the link between the time scale for rapid onset of rain showers and the number of water particle collisions needed for raindrop formation. In the present article, I provide a mathematical review of the key points in MW2016 and also detail some errata in the published Letter which were identified in the review process.

Mathematical review

In the collector-drop model, a runaway droplet falls and collides with a large number \mathcal{N} of small droplets of radius a_0 according to an inhomogeneous Poisson process. If the random variable t_n represents the inter-collision time leading up to the n-th collision, the time for a droplet to experience runaway growth is then the random variable

T = \sum_{n=1}^{\mathcal{N}} t_n \quad \quad \quad \quad \quad (1)

The inter-collision times t_n are independent exponentially distributed random variables with

P_n(t_n) = R_n \exp(-R_n t_n) \quad \quad \quad \quad \quad (2)

where R_n is the number of collisions per second, i.e., the collision rate, by the time of the n-th collision. The collision rate R_n is modelled as

R_n = R_1 f(n) \quad \quad \quad \quad \quad (3)

where

R_1 = \epsilon N_0 \pi(a_0 + a_1)^2 \alpha (a_1^2 - a_0^2) \quad \quad \quad \quad \quad (4)

is the collision rate of a drop of radius a_1 with droplets of radius a_0. Here, \epsilon is the coalescence efficiency, N_0 is the number density of microscopic water droplets per m^3, \pi(a_0 + a_1)^2 is the effective cross-section area and \alpha (a_1^2 - a_0^2) is the relative velocity. The constant \alpha is a constant of proportionality between terminal velocity and the radius squared of a water particle.

The function f(n) in (3) characterises the dependence of R_n on the number of collisions. In the Kostinski and Shaw 2005 paper referenced above, f(n) = n^2, but we can have f(n) = n^{4/3} when \epsilon \approx 1 and a_n \gg a_1. MW2016 treats f(n) as a simple power law, so that

R_n = R_1 n^{\gamma} \quad \quad \quad \quad \quad (5)

The average inter-collision time leading to the n-th collision is then given by

\langle t_n \rangle \equiv \tau_n = \frac{1}{R_n} = \frac{1}{R_1} n^{-\gamma} \quad \quad \quad \quad \quad (6)

and the mean time for explosive growth is obtained as

\langle T \rangle = \sum_{n=1}^{\mathcal{N}} \tau_n = \frac{1}{R_1} \sum_{n=1}^{\mathcal{N}} n^{-\gamma} \quad \quad \quad \quad \quad (7)

When \gamma > 1, the mean time for explosive growth converges as \mathcal{N} \rightarrow \infty, giving

\lim_{\mathcal{N} \rightarrow \infty} \langle T \rangle = \frac{1}{R_1} \zeta(\gamma) \quad \quad \quad \quad \quad (8)

where \zeta(\gamma) is Riemann’s zeta function.

A cut-off mechanism for raindrop size is obtained by considering the liquid water content of a cloud, expressed as a volume fraction \Phi_l. Since N_0 is the number of water particles of radius a_0 per m^3, and each particle is a sphere with volume \frac{4}{3} \pi a_0^3, the volume fraction of liquid water per m^3 of cloud is given by

\Phi_l \approx N_0 \cdot \frac{4}{3} \pi a_0^3 \quad \quad \quad \quad \quad (9)

For example, if each droplet is initially of radius a_0 = 10 \mu m = 10^{-5}m, and there are N_0 \approx 10^9 such droplets per m^3 of cloud, then \Phi_l \approx 10^{-6}.

Using (2), the fraction of droplets undergoing runaway growth between times t and t + \delta t is given by

P(t) (t + \delta t - t) = P(t) \delta t \quad \quad \quad \quad \quad (10)

For example, if P(t) \delta t \approx 10^{-6}, then `one in a million’ water droplets will undergo runaway growth between times t and t + \delta t.

If the volume of each runaway droplet increases by a factor of \mathcal{N}, the volume fraction of liquid water removed from the cloud per m^3 over the time interval of length \delta t will then be

\mathcal{N} \Phi_l P(t) \delta t \quad \quad \quad \quad \quad (11)

For example, if \mathcal{N} = 10^3, \Phi_l \approx 10^{-6} and P(t) \delta t \approx 10^{-6}, then the volume fraction of liquid water content per m^3 is reduced by 10^3 \cdot 10^{-6} \cdot 10^{-6} = 10^{-9} over the time interval of length \delta t. This represents a \frac{10^{-9}}{10^{-6}}\cdot 10^2 = 0.1 per cent reduction in \Phi_l.

As \delta t \rightarrow 0, the instantaneous change of the liquid water content of a cloud per m^3 due to runaway growth of droplets is thus

\frac{d \Phi_l}{d t} = -\mathcal{N} \Phi_l P(t) \quad \quad \quad \quad \quad (12)

Therefore, the instantaneous percentage rate of decline of the volume fraction is

-\frac{d \ln \Phi_l}{d t} = \mathcal{N} P(t) \quad \quad \quad \quad \quad (13)

Integrating (13) over a time interval then gives a percentage loss \mu in that interval:

\mu = \mathcal{N}\int_0^t dt^{\prime} P(t^{\prime}) \quad \quad \quad \quad \quad (14)

The onset of a shower is determined by the criterion that \mu be a significant percentage of the liquid water content of a cloud that is removed by raindrop formation as a result of \mathcal{N} particle collisions. We are now interested in finding the time scale t^{*} over which this significant reduction in \Phi_l occurs. In accordance with (14), this t^{*} satisfies

\mathcal{N^{*}}\int_0^{t^{*}} dt^{\prime} P(t^{\prime}) = 1 \quad \quad \quad \quad \quad (15)

where \mathcal{N^{*}} \equiv \mathcal{N}/\mu.

To make progress from here, MW2016 approximates P(t) in (15) using the large deviation principle, which is explained in detail in, e.g., an article by Touchette (reference: Touchette, H., 2009, The large deviation approach to statistical mechanics, Phys. Rep. 478, 1). With T defined as in (1) and \langle T \rangle given by (7), we implement the large deviation principle here by approximating the probability P(T)dT at some value \bar{T} in the small-value tail of the distribution of T as

P(\bar{T})d\bar{T} = P(T \in [\bar{T}, \bar{T} + d\bar{T}])

where P(\bar{T}) is given by

P(\bar{T}) = \frac{1}{\langle T \rangle} \exp[-J(\tau)] \quad \quad \quad \quad \quad (16)

with \tau \equiv \bar{T}/\langle T \rangle. The function J(\tau) in (16) is referred to as an entropy function or rate function in large deviation theory, and is defined below. Using (16) in (15) we get the condition for the onset of rainfall as

\mathcal{N^{*}} \int_0^{t^{*}} \frac{d \bar{T}}{\langle T \rangle} \exp \bigg[-J\bigg(\frac{\bar{T}}{\langle T \rangle}\bigg)\bigg] = \mathcal{N^{*}} \int_0^{t^{*}} d \tau^{\prime} \exp [-J(\tau^{\prime})] = 1 \quad \quad \quad \quad \quad (17)

with \tau^{\prime} \equiv \bar{T}/\langle T \rangle. A saddle-point approximation for the integral here at the maximum value of the integrand, \exp[-J(\tau^{*})], with \tau^{*} \equiv \bar{T}^{\ast}/\langle T \rangle, then gives the condition for the onset of rainfall as

\mathcal{N}^{*} \exp[-J(\tau^{*})] = 1 \quad \quad \quad \quad \quad (18)

Taking logs, we can write the condition alternatively as

\ln \mathcal{N}^{*} = J(\tau^{*}) \quad \quad \quad \quad \quad (19)

What is required now is to find an expression for \tau^{*} in terms of \ln \mathcal{N}^{*}, using (19). To obtain this, it is necessary to find an expression for the rate function J(\tau) for the random sum defined by (1) and (2).

Consider a single inter-collision time t_n with density (2). Following the approach in the article by Touchette cited above for the purposes of the large deviation analysis here, we obtain the log of the Laplace transform of the random variable t_n, often referred to as its scaled cumulant generating function, as

\lambda_n (k) = -\ln \langle \text{e}^{-kt_n} \rangle = -\ln\bigg(\int_0^{\infty} d t_n \text{e}^{-kt_n} R_n \text{e}^{-R_n t_n}\bigg) = \ln\bigg(1 + \frac{k}{R_n}\bigg) \quad \quad \quad \quad \quad (20)

To ensure the logarithm on the right-hand side is always defined, we need to restrict the admissible values of k to the interval (-R_1, \infty). Inter-collision times are assumed to be independent, so for a sum T of inter-collision times such as (1), we get the log of the Laplace transform of T as

\lambda(k) = -\ln \langle \text{e}^{-kT} \rangle

= -\ln \bigg \langle \prod_{n=1}^{\mathcal{N}} \text{e}^{-kt_n} \bigg \rangle = -\ln \prod_{n=1}^{\mathcal{N}}\langle \text{e}^{-kt_n} \rangle = \sum_{n=1}^{\mathcal{N}} \ln\bigg(1 + \frac{k}{R_n}\bigg) \quad \quad \quad \quad \quad (21)

This is differentiable with respect to k for all k \in (-R_1, \infty), so by the Gartner-Ellis Theorem of large deviation analysis, discussed in Touchette, the rate function in (16) is given by

J(\tau) = \sup_k \{\lambda(k) - k\tau\} \quad \quad \quad \quad \quad (22)

The expression on the right-hand side of (22) is referred to as the Fenchel-Legendre transform of \lambda(k).

Note that the expression in (22) is the negative of the expression usually given in large deviation theory. This is because the outer function in the formulation of the log of the Laplace transform in (20) above is the negative of the expression used, e.g., in Touchette, and this formulation also has the parameter k negated. The alternative formulation would be

\lambda_n(k) = \ln \langle \text{e}^{kt_n} \rangle = \ln\bigg(\int_0^{\infty} d t_n \text{e}^{kt_n} R_n \text{e}^{-R_n t_n}\bigg) = -\ln\bigg(1 - \frac{k}{R_n}\bigg)

and the associated entropy function for this would then be the more conventional

J(\tau) = \sup_k \{k\tau - \lambda(k)\}

In this case, the admissible values of k would be those in the interval (-\infty, R_1). Essentially, the signs of \lambda(k) and k have been negated in the formulation in MW2016, as this is more convenient for the analysis.

Differentiating the bracketed expression in (22) with respect to k and setting equal to zero we obtain

\lambda^{\prime}(k) - \tau = 0 \quad \quad \quad \quad \quad (23)

But

\lambda^{\prime}(k) = \sum_{n=1}^{\mathcal{N}} \frac{\big(\frac{1}{R_n}\big)}{1 + \frac{k}{R_n}} = \sum_{n=1}^{\mathcal{N}} \frac{1}{R_n + k} \quad \quad \quad \quad \quad (24)

Putting this in (23) we get

\tau = \sum_{n=1}^{\mathcal{N}} \frac{1}{R_n + k} \quad \quad \quad \quad \quad (25)

This implicitly gives an optimal value of k as a function of \tau, say k^{*}(\tau). If this could be obtained explicitly, inserting it into the right-hand side of (22) would give the required rate function J(\tau):

J(\tau) = \lambda(k^{*}(\tau)) - k^{*}(\tau) \cdot \tau \quad \quad \quad \quad \quad (26)

However, in practice it is not possible to obtain an exact expression for J(\tau) in this way. Instead, MW2016 obtains asymptotic expressions for J(\tau) which are valid for small \tau. This is achieved by reparameterising J(\tau) using a scaled variable \kappa, defined as

\kappa = \frac{k^{*}}{R_1} \quad \quad \quad \quad \quad (27)

From (7) and (8) we get, as \mathcal{N} \rightarrow \infty,

\tau = \frac{\bar{T}}{\langle T \rangle} = \frac{R_1 \bar{T}}{\zeta(\gamma)} \quad \quad \quad \quad \quad (28)

Using (5), (27) and (28) in (25) we get

\tau(\kappa) = \frac{R_1}{\zeta(\gamma)} \sum_{n=1}^{\infty} \frac{1}{R_1 n^{\gamma} + R_1 \kappa} = \frac{1}{\zeta(\gamma)} \sum_{n=1}^{\infty} \frac{1}{\kappa + n^{\gamma}} \quad \quad \quad \quad \quad (29)

The expression on the right-hand side of (29) appears in equation (18) in MW2016.

We now write (2.21) in reparameterised form as \mathcal{N} \rightarrow \infty as

S(\kappa) = \sum_{n=1}^{\infty} \ln \bigg(1 + \frac{R_1 \kappa}{R_1 n^{\gamma}}\bigg) = \sum_{n=1}^{\infty} \ln(1 + \kappa n^{-\gamma}) \quad \quad \quad \quad \quad (30)

We also write

k^{*}(\tau) \cdot \tau = \frac{R_1 \kappa \zeta(\gamma) \tau}{R_1} = \zeta(\gamma) \kappa \tau \quad \quad \quad \quad \quad (31)

Then we re-express (26) in the form

J(\kappa)= S(\kappa) - \zeta(\gamma) \kappa \tau(\kappa) \quad \quad \quad \quad \quad (32)

This appears as equation (19) in MW2016.

Following the approach in the Supplement to MW2016, the next stage in the development is to obtain an asymptotic expression for the sum in (30) as \kappa \rightarrow \infty. (It is assumed from now on that R_1 = 1, to simplify the expressions). Upon differentiating this, we will then obtain a corresponding asymptotic expression for \tau in accordance with (29), which can be inverted to obtain a small \tau approximation for \kappa, and subsequently a small \tau approximation for J(k). Finally, this will be used in (19) to obtain an expression for \tau^{*} in terms of \ln\mathcal{N}^{*}.

Begin by approximating the sum in (30) as an integral. Observe that

\int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma})

= \int_0^1 dn \ln(1 + \kappa n^{-\gamma}) + \int_1^2 dn \ln(1 + \kappa n^{-\gamma}) + \int_2^3 dn \ln(1 + \kappa n^{-\gamma}) + \cdots

= \int_1^0 -dx \ln(1 + \kappa (1-x)^{-\gamma})

+ \int_1^0 -dx \ln(1 + \kappa (2-x)^{-\gamma}) + \int_1^0 -dx \ln(1 + \kappa (3-x)^{-\gamma}) + \cdots

= \sum_{n=1}^{\infty} \int_0^1 dx \ln(1 + \kappa (n-x)^{-\gamma})

(The third line is obtained by making the change of variable n = \overline{m} - x in each integral in the second line, with \overline{m} representing the fixed positive integer in the upper limit, so dn = -dx, and x = 0 when n = \overline{m} whereas x = 1 when n = \overline{m} - 1). Therefore

\sum_{n=1}^{\infty} \ln(1 + \kappa n^{-\gamma}) \equiv \int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma})

+ \bigg\{\sum_{n=1}^{\infty} \ln(1 + \kappa n^{-\gamma}) - \int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma})\bigg\}

= \int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma}) + \sum_{n=1}^{\infty} \int_0^1 dx \bigg \{ \ln(1 + \kappa n^{-\gamma}) - \ln(1 + \kappa (n-x)^{-\gamma})\bigg\}

In MW2016 this is written as

S = S_0 + \sum_{n=1}^{\infty} \triangle S_n \quad \quad \quad \quad \quad (33)

where

S = \sum_{n=1}^{\infty} \ln(1 + \kappa n^{-\gamma})

S_0 = \int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma})

\triangle S_n = \int_0^1 dx \bigg \{\ln(1 + \kappa n^{-\gamma}) - \ln(1 + \kappa (n-x)^{-\gamma})\bigg\} \quad \quad \quad \quad \quad (34)

Write m = n - x. Then as x \rightarrow 0 we have

\frac{\ln(1 + \kappa n^{-\gamma}) - \ln(1 + \kappa (n-x)^{-\gamma})}{x}

= \frac{\ln(1 + \kappa (m+x)^{-\gamma}) - \ln(1 + \kappa m^{-\gamma})}{x}

\rightarrow \frac{\partial \ln(1 + \kappa n^{-\gamma})}{\partial n} = \frac{-\gamma \kappa}{n(\kappa + n^{\gamma})}

Therefore, when n \gg 1, we can approximate each \triangle S_n by using

\triangle S_n = \int_0^1 dx x \bigg\{\frac{\ln(1 + \kappa n^{-\gamma}) - \ln(1 + \kappa (n-x)^{-\gamma})}{x}\bigg\}

\sim \frac{\partial \ln(1 + \kappa n^{-\gamma})}{\partial n} \int_0^1 dx x

= \frac{-\gamma \kappa}{2n(\kappa + n^{\gamma})} \quad \quad \quad \quad \quad (35)

When \kappa n^{-\gamma} \gg 1 but n is not necessarily large, we can write

\triangle S_n \sim \int_0^1 dx \ln[\kappa n^{-\gamma}(1 + n^{\gamma}/\kappa)] - \int_0^1 dx \ln[\kappa (n-x)^{-\gamma}(1 + (n - x)^{\gamma}/\kappa]

= -\gamma \int_0^1 dx \{ \ln(n) - \ln(n-x)\} + O(\kappa^{-1}) \quad \quad \quad \quad \quad (36)

and we can ignore the O(\kappa^{-1}) component as \kappa \rightarrow \infty. Next, make the change of variable y = n - x in the integral in (36). Then when x = 0, y = n, when x = 1, y = n-1, and dy = -dx. We get

\triangle S_n \sim \gamma \int_n^{n-1} dy \{\ln(n) - \ln(y)\}

= -\gamma \int_{n-1}^n dy \{\ln(n) - \ln(y)\}

= -\gamma\bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] \quad \quad \quad \quad \quad (37)

In order to evaluate S using (33), MW2016 splits the summation over \triangle S_n into two parts: choose an integer M such that \kappa M^{-\gamma} \gg 1 and let M \rightarrow \infty as \kappa \rightarrow \infty. Use (35) for \triangle S_n when n > M, and (37) when n \leq M. We get

S \sim S_0 - \gamma \sum_{n=1}^M \bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] - \gamma \kappa \sum_{n = M+1}^{\infty} \frac{1}{2n(\kappa + n^{\gamma})} \quad \quad \quad \quad \quad (38)

We now approximate the second summation using an integral:

- \gamma \kappa \sum_{n = M+1}^{\infty} \frac{1}{2n(\kappa + n^{\gamma})} \sim \frac{-\gamma \kappa}{2} \int_M^{\infty} dn \frac{1}{n(\kappa + n^{\gamma})} = \frac{-\gamma}{2} \int_M^{\infty} dn \frac{1}{n\bigg(1 + \frac{n^{\gamma}}{\kappa}\bigg)} \quad \quad \quad \quad \quad (39)

Make the change of variable y = n^{\gamma}/\kappa. Then dy = \frac{1}{\kappa} \gamma n^{\gamma - 1} dn, so dn = \frac{1}{y} \frac{n}{\gamma} dy, and when n = M, y = M^{\gamma}/\kappa, while when n = \infty, y = \infty. The integral in (39) becomes

\frac{-\gamma}{2} \int_{M^{\gamma}/\kappa}^{\infty} \frac{1}{y} \frac{n}{\gamma} dy \frac{1}{n(1+y)}

= -\frac{1}{2} \int_{M^{\gamma}/\kappa}^{\infty} dy \frac{1}{y(1+y)}

= -\frac{1}{2} \int_{M^{\gamma}/\kappa}^{\infty} dy \bigg \{ \frac{1}{y} - \frac{1}{y+1} \bigg \}

= \frac{1}{2} \ln\bigg(\frac{M^{\gamma}}{\kappa}\bigg) - \frac{1}{2} \ln\bigg(\frac{M^{\gamma}}{\kappa} + 1\bigg)

= \frac{1}{2} \ln \bigg(\frac{\frac{M^{\gamma}}{\kappa}}{\frac{M^{\gamma}}{\kappa}+ 1}\bigg)

= -\frac{1}{2} \ln(1 + \kappa M^{-\gamma})

= -\frac{1}{2} \ln \bigg[\kappa M^{-\gamma} \bigg(1 + \frac{1}{\kappa M^{-\gamma}}\bigg)\bigg]

= -\frac{1}{2} \ln(\kappa) - \frac{1}{2} \ln(M^{-\gamma}) - \frac{1}{2} \ln \bigg(1 + \frac{1}{\kappa M^{-\gamma}}\bigg)

= -\frac{1}{2} \ln(\kappa) + \frac{\gamma}{2} \ln(M) + O(\kappa^{-1})

Putting this in (38) we get

S \sim S_0 - \gamma \sum_{n=1}^M \bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] -\frac{1}{2} \ln(\kappa) + \frac{\gamma}{2} \ln(M) + O(\kappa^{-1}) \quad \quad \quad \quad \quad (40)

We can write this as

S \sim S_0 - \gamma \bigg\{\sum_{n=1}^M \bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] - \frac{1}{2} \ln(M)\bigg\}

-\frac{1}{2} \ln(\kappa) + O(\kappa^{-1}) \quad \quad \quad \quad \quad (41)

Taking the limit as M \rightarrow \infty, the term in curly brackets in (41) converges to a constant C, and we get

S \sim S_0 - \frac{1}{2} \ln(\kappa) - \gamma C + O(\kappa^{-1}) \quad \quad \quad \quad \quad (42)

This is equation [10] in the Supplement to MW2016.

From (29), we see that differentiating S(\kappa) in (30) gives a time T(\kappa) = \zeta(\gamma) \tau(\kappa). The corresponding asymptotic expression for T(\kappa) is obtained by differentiating (42). We get

\frac{\partial S}{\partial \kappa} \sim \int_0^{\infty} dn \frac{1}{\kappa} \bigg(\frac{1}{1 + \frac{n^{\gamma}}{\kappa}}\bigg) - \frac{1}{2\kappa} \quad \quad \quad \quad \quad (43)

In the integral in (43), make the change of variable x = \frac{n^{\gamma}}{\kappa}, so that (\kappa x)^{1/\gamma} = n. Then

dn = \frac{1}{\gamma} (\kappa x)^{\frac{1}{\gamma} -1} \cdot \kappa dx = \frac{1}{\gamma} \kappa^{-\frac{\gamma -1}{\gamma}} \cdot x^{-\frac{\gamma -1}{\gamma}} \cdot \kappa dx

The integral in (43) then becomes

\kappa^{-\frac{\gamma -1}{\gamma}} \cdot \frac{1}{\gamma} \int_0^{\infty} dx \frac{x^{-\frac{\gamma -1}{\gamma}}}{1+x}

so we get

\frac{\partial S}{\partial \kappa} = T(\kappa) \sim A(\gamma) \cdot \kappa^{-\frac{\gamma -1}{\gamma}} - \frac{1}{2\kappa} \quad \quad \quad \quad \quad (44)

with

A(\gamma) \equiv \frac{1}{\gamma} \int_0^{\infty} dx \frac{x^{-\frac{\gamma -1}{\gamma}}}{1+x} \quad \quad \quad \quad \quad (45)

(MW2016 notes that the integral in (45) is actually the beta function B\big(1-\frac{1}{\gamma}, \frac{1}{\gamma}\big)). Integrating the first term on the right-hand side of (44), which is \frac{\partial S_0}{\partial \kappa}, gives

S_0 = \gamma A(\gamma) \kappa^{\frac{1}{\gamma}} \quad \quad \quad \quad \quad (46)

Substituting for S_0 in (42) with the expression in (46) gives

S \sim \gamma A(\gamma) \kappa^{\frac{1}{\gamma}} - \frac{1}{2} \ln(\kappa) - \gamma C + O(\kappa^{-1}) \quad \quad \quad \quad \quad (47)

This is equation (21) in MW2016.

To obtain an explicit expression for the rate function J(\tau), we must now invert (44) to express the parameter \kappa in terms of the time T, and hence in terms of the dimensionless time given by \tau = \frac{T}{\zeta(\gamma)}. To this end, observe that from (44) we get

T \sim A \kappa^{\frac{1}{\gamma}} \kappa^{-1} - \frac{1}{2} \kappa^{-1} = \kappa^{-1} \bigg(A \kappa^{\frac{1}{\gamma}} - \frac{1}{2}\bigg) \quad \quad \quad \quad \quad (48)

Therefore

\kappa T \sim A \kappa^{\frac{1}{\gamma}}

so

\kappa \sim \bigg(\frac{T}{A}\bigg)^{-\frac{\gamma}{\gamma-1}} = \bigg(\frac{A}{\zeta(\gamma)}\bigg)^{\frac{\gamma}{\gamma-1}}\bigg(\frac{T}{\zeta(\gamma)}\bigg)^{-\frac{\gamma}{\gamma-1}} = b\tau^{-\frac{\gamma}{\gamma-1}} \quad \quad \quad \quad \quad (49)

with b \equiv \big(\frac{A}{\zeta(\gamma)}\big)^{\frac{\gamma}{\gamma-1}}. Now, using (47) and (48), the rate function is given by

J(\tau) = S(\kappa^{*}) - T(\kappa^{*})\kappa^{*}

= \gamma A(\gamma) \kappa^{\frac{1}{\gamma}} - \frac{1}{2} \ln(\kappa) - \gamma C -A \kappa^{\frac{1}{\gamma}} + \frac{1}{2}

= (\gamma - 1) A\kappa^{\frac{1}{\gamma}} - \frac{1}{2} \ln(\kappa) - \bigg(\gamma C - \frac{1}{2}\bigg) \quad \quad \quad \quad \quad (50)

Using (49) in (50) we get

J(\tau) = (\gamma - 1) A b^{\frac{1}{\gamma}} \tau^{-\frac{1}{\gamma-1}} + \frac{1}{2} \frac{\gamma}{\gamma - 1} \ln(\tau) + D \quad \quad \quad \quad \quad (51)

where D is another constant.

We are now finally in a position to obtain the result given in (27) in MW2016. From (51) we see that to leading order we have

J(\tau^{*}) \propto [\tau^{*}]^{-\frac{1}{\gamma-1}}

so from (19) we obtain

\tau^{*} \propto [\ln \mathcal{N}^{*}]^{-(\gamma - 1)} \quad \quad \quad \quad \quad (52)

Errata in MW216

In the course of working with MW2016, some errata have come to light which I will record here.

In a private written communication with Michael Wilkinson, I pointed out that an error was made in the Supplement to MW2016 which details the asymptotic analysis. The issue arises in the Supplement when combining equations (17) and (19) there to get equation (20), which is then presented as equation (24) in the main MW2016 text. Equation (17) for \kappa in the Supplement has a leading order term involving the dimensionless time variable \tau with an exponent of the form -\gamma/(\gamma-1). Using (17) in equation (19) for the entropy, we should get the leading order \tau term appearing with exponent -1/(\gamma-1), because \kappa in (19) appears raised to the power 1/\gamma. However, the exponent is incorrectly given as -\gamma/(\gamma-1). This error affected the time scale relationship reported in (27) in the main MW2016 text. Combining the expression for the entropy with the condition (12) in the main MW2016 text for the onset of a shower, \tau should be related to \ln \mathcal{N}^{\ast} with an exponent -(\gamma-1) on the latter, i.e., the reciprocal of -1/(\gamma-1). However, the exponent that is actually reported in equation (27) of MW2016 is -(\gamma-1)/\gamma.

In an online meeting with Michael Wilkinson, I also pointed out that the formula for the constant C given as equation (11) in the Supplement to MW2016 cannot be correct, and the formula is incompatible with the numerical value claimed for C in equation (12) in the Supplement. For example, the expression involving natural logarithm terms in (11) must be negative, but the numerical value given in (12) is positive. Using numerical calculations in MAPLE, I pointed out that the following variant of the formula in (11) produces a numerical value close to the one reported in (12), as M becomes large:

C = 1 + \bigg\{\sum_{n=1}^M \bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] - \frac{1}{2} \ln(M)\bigg\}

The expression in curly brackets appears in equation (41) in my review above.

Finally, in another private written communication with Michael Wilkinson I pointed out that the symbol T is being used inconsistently in MW2016. In equation (4) in MW2016, T is defined as a random variable, i.e., a sum of random inter-collision times for a runaway raindrop. However, the same symbol T is used in the entropy function in equation (7) and in the related equation (16) in MW2016, where T is no longer a random variable, but rather an exogenous bound on values of the random variable T. The distinction between the two becomes a delicate matter in some discussions, so it would have been better to use something like an overbar to distinguish between the random variable T and the related symbol \bar{T}, for example in expressing the large deviation principle P(T \in [\bar{T}, \bar{T}+d\bar{T}]) \approx \exp(-J(\bar{T}))d\bar{T}.

Using Lagrange’s interpolation formula to prove complicated sum/product identities

A paper by Sen and Balakrishnan pointed out that Lagrange’s polynomial interpolation formula can be used to construct relatively simple proofs of some complicated-looking sum/product identities. (Sen, A., Balakrishnan, N., 1999, Convolution of geometrics and a reliability problem, Statistics and Probability Letters, 43, pp. 421-426). In this note I want to make a quick record of how this works.

Let (R_1, f(R_1)), (R_2, f(R_2)), \ldots , (R_{\mathcal{N}}, f(R_{\mathcal{N}})) be a set of \mathcal{N} distinct points in [0, \infty) and let f be a continuous function defined on [0, \infty). Then there is a unique polynomial p of degree \mathcal{N}-1 or less which satisfies the interpolation conditions

p(R_i) = f(R_i) \qquad \qquad \qquad (1)

for i = 1, \ldots, \mathcal{N}. This unique polynomial can be constructed by defining for n = 1, \ldots, \mathcal{N} the functions

\mathcal{P}_n(R) = \prod_{\substack{j=1 \\ j \neq n}}^{\mathcal{N}} \frac{(R_j-R)}{(R_j-R_n)} \qquad \qquad \qquad (2)

These are polynomials of degree less than \mathcal{N}, and since the product on the right-hand side of (2) contains all the R_j except R_n, the functions \mathcal{P}_n(R) satisfy

\mathcal{P}_n(R_i) = \delta_{ni} \qquad \qquad \qquad (3)

for i = 1, \ldots, \mathcal{N}, where \delta_{ni} is Kronecker’s delta. In other words, for each n we have \mathcal{P}_n(R_n) = 1, and \mathcal{P}_n(R_i) = 0 for i \neq n. Lagrange’s polynomial interpolation formula is then written as

p(R) = \sum_{n=1}^{\mathcal{N}} \mathcal{P}_n(R) \cdot f(R_n) \qquad \qquad \qquad (4)

This satisfies the interpolation conditions in (1) above because setting R = R_i in (4) causes all the terms to vanish except \mathcal{P}_i(R_i)f(R_i) = f(R_i), so (1) is obtained. Crucially for the proofs below, the formula (4) represents a unique polynomial of degree \mathcal{N}-1 or less whose values are f(R_1), f(R_2), \dots, f(R_{\mathcal{N}}) at the interpolation points R_1, R_2, \ldots , R_{\mathcal{N}} respectively.

Now, a brief indication of how to prove

\sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg) \bigg) \equiv 1 \qquad \qquad \qquad (5)

using Lagrange’s polynomial interpolation formula was provided in Sen and Balakrishnan’s paper (cited above), and we elaborate on this idea here to enable similar approaches for other formulas. The sum/product on the left-hand side of (5) looks like Lagrange’s formula for interpolating the points (R_1, 1), (R_2, 1), \dots, (R_{\mathcal{N}}, 1). In the style of (4), this formula would be

p(R) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j - R}{R_j - R_n}\bigg) \bigg) \cdot 1 \qquad \qquad \qquad (6)

But the constant function that takes the value 1 for all its arguments is a polynomial of degree 0, and the interpolating polynomial p(R) in (6) is unique. Therefore we can immediately conclude that p(R) = 1. But (5) is simply p(0) = 1, so it is immediately proved.

We now see how to adapt this procedure to prove another sum/product formula. Consider the formula

\sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n^m \bigg) \equiv 0 \qquad \qquad \qquad (7)

for m = 1, \ldots, \mathcal{N}-1. The left-hand side of (7) looks like Lagrange’s formula for interpolating the points (R_1, R_1^m), (R_2, R_2^m), \dots, (R_{\mathcal{N}}, R_{\mathcal{N}}^m). In the style of (4), this formula would be

p(R) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j - R}{R_j - R_n}\bigg) \bigg) \cdot R_n^m \qquad \qquad \qquad (8)

But the function f(R_n) = R_n^m in (8) is a polynomial of degree m, and the interpolating polynomial p(R) in (8) is unique. Therefore we can immediately conclude that p(R) = R^m. But (7) is then simply p(0) = 0, so it is immediately proved.

Other sum/product formulas of this kind can be proved in similar ways.

Using an asymptotic approximation to obtain a closed form solution

I needed to find an extremum of a function of the form

f(\bar{T}_1) = \big[\mathcal{N}_1!\big]^{\gamma - 1}(\bar{T}_1)^{\mathcal{N}_1} + \big[(\mathcal{N} - \mathcal{N}_1)!\big]^{- 1}\bigg(\frac{\mathcal{N}!}{\mathcal{N}_1!}\bigg)^{\gamma}(\bar{T} - \bar{T}_1)^{\mathcal{N}-\mathcal{N}_1}

by choice of \bar{T}_1, ideally obtaining a closed form solution for the critical value \bar{T}_1^{\ast}. The precise context is not relevant here – I just want to record an asymptotic approximation trick I was able to use to solve this problem. Differentiating and setting equal to zero we obtain the first-order condition

\mathcal{N}_1 \big[\mathcal{N}_1!\big]^{\gamma - 1}(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}

- (\mathcal{N} - \mathcal{N}_1)\big[(\mathcal{N} - \mathcal{N}_1)!\big]^{- 1}\bigg(\frac{\mathcal{N}!}{\mathcal{N}_1!}\bigg)^{\gamma}(\bar{T} - \bar{T}_1^{\ast})^{\mathcal{N}-\mathcal{N}_1 - 1} = 0

which implies

\frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{(\bar{T} - \bar{T}_1^{\ast})^{\mathcal{N}-\mathcal{N}_1 - 1}} = \frac{(\mathcal{N} - \mathcal{N}_1)\big[(\mathcal{N} - \mathcal{N}_1)!\big]^{- 1}\bigg(\frac{\mathcal{N}!}{\mathcal{N}_1!}\bigg)^{\gamma}}{\mathcal{N}_1 \big[\mathcal{N}_1!\big]^{\gamma - 1}} = \frac{\big[\mathcal{N}!\big]^{\gamma}(\mathcal{N}_1-1)!}{\big[\mathcal{N}_1!\big]^{2\gamma}(\mathcal{N}-\mathcal{N}_1-1)!} \qquad \qquad \qquad (1)

This does not immediately yield a closed form solution for \bar{T}_1^{\ast}, but I realised that in the present context, with \mathcal{N} \gg \mathcal{N}_1 \gg 1, and \bar{T} \gg \bar{T}_1^{\ast}, we can in fact satisfactorily approximate the left-hand side as

\frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{(\bar{T} - \bar{T}_1^{\ast})^{\mathcal{N}-\mathcal{N}_1 - 1}} = \bigg(\frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1 - 1}}\bigg) \cdot \bigg \{ \bigg(1 - \frac{\bar{T}_1^{\ast}}{\bar{T}}\bigg)^{-(\mathcal{N}-\mathcal{N}_1 - 1)}\bigg \}

\approx \bigg(\frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1 - 1}}\bigg) \cdot \bigg(1 + (\mathcal{N}-\mathcal{N}_1 - 1)\frac{\bar{T}_1^{\ast}}{\bar{T}}\bigg)

\approx \frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1 - 1}} + (\mathcal{N}-\mathcal{N}_1 - 1) \cdot \frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1}}

\approx (\mathcal{N}-\mathcal{N}_1) \cdot \frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1}}

Then we can write (1) using the gamma function n! = \Gamma(n+1) as

(\mathcal{N}-\mathcal{N}_1) \cdot \frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1}} = \frac{[\Gamma(\mathcal{N}+1)]^{\gamma}\cdot [\Gamma(\mathcal{N}_1)]}{[\Gamma(\mathcal{N}_1+1)]^{2\gamma} \cdot [\Gamma(\mathcal{N}-\mathcal{N}_1)]}

and thus

T_1^{\ast} = \frac{ [\Gamma(\mathcal{N}+1)]^{\frac{\gamma}{\mathcal{N}_1}} \cdot [\Gamma(\mathcal{N}_1)]^{\frac{1}{\mathcal{N}_1}} \cdot \bar{T}^{\ \frac{(\mathcal{N}-\mathcal{N}_1)}{\mathcal{N}_1}}}{(\mathcal{N} - \mathcal{N}_1)^{\frac{1}{\mathcal{N}_1}}\cdot [\Gamma(\mathcal{N}_1+1)]^{\frac{2\gamma}{\mathcal{N}_1}}\cdot [\Gamma(\mathcal{N}-\mathcal{N}_1)]^{\frac{1}{\mathcal{N}_1}}}

which is the kind of closed form solution I wanted.

Applying the Bromwich integral to a cumulant generating function

In this note I want to quickly go through the motions of applying the Bromwich integral (i.e., inverse Laplace transform) to the cumulant generating function \lambda(k) of an exponential variate X with probability density function

f(x) = \mu e^{-\mu x} \qquad \qquad \qquad (1)

for x \geq 0. The cumulant generating function is the log-Laplace transform of f(x), obtained as

\lambda(k) = \ln \langle e^{kx} \rangle

= \ln\bigg(\int_0^{\infty} e^{kx} \mu e^{-\mu x} dx\bigg) = \ln\bigg(\frac{\mu}{\mu - k}\bigg) \qquad \qquad \qquad (2)

Note that we need to assume k < \mu for the integral to converge in this case.

We need to apply the Bromwich integral to the exponential of \lambda(k), i.e.,

e^{\lambda(k)} = \frac{\mu}{\mu - k} \qquad \qquad \qquad (3)

So, we have that taking the Laplace transform of the density function gives

\mathcal{L}(\mu e^{-\mu x}) = \frac{\mu}{\mu - k} \qquad \qquad \qquad (4)

How do we now apply the Bromwich integral to the right-hand side of (4) to deduce \mu e^{-\mu x}? One thing to notice is that this result is immediately obtainable from standard tables of Laplace transforms. For example, in most such tables one will find something like

\mathcal{L}(e^{-at}) = \frac{1}{p+a} \qquad \qquad \qquad (5)

where p is the Laplace transform parameter such that

F(p) = \mathcal{L}(f(t)) = \int_0^{\infty} f(t) e^{-pt} dt \qquad \qquad \qquad (6)

Putting a = \mu, p = -k and t = x in (5), we can then immediately deduce from (5) that

\mathcal{L}^{-1} \bigg(\frac{\mu}{\mu-x}\bigg) = \mu \mathcal{L}^{-1} \bigg(\frac{1}{\mu-x}\bigg) = \mu e^{-\mu x} \qquad \qquad \qquad (7)

Therefore, we will try to deduce our result by applying the Bromwich integral to the right-hand side of (5), where the Bromwich integral is given by

f(t) = \frac{1}{2 \pi i} \int_{c - i \infty}^{c + i \infty} F(z) e^{zt} dz \qquad \qquad \qquad (8)

for t > 0. The basic procedure is simple. First, we convert p on the right-hand side of (5) into a complex variable, say p = x + iy \equiv z, so the function F(z) in (8) is then

F(z) = \frac{1}{z + a} \qquad \qquad \qquad (9)

Then as long as F(z) is of the form F(z) = P(z)/Q(z), where P(z) and Q(z) are polynomials such that the degree of Q(z) is at least one greater than the degree of P(z), we can immediately conclude

f(t) = \text{the sum of residues of } F(z) e^{zt} \text{ at all poles}

This condition is certainly satisfied by (9), so all we need to do is work out the relevant residues. The function \frac{e^{zt}}{z+a} has a pole of order 1 at z = -a, so there is a single residue which can be obtained as

\lim_{z \rightarrow -a} \bigg((z+a) \cdot \frac{e^{zt}}{(z+a)}\bigg) = e^{-at} \qquad \qquad \qquad (10)

This then immediately confirms the result in (5), and hence our result as explained below (5). We could just finish here, but it is instructive to actually prove that this result is correct using contour integration.

We first construct a contour bearing in mind that we need the restriction \lambda > k as per (1) above, which in the notation of (9) translates into the condition Re z > -a. So, picking an arbitrary c > -a, we have the following:

The contour \Gamma is designed to enclose the singularity of the integrand in the contour integral

\frac{1}{2 \pi i} \oint_{\Gamma} \frac{e^{zt}}{z+a} dz \qquad \qquad \qquad (11)

We now split this contour integral into a straight part and an arc:

\frac{1}{2 \pi i} \int_{c - iy}^{c + iy} \frac{e^{zt}}{z+a} dz + \frac{1}{2 \pi i} \int_{arc} \frac{e^{zt}}{z+a} dz = e^{-at} \qquad \qquad \qquad (12)

We want to show that as y \rightarrow \infty, the integral along the arc vanishes, and so

\frac{1}{2 \pi i} \int_{c - i\infty}^{c + i\infty} \frac{e^{zt}}{z+a} dz = e^{-at} \qquad \qquad \qquad (13)

as per the Bromwich integral. We use estimations, beginning by writing

\int_{arc} \frac{e^{zt}}{z+a} dz \leq \pi y \sup_{arc} \bigg| \frac{e^{zt}}{z+a} \bigg| \qquad \qquad \qquad (14)

We have along the arc that |z| \geq y and also

|z+a| = \sqrt{(z+a)(\bar{z}+a)}

= \sqrt{((x+a)+iy)((x+a)-iy)}

= \sqrt{(x+a)^2 + y^2}

\geq y

In addition, for any z on the arc, with arg(z) = \phi, we must have \cos(\phi) < 0 since we are operating in the second and third quadrants, so

|e^{zt}| = |e^{|z|(\cos(\phi) + i \sin(\phi)t)}|

= e^{|z|\cos(\phi)t}

\leq e^{-y |\cos(\phi)|t}

\leq \frac{1}{y |\cos(\phi)|t}

Therefore (14) gives

\int_{arc} \frac{e^{zt}}{z+a} dz \leq \frac{\pi y}{y^2 |\cos(\phi)|t} \rightarrow 0

as y \rightarrow \infty, so (12) reduces to (13) as required.

Inverse Fourier transform of a characteristic function

Let the Fourier transform of a function f(x) be

g(t) = \int_{-\infty}^{\infty} f(x) e^{itx} dx \qquad \qquad \qquad (1)

The corresponding inverse Fourier transform would then be

f(x) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} g(t) e^{-itx} dt \qquad \qquad \qquad (2)

As an exercise, what I want to do in this note is derive the characteristic function \phi_X(t) = E[e^{itX}] of an exponential random variate X with probability density function

f(x) = \lambda e^{-\lambda x} \qquad \qquad \qquad (3)

for x \geq 0. The required characteristic function is the Fourier transform of this density function. I then want to apply the inverse Fourier transform to \phi_X(t) and thereby recover the density f(x).

Simply applying (1) above to (3) we easily get the characteristic function:

\phi_X(t) = \int_0^{\infty} \lambda e^{-\lambda x} e^{itx} dx

= \lambda \int_0^{\infty} e^{-(\lambda-it)x} dx

= \bigg[-\frac{1}{(\lambda - it)} e^{-(\lambda - it)x}\bigg]_0^{\infty}

= \frac{\lambda}{\lambda - it}

= \bigg(1 - \frac{it}{\lambda}\bigg)^{-1} \qquad \qquad \qquad (4)

(Note that e^{-(\lambda - it)x} = e^{-\lambda} e^{itx} \rightarrow 0 as x \rightarrow \infty if \lambda > 0).

Let us now reverse the process, and apply the inverse Fourier transform to (4). Using (2) above, we get

f(x) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} \bigg(1 - \frac{it}{\lambda}\bigg)^{-1} e^{-itx} dt

= \frac{-i\lambda}{2 \pi} \int_{-\infty}^{\infty} \frac{e^{-itx}}{(t + i\lambda)} dt \qquad \qquad \qquad (5)

(Note that t is unrestricted, so the integration limits are from -\infty to \infty). It quickly transpires that elementary techniques will not work easily with (5), so we try contour integration using the residue theorem. We turn the integration variable t into a complex variable z = t + iy, and formulate the integral in (5) as a contour integral

\frac{-i\lambda}{2 \pi} \oint_C \frac{e^{-izx}}{(z + i\lambda)} dz \qquad \qquad \qquad (6)

The trick will be to design the contour C in such a way that part of the contour includes the integral in the form (5), while the rest of the contour vanishes, in a limiting process. We can then apply the residue theorem to solve (5).

The integrand in (6) has a pole of order 1 at z = -i\lambda. To find the residue of the integrand at this pole, we multiply the integrand by (z + i \lambda), obtaining e^{-izx}, and then evaluate this result at z = -i \lambda, yielding e^{-\lambda x}. Thus, e^{-\lambda x} is the residue of the integrand at its pole. If we design the contour C in such a way that it encloses the singularity at -i \lambda, then Cauchy’s residue theorem tells us that the integral in (6) equals 2 \pi i times the residue e^{-\lambda x}, so we will get the density function we are looking for:

\frac{-i\lambda}{2 \pi} \oint_C \frac{e^{-izx}}{(z + i\lambda)} dz = \frac{-i\lambda}{2 \pi} (2 \pi i) e^{-\lambda x} = \lambda e^{-\lambda x} \qquad \qquad \qquad (7)

Clearly then, when we use contour integration we get the required result straight away. But what contour do we use for the integral in (6)? The contour must enclose the point -i \lambda, so letting a > \lambda be some initial starting value, we can try the following:

Then the contour C may be divided into a straight part along the real axis (which only involves the real part of z, namely t), and a curved arc. Therefore the contour integral in (6) can be decomposed as

\frac{-i\lambda}{2 \pi} \int_{-a}^{a} \frac{e^{-itx}}{(t + i\lambda)} dt - \frac{i\lambda}{2 \pi} \int_{arc} \frac{e^{-izx}}{(z + i\lambda)} dz = \lambda e^{- \lambda x} \qquad \qquad \qquad (8)

We can now use estimations to show that the second integral on the left-hand side of (8) vanishes as a \rightarrow \infty. First, noting that \pi a is half the circumference of a circle of radius a, we have

\int_{arc} \frac{e^{-izx}}{(z + i\lambda)} dz \leq \pi a \sup_{arc} \bigg|\frac{e^{-izx}}{(z+i \lambda)}\bigg| \qquad \qquad \qquad (9)

Next, we observe that

|e^{-izx}| = |e^{-x|z|(\cos(\arg(z)) + i \sin(\arg(z)))}| = e^{-x |z| \cos(\arg(z))} = \frac{1}{e^{x |z| \cos(\arg(z))} } \leq \frac{1}{x |z| \cos(\arg(z))} \qquad \qquad \qquad (10)

and

|z+i \lambda| \geq |z| \qquad \qquad \qquad (11)

and also, along the arc, we must have

|z| \geq a - \lambda \qquad \qquad \qquad (12)

Using these and (9) we then have

\int_{arc} \frac{e^{-izx}}{(z + i\lambda)} dz \leq \frac{\pi a}{(a - \lambda)^2 x \cos(\arg(z))} \rightarrow 0

as a \rightarrow \infty, so in the limit as a \rightarrow \infty we are left with

\frac{-i\lambda}{2 \pi} \int_{-\infty}^{\infty} \frac{e^{-itx}}{(t + i\lambda)} dt = \lambda e^{- \lambda x}

which is the required result.

Two versions of the exponential cumulant generating function

In this note I want to record an interesting issue I have noticed with regard to how the form of the large deviation entropy function can `flip’ depending on the way the underlying cumulant generating function is formulated.

The cumulant generating function of a random variable is the natural log of its moment generating function. In the case of an exponential random variable with mean \mu and probability density function

f(x) = \frac{1}{\mu} e^{-x/\mu}

the moment generating function is the same as the Laplace transform of the random variable. Thus, one way to compute the cumulant generating function of an exponential random variable is as follows:

\lambda(k) = \ln \langle e^{kx} \rangle

= \ln \bigg(\int_0^\infty e^{kx} \frac{1}{\mu} e^{-x/\mu} dx \bigg)

= \ln \bigg(\frac{1}{\mu} \int_0^{\infty} e^{-(1-\mu k)x/\mu} dx \bigg)

= \ln \bigg(\frac{1}{\mu} \bigg[ -\frac{\mu}{1-\mu k}e^{-(1-\mu k)x/\mu} \bigg]_0^{\infty} \bigg)

= \ln \bigg( \frac{1}{1-\mu k}\bigg)

= -\ln(1 - \mu k) \qquad \qquad \qquad (1)

Note that it is clear from (1) that there is a singularity at k = \frac{1}{\mu} and that the natural logarithm is defined only if the Laplace transform parameter k is restricted to the region k < \frac{1}{\mu}. Since \lambda(k) is everywhere differentiable in this region, it follows from the Gärtner-Ellis Theorem of large deviation theory that the probability density function of values of the random variable in the tails of the exponential distribution can be approximated using the large deviation principle, with entropy function

I(s) = \sup_k (ks - \lambda(k)) \qquad \qquad \qquad (2)

All well and good. However, what I want to point out here is that there is a mathematically equivalent way to formulate the cumulant generating function, as follows:

\lambda(k) = -\ln \langle e^{-kx} \rangle

= -\ln \bigg(\int_0^\infty e^{-kx} \frac{1}{\mu} e^{-x/\mu} dx \bigg)

= -\ln \bigg(\frac{1}{\mu} \int_0^{\infty} e^{-(1+\mu k)x/\mu} dx \bigg)

= -\ln \bigg( \frac{1}{1+\mu k}\bigg)

= \ln(1+\mu k) \qquad \qquad \qquad (3)

It is less obvious from (3) that there are singularities, etc., but they are still there. It’s just that everything is negated when compared to the first formulation. There is now a singularity at k = -\frac{1}{\mu}, and the logarithm in (3) is now only defined for values of k in the region k > -\frac{1}{\mu}. What interested me is that it is not only the admissible region of values of k that gets reversed, but also the form of the entropy function in (2) above. Since in the second formulation of the cumulant generating function we have negated both the outer function and the parameter k, the signs of the terms on the right-hand side of (2) have also become reversed, so the entropy function as a whole now gets flipped around. It should therefore read as follows in this case:

I(s) = \sup_k (\lambda(k) - ks)

The role of the Legendre transform in large deviation theory

The Legendre transform is a mechanism for converting a relationship like

t(k) = \frac{d \lambda(k)}{dk} \qquad \qquad \qquad (1)

into a relationship like

k(t) = \frac{dJ(t)}{dt} \qquad \qquad \qquad (2)

In other words, by going from a function \lambda(k) to its Legendre transform J(t) and vice versa, the roles of t and k in (1) and (2) can be reversed.

In moving from (1) to (2), the required Legendre transform is

J(t) = t \cdot k(t) - \lambda(k(t)) \qquad \qquad \qquad (3)

because differentiating both sides of (3) with respect to t gives

\frac{dJ(t)}{dt} = k(t) + t \cdot \frac{d k(t)}{dt} - \frac{d \lambda(k(t))}{dk(t)} \cdot \frac{dk(t)}{dt} \qquad \qquad \qquad (4)

= k(t)

Using (1), we see that the last two terms in (4) cancel, so we get (2). Conversely, the Legendre transform of J(t) in (2) can be written

\lambda(k) = k \cdot t(k) - J(t(k)) \qquad \qquad \qquad (5)

because differentiating both sides with respect to k gives (1).

The Fenchel-Legendre transform in large deviation theory achieves this same kind of relationship between an entropy function J(\bar{T}) and the log of the Laplace transform of a corresponding random variable, which is denoted \lambda(k). The Fenchel-Legendre transform is given by

J(\bar{T}) = \sup_{k \in \mathbb{R}} (k \cdot \bar{T} - \lambda(k)) \qquad \qquad \qquad (6)

We take \lambda(k) to be the log of the Laplace transform of a random variable T, and then approximate the density of T at some far-flung value \bar{T} in the distribution of T as

f(\bar{T}) = \exp(-J(\bar{T})) \qquad \qquad \qquad (7)

Random variables for which we can use (7) as an approximation of the density at tail values \bar{T} in the distribution are said to satisfy the large deviation principle.

One way to understand how (6) arises is in terms of Chernoff’s inequality, which says

P(T \geq \bar{T}) \leq \exp(-J(\bar{T})) \qquad \qquad \qquad (8)

Chernoff’s inequality is a modified form of Markov’s inequality. Minimising the upper bound in this inequality requires J(\bar{T}) to be the supremum defined in (6) above, as I explained in a previous post.

Another way to understand where (6) comes from is through the use of the approximation in (7) to obtain \lambda(k) in terms of J(\bar{T}), followed by an inversion to obtain J(\bar{T}) in terms of \lambda(k) using the fact that \lambda(k) and J(\bar{T}) are Legendre transforms of each other. For a random variable T, the Laplace transform is

E[e^{kT}] = \int_{\mathbb{R}} e^{kT} f(T) dT \qquad \qquad \qquad (9)

(Note that, by the inherent symmetry, the right-hand side can equally be regarded as the Laplace transform of the density). We now approximate f(T)dT by

f(\bar{T})d \bar{T} = P(T \in [\bar{T}, \bar{T}+d\bar{T}]) \qquad \qquad \qquad (10)

with f(\bar{T}) as given in (7) above. Then (9) becomes

E[e^{kT}] \approx \int_{\mathbb{R}} e^{k \bar{T}} e^{-J(\bar{T})} d\bar{T} = \int_{\mathbb{R}} e^{k\bar{T} - J(\bar{T})}d \bar{T} \qquad \qquad \qquad (11)

We can approximate the integral in (11) using the saddle-point approximation, i.e., as the maximum value of the integrand, which gives

E[e^{kT}] \approx \exp(\sup_{\bar{T} \in \mathbb{R}} (k\bar{T} - J(\bar{T}))) \qquad \qquad \qquad (12)

Now taking the log of both sides of this we get

\lambda(k) = \sup_{\bar{T} \in \mathbb{R}} (k\bar{T} - J(\bar{T})) \qquad \qquad \qquad (13)

We can now get from (13) to (6) by invoking the Legendre transform relationship. To do this, we need to carefully understand the role that taking the supremum is playing in (13). For any given value of k, we are finding the value of \bar{T} that maximises the expression k\bar{T} - J(\bar{T}). This then makes the optimised value of \bar{T} a function of k. Thus we can write (13) as

\lambda(k) = k \cdot \bar{T}^{\ast}(k)  - J(\bar{T}^{\ast}(k)) \qquad \qquad \qquad (14)

where \bar{T}^{\ast} denotes the optimal value of \bar{T}. The supremum in (6) is playing a similar role, so we can likewise rewrite (6) as

J(\bar{T}) = \bar{T} \cdot k^{\ast}(\bar{T}) - \lambda(k^{\ast}(\bar{T})) \qquad \qquad \qquad (15)

where k^{\ast} denotes the optimal value of k. Comparing these with (3) and (5) above, we now immediately recognise (14) and (15) as Legendre transforms of each other whenever \lambda(k) is everywhere differentiable. Thus, we can invert (14) to obtain (15) using the Legendre transform relationship between them. We can then use (15) to calculate the entropy function for particular random variables, given their Laplace transforms.