Study of a Lagrangian function approach to analysing raindrop growth

In this note, I study a constrained maximisation approach seeking to clarify the growth history of microscopic water droplets which undergo runaway aggregation to become raindrops. Using this approach, and the setup outlined in a previous note about a 2016 paper in Physical Review Letters on the large deviation analysis of rapid-onset rain showers (referred to as MW2016 herein), an expression for the total time for runaway growth is easily obtained in terms of the mean inter-collision times \tau_n. This expression seems to be related in some way to the expression for \tau in MW2016, which was derived from large deviation analysis. The purpose of the present note is to try to further clarify the links with some of the results in MW2016. In order to achieve this, I employed a Lagrangian function approach to the constrained maximisation problem as follows.

A Lagrangian function approach

Let the random variable T, representing the total time required for a droplet to undergo runaway growth, be defined as in the previous note mentioned above, that is,

T = \sum_{n=1}^{\mathcal{N}} t_n

where the inter-collision times t_n are independent exponentially distributed random variables with mean \tau_n = 1/R_n. The parameter \mathcal{N} is a given total number of water particle collisions required for runaway growth of a raindrop.

Consider also a fixed time interval [0, \tau], where \tau \equiv \bar{T}/\langle T \rangle , and \bar{T} is a number in the small-value tail of the distribution of T, to be treated as an exogenous parameter in the forthcoming analysis. We partition the interval [0, \tau] into \mathcal{N} subintervals, each of length \hat{t}_n, so that

\tau = \sum_{n=1}^{\mathcal{N}} \hat{t}_n \qquad \qquad \qquad \qquad \qquad (1)

Then on the basis of the exponential distributions of the inter-collision times t_n, we have the following joint cumulative distribution function for the inter-collision times:

P(t_1 \leq \hat{t_1}, t_2 \leq \hat{t}_2, \ldots, t_{\mathcal{N}} \leq \hat{t}_{\mathcal{N}}) = \prod_{n=1}^{\mathcal{N}} (1 - \exp(-\hat{t}_n/\tau_n)) \qquad \qquad \qquad \qquad \qquad (2)

In the constrained maximisation approach, we seek to choose the inter-collision time parameters \hat{t}_n to maximise the joint cumulative distribution function in (2), subject to the constraint represented by (1). Since logarithms are monotone increasing functions, we can obtain the same results by maximising the natural logarithm of (2) subject to (1). The Lagrangian function for this constrained maximisation problem can then be written as

\mathcal{L} = \sum_{n=1}^{\mathcal{N}} \ln(1 - \exp(-\hat{t}_n/\tau_n)) + k \bigg(\tau - \sum_{n=1}^{\mathcal{N}} \hat{t}_n \bigg) \qquad \qquad \qquad \qquad \qquad (3)

where I have used the symbol k to denote the Lagrange multiplier, since this will turn out to be closely linked to the Laplace transform parameter k in the large deviation analysis in MW2016. First-order conditions for a maximum are

\frac{\partial \mathcal{L}}{\partial \hat{t}_n} = \frac{\frac{1}{\tau_n}\exp(-\hat{t}_n/\tau_n)}{1 - \exp(-\hat{t}_n/\tau_n)} - k = 0 \qquad \qquad \qquad \qquad \qquad (4)

for n = 1, \ldots, \mathcal{N}, and

\frac{\partial \mathcal{L}}{\partial k} = \tau - \sum_{n=1}^{\mathcal{N}} \hat{t}_n = 0 \qquad \qquad \qquad \qquad \qquad (5)

Solving (4) gives

\hat{t}_n^{\ast} = \tau_n \ln\bigg(1 + \frac{1}{k^{\ast} \tau_n}\bigg) \qquad \qquad \qquad \qquad \qquad (6)

and putting this into (5) gives

\tau = \sum_{n=1}^{\mathcal{N}} \tau_n \ln\bigg(1 + \frac{1}{k^{\ast} \tau_n}\bigg) \qquad \qquad \qquad \qquad \qquad (7)

where \hat{t}_n^{\ast} and k^{\ast} denote the optimised values of \hat{t}_n and k respectively, given the value of the parameter \tau. We can now put this structure to work and clarify its links with runaway growth of raindrops and with the large deviation analysis in MW2016.

First, we obtain an intuitively appealing interpretation of the optimised Lagrange multiplier k^{\ast} by solving (6) for k^{\ast} in the case n=1. Noting that \tau_1=1, this gives

k^{\ast} = \frac{1}{\exp(\hat{t}_1^{\ast}) - 1} \qquad \qquad \qquad \qquad \qquad (8)

Therefore, for small \tau and \hat{t}_1^{\ast} \ll 1, we have k^{\ast} \approx 1/\hat{t}_1^{\ast}, so the maximised Lagrange multiplier can be interpreted as the reciprocal of the first monomer collision time in the runaway growth process. Thus, k^{\ast} can be expected to be very large in scenarios in which \hat{t}_1^{\ast} is very small.

Next, put the expressions for \hat{t}_n^{\ast} in (6) into the Lagrangian function in (3), to obtain the following expression for the maximised value of the Lagrangian as a function only of k^{\ast}:

\mathcal{L}^{\ast} = - \bigg \{\sum_{n=1}^{\mathcal{N}} \ln(1 + k^{\ast} \tau_n) - k^{\ast} \tau \bigg \} - \sum_{n=1}^{\mathcal{N}} k^{\ast} \tau_n \ln\bigg(1 + \frac{1}{k^{\ast} \tau_n}\bigg) \qquad \qquad \qquad \qquad \qquad (9)

In the regime of a small \tau, \hat{t}_1^{\ast} \ll 1, and k^{\ast} \gg 1, the second term on the right-hand side of (9) would converge to the constant \mathcal{N}, and we recognise the first bracketed term on the right-hand side of (9) as being of the same form as the entropy function in MW2016. Therefore, as \tau \rightarrow 0, and k^{\ast} \rightarrow \infty, we obtain the result

\mathcal{L}^{\ast \prime}(k^{\ast}) \sim -J^{\prime}(k^{\ast}) = 0 \qquad \qquad \qquad \qquad \qquad (10)

which then leads to the same function \tau(k^{\ast}) as the one derived from the large deviation analysis in MW2016. However, in the regime of larger values of \tau for which k^{\ast} is not large, the correct expression for \tau(k^{\ast}) is (7) above, obtained when the second term on the right-hand side of (9) is no longer negligible when calculating \mathcal{L}^{\ast \prime}(k^{\ast}).

A geometric understanding of the relationship between the two approaches can also be obtained by recognising a probabilistic connection between them. We can extend to our \mathcal{N}-dimensional case an elementary result in probability theory relating the cumulative distribution function of a sum of random variables to the joint density function of those same random variables by writing

P\bigg(\sum_{n=1}^{\mathcal{N}} t_n \leq \tau\bigg)

= \int_0^{\tau} \cdots \int_0^{\tau - \sum_{n=2}^{\mathcal{N}} t_n} \{f_1(t_1)f_2(t_2) \cdots f_{\mathcal{N}}(t_{\mathcal{N}})\} \ dt_1 \cdots dt_{\mathcal{N}} \qquad \qquad \qquad \qquad \qquad (11)

Given that the inter-collision times are independent, the integrand on the right-hand side of (11) is the joint probability density function of the inter-collision times t_1, t_2, \ldots, t_{\mathcal{N}}, which also appear in the sum in the cumulative distribution function on the left-hand side of (11). To clarify the situation geometrically, consider the two-dimensional case depicted in the following diagram.

In two dimensions, the probability represented by (11) is the probability mass in the shaded region bounded by the coordinate axes and the diagonal line.

The link between this framework, and the exponential function of the entropy function J(\tau) that appears in the large deviation analysis in MW2016, \exp(-J(\tau)), is that the latter has the probabilistic interpretation that it is the smallest possible upper bound of the probability that the random sum of inter-collision times is greater than \tau when using the form of Markov’s inequality in the Cramér-Chernoff bounding method (see my previous note about this}. Therefore, negating this relationship, we have

P\bigg(\sum_{n=1}^{\mathcal{N}} t_n \leq \tau\bigg) \geq 1 - \exp(-J(\tau)) \qquad \qquad \qquad \qquad \qquad (12)

So, by using the Fenchel-Legendre transform J(\tau), the approach in MW2016 is implicitly maximising the lower bound of the probability that the sum of inter-collision times is no greater than \tau in (12), and 1 - \exp(-J(\tau)) then constitutes an approximation of the probability mass represented in two dimensions by the shaded region in the diagram above.

The link with the constrained maximisation approach is that here we are approximating the probability as represented by the right-hand side of (11) by choosing the parameters \hat{t}_n to obtain the maximum of the joint cumulative distribution function of the inter-collision times. In the two-dimensional case, this corresponds to choosing a point on the diagonal line in the diagram below in order to maximise the probability mass in the rectangular region.

The upshot, then, is that the multi-dimensional analogue of maximising the probability mass in the rectangular region in the second diagram above by choosing the \hat{t}_n optimally, and the multi-dimensional analogue of approximating the probability mass in the entire region below the diagonal line using the entropy function J(\tau) in 1 - \exp(-J(\tau)), both converge to the same result in the small \tau and small \hat{t}_1^{\ast} regime for which k^{\ast} \gg 1. The two approaches deviate from each other in the large \tau/small k^{\ast} regime due to the logarithmic correction term in (9).

Finally in this section, I briefly record some numerical results from using MAPLE to implement equations (6) and (7). Substituting equation (8) into (6) and (7), and using \tau_n = 1/n^2, we obtain

\hat{t}_n^{\ast} = \frac{1}{n^2} \ln\big(1 + n^2 (\exp(\hat{t}_1^{\ast})-1)\big) \qquad \qquad \qquad \qquad \qquad (13)

for n = 1, \ldots, \mathcal{N}, and

\tau = \sum_{n=1}^{\mathcal{N}} \frac{1}{n^2} \ln\big(1 + n^2 (\exp(\hat{t}_1^{\ast})-1)\big) \qquad \qquad \qquad \qquad \qquad (14)

Thus, given a value for \tau, the maximum-likelihood time \hat{t}_1^{\ast} for the first monomer collision can be computed from (14), and then this value can be used to determine \hat{t}_n^{\ast} for each n = 2, \dots, \mathcal{N} via (13). A plot of the number of collisions versus time, i.e., the growth history of a runaway drop, can then be obtained by plotting the points with coordinates \big(\big \{\sum_{n=1}^{m}\hat{t}_n^{\ast}\big \}, m \big) for m = 1, 2, \dots. For example, a MAPLE computation with values \tau = 0.4136 and \mathcal{N} = 10,000, inspired by some FORTRAN simulations, produced a first monomer collision time \hat{t}_1^{\ast} = 0.018 and a growth history plot like the one shown in the following diagram. The shape of this plot is typical of the runaway growth plots also produced by some physical FORTRAN simulations (not shown here).

Interpretation as constrained maximum likelihood

At first sight, the constrained maximisation approach in this section looks different from a traditional maximum likelihood calculation because the objective function (2) involves the cumulative distribution functions of the inter-collision times rather than probability densities. However, the approach can in fact be interpreted as a constrained maximum likelihood problem with left-censoring of the inter-collision times. Censored data arise frequently in econometrics and other statistical disciplines when some values of a variable of interest fall within a certain range that renders them unobservable. Such observations are then all reported in a collected data set as having a value equal to the boundary of the unobservable range. The objective functions used for maximum likelihood estimation in these settings will necessarily involve the cumulative distribution functions of the variables of interest, because probability densities cannot be evaluated at single points for censored data. See, for example, Gentleman, R., Geyer, C., 1994, Maximum likelihood for interval censored data: consistency and computation, Biometrika, Vol. 81, No. 3, pp. 618-623, and also pages 761-768 in Greene, W., 2003, Econometric Analysis, Fifth Edition, Prentice-Hall International (UK) Limited, London.

In our constrained maximisation calculation in this section, what we are essentially saying is this: we cannot observe the actual inter-collision times t_n, so assume that all we know is that the inter-collision times t_n are below some threshold values \hat{t}_n. (In econometrics, this would be referred to as left-censoring). Then given the exponential distributions for the inter-collision times, what is the most likely sequence of threshold values \hat{t}_n for a runaway raindrop? It is important to understand, however, that we are not addressing this question here in a statistical way with observed data. Maximum likelihood estimation has some well known disadvantages in statistical contexts, particularly when the number of parameters to be estimated is large. Distortions can arise in maximum likelihood estimation involving the gamma distribution, for example, due to unwanted correlations between parameter estimates. See, e.g., Bowman, K., Shenton, L., 2002, Problems with maximum likelihood estimation and the 3 parameter gamma distribution, Journal of Statistical Computation and Simulation, 72:5 pp. 391-401. However, we are not doing statistical maximum likelihood here, in the sense that we are not using observed data to estimate parameters. We are doing mathematical maximum likelihood, just using the functional forms of the probability distributions of the runaway drops. In this context, the statistical problems with the maximum likelihood method are irrelevant, so the approach here actually represents a potentially powerful mathematical method for analysing raindrop histories that can also be applied in the context of other problems. It also has direct links to the entropy functions of large deviation theory, as was demonstrated using the Lagrangian function approach above.

Monomer injections leading to oscillatory behaviour in a mathematical model of rain formation

In a previous note, I outlined a mathematical model of rapid-onset rain formation published by Michael Wilkinson in Physical Review Letters, which will be referred to as MW2016 herein. (Full reference: Wilkinson, M., 2016, Large Deviation Analysis of Rapid Onset of Rain Showers, Phys. Rev. Lett. 116, 018501). In the present note, I want to briefly record an interesting experiment in which I augmented the framework in MW2016 by allowing monomer injections to oppose the reduction in the volume fraction of liquid water content due to runaway collisions in equation [11] in MW2016. I show that this immediately leads to the possibility of oscillatory behaviour in Wilkinson’s model.

The simplest possible assumption is that of a constant monomer injection rate which increases the volume fraction at a constant rate \alpha, thus changing the differential equation presented as equation [11] in MW2016 to

\frac{d \Phi_l}{d t} = \alpha - \mathcal{N} \Phi_l P(t) \qquad \qquad \qquad \qquad \qquad (1)

For convenience we will write

\bar{\mathcal{N}} \equiv \mathcal{N} P(t) \qquad \qquad \qquad \qquad \qquad (2)

so that (1) becomes

\frac{d \Phi_l}{d t} = \alpha - \bar{\mathcal{N}} \Phi_l \qquad \qquad \qquad \qquad \qquad (3)

This addition of monomer injections to equation [11] in MW2016 immediately gives rise to the possibility of oscillatory behaviour in the collector drop model because \bar{\mathcal{N}} also depends on the volume fraction \Phi_l, so there is a feedback loop between them. Suppose we capture this dependence by postulating that the fractional change in the number of collisions \bar{\mathcal{N}} is related to the liquid content volume fraction \Phi_l via some unknown function g, so that

\frac{1}{\bar{\mathcal{N}}} \frac{d \bar{\mathcal{N}}}{d t} = g(\Phi_l) \qquad \qquad \qquad \qquad \qquad (4)

A Maclaurin series approximation of the unknown function g to second order gives

g(\Phi_l) = g(0) + g^{\prime}(0) \Phi_l + \frac{1}{2}g^{\prime \prime}(0) \Phi_l^2 \qquad \qquad \qquad \qquad \qquad (5)

and putting this into (4) gives

\frac{d \bar{\mathcal{N}}}{d t} = g(0)\bar{\mathcal{N}} + g^{\prime}(0) \bar{\mathcal{N}} \Phi_l + \frac{1}{2}g^{\prime \prime}(0) \bar{\mathcal{N}} \Phi_l^2 \qquad \qquad \qquad \qquad \qquad (6)

Equations (3) and (6) now form a coupled ODE system reminiscent of a Lotka-Volterra system, and this is capable of yielding oscillatory behaviour in both the number of collisions \bar{\mathcal{N}} and the volume fraction \Phi_l, depending on the parameters \alpha, g(0), g^{\prime}(0) and g^{\prime \prime}(0). In fact, numerical experiments with (3) and (6) show that the numbers of collisions \bar{\mathcal{N}} in particular can oscillate violently in the initial stages of the interaction, before adjustment to equilibrium eventually takes place. In all numerical experiments in which oscillations are observed, the amplitudes of the oscillations in \bar{\mathcal{N}} are highest when the value of the time variable is small, becoming damped as the value of the time variable increases.

The above figures show classic damped oscillation patterns obtained in one such computation, carried out using MAPLE, with scaled variables representing the number of collisions divided by 10^3, and the volume fraction multiplied by 10^8 (due to the very large difference in magnitude between \bar{\mathcal{N}} and \Phi_l), with \alpha = 0.1, g(0) = -0.00004, g^{\prime}(0) = 4 and g^{\prime \prime}(0) = 400,000,000. (As a check, identical results were obtained using MATLAB and FORTRAN. See the screenshots below).

The observation that oscillatory behaviour arises so easily after incorporation of monomer injections into the model, due to the resulting Lotka-Volterra type of interaction between \bar{\mathcal{N}} and \Phi_l, might have some connection to the oscillatory behaviour reported in previous work on coagulation systems driven by monomer inputs and large cluster outputs (e.g., Ball, R., Connaughton, C., Jones, P., Rajesh, R., Zaboronski, O., 2012, Collective Oscillations in Irreversible Coagulation Driven by Monomer Inputs and Large-Cluster Outputs, Phys. Rev. Lett. 109, 168304). Further investigation would be needed to clarify this.

The following are screenshots of the MAPLE, MATLAB and FORTRAN programs I used to carry out these experiments.

MAPLE:

MATLAB:

FORTRAN:

A note on mathematics in Latin

Latin has played a significant role in the history and development of mathematics. For example, an interesting article by Richard Oosterhoff on Neo-Latin Mathematics indicates the extent to which Latin has helped to promote the prestige and circulation of mathematical work since medieval times.

There is a huge amount of mathematics written in Latin freely available online. To give one example of a particularly rich source, the Euler Archive provides all 850+ of Euler’s written works in digital form, most of which are written in Latin. The Latin edition of Isaac Newton’s Principia Mathematica is also in the public domain and freely downloadable online, as is a famous prize submission to the Paris Academy in 1861 by Bernhard Riemann, and Carl Friedrich Gauss’s famous Disquisitiones Arithmeticae, along with countless other works by numerous mathematicians in repositories such as wilbourhall.org, the French national library’s Gallica website, archive.org, and others.

Excerpt from Disquisitiones Arithmeticae by Carl Friedrich Gauss

I have compiled some Latin `crib notes’ here to provide a quick reminder of basic grammar rules. A useful vocabulary list which is worth learning is DCC’s Latin Core Vocabulary (the 1000 most frequently occurring words in Latin).

Accurate analytical approximation formulae for large-deviation analysis of rain formation

A 2016 paper by M Wilkinson in Physical Review Letters suggests that large-deviation theory is a suitable framework for studying the phenomenon of unexpectedly rapid rain formation in collector-drop collision processes. Wilkinson derives asymptotic approximation formulae for a set of exact large-deviation functions in the collector-drop model, such as the cumulant generating function and the entropy function. The asymptotic approach assumes a large number of water droplet collisions and is motivated by the fact that the exact large-deviation functions are prohibitively difficult to interpret and deal with directly.

Wilkinson uses his asymptotic formulae to obtain further results and also provides some numerical work which suggests that a certain log-density function for the collector-drop model (which is a function of his asymptotic approximation formulae) is itself approximated satisfactorily. However, the numerical work does not test the accuracy of the individual asymptotic approximation formulae directly against their exact large-deviation theory counterparts. When these direct checks are carried out, they reveal that the asymptotic formulae are, in fact, rather inaccurate, even for very large numbers of collisions. Their individual inaccuracy is masked by their incorporation into log-density functions in Wilkinson’s numerical work. Their inaccuracy, as well as some assumptions underlying their derivation, severely limit their applicability.

The purpose of the present note is to point out that it is quite possible to develop accurate analytical (i.e., non-asymptotic) approximation formulae for the large-deviation theory functions in the collector-drop model which also preserve the forms of the leading order power terms in Wilkinson’s asymptotic formulae. An analytical approximation approach can be developed based on a Euler-Maclaurin formula. The resulting analytical formulae are extremely accurate and valid for all relevant numbers of collisions and time scales, producing numerical results which are essentially indistinguishable from the exact function values of their large-deviation theory counterparts.

My full paper can be found on ArXiv.org.

A Phys. Rev. Letter by M Wilkinson on large deviation analysis of rapid-onset rainfall

The following is a record of some notes I made on the main mathematical developments in Michael Wilkinson’s 2016 Physical Review Letter on the large deviation analysis of rapid-onset rain showers (reference: Wilkinson, M., 2016, Large Deviation Analysis of Rapid Onset of Rain Showers, Phys. Rev. Lett. 116, 018501). This publication will be referred to as MW2016 herein. The published Letter has an accompanying Supplement which will also be referred to here. The Supplement provides additional details of some intricate derivations underlying key results presented in the main Letter.

One of the main contributions of MW2016 is that it shows that large deviation theory is the correct framework for analysing the implications of a collector-drop model of rapid rain formation which was discussed in a previous paper by Kostinski and Shaw (reference: Kostinski, A., Shaw, R., 2005, Fluctuations and Luck in Droplet Growth by Coalescence, Bull. Am. Meteorol. Soc. 86, 235). In this collector-drop model, a runaway droplet falls and coalesces at a rapidly increasing rate with many smaller water particles. The collision rates are assumed to increase according to a simple power law function of the number of collisions. In addition to formulating a volume fraction cut-off mechanism for raindrop size, MW2016 also successfully uses large deviation analysis in conjunction with asymptotic approximations to study the link between the time scale for rapid onset of rain showers and the number of water particle collisions needed for raindrop formation. In the present article, I provide a mathematical review of the key points in MW2016 and also detail some errata in the published Letter which were identified in the review process.

Mathematical review

In the collector-drop model, a runaway droplet falls and collides with a large number \mathcal{N} of small droplets of radius a_0 according to an inhomogeneous Poisson process. If the random variable t_n represents the inter-collision time leading up to the n-th collision, the time for a droplet to experience runaway growth is then the random variable

T = \sum_{n=1}^{\mathcal{N}} t_n \quad \quad \quad \quad \quad (1)

The inter-collision times t_n are independent exponentially distributed random variables with

P_n(t_n) = R_n \exp(-R_n t_n) \quad \quad \quad \quad \quad (2)

where R_n is the number of collisions per second, i.e., the collision rate, by the time of the n-th collision. The collision rate R_n is modelled as

R_n = R_1 f(n) \quad \quad \quad \quad \quad (3)

where

R_1 = \epsilon N_0 \pi(a_0 + a_1)^2 \alpha (a_1^2 - a_0^2) \quad \quad \quad \quad \quad (4)

is the collision rate of a drop of radius a_1 with droplets of radius a_0. Here, \epsilon is the coalescence efficiency, N_0 is the number density of microscopic water droplets per m^3, \pi(a_0 + a_1)^2 is the effective cross-section area and \alpha (a_1^2 - a_0^2) is the relative velocity. The constant \alpha is a constant of proportionality between terminal velocity and the radius squared of a water particle.

The function f(n) in (3) characterises the dependence of R_n on the number of collisions. In the Kostinski and Shaw 2005 paper referenced above, f(n) = n^2, but we can have f(n) = n^{4/3} when \epsilon \approx 1 and a_n \gg a_1. MW2016 treats f(n) as a simple power law, so that

R_n = R_1 n^{\gamma} \quad \quad \quad \quad \quad (5)

The average inter-collision time leading to the n-th collision is then given by

\langle t_n \rangle \equiv \tau_n = \frac{1}{R_n} = \frac{1}{R_1} n^{-\gamma} \quad \quad \quad \quad \quad (6)

and the mean time for explosive growth is obtained as

\langle T \rangle = \sum_{n=1}^{\mathcal{N}} \tau_n = \frac{1}{R_1} \sum_{n=1}^{\mathcal{N}} n^{-\gamma} \quad \quad \quad \quad \quad (7)

When \gamma > 1, the mean time for explosive growth converges as \mathcal{N} \rightarrow \infty, giving

\lim_{\mathcal{N} \rightarrow \infty} \langle T \rangle = \frac{1}{R_1} \zeta(\gamma) \quad \quad \quad \quad \quad (8)

where \zeta(\gamma) is Riemann’s zeta function.

A cut-off mechanism for raindrop size is obtained by considering the liquid water content of a cloud, expressed as a volume fraction \Phi_l. Since N_0 is the number of water particles of radius a_0 per m^3, and each particle is a sphere with volume \frac{4}{3} \pi a_0^3, the volume fraction of liquid water per m^3 of cloud is given by

\Phi_l \approx N_0 \cdot \frac{4}{3} \pi a_0^3 \quad \quad \quad \quad \quad (9)

For example, if each droplet is initially of radius a_0 = 10 \mu m = 10^{-5}m, and there are N_0 \approx 10^9 such droplets per m^3 of cloud, then \Phi_l \approx 10^{-6}.

Using (2), the fraction of droplets undergoing runaway growth between times t and t + \delta t is given by

P(t) (t + \delta t - t) = P(t) \delta t \quad \quad \quad \quad \quad (10)

For example, if P(t) \delta t \approx 10^{-6}, then `one in a million’ water droplets will undergo runaway growth between times t and t + \delta t.

If the volume of each runaway droplet increases by a factor of \mathcal{N}, the volume fraction of liquid water removed from the cloud per m^3 over the time interval of length \delta t will then be

\mathcal{N} \Phi_l P(t) \delta t \quad \quad \quad \quad \quad (11)

For example, if \mathcal{N} = 10^3, \Phi_l \approx 10^{-6} and P(t) \delta t \approx 10^{-6}, then the volume fraction of liquid water content per m^3 is reduced by 10^3 \cdot 10^{-6} \cdot 10^{-6} = 10^{-9} over the time interval of length \delta t. This represents a \frac{10^{-9}}{10^{-6}}\cdot 10^2 = 0.1 per cent reduction in \Phi_l.

As \delta t \rightarrow 0, the instantaneous change of the liquid water content of a cloud per m^3 due to runaway growth of droplets is thus

\frac{d \Phi_l}{d t} = -\mathcal{N} \Phi_l P(t) \quad \quad \quad \quad \quad (12)

Therefore, the instantaneous percentage rate of decline of the volume fraction is

-\frac{d \ln \Phi_l}{d t} = \mathcal{N} P(t) \quad \quad \quad \quad \quad (13)

Integrating (13) over a time interval then gives a percentage loss \mu in that interval:

\mu = \mathcal{N}\int_0^t dt^{\prime} P(t^{\prime}) \quad \quad \quad \quad \quad (14)

The onset of a shower is determined by the criterion that \mu be a significant percentage of the liquid water content of a cloud that is removed by raindrop formation as a result of \mathcal{N} particle collisions. We are now interested in finding the time scale t^{*} over which this significant reduction in \Phi_l occurs. In accordance with (14), this t^{*} satisfies

\mathcal{N^{*}}\int_0^{t^{*}} dt^{\prime} P(t^{\prime}) = 1 \quad \quad \quad \quad \quad (15)

where \mathcal{N^{*}} \equiv \mathcal{N}/\mu.

To make progress from here, MW2016 approximates P(t) in (15) using the large deviation principle, which is explained in detail in, e.g., an article by Touchette (reference: Touchette, H., 2009, The large deviation approach to statistical mechanics, Phys. Rep. 478, 1). With T defined as in (1) and \langle T \rangle given by (7), we implement the large deviation principle here by approximating the probability P(T)dT at some value \bar{T} in the small-value tail of the distribution of T as

P(\bar{T})d\bar{T} = P(T \in [\bar{T}, \bar{T} + d\bar{T}])

where P(\bar{T}) is given by

P(\bar{T}) = \frac{1}{\langle T \rangle} \exp[-J(\tau)] \quad \quad \quad \quad \quad (16)

with \tau \equiv \bar{T}/\langle T \rangle. The function J(\tau) in (16) is referred to as an entropy function or rate function in large deviation theory, and is defined below. Using (16) in (15) we get the condition for the onset of rainfall as

\mathcal{N^{*}} \int_0^{t^{*}} \frac{d \bar{T}}{\langle T \rangle} \exp \bigg[-J\bigg(\frac{\bar{T}}{\langle T \rangle}\bigg)\bigg] = \mathcal{N^{*}} \int_0^{t^{*}} d \tau^{\prime} \exp [-J(\tau^{\prime})] = 1 \quad \quad \quad \quad \quad (17)

with \tau^{\prime} \equiv \bar{T}/\langle T \rangle. A saddle-point approximation for the integral here at the maximum value of the integrand, \exp[-J(\tau^{*})], with \tau^{*} \equiv \bar{T}^{\ast}/\langle T \rangle, then gives the condition for the onset of rainfall as

\mathcal{N}^{*} \exp[-J(\tau^{*})] = 1 \quad \quad \quad \quad \quad (18)

Taking logs, we can write the condition alternatively as

\ln \mathcal{N}^{*} = J(\tau^{*}) \quad \quad \quad \quad \quad (19)

What is required now is to find an expression for \tau^{*} in terms of \ln \mathcal{N}^{*}, using (19). To obtain this, it is necessary to find an expression for the rate function J(\tau) for the random sum defined by (1) and (2).

Consider a single inter-collision time t_n with density (2). Following the approach in the article by Touchette cited above for the purposes of the large deviation analysis here, we obtain the log of the Laplace transform of the random variable t_n, often referred to as its scaled cumulant generating function, as

\lambda_n (k) = -\ln \langle \text{e}^{-kt_n} \rangle = -\ln\bigg(\int_0^{\infty} d t_n \text{e}^{-kt_n} R_n \text{e}^{-R_n t_n}\bigg) = \ln\bigg(1 + \frac{k}{R_n}\bigg) \quad \quad \quad \quad \quad (20)

To ensure the logarithm on the right-hand side is always defined, we need to restrict the admissible values of k to the interval (-R_1, \infty). Inter-collision times are assumed to be independent, so for a sum T of inter-collision times such as (1), we get the log of the Laplace transform of T as

\lambda(k) = -\ln \langle \text{e}^{-kT} \rangle

= -\ln \bigg \langle \prod_{n=1}^{\mathcal{N}} \text{e}^{-kt_n} \bigg \rangle = -\ln \prod_{n=1}^{\mathcal{N}}\langle \text{e}^{-kt_n} \rangle = \sum_{n=1}^{\mathcal{N}} \ln\bigg(1 + \frac{k}{R_n}\bigg) \quad \quad \quad \quad \quad (21)

This is differentiable with respect to k for all k \in (-R_1, \infty), so by the Gartner-Ellis Theorem of large deviation analysis, discussed in Touchette, the rate function in (16) is given by

J(\tau) = \sup_k \{\lambda(k) - k\tau\} \quad \quad \quad \quad \quad (22)

The expression on the right-hand side of (22) is referred to as the Fenchel-Legendre transform of \lambda(k).

Note that the expression in (22) is the negative of the expression usually given in large deviation theory. This is because the outer function in the formulation of the log of the Laplace transform in (20) above is the negative of the expression used, e.g., in Touchette, and this formulation also has the parameter k negated. The alternative formulation would be

\lambda_n(k) = \ln \langle \text{e}^{kt_n} \rangle = \ln\bigg(\int_0^{\infty} d t_n \text{e}^{kt_n} R_n \text{e}^{-R_n t_n}\bigg) = -\ln\bigg(1 - \frac{k}{R_n}\bigg)

and the associated entropy function for this would then be the more conventional

J(\tau) = \sup_k \{k\tau - \lambda(k)\}

In this case, the admissible values of k would be those in the interval (-\infty, R_1). Essentially, the signs of \lambda(k) and k have been negated in the formulation in MW2016, as this is more convenient for the analysis.

Differentiating the bracketed expression in (22) with respect to k and setting equal to zero we obtain

\lambda^{\prime}(k) - \tau = 0 \quad \quad \quad \quad \quad (23)

But

\lambda^{\prime}(k) = \sum_{n=1}^{\mathcal{N}} \frac{\big(\frac{1}{R_n}\big)}{1 + \frac{k}{R_n}} = \sum_{n=1}^{\mathcal{N}} \frac{1}{R_n + k} \quad \quad \quad \quad \quad (24)

Putting this in (23) we get

\tau = \sum_{n=1}^{\mathcal{N}} \frac{1}{R_n + k} \quad \quad \quad \quad \quad (25)

This implicitly gives an optimal value of k as a function of \tau, say k^{*}(\tau). If this could be obtained explicitly, inserting it into the right-hand side of (22) would give the required rate function J(\tau):

J(\tau) = \lambda(k^{*}(\tau)) - k^{*}(\tau) \cdot \tau \quad \quad \quad \quad \quad (26)

However, in practice it is not possible to obtain an exact expression for J(\tau) in this way. Instead, MW2016 obtains asymptotic expressions for J(\tau) which are valid for small \tau. This is achieved by reparameterising J(\tau) using a scaled variable \kappa, defined as

\kappa = \frac{k^{*}}{R_1} \quad \quad \quad \quad \quad (27)

From (7) and (8) we get, as \mathcal{N} \rightarrow \infty,

\tau = \frac{\bar{T}}{\langle T \rangle} = \frac{R_1 \bar{T}}{\zeta(\gamma)} \quad \quad \quad \quad \quad (28)

Using (5), (27) and (28) in (25) we get

\tau(\kappa) = \frac{R_1}{\zeta(\gamma)} \sum_{n=1}^{\infty} \frac{1}{R_1 n^{\gamma} + R_1 \kappa} = \frac{1}{\zeta(\gamma)} \sum_{n=1}^{\infty} \frac{1}{\kappa + n^{\gamma}} \quad \quad \quad \quad \quad (29)

The expression on the right-hand side of (29) appears in equation (18) in MW2016.

We now write (2.21) in reparameterised form as \mathcal{N} \rightarrow \infty as

S(\kappa) = \sum_{n=1}^{\infty} \ln \bigg(1 + \frac{R_1 \kappa}{R_1 n^{\gamma}}\bigg) = \sum_{n=1}^{\infty} \ln(1 + \kappa n^{-\gamma}) \quad \quad \quad \quad \quad (30)

We also write

k^{*}(\tau) \cdot \tau = \frac{R_1 \kappa \zeta(\gamma) \tau}{R_1} = \zeta(\gamma) \kappa \tau \quad \quad \quad \quad \quad (31)

Then we re-express (26) in the form

J(\kappa)= S(\kappa) - \zeta(\gamma) \kappa \tau(\kappa) \quad \quad \quad \quad \quad (32)

This appears as equation (19) in MW2016.

Following the approach in the Supplement to MW2016, the next stage in the development is to obtain an asymptotic expression for the sum in (30) as \kappa \rightarrow \infty. (It is assumed from now on that R_1 = 1, to simplify the expressions). Upon differentiating this, we will then obtain a corresponding asymptotic expression for \tau in accordance with (29), which can be inverted to obtain a small \tau approximation for \kappa, and subsequently a small \tau approximation for J(k). Finally, this will be used in (19) to obtain an expression for \tau^{*} in terms of \ln\mathcal{N}^{*}.

Begin by approximating the sum in (30) as an integral. Observe that

\int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma})

= \int_0^1 dn \ln(1 + \kappa n^{-\gamma}) + \int_1^2 dn \ln(1 + \kappa n^{-\gamma}) + \int_2^3 dn \ln(1 + \kappa n^{-\gamma}) + \cdots

= \int_1^0 -dx \ln(1 + \kappa (1-x)^{-\gamma})

+ \int_1^0 -dx \ln(1 + \kappa (2-x)^{-\gamma}) + \int_1^0 -dx \ln(1 + \kappa (3-x)^{-\gamma}) + \cdots

= \sum_{n=1}^{\infty} \int_0^1 dx \ln(1 + \kappa (n-x)^{-\gamma})

(The third line is obtained by making the change of variable n = \overline{m} - x in each integral in the second line, with \overline{m} representing the fixed positive integer in the upper limit, so dn = -dx, and x = 0 when n = \overline{m} whereas x = 1 when n = \overline{m} - 1). Therefore

\sum_{n=1}^{\infty} \ln(1 + \kappa n^{-\gamma}) \equiv \int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma})

+ \bigg\{\sum_{n=1}^{\infty} \ln(1 + \kappa n^{-\gamma}) - \int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma})\bigg\}

= \int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma}) + \sum_{n=1}^{\infty} \int_0^1 dx \bigg \{ \ln(1 + \kappa n^{-\gamma}) - \ln(1 + \kappa (n-x)^{-\gamma})\bigg\}

In MW2016 this is written as

S = S_0 + \sum_{n=1}^{\infty} \triangle S_n \quad \quad \quad \quad \quad (33)

where

S = \sum_{n=1}^{\infty} \ln(1 + \kappa n^{-\gamma})

S_0 = \int_0^{\infty} dn \ln(1 + \kappa n^{-\gamma})

\triangle S_n = \int_0^1 dx \bigg \{\ln(1 + \kappa n^{-\gamma}) - \ln(1 + \kappa (n-x)^{-\gamma})\bigg\} \quad \quad \quad \quad \quad (34)

Write m = n - x. Then as x \rightarrow 0 we have

\frac{\ln(1 + \kappa n^{-\gamma}) - \ln(1 + \kappa (n-x)^{-\gamma})}{x}

= \frac{\ln(1 + \kappa (m+x)^{-\gamma}) - \ln(1 + \kappa m^{-\gamma})}{x}

\rightarrow \frac{\partial \ln(1 + \kappa n^{-\gamma})}{\partial n} = \frac{-\gamma \kappa}{n(\kappa + n^{\gamma})}

Therefore, when n \gg 1, we can approximate each \triangle S_n by using

\triangle S_n = \int_0^1 dx x \bigg\{\frac{\ln(1 + \kappa n^{-\gamma}) - \ln(1 + \kappa (n-x)^{-\gamma})}{x}\bigg\}

\sim \frac{\partial \ln(1 + \kappa n^{-\gamma})}{\partial n} \int_0^1 dx x

= \frac{-\gamma \kappa}{2n(\kappa + n^{\gamma})} \quad \quad \quad \quad \quad (35)

When \kappa n^{-\gamma} \gg 1 but n is not necessarily large, we can write

\triangle S_n \sim \int_0^1 dx \ln[\kappa n^{-\gamma}(1 + n^{\gamma}/\kappa)] - \int_0^1 dx \ln[\kappa (n-x)^{-\gamma}(1 + (n - x)^{\gamma}/\kappa]

= -\gamma \int_0^1 dx \{ \ln(n) - \ln(n-x)\} + O(\kappa^{-1}) \quad \quad \quad \quad \quad (36)

and we can ignore the O(\kappa^{-1}) component as \kappa \rightarrow \infty. Next, make the change of variable y = n - x in the integral in (36). Then when x = 0, y = n, when x = 1, y = n-1, and dy = -dx. We get

\triangle S_n \sim \gamma \int_n^{n-1} dy \{\ln(n) - \ln(y)\}

= -\gamma \int_{n-1}^n dy \{\ln(n) - \ln(y)\}

= -\gamma\bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] \quad \quad \quad \quad \quad (37)

In order to evaluate S using (33), MW2016 splits the summation over \triangle S_n into two parts: choose an integer M such that \kappa M^{-\gamma} \gg 1 and let M \rightarrow \infty as \kappa \rightarrow \infty. Use (35) for \triangle S_n when n > M, and (37) when n \leq M. We get

S \sim S_0 - \gamma \sum_{n=1}^M \bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] - \gamma \kappa \sum_{n = M+1}^{\infty} \frac{1}{2n(\kappa + n^{\gamma})} \quad \quad \quad \quad \quad (38)

We now approximate the second summation using an integral:

- \gamma \kappa \sum_{n = M+1}^{\infty} \frac{1}{2n(\kappa + n^{\gamma})} \sim \frac{-\gamma \kappa}{2} \int_M^{\infty} dn \frac{1}{n(\kappa + n^{\gamma})} = \frac{-\gamma}{2} \int_M^{\infty} dn \frac{1}{n\bigg(1 + \frac{n^{\gamma}}{\kappa}\bigg)} \quad \quad \quad \quad \quad (39)

Make the change of variable y = n^{\gamma}/\kappa. Then dy = \frac{1}{\kappa} \gamma n^{\gamma - 1} dn, so dn = \frac{1}{y} \frac{n}{\gamma} dy, and when n = M, y = M^{\gamma}/\kappa, while when n = \infty, y = \infty. The integral in (39) becomes

\frac{-\gamma}{2} \int_{M^{\gamma}/\kappa}^{\infty} \frac{1}{y} \frac{n}{\gamma} dy \frac{1}{n(1+y)}

= -\frac{1}{2} \int_{M^{\gamma}/\kappa}^{\infty} dy \frac{1}{y(1+y)}

= -\frac{1}{2} \int_{M^{\gamma}/\kappa}^{\infty} dy \bigg \{ \frac{1}{y} - \frac{1}{y+1} \bigg \}

= \frac{1}{2} \ln\bigg(\frac{M^{\gamma}}{\kappa}\bigg) - \frac{1}{2} \ln\bigg(\frac{M^{\gamma}}{\kappa} + 1\bigg)

= \frac{1}{2} \ln \bigg(\frac{\frac{M^{\gamma}}{\kappa}}{\frac{M^{\gamma}}{\kappa}+ 1}\bigg)

= -\frac{1}{2} \ln(1 + \kappa M^{-\gamma})

= -\frac{1}{2} \ln \bigg[\kappa M^{-\gamma} \bigg(1 + \frac{1}{\kappa M^{-\gamma}}\bigg)\bigg]

= -\frac{1}{2} \ln(\kappa) - \frac{1}{2} \ln(M^{-\gamma}) - \frac{1}{2} \ln \bigg(1 + \frac{1}{\kappa M^{-\gamma}}\bigg)

= -\frac{1}{2} \ln(\kappa) + \frac{\gamma}{2} \ln(M) + O(\kappa^{-1})

Putting this in (38) we get

S \sim S_0 - \gamma \sum_{n=1}^M \bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] -\frac{1}{2} \ln(\kappa) + \frac{\gamma}{2} \ln(M) + O(\kappa^{-1}) \quad \quad \quad \quad \quad (40)

We can write this as

S \sim S_0 - \gamma \bigg\{\sum_{n=1}^M \bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] - \frac{1}{2} \ln(M)\bigg\}

-\frac{1}{2} \ln(\kappa) + O(\kappa^{-1}) \quad \quad \quad \quad \quad (41)

Taking the limit as M \rightarrow \infty, the term in curly brackets in (41) converges to a constant C, and we get

S \sim S_0 - \frac{1}{2} \ln(\kappa) - \gamma C + O(\kappa^{-1}) \quad \quad \quad \quad \quad (42)

This is equation [10] in the Supplement to MW2016.

From (29), we see that differentiating S(\kappa) in (30) gives a time T(\kappa) = \zeta(\gamma) \tau(\kappa). The corresponding asymptotic expression for T(\kappa) is obtained by differentiating (42). We get

\frac{\partial S}{\partial \kappa} \sim \int_0^{\infty} dn \frac{1}{\kappa} \bigg(\frac{1}{1 + \frac{n^{\gamma}}{\kappa}}\bigg) - \frac{1}{2\kappa} \quad \quad \quad \quad \quad (43)

In the integral in (43), make the change of variable x = \frac{n^{\gamma}}{\kappa}, so that (\kappa x)^{1/\gamma} = n. Then

dn = \frac{1}{\gamma} (\kappa x)^{\frac{1}{\gamma} -1} \cdot \kappa dx = \frac{1}{\gamma} \kappa^{-\frac{\gamma -1}{\gamma}} \cdot x^{-\frac{\gamma -1}{\gamma}} \cdot \kappa dx

The integral in (43) then becomes

\kappa^{-\frac{\gamma -1}{\gamma}} \cdot \frac{1}{\gamma} \int_0^{\infty} dx \frac{x^{-\frac{\gamma -1}{\gamma}}}{1+x}

so we get

\frac{\partial S}{\partial \kappa} = T(\kappa) \sim A(\gamma) \cdot \kappa^{-\frac{\gamma -1}{\gamma}} - \frac{1}{2\kappa} \quad \quad \quad \quad \quad (44)

with

A(\gamma) \equiv \frac{1}{\gamma} \int_0^{\infty} dx \frac{x^{-\frac{\gamma -1}{\gamma}}}{1+x} \quad \quad \quad \quad \quad (45)

(MW2016 notes that the integral in (45) is actually the beta function B\big(1-\frac{1}{\gamma}, \frac{1}{\gamma}\big)). Integrating the first term on the right-hand side of (44), which is \frac{\partial S_0}{\partial \kappa}, gives

S_0 = \gamma A(\gamma) \kappa^{\frac{1}{\gamma}} \quad \quad \quad \quad \quad (46)

Substituting for S_0 in (42) with the expression in (46) gives

S \sim \gamma A(\gamma) \kappa^{\frac{1}{\gamma}} - \frac{1}{2} \ln(\kappa) - \gamma C + O(\kappa^{-1}) \quad \quad \quad \quad \quad (47)

This is equation (21) in MW2016.

To obtain an explicit expression for the rate function J(\tau), we must now invert (44) to express the parameter \kappa in terms of the time T, and hence in terms of the dimensionless time given by \tau = \frac{T}{\zeta(\gamma)}. To this end, observe that from (44) we get

T \sim A \kappa^{\frac{1}{\gamma}} \kappa^{-1} - \frac{1}{2} \kappa^{-1} = \kappa^{-1} \bigg(A \kappa^{\frac{1}{\gamma}} - \frac{1}{2}\bigg) \quad \quad \quad \quad \quad (48)

Therefore

\kappa T \sim A \kappa^{\frac{1}{\gamma}}

so

\kappa \sim \bigg(\frac{T}{A}\bigg)^{-\frac{\gamma}{\gamma-1}} = \bigg(\frac{A}{\zeta(\gamma)}\bigg)^{\frac{\gamma}{\gamma-1}}\bigg(\frac{T}{\zeta(\gamma)}\bigg)^{-\frac{\gamma}{\gamma-1}} = b\tau^{-\frac{\gamma}{\gamma-1}} \quad \quad \quad \quad \quad (49)

with b \equiv \big(\frac{A}{\zeta(\gamma)}\big)^{\frac{\gamma}{\gamma-1}}. Now, using (47) and (48), the rate function is given by

J(\tau) = S(\kappa^{*}) - T(\kappa^{*})\kappa^{*}

= \gamma A(\gamma) \kappa^{\frac{1}{\gamma}} - \frac{1}{2} \ln(\kappa) - \gamma C -A \kappa^{\frac{1}{\gamma}} + \frac{1}{2}

= (\gamma - 1) A\kappa^{\frac{1}{\gamma}} - \frac{1}{2} \ln(\kappa) - \bigg(\gamma C - \frac{1}{2}\bigg) \quad \quad \quad \quad \quad (50)

Using (49) in (50) we get

J(\tau) = (\gamma - 1) A b^{\frac{1}{\gamma}} \tau^{-\frac{1}{\gamma-1}} + \frac{1}{2} \frac{\gamma}{\gamma - 1} \ln(\tau) + D \quad \quad \quad \quad \quad (51)

where D is another constant.

We are now finally in a position to obtain the result given in (27) in MW2016. From (51) we see that to leading order we have

J(\tau^{*}) \propto [\tau^{*}]^{-\frac{1}{\gamma-1}}

so from (19) we obtain

\tau^{*} \propto [\ln \mathcal{N}^{*}]^{-(\gamma - 1)} \quad \quad \quad \quad \quad (52)

Errata in MW216

In the course of working with MW2016, some errata have come to light which I will record here.

In a private written communication with Michael Wilkinson, I pointed out that an error was made in the Supplement to MW2016 which details the asymptotic analysis. The issue arises in the Supplement when combining equations (17) and (19) there to get equation (20), which is then presented as equation (24) in the main MW2016 text. Equation (17) for \kappa in the Supplement has a leading order term involving the dimensionless time variable \tau with an exponent of the form -\gamma/(\gamma-1). Using (17) in equation (19) for the entropy, we should get the leading order \tau term appearing with exponent -1/(\gamma-1), because \kappa in (19) appears raised to the power 1/\gamma. However, the exponent is incorrectly given as -\gamma/(\gamma-1). This error affected the time scale relationship reported in (27) in the main MW2016 text. Combining the expression for the entropy with the condition (12) in the main MW2016 text for the onset of a shower, \tau should be related to \ln \mathcal{N}^{\ast} with an exponent -(\gamma-1) on the latter, i.e., the reciprocal of -1/(\gamma-1). However, the exponent that is actually reported in equation (27) of MW2016 is -(\gamma-1)/\gamma.

In an online meeting with Michael Wilkinson, I also pointed out that the formula for the constant C given as equation (11) in the Supplement to MW2016 cannot be correct, and the formula is incompatible with the numerical value claimed for C in equation (12) in the Supplement. For example, the expression involving natural logarithm terms in (11) must be negative, but the numerical value given in (12) is positive. Using numerical calculations in MAPLE, I pointed out that the following variant of the formula in (11) produces a numerical value close to the one reported in (12), as M becomes large:

C = 1 + \bigg\{\sum_{n=1}^M \bigg[(n-1) \ln\bigg(\frac{n-1}{n}\bigg)+ 1\bigg] - \frac{1}{2} \ln(M)\bigg\}

The expression in curly brackets appears in equation (41) in my review above.

Finally, in another private written communication with Michael Wilkinson I pointed out that the symbol T is being used inconsistently in MW2016. In equation (4) in MW2016, T is defined as a random variable, i.e., a sum of random inter-collision times for a runaway raindrop. However, the same symbol T is used in the entropy function in equation (7) and in the related equation (16) in MW2016, where T is no longer a random variable, but rather an exogenous bound on values of the random variable T. The distinction between the two becomes a delicate matter in some discussions, so it would have been better to use something like an overbar to distinguish between the random variable T and the related symbol \bar{T}, for example in expressing the large deviation principle P(T \in [\bar{T}, \bar{T}+d\bar{T}]) \approx \exp(-J(\bar{T}))d\bar{T}.

Using Lagrange’s interpolation formula to prove complicated sum/product identities

A paper by Sen and Balakrishnan pointed out that Lagrange’s polynomial interpolation formula can be used to construct relatively simple proofs of some complicated-looking sum/product identities. (Sen, A., Balakrishnan, N., 1999, Convolution of geometrics and a reliability problem, Statistics and Probability Letters, 43, pp. 421-426). In this note I want to make a quick record of how this works.

Let (R_1, f(R_1)), (R_2, f(R_2)), \ldots , (R_{\mathcal{N}}, f(R_{\mathcal{N}})) be a set of \mathcal{N} distinct points in [0, \infty) and let f be a continuous function defined on [0, \infty). Then there is a unique polynomial p of degree \mathcal{N}-1 or less which satisfies the interpolation conditions

p(R_i) = f(R_i) \qquad \qquad \qquad (1)

for i = 1, \ldots, \mathcal{N}. This unique polynomial can be constructed by defining for n = 1, \ldots, \mathcal{N} the functions

\mathcal{P}_n(R) = \prod_{\substack{j=1 \\ j \neq n}}^{\mathcal{N}} \frac{(R_j-R)}{(R_j-R_n)} \qquad \qquad \qquad (2)

These are polynomials of degree less than \mathcal{N}, and since the product on the right-hand side of (2) contains all the R_j except R_n, the functions \mathcal{P}_n(R) satisfy

\mathcal{P}_n(R_i) = \delta_{ni} \qquad \qquad \qquad (3)

for i = 1, \ldots, \mathcal{N}, where \delta_{ni} is Kronecker’s delta. In other words, for each n we have \mathcal{P}_n(R_n) = 1, and \mathcal{P}_n(R_i) = 0 for i \neq n. Lagrange’s polynomial interpolation formula is then written as

p(R) = \sum_{n=1}^{\mathcal{N}} \mathcal{P}_n(R) \cdot f(R_n) \qquad \qquad \qquad (4)

This satisfies the interpolation conditions in (1) above because setting R = R_i in (4) causes all the terms to vanish except \mathcal{P}_i(R_i)f(R_i) = f(R_i), so (1) is obtained. Crucially for the proofs below, the formula (4) represents a unique polynomial of degree \mathcal{N}-1 or less whose values are f(R_1), f(R_2), \dots, f(R_{\mathcal{N}}) at the interpolation points R_1, R_2, \ldots , R_{\mathcal{N}} respectively.

Now, a brief indication of how to prove

\sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg) \bigg) \equiv 1 \qquad \qquad \qquad (5)

using Lagrange’s polynomial interpolation formula was provided in Sen and Balakrishnan’s paper (cited above), and we elaborate on this idea here to enable similar approaches for other formulas. The sum/product on the left-hand side of (5) looks like Lagrange’s formula for interpolating the points (R_1, 1), (R_2, 1), \dots, (R_{\mathcal{N}}, 1). In the style of (4), this formula would be

p(R) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j - R}{R_j - R_n}\bigg) \bigg) \cdot 1 \qquad \qquad \qquad (6)

But the constant function that takes the value 1 for all its arguments is a polynomial of degree 0, and the interpolating polynomial p(R) in (6) is unique. Therefore we can immediately conclude that p(R) = 1. But (5) is simply p(0) = 1, so it is immediately proved.

We now see how to adapt this procedure to prove another sum/product formula. Consider the formula

\sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j}{R_j - R_n}\bigg)R_n^m \bigg) \equiv 0 \qquad \qquad \qquad (7)

for m = 1, \ldots, \mathcal{N}-1. The left-hand side of (7) looks like Lagrange’s formula for interpolating the points (R_1, R_1^m), (R_2, R_2^m), \dots, (R_{\mathcal{N}}, R_{\mathcal{N}}^m). In the style of (4), this formula would be

p(R) = \sum_{n=1}^{\mathcal{N}} \bigg(\prod_{\substack{j=1 \\ j\neq n}}^{\mathcal{N}} \bigg(\frac{R_j - R}{R_j - R_n}\bigg) \bigg) \cdot R_n^m \qquad \qquad \qquad (8)

But the function f(R_n) = R_n^m in (8) is a polynomial of degree m, and the interpolating polynomial p(R) in (8) is unique. Therefore we can immediately conclude that p(R) = R^m. But (7) is then simply p(0) = 0, so it is immediately proved.

Other sum/product formulas of this kind can be proved in similar ways.

Using an asymptotic approximation to obtain a closed form solution

I needed to find an extremum of a function of the form

f(\bar{T}_1) = \big[\mathcal{N}_1!\big]^{\gamma - 1}(\bar{T}_1)^{\mathcal{N}_1} + \big[(\mathcal{N} - \mathcal{N}_1)!\big]^{- 1}\bigg(\frac{\mathcal{N}!}{\mathcal{N}_1!}\bigg)^{\gamma}(\bar{T} - \bar{T}_1)^{\mathcal{N}-\mathcal{N}_1}

by choice of \bar{T}_1, ideally obtaining a closed form solution for the critical value \bar{T}_1^{\ast}. The precise context is not relevant here – I just want to record an asymptotic approximation trick I was able to use to solve this problem. Differentiating and setting equal to zero we obtain the first-order condition

\mathcal{N}_1 \big[\mathcal{N}_1!\big]^{\gamma - 1}(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}

- (\mathcal{N} - \mathcal{N}_1)\big[(\mathcal{N} - \mathcal{N}_1)!\big]^{- 1}\bigg(\frac{\mathcal{N}!}{\mathcal{N}_1!}\bigg)^{\gamma}(\bar{T} - \bar{T}_1^{\ast})^{\mathcal{N}-\mathcal{N}_1 - 1} = 0

which implies

\frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{(\bar{T} - \bar{T}_1^{\ast})^{\mathcal{N}-\mathcal{N}_1 - 1}} = \frac{(\mathcal{N} - \mathcal{N}_1)\big[(\mathcal{N} - \mathcal{N}_1)!\big]^{- 1}\bigg(\frac{\mathcal{N}!}{\mathcal{N}_1!}\bigg)^{\gamma}}{\mathcal{N}_1 \big[\mathcal{N}_1!\big]^{\gamma - 1}} = \frac{\big[\mathcal{N}!\big]^{\gamma}(\mathcal{N}_1-1)!}{\big[\mathcal{N}_1!\big]^{2\gamma}(\mathcal{N}-\mathcal{N}_1-1)!} \qquad \qquad \qquad (1)

This does not immediately yield a closed form solution for \bar{T}_1^{\ast}, but I realised that in the present context, with \mathcal{N} \gg \mathcal{N}_1 \gg 1, and \bar{T} \gg \bar{T}_1^{\ast}, we can in fact satisfactorily approximate the left-hand side as

\frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{(\bar{T} - \bar{T}_1^{\ast})^{\mathcal{N}-\mathcal{N}_1 - 1}} = \bigg(\frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1 - 1}}\bigg) \cdot \bigg \{ \bigg(1 - \frac{\bar{T}_1^{\ast}}{\bar{T}}\bigg)^{-(\mathcal{N}-\mathcal{N}_1 - 1)}\bigg \}

\approx \bigg(\frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1 - 1}}\bigg) \cdot \bigg(1 + (\mathcal{N}-\mathcal{N}_1 - 1)\frac{\bar{T}_1^{\ast}}{\bar{T}}\bigg)

\approx \frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1-1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1 - 1}} + (\mathcal{N}-\mathcal{N}_1 - 1) \cdot \frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1}}

\approx (\mathcal{N}-\mathcal{N}_1) \cdot \frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1}}

Then we can write (1) using the gamma function n! = \Gamma(n+1) as

(\mathcal{N}-\mathcal{N}_1) \cdot \frac{(\bar{T}_1^{\ast})^{\mathcal{N}_1}}{\bar{T}^{\mathcal{N}-\mathcal{N}_1}} = \frac{[\Gamma(\mathcal{N}+1)]^{\gamma}\cdot [\Gamma(\mathcal{N}_1)]}{[\Gamma(\mathcal{N}_1+1)]^{2\gamma} \cdot [\Gamma(\mathcal{N}-\mathcal{N}_1)]}

and thus

T_1^{\ast} = \frac{ [\Gamma(\mathcal{N}+1)]^{\frac{\gamma}{\mathcal{N}_1}} \cdot [\Gamma(\mathcal{N}_1)]^{\frac{1}{\mathcal{N}_1}} \cdot \bar{T}^{\ \frac{(\mathcal{N}-\mathcal{N}_1)}{\mathcal{N}_1}}}{(\mathcal{N} - \mathcal{N}_1)^{\frac{1}{\mathcal{N}_1}}\cdot [\Gamma(\mathcal{N}_1+1)]^{\frac{2\gamma}{\mathcal{N}_1}}\cdot [\Gamma(\mathcal{N}-\mathcal{N}_1)]^{\frac{1}{\mathcal{N}_1}}}

which is the kind of closed form solution I wanted.

Applying the Bromwich integral to a cumulant generating function

In this note I want to quickly go through the motions of applying the Bromwich integral (i.e., inverse Laplace transform) to the cumulant generating function \lambda(k) of an exponential variate X with probability density function

f(x) = \mu e^{-\mu x} \qquad \qquad \qquad (1)

for x \geq 0. The cumulant generating function is the log-Laplace transform of f(x), obtained as

\lambda(k) = \ln \langle e^{kx} \rangle

= \ln\bigg(\int_0^{\infty} e^{kx} \mu e^{-\mu x} dx\bigg) = \ln\bigg(\frac{\mu}{\mu - k}\bigg) \qquad \qquad \qquad (2)

Note that we need to assume k < \mu for the integral to converge in this case.

We need to apply the Bromwich integral to the exponential of \lambda(k), i.e.,

e^{\lambda(k)} = \frac{\mu}{\mu - k} \qquad \qquad \qquad (3)

So, we have that taking the Laplace transform of the density function gives

\mathcal{L}(\mu e^{-\mu x}) = \frac{\mu}{\mu - k} \qquad \qquad \qquad (4)

How do we now apply the Bromwich integral to the right-hand side of (4) to deduce \mu e^{-\mu x}? One thing to notice is that this result is immediately obtainable from standard tables of Laplace transforms. For example, in most such tables one will find something like

\mathcal{L}(e^{-at}) = \frac{1}{p+a} \qquad \qquad \qquad (5)

where p is the Laplace transform parameter such that

F(p) = \mathcal{L}(f(t)) = \int_0^{\infty} f(t) e^{-pt} dt \qquad \qquad \qquad (6)

Putting a = \mu, p = -k and t = x in (5), we can then immediately deduce from (5) that

\mathcal{L}^{-1} \bigg(\frac{\mu}{\mu-x}\bigg) = \mu \mathcal{L}^{-1} \bigg(\frac{1}{\mu-x}\bigg) = \mu e^{-\mu x} \qquad \qquad \qquad (7)

Therefore, we will try to deduce our result by applying the Bromwich integral to the right-hand side of (5), where the Bromwich integral is given by

f(t) = \frac{1}{2 \pi i} \int_{c - i \infty}^{c + i \infty} F(z) e^{zt} dz \qquad \qquad \qquad (8)

for t > 0. The basic procedure is simple. First, we convert p on the right-hand side of (5) into a complex variable, say p = x + iy \equiv z, so the function F(z) in (8) is then

F(z) = \frac{1}{z + a} \qquad \qquad \qquad (9)

Then as long as F(z) is of the form F(z) = P(z)/Q(z), where P(z) and Q(z) are polynomials such that the degree of Q(z) is at least one greater than the degree of P(z), we can immediately conclude

f(t) = \text{the sum of residues of } F(z) e^{zt} \text{ at all poles}

This condition is certainly satisfied by (9), so all we need to do is work out the relevant residues. The function \frac{e^{zt}}{z+a} has a pole of order 1 at z = -a, so there is a single residue which can be obtained as

\lim_{z \rightarrow -a} \bigg((z+a) \cdot \frac{e^{zt}}{(z+a)}\bigg) = e^{-at} \qquad \qquad \qquad (10)

This then immediately confirms the result in (5), and hence our result as explained below (5). We could just finish here, but it is instructive to actually prove that this result is correct using contour integration.

We first construct a contour bearing in mind that we need the restriction \lambda > k as per (1) above, which in the notation of (9) translates into the condition Re z > -a. So, picking an arbitrary c > -a, we have the following:

The contour \Gamma is designed to enclose the singularity of the integrand in the contour integral

\frac{1}{2 \pi i} \oint_{\Gamma} \frac{e^{zt}}{z+a} dz \qquad \qquad \qquad (11)

We now split this contour integral into a straight part and an arc:

\frac{1}{2 \pi i} \int_{c - iy}^{c + iy} \frac{e^{zt}}{z+a} dz + \frac{1}{2 \pi i} \int_{arc} \frac{e^{zt}}{z+a} dz = e^{-at} \qquad \qquad \qquad (12)

We want to show that as y \rightarrow \infty, the integral along the arc vanishes, and so

\frac{1}{2 \pi i} \int_{c - i\infty}^{c + i\infty} \frac{e^{zt}}{z+a} dz = e^{-at} \qquad \qquad \qquad (13)

as per the Bromwich integral. We use estimations, beginning by writing

\int_{arc} \frac{e^{zt}}{z+a} dz \leq \pi y \sup_{arc} \bigg| \frac{e^{zt}}{z+a} \bigg| \qquad \qquad \qquad (14)

We have along the arc that |z| \geq y and also

|z+a| = \sqrt{(z+a)(\bar{z}+a)}

= \sqrt{((x+a)+iy)((x+a)-iy)}

= \sqrt{(x+a)^2 + y^2}

\geq y

In addition, for any z on the arc, with arg(z) = \phi, we must have \cos(\phi) < 0 since we are operating in the second and third quadrants, so

|e^{zt}| = |e^{|z|(\cos(\phi) + i \sin(\phi)t)}|

= e^{|z|\cos(\phi)t}

\leq e^{-y |\cos(\phi)|t}

\leq \frac{1}{y |\cos(\phi)|t}

Therefore (14) gives

\int_{arc} \frac{e^{zt}}{z+a} dz \leq \frac{\pi y}{y^2 |\cos(\phi)|t} \rightarrow 0

as y \rightarrow \infty, so (12) reduces to (13) as required.

Inverse Fourier transform of a characteristic function

Let the Fourier transform of a function f(x) be

g(t) = \int_{-\infty}^{\infty} f(x) e^{itx} dx \qquad \qquad \qquad (1)

The corresponding inverse Fourier transform would then be

f(x) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} g(t) e^{-itx} dt \qquad \qquad \qquad (2)

As an exercise, what I want to do in this note is derive the characteristic function \phi_X(t) = E[e^{itX}] of an exponential random variate X with probability density function

f(x) = \lambda e^{-\lambda x} \qquad \qquad \qquad (3)

for x \geq 0. The required characteristic function is the Fourier transform of this density function. I then want to apply the inverse Fourier transform to \phi_X(t) and thereby recover the density f(x).

Simply applying (1) above to (3) we easily get the characteristic function:

\phi_X(t) = \int_0^{\infty} \lambda e^{-\lambda x} e^{itx} dx

= \lambda \int_0^{\infty} e^{-(\lambda-it)x} dx

= \bigg[-\frac{1}{(\lambda - it)} e^{-(\lambda - it)x}\bigg]_0^{\infty}

= \frac{\lambda}{\lambda - it}

= \bigg(1 - \frac{it}{\lambda}\bigg)^{-1} \qquad \qquad \qquad (4)

(Note that e^{-(\lambda - it)x} = e^{-\lambda} e^{itx} \rightarrow 0 as x \rightarrow \infty if \lambda > 0).

Let us now reverse the process, and apply the inverse Fourier transform to (4). Using (2) above, we get

f(x) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} \bigg(1 - \frac{it}{\lambda}\bigg)^{-1} e^{-itx} dt

= \frac{-i\lambda}{2 \pi} \int_{-\infty}^{\infty} \frac{e^{-itx}}{(t + i\lambda)} dt \qquad \qquad \qquad (5)

(Note that t is unrestricted, so the integration limits are from -\infty to \infty). It quickly transpires that elementary techniques will not work easily with (5), so we try contour integration using the residue theorem. We turn the integration variable t into a complex variable z = t + iy, and formulate the integral in (5) as a contour integral

\frac{-i\lambda}{2 \pi} \oint_C \frac{e^{-izx}}{(z + i\lambda)} dz \qquad \qquad \qquad (6)

The trick will be to design the contour C in such a way that part of the contour includes the integral in the form (5), while the rest of the contour vanishes, in a limiting process. We can then apply the residue theorem to solve (5).

The integrand in (6) has a pole of order 1 at z = -i\lambda. To find the residue of the integrand at this pole, we multiply the integrand by (z + i \lambda), obtaining e^{-izx}, and then evaluate this result at z = -i \lambda, yielding e^{-\lambda x}. Thus, e^{-\lambda x} is the residue of the integrand at its pole. If we design the contour C in such a way that it encloses the singularity at -i \lambda, then Cauchy’s residue theorem tells us that the integral in (6) equals 2 \pi i times the residue e^{-\lambda x}, so we will get the density function we are looking for:

\frac{-i\lambda}{2 \pi} \oint_C \frac{e^{-izx}}{(z + i\lambda)} dz = \frac{-i\lambda}{2 \pi} (2 \pi i) e^{-\lambda x} = \lambda e^{-\lambda x} \qquad \qquad \qquad (7)

Clearly then, when we use contour integration we get the required result straight away. But what contour do we use for the integral in (6)? The contour must enclose the point -i \lambda, so letting a > \lambda be some initial starting value, we can try the following:

Then the contour C may be divided into a straight part along the real axis (which only involves the real part of z, namely t), and a curved arc. Therefore the contour integral in (6) can be decomposed as

\frac{-i\lambda}{2 \pi} \int_{-a}^{a} \frac{e^{-itx}}{(t + i\lambda)} dt - \frac{i\lambda}{2 \pi} \int_{arc} \frac{e^{-izx}}{(z + i\lambda)} dz = \lambda e^{- \lambda x} \qquad \qquad \qquad (8)

We can now use estimations to show that the second integral on the left-hand side of (8) vanishes as a \rightarrow \infty. First, noting that \pi a is half the circumference of a circle of radius a, we have

\int_{arc} \frac{e^{-izx}}{(z + i\lambda)} dz \leq \pi a \sup_{arc} \bigg|\frac{e^{-izx}}{(z+i \lambda)}\bigg| \qquad \qquad \qquad (9)

Next, we observe that

|e^{-izx}| = |e^{-x|z|(\cos(\arg(z)) + i \sin(\arg(z)))}| = e^{-x |z| \cos(\arg(z))} = \frac{1}{e^{x |z| \cos(\arg(z))} } \leq \frac{1}{x |z| \cos(\arg(z))} \qquad \qquad \qquad (10)

and

|z+i \lambda| \geq |z| \qquad \qquad \qquad (11)

and also, along the arc, we must have

|z| \geq a - \lambda \qquad \qquad \qquad (12)

Using these and (9) we then have

\int_{arc} \frac{e^{-izx}}{(z + i\lambda)} dz \leq \frac{\pi a}{(a - \lambda)^2 x \cos(\arg(z))} \rightarrow 0

as a \rightarrow \infty, so in the limit as a \rightarrow \infty we are left with

\frac{-i\lambda}{2 \pi} \int_{-\infty}^{\infty} \frac{e^{-itx}}{(t + i\lambda)} dt = \lambda e^{- \lambda x}

which is the required result.

Two versions of the exponential cumulant generating function

In this note I want to record an interesting issue I have noticed with regard to how the form of the large deviation entropy function can `flip’ depending on the way the underlying cumulant generating function is formulated.

The cumulant generating function of a random variable is the natural log of its moment generating function. In the case of an exponential random variable with mean \mu and probability density function

f(x) = \frac{1}{\mu} e^{-x/\mu}

the moment generating function is the same as the Laplace transform of the random variable. Thus, one way to compute the cumulant generating function of an exponential random variable is as follows:

\lambda(k) = \ln \langle e^{kx} \rangle

= \ln \bigg(\int_0^\infty e^{kx} \frac{1}{\mu} e^{-x/\mu} dx \bigg)

= \ln \bigg(\frac{1}{\mu} \int_0^{\infty} e^{-(1-\mu k)x/\mu} dx \bigg)

= \ln \bigg(\frac{1}{\mu} \bigg[ -\frac{\mu}{1-\mu k}e^{-(1-\mu k)x/\mu} \bigg]_0^{\infty} \bigg)

= \ln \bigg( \frac{1}{1-\mu k}\bigg)

= -\ln(1 - \mu k) \qquad \qquad \qquad (1)

Note that it is clear from (1) that there is a singularity at k = \frac{1}{\mu} and that the natural logarithm is defined only if the Laplace transform parameter k is restricted to the region k < \frac{1}{\mu}. Since \lambda(k) is everywhere differentiable in this region, it follows from the Gärtner-Ellis Theorem of large deviation theory that the probability density function of values of the random variable in the tails of the exponential distribution can be approximated using the large deviation principle, with entropy function

I(s) = \sup_k (ks - \lambda(k)) \qquad \qquad \qquad (2)

All well and good. However, what I want to point out here is that there is a mathematically equivalent way to formulate the cumulant generating function, as follows:

\lambda(k) = -\ln \langle e^{-kx} \rangle

= -\ln \bigg(\int_0^\infty e^{-kx} \frac{1}{\mu} e^{-x/\mu} dx \bigg)

= -\ln \bigg(\frac{1}{\mu} \int_0^{\infty} e^{-(1+\mu k)x/\mu} dx \bigg)

= -\ln \bigg( \frac{1}{1+\mu k}\bigg)

= \ln(1+\mu k) \qquad \qquad \qquad (3)

It is less obvious from (3) that there are singularities, etc., but they are still there. It’s just that everything is negated when compared to the first formulation. There is now a singularity at k = -\frac{1}{\mu}, and the logarithm in (3) is now only defined for values of k in the region k > -\frac{1}{\mu}. What interested me is that it is not only the admissible region of values of k that gets reversed, but also the form of the entropy function in (2) above. Since in the second formulation of the cumulant generating function we have negated both the outer function and the parameter k, the signs of the terms on the right-hand side of (2) have also become reversed, so the entropy function as a whole now gets flipped around. It should therefore read as follows in this case:

I(s) = \sup_k (\lambda(k) - ks)