## Answer to Puzzle #??: Arithmetic with normal random deviates

Puzzle:
Let X and Y be two independent "normal" random deviates with mean and standard deviation μX and σX, and μY and σY, respectively.

1. If you add the two (X+Y) the result is also an exactly-normal deviate with mean μXY and standard deviation [(σX)2+(σY)2]1/2.
2. If you subtract the two (X-Y) the result is also an exactly-normal deviate with mean μXY and standard deviation [(σX)2+(σY)2]1/2.
3. If you multiply the two (X·Y) then you do not get an exactly-normal deviate unless σXσY=0. However, if σX<<μX and σY<<μY then you do get an approximately normal deviate. Show its mean is exactly μXμY while its standard deviation is exactly
[(σXμY)2 + (σYμX)2 + (σXσY)2]1/2.
4. If you divide the two (X/Y) then you do not get an exactly-normal deviate unless σY=0 and μY≠0. However, if σX<<μX and σY<<μY then you do get an approximately normal deviate. Show its mean is approximately [1+(σYY)2XY and indeed it is given asymptotically by any truncation of the infinite series
[1 + δ2 + 3·δ4 + 3·5·δ6 + 3·5·7·δ8 + 3·5·7·9·δ10 + ...]μXY,    δ = σYY → 0
(although this series diverges if δ≠0). Meanwhile its standard deviation is approximately
(1/μY)[(σX)2 + (σYμX)2 + (σXσY)2]1/2.
and indeed its square (i.e. variance) is given asymptotically by any truncation of the also-divergent series
XY)2 + (μY)-2 [(σX)2 + (μX)2] [δ2 + 9δ4 + 75δ6 + 735δ8 + 8505δ10 +...]
5. How many terms of these asymptotic series should be used, and approximately what |error| will you then get?
6. If f(z) is some generic analytic function, then argue f(X) will (in the limit when σ=σX is made small while μ=μX is held fixed) be approximately normal with mean
μf = f(μ) + [2-1/1!] f''(μ) σ2 + [2-2/2!] f(4)(μ) σ4 + [2-3/3!] f(6)(μ) σ6 + ...
This series can converge, for nice-enough f(z), in which case it yields an exact result. But it also can diverge for all σ≠0 in which case it is only an asymptotic series. The mean square is
μ(f²) = f(μ)2 + [2-1/1!] (f²)''(μ) σ2 + [2-2/2!] (f²)(4)(μ) σ4 + [2-3/3!] (f²)(6)(μ) σ6 + ...
and then the variance of f(X) is μ(f²)–(μf)2.
7. What is (10±1)/(5±1)?

These formulas are extremely useful for working with statistics (meaning "numbers equipped with 1σ error bars"). Surprisingly, the multiplication, division, and f formulas do not seem to be given by statistics books.

By Warren D. Smith Sept 2009

A and B are well known. The formulas for the mean arise from linearity of expectations. The formulas for the variance then also arise from linearity of expectations combined with independence. The underlying reason X+Y is exactly normally distributed (ditto X-Y) is that (i) the Fourier transform of a Gaussian is a Gaussian and (ii) the product of two Gaussians is a Gaussian.

For part C, the formula for the mean arises from

μXY = ∫∫ F((z-μX)/σX)/σX F((q-μY)/σY)/σY z q dz dq = μXμY

[with integration over the whole of the zq plane, and with F(z)=(2π)-1/2exp(-z2/2)] whereupon the formula for standard deviation may be verified by computing

XY)2 = ∫∫ F((z-μX)/σX)/σX F((q-μY)/σY)/σY (zq - μAμB)2 dz dq = (σXμY)2 + (σYμX)2 + (σXσY)2.

For part D regard μY/Y as approximately normal then apply part C's multiplication formula.

The fact that 1/(1+δ)=1-δ+δ23+... is used to deal with small relative deviations δ off mean (and we pretend the "lower tail of Y is cut off," i.e. that the chance of a deviation of the same order as Y's mean, or larger, is neglectible – otherwise the standard deviation of X/Y would, in fact, be infinite).

This technique plus the fact that ∫-∞<s<∞F(s)s2nds=1·3·5···(2n-1) if n≥1 allows us to see that the mean of 1/Z, where Z is a normal deviate with mean=1 and standard deviation σ<<1, is given asymptotically when δ→0+ by any truncation of

1 + δ2 + 3·δ4 + 3·5·δ6 + 3·5·7·δ8 + 3·5·7·9·δ10 + ...

Meanwhile, using the fact that

[δ-δ234+...]2 = [δ/(1-δ)]2 = δ2 + 2δ3 + 3δ4 + 4δ5 + ...

we find that the variance of 1/Z is given asymptotically by any truncation of

δ2 + 9δ4 + 75δ6 + 735δ8 + 8505δ10 + 114345δ12 + ... + 3·5·7·9···(2n-3)·(2n-1)2δ2n +...

E. I recommend (following a common recommendation) summing these series up to the term of least absolute value, then stopping (or stopping earlier, if the term is small enough that you're satisfied). The first neglected |term| will then presumably be about the same as the additive error. For each of the δ-power series above we'd keep about δ-2/2 terms. This (as a result of "Stirling's formula," essentially) would yield least |term| roughly (δ/r)c+1/δ² where r=√e=1.6487... and c=O(1) is a constant you can choose to optimize performance of the estimate. (Different c's for different series.)

F. Same technique as in part D but employing the Taylor series of f(z) about z=μ. The odd-power terms in the series integrate to zero due to odd symmetry. When you multiply an even term f(2n)(μ)/(2n)! times the normal-moment-factor 1·3·5···(2n-1)·σ2n you get 2-n/n! f(2n)(μ) σ2n. That explains the first series. The second series is the same series but based on f(z)2 not f(z). If f(z)=exp(z) or cos(z) then both series converge for all σ; but as we saw in part D, if f(z)=1/z then we get divergence for all σ≠0. [And the fact that mean2+variance=meansquare for a probability distribution, is just Pythagoras's theorem, integrated.]

G. 2.0925±0.540.    [Using δ=0.2 and δ2=0.04 and truncating as in part E.]