Puzzle: Counting new records

Puzzle:
Let S be a set of N numbers. Permute them into random order. Scan thru the elements in that order. Each time you encounter a new record max, ring a bell. So for example, if S={7*,8*,5,3,4,9*} you would ring the bell 3 times at the *s.

  1. Prove the expected number of times the bell rings, equals HarmonicNumber(N) = 1+1/2+1/3+...+1/N.
  2. Compute a formula for the probability the bell rings R times (as a function of R and N). Hint: my formula involves "unsigned Stirling numbers of the first kind."
  3. What is the variance in the number of bell-rings?

Answer

1. This is since the probability the nth number is greater than all n-1 preceding, is 1/n. Hence the expected number of new maxima is ha(N)=HarmonicNumber(N)=∑1≤k≤N (1/k) which is asymptotic (when N is large) to

ha(N) ∼ ln(N) + γ + N-1/2 - N-2/12 + N-4/120 -...

where γ≈0.5772156649 is the Euler-Mascheroni constant.

2. But we can actually write down the entire probability distribution. The key fact is that the number of permutations of {1,2,...,N} with R successive maxima is given by the "unsigned Stirling Numbers of the first kind" |s(N,R)|=(-1)N-Rs(N,R). Do not confuse these with the signed numbers s(N,R) or with the Stirling numbers of the second kind S(N,R) denoted with a capital S. For more information about s(N,R) see the wikipedia article or NIST handbook. As a generating function formula,

x(x-1)(x-2)...(x-N+2)(x-N+1) = ∑0≤R≤N s(N,R)xR

which if we are speaking about unsigned Stirling-type-1s is

x(x+1)(x+2)...(x+N-2)(x+N-1) = ∑0≤R≤N |s(N,R)|xR

One can also compute them via the recurrence

|s(N+1,R)| = N·|s(N,R)| + |s(N,R-1)|   for   R>0   with   |s(0,0)|=1   and   |s(M,0)|=|s(0,M)|=0 for M>0.

Incidentally, this recurrence is easily proven using the "key fact" above. (Which can be viewed as proving the recurrence, or as proving the key fact provided that we already knew the recurrence.) The corresponding recurrence for the signed Stirling-first-kinds is

s(N+1,R) = N·s(N,R) + s(N,R-1)   for   R>0   with   s(0,0)=1   and   s(M,0)=s(0,M)=0 for M>0.

Here is a table of |s(N,R)| with N increasing from 0 in the downward direction while R increases from 0 in the rightward direction:

     N=0  1 
     N=1  0    1                      Table of |s(N,R)|
     N=2  0    1      1 
     N=3  0    2      3       1 
     N=4  0    6      11      6      1 
     N=5  0    24     50      35    10     1 
     N=6  0   120    274     225    85     15     1 
     N=7  0   720   1764    1624    735    175    21      1 
     N=8  0   5040  13068   13132   6769   1960   322    28    1
     N=9  0  40320  109584  118124  67284  22449  4536   546   36   1
    N=10  0 362880 1026576 1172700 723680  269325 6327  9450  870  45   1
         R=0  R=1    R=2      R=3    R=4    R=5    R=6   R=7  R=8  R=9 R=10

Note that the recurrence relates a number in the table to the one directly above it and the one diagonally left & upward. For example, the three italicized numbers in the table obey 50=4·11+6, and every such little triangle of three adjacent numbers also works, such as 175=6·15+85, which allows you to easily expand the table.

So let Prob(R) be the probability that the bell rings R times, 1≤R≤N. Then

Prob(R) = |s(N,R)| / N!.

3. The variance in the number of maxima is

variance = ∑1≤R≤N [R-ha(N)]2 s(N,R) / N!

Nmean Rvariance in R
11.0.
23/2 = 1.51/4 = 0.25
311/6≈1.83333317/36≈0.47222222
425/12≈2.0833395/144≈0.6597222
23=82.7178571431.190435091
24=163.3807289931.796382460
25=324.0584951952.444327933
26=644.7438909043.114460402
27=1285.4331470933.795995088
28=2566.1243449634.483309527
29=5126.8165165355.173533687
210=10247.5091756725.865217691
211=20488.2020787726.557632865
212=40968.8951038977.250413939

The variance appears numerically to obey

variance = ha(N) - π2/6 + N-1 + O(N-2)

where π2/6≈1.6449340668482264, which if so would mean that the probability distribution is sharply peaked around its expectation, e.g. when N→∞ the probability→100% that the number of maxima lies within the interval [0.999, 1.001]·ha(N), or indeed any interval containing 1 can replace [0.999, 1.001].

This indeed is correct, as may be verified from the amazing exact formula

variance = harmonic(N,1) - harmonic(N,2)   where   harmonic(N,k) = ∑1≤j≤N j-k.

(And this may be shown by a disgusting inductive proof. I hope there is a nicer proof.) These variance results seem to be new, e.g. are not listed in the NIST compilation of facts about Stirling numbers.

We thank Tanya Khovanova, Joel Brewster Lewis, and Victor Miller for the hint. It turns out our "new" variance exact formula was stated by Ned Glick: Breaking records and breaking boards, Amer. Mathematical Monthly 85 (Jan. 1978) 2-26, on page 14. Glick traces the result that the number of maxima is (with probability→1) asymptotic to ln(N), to Alfred Renyi 1962; Renyi also recognized the connection to Stirling numbers. Cris Moore found nice proofs for the variance and third moment and his techniques actually allow expressing any particular moment in closed form, although it gets more mysterious the higher the moment.


Return to puzzles

Return to main page