Warren D. Smith, Oct.2022.
Table of contents:
The rules of "Approval voting" are: each vote is the set of candidates that voter approves. The most-approved candidate wins.
Suppose there are C candidates (2≤C). Most voters act as follows to decide what approval-ballot to cast.
Or, essentially equivalently, they choose a number A, then approve the best A, disapproving the remaining C-A candidates.
Voters of this kind face only one QUESTION: What should my personal threshold be? Or equivalently: how many candidates should I approve? The answer:
Example: Suppose there are C=7 candidates, and you think the chances are
And suppose your personal utilities for them are UNix=-600, URea=+300, UCar=+100. Then your estimate of your expected utility of the winner is
So in this circumstance you'd approve everybody better than "-225." In particular you'd approve both Reagan and Carter, but disapprove Nixon.
Why is that strategy optimal in a large somewhat-random election? Because then your vote is helping candidates better than your expected winner-value to win, and only them. If your vote does not alter the winner, then you get your expectation. In the (unlikely in a large election) event that it does alter the winner, then you've guaranteed getting a winner better than your prior expectation. Because the election is large and the zillions of other voters act somewhat randomly, you cannot predict or control which candidate you are going to help and which one you are going to hurt – and most likely your vote will have no effect; and if it does have an effect, then it will (with overwhelming likeliness) hurt only one candidate and help only one. Which ones? That is controlled by the randomness, not by you. The only thing you can control is which set of candidates you want to help win, and you can make sure to help one better than expectation, guaranteeing your vote causes (in expectation) an improvement. The fact that any other choice of threshold would sacrifice some expected utility comes from this
LEMMA: For any set of numbers U1, U2, U3, ..., UC not all equal, and any set of probabilities {P1, P2, ..., PC} summing to 1, we maximize ∑k∈S ∑j∉S (Uk-Uj) Pj Pk by choosing the set S to be the indices k of those Uk that are above expectation, the expectation being E=∑1≤j≤C UjPj.
Remarks. Note that your optimum choice of threshold depends on how you think the other voters are going to act, and not solely on your own opinions of the C candidates. If you thought Nixon, Reagan, and Carter had no chance to win because the zillion other voters were all going to approve Roosevelt, Eisenhower, and Bush instead, then your personal approval threshold might be quite different.
Why do we care? We want to be able to compare theoretical predictions for how often voters will approve-3, approve-1, or whatever, versus data from real-world approval-voting elections. If the predictions disagree greatly with reality, then we'll know something is wrong.
Plan. We are going to examine a goodly number of simplistic models of what a "random voter" is in a C-candidate election. Each model shall assume there are a huge (essentially infinite) number of such voters voting; and in each model it will be equally likely, a priori, that any one of the candidates will win. We shall exactly solve (except in one case our solution will be only approximate) each model to determine the exact probability distribution of the number A of candidates such a voter would approve.
It will turn out in the end that there is a massive "unification" – almost all our solutions turn out to be corollaries of just one master theorem, which I'll call the GENERALIZED JANSON THEOREM. Furthermore, most of our models will predict that the usual number of approvals per ballot will be a bit less than C/2, because the distributions we get, although all different, tend to feature a single peak centered at about C/2 or a bit less.
I: 2-valued utilities from coin tosses: thresholding strategy ⇒ "honest" voting. Suppose a "random voter's" opinions are determined by tossing C fair coins. The utility of the Kth candidate to that voter then is "high" if the Kth coin came up "heads," and "low" otherwise (for two particular values "high" and "low" associated with that voter, it does not matter what they are, just that they exist and low<high).
In that case, each threshholding-strategy voter is going to approve all her highs and disapprove all her lows, and she will not care what the other zillion voters think, i.e. she will "vote honestly"!
And then, in a C-candidate election, the chance a random voter will approve exactly A out of the C candidates (0≤A≤C) is proportional to the (A+1)th entry, in the row containing exactly C+1 numbers, of "Pascal's triangle" of "binomial coefficients:"
1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 1 5 10 10 5 1 1 6 15 20 15 6 1 1 7 21 35 35 21 7 1 1 8 28 56 70 56 28 8 1 1 9 36 84 126 126 84 36 9 1 1 10 45 120 210 252 210 120 45 10 1 1 11 55 165 330 462 462 330 165 55 11 1
More entries Z of this triangle may be generated from the two (X and Y) immediately above it using the stencil
X Y Z
where Z=X+Y. For example 35=20+15. The Rth row sums to 2R-1, for example 1+4+6+4+1=16=24.
Asterisk: In the cases A=0 and A=C of no- and all-approval, actually our voter literally does not care how she votes, so in practice she probably would not bother voting at all, in which case the first and last entries (both 1) in each row of Pascal's triangle would, for our purposes, never be used so that 1≤A≤C-1 always.
Some known facts about binomial coefficients.
There is a formula for the Kth entry in the Rth row:
II: Wrongly thresholding based on distribution-mean not your personal expectation: Now consider a different kind of "random voter." She samples C random numbers independently from some particular even-symmetric density (for example a "normal" or "uniform" density) – it does not matter which, and different voters could use different densities. The Kth random number is her utility for the Kth candidate. Assume a huge number of such voters. Then she approves the candidates she considers better than the central value of her density i.e. its distribution-mean.
Note, this is not really the optimum strategy, because the center of her distribution only usually approximately coincides with the expected value (to her) of the winner. But it resembles the optimum strategy.
This kind of voter again is going to approve A of the C candidates with chance proportional to the (A+1)th entry, in the row containing exactly C+1 numbers, of Pascal's triangle; and there will be the same asterisk as before about the A=0 and A=C cases.
III: 3-valued utilities from "throin" tosses: again thresholding strategy ⇒ "semi-honest" voting. Now suppose each "random voter" tosses C fair three-sided coins, aka "throins." Her utility for the Kth candidate is high, medium, or low depending on the Kth throin. (For three particular values "high," "medium," and "low" associated with that voter, it does not matter what they are, all we need is that they exist and low<medium<high.)
In that case, each threshholding-strategy voter is going to approve all her highs, disapprove all her lows, and either disapprove all, or approve all, the mediums – depending on whether she believes the expected value of the winner determined by the other zillion voters is above, or below, "medium." I.e, she will "vote honestly"!
If all the zillion other voters also employ the same methodology then there will be a 50-50 chance of
all-disapprove or all-approve on her mediums.
Due to the same asterisk
as before about the A=0 and A=C cases, let us assume 1≤A≤C-1.
Then the chance of approving exactly A of the C candidates will be proportional to this (integer-valued) formula:
4 9 9 20 24 20 45 60 60 45 102 150 160 150 102 231 378 420 420 378 231 520 952 1120 1120 1120 952 520 1161 2376 3024 3024 3024 3024 2376 1161 2570 5850 8160 8400 8064 8400 8160 5850 2570 5643 14190 21780 23760 22176 22176 23760 21780 14190 5643 12300 33924 57200 67320 63360 59136 63360 67320 57200 33924 12300
The sum of the entries in the (N-1)th row equals 3N-2N-1.
IV: Exact solution for thresholding voters whose utilities are random-uniform numbers. Now consider yet another kind of "random voter." She samples C random numbers independently from a uniform density. The Kth random number is her utility for the Kth candidate. (For example they each could be random real numbers between 0 and 1, albeit the endpoints do not need to be 0 and 1 and could depend upon the voter. E.g, we also could allow voters who first pick their favorite candidate F and their most-hated candidate H, then generate the utilities for the C-2 other candidates by sampling uniform random numbers in the interval with endpoints UH and UF.)
Assume a huge number ("infinity") of such voters. Our voter therefore assumes (correctly) that all C candidates are equally likely, a priori, to win. Hence she approves the candidates she considers better than the average of her C random numbers. And note, this really is optimum strategy.
Such voters approve A of the C candidates, 1≤A≤C-1, with chance proportional to the Ath entry, in the row containing exactly C-1 numbers, of Euler's triangle:
1 1 1 1 4 1 1 11 11 1 1 26 66 26 1 1 57 302 302 57 1 1 120 1191 2416 1191 120 1 1 247 4293 15619 15619 4293 247 1 1 502 14608 88234 156190 88234 14608 502 1 1 1013 47840 455192 1310354 1310354 455192 47840 1013 1
More entries Z may be generated from the two (X and Y) immediately above it using the stencil
X Y Z
where Z=aX+bY where Y and Z lie on the bth / diagonal (counting starting at the topmost /) while X and Z lie on the ath \ diagonal (counting starting at the topmost \). For example 4293=3×1191+6×120 because the rightmost 1191 and 4293 lie on the 3rd \ diagonal while 120 and 4293 lie on the 6th / diagonal. The Rth row contains R numbers and sums to R!, for example 1+26+66+26+1=120=5!.
Example: the chance a random voter will approve A=6 out of C=9 candidates is 4293/8!=4293/40320=477/4480.
Some known facts about Eulerian numbers. There is a formula for the Kth entry in the Rth row:
The entries in the Rth row of the Euler triangle arise from the coefficients of powers
of x in the full expansion of
The full expansion into a double power series of
This whole solution arises from
DOMINIQUE FOATA'S 1977 THEOREM ABOUT EULERIAN NUMBERS: If C≥1 random numbers are sampled independently from a uniform density on a real interval, then (C-1)! times the chance that exactly A of them exceed their mean, equals the Ath number on the row of the Euler triangle containing C-1 numbers.
This is proven in section 2 of Foata's paper "Distributions eulériennes et mahoniennes sur le groupe des permutations" (in French), pp.27-49 in Higher Combinatorics [Berlin 1977, M.Aigner, editor], Boston, Dordrecht, 1977. (This paper also was published translated into Russian; and it also contains a 1-page appendix "Eulerian Partitions of a Unit Hypercube" in English by Richard P. Stanley.)
Before unearthing Foata's theorem, I had conjectured it based on (i) approximate computer output, plus (ii) I knew all the probabilities had to be rational numbers, which appeared to become (and actually do become, in view of Foata's theorem) integers when multiplied by (C-1)!. But I could not see a proof. I then consulted Svante Janson, an expert on Eulerian numbers, who did find a proof. (It's impressive how simple the proof ended up being, considering how complicated our proof-ideas were when Janson & I started working on it.) Janson's proof method actually yields results which both improve over Foata's, and which "unify" all our model-solutions so far, and which yield additional exactly-solvable models.
GENERALIZED SVANTE JANSON THEOREM (2022) ABOUT EULERIAN NUMBERS: Sample C-1 random numbers Δ2, Δ3, ..., ΔC independently from any fixed density ρ(z) on reals z>0. Certain kinds of dependence also are permissible: we can sample the Δj subject to a constraint of the form "the sum of all the Δj is demanded to equal S" for any particular value S you want; and also "sum" could be replaced by "product," or any other "symmetric function"; and the demand "equal S" could be replaced by "lie between S1 and S2" for any particular values S1 and S2 you want.
Then generate C utility values U1, U2, U3, ..., UC, sorted into increasing order, from the Δj's as follows: (i) pick U1 arbitrarily, (ii) for j=2 to C let Uj=Uj-1+Δj, (iii) randomly permute the U's with all C! permutations equally likely. In other words, the Δj are the gaps between the sorted U's. Then the chance exactly A of the Uj's exceed their mean, equals the chance that the maximum value of K with 0≤K≤C-2 such that
is Kmax=A-1. (Also, of course Janson's theorem would remain valid if the word "exceed" were replaced by its opposite "are less than.")
PROOF: Wlog assume U1=0 and that we are using "are less than" and that the U's are sorted. Then Uj=∑2≤k≤j Δk. Let M equal the mean Uj, hence
The chance exactly A of the Uj (which necessarily are j=1,2,3,...,A) are less than M equals the chance exactly A of the CUj are less than CM, which occurs if and only if A is maximal such that
If we cancel out summands occuring on both sides of this inequality we see that happens if and only if if A is maximal such that
Since all the Δj are identically distributed and independent (more precisely: "exchangeable," aka "rearrangeable"), the chance of that equals the chance that the maximum value of K with 0≤K≤C-2 such that
equals A-1. Q.E.D.
COROLLARIES (SPECIAL CASES) OF GENERALIZED JANSON THEOREM:
Foata's theorem is the special case of Janson that arises when ρ(z) happens to be an exponential density, e.g. ρ(z)=e-z (which is what matters if the U's come from a uniform density) because of the following known result due to PIERRE-SIMON LAPLACE and JOHN RIORDAN (re-proven by Richard P. Stanley in his appendix in Foata 1977):
(SEVERAL VERSIONS OF) LAPLACE/RIORDAN/STANLEY THEOREM:
But there are other nice Janson corollaries:
I daresay more corollaries also could be obtained.
We now consider yet another kind of "random voter." She samples C random numbers independently from a normal density. The Kth random number is her utility for the Kth candidate.
The probability that she will approve exactly A among the C candidates, i.e the chance that exactly A among C random normal deviates exceeds their mean, is (C-1)!-1 times the Ath entry in the row containing C-1 numbers, of the following table:
1 1 1 1.0529 3.8942 1.0529 1.1742 10.8259 10.8259 1.1742 1.3839 26.7798 63.6726 26.7798 1.3839 1.717 63.375 294.908 294.908 63.375 1.717 2.23 148.09 1207.95 2323.45 1207.95 148.09 2.23 3.04 347.16 4623.9 15185.9 15185.9 4623.9 347.16 3.04 4.298 823.3 17046.4 88568 149996 88568 17046.4 823.3 4.298
I do not know any easy way to generate the entries in the above triangular array. (They should be expressible in closed form as a linear combination of Schläfli functions, but that's not easy.) My values should be accurate except for small errors in their last decimal place. The reason I have scaled the Rth row (note R=C-1) so that its sum equals R! is to permit easy comparison versus Euler's triangle (which arises using sampling from a uniform instead of normal density).
For any even-symmetric density we always get a bilaterally symmetric triangle of probabilities, which is enough to show that if C=3 then the chances of A=1 and A=2 are exactly equal (to ½).
Usually in real elections the voters both (i) act randomly and independently of each other, but also, very importantly, (ii) share a goodly amount of common thinking. So far, we only modeled part (i). Part (ii) typically causes a certain subset of candidates (the "no-hopers") to be viewed, by almost all voters, as having no chance to win. Only the remaining, often much smaller, subset of candidates have hope to win ("hopers"). In that case, we can approximately predict that the voters will act as though all the hopers have approximately equal winning chances, thus applying our previous models but with a smaller value of C, namely Chope, the number of hopers. Voters who regard the no-hopers as all being worse won't approve any of them; but voters who regard some of the no-hopers as better than the expected winner-value among the hopers, will approve those no-hopers too. Every voter will approve at least 1, and disapprove at least one, of the hopers. Hence the usual number of hoper-approvals on a ballot is predicted to be between 1 and Chope-1, and usually averaging a bit less than Chope/2.
And actually, of course, usually some hopers have more hope than others, which presumably leads to further diminution of the effective value of C.
Both are 1-bordered bilaterally symmetric triangular arrays of positive integers, and both arise as (different) special cases of the generalized-Janson theorem. And that theorem allows us to readily devise simple homotopies which "morph" Pascal triangle ↔ Euler triangle.
It is easy to see from the central limit theorem that the R binomial coefficients forming the Rth row of the Pascal triangle, regarded as the bar heights in a histogram (all bar widths=1), approach a "normal distribution" with variance=R/4. And the Laplace/Riordan/Stanley theorem similarly shows that the R Eulerian numbers forming the Rth row of the Euler triangle, regarded as the bar heights in a histogram, also approach a "normal distribution," but with the smaller variance=R/12.
Another set of analogous facts about the Pascal and Euler triangles are mentioned by Meng Li & Ron Goldman in their paper: Limits of Sums for Binomial and Eulerian Numbers and their Associated Distributions, Discrete Maths. 343,7 (July 2020) #111870.
So the two triangles seem in many ways quite analogous.
But Kevin S. Brown claimed that there is more going on here than a mere "analogy"! He discovered that the following ∞×∞ array of numbers exhibits both Pascal's binomial coefficients, and the Eulerian numbers, as one unified entity obeying X=(c+1)Z-(r+2)Y on stencils
Y X Z
where X and Z both lie in column c of the array (adjacently) while X and Y both lie in row r (adjacently), with c≥-2 and -∞<r<+∞.
c=-2 c=-1 c=0 c=1 c=2 c=3 c=4 c=5 c=6 c=7 c=8 c=9 r=-8 0 0 1 120 4293 88234 1310354 15724248 162512286 1505621508 12843262863 102776998928 r=-7 0 0 1 57 1191 15619 156190 1310354 9738114 66318474 423281535 2571742175 r=-6 0 0 1 26 302 2416 15619 88234 455192 2203488 10187685 45533450 r=-5 0 0 1 11 66 302 1191 4293 14608 47840 152637 478271 r=-4 0 0 1 4 11 26 57 120 247 502 1013 2036 r=-3 1 1 1 1 1 1 1 1 1 1 1 1 r=-2 1 0 0 0 0 0 0 0 0 0 0 0 r=-1 -1 1 0 0 0 0 0 0 0 0 0 0 r=0 0 0 1 0 0 0 0 0 0 0 0 0 r=+1 0 0 1 1 0 0 0 0 0 0 0 0 r=+2 0 0 1 2 1 0 0 0 0 0 0 0 r=+3 0 0 1 3 3 1 0 0 0 0 0 0 r=+4 0 0 1 4 6 4 1 0 0 0 0 0 r=+5 0 0 1 5 10 10 5 1 0 0 0 0 r=+6 0 0 1 6 15 20 15 6 1 0 0 0 r=+7 0 0 1 7 21 35 35 21 7 1 0 0 r=+8 0 0 1 8 28 56 70 56 28 8 1 0 r=+9 0 0 1 9 36 84 126 126 84 36 9 1
For example 102776998928=10×2571742175+6×12843262863, 7=7×28-9×21, 35=5×70-9×35, 0=0+0, 0=1-1, 1=(c+1)0+1, 0=0-(r+2)0 etc. All entries arise from (repeatedly applying) the stencil-relation starting with, e.g, the r=0 row and c=-1 column. (Or from any other row and column.) If desired, one may also extend the array to arbitrarily negative c (even though our picture stops at c=-2) but then will be forced to include rational noninteger entries.
This engenders the suspicion that there is some sort of "duality" lurking between the binomial coefficients and Eulerian numbers, and that many (most?) "binomial identities" may be converted to "dual forms" involving Eulerian numbers, and vice versa. As of 2022, that entire idea remains almost unexplored.
Dominique Foata: Eulerian Polynomials: from Euler's Time to the Present, Invited address at the 10-th Annual Ulam Colloquium, University of Florida, 18 February 2008. Published pp. 253-273 in "The Legacy of Alladi Ramakrishnan in the Mathematical Sciences" (K. Alladi, J.R. Klauder, C.R. Rao, eds.) Springer, New York Dordrecht Heidelberg London, 2010.