Puzzle #109: Information content of ballots

Puzzle:
The number of possible approval-style ballots in an N-candidate election, is 2N; but 2N-2 if the two "silly" (all-approve or all-disapprove) ballots are excluded. The number of possible rank-order-style ballots in an N-candidate election, is N!.

  1. Compare these formulas. Which is greater?
  2. Suppose we have "tactical" (rather than "honest") voters in an election where the two "front runner" candidates are known; and suppose the tactical voters always rate these two candidates maximum or minimum (i.e. never rating them intermediate). Now what are the counting formulas, and which is greater?
  3. Finally, suppose we have tactical voters who employ the "moving average strategy." That is: there is a publicly known ordering of the candidates from a priori most-likely-to-win to a priori least-likely-to-win. The voters begin by awarding the two frontrunners the max and min allowed votes, then go through the remaining candidates in decreasing order of victory-chances, awarding each the max- or min-allowed-vote depending on whether the voter regards him as better than average or worse than average where the "average" is taken among the preceding candidates only. Now what are the counting formulas, and which is greater?
  4. How could you encode a rank-order ballot Gore>Kennedy>Bush using 3 bits? In the other direction, suppose you knew Gore, Kennedy, and Bush were the three candidates; how could you decode the binary number 010 to determine what the rank-order ballot was?
  5. Further, show how to encode and decode three independent rank-order ballots (honest voters) using only 8 bits in all (for an average of only 8/3≈2.667 bits per ballot). More generally, anything which has only W possible values, can be encoded using log2W bits per thing (if there is a long sequence of them) and using ⌈log2W⌉ bits (if there is only one).

Can you draw any "take-home lessons" from this exercise?

Answer

A. When N≤3, there are more approval ballot-types; but when N≥4 there are more rank-order ballot types. When N≤3 the non-silly approval and rank-order counts are exactly equal.

B. The number of tactical approval votes then is 2N-1 while the number of tactical rank-order ballot types is 2·(N-2)!. The approval count is greater when N≤5 but the rank-order count is greater when N≥6. For example, when N=5 we have 16>12, while when N=6 we have 32<48.

C. The number of moving-average-strategy tactical approval votes then is 2N-1 while the number of same-strategy rank-order ballot types is 2N-2 if N≥3. The approval count is always greater, for every N≥3. (Equality when N=2.)

D. Consider the 6=3! possible rank-order ballots in lexicographic order:

B>G>K, B>K>G, G>B>K, G>K>B, K>B>G, K>G>B.

Since Gore>Kennedy>Bush is the 4th ballot, we encode it with the 4th 3-bit binary number (sequence is 000, 001, 010, 011, 100, 101) namely "011." Going in reverse, the ballot "010" is the 3rd hence the rank-order ballot must be G>B>K. (This encoding and decoding scheme works without error provided the encoder and decoder both know how to spell the names of the candidates, so that they can both order them lexicographically.)

E. Consider the 216=63 possible ordered triples of rank-order ballots in lexicographic order. Any such triple can be encoded as a binary 8-bit number since 28=256>216. [This sort of thing is standard "information theory" dating to Claude Shannon in the 1940s, if not earlier.]

Lesson. Approval votes actually provide more information than rank-order votes if N≤3, if we have tactical voters and N≤5, or, if we have tactical moving-average-strategy voters, for all N. One might naively have thought that (since N!>2N for N≥4) rank-order ballots provided more information. However, actually, which provides more information depends on the amount of voter honesty/strategy and on the number N of candidates, and with, e.g, values of N from the real-world elections, it often is approval ballots which provide more information. The "moving average strategy" actually is best strategy for certain rank-order voting methods (e.g. Borda) under certain probabilistic models of the other voters. Of course, you might instead want to use other voting strategies, but probably the moving-average-strategy is not unusual as far as the entropy of the ballots it produces is concerned, among fully-spelled-out voting strategies attempting to be "best."

Incidentally, ballots which are distributed in a non-uniform manner, with greater probabilities for some than others, may be encoded using -∑kpklg(pk) bits (per-ballot average if a long sequence of ballots are being encoded) where lg(x)=log2(x) and where pk is the probability of the kth ballot-type. This is one of Shannon's comparatively easy results and may be proven by

  1. Constructing appropriate binary encoding/decoding schemes. (E.g. "Huffman coding" of "n-grams" works.)
  2. Arguing using combinatorial upper bounds (related to "Stirling's formula") that it is impossible to use asymptotically fewer bits.

Consequently, if, say, voters were usually tactical, but occasionally honest, then the information content of a ballot would be nearly the same as the all-tactical formula. Also if only a few kinds of ballots were extremely common with all other types rare, then ballots could be compressed down to very few bits on average.

Note from Prof. Steven J. Brams: For related combinatorial calculations, see also table on p28 of my book Approval Voting.


Return to puzzles

Return to main page