Spoilage and Error Rates with Range Voting versus other Voting Systems

(skip to conclusions)

Theoretical expectations

With plurality voting, your ballot can be invalidated by voting for two candidates (overvoting).

With instant runoff voting, there are more ways to go wrong. E.g. you can co-rank two candidates at any level of the ranking, skip a ranking-level, etc. So it seems "obvious" there are going to be higher ballot spoilage rates with IRV than with plurality.

With range and approval voting, there are fewer ways, in fact many would say no ways, to go wrong, because every way to vote or to fill out numbers within the permitted range, is a valid vote. So you similarly might expect range and approval voting to have lower spoilage rates than plurality.

All that , however, is just a theory. The real proof of the pudding is the experiment. As PhD advisors are fond of saying to their poor graduate students, "Show me the data!".

Plurality (real data)

According to official plurality-vote totals: Florida's ballot spoilage rate in 2000 was 3%, and for the US nationwide 2000 presidential election, 1.9 million ballots were spoiled and hence uncounted versus 105 million that were counted, for a spoilage rate of 1.8%.

However, the distribution of invalid ballots in the USA is uneven: USA Today reported that voters in Florida's majority-black precincts were four times as likely to have their 2000 ballots invalidated than white precincts: 8.9% versus 2.4%. Among the 100 precincts with the highest numbers of disqualified ballots, 83 were majority-black. Allan Lichtman (history professor at American University) conducted a study of ballot rejection rates in Florida for the US Commission on Civil Rights. He found that overall, there was enormous difference in the rate of white votes and African-American votes counted in Florida. When one looks at the variation in the ballot spoilage rates for each Florida county, about one-fourth of the variation can be explained solely by knowing how many African American voters were registered there. Controlling for the number of high school graduates and literacy failed to diminish this relationship. For the entire state, the rate of spoiled ballots for African Americans was 14.4% while it was 1.6% for non-African Americans. The US Commission on Civil Rights subsequently claimed that, in 2000 Florida, 54% of the ballots discarded as "spoiled" were cast by African Americans, who were only 11% of the voters.

And here are some spoilage rates from other countries:

Country %invalid
Mexico 2006 presidential (5 major candidates) 2.16%
Yugoslavia 2000 (5 major candidates) 3.03%
Taiwan 2004 2.5%
Taiwan 2000 1%
Russia 1996 1.0 to 1.6%

In California, 1.59% of ballots for governor went uncounted in 1997, and 1.8% of presidential ballots went uncounted in 2000, but only 0.97% of ballots for governor went uncounted in 2001. The 2001 decrease was attributed to new voting machine protocols which immediately reported invalid ballots to voters to give them the option of correcting them.

Range (real data)

Although in our 2004 range voting exit-poll study (#82 here) we did not collect enough data to get a statistically significant prediction of ballot-spoilage rates, nevertheless we shall report our data for whatever it is worth. (It is possible to draw some statistically significant conclusions.)

Out of 54 range vote ballots in which we demanded the voters fill in every score slot with a numeric score (no Xs – intentional blanks – allowed) one score on one ballot was unclear. That is a total ballots-with-problems rate of 1.9%. However, by using all the other 6 scores on that 7-candidate ballot and regarding the unclear one as an X, our unused-score rate is only 0.026%.

Out of 68 range vote ballots in which we allowed blank scores, there were zero errors (but 5 voters refused to fill in their ballots at all for privacy reasons). That is a total ballots-with-problems rate of 0%. (Of course, it makes sense that allowing blanks would reduce the error rate.)

Despite the paucity of our data, it is actually possible to turn the per-entry error figure into a statistically significant weakened conclusion by using Poisson statistics. The calculations are as follows. First, our observed per-entry range error rate based on a single error in 117 ballots each with 7 entries, was 1/(117×7) which is a factor of 14.8 smaller than the US nationwide 1.8% plurality per-race spoilage rate. Second, if our true error rate were 3 (or fewer) times smaller than the plurality rate, the probability we would have observed at most one error, would have been at most 5.93e-4.93=4.2%. Third, if our true error rate were 2 (or fewer) times smaller than the plurality rate, the probability we would have observed at most one error, would have been at most 8.4e-7.4=0.51%. Therefore, with 96% confidence we can say that the range per-entry error rate is at least three times smaller than the plurality per-race error rate, and with 99.5% confidence we can say that the range per-entry error rate is at least twice as small as the plurality per-race error rate.

Why does range-voting have a much smaller per-entry error rate than plurality's per-race error rate? One might conjecture that the very "complexity" of having to fill in a number for each candidate makes the voter pay more attention; another conjecture would be that the repetition involved – you fill in a number next to every candidate – makes errors less likely. (In contrast, in many US punch-card-based plurality elections, the voter has to find punch-hole 53, randomly located somewhere on a card, corresponding to a candidate on a separate ballot where it says: "punch hole 53." This is a one-time, error-prone operation.) Range voting may seem more "complicated" than plurality at first glance, but what the data is saying is, that impression is misleading, since the so-called complexity in this case actually helps voters.

Later note: much more range voting data – about 1800 ballots – is now available thanks to the French Range Voting study. It entirely confirms all the conclusions discussed here from our own smaller studies!

Instant Runoff Voting in Australian House of Representatives Elections 2004 (and Irish Presidential elections) – real data

Australia House of Representatives elections are IRV and have been since 1918. Australia has conducted more IRV elections than any other country. Official data from recent elections is available from http://elections.uwa.edu.au/. In the House of Representatives election of 9 Oct. 2004 (this site indicates) they had the following percentages of invalid votes:

Place %invalid
Victoria 4.10%
Northern Territory 4.45%
New South Wales 6.12%
Queensland 5.16%
Western Australia 5.32%
South Australia 5.56%
Tasmania 3.59%
Australian Capital Territory 3.44%

Ireland holds its presidential elections using the IRV system, an event of low interest because the position is largely powerless and ceremonial in nature, and often only one candidate runs unopposed. (7 year terms.) In the 1997 election, which was the last contested one (according to http://www.electionsireland.org and the Binghampton elections archive), Mary McAleese won in a 5-candidate race. (In the first rounds 3 candidates were eliminated and then McAleese beat Banotti in the final round by 706259 to 497516.) The total number of votes which failed to transfer to these two final candidates, i.e. which were not counted in the final round, was 66061 out of 1269836 cast, i.e. 5.2% invalid. Also, there were 0.80% "spoilt" ballots (which could not even be used in the first round) for a total of 6.0% ballots unuseable at some stage of the election process. However, this 6.0% figure is not really a true measure of the rate of invalid ballots because some of those 5.2% perhaps intentionally did not wish to specify a preference between McAleese and Banotti, as opposed to having it discarded due to an error. In 1990 the same websites report Mary Robinson won in a 3-candidate election with 1574651 "valid" and 9444 "spoilt" votes, which is a tiny 0.60% spoilage rate. But in addition to these "spoilt" votes there were 25548 votes which failed to transfer, i.e. failed to specify a preference between the final contenders Robinson and Lenihan, which is 1.62% and makes 2.22% when both kinds of spoilage are summed. All of these 1990 and 1997 numbers were larger than the exceedingly tiny spoilage rate (0.56%) in the 1973 election, which effectively was a plurality election since there were only 2 candidates.

Conclusion: Australia IRV ballots have far higher invalidity rates than the USA's and other countries' rates for plurality ballots above – every Australia area listed here did worse than every plurality country listed there. Also, although our Irish data may not be what you really want, as far as it goes it also indicates larger invalid-ballot rates in IRV than plurality contests within Ireland.

Instant Runoff Voting (versus Plurality) in San Francisco 2004 – real data

Table 1. Valid votes, overvotes and undervotes (also known as DROP-OFF) in IRV races:
Final official results from the SF Dept of Elections (www.sfgov.org/elections)

District

Total Voters
Overvotes    +
Undervotes/dropoff    = 
Invalid ballots    
Total valid ballots
1
30,721
156 (0.5%)
1,778    (5.8%)
1934             
27,787
2
39,462
95   (0.3%)
4,879    (12.4%)        
4974  
34,488
3
28,317
74   (0.3%)
2,338   (8.3%)      
2412             
25,905
5
39,255
394  (1.1%)
3,752   (9.6%)                  
4146             
35,109
7
34,905
236  (0.7%)
3,030    (8.7%)        
3266             
31,639
9
26,275
172  (0.7%)
1,235    (4.7%)        
1407             
24,868
11
24,902
219  (0.9%)
1,507    (6.1%)        
1726
23,176
Total
223,837
1347  (0.60%)
18,519    (8.3%)        
19866
203,971

(Total voters – Invalid ballots = valid ballots) 
Overvote means a voter selected two or more candidates for their top IRV ranking.
Undervote/drop-off means voter ranked nothing on their ballot.

Table 2.  Undervotes/dropoff and overvotes in non-IRV (i.e. plurality) San Francisco races:
Based on official data released 5 Nov. 2004.  The report lacks about 80,000 absentee and provisional ballots that had not yet been counted.

Race

Voters
Undervotes/ Drop-off
Overvotes
% Overvote
Total valid ballots
 

President

283,462

0.9%

312
0.1%
280,581
 

US Senate

283,462

7.0%

273
0.1%
263,229
 

US Rep – 8

229,483

7.5%

169
0.1%
212,047
 

US Rep – 12

53,979

11.4%

29
0.1%
47,776
 

State Sen – 3

160,873

13.0%

99
0.1%
139,826
 

State Ass – 12

122,445

15.9%

94
0.1%
102,910
 

State Ass – 13

161,017

12.0%

86
0.1%
141,551
 

Total

1,294,721

8.1%1062
0.082%

 

Overvote means a voter selected two or more candidates for the same office. Undervote/drop-off means voter selected no candidate for that race. In either IRV or plurality voting, undervoting has the same effect as not voting at all in that race.

So from this we see that the overvote error rates in San Francisco – same place, same time, same voters, just IRV versus plurality races – ranged from 3 to 11 times higher with IRV than with plurality voting, typically 7 times higher. (If double-ranking a non-top candidate in IRV were also considered – we haven't – then IRV overvote error rates would have been even higher.) And this conclusion is fully statistically significant.

Meanwhile, the undervotes and dropoffs probably mostly were "intentional" rather than "errors," but anyhow were comparable for both Plurality and IRV.

Conclusions

Based on our data, the ballot spoilage rate with range voting would be about half of what it now is with plurality, in terms of the count of ballots-with-problems, and about 1/14 of what it now is, in terms of the entries-affected rate. Here the "half" and "1/14" are not statistically significant conclusions, but the "1/14" claim is statistically significant at the 96% confidence level if it is weakened to "the range entry-affected error rate is at least three times smaller than the plurality per-race ballot-spoilage rate" and at the 99.5% level if it is further weakened to "the range entry-affected error rate is at least twice as small as the plurality per-race ballot-spoilage rate." (This all was written before the French RV study, which can be used to improve the significance of all that stuff greatly.) And keep in mind that our voters had never range-voted before in their lives and did not have the benefit of voting machines – with experience and machines, range error rates presumably would improve even further.

With IRV in San Francisco 2004, the ballot spoilage rates were 7 times larger than under plurality voting, and this is fully statistically significant. Australia in IRV races reports higher spoilage rates everywhere, than every entry in our collection of plurality countries' spoilage rates, but the increase is not a factor of 7, it is more like a factor of 2-to-6. One might conjecture this improvement versus San Francisco is due to the greater level of IRV experience the Australians possess.

In short: Range Voting is better than Plurality is better than IRV with respect to ballot spoilage rates.

Presumably Approval Voting also is better than plurality since "overvoting" is a form of ballot spoilage that is no longer possible. The only approval-voting spoilage-rate data I have – the French study which had 10 spoiled ballots out of 2597 for an extremely low (0.385 ± 0.122)% spoilage rate – certainly is consistent with that theory.


Return to main page