Why Debian elections are currently uniquely important real-world data
Very few elections have both
been conducted using rank-order preference ballots
all the data (i.e. all votes) are publicly available on the internet
the election was both large, consequential, and real.
So when we get such rare and treasured data – gold! – it is
advisable to pay attention to it.
The Debian Linux Developers hold annual such elections.
Indeed, these may be the only elections in the entire
world with all these properties.
(Yes, there are plenty of ranked-ballot elections
held all the time, in, e.g. Australia, but the vote set is never, or certainly very rarely, made
available on the internet.
That may be an intentional plan intended to prevent anyone from noticing
bad properties of their voting system or election.
Later note: that was written in 2004. The situation now is somewhat better since
I know of about 10 IRV elections whose full data has been put on the internet, albeit
in such a horrible format I could not bring myself to analyse them. With time more may
appear, and I would appreciate anybody taking the effort to convert them to a more reasonable
data format. Debian already was reasonable.)
These are pretty serious elections.
The voters really care who wins, they have pre-election
intercandidate debates transcribed and published worldwide
via the internet, and the votes are conducted electronically via a special software
vote-by-internet system.
In the 7 Debian-leader elections held so far, all 7 have featured a
Condorcet Winner, who (by the laws of Debian) was therefore the
election winner. (Also, every election method based on starting with the "Smith set"
of candidates that pairwise dominate all non-Smith-set candidates, then applying
some other election method within the Smith set, would in these 7 cases yield
the same winner since the Smith set in all 7 cases was a singleton.)
However, it is also possible for us to consider who would have been the winner
under different election systems but with the same votes. That is possible because
in all the Debian elections after 2000 (i.e. 5 so far), the full vote dataset is
publicly available.
Rob Lanphier recently developed a beautiful piece of software called "ElectoWidget" for
carrying out elections using a variety of possible voting systems. We converted the 2001-2005
Debian vote sets into ElectoWidget format and ran it. The results in full beautiful detail
are here:
2001,
2002,
2003,
2004,
2005.
Our goal here is to summarize the highlights of these results.
How we converted the data to allow range and approval voting
Before we start, we need to mention a format issue.
Range Voting wants the candidates to get scores, whereas Debian originally
only asked voters for
candidate orderings (with equalities allowed in the orderings).
And Approval Voting and DMC Voting want candidates
to be approved or disapproved.
So, we had to fake it. Here's the conversion recipe we used:
All unranked candidates were un-approved and pre-converted to coequal bottom rank.
A range of -N to +N was used in an N-candidate election (where NOTA is
included in the value N of the count).
NOTA ("None Of The Above," a pseudo-candidate who also was available in
all the Debian elections) always received a score of 0, if ranked.
All candidates above NOTA were given a score of (N-DebianRanking),
causing the top ranked candidate to get N, the second N-1, and so on, in an
N-candidate election.
All candidates below NOTA were given a score of (NOTArank-DebianRanking)
or the minimum allowed score.
Thus a candidate one slot below NOTA was given -1, two slots below -2, etc.
NOTA and all candidates ranked equal or below NOTA
are un-approved; candidates ranked above NOTA
are approved.
We then proceeded with the election.
Highlights (and lowlights) of the 2001-2005 Debian elections
1999: 4 candidates+NOTA (None Of The Above); Wichert Akkerman won.
311 valid votes. If this election is done via either IRV or plurality
instead, Collins still wins, but
there is a near-tie – a one-vote difference in round 1 –
between Kumria and NOTA which could have delayed the IRV election
(although fortunately in this particular election
the outcome of that tie cannot matter, so that delay is avoidable).
All methods agreed Collins won, except that
Approval voting said Bdale Garbee won.
475 successful voters. 509 valid votes (plus 122 rejected as invalid).
N.b: Debian allowed voters to vote
more than once, but only their last vote counts.
(All methods agreed on the winner.)
488 successful voters; 510 accepted votes (plus 200 rejected).
This was the hardest one of these 7 elections to call;
in all the other elections 2001-2005 all of the following methods all
agreed on the winner: Schulze-beatpaths, Plurality, IRV, DMC, Simpson-Kramer-minmax-margins,
minmax-wv, Copeland, Range-Voting, Approval-Voting,
and every Condorcet method. But in this election,
these methods called it for Michlmayr, Robinson, Robinson, Michlmayr,
Michlmayr, Michlmayr, Michlmayr, Garbee*, Garbee*, and Michlmayr respectively.
(The asterisk is that range and approval voting are
not really applicable since these ballots
were not originally cast as range ballots; this is with our standard scheme
for conversion of these ballots into range form, which can only approximate what
the voters would have done. Range and Approval then also get
a different winner in 2005, and Approval gives Garbee the win in 2001.)
Total votes tallied=488, which is 58.6% of all possible votes, according
to Debian.
The fact that 200 votes were rejected as invalid versus 510 accepted as valid,
even after 5 years of experience and the best software developers in the world voting
and programming, strikes me as very strong evidence that Debian's Schulze
beatpaths voting system is just too complicated. It perhaps also proves that IRV voting
is also just too invalid-vote-genic.
A closer look:
According to Debian (and correcting some Debian typos)
of 710 ballots, 533 passed "SIG" check, meaning 177 were
non-"SIG" and hence rejected. All these passed "LDAP" check.
Then 23 "bad ballots" were rejected,
leaving 510 valid votes.
Of these, some were votes cast more than once by the same voter,
in which case the earlier edtions were discarded; the final tally was of 488
votes, one per successful voter. (There may also have been, indeed probably were,
many "unsuccessful" voters who were unable to cast a valid vote.)
200/710 is a 28% rejection
rate, which is enormous. 23/533 is a 4.3% rejection rate, which is
also quite large by USA standards. However, a single persistent re-voter could
have caused this high rejection rate (a phenomenon which could not happen in USA elections)
so perhaps this means nothing.
I do not know what "SIG" and "LDAP" and "Bad Ballot" mean.
Although Michlmayr was the Condorcet Winner, he only won pairwise versus Garbee
by 4 votes (228 versus 224), hence Michlmayr was close to not being a CW.
Branden Robinson had the most top-rank votes (158 sole-top votes out of 453 ballots
cast with somebody sole-top) and hence presumably was the Plurality winner. Robinson
was also the Instant Runoff Voting (IRV)
winner provided ballots with duplicate rankings are truncated
as soon as the duplicate occurs, as the CVD
recommends, and in general using precisely the IRV rules which the CVD recommends.
There seemed to be a preference toward Michlmayr among Garbee
voters, and vice versa.
This helps explain why Branden Robinson lost, despite the most first-place votes.
However, this also means the Robinson-supporters could have prevented Michlmayr
from being a Condorcet Winner, by agreeing among themselves to rate
Garbee>Michlmayr
a few percent more often. In that case Garbee
would have been the Condorcet Winner.
It thus is conceivable that Garbee
really "should" have won, but was prevented from doing
so as a result of misfired strategic voting by the Robinson (or Zadka) supporters aiming
to prevent one or the other from being a Condorcet Winner.
Rob Lanphier points out that if this single vote
Garbee > Michlmayr > Robinson > Zadka > NOTA
were eliminated from the picture, then Michlmayr would have won the IRV election
(despite the fact that this voter actually ranked Michlmayr above Robinson)!
In other words, this IRV election hinged on a single vote, and
exhibited a "no show paradox" – that voter would have been "better off" by
"not showing up"!
Rob Lanphier also points out that if this other single vote
Michlmayr > Garbee > Robinson > Zadka > NOTA
instead were eliminated from the picture, then Garbee would have won the IRV election
(despite the fact that this voter actually ranked Garbee above Robinson)!
In other words, this IRV election also hinged on a different single vote, making it,
in some sense, a 3-way near-tie as far as IRV is concerned – and there also
exhibited a "no show paradox" – this voter too would have been "better off" by
"not showing up"! Both of these votes also exhibit severe "non-monotonicity," i.e.
in both cases raising "Robinson" to top rank in these votes actually would
have caused Robinson to lose the IRV election!
Ties and near-ties are very common in IRV elections because there is
an opportunity for one every single round. (This election features an exact 146-146 tie
between Garbee and Michlmayr in round 3.) In Range Voting ties are inherently
far less common. It is also possible that, if this election had been held using
0-99 range voting, then voters would have used their (new) ability to express different
intensities of preference in such a way as to say, e.g. that they preferred Garbee
over Robinson by more than they prefered Michlmayr over Garbee. It is possible
that the range voters, by acting in this manner, would have yielded a clear victory
for somebody. We will never really know that since range-style votes were never collected.
But with our standard scheme for converting the ballots to range-style and
approval-style ballots,
Garbee came out the clear winner.
The Condorcet method Debian actually used would not have changed its winner
unless at least 4 (and probably 5) votes changed, so, at least in this
particular instance, it was less manipulable and less subject to "tie-crises" than IRV,
where everything hung on single votes.
With Range (-5 to +5 scale, NOTA=0), the totals were
Garbee=1776,
Michlmayr=1672,
Robinson=1560,
NOTA=0,
Zadka=332, so that at least about 8 range votes
would have needed to change to
alter the winner. Thus range, in this case, was considerably less manipulable
and yielded a clearer victory. This suggests range was more-arguably
correct (at least with our
ballots) so that the truest
winner was Garbee.
504 successful voters. 531 accepted votes (plus 69 rejected).
All methods agree Robinson won, except range and approval voting say
Anthony Towns
won (with our standard method for converting the ballots into range ballots).
The range victory was very narrow, with Towns getting a total range vote of
1975 versus Matthew Garrett caming in a close second with 1969. However,
Robinson with 1903
was far in third place.
Altering only a few votes could
change the range-winner to Garrett, but a fairly large number of votes would need
to change to give the victory back to Robinson.
Robinson was the Condorcet winner, with his closest matchup being against Towns,
whom he beat pairwise 245 to 222, i.e. 12 votes would need to change
to alter Debian's Condorcet winner.
Approval again gave Towns the win with 389 approvals and Garrett
second with 382, with Robinson coming in third with 375. Thus at least 14
approval votes (and probably a good deal more) would have needed to change to
move the approval-victory from Towns back to Robinson.
All this suggests that perhaps
Towns was the clearest winner, in which case
again range was more-right than the other methods.
Real-world lessons learned from this real-world data
IRV came out looking very bad in the Debian 2001-2005 elections
– in these, the only 5 elections with full data available,
IRV encountered a horrible nightmare in 2003 with just about everything horrible
that could happen, actually happening – and in 2001 IRV also suffered a 1-off
near-tie which (fortunately) did not result in a nightmare in this case, but
certainly represented a "close shave."
At least with our standard scheme for converting the Debian ballots to range ballots,
range and approval voting
both gave different winners in both 2003 and 2005 than every other
method we tried. (And in 2001, Approval's winner differed from Range's.)
If those converted ballots had really been what the Debian voters intended, then
I think there would be a good case that all those other methods gave the wrong winners
in both those years. This demonstrates the fact that intensity of
preference information
really does matter. Often. In real life.
All the other methods all entirely disregard intensity information; only
range takes it into account. That seems to indicate range is quite superior
in practice to all the other methods – would it really be best
if, say, Towns and Robinson
were each preferred over the other by about 250 voters, but with Towns always preferred
by a lot and Robinson by a little, for Robinson still to win?