The Debian leader elections (especially 2003)

This page written 2005.

Why Debian elections are currently uniquely important real-world data

Very few elections have both

  1. been conducted using rank-order preference ballots
  2. all the data (i.e. all votes) are publicly available on the internet
  3. the election was both large, consequential, and real.

So when we get such rare and treasured data – gold! – it is advisable to pay attention to it. The Debian Linux Developers hold annual such elections. Indeed, these may be the only elections in the entire world with all these properties. (Yes, there are plenty of ranked-ballot elections held all the time, in, e.g. Australia, but the vote set is never, or certainly very rarely, made available on the internet. That may be an intentional plan intended to prevent anyone from noticing bad properties of their voting system or election. Later note: that was written in 2004. The situation now is somewhat better since I know of about 10 IRV elections whose full data has been put on the internet, albeit in such a horrible format I could not bring myself to analyse them. With time more may appear, and I would appreciate anybody taking the effort to convert them to a more reasonable data format. Debian already was reasonable.)

These are pretty serious elections. The voters really care who wins, they have pre-election intercandidate debates transcribed and published worldwide via the internet, and the votes are conducted electronically via a special software vote-by-internet system.

In the 7 Debian-leader elections held so far, all 7 have featured a Condorcet Winner, who (by the laws of Debian) was therefore the election winner. (Also, every election method based on starting with the "Smith set" of candidates that pairwise dominate all non-Smith-set candidates, then applying some other election method within the Smith set, would in these 7 cases yield the same winner since the Smith set in all 7 cases was a singleton.) However, it is also possible for us to consider who would have been the winner under different election systems but with the same votes. That is possible because in all the Debian elections after 2000 (i.e. 5 so far), the full vote dataset is publicly available.

Rob Lanphier recently developed a beautiful piece of software called "ElectoWidget" for carrying out elections using a variety of possible voting systems. We converted the 2001-2005 Debian vote sets into ElectoWidget format and ran it. The results in full beautiful detail are here: 2001, 2002, 2003, 2004, 2005.

Our goal here is to summarize the highlights of these results.

How we converted the data to allow range and approval voting

Before we start, we need to mention a format issue. Range Voting wants the candidates to get scores, whereas Debian originally only asked voters for candidate orderings (with equalities allowed in the orderings). And Approval Voting and DMC Voting want candidates to be approved or disapproved.

So, we had to fake it. Here's the conversion recipe we used:

We then proceeded with the election.

Highlights (and lowlights) of the 2001-2005 Debian elections

1999: 4 candidates+NOTA (None Of The Above); Wichert Akkerman won.
208 valid votes.
2000: 4 candidates+NOTA; Wichert Akkerman won.
216 valid votes. Debian did not make the full vote dataset available for the 1999-2000 elections so we are unable to say more.
2001: 4 candidates+NOTA; Ben Collins won.
311 valid votes. If this election is done via either IRV or plurality instead, Collins still wins, but there is a near-tie – a one-vote difference in round 1 – between Kumria and NOTA which could have delayed the IRV election (although fortunately in this particular election the outcome of that tie cannot matter, so that delay is avoidable). All methods agreed Collins won, except that Approval voting said Bdale Garbee won.
2002: 3 candidates+NOTA; Bdale Garbee won.
475 successful voters. 509 valid votes (plus 122 rejected as invalid). N.b: Debian allowed voters to vote more than once, but only their last vote counts. (All methods agreed on the winner.)
2003: 4 candidates+NOTA; Martin Michlmayr won.
488 successful voters; 510 accepted votes (plus 200 rejected). This was the hardest one of these 7 elections to call; in all the other elections 2001-2005 all of the following methods all agreed on the winner: Schulze-beatpaths, Plurality, IRV, DMC, Simpson-Kramer-minmax-margins, minmax-wv, Copeland, Range-Voting, Approval-Voting, and every Condorcet method. But in this election, these methods called it for Michlmayr, Robinson, Robinson, Michlmayr, Michlmayr, Michlmayr, Michlmayr, Garbee*, Garbee*, and Michlmayr respectively. (The asterisk is that range and approval voting are not really applicable since these ballots were not originally cast as range ballots; this is with our standard scheme for conversion of these ballots into range form, which can only approximate what the voters would have done. Range and Approval then also get a different winner in 2005, and Approval gives Garbee the win in 2001.)
2004: 3 candidates+NOTA; Martin Michlmayr won.
482 successful voters. 506 accepted votes (plus 52 rejected). All methods agree.
2005: 6 candidates+NOTA; Branden Robinson won.
504 successful voters. 531 accepted votes (plus 69 rejected). All methods agree Robinson won, except range and approval voting say Anthony Towns won (with our standard method for converting the ballots into range ballots). The range victory was very narrow, with Towns getting a total range vote of 1975 versus Matthew Garrett caming in a close second with 1969. However, Robinson with 1903 was far in third place. Altering only a few votes could change the range-winner to Garrett, but a fairly large number of votes would need to change to give the victory back to Robinson. Robinson was the Condorcet winner, with his closest matchup being against Towns, whom he beat pairwise 245 to 222, i.e. 12 votes would need to change to alter Debian's Condorcet winner. Approval again gave Towns the win with 389 approvals and Garrett second with 382, with Robinson coming in third with 375. Thus at least 14 approval votes (and probably a good deal more) would have needed to change to move the approval-victory from Towns back to Robinson. All this suggests that perhaps Towns was the clearest winner, in which case again range was more-right than the other methods.

Real-world lessons learned from this real-world data

IRV came out looking very bad in the Debian 2001-2005 elections – in these, the only 5 elections with full data available, IRV encountered a horrible nightmare in 2003 with just about everything horrible that could happen, actually happening – and in 2001 IRV also suffered a 1-off near-tie which (fortunately) did not result in a nightmare in this case, but certainly represented a "close shave."

At least with our standard scheme for converting the Debian ballots to range ballots, range and approval voting both gave different winners in both 2003 and 2005 than every other method we tried. (And in 2001, Approval's winner differed from Range's.) If those converted ballots had really been what the Debian voters intended, then I think there would be a good case that all those other methods gave the wrong winners in both those years. This demonstrates the fact that intensity of preference information really does matter. Often. In real life.

All the other methods all entirely disregard intensity information; only range takes it into account. That seems to indicate range is quite superior in practice to all the other methods – would it really be best if, say, Towns and Robinson were each preferred over the other by about 250 voters, but with Towns always preferred by a lot and Robinson by a little, for Robinson still to win?

Peru election of 2006

Return to main page