The following by Warren D. Smith is based on, and includes excerpts from, Nate Silver's New York Times 24 January 2011 blog on the voting methods used to award OSCAR film academy awards.
Silver's piece was pointed out by professional Instant Runoff Voting (IRV) propagandist Rob Richie in a published Huffington Post piece 2013-01-11, who described Silver's analysis as "fascinating." But as usual, Richie completely forgot to mention that Silver's analysis exposed considerable problems with the OSCAR awards' IRV system, and indeed IRV appears to have robbed Toy Story 3 of what should have been a historic victory as the first ever best picture award for an animated film. We'll give the details and explain 7 problems with IRV as they arise one by one during Silver's analysis.
Silver (although he does not realize he is doing so) compares the current OSCAR final-round voting system, based on instant runoff, versus the score voting system used by Metacritic.com and rottentomatoes.com's "tomato meter." These and other things like the Internet Movie Database and Yahoo movies work by averaging ratings from large numbers of ordinary people, and/or professional or amateur film critics.
Silver constructs a slightly artificial scenario with these 10 films as contenders: The Social Network, Black Swan, The Fighter, Inception, The King's Speech, The Kids Are All Right, True Grit, Toy Story 3, Blue Valentine, and Winter's Bone.
Silver actually predicted those 90% right – the actual nominees in 2011 were the same except that Blue Valentine was replaced by the survival story 127 Hours.
IRV Problem #1: Silver then observes that not all the critics reviewed all the movies. ("You think the Academy's voters have seen all of them either?" he asks.) This problem is trivial for score voting to handle; MetaCritic simply averages the scores it gets. But as Silver observes, it is not so easily handled by other voting systems like instant runoff (IRV), the system the OSCARs currently (2012; probably foolishly) uses. Silver handles that by "restricting the analysis to those critics who reviewed at least half of them – this leaves us with a 40-person panel – assigning a lukewarm score of 65 (on 0-100 scale) to any movies that the critic bypassed."
It is rather sad that we need to begin by throwing out and/or faking a lot of our data-set (vote set). I daresay Silver as a statistics professional absolutely hates to discard data when he does not really have to. But that is an example of the price you pay for having a stupid voting system.
IRV Problem #2: Silver then observes that there is "yet another problem: it was quite common for one or more of the critic's choices to have gotten the same score." Again, that is no problem at all for score voting. But, again, for stupid voting systems like instant runoff, this is illegal and hence a major problem. Silver "solves" this problem by "drawing lots" (i.e. random tie-breaking imposed by Silver). Again, it is sad to alter the vote-set, i.e. data-set, but he does so because he is forced to by the stupid voting system.
So now, Silver has 40 "voters" and for each has artificially constructed a 10-film preference order ballot from their movie reviews plus (unfortunately) both random tie-breaking and fake-score-insertion, altering and editing those votes to make them legal for the stupid OSCAR instant runoff process. (These changes were considerable. Silver gives as an example J.R.Jones' ballot which involved 5 artificial random tie breaks plus 2 fake scores among the 10 films!)
Silver now runs that instant runoff process. As a result, the films get eliminated one by one in this order:
IRV Problem #5: Complexity. Note how complicated the above process was, as compared to simply picking greatest average score as the winner as in score voting.
Silver notes that this winner in his pseudo-election was "no surprise" because Social Network had in fact just won the Critic's choice award and the highest metacritic score using score voting. Silver therefore prognosticated that Social Network would win the 2011 OSCAR best picture award.
He was wrong: the official winner was The King's Speech. Comparing with score voting (now using rating data from Feb. 2013 to gain extra benefits from hindsight):
|Film||IMDB users||Metacritic||TomatoMeter||Avg tomato critic/user||Yahoo movies|
|King's Speech||8.2||88||94||8.6, 4.3||4.5 stars|
|Social Network||7.9||95||96||9.0, 4.2||4.0 stars|
|Toy Story 3||8.5||92||99||8.8, 4.3||4.5 stars|
|Winter's Bone||7.3||90||94||8.3, 3.7||4.0 stars|
|The Kids are All Right||7.9||86||93||7.8, 3.6||3.5 stars|
Notes on the table: Typical IMDb ratings based on 260,000 raters each; typical tomatos based on 170,000; typical yahoos based on 21,000. These counts suggest that statistical "noise" in an IMDb rating (as a percentage of that rating) should be about ±0.3% or less, which is below the roundoff error. Thus the IMDb rating of King's Speech should really be "8.2±0.02" which is below the error ±0.05 due to IMDb rounding it off to one decimal place. "TomatoMeter" combines the scores of "approved critics." "Avg tomato critic/user" gives two numbers: average critic score then average user score. (Moviefone.com also provides user-ratings but presently typically only based on 1000 raters hence we ignore them.)
So although King's Speech was a defensible choice as 2011 OSCAR winner versus Social Network (although SN perhaps was a slightly better choice) there is a major problem:
IRV Problem #6 – it looks like Toy Story 3 was the best choice (based on the data above)!
Assuming that Toy Story 3 really was the best choice, why was it robbed by the IRV voting system (which made it finish only third in Silver's simulation)? One possible explanation would be that the vote among the final three could have been something like this:
|11||Toy > King > Social|
|12||King > Toy > Social|
|17||Social > Toy > King|
IRV Problem #7: In this artificial scenario (which perhaps was basically what happened – we do not know because the Academy keeps its votes secret), the IRV system eliminates Toy whereupon King wins by 23=11+12 versus 17 for Social. (As in fact happened, i.e. King really did win, although in Silver's incorrect forecast the final King vs Social votes were reversed, it was 23-17 the other way.) But note, in this scenario, that Toy would overwhelmingly win a head-to-head vote versus King by 28-12, and would also have won a head-to-head vote versus Social by 23-17. (It might be more realistic to alter the votes in this scenario a bit, e.g. change "12 and 17" to "14 and 15" and make the 11 Toy-top voters split their second choices among King and Social, but neither would alter anything important.) So if this were what actually happened, then we would have to conclude that the Academy's poor IRV voting system robbed Toy Story 3 of its deserved victory.
That would have been historic as the first time ever that an animated film won best picture. And based on the amalgamated opinion of hundreds of thousands of raters plus about 100 critics, it appears to have deserved it! Note in the data table, Toy Story 3 equalled or beat King's Speech according to every single rating method for every single rater-group, plus also was the top-grossing picture of 2010.
Whodunit? It is now fairly clear Toy Story 3 was robbed. But what to blame this on is not clear (especially since the Academy keeps its votes secret). The obvious possible culprits are
Return to main page