By Brian Olson & Warren D. Smith. Third PRELIMINARY draft August 2011.
There is no issue that is more sensitive to politicians of all colors and ideological persuasions than redistricting. It will determine who wins and loses for eight years.
– Ted Harrington, chair of political science at the University of North Carolina (quoted during Shaw v. Hunt trial, March 1994).
Smith was urged to write an essay about theoretical issues in political districting. This page is his current answer to this request. Brian Olson helped him a lot hence is credited as coauthor.
At any level of government, a district means that there is a specific piece of government responsible for your and the other people in your district's needs. Districts break governance into managable pieces. School districts let us manage just the schools 'here' and not 'there.' State Legislature and Congressional districts let you elect one specific representative.
"Democracy" is voters choosing their leaders. But when politicians get to draw their own districts, such as (most egregiously) in the USA, the result is the opposite – the politicians choose their voters. Outrageously biased absurdlooking district maps are common in the 2000era USA, and not coincidentally, reelection rates to congress are ≈98%. Voters thus play little role in the "democratic" process. Statehouses often are even worse.
Here's an actual quote (given as evidence in a court case) by a redistricting committee chairman to one of his committee colleagues: "We are going to shove [this map] up your f a and you are going to like it, and I'll f any Republican I can."
US Supreme Court Justice Sandra Day O'Connor apparently regarded this as a virtue, opining that any politician who did not do absolutely everything he could to help his party "ought to be impeached." (To be clear, the fact that a US supreme court justice could emit that quote, absolutely astounds and revolts me; and I would be more in agreement with "any politician who does absolutely everything he can to help his party ought to be impeached.")
Justin Levitt (Law Professor at Loyola) claims that the USA "is unique among industrialized democracies in putting an inherent conflict of interest directly at the heart of the redistricting system."
Computerized gerrymandering systems and massive secret databases about people compiled by the political parties (they know your gender, your household composition, what magazines you subscribe to, what party you are registered in, what crimes you committed, how often you vote, what products you buy...) are now enabling gerrymandering with extreme levels of detail – individual buildings – making the problem much worse still. This all is an extreme failure of democracy.
Stupid ways to try to avoid this problem, which in the USA usually have not worked, include
I know of exactly two ways to genuinely fix the problem and produce genuinely unbiased district maps.
Anybody in the world is allowed to run the same algorithm on the same data (both being public) to hopefully get the same district map, verifying there was no cheating, and if it is a simpleenough algorithm then anybody can check validity to a considerable extent just "by eye" with no computer needed. In the resultsbased paradigm, again, anybody can confirm the claimed map goodness by computer, or, if it is a simpleenough quality definition, one can confirm/deny fairly well merely by eye.
Why can't we get the best of both worlds by making a computer find the best possible districtmap? Because for all three of my favorite reasonable quality metrics, I can prove that finding the best districtmap [or even answering the yes/no question "does a map with quality better than X exist?"] is an "NPcomplete" problem. (Discussed in NPcompleteness section below.) Given that essentially everybody believes P≠NP, that basically means that there cannot be any computer program that both (i) finds the best map, or even merely answers that yes/no question for arbitrary X and (ii) can be guaranteed to run quickly enough that it is going to finish computing in any reasonable amount of time (e.g. your lifetime, the age of the universe, etc).
Note the word "guarantee." Even if some computer program were found by empirical testing to often succeed quickly in finding a provably optimum districting, there would be no guarantee that the very next run tried would not take longer than the age of the universe. This is unfortunately simply the way it is with NPcomplete problems, and is why NPcompleteness renders many schemes unacceptable for many electionrelated and districtingrelated purposes in which guarantees are required.
OK, can we at least make a computer find a map guaranteed to be no worse than a factor of 2 away from optimum? (Or a factor 1.1? Or 1.01? Or any constant factor at all, even 9999999?) Those at present (year 2011) all remain open questions for any of the three qualitymeasures discussed below! I suspect such an algorithm will eventually be found (indeed, I have some candidates), but unfortunately I also suspect it will be complicated.
Will that contest paradigm work in real life? I'm presently unconvinced. There was a trial contest to redistrict Arizona 2011, and it was a total failure; by their announced deadline 5 June nobody was able to use their website's software to produce any sensible map (there were a few "entrants" which were obviously just failed experiments in using their software) whereupon they magically changed the deadline to 22 June. There was a later magical rescheduling too. I personally tried to use their contest software and failed, and so did several others whom I know. There would need to be hundreds, maybe thousands or tens of thousands, of such contests set up to district all the 100s10000s of areas that need to be districted. MORE ON THIS???
The whole paradigm depends on having enough entrants (and enough entrants with enough competence!) in every contest. Right now, often, even simple manual sanity checks on publishedbytheUSAgovernment election results, simply are not conducted. Not by any person. Not by any group. Not even the losing candidates in that election, even in close races, perform simple arithmetic sanity checks!! I know this because, many years later, when I or somebody else gets prodded by some random news event to finally perform such a sanity check, we often find some incredibly obvious mathematical impossibility (often more than one). Like more votes than voters. Or sums that are wrong. With these errors exceeding the official election margin. Meaning nobody ever found it/them and made it/them public before. CITE EXAMPLES??? Now drawing district maps is far more complicated and difficult than performing simple arithmetic checks on published election results. It's pretty much beyond the ability of some unfunded amateur without computerized tools. So it is even less likely to happen.
But in the future, computerized tools will get common and huge numbers of people will enter the contests, right? Maybe they will and maybe they won't. Even if you can easily get and install such tools and they all work perfectly and you aren't worried their authors might be evilly trying to malware your computer – it still is way easier to perform simple sanity checks on election results, and I repeat, that isn't happening. And the government can and often has in the past, published their data in bizarre, changing with time and place, formats. (With errors in it too.) For example just now for 2010 the census department changed its file formats, killing our districting program (which had worked with their 2000era data). Election data all over the USA is published in absolutely insane, ridiculous, not the same in this town as that town, formats, making it impossible for any simple computer tool to just look at everything on election day and spot obvious insanities. It can take hours to months to rewrite your program to try to eat some new data format, and months of effort per town, with tens of thousands of towns in the USA, adds up to too much effort. In short, the politicians could easily screw up any contest to make it effectively impossible for anybody to enter it besides their own entrant. Just like they made "Jim Crow" laws which for 50+ years made it almost impossible for black people in the Southern USA to vote. Just like they have blatantly gerrymandered, drawing absolutely absurd districtshapes, for 200+ years without ever stopping, now having ≈98% reelection rates. Just like the Tweed Ring took over New York and redirected about onethird of its total budget into their own pockets, with only 2 members of this ring ever serving any jail time ever.
But we'll just solve that problem by making a law that there has to be a simple, neverchanging, fixed format for census and map data. Dream on. Let me put it this way. In the history of the universe, I doubt there has ever been such a law, and trying to describe data formats in legalese basically adds up to dysfunctional garbage.
As of 2011, there is an ISO standard for "Geographic information – Geodetic codes and parameters" (ISO/TS 19127:2005) but unfortunately there do not appear to be any ISO standards for "census" or "population" or reporting election results.
Incidentally, we wrote a program to eat 2000 USA census data... and the census dept swore to us they'd never change this data format. It was a de facto standard which tons of software all over the world relied on. So they'd never change it. Presto, very next census (2010) they changed it and all the old software broke.
Will the procedurebased paradigm work in real life? Yes it will. The government itself will need to run the procedure so they will not be able to stop contestants by introducing bogus and sick data, because they are the only contestant. Even if the government decides to stall and claim it was impossible for them to produce a district map, then other people could easily run the algorithm so everybody would know what the map was anyway. Gerrymandering will be impossible because the algorithm produces one and only one map, with no adjustability whatsoever, given your state's shape and where its people live. Ivan Ryan programmed the shortest splitline algorithm and ran it on all 50 US states using year2000 US census data – at a cost in electricity of around 10 cents and total computer runtime (needed only one 2000era PC) of a few days, which are far less cost and time than the USA's human districters have ever consumed – and it almost always produced clearlybetter district maps.
Here are three qualitymeasures that previous authors have proposed as hopefully suitable for the contestbased paradigm or evaluative purposes. (In all cases, we demand there be exactly N districts, all equipopulous, and each a connected set.)
About "equipopulousness": It is mathematically feasible, but in the real world infeasible and pointless due to census datagathering errors, to get exact equipopulousness down to at most ±1 person. Requiring the maxpop district have ≤1.005 times the alleged population of the minpop district plus at most 5 people, seems a morethanadequate compromise sanctioned by US courts, although 0.7% deviation was judged too much in KARCHER vs. DAGGETT 1983.1. Brian Olson's "average distance to center" measure:
Across all districts and all people, the "best" district map is the one where people have the least average distance to the center of their district. (And this quantity – the average distance from a person to her districtcenterpoint – is a numerical measure of goodness for any map; smaller numbers are better.) Note, a similar but much worse idea was put forward by Boyce & Clark 1964.
Annoying but crucial details: "Distance" means geodesic "great circle" distance on the surface of the sphere. This distance between two points with (longitude, latitude) given by (a,b) and (c,d) is
where R≈6371km is the radius of the Earth and bd is computed appropriately modulo 360 so that it always is ≤180 degrees. "Center" means, I presume, the "FermatSteinerWeber point" (based on surfacedistance) of all the peoplelocations in that district. The FSW point of N pointsites is defined to be the point X which minimizes the sum of the N distances from X to the sites. There are known efficient algorithms to find X to high accuracy (it's a "convex programming" problem), although these algorithms are not obvious and not very simple. On a flat earth the FSW point is always unique. On a round earth it can be nonunique, but that should not be a problem in practice.
To learn more about FSW points:
http://en.wikipedia.org/wiki/Geometric_median
http://en.wikipedia.org/wiki/Fermat_point
http://mathworld.wolfram.com/FermatPoints.html.
"Weiszfeld's iteration" consists of iteratively replacing the current center X (initially a random point) by the weighted mean of all the sites, where the weight of site k is proportional to the reciprocal of its distance to the current X. (The proportionality constant is chosen so the weights sum to 1.) In a Euclidean space using ordinary Euclidean distance, X converges to the FSW point – in a manner which decreases the summed distance each iteration – provided the sites have a fulldimensional convex hull and provided X never lands exactly on a site. (References on Weiszfeld iteration.)But on a round Earth using surface distance, it's a little more difficult. The following new Weiszfeldlike algorithm conjecturally works if all the sites lie within the same hemisphere: work in a suitable gnomonic projection of the Earth's surface onto a flat plane. (The virtue of this map projection for districting purposes is that geodesic shortestpaths on the sphere correspond to straight line segments on the plane, hence "convexity" is preserved.) On this flat map, use the Weiszfeld iteration, but letting the weight of site k instead be proportional to
EuclideanDist(X, sitek)^{2}TrueSphericalDist(X, sitek). Finally, after X has converged, unproject it back onto the sphere.
Sidenote on the gnomonic projection: To map (x,y,z) on the unit sphere (x²+y²+z²=1) to (X,Y) in the plane via a gnomonic projection centered at a point C having longitude=A and latitude=B, first rotate (x,y,z) about the NorthSouth zaxis by angle A so that C is moved onto the Greenwich meridian at x=1; then rotate by angle B about the yaxis to move C onto the equator with z=0 [the net effect is to move C to new location (1,0,0)]:[ x ] [ cosB 0 sinB ] [ cosA sinA 0 ] [ x ] [ y ] ← [ 0 1 0 ] [ sinA cosA 0 ] [ y ] [ z ] [ sinB 0 cosB ] [ 0 0 1 ] [ z ] new rotation by rotation by old xyz angle B angle A xyz[now it should be the case that x>0 otherwise (x,y,z) was on the other side of the planet from C] then perform the projection itself:X = y/x; Y = z/x. This last operation can be inverted by x=(1+X^{2}+Y^{2})^{1/2} then y=xX, z=xY.I note that the direction to the Weiszfeld point is always a descent direction (also true for Cooper 1968's variant based on distance^{e} rather than plain distance, for any exponent e with 1≤e<∞; and also true for our gnomonicprojected version of the sphericaldistances problem; this using Cooper's or my generalized Weiszfeld iterations) so that one can always "back up" from the candidate new Weiszfeld point by repeatedly dividing the stepsize by 2 until we find a descent point.
Relation to Average Interperson Distance: The average distance from a random person to a random person in the same district, is bounded between 1 and 2 times Olson's average distance to districtcenter.
Olson's motivation: If there were exactly one votingplace in each district, and if each one were optimally located, and if each inhabitant traveled there by personal helicopter to vote, and if all helicopters had identical performance, then the best district map by Olson's criterion, would be the one minimizing the total travel expense to all the people each election. But of course in real life all these "if" assumptions are untrue, especially the first. They may, however, have some approximate validity.
2. The "average squareddistance" (areavalued) measure:
The "best" district map is the one where the average squareddistance between two people in the same district (averaged across all districts and all such peoplepairs) is minimized.
I intend that "squared distance" now mean
the usual Pythagorean straightline formula.
To convert from (longitude,latitude) pair (α,β) to Euclidean coordinates (x,y,z) use x=[cosα][cosβ], y=[sinα][cosβ], and z=sinβ.Remarks: It is mathematically equivalent (but simpler to compute) to minimize the average squareddistance between each person and her districtcenterpoint, where "center" now simply means the numerical average (in 3space) of all the (x,y,z) locations of the voters in that district. (It is well known that the average of sites in a Euclidean space uniquely minimizes the sum of squared distances to those sites.) This quantity is a numerical qualitymeasure for any districting (smaller numbers better). It has the dimensions of squared distance, i.e. area, as opposed to Olson's measure which has the dimensions of length.
The reasons it is "simpler to compute" this than Olson's measure are
 because squared Euclidean distances are simpler to compute than unsquared surface distances (no trig, inversetrig, or square roots required; just multiply, add, & subtract)
 averages are much simpler to compute than FSW points.
Olson should easily be able to modify his computer program to employ either quality measure 1 or 2 (user chooses which).
The reason the pairsum and singlesum measures are "equivalent" is that
∑_{1≤i≤N} ∑_{1≤j≤N} (x_{i}x_{j})^{2} and 2N ∑_{1≤i≤N} (x_{i}μ)^{2} where Nμ = ∑_{0≤j≤N} x_{j} are identical and N (the population of a district) is merely a constant scaling factor. To prove that, realize that both expressions are translationinvariant, i.e. adding any fixed constant vector to all the x_{i} and to μ leaves both of our expressions unaltered, so therefore we can wlog demand μ=0=∑_{j}x_{j}, whereupon it becomes trivial to see they're equal.
3. The minperimeter measure:
The "best" district map minimizes the total length of all the district boundaries (subject to the usual constraints that there be the correct total number of districts and each have the right population and form a connected set). This total length is a quality measure for any map (smaller is better).
This tends to prevent ultrawiggly district boundaries and makes maps easy to draw. It is probably the easiest measure (among the three) to compute. It also tends (albeit somewhat less directly) to minimize voter travel distances.
Mackenzie's related index: John Mackenzie in his paper Gerrymandering and Legislator Efficiency introduced the following measure Q of badness, which some have branded the "Mackenzie index":
Let P=g+n denote total district perimeter as the sum of mutable political boundary length g plus immutable natural boundary length n. The the mutable proportion equals g/P. The polygon complexity attributable to political manipulation is Q = (g/P)(P^{2}/A) = gP/A.
For districts with g=P, the least possible Mackenzie index would be achieved on a flat earth by a circle, with Q=4π≈12.57; a square would have Q=16; a 2×1 rectangle would have Q=18; a 345 right triangle would have Q=24; and a 4×1 rectangle would have Q=25. Therefore in my opinion districts with Q>40 ought to be illegal. But Q values as low as 0 are possible (for a state consisting of exactly 1 district, entire boundary immutable) and unboundedly large values also are possible (very wiggly boundary). The worst districts in the 19992009 USA had 200<Q<900.
Mackenzie then used US government district shape datafiles to compute all Q values for all congressional districts for the 106th through 110th congresses (years 19992008) and used the results to create color coded maps of the USA highlighting the worstgerrymandered areas (100<Q≤870) in orange and red, and the leastgerrymandered areas (0≤Q≤75) in green. The worstgerrymandered states during this era (having greatest average Q for their districts), are given in the table below at left computed by Mackenzie (20072008 only); at right are Mackenzie's five Qmaps combined into one animated GIF, and note that about half the land area of the USA is gerrymandered at Q>50 levels which I contend ought to be illegal.

Gerrymandertude changes with time. In the earlier 106th congress (19992000) it looks to the eye like the mostgerrymandered states instead were
in approximately worst→best order.
Mackenzie then also tried to ask: "Is there any correlation (or anticorrelation) between the geometric complexity (Q value) of districts and the efficacy of their congressmen at unfairly channeling federal money toward that district?" But he failed to find any such connection (or if one exists, then it is not immediately apparent in his data, the "noise" seeming larger than the "signal").
Generically unique optimum Theorem: each of the above three measures generically yields a unique optimum partitioning of the P people into N districts for any P≥N≥1 (maxpop district has at most 1 more person than minpop district; "people" regarded as disjoint points on the sphere surface). "Generic" means if the people's locations got perturbed randomlyuniformly by ≤ε then you'd get uniqueness with probability=1 for any ε>0, no matter how small, starting from any set of people locations.
Proof:
Assume a flat Earth.
"Generic real numbers" disobey every polynomial equation with integer coefficients
(except for alwaystrue equations such as 1=1). In other words they
are "algebraically unrelated."
For any partitioning of the P points
into N equipopulous subsets we can compute the centerpoints (whether mean or FSWpoint)
for each subset and thus its Olson or squareddistancebased quality measure, as an
algebraic
function of the people's coordinates. Since no two algebraic functions can be equal
on generic data (because if they were, you'd get one of those forbiddden polynomial equalities),
these quality measures will have distrinct values for
every partitioning, and hence the optimum partition must generically be unique.
The perimeterssum measure also is an algebraic function of the coordinates
(for any given partition), and a different function for each partition; so
no two partitions can have equal cost, so again the optimum is
generically unique.
For a round Earth, one can employ e.g. the stereographic projection to "flatten" it,
but then the above proof only immediately works for the sumofsquaredbased measure, because only
it is an algebraic function of the flatmap coordinates (the other two measures
also require inversetrigonometric functions). It is, however, possible by using trigonometric
"summed angle" identities to convert any costequality
into an algebraic (and then further into a polynomial)
equality. For example if two sums of arccosines are equal, then take the cosine of both sides,
and apply cos(A+B)=cos(A)cos(B)sin(A)sin(B) and use
sin(A)^{2}+cos(A)^{2}=1 and cos(arccos(X))=X to convert to an algebraic identity.
Q.E.D.
Another way to prove all this is to note that the number of possible ways to write
costequality formulae
is at most countably infinite, whereas the real numbers are uncountably infinite;
it is key that for our costmeasures any two setpartitions always have different
costformulae.
Of course with special (nongeneric) real numbers, it is possible for two different districtings to be exactly tied for leastcost. The virtue is that any slight perturbation of the peoplelocations will (with probability 1 for a random perturbation) break all ties.
The following inequalities hold where "optimal" means the leastcost map for that cost measure, i.e. each occurrence of the word "optimal" has a different meaning:
Note the middle inequality is only valid if "distances" are measured in the same way on both sides of it (but sadly Spherical & Euclidean distances differ) but the outer two "≤"s are valid even using the two different kinds of distances involved in our quality definitions. The middle ≤ becomes valid with our incompatible distances either in the flatEarth limit or if we weaken it by multiplying its left hand side by (2/π).
Proof: The distance from A to B cannot exceed half the perimeter of the district (A,B both in the district) as you can see by rigidly moving AB until both endpoints hit a boundary. Hence the square root of the mean squared distance to center, is ≤ the square root of the maximum squared distance to center, which is ≤ perim/2. Of course, that is for the districts that minimize perimeter. For the districts redrawn to minimize meansquareddistance, this inequality only becomes more true; and if we switch from spherical surface distance to Euclidean 3D distance, it again only makes it more true since the latter kind of distance is always smaller. Next, for any set of nonnegative real numbers, their mean is always ≤ the square root of their mean square. Of course, that is for the districts that minimize meansquareddistance. Redrawing the districts to minimize meanunsquareddistance, only makes the inequality more true. If the squared distances are measured using straight line 3D distance but the unsquared distances are measured on the sphere surface, then that reasoning was invalid but is salvaged if we weaken it by multiplying the left hand side by (2/π) because straightline 3D distances from A to B cannot be shorter than (2/π) times the sphere surface distance from A to B (and equality occurs only if A and B are antipodal). Q.E.D.
It is possible by positioning the people appropriately to make each of these ≤ be very strongly obeyed, i.e. "<<"; but with a uniform distribution of people in any nicelyshaped country, the three quantities all will be within constant multiplicative factors of any other.
The above three qualitymeasures seem to me to be the best three among those that people have proposed over the years (that I am aware of). However, many stupid qualitymeasures have also been proposed. Many of them are incredibly stupid to the point where my mind absolutely boggles that professional scientists could actually have proposed them in print.
Example stupid idea #1: The "quality" of a district is the ratio of the radius of the smallestenclosing circle, divided by the radius of the largestenclosed circle (smaller ratios better).
Example stupid idea #2:
The "quality" of a district is the ratio of the area of the
smallestenclosing circle, divided by the area of the district
(smaller ratios better). This has been called the "Roeck (1961) measure."
An even morepeculiar related quality measure was
Schwartzberg 1966's – defined on p.108 of
Young's review
paper as
the ratio of
a polygonal pseudoperimeter divided by
the perimeter of a circle with the same area as the district.
This, because it is based on a polyonal pseudoperimeter, is not affected
by the real perimeter. Thus in a districtmap that is square grid with
500mile sidelengths (not bad), replacing all the square sides by
arbitrary hugelywiggly curves
would leave the Schwartzberg quality measure unaffected,
because (as defined by Young) the pseudoperimeter would still be the original square grid!
I saw, however, a website which (re)defined Schwartzberg using genuine, not pseudo,
perimeter. Call that "sanitized Schwartzberg"; it makes far more sense.
(I do not have access to Schwartzberg's original 1966 paper.)
Example stupid idea #3: Inscribe the region in a rectangle with largest length/width ratio. This ratio is ≥1, with numbers closer to 1 corresponding to "better" regions. (Harris 1964 and Papayanopoulos 1973.)
In all cases the quality of the whole multidistrict map is obtained by summing the qualities of all the districts within it.
A big reason these three ideas – and many others – are stupid, is that you can take a multidistrict map "optimal" by this measure, then add a ton of essentially arbitrary wiggles to large portions of the district boundaries, while leaving the "quality" of that map exactly the same. Here's an example for stupid measure #1:
The picture shows a discshaped "country" subdivided by a diameterline into two semicircular districts, one in the North and one in the South. This 2district map is "optimum" by stupid quality measure #1 (dashed smaller circle is just a mental crutch to help you see that; it is not a district boundary; for how to know this is optimum, see Graham et al 1998 and employ the max≥arithmetic mean≥harmonic mean≥min inequality). However, upon making verywiggly alterations to two portions of the interdistrict boundary as shown, we still have an "exactly optimum" 2district map (assuming the original map was optimum) according to stupid measure #1. Similarly the hexagonal beehive ought to be optimum for both stupid quality measures #1 and #2, but we can alter large portions of each hexagon edge by adding almost arbitrary wiggles and still get an "exactly optimum" tesselation. Stupid idea #3 just does not work sensibly on a round Earth (there is no such thing as a "rectangle" if that means all four angles are 90 degrees; if it means all 4 angles are equal and the sides come in two equallength pairs, then a district consisting of the entire Northern Hemisphere is "infinitely bad"!). If we only consider it on a flat Earth, then any region bounded by a constantwidth curve (there are an continuuminfinite number of inequivalent such shapes) has ratio=1 and hence is "optimal" by stupid measure #3.
Example stupid idea #4: Yamada 2009 had the idea that the cost of a districting should be the total length of its minimum spanning forest; that is, for each district you find the minimumlength tree (MST) made of interperson line segments for all the people in that district, and the cost of the whole districting is the sum of the lengths of all the district MSTs.
With Yamada's idea, if the people are located one each at the vertices of the regularhexagon "beehive," then every district map (so long as each district contains a connected subnetwork of the beehive) is "exactly optimum." This allows incredibly "fuzzy hairy tree shaped" districts while still achieving "exact optimality." There is virtually no discrimination. And presumably by perturbing these locations arbitrarily slightly we could break the tie in a highly arbitrary manner in which case insane fernshaped districts would actually be "the unique optimum"!
Summary of stupidity: In other words, with those stupid quality measures
It's an embarrassing sign of the dysfunctionality of this entire scientific field, that, e.g. no previous author (before this web page) has considered the fundamental uniqueness theorems. In short, they never reached "square one."
It seems to me, however, that my three favorite quality measures are invulnerable to those criticisms – the "optimum" map is always a pretty decent map; and since it is generically unique any attempt to add wiggling to any boundary, will destroy optimality.
Example stupid idea #5: Quality of a district is the ratio of the area of the convex hull of a district divided by the area of the district.
Example stupid idea #6: Quality of a district is the probability that a district will contain the shortest path between a randomly selected pair of points in it. (Chambers & Miller 2007.)
Example stupid idea #7: An even stupider related idea – the average percentage of reflex angles in a polygonal pseudoperimeter – was advanced by Taylor 1973.
With stupid ideas 5, 6 & 7, every convex shape is "exactly optimum." Thus the "hexagonal beehive" pictured in an upcoming section is optimum, but with this stupid idea so is the "squished beehive" with long thin needlelike districts (this picture was got by factor2 squishing, but stupid ideas #57 would still regard this as "exactly optimum" even with 99!):
But Taylor's idea #7 goes beyond even that amount of stupidity to enter whole new regimes. Observe that Taylor's idea is not even defined for a district that is not polygonal (e.g. has boundaries that are smooth curves, or fractals). One could try to define it by approximating such a shape by a polygon, but then different sequences of polygons (all sequences approaching the same limitshape) would yield vastly different Taylor quality measures. Thus, the Taylor measure is not a measure of shape at all!
Not necessarily. Some of them could still have value if used in combination with a nonstupid idea such as our three costrecommendations. For example, we could minimize the sum of district perimeters (good idea #3) subject to the constraint that we only permit districts having ratio of the radius of the smallestenclosing circle, divided by the radius of the largestenclosed circle, less than 2 (stupid idea #1!) and/or subject to the constraint that we only allow convex district shapes (related to stupid ideas #5 and 6). (We then would need to use total district perimeters; i.e. we do not allow using segments of the countryborder "for free"; indeed we in this formulation would need to ignore the countryborder entirely, regarding the peoplelocations as the only important thing.)
I don't see anything wrong with this sort of ideacombination (aside from the fact that no district map satisfying the extra constraint exists, for, e.g, a very long thin country...), and indeed some such hybrid might be superior to either ingredient alone.
...the optimum way to subdivide it into equipopulous districts, would be a regularhexagon "beehive" tessellation. (This is simultaneously true for all three of our optimality notions. For example using Olson's measure, the hexagon tesselation has average distance to tilecenter which is approximately 98.5883% of what it would have been using the squaregrid tessellation.)
However, on a round earth, or in a finite country, or with nonuniform population density, the pictures would change and would no longer (in general) be identical and no longer be the regularhexagonbeehive.
The picture shows a discshaped "country," mainly rural, but with one denselypopulated city (the small white offcenter subdisc). The population densities in the city and rural areas both are uniform and both areas have equal total populations. For simplicity we use a flat Earth. The problem is to divide this country into exactly two districts.
Based on the above simple example – and also based on my intuition before I worked this example out – I suspect we have listed our three qualitymeasures in increasing order of both goodness and simplicity – the best one is the minperimeter measure (3).
I'm not alone. Electionexpert Kathy Dopp has also told me (June 2011) she independently came to the same conclusion for her own reasons: minimizing total perimeter is the best. More precisely (Oct 2011), it seems Dopp prefers to minimize some function of the isoperimetric ratios of the districts in the map. However, it is not clear to me what the best such function should be. See discussion below about "weighted sum." Similar ideas have been advanced also in many scientific publications by other authors.
Reasons:
In the contest paradigm, if you (a good person) submit a sensible districting, but I (evil) have more funding, hence more compute power and better programmers, than you, then I'll find a "better" map by either measure 1 or 2 (whichever is the one we are using). I will then be able to modify my map by adding an enormous number of evil wiggles, while still having a "better" map than you do (because that'll only alter my quality measure by a tiny amount which goes to zero like the square of the maximum wigglesize when my wiggles are made small), hence I'll still win the contest.
To hammer that home: suppose I'm able to beat you by 1% due to my extra compute power. I'll keep my 1%better map secret until 1 second before the contest deadline. I'll then add evil wiggles of order 10% of my district size to my map, and submit that. Note, I'll still be able to beat you, since the square of 10% is 1%. But wiggles of size 10% are actually quite enormous and quite sufficient to enable very severe gerrymandering! Evil will be able to win.
In practice the Democratic and Republican parties probably would have far more funding, computepower, motivation, and programmeryears to throw at it than everybody else. Hence I worry that we might get an evilDemocrat plan or an evilRepublican plan always winning the contests, with the nonevil nonbiased nonwiggly maps, all losing.
If we both are handed maps with N equipopulous districts, then I'll be able to evaluate the perimeter length without needing to know or think about the locations of every single person. All I need is a tapemeasure. Meanwhile you, in order to compute qualitymeasures 1 and 2, will need to know, and do some arithmetic related to, the location of each of the millions of people. (Although actually I would need to know the peoplelocations if I wanted to be sure the districts were equipopulous; but not if all I wanted was to compute the costmeasure.)
However, I'm not 100% sure of the conclusion that minperimeter is the best criterion.
This question of what is the best quality measure probably cannot be answered just via pure mathematics; it requires experiment. If somebody produced the optimum maps (for all three notions of "optimal") for districting about 100 realworld countries, and we looked at them, then we'd have a better idea how well they perform. As of the year 2011 nobody has done that experiment – it and it would probably be infeasible to really do it since optimal districting is NPhard – but see the pretty pictures section for the closest we've come.
But there are some valid counterarguments that the minsquareddistancebased quality measure (2) is better than my favored minperimeter measure (3). The case is not entirely onesided:
A. Under certain rules (discussed below) the minperimeter measure plausibly seems more vulnerable to manipulation by the census department.
B. The squareddistance cost measure can easily be modified so that districts not only tend to be geographically compact, but also tend to group people with common interests together. To do that, associate an "interest vector" with each person. Each person now has more than 3 coordinates; the extra dimensions represent her interests. We now can use the same squareddistancebased cost measure as before, just in a higherdimensional space. (I happen to hate that idea because: what qualifies as an "interest" and by how much should we scale each interest? Whoever arbitrarily decides those things, will control the world.)
C. The squareddistancebased measure has this beautiful property which in general is disobeyed by the other two:
Voronoi Characterization Theorem: The "optimum" Ndistrict map according to the squareddistancebased measure always subdivides the population in such a way that the districts are "Voronoi regions." That is, associated with each district k, is a magic point x_{k} in 4space, such that the inhabitants of district k consist exactly of the people closer to x_{k} than to any x_{j} with j≠k. [Yes, I did say fourdimensional space! Use coordinates w,x,y,z. Every point on the Earth has w=0, but those N magic points are allowed to have w≥0. The x,y,z coordinates of a district's magic point are just the mean x,y,z for its inhabitants.]
Remark: This makes describing the exact shapes of the districts very simple – they always are convex polyhedra in 3space – makes it easy for anybody to determine which district they are in (simply compute your N distances to the N magic points; which is closest tells you your district) – and finally makes it much easier for a computer to try to search for the optimum districting. Note also that if our districts are small enough that it is OK to approximate the Earth as "flat," i.e. 2dimensional, then the regions are Voronoi regions whose magic points now lie merely in 3space, and the Voronoi polyhedra, when we only look on the plane, reduce to convex polygons. In other words, on a flat Earth, the optimum Ndistrict map always has convex polygonal districts. That's very nice.
Unfortunately on a round earth, it is not quite so simple. The optimum map's districts, although convex polyhedra in 3space, if just viewed on the sphere surface become usuallynonconvex "polygons" whose "edges" are circular arcs; and it is possible for a district to consist of more than one disconnected such polygons. For example consider N1 equipopulous cities equispaced along the equator, plus two halfpopulation cities at the North and South poles, plus a uniform distribution of a comparatively tiny number of rural people. The best Ndistrict map (if N is sufficiently large) then has N1 identical districts each containing a single equatorial city, plus an Nth district consisting of two disconnected regions containing both poles (all districts have equal area and equal population).
Even so, though, a very concise exact description of the optimum map exists – simply give the 4N coordinates specifying the N magic points. If we only permit thusspecified maps as contestentrants, that prevents ultrawiggly district boundaries, whereupon the criticism A of the sumofsquaresbased quality measure – that an evil computationallypowerful contestentrant could submit a wiggly map and still win – is avoided.
Proof: See §5.3 of Spann, Gulotta & Kane for a beautiful proof employing the duality theorem for linear programming (which in turn is discussed in Dantzig 1963). Their C_{j} is the same (up to an arbitrary overall additive constant) as our +w_{j}^{2}, and their p_{i} is the "magic point" associated with district i, whose (x,y,z) coordinates, note, are seen to be the means of the peoplelocations in district i. Q.E.D.
Extension to handle Olson: The same Spann proof also will show a related theorem about what the Olsonoptimum districting is: It always is an "additively weighted Voronoi diagram" for the sphere surface metric (basically you just go through their proof but now using unsquared distances). In other words, associated with each optimumOlson district k is both a magic point M_{k} on the sphere surface (the FSW point) and a real "additive weight" A_{k}; and the kth optimalOlson district consists precisely of the points x on the sphere such that
The regions in such a diagram are in general nonconvex (but always connected) and have boundaries described piecewise by certain algebraic curves. On a flat earth the district boundary curves are (piecewise) hyperbolaarcs and line segments; and the regions always are connected, although usually nonconvex.
Our three favorite cost measures for districts were
We then suggested that the cost of a multidistrict map should be just the sum of the individual district costs.
But maybe that latter idea was too simplistic. Perhaps instead of a plain sum, we should use some sort of weighted sum.
Advantages of plain unweighted sum:
Disadvantages of plain unweighted sum: One could argue, however, that what we need to focus on is not economic or physical costs, but rather political costs, as measured in votes. In that view, if we have a tiny (since densely populated) district, some amount of geometrical badness for it should count the same as, not less than, a muchlargerscaled (since sparsely populated) district of the same shape – since they both contain the same number of votes. This can be accomplished "fairly" by weighting our summands by appropriate negative powers of the district areas, as follows:
Note the powers are chosen so the final cost comes out dimensionless (as it should since votes are dimensionless, unlike areas and lengths). One could also consider using powers of district perimeters or diameters, instead of areas.
Criticisms of areabased (or other) weighting: Unfortunately these kinds of weighting destroy some of the nice theoretical properties of our cost measures. For example, with weightedOlson, optimumcost maps now will be "additively and multiplicatively weighted Voronoi diagrams." The regions (districts) in such diagrams (even on a flat earth) each now can be very nonconvex, can be disconnected into very many components, and can have many holes.
Also, this kind of weighting can inspire an optimizer to artificially "bloat" smallarea districts just in order to decrease their weights, in opposition to the whole goal of keeping districts "compact." With area^{1/2}weighted Olson, "good" districts would no longer be cities or chunks of cities anymore, they would be "chunks of cities that always also have a large rural area attached." Ditto if the weighting instead is based on diameter^{1}.
Weighting by negative powers of district perimeters would be even worse in this respect since the optimizer then would be inspired to add wiggles to boundaries! Indeed with district map costs based on the Olson or squareddistances measures using weightedsumcombining using perimeter^{1} and perimeter^{2} as weights, optimum maps would not even exist due to this problem – a devastating indictment!
Verdict: In view of these criticisms, in my opinion at most one among the 9 possible weightedsum schemes I considered (based on our 3 favorite district costs and 3 types of weighting based on powers of area, perimeter, and diameter; 3×3=9) can remain acceptable:
I have called this the sanitized Schwartzberg cost above. Warning: If you do this, then you must not use the outer boundary of a country as part of districtperimeters. (if you tried, then a country with a very wiggly boundary, would artificially always get only hugearea districts adjacent to the country's border! For a country with a fractal boundary, optimum map either would not exist or would be insane!). Instead you must take the view that the country has no boundary and is defined only by its set of residents (points); or the view that the country's boundary "costs nothing" and only interdistrict boundaries cost anything.
Sanitized Schwartzberg also will suffer, though to a lesser extent, from the optimizer artificially "bloating" smallarea districts. (Sanitized Schwartzberg in some sense does not encourage such bloating but also no longer discourages it.) With plain unweighted perimeter minimization, the optimum map for a country with one geographically compact city would usually be: some districts 100% rural, some districts 100% urban, with at most one mixed district. But with sanitized Schwartzberg, it could well be happy to make, say, four mixed urban+rural districts (e.g. three each containing about 0.2 of the city plus 0.8 of a normalsized rural district, plus one with 0.4 of the city and 0.6 of a normalsized rural district) and all the rest entirely rural, without any pureurban districts at all.
An ad hoc compromise scheme which both discourages such bloat, but also tries to gain some of the benefits of Schwartzbergian thinking, would be
Another idea related to the original sanitized Schwartzberg idea would be to minimize this
This proposal would still be dimensionless. Yet another idea related to our original sumofperimeters (dimensionful) idea would be to minimize
It is not obvious to me what is best among the ideas of this ilk.
What are P and NP? A computer program which, given B bits of input, always runs to completion in a number of steps bounded by a polynomial in B, is called a "polynomialtime algorithm." The set of problems soluble by polynomialtime algorithms is called polynomial time (often denoted "P" for short). The set of problems whose solutions, once somehow found, may be verified by a polynomialtime algorithm, is called "NP." It is generally believed – although proving this is a milliondollar open problem – that hard problems exist whose solutions (once found) are easy to verify – i.e. P≠NP. Examples:
An "NPhard" problem is one such that, if anybody ever found a polytime algorithm capable of solving it, then there would be polytime algorithms for every NP problem. An "NPcomplete" problem is an NPhard problem that is in NP. Many NPhard and NPcomplete problems have been identified. One classic example is the "3SAT" problem, which asks whether an Nbit binary sequence exists meeting certain logical conditions (those conditions are specified by the problem poser as part of the input). If P≠NP, then every NPhard problem actually is hard, i.e. cannot always be solved by any polynomialtime algorithm. In particular, if P≠NP then 3SAT problems cannot always be solved by any polynomialtime algorithm.
See Garey & Johnson's book for a much more extensive introduction to NPcompleteness, including long lists of NPcomplete problems and how to prove problems NPcomplete.
It was shown by
Megiddo & Supowit 1984,
that this yesno problem is NPhard:
Given the (integer) coordinates of a set of people (points) on a flat Earth, and
an integer N, and a rational number X: does
there exist a way to subdivide the people into N
equipopulous disjoint subsets ("districts")
having Olson costmeasure less than X? (Yes or no?)
Therefore (unless P=NP) Olsonoptimal districting is hard, i.e. not accomplishable by any algorithm in time always bounded by any fixed polynomial in the number of people, the number of districts, and the number of bits in all those coordinates.
Mahajan et al 2009, by adapting Megiddo & Supowit's method, showed that the same yesno problem, but using the sumofsquareddistancesbased cost measure instead of Olson's, is NPhard.
We should note that those two proofs actually addressed different problems than we said. However, it is almost trivial to modify them to make their NPhard pointset have equal numbers of points in each "district." Then those proofs show NPhardness of optimal equipopulous districting. So the realization of the NPhardness of districting is really a new result by me; but my "new contribution" is only 1% because 99% of the work was done in these previous papers.
Another interesting NPcomplete problem is the problem of finding the shortest 1000 gerrymander. More precisely, given N points in the plane, some red and some blue, if you are asked to find the shortestperimeter polygon containing all the red points but none of the blue points inside, that's NPhard. There's a very easy NPhardness proof if we happen to already know that the problem of finding (or merely approximating to within a factor of 1+N^{2}) the shortest traveling salesman tour of N points in the plane, is NPhard. You simply place N red points in the plane then place N blue points each extremely near to a red point (each red has a blue "wife"). Then the polygon separating the red and blue points must pass extremely near to each red point in order to separate each husbandwife pair hence must be a "traveling salesman tour" of the red (or of the blue) points. In other words, optimally solving the 1000 gerrymander problem is at least as hard as optimally solving the plane traveling salesman problem, hence is NPhard. You should now find it almost as trivial to prove that the problem of, e.g. finding the shortestperimeter polygon enclosing exactly 50% of the points and having at least a 5149 red:blue majority on its inside, is NPhard (for any particular values of 50 and 51). Unfortunately, this all won't stop gerrymanderers because they don't give a damn about the "quality" (minimizing the length) of their gerrymander.
There are different flavors of NPhardness. For some optimization problems it is NPhard even to approximate the optimum value to within any fixed constant multiplicative factor. (Example: the chromatic number problem for graphs. Indeed, even the logarithm of the chromatic number for graphs.) For others, it is NPhard to get within a factor of 1+ε of the optimum cost for any positive ε below some threshhold – this class is called APXhard. (Example is the "triangle packing problem" – phrased as a maximization problem this is: given a graph, what is the maximum number of vertexdisjoint triangles that exist inside that graph? A related problem, which also is APXcomplete, is the same but for edgedisjoint triangles.) For other problems, it is possible in polynomial time (but which polynomial it is, depends upon 1/ε in a possibly nonpolynomial manner) to get within 1+ε of the optimum cost for any ε>0 – that is called a "polynomialtime approximation scheme," or PTAS. (Example: the NPhard "traveling salesman problem" in the Euclidean plane has a PTAS.) No PTAS can exist for any APXhard or APXcomplete problem, unless P=NP.
It is presently unknown whether the districting problem based on either Olson's or the sumofsquareddistances cost measure has a PTAS, or is APXhard, or what. Those NPhardness proofs only pertain to finding the exact optimum and do not address approximability questions.
Another important flavordistinction in the NPcompleteness world is weak versus strong NPhardness. Suppose some NPhard problem's input includes a bunch of numbers. If requiring those numbers to be input in unary (greater number of input bits) causes the problem to become soluble by an algorithm running in time bounded by a polynomial in this new longer input length, then that was only a weakly NPhard problem. The most famous example of this is the "number partition" problem where the input is a set of numbers, and the yesno question is: does there exist a subset of those numbers, whose sum is exactly half of the total?
Our above two districting NPcompleteness results both are "strong."
It is trivial to use the number partition problem to see that the following problem is weakly NPcomplete:
Twodistrict mincutlength problem with cities: The input is a simple polygon "country" in the Euclidean plane, and a set of disjoint circulardisc "cities" lying entirely inside that country, each regarded as containing uniform population density; and for each city the probleminput further specifies its positive integer population. Also input is a rational number X. The yesno question: Is there a way to cut the country apart, using total cutlength≤X, into two nowdisconnectedfromeachother sets, each having exactly equal population?
Furthermore, by shaping the country appropriately with tiny narrow bits and making the cities all lie far from both each other and the country's border, we can see that it is weakly NPhard even to approximate the minimum cutlength to within any constant factor!
That proof depended heavily upon the requirement that
the populationsplit be exactly 5050.
However,
Fukuyama 2006
used a cleverer version of the same idea with a "tree shaped" country
to show that,
even if we did not require a 5050
split, but instead were satisfied with a
However, this can be criticized for being only a "weak" NPhardness result. With only this result, it would remain conceivable that if the population had to be specified by stating coordinates for each person, that then the minperimeter districting problem would become polynomialtime soluble, or have a PTAS.
A different criticism: if instead of the cutlength, we used the entire sum of district perimeters as the cost measure (i.e. no longer allowing use of the country's own border "for free") then it would be conceivable that the minperimeter districting problem would have a PTAS (although finding the exact optimum would remain NPhard).
The first criticism can be overcome by using the fact (proven in theorem 4.6 of Hunt et al 1998) that partition into triangles remains NPcomplete even in planar graphs with bounded maximum valence. It follows easily from this that
Theorem: Mincutlength districting into 3person districts is strongly NPhard in the Euclidean plane for polygonal countries with polygonal holes, with the locations of each inhabitant specified one at a time.
Proof sketch: Make a country which looks like a "road map" of a planar graph with maximum valence≤5; and make each road have a narrow stretch. In other words, our country is an archipelago whose islands are joined by roads across bridges over the sea; the roadsurfaces also count as part of the "country." Place one person at each vertex (or equivalently, place one city well inside each island, all cities have the same population). All the narrow portions of the roads have equal tiny widths. There will be V people in all for a Vvertex Eedge graph, with V divisible by 3. It is possible to divide this country into V/3 different 3person districts by cutting EV edges at their narrow parts if, and only if, the graph has a partition into triangles. (Otherwise more cutting would need to be done.) Q.E.D.
Was invented by Warren D. Smith in the early 2000s. It is a very simple mechanical procedure that inputs the shape of a country and the locations of its inhabitants, and a positive number N, and outputs a unique subdivision of that country into N equipopulous districts.
Formal recursive formulation of shortest splitline districting algorithm:ShortestSplitLine( State, N ){ If N=1 then output entire state as the district; A = floor(N/2); B = ceiling(N/2); find shortest splitline resulting in A:B population ratio (breaking ties, if any, as described in notes); Use it to split the state into the two HemiStates SA and SB; ShortestSplitLine( SB, B ); ShortestSplitLine( SA, A ); }
Notes:
Example: The picture shows Louisiana districted using the shortest splitline algorithm using year2000 census data.
Want N=7 districts. Split at top level: 7 = 4+3. This is the NNEdirected splitline. Split at 2nd level: 7 = (2+2) + (1+2). Split at 3rd level: 7 = ((1+1) + (1+1)) + ((1) + (1+1)). result: 7 districts, all exactly equipopulous.
For theoretical investigation trying to determine the fastest possible algorithmic way to implement the shortest splitline algorithm, see this.
We suppose there are P "people" (points with known coordinates) whom we wish to subdivide into N "districts" (disjoint subsets) which are equipopulous (up to a single extra person), P≥N≥2. Begin with some Ndistrict subdivision. Now
We simply continue applying these twophase improvement cycles until no further improvement is possible, then stop. At that point we have found a "local optimum" districting, i.e. one whose cost cannot be decreased either by moving the district centers or by differently assigning the people to those districtcenters (but improvement could be possible by doing both simultaneously). Each 2phase improvement step takes only polynomial time, but it is conceivable that an exponentially long chain of improvements will happen (see Vattani 2011 for a point set in the plane that causes our 2phase iteration to iterate an exponentially large number of times, all the while keeping all districts equipopulous to within an additive constant). But usually the number needed is small. We can perform this whole optimization process many times starting from different initial districtmaps, selecting only the best among the many locallyoptimum maps thus produced.
The 2phase improvement technique above is very elegant. However, it could get stuck in a local, but not global, optimum. It would be nicer if we somehow could guarantee at least some decent probability of getting at least within some constant factor of optimumquality.
Heuristic algorithm based on matching: Here's an interesting idea. Observe that if P=2N, i.e. if we were trying to divide our country into "districts" each containing exactly 2 people, then the optimum districting (for any of our three quality measures) could actually be found in polynomial(N) time by finding a mincost matching in the complete (nonbipartite) graph of all people (with edge cost being the squaredEuclidean or unsquaredspherical distance between the two people). But in the real world P>>N. That suggests taking a uniform random sample, of, say, 256N people from among the P, then finding their mincost matching and replacing each matched personpair by a random point on the line segment joining them (thus reducing to 128N pseudopeople) then continuing on reducing to 64N, 32N, 16N, 8N, 4N, and finally 2N pseudopeople, then employing their matching's midpoints as our N initial district centers for use with the 2phase improvement procedure. Software exists which will solve an Npoint geometric matching task quickly and usually almostoptimally for the numbers of points we have in mind here (Fekete et al 2002).
Easy Lower Bound Theorem: The expected cost of the mincost matching among 2N random people (using squared distances as edge costs) is a lower bound on the meansquaredinterpersondistance cost measure for the optimum Ndistrict map. Half the expected cost of the mincost matching among 2N random people (now using unsquared spherical distances as edge costs) is a lower bound on the Olson meandistancetodistrictcenter cost measure for the optimum Ndistrict map.
This theorem usually makes it computationally feasible to gain high confidence that some districting is within a factor of 2.37 (or whatever the number happens to be for that map) of optimum under the squareddistancebased and Olson cost measures and hence (in view of our inequality) for the minperimeter cost measure too.
Heuristic idea based on randomly ordering the people: Here's another interesting idea. Randomly reorder the P people in your country. Now take the locations of the first N people as the initial centerpoints of your N districts. Now start adding the remaining PN people one at a time. Each time one is added, adjust the current district definitions (defined by their centerpoints and additive weights or extra coordinate) so we always have an equipopulous districting of the people that have been added so far. For example, one could agree to always add the new person to the district with the closest centerpoint (using additivelyweighted distances), except if that resulted in a toohigh districtpopulation, then we search for a weight adjustment for that district that moves one of its people out into another district; then if that district now has too many people we need to adjust its weight, etc. After making all PN adjustments (or we can stop earlier), we'll have found a, hopefully good, districting of the country. It then can be improved further by running 2phase improvement iterations.
We shall now, for the first time, describe a very simple randomized polynomialtime algorithm for finding N districts (for P people on a flat Earth) which is guaranteed to produce a districting whose expected Olsoncost is O(logP) times the minimum possible cost for the unknown optimum districting. This districting can be used as a randomized starting point for the 2phase improvement algorithm.
We must admit that "O(logP)" is not a hugely impressive quality guarantee! With P=1000000, log_{2}P≈20, and guaranteeing getting within a factor of O(20) away from optimum is a pretty weak guarantee. We're only mentioning this because nobody else in the entire 50year history of this field has ever proved any approximateoptimality theorem even close to being this good (poor as it is). As we have remarked before, this field of science is in an extremely dysfunctional state and most of the work in this area has, so far, been of remarkably low quality.
It then immediately follows from Markov's inequality that this algorithm will with probability ≥1/2 return a districting with cost≤O(logP)×optimal; and rerunning the algorithm R times (with independent random numbers each time) and picking the best districting it finds, will reduce the failure probability from 1/2 to ≤2^{R}.
Expectation≤O(logP)×Optimal Districting Heuristic:
Sketch of why this works: (The argument shall implicitly use the subset monotonicity lemma that Olson's optimal sumofdistancestocenters generally increases, and cannot decrease, if we add new people. This lemma is false for the perimeterlength, which is why this algorithm and/or proof only work for Olson, not perimeterlength, cost measure.) If some rectangletree box contains too many people to fit in an integer number of districts, then its excess people need (in the optimum districting) to have a district "outside" that box, so that the line segments joining each of them to their district center must cross the box wall. (Or, the district center could lie inside the box in which case the line segments from the people outside it that are in that district, must cross the wall. In that case we regard the other sibling box as the one with the "excess" and assign this cost to it.) But (crossing expectation lemma) the length L of any line segment that crosses a boxwall of length W, is, in expectation over the random boxshifts, of order ≥W. No matter how crudely it handles those excess people our algorithm isn't going to give them length more than order W each. So it seems to me (in view of the subset monotonicity and crossing expectation lemmas) that at each level of the rectangle tree the expected excess cost incurred is at most a constant times the total cost of the (unknown) optimal districting. [And it does not matter that these excesses are statistically dependent on each other, due to linearity of expectations: E(a)+E(b)=E(a+b) even if the random variables a and b depend on each other.] The whole tree is only O(logP) levels deep, so the whole procedure should yield a districting whose expected cost is O(logP) times optimal. In fact, very crudely trying to bound the constants, I think the expected cost will be ≤6+32log_{2}P times the optimal districting's Olsoncost. Q.E.D.
Districting method  Contest needed?  Convex districts?  Optimality?  Other 

Shortest splitline algorithm  No; extremely fast simple untunable algorithm  Districts are convex polygons drawn on sphere using geodesics as edges  Usually does well with respect to the cutlength cost measure, but not always; can yield a map far worse than optimal.  See results for all 50 US states (year2000 census) 
Minimize sum of districtperimeters  Yes (finding optimum is NPhard)  Districts are (generally nonconvex) polygons  Simplest cost measure to compute. No polytime algorithm is known that can even assure constantfactor approximation for any constant.  Readily modified to minimize crossings of rivers, roads, etc – but that would make it vulnerable to redefining or reengineering "rivers." If district lines required to follow censusblock boundaries (not recommended!), then vulnerable to redefinition of those blocks. 
Minimize sum of squared distances to your districtcenter  Yes (finding optimum is NPhard)  Districts generally nonconvex circulararc "polygons" on the sphere, and a district can be more than one disconnected polygon; but each district arises as a convex polyhedron in 3space and in the "flat Earth" limit all districts become convex polygons.  No algorithm is known assuring constantfactor approximation for any constant.  Readily modified to add extra "interest" dimensions to group voters with common interests in same districts – but that would make it vulnerable to redefinitions & rescalings of what valid "interests" are. 
Minimize sum of distances to your districtcenter  Yes (finding optimum is NPhard)  Districts generally nonconvex both on the sphere and on a flat Earth; boundaries are piecewisealgebraic curves that are complicated to describe.  It is probably possible to get within some constant factor of optimum in polynomial worstcase time, but nobody has shown that yet. We've shown how to get within a logfactor of optimum.  Olson has programmed a heuristic algorithm to seek districtings good by this measure. See his USA (year2010 census) results. 
It's possible in principle for humandrawn districts to better represent "communities of interest" and other fuzzy but real sociological features. I think that desire must be sacrificed in order to get fully automatic and impartial districting. This might sometimes draw awkward lines which cut regions in strange ways. (But it is certain that there are currently a lot of awkward lines cutting regions in strange ways, put there intentionally to defeat democracy!) The kind of solutions we're proposing here, usually are better than what human USA legislators currently draw. This is true essentially 100% of the time measured by our objective mathematical quality measures. It also is true about 98100% of the time measured subjectively by my eye.
Minimizing "travel time" is hard to measure & automate, and fairly easy for evildoers to manipulate.
It is fairly easy to modify costmeasure (3) to inspire district boundaries to follow, but not cross: rivers, highways, county lines, etc. E.g. you could count the "cost" of each meter of district boundary as "10 dollars" normally, but only "5 dollars" wherever the district boundary happens to be following a river or highway, and add a penalty of 2 dollars every time it crosses a river. But measures (1) and (2) are not easily modifiable to accomplish these goals. [All the unmodified measures simply ignore rivers, highways, etc.] This easy modifiability is yet another advantage of the minperimeter quality measure, versus the other two.
If these magic "dollar" numbers were chosen right we could get better districtings, I suppose – but the danger is that some evil person would try to tune them intentionallywrong to get gerrymandering effects, plus this all makes computer programs more complicated (although not insurmountably so). It also makes things somewhat manipulable by evilly changing the definitions of county lines, or of "rivers," building different highways, etc. ("That wasn't a 'river,' that was a 'creek'! That wasn't a 'highway,' that was a mere 'road'!") The joy of the shortest splitline algorithm is it has absolutely no tunable parameters or unclearlydefined entities, hence cannot be manipulated by building highways, redefining rivers, altering a parameter, etc.
The US Census measures population in small regions called "blocks." In California 2010 there are 533163 of these blocks, see image (png 200KB). Note that some blocks are actually quite large, and their shapes are not always simple. The census also provides a coarser subdivision called census "tracts." There are roughly 50 blocks per tract. If every district is made a union of census blocks, then you get "exact" population counts in each district, but the district boundaries will usually be very jagged. If district boundaries cut through census blocks, then we can get much simpler nonjagged district boundaries (e.g. "straight lines" on a map) but sacrifice exact knowledge of district populations. (We still can model each block as containing uniform population density and thus estimate the district populations fairly well.) Of course, everybody knows the census is not and cannot be exact, i.e. many of their alleged block populations are wrong. So we never had exact knowledge, we just had the illusion of exact knowledge – but we now must sacrifice that illusion!
Census blocks are supposed to break along existing political divisions (counties, zip codes, cities) or roads or natural features. Therefore, following census block boundaries has some advantages versus simple straight lines.
But if a law demanded that district boundaries must always follow censusblock boundaries, then everything becomes manipulable via evil alterations to the exact definitions of the census blocks. For example, 'here' the evil manipulator could make the census blocks tend to be long, thin, and oriented EastWest, 'there' he could predominantly orient them NorthSouth, and in a third region he could make them tend to have wigglier boundaries than usual. These manipulations could all be fairly small changes which still look fairly innocent but nevertheless would cause a massive alteration of the "optimum" (based on the minperimeter measure) districting.
There would be incentive to appoint a Republican as the head of the census department so he could corrupt things their way!
Perhaps such manipulations could force "evil wiggly" boundaries with one or other of the two other optimality notions, also – I'm not sure. I am certain this kind of manipulationpotential is greatly diminished if district drawings are unrestricted, i.e. district boundaries are allowed to cut through census blocks.
Possibly. There are currently some districts in some states which are created to almostguarantee that a Black or Hispanic person gets elected to the US House. This might be considered a "good" kind of gerrymandering. It is a distortion of districting to an end that many people currently agree with. (I can't provide an exact count but believe there are more districts resulting from bad gerrymandering than good.)
We warn you that "good gerrymandering" is a very dangerous road to take. Furthermore, it may not have the effect you think. For example, suppose you create one "majority minority" district, e.g. containing more than 50% Blacks, by drawing absurdlyshaped boundaries to shove more Blacks in there. The idea is this way, a 90%White state will elect one Black congressman, as opposed to 10 out of 10 Whites. Sounds great. But, what also might happen is that a lessabsurd map would have several districts with close battles between A and B, and the Blacks in those districts would be enough to swing all those battles toward B. So we'd have several Blackdecided battles with a nonabsurd map. But with the absurd map, the Blacks would basically always elect exactly one guy, who'd basically be a sure thing, and have zero influence on any other battle. (Also they'd have zero influence even on their own district, because he'd be a sure thing to win. You can't "influence" a surething race.) So the net effect of drawing absurd districts to "help" Blacks, might actually be to deprive them of voting power so they could influence fewer races – and every district becomes a noncompetitive "sure thing" race where no voter of any color can affect anything – it is all preordained. This is great for politicians, whose seats are now safe for life, but not so great for voters (of any color), who all now are powerless.
Oops.
Just to hammer that home, in a state with 53% Democrat and 47% Republican voters, you might hope/expect that Democrats would win a majority of the seats. But Blacks in the year2000 era vote about 90% Democrat, and cramming them all in 1 district to "help" them would result in 1 Democrat winner versus 9 Republican winners! This would be a huge nonhelp to Blacks and a huge distortion of democracy.
Some people have told me the USA's Voting Rights Act requires "majority minority" districts. The link gives the full text of the VRA, and I see no such requirement. (Also note that there are many US states right now – and for many decades past too – have had zero majorityminority districts.)
Proportional Representation (PR) is an alternative to districting which does a good job of representing people based on identity groups. The whole idea of singlewinner elections in districts is discarded with PR. Instead you have multiwinner elections. More complicated and cleverly designed election methods then are needed, but they can be designed in such a way that any fractionF minority can, by voting in a "racist" manner, assure they get about a fraction ≥F of theircolor winners (provided they run).
It also is possible to conceive of systems with a mix of districting and PR. (Some countries have such mixed systems.) Districting maintains the tradition of having your local representative who is more directly accessible for communication and regional issues. PR better represents the whole mix of identity groups across a state or country.
With or without such changes I'm going to continue advocating for antigerrymandering measures because I think they will make far more good changes than bad.
Olson's computer program, as we said, uses enormous computation to try to find a map nearoptimum under his quality measure. (Olson's current program "cheats" in several ways, e.g. it does not use true spherical surface distance – he uses a certain ugly approximate distance formula – and not using the true FSW point as the "center", but those issues don't matter much for our comparisonpurposes here.)
Meanwhile the shortest splitline algorithm uses a far tinier amount of computing and makes no attempt to find an optimum or especiallynearoptimum districting – except for the fact that each splitstep, it chooses the shortest possible splitline. The net effect of that seems in the right direction to keep the perimetermeasure small, but it might in principle still yield total perimeters far larger than optimum.
Consequently, it seemed reasonable to guess that Olson and splitlining would often output quitedifferent maps. However, examining Olson's 50 USA state maps (based on year2000 census) versus our splitliningbased maps of the same 50 states, we find that they actually usually are astonishingly similar! Consider Texas. Left top, the official district map produced by the gerrymandering slimeballs in the TX legislature (32 districts). Right top, Olson's nearoptimal map. Bottom left: splitlining's map. Bottom right, map by Spann, Gulotta & Kane trying to minimize a sumofsquareddistances based measure (they use a method like our 2phase improver working on a "flat Earth" allowing up to 4% population difference between the most and leastpopulous districts).
The official map is clearly utterly despicable garbage. And if you liked TX's official districts based on the 2000 census, you'll love the upcoming new ones from the 2010 census – below check a detail insert from the Texas State Legislature's "plan C125" for congressional redistricting:
Meanwhile, Olson's, the squareddistbased, and the splitlining maps all are pretty reasonable. Olson's map has jagged boundaries if viewed closely, which is mainly because it follows censusblockboundaries (if Olson changed his program to have it cut through census blocks, then his boundaries would smooth). Note that Olson's districts have curved boundaries (even after smoothing) while splitlining and optimizing the squareddistancebased measure both always produce district boundaries made of straight lines (on a flat Earth; on a round Earth splitlining still uses "straight" geodesic segments but the squareddist minimizer would employ slightly nonstraight curves). The basic structure of the Olson and splitline maps is almost identical, except Olson has handled the two big districts in the Western part of the state differently than (and in this case I think better than) splitlining.
And here's Virginia:
Official  Olson 

Splitlining 

Again, the official gerrymandering at top left is despicable garbage, and Olson and splitlining both do a decent job and again are very similar except for the big western district (which this time I think splitlining handles better than Olson) and the fact splitlining pretty much puts the Delmarva peninsula at the East (Accomack & Northhampton counties; Chincoteague, Cape Charles) in its own district, which I also prefer versus Olson. (It can't entirely separate it since there is not enough population in this peninsula to comprise a full district, but it tries the best it can.) We could similarly go through other states:
TX (32 districts)  Official & Olson  Splitline  Σdist²based*  As reGerrymandered
in 2003 to favor Republicans 

NY (29 districts)  Official & Olson  Splitline*  Σdist²based  Census dept. provided erroneous geographic data (chopped off part of Long Island near Greenport). 
AZ (8 districts)  Official & Olson*  Splitline  Σdist²based  Districted by an "independent commission" (ha!); note the green district is contiguous, but joined by filaments too fine to see since <½ pixel wide. 
IL (19 districts)  Official & Olson  Splitline  Σdist²based*  Several toofinetosee filaments in the official map. 
All 50 states  Official & Olson  Splitline  (Unrelated: what if the entire Entire USA (lower 48) were districted in 1 go? 
Here's New Jersey (also districted by an "independent commission," ha ha ha):
Official  Olson  Splitline 

The conclusion for other states is almost always basically the same as for VA, TX and NJ.
This exercise has led me to be more impressed with splitlining – even though it makes no serious effort to optimize, it nevertheless does quite well most of the time. Certainly Olson and splitlining yield maps far more similar than I would have naively expected.
Tentative "current fairness" analysis: Another kind of comparative analysis has been made by "Walabio" and by Jameson Quinn. Walabio claimed that
Jameson Quinn then confirmed this assessment plus examined some other states besides CA, and again concluded that splitlining, at least with USA 2000era voter distributions, tends to produce "fairer" district maps than Olson's computer program, i.e. bettersimulating proportional representation. I warn the reader that these conclusions are tentative, based on manual inexact examinations of not very many US states. They could be just wrong. Even if they are valid, I'm not sure what we should conclude. In future, for example, maybe urban voters will prefer Republicans – well, that would just yield reversedirection unfairness, but more seriously, the geographic distributions of partyaffiliation within states might change in some other way – whereupon Olson's maps might become the "fairer" ones. I don't currently understand that very well... but anyhow as matters presently stand there is a notveryconfident conclusion splitlining's maps are "fairer" than Olson's.
We thank Stefano Zunino, James Gilmour, "Walabio," and Jameson Quinn, for helpful comments and/or errorcorrections. The shortest splitline algorithm was programmed by Ivan Ryan.
A. References about Weiszfeldlike iterations:
Weiszfeld's original paper [Tohoku Math. Journal 43 (1937) 355386]
has been translated into English and republished:
Endre Weiszfeld: On the point for which the sum of the distances to n given points is minimum,
translated and annotated by F.Plastria, Annals of Operations Research 167,1 (March 2009) 741.
["Andrew Vazsonyi" and "Endre Weiszfeld" were same person.]
Other papers on this include:
Leon Cooper:
An extension of the Generalized Weber Problem,
J. of Regional Science 8,2 (1968) 181197.
[Incidentally, this is not the physics Nobel prize winner Leon N. Cooper.]
Ulrich Eckhardt: Weber's problem and Weiszfeld's algorithm in general [Banach] spaces,
Math'l Programming 18,1 (1980) 186196.
I.Norman Katz:
Local convergence in Fermat's problem, Math'l Programming 4,1 (1973) 98107;
6,1 (1974) 89104. [Shows Weiszfeld algorithm generically converges in ultimately geometric,
aka "linear," fashion, but convergence can be either sub or superlinear in special cases.]
I.N.Katz & L.Cooper: Optimal location on a sphere,
Computers Math. Applic. 6,2 (1980) 175196.
[Propose a Weiszfeldlike iteration for the Fermat problem on a round Earth.
It appears to converge generically in numerical experiments, e.g. see 27,6 (1994) 3750.]
Harold W. Kuhn: A note on Fermat's problem, Math'l Programming 4,1 (1973) 98107.
[Careful proof of convergence of Weiszfeld iteration to optimum. But his final theorem
is only correct in 2 dimensions or if the convex hull of the sites is fulldimensional.]
H.W.Kuhn & R.E.Kuenne:
An efficient algorithm for the numerical solution of the generalized
Weber problem in spatial economics, J. of Regional Science 4,2 (1962) 2133.
[Found 7 Weiszfeld iterations always sufficed to solve 15 problems with up to 24 sites.
Although it is possible to devise problems for which Weiszfeld's convergence is very slow,
they arise rarely in practice.]
James G. Morris:
Convergence of the Weiszfeld Algorithm
for Weber Problems Using a Generalized "Distance" Function,
Operations Research 29,1 (JanFeb 1981) 3748.
[Morris and Cooper1968 both consider minimizing the sum of Kth powers
of the distances to the sites, K>0. Morris further
generalizes by allowing L_{p} distance for any p≥1.
The triangle inequality holds if p≥1 and 0<K≤1, but not if K>1 or 0<p<1.
The summedpowerdistance function is concave∪ if K≥1 and p≥1, but not if 0<K<1.
Morris devises a Weiszfeldlike iteration and proves convergence if 1≤p≤2 and 0<K≤p
and it never lands exactly on a site. Cooper considers only p=2 and
considers the obvious generalization of Weiszfeld's iteration in which the siteweights are
proportional to the (K2)th power of the distance to that site. He conjectures this
iteration generically converges to the optimum for every K with 1≤K<∞,
based on the fact that he successfully solved 180 such problems by computer.
But this has only been proven for K=1 and K=2.]
Lawrence M. Ostresh Jr: On the convergence of a class of iterative methods for solving the
Weber location problem. Oper. Res. 26,4 (1978) 597609;
Convergence and Descent in the Fermat Location Problem,
Transportation Science 12,2 (May 1978) 153164.
[Ostresh's 2nd paper first notes that
if the sites are noncollinear the
sumofdistances function is strictly concave∪ hence has a unique minimum.
Given that, he
proves Weiszfeld's descent theorem in 1 paragraph (thm2) then further
proves (thm3) that any
point in a ball centered at the new Weiszfeld point, with radius being
the distance to the old one, yields descent. That makes it trivial to avoid the
slight flaw in the Weiszfeld iteration that if you land exactly on a site, you need
to divide by 0 and die.]
Liping Zhang:
On the convergence
of a modified algorithm for the spherical facility location problem,
Operations Research Letters 31,2 (2003) 161166.
[Gives an algorithm which he proves under certain circumstances always converges to the
Fermat globaloptimum point on a round Earth. Because the sumofsphericaldistances
function is "spherically concave∪" if all sites lie on the same hemisphere, and "strictly"
so if they do not all lie on a single geodesic, it is clear there is no fundamental difficulty in
devising such an algorithm; many approaches will work. Similar remarks can be made
about the Kpowered L_{p} distance variant with K≥1 and p≥1.]
B. The COMAP/UMAP districtingmath contest of 2007: UMAP (undergraduate journal of maths & applications) issue 28,3 (Fall 2007) pages 191332, contains 5 papers on political districting. Except that really, it only contains only one paper, because pages 261450 are missing from the printed journal, though allegedly available on a CDROM. (Not in my library!) These papers were judged the best 5 among 351 submissions (by teams of undergraduates all over the world) to a contest. Here are those papers:
Ingredients of our suggested two phase approach and the Voronoi theorem were discovered and rediscovered and/or refined by a number of people listed above and below (including by me; the earliest discoverers seem to be Weaver and Hess in the early 1960s), but we've relied mainly on its treatment by Spann et al.
C. Other books and papers:
Sanjeev Arora: Polynomialtime Approximation Schemes for Euclidean TSP and other Geometric Problems, Journal of the ACM 45,5 (1998) 753782. Approximation schemes for NPhard geometric optimization problems: A survey, Math'l Programming 97,12 (July 2003) 4369.D. References about measuring districtshape "quality" (mostly embarrassingly bad ideas):
Ronald R. Boyce & W.A.V.Clark: The concept of shape in geography,
Geographical Review 54 (1964) 561572.
Christopher P. Chambers & Alan D. Miller: A measure of bizarreness,
Quarterly Journal of Political Science 5,1 (2010) 2744.
C.C.Harris: A scientific method of districting, Behavioral Science 9 (1964) 219225.
H.F.Kaiser: An objective method for establishing legislative districts,
Midwest J. Political Science 10 (1966) 200213.
Lee Papayanopoulos: Quantitative measures underlying apportionment methods,
Annals New York Academy of Sciences Special Issue 219 (1973) 181191.
E.C.Roeck: Measuring compactness as a requirement of legislative apportionment,
Midwest J. Political Science 5 (1961) 7074.
Joseph E. Schwartzberg: Reapportionment, gerrymanders, and the notion of compactness,
Minnesota Law Review 50 (1966) 443452.
Peter J. Taylor: A new shape measure for evaluating electoral districts,
American Political Science Review 67,3 (1977) 947950.
Takeo Yamada: A minimax spanning forest
approach to the political districting problem,
Int'l J. Systems Science 40,5 (May 2009) 471477.
H.P.Young:
Measuring the compactness of legislative districts,
Legislative Studies Quarterly 13,1 (1988) 105115.
[Discusses how bad various districtshapequality measures advanced by
previous authors are, hence suggests giving up on all of them and abandoning problem as
hopeless. But his review also is of low quality, e.g. contains not a single theorem
and reaches some toorapid conclusions via extremely shallow reasoning.]
E. Other Redistricting Websites
Brian Olson has a nice website where he exhibits maps drawn by his opensource program trying to use "simulated annealing" and immense amounts of computation to work out probablynearoptimum district maps for all 50 states ("optimum" using a bastardized variant of the leastaveragedistancetocenter measure described above). He's done that for both the year2000 and 2010 censuses.
Drs. Michael McDonald & Micah Altman have an automated redistricting and districtinganalysis tool called BARD (Better Automated ReDistricting).
Dave's Redistricting has a program you can run to tinker with district maps if you have microsoft silverlight.
The Shortest Splitline Algorithm tries to make fair maps by making a reasonable line at every step. Also a publicsource program; successfully run on year2000 census data for all 50 states. But the US Census Bureau changed their data format for 2010, breaking the program.
George L Clark has written his own redistricting software and come up with some good, substantially similar and interestingly different solutions based on a perimeter measure.
David Burton is working on redistricting for North Carolina. He has a similar approach to Olson but wants districts with equal numbers of voters rather than equal population. (Which goal, incidentally, is unconstitutional according to the US Supreme Court.)
Kevin Baas's autoredistrict.org: "AutoRedistrict is a free and open source computer program that automatically creates fair and compact electoral districts. Simply open a shapefile, load in census and election data (if not already included in the shapefile), and hit 'Go.' AutoRedistrict uses a genetic algorithm to design districts that meet multiple criteria..." It seems highly sophisticated, but also complicated and computationally very intense, and he makes a lot of decisions that seem suspicous to me. Baas has a lot of original ideas.
High quality authoritative congressional district maps are available from nationalatlas.gov