RangeVoting.org - Answer to puzzle #21

Partial Answer to Puzzle #21: Getting rid of cycles in directed graphs

Puzzle #21 – how many arcs must be deleted from an n-node directed graph to get rid of all cycles? Let F(n) denote the maximum, over all n-node directed graphs G (note: singly-directed arcs only; bidirected arcs not allowed), of the minimum number of arcs you need to remove from G in order to cause G to become acyclic. Find values of F(n), upper and lower bounds on F(n), and asymptotics.

Motivation: This problem arises in the theory of Condorcet-like voting schemes. It is also relevant to the theory of the minimum feedback arc set problem, a known APX-hard NP-complete graph problem.

Answer: This innocent-looking puzzle turns out to lie rather deep... The current best known upper and lower bounds on F(n) for n<30 are shown below.

n	0	1	2	3	4	5	6	7	8	9
F(n)	0	0	0	1	1	3	4	7	8	12
n	10	11	12	13	14	15	16	17	18	19
F(n)	15	20	22-24	28-30	30-34	36-40	40-47	47-54	55-62	64-69
n	20	21	22	23	24	25	26	27	28	29
F(n)	65-77	74-87	81-95	92-105	99-116	110-126	122-137	135-150	138-161	151-174

Note on computer aid: I originally proved 8≤F(8)≤10 and 12≤F(9)≤13 and conjectured F(8)=8 and F(9)=12. I then proved the theorem that these two conjectures are equivalent (and they would imply F(10)≤16, F(11)≤20, F(12)≤25, etc). I then noted that the value of F(8) should be within the reach of computers. I then used a computer with the aid of Brendan McKay's files of "tournaments" to prove the conjecture. Also, the computer found the values of F(n) for all n≤8 as a side effect, confirming the values I found by hand (and F(5)=3 is achieved by a unique tournament). The computer would also have been able to confirm F(9)=12 directly, but it would have taken it about a day, so I killed the job after 40 minutes. I estimate that F(10) could be determined by the current program in about 1-2 years. By only searching the semi-regular tournaments (i.e. performing a nonexhaustive search which probably will be good enough), my computer found F(9)≥12, F(10)≥15, and F(11)≥20 (achieved by a unique semi-regular tournament but 55 others reach 19). Although these might be thought only to prove lower bounds because of the non-exhaustive nature of the search, we shall see below with a little extra reasoning how they really prove equality in all three cases. Also, it turns out my values of F(n) for n≤10 had already been proven by A.Kotzig in Discrete Maths. 12 (1975) 17-25, also with the aid of a computer. With a better program (branch and bound search with "early cutoffs"), it would be possible to process all semi-regular tournaments with 12 and 13 nodes in about 1 week, which probably would suffice to determine F(12) and F(13), although I do not see how or why the word "probably" could be replaced by "definitely." Finally, the computer can be used to prove lower bounds by finding the best way to remove arcs from specific graphs devised by human inventors. This was how I proved many lower bounds listed in the next table. It would be good to do an exhaustive search of all cyclic graphs of each size, and of all "cyclic plus one extra vertex" graphs ditto; and to search all possible ways of extending record-showing graphs by adding one extra vertex (and to search all possible ways to remove their vertices). I have not carried out any of those ideas.

In much of the below, however, we shall disregard the computer and work by hand.

News flash: Irene Charon and Olivier Hudry noted that on the 20th page of their paper Aggregation of preferences into a linear order: a survey on the linear ordering problem for weighted or unweighted tournaments, are stated the following results got from [J-C. Bermond & Y. Kodratoff: Une heuristique pour le calcul de l'indice de transitivite' d'un tournoi, RAIRO 10,3 (1976) 83-92] which are sometimes worse than our results above, but mostly better:

F(12)=22 F(13)=28 n = 14 15 16 17 18 19 20 21 22 23 24 25 F(n) lower bound 31 38 40 47 55 64 64 72 81 92 92 103 F(n) upper bound 32 39 44 52 58 67 73 83 91 102 110 122

(End of news flash.)

The results for n≤4 are by exhaustive manual consideration of every possible directed graph.

QRgraph(P): A construction of an infinite family of directed graphs:

The following directed graphs may have something to do with the answer.

Let P be any prime of form 4k-1 (P=3,7,11,19,...). Let G be the directed graph with P vertices got by joining X→X+S whenever S is a nonzero square modulo P. (Note that this is self-consistent because -1 is not a square mod such P.) This directed graph is extremely symmetric. Namely, it is invariant under the group of (P-1)P linear transformations that map X to AX+B mod P with A nonzero (where if A is nonsquare the graph gets orientation-reversed). Note the symmetry group is arc-transitive. Perhaps, for an infinite subset of such P, this digraph maximizes (over all digraphs with the same number of nodes) the minimum number of arcs you need to remove to get acyclicity.

More generally, we can permit P to be any prime power of form 4k-1, not merely a prime – but then arithmetic has to be done in the finite field with P elements. The first interesting case of that is P=27=3³, in which case we can use this finite field.

Lower bounds on F(n) from QRgraphs and related graphs (usually with cyclic symmetries):

n	F(n)≥	Reason
3	1	QRgraph(3) is just the 3-cycle, which proves F(3)=1.
5	3	Cyclic graph with P=5 and S={1,3}. Note, this not actually generated by the QRgraph construction since P=5 is not the right kind of prime and this S are not the quadratic residues mod 5, but it works. We permit arbitrary P and arbitrary modular-increment-sets S in general "cyclic graphs." This is the unique tournament that achieves 3.
6	4	Delete a vertex from QRgraph(7). Another way to prove F(6)≥4 is to use the cyclic graph with P=6 and S={3,5}. Warning: in an earlier version of this file I had said 5 not 4 because of a typo.
7	7	QRgraph(7), which has S={1,2,4}. There are only two graphs that demonstrate F(7)≥7; the other is got from QRgraph(7) by reversing the orientation of a 3-cycle.
8	8	Cyclic graph with P=8 and S={1,3,4}. [Exactly 8 arcs are misdirected with respect to the vertex order 0,5,2,7,4,1,6,3 and this is the best order as shown by exhaustive search by a branch-and-bound program.] Also arises from extending QRgraph(7) by adding a vertex, see text. Also arises from deleting a vertex from the constructions showing F(9)≥12. Exhaustive computer examinations of 8-node tournaments shows F(8)=8.
9	12	Cyclic graph with P=9 and S={2,5,6,8}. Best vertex order 6,7,1,4,5,8,2,0,3. Also another graph based on a Steiner Triple System shows this more elegantly, see text.
10	15	Delete a vertex from QRgraph(11). A computer examination of all 13333 semi-regular 10-node tournaments (which took several hours; "semi-regular" means that every invalence of each node is 4 or 5) found that the one that yielded the largest F(10) lower bound gave F(10)≥15. This indeed proves F(10)=15 because a tournament with a node with indegree≤3 would (by considering deleting all those incoming arcs) be convertable to acyclicity by deleting ≤3+F(9)=3+12=15 arcs.
11	20	QRgraph(11), which has S={1,4,9,5,3}. Best vertex order 4,6,3,2,1,9,0,8,10,5,7. This is the unique regular tournament that achieves 20.
12	21	Delete a vertex from F(13)≥27.
12	22	Delete a vertex from F(13)≥28.
13	27	Cyclic graph with P=13 and S={3,4,5,7,11,12}. Best vertex order 7,0,2,8,3,9,4,10,5,11,6,12,1. This is highly non-unique. There are about 70 thousand different isomorphism types of tournaments which demonstrate the lower bound 27; about one random regular tournament in 22 works.
13	28	Demonstrated by only two graphs that I currently know of. Regular 13-node tournaments demonstrating F(13)≥28 are very rare (about 1 in 200,000 or rarer).
14	29	Add a 14th node with arcs alternately incoming and outgoing from the 13 old vertices to the cyclic graph showing F(13)≥27. Another proof: Cyclic graph with P=14 and S={1,3,4,5,7,8}. Best vertex order 9,8,5,1,11,7,4,3,0,13,10,6,2,12. Another: delete a vertex from F(15)≥36.
14	30	There is a way to add a 14th node to one of the graphs showing F(13)≥28 that proves F(14)≥30.
15	36	Cyclic graph with P=15 and S={1,2,3,5,6,8,11}. Best vertex order 2,1,14,11,9,8,6,3,0,13,12,10,7,5,4.
17	47	Cyclic graph with P=17 and S={3,8,10,11,12,13,15,16}; best vertex order 10,15,0,5,7,12,9,14,2,16,4,6,11,13,1,3,8.
19	64	QRgraph(19). Best vertex order 1,13,16,9,12,3,5,15,8,11,4,7,6,0,18,2,14,17,10. Delete 1, 2 or 3 vertices to show F(18)≥55, F(17)≥47, and F(16)≥40.
20	65	Add a 20th node with arcs alternately incoming and outgoing from the 19 old vertices to QRgraph(19). Also can prove using cyclic graph with P=20 and S={1,3,6,7,8,9,10,15,16}: Best vertex order 16,13,10,7,1,15,0,12,14,6,5,19,4,18,17,9,11,3,8,2.
21	74	Cyclic graph with P=21 and S={1,3,6,7,8,9,10,15,16}; best vertex order 16,13,10,7,1,15,0,12,14,6,5,19,4,18,17,9,11,3,8,2.
22	81	Delete a vertex from QRgraph(23).
23	92	QRgraph(23). Best vertex order 10,1,15,6,20,11,2,16,7,21,12,3,17,8,22,13,4,18,9,0,14,5,19.
24	97	Cyclic graph with P=24 and S={1,3,4,5,6,7,11,12,15,16,22}. Best vertex order 13,10,12,7,9,6,8,3,5,2,4,23,1,22,0,19,21,18,20,15,17,14,16,11. Also can be shown by deleting a vertex from F(25)≥109 or deleting three vertices from F(27)≥133.
24	99	Delete three vertices from F(27)≥135.
25	109	Cyclic graph with P=25 and S={2,3,4,6,7,9,10,11,12,17,20,24}. Best vertex order 14,10,7,3,0,8,21,4,18,1,22,15,23,11,19,16,12,9,5,6,2,24,20,17,13. Also can be shown by deleting two vertices from F(27)≥133.
25	110	Delete two vertices from F(27)≥135.
26	115	Extend the cyclic 25-node graph above by adding a 26th node with arcs alternately incoming and outgoing from the 25 old vertices. Best vertex order 16,9,24,14,7,4,22,12,2,20,17,10,0,18,25,15,8,5,23,13,6,3,21,11,1,19.
26	120	Delete a vertex from F(27)≥133.
26	122	Delete a vertex from F(27)≥135. Best vertex order 10,9,5,12,8,22,7,18,17,20,1,11,23,3,19,13,15,2,4,25,14,21,6,16,24,0.
27	124	Cyclic graph with P=27 and S={1,2,3,4,5,7,10,11,14,15,18,19,21}. Best vertex order 18,14,11,7,4,0,24,20,17,10,3,23,16,13,9,6,2,26,22,19,15,12,8,5,1,25,21.
27	126	Delete four vertices from QRgraph(31).
27	133	I tried several doubly-regular 27-node tournaments arising from 28x28 skew Hadamard matrices but the branch and bound program failed to complete the search on most of them. However, the search did complete on this Hadamard graph finding (after 3 hours) best vertex order 16,0,12,9,1,2,6,3,22,15,8,5,13,18,21,7,20,25,11,17,14,4,26,10,19,23,24. For graphs with 25 or more nodes, the B&B program sometimes fails to complete the search, and failure gets more and more likely the more nodes you have; with 31 nodes it almost always fails.
27	135	Branch and bound program was unable to find the best vertex order for QRgraph(27) directly even after 2 days of computing – and incidentally this graph is a bit tricky to construct because it is not a simple cyclic construction (circulant adjacency matrix) since GF(27) is not just the integers mod 27 because 27 is not a prime, but rather a prime cube. Here is its adjacency matrix. But, utilizing the little grey cells, we observe that since QRgraph(P) has arc-transitive automorphism group, all vertices are equivalent, and hence we can delete one, find the best vertex-ordering of the reduced graph, and then place the extra vertex on top of that ordering. The branch-and-bound code was able to do that i.e. proving F(26)≥115, in 40 minutes, Indeed (after allowing a few more grey cells to contribute!) we realize that we can wlog delete both vertices in an arc then restore them/it to the top of the ordering, because the top arc in the best vertex ordering must be compatibly directed. For that (proving F(25)≥110, and F(26)≥122, and F(27)≥135 indirectly) the branch-and-bound code only needs 7 minutes.
28	136	Cyclic graph with P=28 and S={2,4,7,8,14,15,16,17,18,19,22,23,25}. Best vertex order 16,14,19,17,22,15,20,25,18,23,0,21,26,3,24,1,6,27,4,9,2,7,12,5,10,8,13,11.
28	138	Delete three vertices from QRgraph(31).
29	146	If the cyclic F(28)≥136 graph is extended by adding a 29th node with arcs alternately incoming and outgoing from the 28 old vertices, we get a graph which proves F(29)≥146. [Best vertex order 16,14,19,12,24,25,17,10,2,22,23,15,8,0,20,6,26,18,28,1,11,21,7,3,27,4,13,9.] Another proof arises from the cyclic graph with P=29 and S={1,2,3,5,6,12,13,15,18,19,20,21,22,25}, best vertex order 0,10,26,7,23,4,20,1,17,27,14,24,11,21,8,18,5,15,2,12,28,22,9,25,19,6,16,3,13.
29	151	Delete two consecutive vertices from QRgraph(31). Best vertex order 21,19,11,1,28,14,12,10,27,25,23,9,7,5,22,20,18,4,2,0,17,13,3,26,24,16,15,8,6. (Took 3 hours of computing, and also establishes the next two values because of our earlier remarks about grey cells.)
30	165	Delete a vertex from QRgraph(31).
31	180	QRgraph(31).

Side remark for group-theorists: My conjecture that F(8)=8 (later proved with computer aid) was inspired by the following. Consider extending QRgraph(7) by adding one more node. We claim that no matter how one directs the 7 arcs that connect the 7 old to the one new vertex, it is always possible to remove 8 arcs from the resulting graph to make it acyclic. To see that: First of all, at least 2 of the 7 new arcs must be ingoing (and at least 2 outgoing), otherwise we could simply delete the ≤1 ingoing new arcs plus the 7 old deleted arcs and our claim that 8 deletions suffice would be proven. So suppose wlog there are either 2 ingoing and 5 outgoing among the new arcs, or there are 3 ingoing and 4 outgoing. We now shall use the fact that the 7-vertex digraph is arc-transitive; each of its 21 arcs can be mapped to each other. Select an ingoing arc and (by means of a symmetry transformation of the 7-vertex graph to renumber its vertices) make its source vertex be vertex 0. Now delete all 7 arcs i→j of the 7-graph with j<i. If there is only one more ingoing new arc, we are done since it can be deleted to render the graph acyclic. So assume there are two. In that case, it is easy to see that at least one pair among the three old source-endpoints of the three ingoing arcs, must be connected by an old graph-arc. So by a symmetry we can wlog assume those two endpoints are vertices 0 and 1. Then the remaining ingoing new arc can be deleted to render the graph acyclic. Q.E.D. We also claim (which proves F(8)≥8) that any way whatever to connect an 8th vertex to the 7-node graph by 3 ingoing and 4 outgoing arcs (or their global direction-reversal) requires removal of 8 arcs to make it acyclic. Because: there are 3·4=12 in-out node-pairs among the 7, and 7·6/2=21 arcs in the 7-node graph. Since 12+14>21 by the pigeonhole principle at least some of those pairs must be connected by nondeleted arcs (even after 7 deletions) in the 7-node graph, and in the right direction to make a 3-cycle though the extra vertex. Hence more than 7 deletions are required. Q.E.D. This analysis suggests the conjecture that F(8)=8.

The highly-transitive Mathieu sporadic groups lurk underneath the hood of QRgraph(11) and QRgraph(23) and might enable one to prove some better bounds using similar techniques.

SkewHadamardGraph(N): Another construction of an infinite family of directed graphs: If N=4k-1 and a 4kX4k skew Hadamard matrix exists, then you can try "normalizing" the matrix so its first row is all "-" and first column is all "+" (that is, change signs of rows and columns bodily until this is true), removing this first row and column, and using the rest (ignoring the diagonal) as the adjacency matrix of our N-node directed graph. In fact our QRgraph(P) construction above can be interpreted as merely a special case of this Hadamard construction. (I have only tried a few instances of the skew Hadamard construction and in no case was a new record discovered.)

A divide-and-conquer upper bound: F(a+b)≤F(a)+F(b)+ab/2.

Superadditivity: F(a+b)≥F(a)+F(b).

A proof of a quadratic lower bound on F(n): Employ a Steiner Triple System on n points [which exists iff n=1 or n=3 mod 6, and contains (n-1)n/6 triples, any pair of triples overlapping at ≤1 points]; we can thus force an n-vertex directed graph to contain (n-1)n/6 edge-disjoint 3-cycles. I.e, make a directed 3-cycle on each Steiner triple.

Hence we get this lower bound: F(n)≥(n-1)n/6 if n=1 or n=3 mod 6. More generally you can use a maximum triple-packing ["A(n,4,3)" in the terminology of constant-weight codes rather than a Steiner Triple System (thus extending the method to also handle n=0,2,4,5 mod 6). One can indeed show F(n)≥floor(n*floor((n-1)/2)/3), see Chartrand et al. J.Combinatorial Theory 10 (1971) 12-41. However, these lower bounds are not always optimal, as is shown by the fact that the cyclic Steiner Triple System on 13 nodes (increments 1,3,9,2,6,5) proves F(13)≥26 but no more, whereas our (different) cyclic graph proves F(13)≥27.

Examples: F(7)≥7: consider the graph with vertex set Z mod 7, where there are arcs from X to X+1, X to X+2, and X to X+4 all mod 7. Then the triangles X→X+1→X+3→X all are edge-disjoint so you need to remove at least one arc from each to make it acyclic.

F(8)≥8= A(8,4,3); F(9)≥12 because of the Steiner triple system on 9 points arising from the rows, columns, and generalized diagonals of a 3x3 matrix; F(10)≥13= A(10,4,3); F(11)≥17= A(11,4,3); F(12)≥20= A(12,4,3); F(14)≥28= A(14,4,3); F(16)≥37= A(16,4,3); etc.

To consider the case F(9)≥12 in more detail: Make this graph G: it has 9 vertices arranged in a 3x3 grid (with periodic boundary conditions). Each grid point emits four arcs: to its North, East, NorthEast, and SouthEast neighbors. You can acyclify this graph by deleting the East, SouthEast, and NorthEast arcs emanating from the three West-column nodes, plus the North arcs emanating from the three South-row nodes (12 arcs deleted in all). That proves F(9)≥12. It also suggests the conjecture F(9)=12. One can further see that if any vertex is deleted from this 9-vertex digraph, the resulting 8-vertex digraph can be made acyclic by deleting 8 arcs. That reinforces our previous conjecture that F(8)=8. Also in the other direction, if F(8)=8 then the divide-and-conquer upper bound would prove the conjecture F(9)≤12, so the two conjectures are equivalent.

Increasing nature of F(n): Obviously, F(n+1)≥F(n).

A trivial quadratic upper bound on F(n): Order the n vertices arbitrarily from 1 to n. Now delete all the arcs ij with j>i or delete the ones with j<i, whichever arc-set is smaller. Hence F(n)≤(n-1)n/4.

A proof of some upper bounds on F(n): (Actually, some of these inequalities also can be used "in the other direction" to get lower bounds.) F(n+1)≤F(n)+floor( n/2 ), because use the best way to delete arcs from the graph on the first n vertices, and then remove either all ingoing or all outgoing arcs from the final vertex (whichever set is smaller).

A slightly better version of the above argument is this: F(n+1)≤F(n-1)+n-1 which you can see by adding two vertices and deleting all the arcs that enter that pair (or leave it, whichever arc-set is smaller).

Even better is this: F(n+3)≤F(n)+floor(3n/2), To see that, add three vertices which are not linked in a directed 3-cycle. (Three such vertices must exist in G if n>0, as you can see by drawing 4-vertex configurations, so wlog make them be the "last three" vertices of G.) Then delete all the arcs incoming from the n-vertex outside world, to these 3 (or delete the ones leaving, whichever is smaller).

Examples: F(5)≤F(4)+2=1+2=3, F(6)≤F(4)+4=1+4=5, F(7)≤F(4)+6=1+6=7, F(8)≤F(5)+7=3+7=10, F(8)≤F(6)+6=4+6=10, F(8)≤F(7)+3=7+3=10, F(9)≤F(6)+9=4+9=13, F(10)≤F(9)+4=12+4=16, F(10)≤F(8)+8=8+8=16, F(10)≤F(7)+10=7+10=17, F(11)≤F(8)+12=8+12=20, F(12)≤F(9)+13=12+13=25, F(13)≤F(10)+15=15+15=30.

By employing induction on the lattermost bound starting from the computer-aided result F(9)=12 as basis, we find the following quadratic upper bound (there are actually 6 cases depending on n mod 6, but we only give the n=0 mod 6 case): F(6k)≤(9k-5)k-1. This is slightly stronger than the trivial upper bound (6k+1)3k/2, by an additive improvement of 13k/2+1.

It is possible to do better than all that by using "directed graph Ramsey numbers." Namely, let R(n) be the least number so that any tournament with at least R(n) nodes contains an acyclic n-node subtournament, where trivially R(2)=2, and (as we just noted) R(3)=4, but going further, R(4)=8, R(5)=14, R(6)=28, and R(7)≥32. (These values of R(n) are deducible from page 167 in the article on Tournaments by K. Brooks Reid in The Handbook of Graph Theory CRC Press 2004, or see our puzzle 27.) The sequence R(n) plainly exists for every n because

If the statement "any tournament with exactly R(n) nodes contains an acyclic n-node subtournament" is true, then the statement with "exactly" changed to "at least" also is true (by considering a subtournament).
But that statement is true for each n and some R(n), because consider some fixed ordering of the V vertices and consider only the arcs directed consistently with that ordering; we then have an undirected graph and by the usual Ramsey numbers R(n,n) for undirected graphs there must exist, if V≥R(n,n), a complete subgraph (i.e. acyclic tournament, with arc-orientations restored) on n nodes. Here R(1,1)=1, R(2,2)=2, R(3,3)=6, R(4,4)=18, and R(5,5) is not presently known aside from 43≤R(5,5)≤49.

So we have F(n+k)≤F(n)+nk/2 if n+k≥R(k). For example, F(n+4)≤F(n)+2n if n≥4 and F(n+5)≤F(n)+floor(5n/2) if n≥9 and F(n+6)≤F(n)+3n if n≥22.

Examples: F(8)≤F(4)+8=1+8=9, F(9)≤F(5)+10=3+10=13, F(10)≤F(6)+12=4+12=16, F(11)≤F(7)+14=7+14=21, F(12)≤F(8)+16=8+16=24, F(13)≤F(9)+18=12+18=30, F(14)≤F(9)+22=12+22=34, F(15)≤F(10)+25=15+25=40, F(16)≤F(11)+27=20+27=47, F(17)≤F(12)+30≤54, F(18)≤F(13)+32≤62, F(19)≤F(14)+35≤69, F(20)≤F(15)+37≤77, F(21)≤F(16)+40≤87, F(22)≤F(16)+48≤47+48=95, F(23)≤F(17)+51≤105, F(24)≤F(18)+54≤116, F(25)≤F(19)+57≤126, F(26)≤F(20)+60≤137, F(27)≤F(21)+63≤150, F(28)≤F(22)+66≤161, F(29)≤F(23)+69≤174.

About tournaments: Don Coppersmith, Lisa Fleischer, and Atri Rudra showed in a 7-page SODA 2006 paper Ordering by weighted number of wins gives a good ranking for weighted tournaments (pages 776-782) that the following algorithm will come within a factor of 5 of finding the minimum number of arcs to delete in a "tournament" (i.e. directed graph with every possible singly-directed arc): order the vertices according to their indegree (that is, in voting terminology, in order of their Borda, or in our case Copeland, scores) breaking ties randomly, and then delete all the arcs ij with j>i or delete the ones with j<i, whichever arc-set is smaller. This does not produce anywhere near as good bounds as our results here, but it does have the advantage of being an algorithmic, i.e. constructive, proof, as compared to our proofs which are "nonconstructive." We shall later see that random graphs are (usually nearly) extremal.

More: these papers follow up on Coopersmith et al. (I thank Warren Schudy for this remark): N.Ailon, M.Charikar, A.Newman: Aggregating Inconsistent Information: Ranking and Clustering, appeared in STOC (Sympos. Theor. Computer Sci.) 05. Warren Schudy & Claire Kenyon-Matheiu: How to Rank with Few Errors: A PTAS for Weighted Feedback Arc Set on Tournaments, Brown Univ. Dept. Computer Science Report TR06-144, Nov 2006; Another version appeared in STOC.

Finding the asymptotic behavior via solving the problem for random tournaments

The asymptotic behavior of F(n) has been bounded between two quadratics which asymptotically have ratio 3:2. We shall now show that the upper of those two bounds is the truth: F(n) is asymptotic to n²/4.

Using my Hoeffding tail bound for sums of random variables (paper #85 here) we see that an n-node tournament (a "tournament" is a commonly used name for an n-node directed graph with every possible arc) with randomly directed arcs chosen by (n choose 2) coin tosses will have a number of arcs crossing a given cut into X-node and Y-node pieces (X+Y=n), that is, with probability >1-exp(-2n), at most √(XYn) away (additively) from its expectation.

Therefore if we now consider all (n choose Y) possible ways to thus-cut the graph (which is less than 2ⁿ=exp(0.69...n) ways) then even the one which leads to the most extreme deviation from expectation, will still, with probability >1-exp(-1.3n), (for all sufficiently large n) be at most √(XYn) away (additively) from its expectation.

Now, consider bisecting (X=Y=n/2), then recursively bisecting the two resulting hemi-tournaments, etc. The result will be an ordering of the n nodes. Now the claim is that even with the most-extreme of the n! orderings, the number of not-going-in-the-right-direction tournament arcs will be equal to (n+1)n/4 plus a deviation which is small. Specifically, it is smaller than

n^3/2 (1 + 2(√3)/4 +...) = n^3/2 C where C is a positive constant (since the series is convergent), with probability >1-exp(-1.3n)n, which probability is very close to 1 for large n. If you must know, the value of the constant is, in the n→∞ limit, C=∫₀¹ √((1-u)u) du=π/8.

Now it is a well known fact that the minimum feedback arc set problem is equivalent to the problem of ordering the nodes to minimize the number of not-going-in-the-ordering's-direction arcs. [To see that, realize that an acyclic directed graph has an ordering called a "topological sort," whch may be found by successive removal of zero-invalence nodes, which is free of wrong-way arcs.] Therefore, we conclude that, with high probability (exponentially near 1 as n→∞) a random tournament digraph will have the property that you need to remove at least (n+1)n/4-Cn^3/2 arcs in order to get rid of its cycles. In this result, the best (i.e. least) possible value of C we can get with this proof technique is C=π/8/√(2-ln2)+ε<0.34351567. This is not good enough to improve on any of the lower bounds on F(n) in the table at top, because n=29 is not large enough to reach asymptopia.

Our bound F(n)≥(n+1)n/4-Cn^3/2 matches the trivial upper bound (n+1)n/4 asymptotically and thus solves the problem of determining the asymptotic behavior of F(n). It also proves as a side effect that the Coppersmith-Fleischer-Rudra algorithm – and also the trivial algorithm – both will, on random n-node tournaments, be optimal not merely to within a factor of 5, but with high probability actually optimal to within a factor of 1+O(n^-1/2). I also find it very plausible that my (nonrandom) graph construction based on quadratic residues will achieve these same asymptotics. That conjecture may or may not be within the reach of modern number theory.

Bounding the second term of the asymptotic behavior – Joel Spencer

Joel Spencer on page 136-137 of Networks 1 (1971) gave an incomprehensible argument to show the upper bound F(n)≤(n+1)n/4-Kn^3/2 for some constant K>10^-10, and on page 137 then claimed without proof that by "more careful" reasoning one somehow could improve this to K>0.1577989. If we accept that, then it would match our upper bound (except for the unequal values of the constants K and C).

Later, Spencer's claim (but with the constant K unspecified aside from noting that it is positive) was proved in only one paragraph on page 42 of P.Erdös & J.Spencer: Probabilistic methods in combinatorics (Academic Press 1974), by using their "quasi-Ramsey theorem."

Oddly enough, Erdös & Spencer were unable to get a matching upper bound (as we have done) and said that "the gap seems difficult to eliminate." But the way we did eliminate it, was precisely by using the "probabilistic method" so beloved of these authors – and it was quite simple and nice! So why did we succeed while they did not? The answer is that we used the probabilistic method in a better way, employing a Hoeffding tail bound. They did not try to use a tail bound – and that mistake cost them.

Later note: Erdös & Spencer's gap was closed (aside from C≠K), by Spencer himself rather poorly, and then by W. Fernandez de la Vega: On the maximum cardinality of a consistent set of arcs in a tournament, J. Combin. Theory B 35 (1983) 328-332 who showed C>1.72 (for sufficiently large n) by a tail bound based argument similar to ours. This is not as good a value for the constant C as our value, which is because he used a worse tail bound than we did (or because one of us made an arithmetic error). Spencer, according to WFdlV, had shown it for the far weaker C=4000.

Moral

As a consequence of our and the Spencer bound,

(n+1)n/4 - Cn^3/2 < F(n) < (n+1)n/4 - Kn^3/2 for suitable positive constants C and K and all sufficiently large n. Perhaps this means we cannot expect there to be any nice formula for F(n).

Return to puzzles

Return to main page