Concise Prime-Rich Polynomials and Proth sequences; and speeding up the "Quadratic Sieve" Integer-Factorization algorithm

By Warren D. Smith, January 2009. PRELIMINARY.

ABSTRACT. Suppose we want a nonconstant polynomial f(n) with integer coefficients to output values automatically indivisible by the first K primes. Obviously, f(n)=An-1 works where A is the product of the first K primes. However, this is a long formula because A is asymptotically KlgK bits long ("lg" denotes log2). It appears we can make a much shorter formula by using f(n)=n2-Bn-C where D=B2+4C≡5(mod 8) is nonsquare modulo each of the first K odd primes and both B and C are odd. We present numerical evidence and nonrigorous reasons to believe that with this kind of formula at most K+o(K) bits always suffice. Further, by using polynomials of form f(n)=2nH-F with F odd and suitable highly composite H (for example make H/4 the greatest "primorial" such that H<PK), it appears formulas O(K/lnlnK) bits long always suffice, and if ridiculously large H are permitted perhaps only O(K/lnK).

But (we prove) these formulas cannot exceed the usual kind of asymptotic prime density c/lnN for certain constants c which depend on which formula it is – i.e, our formulas (with fixed parameters) merely yield comparatively large values of c.

We also point out how to devise prime-rich "Proth sequences," for example showing, for every n≥1, that P=28669968894403·4n+1 is indivisible by 2, 3, 5, 7, 11, ..., and 211, and such P are prime if and only if 3(P-1)/2 mod P=-1. This may aid in the hunt for the world's largest prime.

Finally, we show how our same circle of ideas should be useable to speed up the popular "quadratic sieve" general-purpose integer-factorization algorithm.

Consider the following three polynomial (and one non-polynomial) functions of n:

f1(n) = 5397346292805549782720214077673687806275517530364350655459511599582614290n - 1,
f2(n) = n2 - 24206229n - 17485613,
f3(n) = 2n60 - 35426473.
f4(n) = 116807971207·4n+1

All four functions f(n) have the property that, for every integer n, f(n) is indivisible by 2,3,5,..., and 181 (i.e, the first 42 primes). However, their description lengths vary considerably. As a matter of "data compression," it is interesting to try to find the shortest polynomial f(n) with the property that it is indivisible by the first K primes for every n. Surprisingly, it appears achievable to compress down to a number of bits sublinear in K.

Define primorial(K) to be the product of the first K primes, e.g. primorial(3)=2·3·5=30. [Other people have used the notation "5#" for this. But we will not.]

The First Construction, which dates back to Euclid, is

f1(n) = An-1     where     A=primorial(K).

It is well known ("asymptotics of Chebyshev's function") that

lg primorial(K)  =  KlgK + KlglnK - 1 + lglnK/lnK + O(1/lnK)

so that this kind of formula is asymptotically KlgK bits long. (Actually, the statement of asymptoticity to KlgK is equivalent to the "prime number theorem.")

The Second Construction is

f2(n) = n2 - Bn - C     where     B and C are odd and D=B2+4C is nonsquare modulo 3, 5, 7, ..., PK.

Note that D≡5(mod 8) is forced by the fact that B and C are odd:

D = B2+4C = (2k+1)2 + 4(2j+1) = 4(k+1)k + 8j + 5.

Once D is known it is a trivial matter to find suitable B and C, so we may focus on the questions of (1) seeing the construction works, (2) finding D and realizing it is not too large.

1. If n is even, then f2(n)=even+even+odd is odd. If n is odd, then f2(n)=odd+odd+odd is odd. Therefore f2(n) is indivisible by 2. If D is nonsquare modulo an odd prime P, then as a consequence of the quadratic formula, no roots of the quadratic f2(n) exist in the finite field of integers mod P. [More details: n=0 cannot be a root of f2(n) since f2(0)=-C is odd hence nonzero. If f2 had a nonzero root r, then by polynomial division – which involves only field operations – it would also have a second root s with r+s=B and rs=-C. From this we could deduce that 4(r-s)2=D (all of these equalities are in mod P arithmetic; we continually use Galois's theorem that mod P arithmetic is a finite field), so that D would be square modulo P, contrary to assumption.] Therefore f2(n) also is indivisible by 3, 5, 7, ..., PK.

2. As is well known (it follows from, e.g. Galois' theorem that the nonzero elements in a finite field form a cyclic multiplicative group) exactly half the nonzero elements modulo any fixed odd prime P are squares; plus zero is always a square. Hence the "probability" that a random integer D is nonsquare modulo some odd prime P, is exactly (P-1)/(2P). Actually, if we choose D to be a random nonsquare positive integer, then the chances D is nonsquare mod P rise above 1/2 to become (P-1)/(2P-⌊√P⌋). If all the K-1 such probabilities were "independent" we would conclude that the chance that some nonsquare integer D is nonsquare modulo all of the first K-1 odd primes would be at least

2≤k≤K (1/2)  =  21-K

Hence we would probabilistically "expect" the least positive D simultaneously nonsquare modulo all these primes to be about 2K-2 + 2K/2, i.e. expect it to be K+O(1) bits long.

Probabilistic computations of this ilk actually are rigorously valid if we have in mind selecting D uniformly from the integer interval [1, Primorial(K)·j/2] for any j=1,2,3,... – the justification arises from the "Chinese remainder theorem," which can be restated in probabilistic language as saying residuosity modulo relatively-prime moduli X,Y, are "independent" events for random-uniform samples in the interval [1,XY] – and they prove the density of valid D in that interval is asymptotic (when K is large) to about 2-K. [It is also a rigorous theorem that for any fixed nonsquare integer D>0, the fraction of the first K primes modulo which D is nonsquare, is asymptotic to 1/2 when K→∞; this follows from Dirichlet's theorem.] But unfortunately it is not rigorously justified to conclude that the least valid D in the interval is about 2K. Thus the following is only a conjecture:

The least positive integer D simultaneously nonsquare with respect to the first K-1 odd primes, and which is 5 mod 8, is at most K+o(K) bits long.

Provable [with weaker constant] under ERH?? Eric Bach?? This conjecture seems supported by the following numerical computation:

[least    #of initial oddprimes D is nonsquare modulo     last ]
[D,       (only minimal D with D=5 mod 8 listed),         prime]
5, 1, 3      since 5 not a square mod 3, but is square mod 5
53, 2, 5     since 53 is nonsquare mod 3 and mod 5, but is square mod 7
173, 4, 11
293, 5, 13
437, 6, 17
9173, 8, 23
24653, 9, 29
74093, 12, 41
170957, 13, 43
214037, 16, 59
2004917, 17, 61
44401013, 18, 67
71148173, 19, 71
154554077, 21, 79
163520117, 24, 97
261153653, 26, 103
1728061733, 29, 113
9447241877, 30, 127
1134843991613, 31, 131
2272201812173, 32, 137
2585307226157, 33, 139
3589495768493, 34, 149
4316096218013, 36, 157
166872686607413, 37, 163
585941592342893, 41, 181

Same thing except now for NEGATIVE D: -3, 0, 2 -19, 1, 3 -43, 3, 7 -67, 5, 13 -163, 11, 37 -77683, 13, 43 -1333963, 14, 47 -2404147, 16, 59 -20950603, 17, 61 -36254563, 18, 67 -51599563, 19, 71 -96295483, 21, 79 -114148483, 22, 83 -269497867, 26, 103 -585811843, 27, 107 -52947440683, 28, 109 -71837718283, 30, 127 -275848096987, 31, 131 [here and beyond I worry software or hardware may have had a problem...] -1747431378907, 34, 149 -23809638256363, 37, 157 -30059924764123, 41, 181

The particular f2(n) given at the beginning of the paper arises from the last line of the (positive-D part of the) table and the fact that 242062292+4·17485613=585941592342893.

The Third Construction is

f3(n) = 2nH - F    with    F odd and with F not 2 times a perfect Hth power modulo any of the first K-1 odd primes.

Why it works: Because F is odd, every value of f3(n) is odd. If F is chosen so that F is not two times a perfect Hth power modulo an odd prime P, then f3(n) will have no roots mod P and hence every value of f3(n) will be indivisible by P.

The point of this construction is that, if we choose H to be "highly composite," then random integers Y are unlikely to be perfect Hth powers mod odd primes P. For example, if H is divisible by all small primes, then XH≡(1 or 0) mod P because of Lagrange's theorem and the fact that H will be a multiple of P-1. Hence in order to be twice a perfect Hth power modulo such a P, Y would need to be 0 or 2 mod P, which for odd primes P is an event of probability 2/P, i.e, usually much lower than 1/2. So such H should be a big improvement over just using H=2 as in the Second Construction and we expect to get much smaller F here than D there. But sadly, that calculation, while illuminating, was most relevant to the case where P is large and H is very much larger, which is not so interesting for us. For us, a more interesting regime is to consider fixed H and then allow the primes P to get large.

For any fixed even H we can work out the exact geometric mean over all primes P not dividing H, of the probability that a random integer Y is a perfect Hth power (or twice a perfect Hth power; the same expression results) modulo that P. It is:

Geometric Mean Probability  =   ∏j⊥H [1 - GCD(j-1, H)-1]1/φ(H)

Here "j⊥H" is to be read as "j relatively prime to H and 0<j<H" and we are using the facts that

  1. There are φ(H) infinite classes of primes modulo H, where φ(H) is the Euler totient function counting the number of residue classes mod H relatively prime to H, and the primes are asymptotically equinumerous within each class (by Dirichlet's theorem)
  2. The probability a random integer will be a perfect Hth power mod j, where j is any fixed prime, is GCD(j-1, H)-1 (by Galois' cyclic-group theorem)

Thus, the expected number of bits we need to specify F so that F is odd and is simultaneously not (twice a) perfect Hth power modulo T different given random primes simultaneously, is Expected BitLength of Formula = T·EBF(H) where

EBF(H)  =   [-1/φ(H)] ∑j⊥H lg[1 - GCD(j-1, H)-1].

Please interpret an empty product as 1. Then this is, in fact, a rigorous exact result for any even H≥2, despite the flaky probabilistic sound of things [provided "random" means "independently uniform in a width-W positive-integer interval in the limit W→∞" and the "number of bits needed to specify" an integer that satisfies a probability-P property, means lg(1/P)]. Unfortunately, this is not the result we need, because we are interested in the first K-1 odd primes, which is not a random set of primes, and in the least F that works. Any application of this result to this different problem, therefore, will lead to nonrigorous results, albeit (I presume) approximately-correct ones.

Nevertheless, EBF(H) is an interesting function. Here is a list of all the H with 1≤H≤12252240 yielding record-low EBF(H) values:

 H,   EBF(H)                    H,        EBF(H)    prime factorization of H
===================             ============================================
2,    1.000000000               18480,    0.249192,  243·5·7·11
4,    0.7075187498              27720,    0.241689,  23325·7·11
6,    0.6315172026              55440,    0.236906,  24325·7·11
12,   0.4509006957              110880,   0.235757,  25325·7·11
24,   0.4150853517              120120,   0.234396,  233·5·7·11·13
48,   0.4069251910              240240,   0.229734,  243·5·7·11·13
60,   0.3568730739              360360,   0.222812,  23325·7·11·13
120,  0.3284600211              720720,   0.218399,  24325·7·11·13
240,  0.3219582192              1441440,  0.217339,  25325·7·11·13
360,  0.3122973335              2162160,  0.217217,  24335·7·11·13
420,  0.3044328129              2882880,  0.217079,  26325·7·11·13
840,  0.2801703095              3603600,  0.216565,  2432527·11·13
1680, 0.2746074801              4324320,  0.216164,  25335·7·11·13
2520, 0.2663470359              7207200,  0.215515,  2532527·11·13
5040, 0.2610819135              10810800, 0.215394,  2433527·11·13
9240, 0.2542455243              12252240, 0.205401,  24325·7·11·13·17

Thus if instead of using H=2 (essentially as in the Second Construction) which got us a formula of bit length asymptotic to K, we were to use H=12, then we would expect our f3(n) formula would have bitlength only 0.4509...K+o(K), and so on. I do not actually know the best H to use, but as a simply-described heuristic, I suggest choosing H=4Primorial(J) where J is chosen as large as possible subject to H<PK. In that case H will be asymptotically lg(K) bits long, which is negligible, and we conjecture that if this is done, then the least odd F meeting the conditions of the Construction – and hence the bitlength of the entire f3(n) formula – will be O(K/lnlnK) bits long.

The basis for this conjecture is this. Our "expected number of bits per prime" formula for EBF(H) is a sum of small terms. Consider multiplying H by Pj, i.e. moving the primorial to the next one. The asymptotic effect that will have on our "expected number of bits per prime" formula for EBF(H) will be, roughly, to shrink 1/Pj of the terms in the sum by factor of about Pj each (causing them to become comparatively negligible), while leaving the rest alone (and the terms affected will be a representative subset). The net effect is that the sum is multiplied by about (1-1/Pj). Thus, asymptotically, we would expect (with our heuristic choice of H) that EBF(H)≈C∏j≤J(1-1/Pj)≈C/ln(PJ) for two different positive constants C. [The reader may verify from the numerical table above, that EBF(H) does indeed behave this way, to good approximation, whenever H is multiplied by any decently-large prime Pj that was not already a factor of H. The second "≈" is because of the "prime number theorem" or just the (easier) "Mertens theorem." Actually weak forms of the PNT, such as Chebyshev's theorem that 0.92<π(x)ln(x)/x<1.11 for all sufficiently large x, suffice for most of the purposes of this paper. The EBF(H) formula will still be valid to within a constant factor for H this small, even though we are now permitting H to be a slowly-increasing function of K.]

If we understood more precisely exactly what the best H were, that perhaps would buy us another factor of, say, lnlnlnK. Also, we might be able to get more "compression" simply by making H bigger. It might be that by choosing H=4Primorial(J) for very much larger J (so large that the bitlength of H became of the same order as the bitlength of the entire formula) we could get the formula length down to only O(K/lnK) bits. But that is a more-dubious conjecture because it is less-clear that the EBF(H) formula is applicable in that regime. Also, even if true, this would yield a formula with an exponentially-large exponent H, i.e. yielding doubly-exponentially large values of f3(n), which I presume would be less useful. In contrast all the other Constructions (and this one too, if H is chosen according to the original heuristic) only yield singly-exponentially large f(n) values if n=KO(1), i.e. in which the bitlength of the numbers output by f(n) is O(KlgK) rather than exponential in K.

[least    #of initial oddprimes F is not twice a 60th     last ]
[F,       power modulo (only minimal odd F listed),       prime]
1, 7, 19
61, 8, 23
103, 13, 43
181, 21, 79
2113, 26, 103
6703, 27, 107
13723, 32, 137
77479, 33, 139
308953, 35, 151
1613671, 36, 157
3818953, 37, 163
9391819, 38, 167
9562099, 39, 173
35426473, 41, 181
96514681, 44, 197
1046763919, 46, 211
35272529113, 47, 223
63853813393, 50, 233
1616682498421, 51, 239
3979640305009, 52, 241
17460680232889, 54, 257  

Same thing except now for NEGATIVE F:
-1, 5, 13
-31, 6, 17
-61, 7, 19
-73, 10, 31
-127, 13, 43
-277, 26, 103
-24691, 28, 109
-45943, 36, 157
-5940097, 37, 163
-7158043, 38, 167
-11928871, 39, 173
-112891231, 41, 181
-120012421, 42, 191
-186908593, 43, 193
-941402953, 46, 211
-13019844121, 47, 223
-72967326283, 48, 227
-94944140623, 50, 233
-467832295003, 53, 251

The particular f3(n) given at the beginning of the paper arises from the line of the (positive-F part of) this table ending "181."

Reasons to think our constructions achieve approximately optimal "compression"

It is impossible for any nonconstant polynomial f(n) to output asymptotically 100% primes. If, indeed, it output even a single prime P, then by the properties of modular arithmetic f(n+jP) will be divisible by P for every integer j. This will be nonprime except when it is P (or -P if we count negated values too) but that can only occur at most deg(P) times. Hence f's asymptotic prime-density will be ≤∏j(1-1/Pj) where Pj are the primes f outputs. This density upper bound seems weak, though, since it can be made arbitrarily near to 100% by design of f(n).

What prime densities do we expect for the three Constructions we have proposed?

1. Dirichlet's theorem tells us that f1(n), for any fixed A, generates prime density C/ln(N) for 0≤n<N for some constant C (depending on A). Further, the Chinese Remainder Theorem tells us that our formula A=Primorial(K) is the unique least-possible coefficient A>0 of n in any linear-polynomial of n, such that its output is always indivisible by the first K primes.

2 & 3. The Hardy-Littlewood conjectures (or their refinement, the Bateman-Horn conjectures) tell us the presumed asymptotic prime-density of any polynomial and in particular that it is always of the form C/lnN for some constant C (the value of C depends on the polynomial). We can in fact prove an upper bound of this form for, e.g, the prime density of f2(n). To sketch the proof, one first shows using well-known factorization and reciprocity properties of the Jacobi symbol, plus the Dirichlet theorem, that D is square and nonzero modulo asymptotically exactly half of all primes P. Then for each such prime P, the output value of f2(n) is necessarily divisible by P in exactly two cases per every P consecutive n's. Hence the fraction of prime output values of f2(n) for 0≤n<N in the limit N→∞ is ≤∏j(1-2/Pj) where the product is over all j such that Pj is prime with D nonsquare and nonzero modulo Pj and such that, say, 0≤ Pj≤N1/5, and finally this by well known estimates is upperbounded by C/lnN. (My use of the exponent "1/5" suffices to cause the worst-case error estimates from "Brun's sieve" to become neglectibly small. More generally, Brun's sieve can be used to justify most arguments which regard residuosities of numbers in X-wide intervals modulo primes less than Xε as "probabilistically independent," for any sufficiently small constant ε>0. See Cojocaru & Murty 2005 and Halberstam & Richert 1974.)

It is similarly possible to prove a CN/lnN asymptotic upper bound on the number of primes output by any given nonconstant polynomial f(n) for 0≤n<N in the limit N→∞. The same proof works, except that instead of relying on properties of the Jacobi symbol to prove f(n) splits into linear factors modulo asymptotically half the primes, we use the Chebotarev density theorem (see Lenstra & Stevenhagen 1996) to prove f(n) has at least one linear factor modulo asymptotically at least a constant fraction of primes.

We further conjecture that f2's amount of "compression" is "best possible" in the following sense:

It is not possible for a nonconstant quadratic function of n to be always indivisible by the first K primes, if its bitlength is less than K-o(K).

Proth primes

There is prize money offered to find the world's largest primes. Our thinking in this paper can also help a little with that hunt. So far, the "world's largest" title has usually been held by "Mersenne primes," of form P=2N-1 with N prime. However, I suspect prime-hunters will gain some advantages by instead seeking certain kinds of "Proth primes," of form P=Q·2N+1 with Q odd and 3≤Q<2N:

  1. The primality test gets a little easier;
  2. By choosing Q cleverly, we can make the Proth sequence Q·4N+1 be comparatively rich in primes.

A sufficient condition for primality (F.Proth 1878) of Prothian numbers P=Q·2N+1 with Q odd, 3≤Q<2N, is as follows. Pick a small integer g≥3. Now P is prime if g(P-1)/2=-1 mod P. This condition also is necessary, i.e. we get a test for primality, provided g is such that the Jacobi symbol (g|P)=-1, or indeed if g is chosen in any other manner such that we happen to know g is nonsquare mod P whenever P is prime. For example, you can choose g to be any small prime such that P is nonsquare mod g.

Proof: One also would want g(P-1)/q≠1 mod P for all (necessarily odd) factors q of Q. but that is automatic because if it were violated, then g would have to be a square mod P, in which case g(P-1)/2=-1 mod P would be violated. Since it isn't violated, that proves g is a "generator" mod P, which proves P is prime. So we've proven the sufficent condition.

To prove the necessary condition, we need to show that if (g|P)=-1 and P is prime (or indeed if g is chosen in any other manner such that we happen to know g is nonsquare mod P whenever P is prime) then g(P-1)/2=-1 mod P. But this is immediate from Lagrange's theorem (also called "Fermat's little theorem" or the "Fermat-Euler theorem" in number-theory contexts) and the Galois theorem that the multiplicative group mod P, for P prime, is cyclic.

Finally note that P≡1 (mod 4), so that by Gauss's "quadratic reciprocity" theorem, if g is prime then g is nonsquare mod P whenever P is prime, if and only if P is nonsquare mod g. Q.E.D.

One way to choose Q to get prime-rich Proth sequences: If we choose Q so that -Q is nonsquare modulo some prime p, then Q·4n+1 will automatically be indivisible by that p, for all n≥0. Further, in order for -Q to be both odd and nonsquare mod 3 it is necessary for Q≡1 (mod 6). This causes it to be ok to use g=3 in Proth's primality test because 1·4n+1≡2 (mod 3) is always nonsquare mod 3, hence by quadratic reciprocity 3 is always nonsquare Q·4n+1.

Example: -37 is nonsquare mod 3, 5, 7, 11, 13, and 17 (but it is square mod 19 since 37+12=38≡0 mod 19). Hence 37·4n+1 is indivisible by 2,3,5,7,11,13, and 17 for all n>0. Further, P=37·4n+1 is prime if and only if 3(P-1)/2=-1 mod P.

A table of good Q is below. [Note that every Q in the table is indeed 1 mod 6, and that always-indivisibility by the first K primes as usual seems to be accomplishable with Q that are K+o(K) bits long.]

[least    #of initial oddprimes -Q is nonsquare       last ]
[Q,       modulo (only minimal odd Q listed),         prime]
7, 2, 5
37, 6, 17
163, 11, 37
26293, 12, 41
30493, 17, 61
852457, 18, 67
8408233, 22, 83
26915263, 23, 89
97241593, 24, 97
289303393, 28, 109
6697563007, 29, 113
11555912503, 30, 127
46029452527, 34, 149
705361404883, 35, 151

The Fourth Construction: We can do better. Modulo some primes P (specifically those such that P≡±1 mod 8) not only is 4 a square, but indeed a 4th power. We therefore can merely demand that -Q not be a 4th power modulo those primes. Since this is a weaker demand, more Q obey it. We similarly can make even weaker demands modulo P modulo which 4 is a higher power; what we really want is that 4nQ(mod Pk)≠-1 for every n>0 and for all of the first K odd primes Pk. Such Q are tabulated below.

[least    #of initial oddprimes 4nQ is never congruent        last]
[Q,       to -1 modulo (only minimal odd Q listed),          prime]
1, 1, 3 *  
3, 2, 5   
7, 3, 7 *
15, 6, 17
93, 8, 23
163, 11, 37 *
177, 17, 61
106177, 21, 79 *
675565, 24, 97 *
866595, 26, 103
1616035, 27, 107 *
6906463, 28, 109 *
15439495, 29, 113 *
17971167, 32, 137 
77883267, 33, 139 
193845465, 34, 149 
2757073113, 36, 157 
3467475715, 37, 163 * 
13179266445, 38, 167
64340102887, 39, 173 *
90118372723, 40, 179 *
116807971207, 41, 181 * 
591461386167, 42, 191  
5709766030753, 44, 197 * 
28669968894403, 46, 211 * 
[* indicates Q≡1 mod 3 hence g=3 works in Proth test. Otherwise Q≡0 mod 3.]

As you can see from the table, considerably smaller Q suffice. On the usual nonrigorous theoretical grounds, I expect Q will be below cK+o(K) bits long, where c is a new constant

c  =  -limK→∞ 1/(K-1) ∑2≤k≤K lg(1-oPk(4)/Pk)  ≈  0.65

where the sum is taken over the first K-1 odd primes Pk, and oP(4) denotes the "order" of 4 mod P, i.e. the least integer r≥1 such that 4r≡1(mod P).

For Q in the above list which happen to be 1(mod 3), one can use g=3 in the Proth test. For other Q, there is rarely (if ever) an all-purpose g which always is suitable fodder for Proth regardless of n; and hence we would have to choose g differently depending on the n. Some examples include:

In other cases it seems simplest just to use the least prime g such that P is nonsquare mod g (and find this g by trial).

As a check, I computed all P=5709766030753·4n+1 with 0≤n≤2000. Every one was indeed indivisible by all primes≤197. There were exactly 32 primes among them, arising from

n = 7, 17, 29, 33, 35, 45, 47, 71, 74, 76, 89, 108, 116, 120, 187, 188, 192, 193, 347, 351, 355, 397, 442, 472, 498, 570, 933, 1136, 1526, 1703, 1800, 1880.

In contrast, the "expected" number of primes in this set, had it not been specially constructed to obey all our magic conditions, would have been only about

0≤n≤2000 1/ln(5709766030753·4n+1) ≈ 3.31.

To hunt for new largest-prime records, I recommend sieving the P-sequence to remove members divisible by any of (say) the first million primes. Only primes p such that -Q is square mod p, need to be in the sieving set, a fact which saves a factor of 2 in work. Even further conditions could be used to discard even more sieving-primes:

Then apply Proth's test to the remaining candidates.

One also can consider 2Q·4n+1 with Q odd [so that 2Q≡2(mod 4)]. For that we have the following table of record-good Q:

[    #of initial oddprimes P such that 2Q·4n is                 ]
[Q,  never -1 modulo that P (only records with Q≡1 mod 2 listed),   lastprime]
1, 0, 2
3, 1, 3
5, 2, 5 *
9, 3, 7
11, 4, 11 *
29, 9, 29 *
165, 10, 31
231, 12, 41
2079, 13, 43
6269, 15, 53 *
7575, 16, 59
16095, 17, 61
44261, 18, 67 *
93081, 21, 79
144251, 23, 89 *
426755, 26, 103 need to write *s here and below??
2187831, 27, 107
19612401, 28, 109
22585901, 30, 127
98468015, 32, 137
120540749, 36, 157
2354947469, 37, 163
4654589121, 38, 167
75184865175, 39, 173
90815671359, 40, 179
155082595019, 42, 191
2533151376795, 45, 199
33251230847501, 46, 211

Now Q≡2(mod 3) are the starred Q which allow using g=3 in Proth's test.

Improving the "quadratic sieve" integer-factorization algorithm

The "quadratic sieve" is an algorithm (or family of algorithms) for factoring large integers N into primes. It was invented by Carl Pomerance and is related to previous ideas by Brillhart, Morrison, and Dixon. It is remarkably simple and has a very fast "inner loop." At one time the quadratic sieve had the best available (conjectural and empirical) worst-case run-time as a function of the bit-length of N. However, it has now been surpassed by the "general number field sieve." Also, if we are interested in average rather than worst case runtime, the "elliptic curve method" seems superior to the quadratic sieve; thus a good hybrid approach is to run ECM to factor "easy" N quickly, and only resort to QS once it has become apparent that our N is a "hard" number. Because the quadratic sieve is much simpler than GNFS, it is preferred for smaller N. The GNFS only becomes superior to QS for N with more than about 100-130 decimal digits.

We are now going to explain a simple improvement to the quadratic sieve algorithm, which we might call the method of "optimized polynomials." It should speed it up by a large factor; in practice I suspect a speedup-factor of 10 would be plausible. This should push the crossover point with GNFS dramatically upward. In fact, my crude estimates suggest that the crossover for a 10×faster QS versus GNFS would be for N with about 50 more digits than the old QS-GNFS crossover.

The central idea, and purpose of the main loop, of the quadratic sieve is to find many integers x at which the quadratic polynomial f(x)=x2-BN assumes a "smooth" value. (We call a number "smooth" if all its prime factors are small. There also are "large prime" QS variants in which it is permitted to have one, or at most a few, prime factors which are not so small.) Here B is a positive integer constant small in comparison with N. Further, N is known to have no small prime factors itself (since we are not idiots and do some trial-division by all small primes beforehand).

Once enough such x's are found, it is almost always possible to multiplicatively combine a subset of the f(x)'s to find a number that is a square of a known smooth number Y, modulo N. (Just which subset, can be determined using Gaussian elimination, or equivalent linear algebra, over GF2 in ways that are described elsewhere.) And since every f(x) by construction is always a known square (in fact it is just x2) mod N, we thus will find a congruence X2≡Y2 (mod N) with known X and Y. At that point, GCD(X-Y, N) and GCD(X+Y, N) both yield factors of N, and usually nontrivial ones.

The usual procedure is to make a big array representing all x-values near enough to (BN)1/2. A "sieving" procedure can be used to estimate the sum of the logs of the primes dividing every f(x) in the array. The few x with large-enough log-sums are investigated more deeply, with the vast majority discarded. This process is extremely fast because the average amount of work per x is very small.

Our new idea is simply this: choose good B. Specifically, choose B≥1 so that BN is a nonzero square modulo each of the first K primes, where K is as large as possible. (The task of finding good B can be done by a preliminary sieving.) Previously, quadratic sievers simply used B=1, or small values such as B=3, paying no particular attention to their goodness, but only to their smallness. If you foolishly do that, then BN will be square for half the primes p, and just which half, will seem random. If however you choose B wisely then BN will be square for, say, every one of the first 50 primes (followed by half of the remaining primes in a pseudorandom manner as usual). The point is that f(x) cannot be divisible by p unless BN is square mod p; and then it is divisible a fraction 2/p of the time if BN is a nonzero square. By choosing good B, we allow our f(x) to be divisible by all of the first 50 primes rather than only about 25 of them. This should cause "smooth" numbers to become far more frequent versus just choosing random or small B. This will cause the inner loop of QS to produce more x's faster, and that increased rate of output will easily compensate for the effort expended in the preliminary search for good B.

Possible future computational work

A "twin" prime P is a prime such that P+2 also is prime. A "Sophie Germain" prime P is a prime such that 2P+1 also is prime. Note that in our First Construction, f1(n)=An-1, since it is indivisible by the first K primes, is "likely" to be prime; and then both f+2 and 2f+1 also are "likely" to be primes; so that we expect f1(n)'s output to be "rich in twin and Sophie Germain primes."

Our usual sort of methods would also allow us to devise parameters for f2(n) and f3(n) to make them rich in twin and/or Sophie Germain primes, but while achieving superior "data compression":

  1. To make f2(n) be rich in twin primes, choose D so that neither D nor D-4 are square modulo any of the first K odd primes. Then every output value X of f2(n) will have the property that both X and X+2 are indivisible by the first K primes. We expect bitlength 2K+o(K).
    To similarly make f3(n) be rich in twin primes, choose F so that neither F nor F-2 are twice Hth-powers modulo any of the first K odd primes. We expect bitlength O(K/lnlnK) with H chosen according to our usual heuristic.
  2. To make f2(n) be rich in Sophie Germain primes, choose D so that neither D nor D-C-1 are square modulo any of the first K odd primes. Then every output value X of f2(n) will have the property that both X and 2X+1 are indivisible by the first K primes. We expect bitlength 2K+o(K).
    To similarly make f3(n) be rich in Sophie Germain primes, choose F so that neither 2F nor 2F-1 are four times a perfect Hth-power modulo any of the first K odd primes. We expect bitlength O(K/lnlnK) with H chosen according to our usual heuristic.

Is this new? – And Acknowledgements and other remarks

I thank Jen Kruse Anderson for pointing out (essentially) that both negative as well as positive discriminants ought to be considered.

The ideas in this paper are very "elementary" (i.e. do not employ analytic number theory) hence would not a priori be expected to be new. And indeed, it appears my ideas about "prime-rich quadratics" were anticipated by Jacobson & Williams 2003. They actually considered a slightly different problem – maximizing the Hardy-Littlewood conjectural prime-density estimate for a quadratic, rather than (as I have done) simply requiring the quadratic to have values indivisible by the first K primes. Of course, these two problems are highly related, but they are not exactly the same. Jacobson & Williams used special purpose sieving hardware to help find their quadratics, whereas I used an ordinary PC. With special hardware, one can rapidly crank through integers N seeking ones which have certain allowed values modulo all of the first K primes and/or small prime powers (the sets of "allowed" remainders are input by the user of the device; the hardware does the checks for all moduli in parallel). Rather than employing an "AND gate" to detect N satisfying all conditions simultaneously, one could also consider using an "adder" to detect N satisfying a threshhold number of of the conditions (where the conditions could have different "weights" input by the user and the threshhold could be based on a weighted sum). But as far as I know their hardware did not provide thresholding capability.

Such devices were first built by D.H.Lehmer. We claim, however, that this kind of device is stupid (at least for the simple "AND gate" version). More precisely: general-purpose sequential PCs can achieve comparable sieving speeds. The tricks for doing so are as follows. First of all, we precompute all the acceptable N modulo the LCM of the first moduli, and examine only those N, not every N, using a precomputed increments table to skip over the useless ones. If only 1% of all N are acceptable modulo the first 7 moduli, this represents a speedup factor of 100. Second, if there are, say, 100 moduli, we only examine them one at a time, cutting short the examination as soon as the N under test fails. If failure rates are 50% for each modulus, that means the expected number of moduli we examine is only 2, not 100. That represents a speedup factor of 50. Third, we can use precomputed arrays for LCMs of other moduli to speed up and effectvely parallelize the more-commonly-tried acceptability tests (and index-incrementing to avoid the need for slow integer-division). The net result is speed, on a boring, garden variety non-parallel computer, comparable to the special purpose parallel-modulus hardware. (However, if there are a vast number of parallel copies of the special sieving chip, that would be another matter.) I thus sieved up to D=30059924764123 in about 10 minutes on an ordinary gigahertz personal computer, for a sieving rate of 5×1010 numbers per second, considerably exceeding sieving rates quoted for the Manitoba special purpose sieving chip. (Also, I believe if I'd tried harder I probably could have made my software twice as fast.)

Despite the heavy overlap with Jacobson & Williams about prime-rich quadratics it appears that my searches for prime-rich Proth-multipliers and prime-rich high-degree polynomials are new. As evidence for that newness, we point out that as of late 2008, our numbers 96514681, 186908593, 77883267, and 2187831 are nowhere present the "Online Encyclopedia of Integer Sequences" by N.J.A.Sloane with the collaboration of thousands of other authors. The same kind of sieving procedure, whether done by special purpose hardware or, as I did, using software (and the same circle of underlying ideas) is applicable for these problems also. The Jacobson-Williams refinement of maximizing the prime-density estimate rather than merely asking for indivisibility modulo the first K primes, would also be applicable to the Proth and high-degree polynomial problems – but neither I, nor anybody else, has done so as yet. This kind of problem, incidentally, would be suited for special sieving hardware if it provided "weighted threshhold" capability.

Finally, although I have consulted several published descriptions of the "quadratic sieve" factoring algorithm, authored both by its inventor and observer Pomerance as well as by others, none of them mentioned our idea of "choosing good B" via a preliminary sieving. This sieving again could be done via the same kind of software I used or special hardware, and it again rests on the same circle of ideas. (Also, if several relatively-prime B are found by the preliminary sieve, they all can be used in parallel on different machines, i.e. this also is a good way to speed up the highly-parallelizable "multiple polynomial quadratic sieve.") The only real difference is its objective – rather trying to get a lot of numbers indivisible by any of the first K primes, we now seek numbers often divisible by those primes. This opposite-polarity goal is, of course, exactly what one would naively expect when trying to factor numbers, as opposed to trying to construct primes.


Alina Carmen Cojocaru & M. Ram Murty: An introduction to sieve methods and their applications, London Mathematical Society Student Texts #66. Cambridge University Press 2005.

Richard E. Crandall & Carl Pomerance: Prime numbers, a computational perspective, Springer 2001; second edition 2005.

Ivan Damgard & G.S.Frandsen: Efficient algorithms for the gcd and cubic residuosity in the ring of Eisenstein integers, J. Symbolic Computation 39,6 (2005) 643-652.

H.Halberstam & H-E.Richert: Sieve Methods, New York: Academic Press 1974.

G.H.Hardy & E.M.Wright: An Introduction to the theory of numbers, Oxford University Press, London 1938, fifth edition 1979.

A.E.Ingham: The Distribution of Prime Numbers. Cambridge University Press, reprinted 1990.

K.Ireland & M.Rosen: A Classical Introduction to Modern Number Theory, 2nd ed. Springer 1990.

Derrick H. Lehmer: article on his "DLS-127" hardware in book "Computers in Mathematical Research" (ed. R.F.Churchhouse & J-C. Herz) North-Holland 1968.

Michael J. Jacobson Jr. & Hugh C. Williams: New quadratic polynomials with high densities of prime values, Math. Comput. 72,241 (2003) 499-519.

William Stein: Elementary number theory, book intended for publication by Springer-Verlag 2004.

P.Stevenhagen & H.W.Lenstra Jr: Chebotarev and his density theorem, Mathematical Intelligencer 18,2 (1996) 26-37.

Andre Weilert: Fast Computation of the Biquadratic Residue Symbol, J. Number Theory 96,1 (2002) 133-151.

Return to main page