Design #950: factor and SmoothFactor for integers --> FactorINT, FactorINT_TrialDiv, FactorINT_PollardRho - CoCoALib - CoCoA Project

Design #950

factor and SmoothFactor for integers --> FactorINT, FactorINT_TrialDiv, FactorINT_PollardRho

Added by John Abbott over 7 years ago. Updated over 1 year ago.

Status:

Closed

Priority:

Normal

Assignee:

John Abbott

Category:

Safety

Target version:

CoCoALib-0.99850

Start date:

20 Oct 2016

Due date:

% Done:

100%

Estimated time:

Spent time:

9.10 h

Description

I can make factor and SmoothFactor for integers faster in some cases by calling IsProbPrime.
The risk is that a factor returned as irreducible (prime) is not; this risk is very low.

Is this acceptable?
Should the user be allowed to stop these "risky" short-cuts?

Discuss.

2022-02 Now called FactorINT, FactorINT_TrialDiv and FactorINT_PollardRho

Related issues

History

#1 Updated by John Abbott over 7 years ago

Related to Feature #796: CoCoALib function for radical (or SqFree) of a polynomial added

#2 Updated by Anna Maria Bigatti over 7 years ago

ProbSmoothFactor?
or SmoothFactor(n) and SmoothFactor(n, "NoProb")?

#3 Updated by John Abbott over 7 years ago

Status changed from New to In Progress
% Done changed from 0 to 10

Does that mean that you are in favour of offering the user the choice between a fast (and probably correct) answer, or a potentially (very) slow but guaranteed correct answer?

What should the default choice be? Perhaps there should be a global flag (set in the GlobalManager?) which says what the default behaviour should be?

#4 Updated by John Abbott over 6 years ago

Target version changed from CoCoALib-0.99560 to CoCoALib-0.99600

#5 Updated by John Abbott over 6 years ago

Related to Design #1159: Add global enum "verify/DontVerify" added

#6 Updated by John Abbott over 6 years ago

Description updated (diff)

Maybe there could be a fn called IsPrime3 which is guaranteed (reasonably) fast, and returns a bool3?

#7 Updated by John Abbott over 6 years ago

Related to Slug #1170: SmoothFactor: slow when a factor is found added

#8 Updated by John Abbott about 6 years ago

Target version changed from CoCoALib-0.99600 to CoCoALib-0.99650 November 2019

#9 Updated by John Abbott over 5 years ago

Target version changed from CoCoALib-0.99650 November 2019 to CoCoALib-1.0

#10 Updated by John Abbott over 5 years ago

Here is an effect of the idea of testing the remaining factor for primality.

>>> N := NextPrime(2^60);
>>> SmoothFactor(N, 1100000000);
record[RemainingFactor := 1152921504606847009, factors := [], multiplicities := []]
--->  time about 0.01s

>>> Pmax := 1100000000;
>>> p1 := NextPrime(Pmax);
>>> p2 := NextPrime(p1);
>>> NN := p1*p2;
>>> SmoothFactor(NN, 1100000000);
record[RemainingFactor := 1210000028600000153, factors := [], multiplicities := []]
time is about 4.6s

There is a big difference in speed depending on whether the remaining factor is prime or not... This is a natural consequence, but is it desirable?

#11 Updated by John Abbott over 3 years ago

% Done changed from 10 to 20

I have just checked what the code currently does: if many primes have been tried without finding a factor and if the remaining is small (less than 2^64) then IsPrime is called, otherwise no primality test is applied.

One reason for this strategy is for instance trying SmoothFactor(1+factorial(10000), 10000).
Applying IsProbPrime to 1+factorial(100000) takes about 84s on my computer; simply trying all primes up to 10000 takes about 0.02s. So even 1 call to IsProbPrime is highly disadvantageous. See also #1170.

This is not really a main feature of CoCoALib; it would be better to delegate factorization to some external library (which?)

What to do?

#12 Updated by John Abbott over 2 years ago

Robbiano would like factor to work for integers (or some integers, at least).
How can we do this reasonably? Maybe it should simply call SmoothFactor with some predefined bound? Perhaps also call PollardRho?

#13 Updated by John Abbott over 2 years ago

Assignee set to John Abbott

I have now a prototype fn for finding one factor using PollardRhoSeq.
It takes as input N>1 to be factorized, and an upper bound on the number of iterations.
If no factor is found, the return value is 0 (which is obviously impossible).

An interesting test case is N := 4087 which requires PollardRho to try a=2 and a=3 (where the poly is x^2+a).
[I did not find any requiring a>3; but did not look that hard].
Also N=13861

An example with 3 factors: N=28214231

#14 Updated by John Abbott over 2 years ago

% Done changed from 20 to 30

Currently my prototype function is called factor_PollardRho.
It takes 2 args: N to be factorized (non-zero integer),
iters number of iterations [time is roughly linear in number of iters]

Should there be a variant with timeout? Maybe that makes sense only in CoCoALib,
and if there is a structure which can resume computing from where timeout occurred?

If we want an easy-to-use factorization function for integers, it should probably
do just a few iterations of SmoothFactor (upto 1000? Surely not more) and then
switch to PollardRho. I do not really want to invest time and effort into making
a more sophisticated integer factorizer.

#15 Updated by John Abbott over 2 years ago

Anna thinks a name closer to SmoothFactor would be more appropriate because the fns both take 2 args...
No final decision.

#16 Updated by Anna Maria Bigatti over 2 years ago

Drastic change: FactorBound, FactorIter or FactorINTBound, FactorINTIter or FactorINT_bound, FactorINT_iter?

#17 Updated by John Abbott over 2 years ago

Target version changed from CoCoALib-1.0 to CoCoALib-0.99850

Another proposal:

FactorINT(N,TimeOut) tries both TrialDiv and PollardRho; maybe also does IsProbPrime check?
FactorINT_TrialDiv(N,pmax) trial division upto pmax
FactorINT_PollardRho(N, iters) PollardRho method with upto iters iterations (gives 0 if no factor found)

Opinions?

#18 Updated by Anna Maria Bigatti over 2 years ago

John Abbott wrote:

FactorINT_TrialDiv(N,pmax) trial division upto pmax
Opinions?

maybe just FactorINT_div(N,pmax) or FactorINT_smooth(N,pmax) (to recall its previus name)?

#19 Updated by John Abbott over 2 years ago

I suggest making SmoothFactor obsolescent, and just forward the call to FactorINT_TrialDiv.

I think I prefer FactorINT_TrialDiv to FactorINT_smooth because the first name is clearer to everyone,
also FactorINT_PollardRho indicates the method used so it would be more coherent to use the name
FactorINT_TrialDiv since that also identifies the method used.

I do wonder whether there would be any sense in letting the user specify a TimeOut (instead of some other
upper limit). Or perhaps both sorts of limit at the same time?

#20 Updated by John Abbott over 2 years ago

% Done changed from 30 to 60

I have now implemented FactorINT, FactorINT_TrialDiv and FactorINT_PollardRho.
SmoothFactor is now obsolete.
Doc has been updated. Tests and examples have been updated (but no new tests added).

#21 Updated by Anna Maria Bigatti over 2 years ago

Subject changed from factor and SmoothFactor for integers to factor and SmoothFactor for integers --> FactorINT, FactorINT_TrialDiv, FactorINT_PollardRho
Description updated (diff)

#22 Updated by John Abbott over 2 years ago

Oh joy!

Have fun if you decide to document functions whose name contain an underscore: LaTeX gives its usual cryptic, unhelpful error messages (sigh!)

I have reworded the doc to avoid the underscores.

Does anyone object to having fn names containing underscores?
If so, what do you propose as an alternative?

#23 Updated by John Abbott over 1 year ago

Status changed from In Progress to Closed
% Done changed from 60 to 100

The current impl is good enough for most purposes.
Some factors may be reducible if PollardRho is used.
Too bad.
Closing.

#24 Updated by John Abbott over 1 year ago

Related to Design #1716: Qn: factor for BigInt added

Also available in: Atom PDF

Project

General

Profile

CoCoA » CoCoALib

Issues

Custom queries