The class SmallFpImpl is a very low level implementation class for fast
arithmetic in a small, prime finite field. It is not intended for use
by casual CoCoALib users, who should instead see the documentation in
QuotientRing (in particular the function NewZmod), or possibly the
documentation in RingFp, RingFpLog, and RingFpDouble.
The class SmallFpImpl offers the possibility of highly efficient
arithmetic in small prime finite fields. This efficiency comes at a
cost: the interface is rather unnatural and intolerant of mistakes. The
emphasis is unequivocally on speed rather than safety or convenience.
The full speed of SmallFpImpl depends on many of its functions being
inlined.
A SmallFpImpl object cannot be used as a CoCoA ring, even though the
implementation is rather reminiscent of a ring implementation class.
All operations on values must be effected by calling member functions
of the SmallFpImpl class. Here is a brief summary.
SmallFpImpl ModP(p, convention); // create SmallFpImpl object
int n;
BigInt N;
SmallFpImpl::value_t a, b, c;
ModP.myModulus(); // value of p (as a long)
ModP.myReduceMod(n); // reduce mod p
ModP.myAssign(a, b); // a = b;
ModP.myAssign(a, n); // a = n%p; (reduce mod p)
ModP.myAssign(a, N); // a = N%p; (reduce mod p)
ModP.myNegate(a, b); // a = -b;
ModP.myAdd(a, b, c); // a = (b+c)%p;
ModP.mySub(a, b, c); // a = (b-c)%p;
ModP.myMul(a, b, c); // a = (b*c)%p;
ModP.myDiv(a, b, c); // a = (b*inv(c))%p;
where inv(c) is inverse of c
ModP.myPower(a, b, c); // a = (b^c)%p;
where ^ means "to the power of"
ModP.myIsZeroAddMul(a,b,c) // a = (a+b*c)%p; result is (a==0)
ModP.myIsZero(a); // a == 0
ModP.myIsOne(a); // a == 1
ModP.myIsMinusOne(a); // a == -1
ModP.myExport(a); // returns a preimage (of type long) according to symm/non-neg convention.
ModP.myIsEqual(a, b); // a == b
For myExport the choice between least non-negative and symmetric
residues is determined by the convention specified when constructing
the SmallFpImpl object. This convention may be either
GlobalSettings::SymmResidues or
GlobalSettings::NonNegResidues.
This is still preliminary
SmallFpImpl::value_t InnerProd;
ModP.myAssign(InnerProd, 0);
size_t k=0;
for (size_t i=0; i < n; ++i)
{
InnerProd += v1[i]*v2[i];
if (++k < ModP.myIterLimit) continue;
if (InnerProd > ModP.myDrop)
{
InnerProd -= ModP.myDrop;
k = 0;
}
}
InnerProd = ModP.myReduceMod(InnerProd);
Most functions are implemented inline, and no sanity checks are
performed (except when CoCoA_DEBUG is enabled). The constructor
does do some checking.
SmallFpImpl::value_t should be an unsigned integral type; it is a
typedef to a type specified in CoCoA/config.H -- this should allow
fairly easy platform-specific customization.
This code is valid only if the square of myModulus can be represented
in a SmallFpImpl::value_t -- the constructor checks this condition.
Most functions do not require myModulus to be prime, though division
becomes only a partial map if it is composite; and the function
myIsDivisible is correct only if myModulus is prime. Currently the
constructor rejects non-prime moduli.
The code assumes that each value modulo p is represented as the least
non-negative residue (i.e. the values are represented as integers in
the range 0 to p-1 inclusive). This decision is linked to the fact
that SmallFpImpl::value_t is an unsigned type.
Note that myMul and myIsZeroAddMul have "fancy" implementations:
the normal remaindering operation is rather slow on many processors,
and the code given here is usefully faster on Athlons and Mac G5.
The benefit arises from the fact that a "reciprocal" of the modulus
can be precomputed inside the constructor.
The constants myDrop and myIterLimit are to allow efficient
exploitation of non-reduced multiplication (e.g. when trying to
compute an inner product modulo p). See example in user doc.
The return type of NumBits is unsigned short, which should offer
a large enough range for the forseeable future. Would size_t be
better? If so, then the data members myMulShift1 and myMulShift2
should be made into size_t too.
How to distinguish between "homogeneous" myAssign, and myAssign from
a long? The latter must reduce its argument, the former must not!
Need functions for (fast) non-reducing addition and multiplication;
and myDrop and myIterLimit need to be publicly accessible.
Also need a good example to show how to use them.
Why don't myMulShift1 and myMulShift2 have sensible names???
(as used in SmallFpLogImpl, for instance -- have a look).
Can multiplication be made any faster? It is a nuisance to have to
check twice whether the value is less than myModulus. To be honest
I do not yet have a proof that the "nifty reduction modulo p" is
always correct (it is OK for p < 32768, checked by exhaustive search).
No example programs in examples/ directory.