SmallFpLogImpl

User documentation for SmallFpLogImpl
Maintainer documentation for SmallFpLogImpl
Bugs, Shortcomings and other ideas

User documentation for SmallFpLogImpl

The class SmallFpLogImpl is a very low level implementation class for fast arithmetic in a small, prime finite field. It is not intended for use by casual CoCoALib users, who should instead see the documentation in QuotientRing (in particular the function NewZZmod), or possibly the documentation in RingFp, RingFpLog, and RingFpDouble.

Compared to SmallFpImpl the only difference is an implementation detail: multiplication and division are achieved using discrete log tables -- this may be fractionally faster on some processors.

Note that the cost of construction of a SmallFpLogImpl(p) object for larger primes may be quite considerable (linear in p), and the resulting object may occupy quite a lot of space (e.g. probably about 6*p bytes).

All operations on values must be effected by calling member functions of the SmallFpLogImpl class. Here is a brief summary.

    SmallFpLogImpl::IsGoodCtorArg(p);   // true iff ctor SmallFpLogImpl(p) will succeed
    SmallFpLogImpl::ourMaxModulus();    // largest permitted modulus
    SmallFpLogImpl ModP(p, convention); // create SmallFpLogImpl object
    long n;
    BigInt N;
    BigRat q;
    SmallFpImpl::value_t a, b, c;
  
    ModP.myModulus();         // value of p (as a long)
  
    ModP.myReduce(n);         // reduce mod p
    ModP.myReduce(N);         // reduce mod p
    ModP.myReduce(q);         // reduce mod p
  
    ModP.myExport(a);         // returns a preimage (of type long) according to symm/non-neg convention.
  
    ModP.myNegate(a);         // -a mod p
    ModP.myAdd(a, b);         // (a+b)%p;
    ModP.mySub(a, b);         // (a-b)%p;
    ModP.myMul(a, b);         // (a*b)%p;
    ModP.myDiv(a, b);         // (a*inv(b))%p;  where inv(b) is inverse of b
    ModP.myPower(a, n);       // (a^n)%p;  where ^ means "to the power of"
    ModP.myIsZeroAddMul(a,b,c) // a = (a+b*c)%p; result is (a==0)

For myExport the choice between least non-negative and symmetric residues is determined by the convention specified when constructing the SmallFpLogImpl object. This convention may be either GlobalSettings::SymmResidues or GlobalSettings::NonNegResidues.

Maintainer documentation for SmallFpLogImpl

The only clever bit is the economical construction of the log/exp tables in the constructor where we exploit the fact that myRoot to the power (p-1)/2 must be equal to -1.

This implementation uses discrete log/exp tables to effect multiplication and division quickly. Note that the residues themselves (i.e. the values of the ring elements) are held as machine integers whose value is the least non-negative representative of the residue class (i.e. in the range 0 to p-1). In particular, although log tables are used, we do NOT use a logarithmic representation for the field elements.

The log/exp tables are stored in C++ vectors: aside from their construction during the RingFpLogImpl constructor, these vectors are never modified, and are used only for table look-up. The C++ vectors are resized in the body of the constructor to avoid large memory requests when overly large characteristics are supplied as argument.

Besides these tables SmallFpLogImpl also remembers the characteristic in myModulus; myRoot is the primitive root used to generate the log/exp tables.

The members myResidueUPBValue and myIterLimit and myHalfNormalize may be used for delayed normalization in loops: see the inner product example in SmallFpImpl.

As the code currently stands, the modulus must also be small enough that it can fit into an FpTableElem (an unsigned short), and that its square can fit into a value_t. Using shorts in the tables gave slightly better run-time performance in our tests. Furthermore, to permit use of unnormalized products in some algorithms, twice the square of the characteristic must fit into a value_t (i.e. myIterLimit must be greater than zero). The constructor for a RingFpLogImpl checks the size restrictions on the characteristic.

Note that the log table has a slot with index 0 which is never written to nor read from. The exp table is double size so that multiplication can be achieved more easily: the highest slot which could ever be used is that with index 2p-3 (in division), but the constructor fills two extra slots (as this makes the code simpler/neater).

The only slick part of the implementation is the filling of the tables in the constructor, where some effort is made to avoid doing more reductions modulo p than necessary. Note that the primitive root is always calculated (potentially costly!); there is no memorized global table of primitive roots anywhere.

Bugs, Shortcomings and other ideas

It is not as fast as I hoped -- perhaps cache effects?