Crypto++  8.0
Free C++ class library of cryptographic schemes
integer.cpp
1 // integer.cpp - originally written and placed in the public domain by Wei Dai
2 // contains public domain code contributed by Alister Lee and Leonard Janke
3 
4 // Notes by JW: The Integer class needs to do two things. First, it needs
5 // to set function pointers on some platforms, like X86 and X64. The
6 // function pointers select a fast multiply and addition based on the cpu.
7 // Second, it wants to create Integer::Zero(), Integer::One() and
8 // Integer::Two().
9 // The function pointers are initialized in the InitializeInteger class by
10 // calling SetFunctionPointers(). The call to SetFunctionPointers() is
11 // guarded to run once using a double-checked pattern. We don't use C++
12 // std::call_once due to bad interactions between libstdc++, glibc and
13 // pthreads. The bad interactions were causing crashes for us on platforms
14 // like Sparc and PowerPC. Since we are only setting function pointers we
15 // don't have to worry about leaking memory. The worst case seems to be the
16 // pointers gets written twice until the init flag is set and visible to
17 // all threads.
18 // For Integer::Zero(), Integer::One() and Integer::Two(), we use one of three
19 // strategies. First, if initialization priorities are available then we use
20 // them. Initialization priorities are init_priority() on Linux and init_seg()
21 // on Windows. AIX, OS X and several other platforms lack them. Initialization
22 // priorities are platform specific but they are also the most trouble free
23 // with determisitic destruction.
24 // Second, if C++11 dynamic initialization is available, then we use it. After
25 // the std::call_once fiasco we dropped the priority dynamic initialization
26 // to avoid unknown troubles platforms that are tested less frequently. In
27 // addition Microsoft platforms mostly do not provide dynamic initialization.
28 // The MSDN docs claim they do but they don't in practice because we need
29 // Visual Studio 2017 and Windows 10 or above.
30 // Third, we fall back to Wei's original code of a Singleton. Wei's original
31 // code was much simpler. It simply used the Singleton pattern, but it always
32 // produced memory findings on some platforms. The Singleton generates memory
33 // findings because it uses a Create On First Use pattern (a dumb Nifty
34 // Counter) and the compiler had to be smart enough to fold them to return
35 // the same object. Unix and Linux compilers do a good job of folding objects,
36 // but Microsoft compilers do a rather poor job for some versions of the
37 // compiler.
38 // Another problem with the Singleton is resource destruction requires running
39 // resource acquisition in reverse. For resources provided through the
40 // Singletons, there is no way to express the dependency order to safely
41 // destroy resources. (That's one of the problems C++11 dynamic
42 // intitialization with concurrent execution is supposed to solve).
43 // The final problem with Singletons is resource/memory exhaustion in languages
44 // like Java and .Net. Java and .Net load and unload a native DLL hundreds or
45 // thousands of times during the life of a program. Each load produces a
46 // memory leak and they can add up quickly. If they library is being used in
47 // Java or .Net then Singleton must be avoided at all costs.
48 
49 #include "pch.h"
50 #include "config.h"
51 
52 #ifndef CRYPTOPP_IMPORTS
53 
54 #include "integer.h"
55 #include "secblock.h"
56 #include "modarith.h"
57 #include "nbtheory.h"
58 #include "smartptr.h"
59 #include "algparam.h"
60 #include "filters.h"
61 #include "stdcpp.h"
62 #include "asn.h"
63 #include "oids.h"
64 #include "words.h"
65 #include "pubkey.h" // for P1363_KDF2
66 #include "sha.h"
67 #include "cpu.h"
68 #include "misc.h"
69 
70 #include <iostream>
71 
72 #if (_MSC_VER >= 1400) && !defined(_M_ARM)
73  #include <intrin.h>
74 #endif
75 
76 #ifdef __DECCXX
77  #include <c_asm.h>
78 #endif
79 
80 // "Error: The operand ___LKDB cannot be assigned to", http://github.com/weidai11/cryptopp/issues/188
81 #if (__SUNPRO_CC >= 0x5130)
82 # define MAYBE_CONST
83 # define MAYBE_UNCONST_CAST(x) const_cast<word*>(x)
84 #else
85 # define MAYBE_CONST const
86 # define MAYBE_UNCONST_CAST(x) x
87 #endif
88 
89 // "Inline assembly operands don't work with .intel_syntax",
90 // http://llvm.org/bugs/show_bug.cgi?id=24232
91 #if CRYPTOPP_BOOL_X32 || defined(CRYPTOPP_DISABLE_MIXED_ASM)
92 # undef CRYPTOPP_X86_ASM_AVAILABLE
93 # undef CRYPTOPP_X32_ASM_AVAILABLE
94 # undef CRYPTOPP_X64_ASM_AVAILABLE
95 # undef CRYPTOPP_SSE2_ASM_AVAILABLE
96 # undef CRYPTOPP_SSSE3_ASM_AVAILABLE
97 #else
98 # define CRYPTOPP_INTEGER_SSE2 (CRYPTOPP_SSE2_ASM_AVAILABLE && (CRYPTOPP_BOOL_X86))
99 #endif
100 
101 // ***************** C++ Static Initialization ********************
102 
103 NAMESPACE_BEGIN(CryptoPP)
104 
105 // Function body near the middle of the file
106 static void SetFunctionPointers();
107 
108 // Use a double-checked pattern. We are not leaking anything so it
109 // does not matter if a pointer is written twice during a race.
110 // Avoid std::call_once due to too many problems on platforms like
111 // Solaris and Sparc. Also see
112 // http://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146 and
113 // http://github.com/weidai11/cryptopp/issues/707.
114 InitializeInteger::InitializeInteger()
115 {
116  static bool s_flag;
117  MEMORY_BARRIER();
118  if (s_flag == false)
119  {
120  SetFunctionPointers();
121  s_flag = true;
122  MEMORY_BARRIER();
123  }
124 }
125 
126 template <long i>
128 {
129  Integer * operator()() const
130  {
131  return new Integer(i);
132  }
133 };
134 
135 // ***************** Library code ********************
136 
137 inline static int Compare(const word *A, const word *B, size_t N)
138 {
139  while (N--)
140  if (A[N] > B[N])
141  return 1;
142  else if (A[N] < B[N])
143  return -1;
144 
145  return 0;
146 }
147 
148 inline static int Increment(word *A, size_t N, word B=1)
149 {
150  CRYPTOPP_ASSERT(N);
151  word t = A[0];
152  A[0] = t+B;
153  if (A[0] >= t)
154  return 0;
155  for (unsigned i=1; i<N; i++)
156  if (++A[i])
157  return 0;
158  return 1;
159 }
160 
161 inline static int Decrement(word *A, size_t N, word B=1)
162 {
163  CRYPTOPP_ASSERT(N);
164  word t = A[0];
165  A[0] = t-B;
166  if (A[0] <= t)
167  return 0;
168  for (unsigned i=1; i<N; i++)
169  if (A[i]--)
170  return 0;
171  return 1;
172 }
173 
174 static void TwosComplement(word *A, size_t N)
175 {
176  Decrement(A, N);
177  for (unsigned i=0; i<N; i++)
178  A[i] = ~A[i];
179 }
180 
181 static word AtomicInverseModPower2(word A)
182 {
183  CRYPTOPP_ASSERT(A%2==1);
184 
185  word R=A%8;
186 
187  for (unsigned i=3; i<WORD_BITS; i*=2)
188  R = R*(2-R*A);
189 
190  CRYPTOPP_ASSERT(R*A==1);
191  return R;
192 }
193 
194 // ********************************************************
195 
196 #if !defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE) || (defined(__x86_64__) && defined(CRYPTOPP_WORD128_AVAILABLE))
197  #define TWO_64_BIT_WORDS 1
198  #define Declare2Words(x) word x##0, x##1;
199  #define AssignWord(a, b) a##0 = b; a##1 = 0;
200  #define Add2WordsBy1(a, b, c) a##0 = b##0 + c; a##1 = b##1 + (a##0 < c);
201  #define LowWord(a) a##0
202  #define HighWord(a) a##1
203  #ifdef _MSC_VER
204  #define MultiplyWordsLoHi(p0, p1, a, b) p0 = _umul128(a, b, &p1);
205  #ifndef __INTEL_COMPILER
206  #define Double3Words(c, d) d##1 = __shiftleft128(d##0, d##1, 1); d##0 = __shiftleft128(c, d##0, 1); c *= 2;
207  #endif
208  #elif defined(__DECCXX)
209  #define MultiplyWordsLoHi(p0, p1, a, b) p0 = a*b; p1 = asm("umulh %a0, %a1, %v0", a, b);
210  #elif defined(__x86_64__)
211  #if defined(__SUNPRO_CC) && __SUNPRO_CC < 0x5100
212  // Sun Studio's gcc-style inline assembly is heavily bugged as of version 5.9 Patch 124864-09 2008/12/16, but this one works
213  #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "r"(b) : "cc");
214  #else
215  #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
216  #define MulAcc(c, d, a, b) asm ("mulq %6; addq %3, %0; adcq %4, %1; adcq $0, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1), "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
217  #define Double3Words(c, d) asm ("addq %0, %0; adcq %1, %1; adcq %2, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1) : : "cc");
218  #define Acc2WordsBy1(a, b) asm ("addq %2, %0; adcq $0, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b) : "cc");
219  #define Acc2WordsBy2(a, b) asm ("addq %2, %0; adcq %3, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b##0), "r"(b##1) : "cc");
220  #define Acc3WordsBy2(c, d, e) asm ("addq %5, %0; adcq %6, %1; adcq $0, %2;" : "+r"(c), "=r"(e##0), "=r"(e##1) : "1"(d##0), "2"(d##1), "r"(e##0), "r"(e##1) : "cc");
221  #endif
222  #endif
223  #define MultiplyWords(p, a, b) MultiplyWordsLoHi(p##0, p##1, a, b)
224  #ifndef Double3Words
225  #define Double3Words(c, d) d##1 = 2*d##1 + (d##0>>(WORD_BITS-1)); d##0 = 2*d##0 + (c>>(WORD_BITS-1)); c *= 2;
226  #endif
227  #ifndef Acc2WordsBy2
228  #define Acc2WordsBy2(a, b) a##0 += b##0; a##1 += a##0 < b##0; a##1 += b##1;
229  #endif
230  #define AddWithCarry(u, a, b) {word t = a+b; u##0 = t + u##1; u##1 = (t<a) + (u##0<t);}
231  #define SubtractWithBorrow(u, a, b) {word t = a-b; u##0 = t - u##1; u##1 = (t>a) + (u##0>t);}
232  #define GetCarry(u) u##1
233  #define GetBorrow(u) u##1
234 #else
235  #define Declare2Words(x) dword x;
236  #if _MSC_VER >= 1400 && !defined(__INTEL_COMPILER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_IA64))
237  #define MultiplyWords(p, a, b) p = __emulu(a, b);
238  #else
239  #define MultiplyWords(p, a, b) p = (dword)a*b;
240  #endif
241  #define AssignWord(a, b) a = b;
242  #define Add2WordsBy1(a, b, c) a = b + c;
243  #define Acc2WordsBy2(a, b) a += b;
244  #define LowWord(a) word(a)
245  #define HighWord(a) word(a>>WORD_BITS)
246  #define Double3Words(c, d) d = 2*d + (c>>(WORD_BITS-1)); c *= 2;
247  #define AddWithCarry(u, a, b) u = dword(a) + b + GetCarry(u);
248  #define SubtractWithBorrow(u, a, b) u = dword(a) - b - GetBorrow(u);
249  #define GetCarry(u) HighWord(u)
250  #define GetBorrow(u) word(u>>(WORD_BITS*2-1))
251 #endif
252 #ifndef MulAcc
253  #define MulAcc(c, d, a, b) MultiplyWords(p, a, b); Acc2WordsBy1(p, c); c = LowWord(p); Acc2WordsBy1(d, HighWord(p));
254 #endif
255 #ifndef Acc2WordsBy1
256  #define Acc2WordsBy1(a, b) Add2WordsBy1(a, a, b)
257 #endif
258 #ifndef Acc3WordsBy2
259  #define Acc3WordsBy2(c, d, e) Acc2WordsBy1(e, c); c = LowWord(e); Add2WordsBy1(e, d, HighWord(e));
260 #endif
261 
262 class DWord
263 {
264 public:
265 #if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
266  DWord() {std::memset(&m_whole, 0x00, sizeof(m_whole));}
267 #else
268  DWord() {std::memset(&m_halfs, 0x00, sizeof(m_halfs));}
269 #endif
270 
271 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
272  explicit DWord(word low) : m_whole(low) { }
273 #else
274  explicit DWord(word low)
275  {
276  m_halfs.high = 0;
277  m_halfs.low = low;
278  }
279 #endif
280 
281 #if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
282  DWord(word low, word high) : m_whole()
283 #else
284  DWord(word low, word high) : m_halfs()
285 #endif
286  {
287 #if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
288 # if (CRYPTOPP_LITTLE_ENDIAN)
289  const word t[2] = {low,high};
290  memcpy(&m_whole, t, sizeof(m_whole));
291 # else
292  const word t[2] = {high,low};
293  memcpy(&m_whole, t, sizeof(m_whole));
294 # endif
295 #else
296  m_halfs.low = low;
297  m_halfs.high = high;
298 #endif
299  }
300 
301  static DWord Multiply(word a, word b)
302  {
303  DWord r;
304  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
305  r.m_whole = (dword)a * b;
306  #elif defined(MultiplyWordsLoHi)
307  MultiplyWordsLoHi(r.m_halfs.low, r.m_halfs.high, a, b);
308  #else
309  CRYPTOPP_ASSERT(0);
310  #endif
311  return r;
312  }
313 
314  static DWord MultiplyAndAdd(word a, word b, word c)
315  {
316  DWord r = Multiply(a, b);
317  return r += c;
318  }
319 
320  DWord & operator+=(word a)
321  {
322  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
323  m_whole = m_whole + a;
324  #else
325  m_halfs.low += a;
326  m_halfs.high += (m_halfs.low < a);
327  #endif
328  return *this;
329  }
330 
331  DWord operator+(word a)
332  {
333  DWord r;
334  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
335  r.m_whole = m_whole + a;
336  #else
337  r.m_halfs.low = m_halfs.low + a;
338  r.m_halfs.high = m_halfs.high + (r.m_halfs.low < a);
339  #endif
340  return r;
341  }
342 
343  DWord operator-(DWord a)
344  {
345  DWord r;
346  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
347  r.m_whole = m_whole - a.m_whole;
348  #else
349  r.m_halfs.low = m_halfs.low - a.m_halfs.low;
350  r.m_halfs.high = m_halfs.high - a.m_halfs.high - (r.m_halfs.low > m_halfs.low);
351  #endif
352  return r;
353  }
354 
355  DWord operator-(word a)
356  {
357  DWord r;
358  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
359  r.m_whole = m_whole - a;
360  #else
361  r.m_halfs.low = m_halfs.low - a;
362  r.m_halfs.high = m_halfs.high - (r.m_halfs.low > m_halfs.low);
363  #endif
364  return r;
365  }
366 
367  // returns quotient, which must fit in a word
368  word operator/(word divisor);
369 
370  word operator%(word a);
371 
372  bool operator!() const
373  {
374  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
375  return !m_whole;
376  #else
377  return !m_halfs.high && !m_halfs.low;
378  #endif
379  }
380 
381  // TODO: When NATIVE_DWORD is in effect, we access high and low, which are inactive
382  // union members, and that's UB. Also see http://stackoverflow.com/q/11373203.
383  word GetLowHalf() const {return m_halfs.low;}
384  word GetHighHalf() const {return m_halfs.high;}
385  word GetHighHalfAsBorrow() const {return 0-m_halfs.high;}
386 
387 private:
388  // Issue 274, "Types cannot be declared in anonymous union",
389  // http://github.com/weidai11/cryptopp/issues/274
390  // Thanks to Martin Bonner at http://stackoverflow.com/a/39507183
391  struct half_words
392  {
393  #if (CRYPTOPP_LITTLE_ENDIAN)
394  word low;
395  word high;
396  #else
397  word high;
398  word low;
399  #endif
400  };
401  union
402  {
403  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
404  dword m_whole;
405  #endif
406  half_words m_halfs;
407  };
408 };
409 
410 class Word
411 {
412 public:
413  Word() : m_whole(0) {}
414  Word(word value) : m_whole(value) {}
415  Word(hword low, hword high) : m_whole(low | (word(high) << (WORD_BITS/2))) {}
416 
417  static Word Multiply(hword a, hword b)
418  {
419  Word r;
420  r.m_whole = (word)a * b;
421  return r;
422  }
423 
424  Word operator-(Word a)
425  {
426  Word r;
427  r.m_whole = m_whole - a.m_whole;
428  return r;
429  }
430 
431  Word operator-(hword a)
432  {
433  Word r;
434  r.m_whole = m_whole - a;
435  return r;
436  }
437 
438  // returns quotient, which must fit in a word
439  hword operator/(hword divisor)
440  {
441  return hword(m_whole / divisor);
442  }
443 
444  bool operator!() const
445  {
446  return !m_whole;
447  }
448 
449  word GetWhole() const {return m_whole;}
450  hword GetLowHalf() const {return hword(m_whole);}
451  hword GetHighHalf() const {return hword(m_whole>>(WORD_BITS/2));}
452  hword GetHighHalfAsBorrow() const {return 0-hword(m_whole>>(WORD_BITS/2));}
453 
454 private:
455  word m_whole;
456 };
457 
458 // do a 3 word by 2 word divide, returns quotient and leaves remainder in A
459 template <class S, class D>
460 S DivideThreeWordsByTwo(S *A, S B0, S B1, D *dummy=NULLPTR)
461 {
462  CRYPTOPP_UNUSED(dummy);
463 
464  // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a S
465  CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
466 
467  // estimate the quotient: do a 2 S by 1 S divide.
468  // Profiling tells us the original second case was dominant, so it was promoted to the first If statement.
469  // The code change occurred at Commit dc99266599a0e72d.
470 
471  S Q; bool pre = (S(B1+1) == 0);
472  if (B1 > 0 && !pre)
473  Q = D(A[1], A[2]) / S(B1+1);
474  else if (pre)
475  Q = A[2];
476  else
477  Q = D(A[0], A[1]) / B0;
478 
479  // now subtract Q*B from A
480  D p = D::Multiply(B0, Q);
481  D u = (D) A[0] - p.GetLowHalf();
482  A[0] = u.GetLowHalf();
483  u = (D) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - D::Multiply(B1, Q);
484  A[1] = u.GetLowHalf();
485  A[2] += u.GetHighHalf();
486 
487  // Q <= actual quotient, so fix it
488  while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
489  {
490  u = (D) A[0] - B0;
491  A[0] = u.GetLowHalf();
492  u = (D) A[1] - B1 - u.GetHighHalfAsBorrow();
493  A[1] = u.GetLowHalf();
494  A[2] += u.GetHighHalf();
495  Q++;
496  CRYPTOPP_ASSERT(Q); // shouldn't overflow
497  }
498 
499  return Q;
500 }
501 
502 // do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
503 template <class S, class D>
504 inline D DivideFourWordsByTwo(S *T, const D &Al, const D &Ah, const D &B)
505 {
506  // Profiling tells us the original second case was dominant, so it was
507  // promoted to the first If statement. The code change occurred at
508  // Commit dc99266599a0e72d.
509 
510  if (!!B)
511  {
512  S Q[2];
513  T[0] = Al.GetLowHalf();
514  T[1] = Al.GetHighHalf();
515  T[2] = Ah.GetLowHalf();
516  T[3] = Ah.GetHighHalf();
517  Q[1] = DivideThreeWordsByTwo<S, D>(T+1, B.GetLowHalf(), B.GetHighHalf());
518  Q[0] = DivideThreeWordsByTwo<S, D>(T, B.GetLowHalf(), B.GetHighHalf());
519  return D(Q[0], Q[1]);
520  }
521  else // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
522  {
523  return D(Ah.GetLowHalf(), Ah.GetHighHalf());
524  }
525 }
526 
527 // returns quotient, which must fit in a word
528 inline word DWord::operator/(word a)
529 {
530  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
531  return word(m_whole / a);
532  #else
533  hword r[4];
534  return DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a).GetWhole();
535  #endif
536 }
537 
538 inline word DWord::operator%(word a)
539 {
540  #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
541  return word(m_whole % a);
542  #else
543  if (a < (word(1) << (WORD_BITS/2)))
544  {
545  hword h = hword(a);
546  word r = m_halfs.high % h;
547  r = ((m_halfs.low >> (WORD_BITS/2)) + (r << (WORD_BITS/2))) % h;
548  return hword((hword(m_halfs.low) + (r << (WORD_BITS/2))) % h);
549  }
550  else
551  {
552  hword r[4];
553  DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a);
554  return Word(r[0], r[1]).GetWhole();
555  }
556  #endif
557 }
558 
559 // ********************************************************
560 
561 // Use some tricks to share assembly code between MSVC, GCC, Clang and Sun CC.
562 #if defined(__GNUC__)
563  #define AddPrologue \
564  int result; \
565  __asm__ __volatile__ \
566  ( \
567  INTEL_NOPREFIX
568  #define AddEpilogue \
569  ATT_PREFIX \
570  : "=a" (result)\
571  : "d" (C), "a" (A), "D" (B), "c" (N) \
572  : "%esi", "memory", "cc" \
573  );\
574  return result;
575  #define MulPrologue \
576  __asm__ __volatile__ \
577  ( \
578  INTEL_NOPREFIX \
579  AS1( push ebx) \
580  AS2( mov ebx, edx)
581  #define MulEpilogue \
582  AS1( pop ebx) \
583  ATT_PREFIX \
584  : \
585  : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B) \
586  : "%esi", "memory", "cc" \
587  );
588  #define SquPrologue MulPrologue
589  #define SquEpilogue \
590  AS1( pop ebx) \
591  ATT_PREFIX \
592  : \
593  : "d" (s_maskLow16), "c" (C), "a" (A) \
594  : "%esi", "%edi", "memory", "cc" \
595  );
596  #define TopPrologue MulPrologue
597  #define TopEpilogue \
598  AS1( pop ebx) \
599  ATT_PREFIX \
600  : \
601  : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B), "S" (L) \
602  : "memory", "cc" \
603  );
604 #else
605  #define AddPrologue \
606  __asm push edi \
607  __asm push esi \
608  __asm mov eax, [esp+12] \
609  __asm mov edi, [esp+16]
610  #define AddEpilogue \
611  __asm pop esi \
612  __asm pop edi \
613  __asm ret 8
614  #define SaveEBX
615  #define RestoreEBX
616  #define SquPrologue \
617  AS2( mov eax, A) \
618  AS2( mov ecx, C) \
619  SaveEBX \
620  AS2( lea ebx, s_maskLow16)
621  #define MulPrologue \
622  AS2( mov eax, A) \
623  AS2( mov edi, B) \
624  AS2( mov ecx, C) \
625  SaveEBX \
626  AS2( lea ebx, s_maskLow16)
627  #define TopPrologue \
628  AS2( mov eax, A) \
629  AS2( mov edi, B) \
630  AS2( mov ecx, C) \
631  AS2( mov esi, L) \
632  SaveEBX \
633  AS2( lea ebx, s_maskLow16)
634  #define SquEpilogue RestoreEBX
635  #define MulEpilogue RestoreEBX
636  #define TopEpilogue RestoreEBX
637 #endif
638 
639 #ifdef CRYPTOPP_X64_MASM_AVAILABLE
640 extern "C" {
641 int Baseline_Add(size_t N, word *C, const word *A, const word *B);
642 int Baseline_Sub(size_t N, word *C, const word *A, const word *B);
643 }
644 #elif defined(CRYPTOPP_X64_ASM_AVAILABLE) && defined(__GNUC__) && defined(CRYPTOPP_WORD128_AVAILABLE)
645 int Baseline_Add(size_t N, word *C, const word *A, const word *B)
646 {
647  word result;
648  __asm__ __volatile__
649  (
650  INTEL_NOPREFIX
651  AS1( neg %1)
652  ASJ( jz, 1, f)
653  AS2( mov %0,[%3+8*%1])
654  AS2( add %0,[%4+8*%1])
655  AS2( mov [%2+8*%1],%0)
656  ASL(0)
657  AS2( mov %0,[%3+8*%1+8])
658  AS2( adc %0,[%4+8*%1+8])
659  AS2( mov [%2+8*%1+8],%0)
660  AS2( lea %1,[%1+2])
661  ASJ( jrcxz, 1, f)
662  AS2( mov %0,[%3+8*%1])
663  AS2( adc %0,[%4+8*%1])
664  AS2( mov [%2+8*%1],%0)
665  ASJ( jmp, 0, b)
666  ASL(1)
667  AS2( mov %0, 0)
668  AS2( adc %0, %0)
669  ATT_NOPREFIX
670  : "=&r" (result), "+c" (N)
671  : "r" (C+N), "r" (A+N), "r" (B+N)
672  : "memory", "cc"
673  );
674  return (int)result;
675 }
676 
677 int Baseline_Sub(size_t N, word *C, const word *A, const word *B)
678 {
679  word result;
680  __asm__ __volatile__
681  (
682  INTEL_NOPREFIX
683  AS1( neg %1)
684  ASJ( jz, 1, f)
685  AS2( mov %0,[%3+8*%1])
686  AS2( sub %0,[%4+8*%1])
687  AS2( mov [%2+8*%1],%0)
688  ASL(0)
689  AS2( mov %0,[%3+8*%1+8])
690  AS2( sbb %0,[%4+8*%1+8])
691  AS2( mov [%2+8*%1+8],%0)
692  AS2( lea %1,[%1+2])
693  ASJ( jrcxz, 1, f)
694  AS2( mov %0,[%3+8*%1])
695  AS2( sbb %0,[%4+8*%1])
696  AS2( mov [%2+8*%1],%0)
697  ASJ( jmp, 0, b)
698  ASL(1)
699  AS2( mov %0, 0)
700  AS2( adc %0, %0)
701  ATT_NOPREFIX
702  : "=&r" (result), "+c" (N)
703  : "r" (C+N), "r" (A+N), "r" (B+N)
704  : "memory", "cc"
705  );
706  return (int)result;
707 }
708 #elif defined(CRYPTOPP_X86_ASM_AVAILABLE) && CRYPTOPP_BOOL_X86
709 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
710 {
711  AddPrologue
712 
713  // now: eax = A, edi = B, edx = C, ecx = N
714  AS2( lea eax, [eax+4*ecx])
715  AS2( lea edi, [edi+4*ecx])
716  AS2( lea edx, [edx+4*ecx])
717 
718  AS1( neg ecx) // ecx is negative index
719  AS2( test ecx, 2) // this clears carry flag
720  ASJ( jz, 0, f)
721  AS2( sub ecx, 2)
722  ASJ( jmp, 1, f)
723 
724  ASL(0)
725  ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
726  AS2( mov esi,[eax+4*ecx])
727  AS2( adc esi,[edi+4*ecx])
728  AS2( mov [edx+4*ecx],esi)
729  AS2( mov esi,[eax+4*ecx+4])
730  AS2( adc esi,[edi+4*ecx+4])
731  AS2( mov [edx+4*ecx+4],esi)
732  ASL(1)
733  AS2( mov esi,[eax+4*ecx+8])
734  AS2( adc esi,[edi+4*ecx+8])
735  AS2( mov [edx+4*ecx+8],esi)
736  AS2( mov esi,[eax+4*ecx+12])
737  AS2( adc esi,[edi+4*ecx+12])
738  AS2( mov [edx+4*ecx+12],esi)
739 
740  AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
741  ASJ( jmp, 0, b)
742 
743  ASL(2)
744  AS2( mov eax, 0)
745  AS1( setc al) // store carry into eax (return result register)
746 
747  AddEpilogue
748 
749  // http://github.com/weidai11/cryptopp/issues/340
750  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
751  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
752 }
753 
754 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
755 {
756  AddPrologue
757 
758  // now: eax = A, edi = B, edx = C, ecx = N
759  AS2( lea eax, [eax+4*ecx])
760  AS2( lea edi, [edi+4*ecx])
761  AS2( lea edx, [edx+4*ecx])
762 
763  AS1( neg ecx) // ecx is negative index
764  AS2( test ecx, 2) // this clears carry flag
765  ASJ( jz, 0, f)
766  AS2( sub ecx, 2)
767  ASJ( jmp, 1, f)
768 
769  ASL(0)
770  ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
771  AS2( mov esi,[eax+4*ecx])
772  AS2( sbb esi,[edi+4*ecx])
773  AS2( mov [edx+4*ecx],esi)
774  AS2( mov esi,[eax+4*ecx+4])
775  AS2( sbb esi,[edi+4*ecx+4])
776  AS2( mov [edx+4*ecx+4],esi)
777  ASL(1)
778  AS2( mov esi,[eax+4*ecx+8])
779  AS2( sbb esi,[edi+4*ecx+8])
780  AS2( mov [edx+4*ecx+8],esi)
781  AS2( mov esi,[eax+4*ecx+12])
782  AS2( sbb esi,[edi+4*ecx+12])
783  AS2( mov [edx+4*ecx+12],esi)
784 
785  AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
786  ASJ( jmp, 0, b)
787 
788  ASL(2)
789  AS2( mov eax, 0)
790  AS1( setc al) // store carry into eax (return result register)
791 
792  AddEpilogue
793 
794  // http://github.com/weidai11/cryptopp/issues/340
795  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
796  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
797 }
798 
799 #if CRYPTOPP_INTEGER_SSE2
800 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Add(size_t N, word *C, const word *A, const word *B)
801 {
802  AddPrologue
803 
804  // now: eax = A, edi = B, edx = C, ecx = N
805  AS2( lea eax, [eax+4*ecx])
806  AS2( lea edi, [edi+4*ecx])
807  AS2( lea edx, [edx+4*ecx])
808 
809  AS1( neg ecx) // ecx is negative index
810  AS2( pxor mm2, mm2)
811  ASJ( jz, 2, f)
812  AS2( test ecx, 2) // this clears carry flag
813  ASJ( jz, 0, f)
814  AS2( sub ecx, 2)
815  ASJ( jmp, 1, f)
816 
817  ASL(0)
818  AS2( movd mm0, DWORD PTR [eax+4*ecx])
819  AS2( movd mm1, DWORD PTR [edi+4*ecx])
820  AS2( paddq mm0, mm1)
821  AS2( paddq mm2, mm0)
822  AS2( movd DWORD PTR [edx+4*ecx], mm2)
823  AS2( psrlq mm2, 32)
824 
825  AS2( movd mm0, DWORD PTR [eax+4*ecx+4])
826  AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
827  AS2( paddq mm0, mm1)
828  AS2( paddq mm2, mm0)
829  AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
830  AS2( psrlq mm2, 32)
831 
832  ASL(1)
833  AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
834  AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
835  AS2( paddq mm0, mm1)
836  AS2( paddq mm2, mm0)
837  AS2( movd DWORD PTR [edx+4*ecx+8], mm2)
838  AS2( psrlq mm2, 32)
839 
840  AS2( movd mm0, DWORD PTR [eax+4*ecx+12])
841  AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
842  AS2( paddq mm0, mm1)
843  AS2( paddq mm2, mm0)
844  AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
845  AS2( psrlq mm2, 32)
846 
847  AS2( add ecx, 4)
848  ASJ( jnz, 0, b)
849 
850  ASL(2)
851  AS2( movd eax, mm2)
852  AS1( emms)
853 
854  AddEpilogue
855 
856  // http://github.com/weidai11/cryptopp/issues/340
857  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
858  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
859 }
860 CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Sub(size_t N, word *C, const word *A, const word *B)
861 {
862  AddPrologue
863 
864  // now: eax = A, edi = B, edx = C, ecx = N
865  AS2( lea eax, [eax+4*ecx])
866  AS2( lea edi, [edi+4*ecx])
867  AS2( lea edx, [edx+4*ecx])
868 
869  AS1( neg ecx) // ecx is negative index
870  AS2( pxor mm2, mm2)
871  ASJ( jz, 2, f)
872  AS2( test ecx, 2) // this clears carry flag
873  ASJ( jz, 0, f)
874  AS2( sub ecx, 2)
875  ASJ( jmp, 1, f)
876 
877  ASL(0)
878  AS2( movd mm0, DWORD PTR [eax+4*ecx])
879  AS2( movd mm1, DWORD PTR [edi+4*ecx])
880  AS2( psubq mm0, mm1)
881  AS2( psubq mm0, mm2)
882  AS2( movd DWORD PTR [edx+4*ecx], mm0)
883  AS2( psrlq mm0, 63)
884 
885  AS2( movd mm2, DWORD PTR [eax+4*ecx+4])
886  AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
887  AS2( psubq mm2, mm1)
888  AS2( psubq mm2, mm0)
889  AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
890  AS2( psrlq mm2, 63)
891 
892  ASL(1)
893  AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
894  AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
895  AS2( psubq mm0, mm1)
896  AS2( psubq mm0, mm2)
897  AS2( movd DWORD PTR [edx+4*ecx+8], mm0)
898  AS2( psrlq mm0, 63)
899 
900  AS2( movd mm2, DWORD PTR [eax+4*ecx+12])
901  AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
902  AS2( psubq mm2, mm1)
903  AS2( psubq mm2, mm0)
904  AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
905  AS2( psrlq mm2, 63)
906 
907  AS2( add ecx, 4)
908  ASJ( jnz, 0, b)
909 
910  ASL(2)
911  AS2( movd eax, mm2)
912  AS1( emms)
913 
914  AddEpilogue
915 
916  // http://github.com/weidai11/cryptopp/issues/340
917  CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
918  CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
919 }
920 #endif // CRYPTOPP_INTEGER_SSE2
921 #else // CRYPTOPP_SSE2_ASM_AVAILABLE
922 int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
923 {
924  CRYPTOPP_ASSERT (N%2 == 0);
925 
926  Declare2Words(u);
927  AssignWord(u, 0);
928  for (size_t i=0; i<N; i+=2)
929  {
930  AddWithCarry(u, A[i], B[i]);
931  C[i] = LowWord(u);
932  AddWithCarry(u, A[i+1], B[i+1]);
933  C[i+1] = LowWord(u);
934  }
935  return int(GetCarry(u));
936 }
937 
938 int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
939 {
940  CRYPTOPP_ASSERT (N%2 == 0);
941 
942  Declare2Words(u);
943  AssignWord(u, 0);
944  for (size_t i=0; i<N; i+=2)
945  {
946  SubtractWithBorrow(u, A[i], B[i]);
947  C[i] = LowWord(u);
948  SubtractWithBorrow(u, A[i+1], B[i+1]);
949  C[i+1] = LowWord(u);
950  }
951  return int(GetBorrow(u));
952 }
953 #endif
954 
955 static word LinearMultiply(word *C, const word *AA, word B, size_t N)
956 {
957  // http://github.com/weidai11/cryptopp/issues/188
958  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
959 
960  word carry=0;
961  for(unsigned i=0; i<N; i++)
962  {
963  Declare2Words(p);
964  MultiplyWords(p, A[i], B);
965  Acc2WordsBy1(p, carry);
966  C[i] = LowWord(p);
967  carry = HighWord(p);
968  }
969  return carry;
970 }
971 
972 #ifndef CRYPTOPP_DOXYGEN_PROCESSING
973 
974 #define Mul_2 \
975  Mul_Begin(2) \
976  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
977  Mul_End(1, 1)
978 
979 #define Mul_4 \
980  Mul_Begin(4) \
981  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
982  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
983  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
984  Mul_SaveAcc(3, 1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
985  Mul_SaveAcc(4, 2, 3) Mul_Acc(3, 2) \
986  Mul_End(5, 3)
987 
988 #define Mul_8 \
989  Mul_Begin(8) \
990  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
991  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
992  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
993  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
994  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
995  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
996  Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
997  Mul_SaveAcc(7, 1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
998  Mul_SaveAcc(8, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
999  Mul_SaveAcc(9, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1000  Mul_SaveAcc(10, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1001  Mul_SaveAcc(11, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1002  Mul_SaveAcc(12, 6, 7) Mul_Acc(7, 6) \
1003  Mul_End(13, 7)
1004 
1005 #define Mul_16 \
1006  Mul_Begin(16) \
1007  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1008  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1009  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1010  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1011  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1012  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1013  Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1014  Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1015  Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1016  Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1017  Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1018  Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1019  Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1020  Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1021  Mul_SaveAcc(14, 0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1022  Mul_SaveAcc(15, 1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1023  Mul_SaveAcc(16, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1024  Mul_SaveAcc(17, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1025  Mul_SaveAcc(18, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1026  Mul_SaveAcc(19, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1027  Mul_SaveAcc(20, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1028  Mul_SaveAcc(21, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1029  Mul_SaveAcc(22, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1030  Mul_SaveAcc(23, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1031  Mul_SaveAcc(24, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1032  Mul_SaveAcc(25, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1033  Mul_SaveAcc(26, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1034  Mul_SaveAcc(27, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1035  Mul_SaveAcc(28, 14, 15) Mul_Acc(15, 14) \
1036  Mul_End(29, 15)
1037 
1038 #define Squ_2 \
1039  Squ_Begin(2) \
1040  Squ_End(2)
1041 
1042 #define Squ_4 \
1043  Squ_Begin(4) \
1044  Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1045  Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1046  Squ_SaveAcc(3, 1, 3) Squ_Diag(2) \
1047  Squ_SaveAcc(4, 2, 3) Squ_NonDiag \
1048  Squ_End(4)
1049 
1050 #define Squ_8 \
1051  Squ_Begin(8) \
1052  Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1053  Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1054  Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1055  Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1056  Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1057  Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1058  Squ_SaveAcc(7, 1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1059  Squ_SaveAcc(8, 2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1060  Squ_SaveAcc(9, 3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1061  Squ_SaveAcc(10, 4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1062  Squ_SaveAcc(11, 5, 7) Squ_Diag(6) \
1063  Squ_SaveAcc(12, 6, 7) Squ_NonDiag \
1064  Squ_End(8)
1065 
1066 #define Squ_16 \
1067  Squ_Begin(16) \
1068  Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1069  Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1070  Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1071  Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1072  Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1073  Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1074  Squ_SaveAcc(7, 0, 8) Squ_Acc(1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1075  Squ_SaveAcc(8, 0, 9) Squ_Acc(1, 8) Squ_Acc(2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1076  Squ_SaveAcc(9, 0, 10) Squ_Acc(1, 9) Squ_Acc(2, 8) Squ_Acc(3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1077  Squ_SaveAcc(10, 0, 11) Squ_Acc(1, 10) Squ_Acc(2, 9) Squ_Acc(3, 8) Squ_Acc(4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1078  Squ_SaveAcc(11, 0, 12) Squ_Acc(1, 11) Squ_Acc(2, 10) Squ_Acc(3, 9) Squ_Acc(4, 8) Squ_Acc(5, 7) Squ_Diag(6) \
1079  Squ_SaveAcc(12, 0, 13) Squ_Acc(1, 12) Squ_Acc(2, 11) Squ_Acc(3, 10) Squ_Acc(4, 9) Squ_Acc(5, 8) Squ_Acc(6, 7) Squ_NonDiag \
1080  Squ_SaveAcc(13, 0, 14) Squ_Acc(1, 13) Squ_Acc(2, 12) Squ_Acc(3, 11) Squ_Acc(4, 10) Squ_Acc(5, 9) Squ_Acc(6, 8) Squ_Diag(7) \
1081  Squ_SaveAcc(14, 0, 15) Squ_Acc(1, 14) Squ_Acc(2, 13) Squ_Acc(3, 12) Squ_Acc(4, 11) Squ_Acc(5, 10) Squ_Acc(6, 9) Squ_Acc(7, 8) Squ_NonDiag \
1082  Squ_SaveAcc(15, 1, 15) Squ_Acc(2, 14) Squ_Acc(3, 13) Squ_Acc(4, 12) Squ_Acc(5, 11) Squ_Acc(6, 10) Squ_Acc(7, 9) Squ_Diag(8) \
1083  Squ_SaveAcc(16, 2, 15) Squ_Acc(3, 14) Squ_Acc(4, 13) Squ_Acc(5, 12) Squ_Acc(6, 11) Squ_Acc(7, 10) Squ_Acc(8, 9) Squ_NonDiag \
1084  Squ_SaveAcc(17, 3, 15) Squ_Acc(4, 14) Squ_Acc(5, 13) Squ_Acc(6, 12) Squ_Acc(7, 11) Squ_Acc(8, 10) Squ_Diag(9) \
1085  Squ_SaveAcc(18, 4, 15) Squ_Acc(5, 14) Squ_Acc(6, 13) Squ_Acc(7, 12) Squ_Acc(8, 11) Squ_Acc(9, 10) Squ_NonDiag \
1086  Squ_SaveAcc(19, 5, 15) Squ_Acc(6, 14) Squ_Acc(7, 13) Squ_Acc(8, 12) Squ_Acc(9, 11) Squ_Diag(10) \
1087  Squ_SaveAcc(20, 6, 15) Squ_Acc(7, 14) Squ_Acc(8, 13) Squ_Acc(9, 12) Squ_Acc(10, 11) Squ_NonDiag \
1088  Squ_SaveAcc(21, 7, 15) Squ_Acc(8, 14) Squ_Acc(9, 13) Squ_Acc(10, 12) Squ_Diag(11) \
1089  Squ_SaveAcc(22, 8, 15) Squ_Acc(9, 14) Squ_Acc(10, 13) Squ_Acc(11, 12) Squ_NonDiag \
1090  Squ_SaveAcc(23, 9, 15) Squ_Acc(10, 14) Squ_Acc(11, 13) Squ_Diag(12) \
1091  Squ_SaveAcc(24, 10, 15) Squ_Acc(11, 14) Squ_Acc(12, 13) Squ_NonDiag \
1092  Squ_SaveAcc(25, 11, 15) Squ_Acc(12, 14) Squ_Diag(13) \
1093  Squ_SaveAcc(26, 12, 15) Squ_Acc(13, 14) Squ_NonDiag \
1094  Squ_SaveAcc(27, 13, 15) Squ_Diag(14) \
1095  Squ_SaveAcc(28, 14, 15) Squ_NonDiag \
1096  Squ_End(16)
1097 
1098 #define Bot_2 \
1099  Mul_Begin(2) \
1100  Bot_SaveAcc(0, 0, 1) Bot_Acc(1, 0) \
1101  Bot_End(2)
1102 
1103 #define Bot_4 \
1104  Mul_Begin(4) \
1105  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1106  Mul_SaveAcc(1, 2, 0) Mul_Acc(1, 1) Mul_Acc(0, 2) \
1107  Bot_SaveAcc(2, 0, 3) Bot_Acc(1, 2) Bot_Acc(2, 1) Bot_Acc(3, 0) \
1108  Bot_End(4)
1109 
1110 #define Bot_8 \
1111  Mul_Begin(8) \
1112  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1113  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1114  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1115  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1116  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1117  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1118  Bot_SaveAcc(6, 0, 7) Bot_Acc(1, 6) Bot_Acc(2, 5) Bot_Acc(3, 4) Bot_Acc(4, 3) Bot_Acc(5, 2) Bot_Acc(6, 1) Bot_Acc(7, 0) \
1119  Bot_End(8)
1120 
1121 #define Bot_16 \
1122  Mul_Begin(16) \
1123  Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1124  Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1125  Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1126  Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1127  Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1128  Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1129  Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1130  Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1131  Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1132  Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1133  Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1134  Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1135  Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1136  Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1137  Bot_SaveAcc(14, 0, 15) Bot_Acc(1, 14) Bot_Acc(2, 13) Bot_Acc(3, 12) Bot_Acc(4, 11) Bot_Acc(5, 10) Bot_Acc(6, 9) Bot_Acc(7, 8) Bot_Acc(8, 7) Bot_Acc(9, 6) Bot_Acc(10, 5) Bot_Acc(11, 4) Bot_Acc(12, 3) Bot_Acc(13, 2) Bot_Acc(14, 1) Bot_Acc(15, 0) \
1138  Bot_End(16)
1139 
1140 #endif
1141 
1142 #if 0
1143 #define Mul_Begin(n) \
1144  Declare2Words(p) \
1145  Declare2Words(c) \
1146  Declare2Words(d) \
1147  MultiplyWords(p, A[0], B[0]) \
1148  AssignWord(c, LowWord(p)) \
1149  AssignWord(d, HighWord(p))
1150 
1151 #define Mul_Acc(i, j) \
1152  MultiplyWords(p, A[i], B[j]) \
1153  Acc2WordsBy1(c, LowWord(p)) \
1154  Acc2WordsBy1(d, HighWord(p))
1155 
1156 #define Mul_SaveAcc(k, i, j) \
1157  R[k] = LowWord(c); \
1158  Add2WordsBy1(c, d, HighWord(c)) \
1159  MultiplyWords(p, A[i], B[j]) \
1160  AssignWord(d, HighWord(p)) \
1161  Acc2WordsBy1(c, LowWord(p))
1162 
1163 #define Mul_End(n) \
1164  R[2*n-3] = LowWord(c); \
1165  Acc2WordsBy1(d, HighWord(c)) \
1166  MultiplyWords(p, A[n-1], B[n-1])\
1167  Acc2WordsBy2(d, p) \
1168  R[2*n-2] = LowWord(d); \
1169  R[2*n-1] = HighWord(d);
1170 
1171 #define Bot_SaveAcc(k, i, j) \
1172  R[k] = LowWord(c); \
1173  word e = LowWord(d) + HighWord(c); \
1174  e += A[i] * B[j];
1175 
1176 #define Bot_Acc(i, j) \
1177  e += A[i] * B[j];
1178 
1179 #define Bot_End(n) \
1180  R[n-1] = e;
1181 #else
1182 #define Mul_Begin(n) \
1183  Declare2Words(p) \
1184  word c; \
1185  Declare2Words(d) \
1186  MultiplyWords(p, A[0], B[0]) \
1187  c = LowWord(p); \
1188  AssignWord(d, HighWord(p))
1189 
1190 #define Mul_Acc(i, j) \
1191  MulAcc(c, d, A[i], B[j])
1192 
1193 #define Mul_SaveAcc(k, i, j) \
1194  R[k] = c; \
1195  c = LowWord(d); \
1196  AssignWord(d, HighWord(d)) \
1197  MulAcc(c, d, A[i], B[j])
1198 
1199 #define Mul_End(k, i) \
1200  R[k] = c; \
1201  MultiplyWords(p, A[i], B[i]) \
1202  Acc2WordsBy2(p, d) \
1203  R[k+1] = LowWord(p); \
1204  R[k+2] = HighWord(p);
1205 
1206 #define Bot_SaveAcc(k, i, j) \
1207  R[k] = c; \
1208  c = LowWord(d); \
1209  c += A[i] * B[j];
1210 
1211 #define Bot_Acc(i, j) \
1212  c += A[i] * B[j];
1213 
1214 #define Bot_End(n) \
1215  R[n-1] = c;
1216 #endif
1217 
1218 #define Squ_Begin(n) \
1219  Declare2Words(p) \
1220  word c; \
1221  Declare2Words(d) \
1222  Declare2Words(e) \
1223  MultiplyWords(p, A[0], A[0]) \
1224  R[0] = LowWord(p); \
1225  AssignWord(e, HighWord(p)) \
1226  MultiplyWords(p, A[0], A[1]) \
1227  c = LowWord(p); \
1228  AssignWord(d, HighWord(p)) \
1229  Squ_NonDiag \
1230 
1231 #define Squ_NonDiag \
1232  Double3Words(c, d)
1233 
1234 #define Squ_SaveAcc(k, i, j) \
1235  Acc3WordsBy2(c, d, e) \
1236  R[k] = c; \
1237  MultiplyWords(p, A[i], A[j]) \
1238  c = LowWord(p); \
1239  AssignWord(d, HighWord(p)) \
1240 
1241 #define Squ_Acc(i, j) \
1242  MulAcc(c, d, A[i], A[j])
1243 
1244 #define Squ_Diag(i) \
1245  Squ_NonDiag \
1246  MulAcc(c, d, A[i], A[i])
1247 
1248 #define Squ_End(n) \
1249  Acc3WordsBy2(c, d, e) \
1250  R[2*n-3] = c; \
1251  MultiplyWords(p, A[n-1], A[n-1])\
1252  Acc2WordsBy2(p, e) \
1253  R[2*n-2] = LowWord(p); \
1254  R[2*n-1] = HighWord(p);
1255 
1256 
1257 void Baseline_Multiply2(word *R, const word *AA, const word *BB)
1258 {
1259  // http://github.com/weidai11/cryptopp/issues/188
1260  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1261  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1262 
1263  Mul_2
1264 }
1265 
1266 void Baseline_Multiply4(word *R, const word *AA, const word *BB)
1267 {
1268  // http://github.com/weidai11/cryptopp/issues/188
1269  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1270  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1271 
1272  Mul_4
1273 }
1274 
1275 void Baseline_Multiply8(word *R, const word *AA, const word *BB)
1276 {
1277  // http://github.com/weidai11/cryptopp/issues/188
1278  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1279  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1280 
1281  Mul_8
1282 }
1283 
1284 void Baseline_Square2(word *R, const word *AA)
1285 {
1286  // http://github.com/weidai11/cryptopp/issues/188
1287  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1288 
1289  Squ_2
1290 }
1291 
1292 void Baseline_Square4(word *R, const word *AA)
1293 {
1294  // http://github.com/weidai11/cryptopp/issues/188
1295  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1296 
1297  Squ_4
1298 }
1299 
1300 void Baseline_Square8(word *R, const word *AA)
1301 {
1302  // http://github.com/weidai11/cryptopp/issues/188
1303  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1304 
1305  Squ_8
1306 }
1307 
1308 void Baseline_MultiplyBottom2(word *R, const word *AA, const word *BB)
1309 {
1310  // http://github.com/weidai11/cryptopp/issues/188
1311  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1312  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1313 
1314  Bot_2
1315 
1316 // http://github.com/weidai11/cryptopp/issues/340
1317 #if defined(TWO_64_BIT_WORDS)
1318  CRYPTOPP_UNUSED(d0); CRYPTOPP_UNUSED(d1);
1319 #endif
1320 }
1321 
1322 void Baseline_MultiplyBottom4(word *R, const word *AA, const word *BB)
1323 {
1324  // http://github.com/weidai11/cryptopp/issues/188
1325  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1326  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1327 
1328  Bot_4
1329 }
1330 
1331 void Baseline_MultiplyBottom8(word *R, const word *AA, const word *BB)
1332 {
1333  // http://github.com/weidai11/cryptopp/issues/188
1334  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1335  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1336 
1337  Bot_8
1338 }
1339 
1340 #define Top_Begin(n) \
1341  Declare2Words(p) \
1342  word c; \
1343  Declare2Words(d) \
1344  MultiplyWords(p, A[0], B[n-2]);\
1345  AssignWord(d, HighWord(p));
1346 
1347 #define Top_Acc(i, j) \
1348  MultiplyWords(p, A[i], B[j]);\
1349  Acc2WordsBy1(d, HighWord(p));
1350 
1351 #define Top_SaveAcc0(i, j) \
1352  c = LowWord(d); \
1353  AssignWord(d, HighWord(d)) \
1354  MulAcc(c, d, A[i], B[j])
1355 
1356 #define Top_SaveAcc1(i, j) \
1357  c = L<c; \
1358  Acc2WordsBy1(d, c); \
1359  c = LowWord(d); \
1360  AssignWord(d, HighWord(d)) \
1361  MulAcc(c, d, A[i], B[j])
1362 
1363 void Baseline_MultiplyTop2(word *R, const word *A, const word *B, word L)
1364 {
1365  CRYPTOPP_UNUSED(L);
1366  word T[4];
1367  Baseline_Multiply2(T, A, B);
1368  R[0] = T[2];
1369  R[1] = T[3];
1370 }
1371 
1372 void Baseline_MultiplyTop4(word *R, const word *AA, const word *BB, word L)
1373 {
1374  // http://github.com/weidai11/cryptopp/issues/188
1375  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1376  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1377 
1378  Top_Begin(4)
1379  Top_Acc(1, 1) Top_Acc(2, 0) \
1380  Top_SaveAcc0(0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1381  Top_SaveAcc1(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
1382  Mul_SaveAcc(0, 2, 3) Mul_Acc(3, 2) \
1383  Mul_End(1, 3)
1384 }
1385 
1386 void Baseline_MultiplyTop8(word *R, const word *AA, const word *BB, word L)
1387 {
1388  // http://github.com/weidai11/cryptopp/issues/188
1389  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1390  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1391 
1392  Top_Begin(8)
1393  Top_Acc(1, 5) Top_Acc(2, 4) Top_Acc(3, 3) Top_Acc(4, 2) Top_Acc(5, 1) Top_Acc(6, 0) \
1394  Top_SaveAcc0(0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1395  Top_SaveAcc1(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
1396  Mul_SaveAcc(0, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
1397  Mul_SaveAcc(1, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1398  Mul_SaveAcc(2, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1399  Mul_SaveAcc(3, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1400  Mul_SaveAcc(4, 6, 7) Mul_Acc(7, 6) \
1401  Mul_End(5, 7)
1402 }
1403 
1404 #if !CRYPTOPP_INTEGER_SSE2 // save memory by not compiling these functions when SSE2 is available
1405 void Baseline_Multiply16(word *R, const word *AA, const word *BB)
1406 {
1407  // http://github.com/weidai11/cryptopp/issues/188
1408  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1409  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1410 
1411  Mul_16
1412 }
1413 
1414 void Baseline_Square16(word *R, const word *AA)
1415 {
1416  // http://github.com/weidai11/cryptopp/issues/188
1417  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1418 
1419  Squ_16
1420 }
1421 
1422 void Baseline_MultiplyBottom16(word *R, const word *AA, const word *BB)
1423 {
1424  // http://github.com/weidai11/cryptopp/issues/188
1425  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1426  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1427 
1428  Bot_16
1429 }
1430 
1431 void Baseline_MultiplyTop16(word *R, const word *AA, const word *BB, word L)
1432 {
1433  // http://github.com/weidai11/cryptopp/issues/188
1434  MAYBE_CONST word* A = MAYBE_UNCONST_CAST(AA);
1435  MAYBE_CONST word* B = MAYBE_UNCONST_CAST(BB);
1436 
1437  Top_Begin(16)
1438  Top_Acc(1, 13) Top_Acc(2, 12) Top_Acc(3, 11) Top_Acc(4, 10) Top_Acc(5, 9) Top_Acc(6, 8) Top_Acc(7, 7) Top_Acc(8, 6) Top_Acc(9, 5) Top_Acc(10, 4) Top_Acc(11, 3) Top_Acc(12, 2) Top_Acc(13, 1) Top_Acc(14, 0) \
1439  Top_SaveAcc0(0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1440  Top_SaveAcc1(1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1441  Mul_SaveAcc(0, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1442  Mul_SaveAcc(1, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1443  Mul_SaveAcc(2, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1444  Mul_SaveAcc(3, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1445  Mul_SaveAcc(4, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1446  Mul_SaveAcc(5, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1447  Mul_SaveAcc(6, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1448  Mul_SaveAcc(7, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1449  Mul_SaveAcc(8, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1450  Mul_SaveAcc(9, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1451  Mul_SaveAcc(10, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1452  Mul_SaveAcc(11, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1453  Mul_SaveAcc(12, 14, 15) Mul_Acc(15, 14) \
1454  Mul_End(13, 15)
1455 }
1456 #endif
1457 
1458 // ********************************************************
1459 
1460 #if CRYPTOPP_INTEGER_SSE2
1461 
1462 CRYPTOPP_ALIGN_DATA(16)
1463 CRYPTOPP_TABLE
1464 const word32 s_maskLow16[4] = {
1465  0xffff,0xffff,0xffff,0xffff
1466 };
1467 
1468 #undef Mul_Begin
1469 #undef Mul_Acc
1470 #undef Top_Begin
1471 #undef Top_Acc
1472 #undef Squ_Acc
1473 #undef Squ_NonDiag
1474 #undef Squ_Diag
1475 #undef Squ_SaveAcc
1476 #undef Squ_Begin
1477 #undef Mul_SaveAcc
1478 #undef Bot_Acc
1479 #undef Bot_SaveAcc
1480 #undef Bot_End
1481 #undef Squ_End
1482 #undef Mul_End
1483 
1484 #define SSE2_FinalSave(k) \
1485  AS2( psllq xmm5, 16) \
1486  AS2( paddq xmm4, xmm5) \
1487  AS2( movq QWORD PTR [ecx+8*(k)], xmm4)
1488 
1489 #define SSE2_SaveShift(k) \
1490  AS2( movq xmm0, xmm6) \
1491  AS2( punpckhqdq xmm6, xmm0) \
1492  AS2( movq xmm1, xmm7) \
1493  AS2( punpckhqdq xmm7, xmm1) \
1494  AS2( paddd xmm6, xmm0) \
1495  AS2( pslldq xmm6, 4) \
1496  AS2( paddd xmm7, xmm1) \
1497  AS2( paddd xmm4, xmm6) \
1498  AS2( pslldq xmm7, 4) \
1499  AS2( movq xmm6, xmm4) \
1500  AS2( paddd xmm5, xmm7) \
1501  AS2( movq xmm7, xmm5) \
1502  AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1503  AS2( psrlq xmm6, 16) \
1504  AS2( paddq xmm6, xmm7) \
1505  AS2( punpckhqdq xmm4, xmm0) \
1506  AS2( punpckhqdq xmm5, xmm0) \
1507  AS2( movq QWORD PTR [ecx+8*(k)+2], xmm6) \
1508  AS2( psrlq xmm6, 3*16) \
1509  AS2( paddd xmm4, xmm6) \
1510 
1511 #define Squ_SSE2_SaveShift(k) \
1512  AS2( movq xmm0, xmm6) \
1513  AS2( punpckhqdq xmm6, xmm0) \
1514  AS2( movq xmm1, xmm7) \
1515  AS2( punpckhqdq xmm7, xmm1) \
1516  AS2( paddd xmm6, xmm0) \
1517  AS2( pslldq xmm6, 4) \
1518  AS2( paddd xmm7, xmm1) \
1519  AS2( paddd xmm4, xmm6) \
1520  AS2( pslldq xmm7, 4) \
1521  AS2( movhlps xmm6, xmm4) \
1522  AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1523  AS2( paddd xmm5, xmm7) \
1524  AS2( movhps QWORD PTR [esp+12], xmm5)\
1525  AS2( psrlq xmm4, 16) \
1526  AS2( paddq xmm4, xmm5) \
1527  AS2( movq QWORD PTR [ecx+8*(k)+2], xmm4) \
1528  AS2( psrlq xmm4, 3*16) \
1529  AS2( paddd xmm4, xmm6) \
1530  AS2( movq QWORD PTR [esp+4], xmm4)\
1531 
1532 #define SSE2_FirstMultiply(i) \
1533  AS2( movdqa xmm7, [esi+(i)*16])\
1534  AS2( movdqa xmm5, [edi-(i)*16])\
1535  AS2( pmuludq xmm5, xmm7) \
1536  AS2( movdqa xmm4, [ebx])\
1537  AS2( movdqa xmm6, xmm4) \
1538  AS2( pand xmm4, xmm5) \
1539  AS2( psrld xmm5, 16) \
1540  AS2( pmuludq xmm7, [edx-(i)*16])\
1541  AS2( pand xmm6, xmm7) \
1542  AS2( psrld xmm7, 16)
1543 
1544 #define Squ_Begin(n) \
1545  SquPrologue \
1546  AS2( mov esi, esp)\
1547  AS2( and esp, 0xfffffff0)\
1548  AS2( lea edi, [esp-32*n])\
1549  AS2( sub esp, 32*n+16)\
1550  AS1( push esi)\
1551  AS2( mov esi, edi) \
1552  AS2( xor edx, edx) \
1553  ASL(1) \
1554  ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1555  ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1556  AS2( movdqa [edi+2*edx], xmm0) \
1557  AS2( psrlq xmm0, 32) \
1558  AS2( movdqa [edi+2*edx+16], xmm0) \
1559  AS2( movdqa [edi+16*n+2*edx], xmm1) \
1560  AS2( psrlq xmm1, 32) \
1561  AS2( movdqa [edi+16*n+2*edx+16], xmm1) \
1562  AS2( add edx, 16) \
1563  AS2( cmp edx, 8*(n)) \
1564  ASJ( jne, 1, b) \
1565  AS2( lea edx, [edi+16*n])\
1566  SSE2_FirstMultiply(0) \
1567 
1568 #define Squ_Acc(i) \
1569  ASL(LSqu##i) \
1570  AS2( movdqa xmm1, [esi+(i)*16]) \
1571  AS2( movdqa xmm0, [edi-(i)*16]) \
1572  AS2( movdqa xmm2, [ebx]) \
1573  AS2( pmuludq xmm0, xmm1) \
1574  AS2( pmuludq xmm1, [edx-(i)*16]) \
1575  AS2( movdqa xmm3, xmm2) \
1576  AS2( pand xmm2, xmm0) \
1577  AS2( psrld xmm0, 16) \
1578  AS2( paddd xmm4, xmm2) \
1579  AS2( paddd xmm5, xmm0) \
1580  AS2( pand xmm3, xmm1) \
1581  AS2( psrld xmm1, 16) \
1582  AS2( paddd xmm6, xmm3) \
1583  AS2( paddd xmm7, xmm1) \
1584 
1585 #define Squ_Acc1(i)
1586 #define Squ_Acc2(i) ASC(call, LSqu##i)
1587 #define Squ_Acc3(i) Squ_Acc2(i)
1588 #define Squ_Acc4(i) Squ_Acc2(i)
1589 #define Squ_Acc5(i) Squ_Acc2(i)
1590 #define Squ_Acc6(i) Squ_Acc2(i)
1591 #define Squ_Acc7(i) Squ_Acc2(i)
1592 #define Squ_Acc8(i) Squ_Acc2(i)
1593 
1594 #define SSE2_End(E, n) \
1595  SSE2_SaveShift(2*(n)-3) \
1596  AS2( movdqa xmm7, [esi+16]) \
1597  AS2( movdqa xmm0, [edi]) \
1598  AS2( pmuludq xmm0, xmm7) \
1599  AS2( movdqa xmm2, [ebx]) \
1600  AS2( pmuludq xmm7, [edx]) \
1601  AS2( movdqa xmm6, xmm2) \
1602  AS2( pand xmm2, xmm0) \
1603  AS2( psrld xmm0, 16) \
1604  AS2( paddd xmm4, xmm2) \
1605  AS2( paddd xmm5, xmm0) \
1606  AS2( pand xmm6, xmm7) \
1607  AS2( psrld xmm7, 16) \
1608  SSE2_SaveShift(2*(n)-2) \
1609  SSE2_FinalSave(2*(n)-1) \
1610  AS1( pop esp)\
1611  E
1612 
1613 #define Squ_End(n) SSE2_End(SquEpilogue, n)
1614 #define Mul_End(n) SSE2_End(MulEpilogue, n)
1615 #define Top_End(n) SSE2_End(TopEpilogue, n)
1616 
1617 #define Squ_Column1(k, i) \
1618  Squ_SSE2_SaveShift(k) \
1619  AS2( add esi, 16) \
1620  SSE2_FirstMultiply(1)\
1621  Squ_Acc##i(i) \
1622  AS2( paddd xmm4, xmm4) \
1623  AS2( paddd xmm5, xmm5) \
1624  AS2( movdqa xmm3, [esi]) \
1625  AS2( movq xmm1, QWORD PTR [esi+8]) \
1626  AS2( pmuludq xmm1, xmm3) \
1627  AS2( pmuludq xmm3, xmm3) \
1628  AS2( movdqa xmm0, [ebx])\
1629  AS2( movdqa xmm2, xmm0) \
1630  AS2( pand xmm0, xmm1) \
1631  AS2( psrld xmm1, 16) \
1632  AS2( paddd xmm6, xmm0) \
1633  AS2( paddd xmm7, xmm1) \
1634  AS2( pand xmm2, xmm3) \
1635  AS2( psrld xmm3, 16) \
1636  AS2( paddd xmm6, xmm6) \
1637  AS2( paddd xmm7, xmm7) \
1638  AS2( paddd xmm4, xmm2) \
1639  AS2( paddd xmm5, xmm3) \
1640  AS2( movq xmm0, QWORD PTR [esp+4])\
1641  AS2( movq xmm1, QWORD PTR [esp+12])\
1642  AS2( paddd xmm4, xmm0)\
1643  AS2( paddd xmm5, xmm1)\
1644 
1645 #define Squ_Column0(k, i) \
1646  Squ_SSE2_SaveShift(k) \
1647  AS2( add edi, 16) \
1648  AS2( add edx, 16) \
1649  SSE2_FirstMultiply(1)\
1650  Squ_Acc##i(i) \
1651  AS2( paddd xmm6, xmm6) \
1652  AS2( paddd xmm7, xmm7) \
1653  AS2( paddd xmm4, xmm4) \
1654  AS2( paddd xmm5, xmm5) \
1655  AS2( movq xmm0, QWORD PTR [esp+4])\
1656  AS2( movq xmm1, QWORD PTR [esp+12])\
1657  AS2( paddd xmm4, xmm0)\
1658  AS2( paddd xmm5, xmm1)\
1659 
1660 #define SSE2_MulAdd45 \
1661  AS2( movdqa xmm7, [esi]) \
1662  AS2( movdqa xmm0, [edi]) \
1663  AS2( pmuludq xmm0, xmm7) \
1664  AS2( movdqa xmm2, [ebx]) \
1665  AS2( pmuludq xmm7, [edx]) \
1666  AS2( movdqa xmm6, xmm2) \
1667  AS2( pand xmm2, xmm0) \
1668  AS2( psrld xmm0, 16) \
1669  AS2( paddd xmm4, xmm2) \
1670  AS2( paddd xmm5, xmm0) \
1671  AS2( pand xmm6, xmm7) \
1672  AS2( psrld xmm7, 16)
1673 
1674 #define Mul_Begin(n) \
1675  MulPrologue \
1676  AS2( mov esi, esp)\
1677  AS2( and esp, 0xfffffff0)\
1678  AS2( sub esp, 48*n+16)\
1679  AS1( push esi)\
1680  AS2( xor edx, edx) \
1681  ASL(1) \
1682  ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1683  ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1684  ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1685  AS2( movdqa [esp+20+2*edx], xmm0) \
1686  AS2( psrlq xmm0, 32) \
1687  AS2( movdqa [esp+20+2*edx+16], xmm0) \
1688  AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1689  AS2( psrlq xmm1, 32) \
1690  AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1691  AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1692  AS2( psrlq xmm2, 32) \
1693  AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1694  AS2( add edx, 16) \
1695  AS2( cmp edx, 8*(n)) \
1696  ASJ( jne, 1, b) \
1697  AS2( lea edi, [esp+20])\
1698  AS2( lea edx, [esp+20+16*n])\
1699  AS2( lea esi, [esp+20+32*n])\
1700  SSE2_FirstMultiply(0) \
1701 
1702 #define Mul_Acc(i) \
1703  ASL(LMul##i) \
1704  AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1705  AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1706  AS2( movdqa xmm2, [ebx]) \
1707  AS2( pmuludq xmm0, xmm1) \
1708  AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1709  AS2( movdqa xmm3, xmm2) \
1710  AS2( pand xmm2, xmm0) \
1711  AS2( psrld xmm0, 16) \
1712  AS2( paddd xmm4, xmm2) \
1713  AS2( paddd xmm5, xmm0) \
1714  AS2( pand xmm3, xmm1) \
1715  AS2( psrld xmm1, 16) \
1716  AS2( paddd xmm6, xmm3) \
1717  AS2( paddd xmm7, xmm1) \
1718 
1719 #define Mul_Acc1(i)
1720 #define Mul_Acc2(i) ASC(call, LMul##i)
1721 #define Mul_Acc3(i) Mul_Acc2(i)
1722 #define Mul_Acc4(i) Mul_Acc2(i)
1723 #define Mul_Acc5(i) Mul_Acc2(i)
1724 #define Mul_Acc6(i) Mul_Acc2(i)
1725 #define Mul_Acc7(i) Mul_Acc2(i)
1726 #define Mul_Acc8(i) Mul_Acc2(i)
1727 #define Mul_Acc9(i) Mul_Acc2(i)
1728 #define Mul_Acc10(i) Mul_Acc2(i)
1729 #define Mul_Acc11(i) Mul_Acc2(i)
1730 #define Mul_Acc12(i) Mul_Acc2(i)
1731 #define Mul_Acc13(i) Mul_Acc2(i)
1732 #define Mul_Acc14(i) Mul_Acc2(i)
1733 #define Mul_Acc15(i) Mul_Acc2(i)
1734 #define Mul_Acc16(i) Mul_Acc2(i)
1735 
1736 #define Mul_Column1(k, i) \
1737  SSE2_SaveShift(k) \
1738  AS2( add esi, 16) \
1739  SSE2_MulAdd45\
1740  Mul_Acc##i(i) \
1741 
1742 #define Mul_Column0(k, i) \
1743  SSE2_SaveShift(k) \
1744  AS2( add edi, 16) \
1745  AS2( add edx, 16) \
1746  SSE2_MulAdd45\
1747  Mul_Acc##i(i) \
1748 
1749 #define Bot_Acc(i) \
1750  AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1751  AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1752  AS2( pmuludq xmm0, xmm1) \
1753  AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1754  AS2( paddq xmm4, xmm0) \
1755  AS2( paddd xmm6, xmm1)
1756 
1757 #define Bot_SaveAcc(k) \
1758  SSE2_SaveShift(k) \
1759  AS2( add edi, 16) \
1760  AS2( add edx, 16) \
1761  AS2( movdqa xmm6, [esi]) \
1762  AS2( movdqa xmm0, [edi]) \
1763  AS2( pmuludq xmm0, xmm6) \
1764  AS2( paddq xmm4, xmm0) \
1765  AS2( psllq xmm5, 16) \
1766  AS2( paddq xmm4, xmm5) \
1767  AS2( pmuludq xmm6, [edx])
1768 
1769 #define Bot_End(n) \
1770  AS2( movhlps xmm7, xmm6) \
1771  AS2( paddd xmm6, xmm7) \
1772  AS2( psllq xmm6, 32) \
1773  AS2( paddd xmm4, xmm6) \
1774  AS2( movq QWORD PTR [ecx+8*((n)-1)], xmm4) \
1775  AS1( pop esp)\
1776  MulEpilogue
1777 
1778 #define Top_Begin(n) \
1779  TopPrologue \
1780  AS2( mov edx, esp)\
1781  AS2( and esp, 0xfffffff0)\
1782  AS2( sub esp, 48*n+16)\
1783  AS1( push edx)\
1784  AS2( xor edx, edx) \
1785  ASL(1) \
1786  ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1787  ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1788  ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1789  AS2( movdqa [esp+20+2*edx], xmm0) \
1790  AS2( psrlq xmm0, 32) \
1791  AS2( movdqa [esp+20+2*edx+16], xmm0) \
1792  AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1793  AS2( psrlq xmm1, 32) \
1794  AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1795  AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1796  AS2( psrlq xmm2, 32) \
1797  AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1798  AS2( add edx, 16) \
1799  AS2( cmp edx, 8*(n)) \
1800  ASJ( jne, 1, b) \
1801  AS2( mov eax, esi) \
1802  AS2( lea edi, [esp+20+00*n+16*(n/2-1)])\
1803  AS2( lea edx, [esp+20+16*n+16*(n/2-1)])\
1804  AS2( lea esi, [esp+20+32*n+16*(n/2-1)])\
1805  AS2( pxor xmm4, xmm4)\
1806  AS2( pxor xmm5, xmm5)
1807 
1808 #define Top_Acc(i) \
1809  AS2( movq xmm0, QWORD PTR [esi+i/2*(1-(i-2*(i/2))*2)*16+8]) \
1810  AS2( pmuludq xmm0, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1811  AS2( psrlq xmm0, 48) \
1812  AS2( paddd xmm5, xmm0)\
1813 
1814 #define Top_Column0(i) \
1815  AS2( psllq xmm5, 32) \
1816  AS2( add edi, 16) \
1817  AS2( add edx, 16) \
1818  SSE2_MulAdd45\
1819  Mul_Acc##i(i) \
1820 
1821 #define Top_Column1(i) \
1822  SSE2_SaveShift(0) \
1823  AS2( add esi, 16) \
1824  SSE2_MulAdd45\
1825  Mul_Acc##i(i) \
1826  AS2( shr eax, 16) \
1827  AS2( movd xmm0, eax)\
1828  AS2( movd xmm1, [ecx+4])\
1829  AS2( psrld xmm1, 16)\
1830  AS2( pcmpgtd xmm1, xmm0)\
1831  AS2( psrld xmm1, 31)\
1832  AS2( paddd xmm4, xmm1)\
1833 
1834 void SSE2_Square4(word *C, const word *A)
1835 {
1836  Squ_Begin(2)
1837  Squ_Column0(0, 1)
1838  Squ_End(2)
1839 }
1840 
1841 void SSE2_Square8(word *C, const word *A)
1842 {
1843  Squ_Begin(4)
1844 #ifndef __GNUC__
1845  ASJ( jmp, 0, f)
1846  Squ_Acc(2)
1847  AS1( ret) ASL(0)
1848 #endif
1849  Squ_Column0(0, 1)
1850  Squ_Column1(1, 1)
1851  Squ_Column0(2, 2)
1852  Squ_Column1(3, 1)
1853  Squ_Column0(4, 1)
1854  Squ_End(4)
1855 }
1856 
1857 void SSE2_Square16(word *C, const word *A)
1858 {
1859  Squ_Begin(8)
1860 #ifndef __GNUC__
1861  ASJ( jmp, 0, f)
1862  Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1863  AS1( ret) ASL(0)
1864 #endif
1865  Squ_Column0(0, 1)
1866  Squ_Column1(1, 1)
1867  Squ_Column0(2, 2)
1868  Squ_Column1(3, 2)
1869  Squ_Column0(4, 3)
1870  Squ_Column1(5, 3)
1871  Squ_Column0(6, 4)
1872  Squ_Column1(7, 3)
1873  Squ_Column0(8, 3)
1874  Squ_Column1(9, 2)
1875  Squ_Column0(10, 2)
1876  Squ_Column1(11, 1)
1877  Squ_Column0(12, 1)
1878  Squ_End(8)
1879 }
1880 
1881 void SSE2_Square32(word *C, const word *A)
1882 {
1883  Squ_Begin(16)
1884  ASJ( jmp, 0, f)
1885  Squ_Acc(8) Squ_Acc(7) Squ_Acc(6) Squ_Acc(5) Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1886  AS1( ret) ASL(0)
1887  Squ_Column0(0, 1)
1888  Squ_Column1(1, 1)
1889  Squ_Column0(2, 2)
1890  Squ_Column1(3, 2)
1891  Squ_Column0(4, 3)
1892  Squ_Column1(5, 3)
1893  Squ_Column0(6, 4)
1894  Squ_Column1(7, 4)
1895  Squ_Column0(8, 5)
1896  Squ_Column1(9, 5)
1897  Squ_Column0(10, 6)
1898  Squ_Column1(11, 6)
1899  Squ_Column0(12, 7)
1900  Squ_Column1(13, 7)
1901  Squ_Column0(14, 8)
1902  Squ_Column1(15, 7)
1903  Squ_Column0(16, 7)
1904  Squ_Column1(17, 6)
1905  Squ_Column0(18, 6)
1906  Squ_Column1(19, 5)
1907  Squ_Column0(20, 5)
1908  Squ_Column1(21, 4)
1909  Squ_Column0(22, 4)
1910  Squ_Column1(23, 3)
1911  Squ_Column0(24, 3)
1912  Squ_Column1(25, 2)
1913  Squ_Column0(26, 2)
1914  Squ_Column1(27, 1)
1915  Squ_Column0(28, 1)
1916  Squ_End(16)
1917 }
1918 
1919 void SSE2_Multiply4(word *C, const word *A, const word *B)
1920 {
1921  Mul_Begin(2)
1922 #ifndef __GNUC__
1923  ASJ( jmp, 0, f)
1924  Mul_Acc(2)
1925  AS1( ret) ASL(0)
1926 #endif
1927  Mul_Column0(0, 2)
1928  Mul_End(2)
1929 }
1930 
1931 void SSE2_Multiply8(word *C, const word *A, const word *B)
1932 {
1933  Mul_Begin(4)
1934 #ifndef __GNUC__
1935  ASJ( jmp, 0, f)
1936  Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1937  AS1( ret) ASL(0)
1938 #endif
1939  Mul_Column0(0, 2)
1940  Mul_Column1(1, 3)
1941  Mul_Column0(2, 4)
1942  Mul_Column1(3, 3)
1943  Mul_Column0(4, 2)
1944  Mul_End(4)
1945 }
1946 
1947 void SSE2_Multiply16(word *C, const word *A, const word *B)
1948 {
1949  Mul_Begin(8)
1950 #ifndef __GNUC__
1951  ASJ( jmp, 0, f)
1952  Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1953  AS1( ret) ASL(0)
1954 #endif
1955  Mul_Column0(0, 2)
1956  Mul_Column1(1, 3)
1957  Mul_Column0(2, 4)
1958  Mul_Column1(3, 5)
1959  Mul_Column0(4, 6)
1960  Mul_Column1(5, 7)
1961  Mul_Column0(6, 8)
1962  Mul_Column1(7, 7)
1963  Mul_Column0(8, 6)
1964  Mul_Column1(9, 5)
1965  Mul_Column0(10, 4)
1966  Mul_Column1(11, 3)
1967  Mul_Column0(12, 2)
1968  Mul_End(8)
1969 }
1970 
1971 void SSE2_Multiply32(word *C, const word *A, const word *B)
1972 {
1973  Mul_Begin(16)
1974  ASJ( jmp, 0, f)
1975  Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1976  AS1( ret) ASL(0)
1977  Mul_Column0(0, 2)
1978  Mul_Column1(1, 3)
1979  Mul_Column0(2, 4)
1980  Mul_Column1(3, 5)
1981  Mul_Column0(4, 6)
1982  Mul_Column1(5, 7)
1983  Mul_Column0(6, 8)
1984  Mul_Column1(7, 9)
1985  Mul_Column0(8, 10)
1986  Mul_Column1(9, 11)
1987  Mul_Column0(10, 12)
1988  Mul_Column1(11, 13)
1989  Mul_Column0(12, 14)
1990  Mul_Column1(13, 15)
1991  Mul_Column0(14, 16)
1992  Mul_Column1(15, 15)
1993  Mul_Column0(16, 14)
1994  Mul_Column1(17, 13)
1995  Mul_Column0(18, 12)
1996  Mul_Column1(19, 11)
1997  Mul_Column0(20, 10)
1998  Mul_Column1(21, 9)
1999  Mul_Column0(22, 8)
2000  Mul_Column1(23, 7)
2001  Mul_Column0(24, 6)
2002  Mul_Column1(25, 5)
2003  Mul_Column0(26, 4)
2004  Mul_Column1(27, 3)
2005  Mul_Column0(28, 2)
2006  Mul_End(16)
2007 }
2008 
2009 void SSE2_MultiplyBottom4(word *C, const word *A, const word *B)
2010 {
2011  Mul_Begin(2)
2012  Bot_SaveAcc(0) Bot_Acc(2)
2013  Bot_End(2)
2014 }
2015 
2016 void SSE2_MultiplyBottom8(word *C, const word *A, const word *B)
2017 {
2018  Mul_Begin(4)
2019 #ifndef __GNUC__
2020  ASJ( jmp, 0, f)
2021  Mul_Acc(3) Mul_Acc(2)
2022  AS1( ret) ASL(0)
2023 #endif
2024  Mul_Column0(0, 2)
2025  Mul_Column1(1, 3)
2026  Bot_SaveAcc(2) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2027  Bot_End(4)
2028 }
2029 
2030 void SSE2_MultiplyBottom16(word *C, const word *A, const word *B)
2031 {
2032  Mul_Begin(8)
2033 #ifndef __GNUC__
2034  ASJ( jmp, 0, f)
2035  Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2036  AS1( ret) ASL(0)
2037 #endif
2038  Mul_Column0(0, 2)
2039  Mul_Column1(1, 3)
2040  Mul_Column0(2, 4)
2041  Mul_Column1(3, 5)
2042  Mul_Column0(4, 6)
2043  Mul_Column1(5, 7)
2044  Bot_SaveAcc(6) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2045  Bot_End(8)
2046 }
2047 
2048 void SSE2_MultiplyBottom32(word *C, const word *A, const word *B)
2049 {
2050  Mul_Begin(16)
2051 #ifndef __GNUC__
2052  ASJ( jmp, 0, f)
2053  Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2054  AS1( ret) ASL(0)
2055 #endif
2056  Mul_Column0(0, 2)
2057  Mul_Column1(1, 3)
2058  Mul_Column0(2, 4)
2059  Mul_Column1(3, 5)
2060  Mul_Column0(4, 6)
2061  Mul_Column1(5, 7)
2062  Mul_Column0(6, 8)
2063  Mul_Column1(7, 9)
2064  Mul_Column0(8, 10)
2065  Mul_Column1(9, 11)
2066  Mul_Column0(10, 12)
2067  Mul_Column1(11, 13)
2068  Mul_Column0(12, 14)
2069  Mul_Column1(13, 15)
2070  Bot_SaveAcc(14) Bot_Acc(16) Bot_Acc(15) Bot_Acc(14) Bot_Acc(13) Bot_Acc(12) Bot_Acc(11) Bot_Acc(10) Bot_Acc(9) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2071  Bot_End(16)
2072 }
2073 
2074 void SSE2_MultiplyTop8(word *C, const word *A, const word *B, word L)
2075 {
2076  Top_Begin(4)
2077  Top_Acc(3) Top_Acc(2) Top_Acc(1)
2078 #ifndef __GNUC__
2079  ASJ( jmp, 0, f)
2080  Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2081  AS1( ret) ASL(0)
2082 #endif
2083  Top_Column0(4)
2084  Top_Column1(3)
2085  Mul_Column0(0, 2)
2086  Top_End(2)
2087 }
2088 
2089 void SSE2_MultiplyTop16(word *C, const word *A, const word *B, word L)
2090 {
2091  Top_Begin(8)
2092  Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2093 #ifndef __GNUC__
2094  ASJ( jmp, 0, f)
2095  Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2096  AS1( ret) ASL(0)
2097 #endif
2098  Top_Column0(8)
2099  Top_Column1(7)
2100  Mul_Column0(0, 6)
2101  Mul_Column1(1, 5)
2102  Mul_Column0(2, 4)
2103  Mul_Column1(3, 3)
2104  Mul_Column0(4, 2)
2105  Top_End(4)
2106 }
2107 
2108 void SSE2_MultiplyTop32(word *C, const word *A, const word *B, word L)
2109 {
2110  Top_Begin(16)
2111  Top_Acc(15) Top_Acc(14) Top_Acc(13) Top_Acc(12) Top_Acc(11) Top_Acc(10) Top_Acc(9) Top_Acc(8) Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2112 #ifndef __GNUC__
2113  ASJ( jmp, 0, f)
2114  Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2115  AS1( ret) ASL(0)
2116 #endif
2117  Top_Column0(16)
2118  Top_Column1(15)
2119  Mul_Column0(0, 14)
2120  Mul_Column1(1, 13)
2121  Mul_Column0(2, 12)
2122  Mul_Column1(3, 11)
2123  Mul_Column0(4, 10)
2124  Mul_Column1(5, 9)
2125  Mul_Column0(6, 8)
2126  Mul_Column1(7, 7)
2127  Mul_Column0(8, 6)
2128  Mul_Column1(9, 5)
2129  Mul_Column0(10, 4)
2130  Mul_Column1(11, 3)
2131  Mul_Column0(12, 2)
2132  Top_End(8)
2133 }
2134 
2135 #endif // #if CRYPTOPP_INTEGER_SSE2
2136 
2137 // ********************************************************
2138 
2139 typedef int (CRYPTOPP_FASTCALL * PAdd)(size_t N, word *C, const word *A, const word *B);
2140 typedef void (* PMul)(word *C, const word *A, const word *B);
2141 typedef void (* PSqu)(word *C, const word *A);
2142 typedef void (* PMulTop)(word *C, const word *A, const word *B, word L);
2143 
2144 #if CRYPTOPP_INTEGER_SSE2
2145 static PAdd s_pAdd = &Baseline_Add, s_pSub = &Baseline_Sub;
2146 static size_t s_recursionLimit = 8;
2147 #else
2148 static const size_t s_recursionLimit = 16;
2149 #endif // CRYPTOPP_INTEGER_SSE2
2150 
2151 static PMul s_pMul[9], s_pBot[9];
2152 static PSqu s_pSqu[9];
2153 static PMulTop s_pTop[9];
2154 
2155 void SetFunctionPointers()
2156 {
2157  s_pMul[0] = &Baseline_Multiply2;
2158  s_pBot[0] = &Baseline_MultiplyBottom2;
2159  s_pSqu[0] = &Baseline_Square2;
2160  s_pTop[0] = &Baseline_MultiplyTop2;
2161  s_pTop[1] = &Baseline_MultiplyTop4;
2162 
2163 #if CRYPTOPP_INTEGER_SSE2
2164  if (HasSSE2())
2165  {
2166  if (IsP4())
2167  {
2168  s_pAdd = &SSE2_Add;
2169  s_pSub = &SSE2_Sub;
2170  }
2171 
2172  s_recursionLimit = 32;
2173 
2174  s_pMul[1] = &SSE2_Multiply4;
2175  s_pMul[2] = &SSE2_Multiply8;
2176  s_pMul[4] = &SSE2_Multiply16;
2177  s_pMul[8] = &SSE2_Multiply32;
2178 
2179  s_pBot[1] = &SSE2_MultiplyBottom4;
2180  s_pBot[2] = &SSE2_MultiplyBottom8;
2181  s_pBot[4] = &SSE2_MultiplyBottom16;
2182  s_pBot[8] = &SSE2_MultiplyBottom32;
2183 
2184  s_pSqu[1] = &SSE2_Square4;
2185  s_pSqu[2] = &SSE2_Square8;
2186  s_pSqu[4] = &SSE2_Square16;
2187  s_pSqu[8] = &SSE2_Square32;
2188 
2189  s_pTop[2] = &SSE2_MultiplyTop8;
2190  s_pTop[4] = &SSE2_MultiplyTop16;
2191  s_pTop[8] = &SSE2_MultiplyTop32;
2192  }
2193  else
2194 #endif // CRYPTOPP_INTEGER_SSE2
2195  {
2196  s_pMul[1] = &Baseline_Multiply4;
2197  s_pMul[2] = &Baseline_Multiply8;
2198 
2199  s_pBot[1] = &Baseline_MultiplyBottom4;
2200  s_pBot[2] = &Baseline_MultiplyBottom8;
2201 
2202  s_pSqu[1] = &Baseline_Square4;
2203  s_pSqu[2] = &Baseline_Square8;
2204 
2205  s_pTop[2] = &Baseline_MultiplyTop8;
2206 
2207 #if !CRYPTOPP_INTEGER_SSE2
2208  s_pMul[4] = &Baseline_Multiply16;
2209  s_pBot[4] = &Baseline_MultiplyBottom16;
2210  s_pSqu[4] = &Baseline_Square16;
2211  s_pTop[4] = &Baseline_MultiplyTop16;
2212 #endif // !CRYPTOPP_INTEGER_SSE2
2213  }
2214 }
2215 
2216 inline int Add(word *C, const word *A, const word *B, size_t N)
2217 {
2218 #if CRYPTOPP_INTEGER_SSE2
2219  return s_pAdd(N, C, A, B);
2220 #else
2221  return Baseline_Add(N, C, A, B);
2222 #endif // CRYPTOPP_INTEGER_SSE2
2223 }
2224 
2225 inline int Subtract(word *C, const word *A, const word *B, size_t N)
2226 {
2227 #if CRYPTOPP_INTEGER_SSE2
2228  return s_pSub(N, C, A, B);
2229 #else
2230  return Baseline_Sub(N, C, A, B);
2231 #endif // CRYPTOPP_INTEGER_SSE2
2232 }
2233 
2234 // ********************************************************
2235 
2236 
2237 #define A0 A
2238 #define A1 (A+N2)
2239 #define B0 B
2240 #define B1 (B+N2)
2241 
2242 #define T0 T
2243 #define T1 (T+N2)
2244 #define T2 (T+N)
2245 #define T3 (T+N+N2)
2246 
2247 #define R0 R
2248 #define R1 (R+N2)
2249 #define R2 (R+N)
2250 #define R3 (R+N+N2)
2251 
2252 // R[2*N] - result = A*B
2253 // T[2*N] - temporary work space
2254 // A[N] --- multiplier
2255 // B[N] --- multiplicant
2256 
2257 void RecursiveMultiply(word *R, word *T, const word *A, const word *B, size_t N)
2258 {
2259  CRYPTOPP_ASSERT(N>=2 && N%2==0);
2260 
2261  if (N <= s_recursionLimit)
2262  s_pMul[N/4](R, A, B);
2263  else
2264  {
2265  const size_t N2 = N/2;
2266 
2267  size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2268  Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2269 
2270  size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2271  Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2272 
2273  RecursiveMultiply(R2, T2, A1, B1, N2);
2274  RecursiveMultiply(T0, T2, R0, R1, N2);
2275  RecursiveMultiply(R0, T2, A0, B0, N2);
2276 
2277  // now T[01] holds (A1-A0)*(B0-B1), R[01] holds A0*B0, R[23] holds A1*B1
2278 
2279  int c2 = Add(R2, R2, R1, N2);
2280  int c3 = c2;
2281  c2 += Add(R1, R2, R0, N2);
2282  c3 += Add(R2, R2, R3, N2);
2283 
2284  if (AN2 == BN2)
2285  c3 -= Subtract(R1, R1, T0, N);
2286  else
2287  c3 += Add(R1, R1, T0, N);
2288 
2289  c3 += Increment(R2, N2, c2);
2290  CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2291  Increment(R3, N2, c3);
2292  }
2293 }
2294 
2295 // R[2*N] - result = A*A
2296 // T[2*N] - temporary work space
2297 // A[N] --- number to be squared
2298 
2299 void RecursiveSquare(word *R, word *T, const word *A, size_t N)
2300 {
2301  CRYPTOPP_ASSERT(N && N%2==0);
2302 
2303  if (N <= s_recursionLimit)
2304  s_pSqu[N/4](R, A);
2305  else
2306  {
2307  const size_t N2 = N/2;
2308 
2309  RecursiveSquare(R0, T2, A0, N2);
2310  RecursiveSquare(R2, T2, A1, N2);
2311  RecursiveMultiply(T0, T2, A0, A1, N2);
2312 
2313  int carry = Add(R1, R1, T0, N);
2314  carry += Add(R1, R1, T0, N);
2315  Increment(R3, N2, carry);
2316  }
2317 }
2318 
2319 // R[N] - bottom half of A*B
2320 // T[3*N/2] - temporary work space
2321 // A[N] - multiplier
2322 // B[N] - multiplicant
2323 
2324 void RecursiveMultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2325 {
2326  CRYPTOPP_ASSERT(N>=2 && N%2==0);
2327 
2328  if (N <= s_recursionLimit)
2329  s_pBot[N/4](R, A, B);
2330  else
2331  {
2332  const size_t N2 = N/2;
2333 
2334  RecursiveMultiply(R, T, A0, B0, N2);
2335  RecursiveMultiplyBottom(T0, T1, A1, B0, N2);
2336  Add(R1, R1, T0, N2);
2337  RecursiveMultiplyBottom(T0, T1, A0, B1, N2);
2338  Add(R1, R1, T0, N2);
2339  }
2340 }
2341 
2342 // R[N] --- upper half of A*B
2343 // T[2*N] - temporary work space
2344 // L[N] --- lower half of A*B
2345 // A[N] --- multiplier
2346 // B[N] --- multiplicant
2347 
2348 void MultiplyTop(word *R, word *T, const word *L, const word *A, const word *B, size_t N)
2349 {
2350  CRYPTOPP_ASSERT(N>=2 && N%2==0);
2351 
2352  if (N <= s_recursionLimit)
2353  s_pTop[N/4](R, A, B, L[N-1]);
2354  else
2355  {
2356  const size_t N2 = N/2;
2357 
2358  size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2359  Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2360 
2361  size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2362  Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2363 
2364  RecursiveMultiply(T0, T2, R0, R1, N2);
2365  RecursiveMultiply(R0, T2, A1, B1, N2);
2366 
2367  // now T[01] holds (A1-A0)*(B0-B1) = A1*B0+A0*B1-A1*B1-A0*B0, R[01] holds A1*B1
2368 
2369  int t, c3;
2370  int c2 = Subtract(T2, L+N2, L, N2);
2371 
2372  if (AN2 == BN2)
2373  {
2374  c2 -= Add(T2, T2, T0, N2);
2375  t = (Compare(T2, R0, N2) == -1);
2376  c3 = t - Subtract(T2, T2, T1, N2);
2377  }
2378  else
2379  {
2380  c2 += Subtract(T2, T2, T0, N2);
2381  t = (Compare(T2, R0, N2) == -1);
2382  c3 = t + Add(T2, T2, T1, N2);
2383  }
2384 
2385  c2 += t;
2386  if (c2 >= 0)
2387  c3 += Increment(T2, N2, c2);
2388  else
2389  c3 -= Decrement(T2, N2, -c2);
2390  c3 += Add(R0, T2, R1, N2);
2391 
2392  CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2393  Increment(R1, N2, c3);
2394  }
2395 }
2396 
2397 inline void Multiply(word *R, word *T, const word *A, const word *B, size_t N)
2398 {
2399  RecursiveMultiply(R, T, A, B, N);
2400 }
2401 
2402 inline void Square(word *R, word *T, const word *A, size_t N)
2403 {
2404  RecursiveSquare(R, T, A, N);
2405 }
2406 
2407 inline void MultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2408 {
2409  RecursiveMultiplyBottom(R, T, A, B, N);
2410 }
2411 
2412 // R[NA+NB] - result = A*B
2413 // T[NA+NB] - temporary work space
2414 // A[NA] ---- multiplier
2415 // B[NB] ---- multiplicant
2416 
2417 void AsymmetricMultiply(word *R, word *T, const word *A, size_t NA, const word *B, size_t NB)
2418 {
2419  if (NA == NB)
2420  {
2421  // Profiling tells us the original second case was dominant, so it was promoted to the first If statement.
2422  // The code change occurred at Commit dc99266599a0e72d.
2423  if (A != B)
2424  Multiply(R, T, A, B, NA);
2425  else
2426  Square(R, T, A, NA);
2427 
2428  return;
2429  }
2430 
2431  if (NA > NB)
2432  {
2433  std::swap(A, B);
2434  std::swap(NA, NB);
2435  }
2436 
2437  CRYPTOPP_ASSERT(NB % NA == 0);
2438 
2439  if (NA==2 && !A[1])
2440  {
2441  // Profiling tells us the original Default case was dominant, so it was
2442  // promoted to the first Case statement. The code change occurred at
2443  // Commit dc99266599a0e72d.
2444  switch (A[0])
2445  {
2446  default:
2447  R[NB] = LinearMultiply(R, B, A[0], NB);
2448  R[NB+1] = 0;
2449  return;
2450  case 0:
2451  SetWords(R, 0, NB+2);
2452  return;
2453  case 1:
2454  CopyWords(R, B, NB);
2455  R[NB] = R[NB+1] = 0;
2456  return;
2457  }
2458  }
2459 
2460  size_t i;
2461  if ((NB/NA)%2 == 0)
2462  {
2463  Multiply(R, T, A, B, NA);
2464  CopyWords(T+2*NA, R+NA, NA);
2465 
2466  for (i=2*NA; i<NB; i+=2*NA)
2467  Multiply(T+NA+i, T, A, B+i, NA);
2468  for (i=NA; i<NB; i+=2*NA)
2469  Multiply(R+i, T, A, B+i, NA);
2470  }
2471  else
2472  {
2473  for (i=0; i<NB; i+=2*NA)
2474  Multiply(R+i, T, A, B+i, NA);
2475  for (i=NA; i<NB; i+=2*NA)
2476  Multiply(T+NA+i, T, A, B+i, NA);
2477  }
2478 
2479  if (Add(R+NA, R+NA, T+2*NA, NB-NA))
2480  Increment(R+NB, NA);
2481 }
2482 
2483 // R[N] ----- result = A inverse mod 2**(WORD_BITS*N)
2484 // T[3*N/2] - temporary work space
2485 // A[N] ----- an odd number as input
2486 
2487 void RecursiveInverseModPower2(word *R, word *T, const word *A, size_t N)
2488 {
2489  // Profiling tells us the original Else was dominant, so it was
2490  // promoted to the first If statement. The code change occurred
2491  // at Commit dc99266599a0e72d.
2492  if (N!=2)
2493  {
2494  const size_t N2 = N/2;
2495  RecursiveInverseModPower2(R0, T0, A0, N2);
2496  T0[0] = 1;
2497  SetWords(T0+1, 0, N2-1);
2498  MultiplyTop(R1, T1, T0, R0, A0, N2);
2499  MultiplyBottom(T0, T1, R0, A1, N2);
2500  Add(T0, R1, T0, N2);
2501  TwosComplement(T0, N2);
2502  MultiplyBottom(R1, T1, R0, T0, N2);
2503  }
2504  else
2505  {
2506  T[0] = AtomicInverseModPower2(A[0]);
2507  T[1] = 0;
2508  s_pBot[0](T+2, T, A);
2509  TwosComplement(T+2, 2);
2510  Increment(T+2, 2, 2);
2511  s_pBot[0](R, T, T+2);
2512  }
2513 }
2514 
2515 // R[N] --- result = X/(2**(WORD_BITS*N)) mod M
2516 // T[3*N] - temporary work space
2517 // X[2*N] - number to be reduced
2518 // M[N] --- modulus
2519 // U[N] --- multiplicative inverse of M mod 2**(WORD_BITS*N)
2520 
2521 void MontgomeryReduce(word *R, word *T, word *X, const word *M, const word *U, size_t N)
2522 {
2523 #if 1
2524  MultiplyBottom(R, T, X, U, N);
2525  MultiplyTop(T, T+N, X, R, M, N);
2526  word borrow = Subtract(T, X+N, T, N);
2527  // defend against timing attack by doing this Add even when not needed
2528  word carry = Add(T+N, T, M, N);
2529  CRYPTOPP_ASSERT(carry | !borrow);
2530  CRYPTOPP_UNUSED(carry), CRYPTOPP_UNUSED(borrow);
2531  CopyWords(R, T + ((0-borrow) & N), N);
2532 #elif 0
2533  const word u = 0-U[0];
2534  Declare2Words(p)
2535  for (size_t i=0; i<N; i++)
2536  {
2537  const word t = u * X[i];
2538  word c = 0;
2539  for (size_t j=0; j<N; j+=2)
2540  {
2541  MultiplyWords(p, t, M[j]);
2542  Acc2WordsBy1(p, X[i+j]);
2543  Acc2WordsBy1(p, c);
2544  X[i+j] = LowWord(p);
2545  c = HighWord(p);
2546  MultiplyWords(p, t, M[j+1]);
2547  Acc2WordsBy1(p, X[i+j+1]);
2548  Acc2WordsBy1(p, c);
2549  X[i+j+1] = LowWord(p);
2550  c = HighWord(p);
2551  }
2552 
2553  if (Increment(X+N+i, N-i, c))
2554  while (!Subtract(X+N, X+N, M, N)) {}
2555  }
2556 
2557  memcpy(R, X+N, N*WORD_SIZE);
2558 #else
2559  __m64 u = _mm_cvtsi32_si64(0-U[0]), p;
2560  for (size_t i=0; i<N; i++)
2561  {
2562  __m64 t = _mm_cvtsi32_si64(X[i]);
2563  t = _mm_mul_su32(t, u);
2564  __m64 c = _mm_setzero_si64();
2565  for (size_t j=0; j<N; j+=2)
2566  {
2567  p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j]));
2568  p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j]));
2569  c = _mm_add_si64(c, p);
2570  X[i+j] = _mm_cvtsi64_si32(c);
2571  c = _mm_srli_si64(c, 32);
2572  p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j+1]));
2573  p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j+1]));
2574  c = _mm_add_si64(c, p);
2575  X[i+j+1] = _mm_cvtsi64_si32(c);
2576  c = _mm_srli_si64(c, 32);
2577  }
2578 
2579  if (Increment(X+N+i, N-i, _mm_cvtsi64_si32(c)))
2580  while (!Subtract(X+N, X+N, M, N)) {}
2581  }
2582 
2583  memcpy(R, X+N, N*WORD_SIZE);
2584  _mm_empty();
2585 #endif
2586 }
2587 
2588 // R[N] --- result = X/(2**(WORD_BITS*N/2)) mod M
2589 // T[2*N] - temporary work space
2590 // X[2*N] - number to be reduced
2591 // M[N] --- modulus
2592 // U[N/2] - multiplicative inverse of M mod 2**(WORD_BITS*N/2)
2593 // V[N] --- 2**(WORD_BITS*3*N/2) mod M
2594 
2595 void HalfMontgomeryReduce(word *R, word *T, const word *X, const word *M, const word *U, const word *V, size_t N)
2596 {
2597  CRYPTOPP_ASSERT(N%2==0 && N>=4);
2598 
2599 #define M0 M
2600 #define M1 (M+N2)
2601 #define V0 V
2602 #define V1 (V+N2)
2603 
2604 #define X0 X
2605 #define X1 (X+N2)
2606 #define X2 (X+N)
2607 #define X3 (X+N+N2)
2608 
2609  const size_t N2 = N/2;
2610  Multiply(T0, T2, V0, X3, N2);
2611  int c2 = Add(T0, T0, X0, N);
2612  MultiplyBottom(T3, T2, T0, U, N2);
2613  MultiplyTop(T2, R, T0, T3, M0, N2);
2614  c2 -= Subtract(T2, T1, T2, N2);
2615  Multiply(T0, R, T3, M1, N2);
2616  c2 -= Subtract(T0, T2, T0, N2);
2617  int c3 = -(int)Subtract(T1, X2, T1, N2);
2618  Multiply(R0, T2, V1, X3, N2);
2619  c3 += Add(R, R, T, N);
2620 
2621  if (c2>0)
2622  c3 += Increment(R1, N2);
2623  else if (c2<0)
2624  c3 -= Decrement(R1, N2, -c2);
2625 
2626  CRYPTOPP_ASSERT(c3>=-1 && c3<=1);
2627  if (c3>0)
2628  Subtract(R, R, M, N);
2629  else if (c3<0)
2630  Add(R, R, M, N);
2631 
2632 #undef M0
2633 #undef M1
2634 #undef V0
2635 #undef V1
2636 
2637 #undef X0
2638 #undef X1
2639 #undef X2
2640 #undef X3
2641 }
2642 
2643 #undef A0
2644 #undef A1
2645 #undef B0
2646 #undef B1
2647 
2648 #undef T0
2649 #undef T1
2650 #undef T2
2651 #undef T3
2652 
2653 #undef R0
2654 #undef R1
2655 #undef R2
2656 #undef R3
2657 
2658 /*
2659 // do a 3 word by 2 word divide, returns quotient and leaves remainder in A
2660 static word SubatomicDivide(word *A, word B0, word B1)
2661 {
2662  // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a word
2663  CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
2664 
2665  // estimate the quotient: do a 2 word by 1 word divide
2666  word Q;
2667  if (B1+1 == 0)
2668  Q = A[2];
2669  else
2670  Q = DWord(A[1], A[2]).DividedBy(B1+1);
2671 
2672  // now subtract Q*B from A
2673  DWord p = DWord::Multiply(B0, Q);
2674  DWord u = (DWord) A[0] - p.GetLowHalf();
2675  A[0] = u.GetLowHalf();
2676  u = (DWord) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - DWord::Multiply(B1, Q);
2677  A[1] = u.GetLowHalf();
2678  A[2] += u.GetHighHalf();
2679 
2680  // Q <= actual quotient, so fix it
2681  while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
2682  {
2683  u = (DWord) A[0] - B0;
2684  A[0] = u.GetLowHalf();
2685  u = (DWord) A[1] - B1 - u.GetHighHalfAsBorrow();
2686  A[1] = u.GetLowHalf();
2687  A[2] += u.GetHighHalf();
2688  Q++;
2689  CRYPTOPP_ASSERT(Q); // shouldn't overflow
2690  }
2691 
2692  return Q;
2693 }
2694 
2695 // do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
2696 static inline void AtomicDivide(word *Q, const word *A, const word *B)
2697 {
2698  if (!B[0] && !B[1]) // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
2699  {
2700  Q[0] = A[2];
2701  Q[1] = A[3];
2702  }
2703  else
2704  {
2705  word T[4];
2706  T[0] = A[0]; T[1] = A[1]; T[2] = A[2]; T[3] = A[3];
2707  Q[1] = SubatomicDivide(T+1, B[0], B[1]);
2708  Q[0] = SubatomicDivide(T, B[0], B[1]);
2709 
2710 #if defined(CRYPTOPP_DEBUG)
2711  // multiply quotient and divisor and add remainder, make sure it equals dividend
2712  CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2713  word P[4];
2714  LowLevel::Multiply2(P, Q, B);
2715  Add(P, P, T, 4);
2716  CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2717 #endif
2718  }
2719 }
2720 */
2721 
2722 static inline void AtomicDivide(word *Q, const word *A, const word *B)
2723 {
2724  word T[4];
2725  DWord q = DivideFourWordsByTwo<word, DWord>(T, DWord(A[0], A[1]), DWord(A[2], A[3]), DWord(B[0], B[1]));
2726  Q[0] = q.GetLowHalf();
2727  Q[1] = q.GetHighHalf();
2728 
2729 #if defined(CRYPTOPP_DEBUG)
2730  if (B[0] || B[1])
2731  {
2732  // multiply quotient and divisor and add remainder, make sure it equals dividend
2733  CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2734  word P[4];
2735  s_pMul[0](P, Q, B);
2736  Add(P, P, T, 4);
2737  CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2738  }
2739 #endif
2740 }
2741 
2742 // for use by Divide(), corrects the underestimated quotient {Q1,Q0}
2743 static void CorrectQuotientEstimate(word *R, word *T, word *Q, const word *B, size_t N)
2744 {
2745  CRYPTOPP_ASSERT(N && N%2==0);
2746 
2747  AsymmetricMultiply(T, T+N+2, Q, 2, B, N);
2748 
2749  word borrow = Subtract(R, R, T, N+2);
2750  CRYPTOPP_ASSERT(!borrow && !R[N+1]);
2751  CRYPTOPP_UNUSED(borrow);
2752 
2753  while (R[N] || Compare(R, B, N) >= 0)
2754  {
2755  R[N] -= Subtract(R, R, B, N);
2756  Q[1] += (++Q[0]==0);
2757  CRYPTOPP_ASSERT(Q[0] || Q[1]); // no overflow
2758  }
2759 }
2760 
2761 // R[NB] -------- remainder = A%B
2762 // Q[NA-NB+2] --- quotient = A/B
2763 // T[NA+3*(NB+2)] - temp work space
2764 // A[NA] -------- dividend
2765 // B[NB] -------- divisor
2766 
2767 void Divide(word *R, word *Q, word *T, const word *A, size_t NA, const word *B, size_t NB)
2768 {
2769  CRYPTOPP_ASSERT(NA && NB && NA%2==0 && NB%2==0);
2770  CRYPTOPP_ASSERT(B[NB-1] || B[NB-2]);
2771  CRYPTOPP_ASSERT(NB <= NA);
2772 
2773  // set up temporary work space
2774  word *const TA=T;
2775  word *const TB=T+NA+2;
2776  word *const TP=T+NA+2+NB;
2777 
2778  // copy B into TB and normalize it so that TB has highest bit set to 1
2779  unsigned shiftWords = (B[NB-1]==0);
2780  TB[0] = TB[NB-1] = 0;
2781  CopyWords(TB+shiftWords, B, NB-shiftWords);
2782  unsigned shiftBits = WORD_BITS - BitPrecision(TB[NB-1]);
2783  CRYPTOPP_ASSERT(shiftBits < WORD_BITS);
2784  ShiftWordsLeftByBits(TB, NB, shiftBits);
2785 
2786  // copy A into TA and normalize it
2787  TA[0] = TA[NA] = TA[NA+1] = 0;
2788  CopyWords(TA+shiftWords, A, NA);
2789  ShiftWordsLeftByBits(TA, NA+2, shiftBits);
2790 
2791  if (TA[NA+1]==0 && TA[NA] <= 1)
2792  {
2793  Q[NA-NB+1] = Q[NA-NB] = 0;
2794  while (TA[NA] || Compare(TA+NA-NB, TB, NB) >= 0)
2795  {
2796  TA[NA] -= Subtract(TA+NA-NB, TA+NA-NB, TB, NB);
2797  ++Q[NA-NB];
2798  }
2799  }
2800  else
2801  {
2802  NA+=2;
2803  CRYPTOPP_ASSERT(Compare(TA+NA-NB, TB, NB) < 0);
2804  }
2805 
2806  word BT[2];
2807  BT[0] = TB[NB-2] + 1;
2808  BT[1] = TB[NB-1] + (BT[0]==0);
2809 
2810  // start reducing TA mod TB, 2 words at a time
2811  for (size_t i=NA-2; i>=NB; i-=2)
2812  {
2813  AtomicDivide(Q+i-NB, TA+i-2, BT);
2814  CorrectQuotientEstimate(TA+i-NB, TP, Q+i-NB, TB, NB);
2815  }
2816 
2817  // copy TA into R, and denormalize it
2818  CopyWords(R, TA+shiftWords, NB);
2819  ShiftWordsRightByBits(R, NB, shiftBits);
2820 }
2821 
2822 static inline size_t EvenWordCount(const word *X, size_t N)
2823 {
2824  while (N && X[N-2]==0 && X[N-1]==0)
2825  N-=2;
2826  return N;
2827 }
2828 
2829 // return k
2830 // R[N] --- result = A^(-1) * 2^k mod M
2831 // T[4*N] - temporary work space
2832 // A[NA] -- number to take inverse of
2833 // M[N] --- modulus
2834 
2835 unsigned int AlmostInverse(word *R, word *T, const word *A, size_t NA, const word *M, size_t N)
2836 {
2837  CRYPTOPP_ASSERT(NA<=N && N && N%2==0);
2838 
2839  word *b = T;
2840  word *c = T+N;
2841  word *f = T+2*N;
2842  word *g = T+3*N;
2843  size_t bcLen=2, fgLen=EvenWordCount(M, N);
2844  unsigned int k=0;
2845  bool s=false;
2846 
2847  SetWords(T, 0, 3*N);
2848  b[0]=1;
2849  CopyWords(f, A, NA);
2850  CopyWords(g, M, N);
2851 
2852  while (1)
2853  {
2854  word t=f[0];
2855  while (!t)
2856  {
2857  if (EvenWordCount(f, fgLen)==0)
2858  {
2859  SetWords(R, 0, N);
2860  return 0;
2861  }
2862 
2863  ShiftWordsRightByWords(f, fgLen, 1);
2864  bcLen += 2 * (c[bcLen-1] != 0);
2865  CRYPTOPP_ASSERT(bcLen <= N);
2866  ShiftWordsLeftByWords(c, bcLen, 1);
2867  k+=WORD_BITS;
2868  t=f[0];
2869  }
2870 
2871  // t must be non-0; otherwise undefined.
2872  unsigned int i = TrailingZeros(t);
2873  t >>= i;
2874  k += i;
2875 
2876  if (t==1 && f[1]==0 && EvenWordCount(f+2, fgLen-2)==0)
2877  {
2878  if (s)
2879  Subtract(R, M, b, N);
2880  else
2881  CopyWords(R, b, N);
2882  return k;
2883  }
2884 
2885  ShiftWordsRightByBits(f, fgLen, i);
2886  t = ShiftWordsLeftByBits(c, bcLen, i);
2887  c[bcLen] += t;
2888  bcLen += 2 * (t!=0);
2889  CRYPTOPP_ASSERT(bcLen <= N);
2890 
2891  bool swap = Compare(f, g, fgLen)==-1;
2892  ConditionalSwapPointers(swap, f, g);
2893  ConditionalSwapPointers(swap, b, c);
2894  s ^= swap;
2895 
2896  fgLen -= 2 * !(f[fgLen-2] | f[fgLen-1]);
2897 
2898  Subtract(f, f, g, fgLen);
2899  t = Add(b, b, c, bcLen);
2900  b[bcLen] += t;
2901  bcLen += 2*t;
2902  CRYPTOPP_ASSERT(bcLen <= N);
2903  }
2904 }
2905 
2906 // R[N] - result = A/(2^k) mod M
2907 // A[N] - input
2908 // M[N] - modulus
2909 
2910 void DivideByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2911 {
2912  CopyWords(R, A, N);
2913 
2914  while (k--)
2915  {
2916  if (R[0]%2==0)
2917  ShiftWordsRightByBits(R, N, 1);
2918  else
2919  {
2920  word carry = Add(R, R, M, N);
2921  ShiftWordsRightByBits(R, N, 1);
2922  R[N-1] += carry<<(WORD_BITS-1);
2923  }
2924  }
2925 }
2926 
2927 // R[N] - result = A*(2^k) mod M
2928 // A[N] - input
2929 // M[N] - modulus
2930 
2931 void MultiplyByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2932 {
2933  CopyWords(R, A, N);
2934 
2935  while (k--)
2936  if (ShiftWordsLeftByBits(R, N, 1) || Compare(R, M, N)>=0)
2937  Subtract(R, R, M, N);
2938 }
2939 
2940 // ******************************************************************
2941 
2942 static const unsigned int RoundupSizeTable[] = {2, 2, 2, 4, 4, 8, 8, 8, 8};
2943 
2944 static inline size_t RoundupSize(size_t n)
2945 {
2946  if (n<=8)
2947  return RoundupSizeTable[n];
2948  else if (n<=16)
2949  return 16;
2950  else if (n<=32)
2951  return 32;
2952  else if (n<=64)
2953  return 64;
2954  else
2955  return size_t(1) << BitPrecision(n-1);
2956 }
2957 
2959  : reg(2), sign(POSITIVE)
2960 {
2961  reg[0] = reg[1] = 0;
2962 }
2963 
2965  : reg(RoundupSize(t.WordCount())), sign(t.sign)
2966 {
2967  CopyWords(reg, t.reg, reg.size());
2968 }
2969 
2970 Integer::Integer(Sign s, lword value)
2971  : reg(2), sign(s)
2972 {
2973  reg[0] = word(value);
2974  reg[1] = word(SafeRightShift<WORD_BITS>(value));
2975 }
2976 
2977 Integer::Integer(signed long value)
2978  : reg(2)
2979 {
2980  if (value >= 0)
2981  sign = POSITIVE;
2982  else
2983  {
2984  sign = NEGATIVE;
2985  value = -value;
2986  }
2987  reg[0] = word(value);
2988  reg[1] = word(SafeRightShift<WORD_BITS>((unsigned long)value));
2989 }
2990 
2991 Integer::Integer(Sign s, word high, word low)
2992  : reg(2), sign(s)
2993 {
2994  reg[0] = low;
2995  reg[1] = high;
2996 }
2997 
2999 {
3000  if (ByteCount() > sizeof(long))
3001  return false;
3002 
3003  unsigned long value = (unsigned long)reg[0];
3004  value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3005 
3006  if (sign==POSITIVE)
3007  return (signed long)value >= 0;
3008  else
3009  return -(signed long)value < 0;
3010 }
3011 
3012 signed long Integer::ConvertToLong() const
3013 {
3015 
3016  unsigned long value = (unsigned long)reg[0];
3017  value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3018  return sign==POSITIVE ? value : -(signed long)value;
3019 }
3020 
3021 Integer::Integer(BufferedTransformation &encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3022 {
3024 
3025  if (o != LITTLE_ENDIAN_ORDER)
3026  {
3027  Decode(encodedInteger, byteCount, s);
3028  }
3029  else
3030  {
3031  SecByteBlock block(byteCount);
3032  encodedInteger.Get(block, block.size());
3033  std::reverse(block.begin(), block.begin()+block.size());
3034 
3035  Decode(block.begin(), block.size(), s);
3036  }
3037 }
3038 
3039 Integer::Integer(const byte *encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3040 {
3041  CRYPTOPP_ASSERT(encodedInteger && byteCount); // NULL buffer
3043 
3044  if (o != LITTLE_ENDIAN_ORDER)
3045  {
3046  Decode(encodedInteger, byteCount, s);
3047  }
3048  else
3049  {
3050  SecByteBlock block(byteCount);
3051 #if (_MSC_VER >= 1500)
3052  std::reverse_copy(encodedInteger, encodedInteger+byteCount,
3053  stdext::make_checked_array_iterator(block.begin(), block.size()));
3054 #else
3055  std::reverse_copy(encodedInteger, encodedInteger+byteCount, block.begin());
3056 #endif
3057  Decode(block.begin(), block.size(), s);
3058  return;
3059  }
3060 }
3061 
3063 {
3064  // Make explicit call to avoid virtual-dispatch findings in ctor
3065  Integer::BERDecode(bt);
3066 }
3067 
3069 {
3070  Randomize(rng, bitcount);
3071 }
3072 
3073 Integer::Integer(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3074 {
3075  if (!Randomize(rng, min, max, rnType, equiv, mod))
3077 }
3078 
3080 {
3081  Integer r((word)0, BitsToWords(e+1));
3082  r.SetBit(e);
3083  return r;
3084 }
3085 
3087 {
3088  return IsNegative() ? false : (reg[0]==0 && WordCount()==0);
3089 }
3090 
3092 {
3093  if (this != &t)
3094  {
3095  if (reg.size() != t.reg.size() || t.reg[t.reg.size()/2] == 0)
3096  reg.New(RoundupSize(t.WordCount()));
3097  CopyWords(reg, t.reg, reg.size());
3098  sign = t.sign;
3099  }
3100  return *this;
3101 }
3102 
3103 bool Integer::GetBit(size_t n) const
3104 {
3105  // Profiling tells us the original Else was dominant, so it was
3106  // promoted to the first If statement. The code change occurred
3107  // at Commit dc99266599a0e72d.
3108  if (n/WORD_BITS < reg.size())
3109  return bool((reg[n/WORD_BITS] >> (n % WORD_BITS)) & 1);
3110  else
3111  return 0;
3112 }
3113 
3114 void Integer::SetBit(size_t n, bool value)
3115 {
3116  if (value)
3117  {
3118  reg.CleanGrow(RoundupSize(BitsToWords(n+1)));
3119  reg[n/WORD_BITS] |= (word(1) << (n%WORD_BITS));
3120  }
3121  else
3122  {
3123  if (n/WORD_BITS < reg.size())
3124  reg[n/WORD_BITS] &= ~(word(1) << (n%WORD_BITS));
3125  }
3126 }
3127 
3128 byte Integer::GetByte(size_t n) const
3129 {
3130  // Profiling tells us the original Else was dominant, so it was
3131  // promoted to the first If statement. The code change occurred
3132  // at Commit dc99266599a0e72d.
3133  if (n/WORD_SIZE < reg.size())
3134  return byte(reg[n/WORD_SIZE] >> ((n%WORD_SIZE)*8));
3135  else
3136  return 0;
3137 }
3138 
3139 void Integer::SetByte(size_t n, byte value)
3140 {
3141  reg.CleanGrow(RoundupSize(BytesToWords(n+1)));
3142  reg[n/WORD_SIZE] &= ~(word(0xff) << 8*(n%WORD_SIZE));
3143  reg[n/WORD_SIZE] |= (word(value) << 8*(n%WORD_SIZE));
3144 }
3145 
3146 lword Integer::GetBits(size_t i, size_t n) const
3147 {
3148  lword v = 0;
3149  CRYPTOPP_ASSERT(n <= sizeof(v)*8);
3150  for (unsigned int j=0; j<n; j++)
3151  v |= lword(GetBit(i+j)) << j;
3152  return v;
3153 }
3154 
3156 {
3157  Integer result(*this);
3158  result.Negate();
3159  return result;
3160 }
3161 
3163 {
3164  Integer result(*this);
3165  result.sign = POSITIVE;
3166  return result;
3167 }
3168 
3170 {
3171  reg.swap(a.reg);
3172  std::swap(sign, a.sign);
3173 }
3174 
3175 Integer::Integer(word value, size_t length)
3176  : reg(RoundupSize(length)), sign(POSITIVE)
3177 {
3178  reg[0] = value;
3179  SetWords(reg+1, 0, reg.size()-1);
3180 }
3181 
3182 template <class T>
3183 static Integer StringToInteger(const T *str, ByteOrder order)
3184 {
3185  CRYPTOPP_ASSERT( order == BIG_ENDIAN_ORDER || order == LITTLE_ENDIAN_ORDER );
3186 
3187  int radix, sign = 1;
3188  // GCC workaround
3189  // std::char_traits<wchar_t>::length() not defined in GCC 3.2 and STLport 4.5.3
3190  unsigned int length;
3191  for (length = 0; str[length] != 0; length++) {}
3192 
3193  Integer v;
3194 
3195  if (length == 0)
3196  return Integer::Zero();
3197 
3198  switch (str[length-1])
3199  {
3200  case 'h':
3201  case 'H':
3202  radix=16;
3203  break;
3204  case 'o':
3205  case 'O':
3206  radix=8;
3207  break;
3208  case 'b':
3209  case 'B':
3210  radix=2;
3211  break;
3212  default:
3213  radix=10;
3214  }
3215 
3216  // 'str' is of length 1 or more
3217  if (str[0] == '-')
3218  {
3219  sign = -1;
3220  str += 1, length -= 1;
3221  }
3222 
3223  if (length > 2 && str[0] == '0' && (str[1] == 'x' || str[1] == 'X'))
3224  {
3225  radix = 16;
3226  str += 2, length -= 2;
3227  }
3228 
3229  if (order == BIG_ENDIAN_ORDER)
3230  {
3231  for (unsigned int i=0; i<length; i++)
3232  {
3233  int digit, ch = static_cast<int>(str[i]);
3234 
3235  // Profiling showd the second and third Else needed to be swapped
3236  // The code change occurred at Commit dc99266599a0e72d.
3237  if (ch >= '0' && ch <= '9')
3238  digit = ch - '0';
3239  else if (ch >= 'a' && ch <= 'f')
3240  digit = ch - 'a' + 10;
3241  else if (ch >= 'A' && ch <= 'F')
3242  digit = ch - 'A' + 10;
3243  else
3244  digit = radix;
3245 
3246  if (digit < radix)
3247  {
3248  v *= radix;
3249  v += digit;
3250  }
3251  }
3252  }
3253  else if (radix == 16 && order == LITTLE_ENDIAN_ORDER)
3254  {
3255  // Nibble high, low and count
3256  unsigned int nh = 0, nl = 0, nc = 0;
3257  Integer position(Integer::One());
3258 
3259  for (unsigned int i=0; i<length; i++)
3260  {
3261  int digit, ch = static_cast<int>(str[i]);
3262 
3263  if (ch >= '0' && ch <= '9')
3264  digit = ch - '0';
3265  else if (ch >= 'a' && ch <= 'f')
3266  digit = ch - 'a' + 10;
3267  else if (ch >= 'A' && ch <= 'F')
3268  digit = ch - 'A' + 10;
3269  else
3270  digit = radix;
3271 
3272  if (digit < radix)
3273  {
3274  if (nc++ == 0)
3275  nh = digit;
3276  else
3277  nl = digit;
3278 
3279  if (nc == 2)
3280  {
3281  v += position * (nh << 4 | nl);
3282  nc = 0, position <<= 8;
3283  }
3284  }
3285  }
3286 
3287  if (nc == 1)
3288  v += nh * position;
3289  }
3290  else // LITTLE_ENDIAN_ORDER && radix != 16
3291  {
3292  for (int i=static_cast<int>(length)-1; i>=0; i--)
3293  {
3294  int digit, ch = static_cast<int>(str[i]);
3295 
3296  if (ch >= '0' && ch <= '9')
3297  digit = ch - '0';
3298  else if (ch >= 'a' && ch <= 'f')
3299  digit = ch - 'a' + 10;
3300  else if (ch >= 'A' && ch <= 'F')
3301  digit = ch - 'A' + 10;
3302  else
3303  digit = radix;
3304 
3305  if (digit < radix)
3306  {
3307  v *= radix;
3308  v += digit;
3309  }
3310  }
3311  }
3312 
3313  if (sign == -1)
3314  v.Negate();
3315 
3316  return v;
3317 }
3318 
3319 Integer::Integer(const char *str, ByteOrder order)
3320  : reg(2), sign(POSITIVE)
3321 {
3322  *this = StringToInteger(str,order);
3323 }
3324 
3325 Integer::Integer(const wchar_t *str, ByteOrder order)
3326  : reg(2), sign(POSITIVE)
3327 {
3328  *this = StringToInteger(str,order);
3329 }
3330 
3331 unsigned int Integer::WordCount() const
3332 {
3333  return (unsigned int)CountWords(reg, reg.size());
3334 }
3335 
3336 unsigned int Integer::ByteCount() const
3337 {
3338  unsigned wordCount = WordCount();
3339  if (wordCount)
3340  return (wordCount-1)*WORD_SIZE + BytePrecision(reg[wordCount-1]);
3341  else
3342  return 0;
3343 }
3344 
3345 unsigned int Integer::BitCount() const
3346 {
3347  unsigned wordCount = WordCount();
3348  if (wordCount)
3349  return (wordCount-1)*WORD_BITS + BitPrecision(reg[wordCount-1]);
3350  else
3351  return 0;
3352 }
3353 
3354 void Integer::Decode(const byte *input, size_t inputLen, Signedness s)
3355 {
3356  CRYPTOPP_ASSERT(input && inputLen); // NULL buffer
3357  StringStore store(input, inputLen);
3358  Decode(store, inputLen, s);
3359 }
3360 
3362 {
3363  CRYPTOPP_ASSERT(bt.MaxRetrievable() >= inputLen);
3364  if (bt.MaxRetrievable() < inputLen)
3365  throw InvalidArgument("Integer: input length is too small");
3366 
3367  byte b;
3368  bt.Peek(b);
3369  sign = ((s==SIGNED) && (b & 0x80)) ? NEGATIVE : POSITIVE;
3370 
3371  while (inputLen>0 && (sign==POSITIVE ? b==0 : b==0xff))
3372  {
3373  bt.Skip(1);
3374  inputLen--;
3375  bt.Peek(b);
3376  }
3377 
3378  reg.CleanNew(RoundupSize(BytesToWords(inputLen)));
3379  for (size_t i=inputLen; i > 0; i--)
3380  {
3381  (void)bt.Get(b);
3382  reg[(i-1)/WORD_SIZE] |= word(b) << ((i-1)%WORD_SIZE)*8;
3383  }
3384 
3385  if (sign == NEGATIVE)
3386  {
3387  for (size_t i=inputLen; i<reg.size()*WORD_SIZE; i++)
3388  reg[i/WORD_SIZE] |= word(0xff) << (i%WORD_SIZE)*8;
3389  TwosComplement(reg, reg.size());
3390  }
3391 }
3392 
3393 size_t Integer::MinEncodedSize(Signedness signedness) const
3394 {
3395  // Profiling tells us the original second If was dominant, so it
3396  // was promoted to the first If statement. The code change occurred
3397  // at Commit dc99266599a0e72d.
3398  unsigned int outputLen = STDMAX(1U, ByteCount());
3399  const bool pre = (signedness == UNSIGNED);
3400  if (!pre && NotNegative() && (GetByte(outputLen-1) & 0x80))
3401  outputLen++;
3402  if (pre)
3403  return outputLen;
3404  if (IsNegative() && *this < -Power2(outputLen*8-1))
3405  outputLen++;
3406  return outputLen;
3407 }
3408 
3409 // PKCS12_PBKDF and other classes use undersized buffers
3410 void Integer::Encode(byte *output, size_t outputLen, Signedness signedness) const
3411 {
3412  CRYPTOPP_ASSERT(output && outputLen); // NULL buffer
3413  ArraySink sink(output, outputLen);
3414  Encode(sink, outputLen, signedness);
3415 }
3416 
3417 void Integer::Encode(BufferedTransformation &bt, size_t outputLen, Signedness signedness) const
3418 {
3419  if (signedness == UNSIGNED || NotNegative())
3420  {
3421  for (size_t i=outputLen; i > 0; i--)
3422  bt.Put(GetByte(i-1));
3423  }
3424  else
3425  {
3426  // take two's complement of *this
3427  Integer temp = Integer::Power2(8*STDMAX((size_t)ByteCount(), outputLen)) + *this;
3428  temp.Encode(bt, outputLen, UNSIGNED);
3429  }
3430 }
3431 
3433 {
3434  DERGeneralEncoder enc(bt, INTEGER);
3436  enc.MessageEnd();
3437 }
3438 
3439 void Integer::BERDecode(const byte *input, size_t len)
3440 {
3441  CRYPTOPP_ASSERT(input && len); // NULL buffer
3442  StringStore store(input, len);
3443  BERDecode(store);
3444 }
3445 
3447 {
3448  BERGeneralDecoder dec(bt, INTEGER);
3449  if (!dec.IsDefiniteLength() || dec.MaxRetrievable() < dec.RemainingLength())
3450  BERDecodeError();
3451  Decode(dec, (size_t)dec.RemainingLength(), SIGNED);
3452  dec.MessageEnd();
3453 }
3454 
3456 {
3457  DERGeneralEncoder enc(bt, OCTET_STRING);
3458  Encode(enc, length);
3459  enc.MessageEnd();
3460 }
3461 
3463 {
3464  BERGeneralDecoder dec(bt, OCTET_STRING);
3465  if (!dec.IsDefiniteLength() || dec.RemainingLength() != length)
3466  BERDecodeError();
3467  Decode(dec, length);
3468  dec.MessageEnd();
3469 }
3470 
3471 size_t Integer::OpenPGPEncode(byte *output, size_t bufferSize) const
3472 {
3473  CRYPTOPP_ASSERT(output && bufferSize); // NULL buffer
3474  CRYPTOPP_ASSERT(bufferSize >= MinEncodedSize()); // Undersized buffer
3475  ArraySink sink(output, bufferSize);
3476  return OpenPGPEncode(sink);
3477 }
3478 
3480 {
3481  word16 bitCount = word16(BitCount());
3482  bt.PutWord16(bitCount);
3483  size_t byteCount = BitsToBytes(bitCount);
3484  Encode(bt, byteCount);
3485  return 2 + byteCount;
3486 }
3487 
3488 void Integer::OpenPGPDecode(const byte *input, size_t len)
3489 {
3490  CRYPTOPP_ASSERT(input && len); // NULL buffer
3491  StringStore store(input, len);
3492  OpenPGPDecode(store);
3493 }
3494 
3496 {
3497  word16 bitCount;
3498  if (bt.GetWord16(bitCount) != 2 || bt.MaxRetrievable() < BitsToBytes(bitCount))
3499  throw OpenPGPDecodeErr();
3500  Decode(bt, BitsToBytes(bitCount));
3501 }
3502 
3504 {
3505  const size_t nbytes = nbits/8 + 1;
3506  SecByteBlock buf(nbytes);
3507  rng.GenerateBlock(buf, nbytes);
3508  if (nbytes)
3509  buf[0] = (byte)Crop(buf[0], nbits % 8);
3510  Decode(buf, nbytes, UNSIGNED);
3511 }
3512 
3513 void Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max)
3514 {
3515  if (min > max)
3516  throw InvalidArgument("Integer: Min must be no greater than Max");
3517 
3518  Integer range = max - min;
3519  const unsigned int nbits = range.BitCount();
3520 
3521  do
3522  {
3523  Randomize(rng, nbits);
3524  }
3525  while (*this > range);
3526 
3527  *this += min;
3528 }
3529 
3530 bool Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3531 {
3532  return GenerateRandomNoThrow(rng, MakeParameters("Min", min)("Max", max)
3533  ("RandomNumberType", rnType)("EquivalentTo", equiv)("Mod", mod));
3534 }
3535 
3537 {
3538 public:
3539  KDF2_RNG(const byte *seed, size_t seedSize)
3540  : m_counter(0), m_counterAndSeed(seedSize + 4)
3541  {
3542  memcpy(m_counterAndSeed + 4, seed, seedSize);
3543  }
3544 
3545  void GenerateBlock(byte *output, size_t size)
3546  {
3547  CRYPTOPP_ASSERT(output && size); // NULL buffer
3548  PutWord(false, BIG_ENDIAN_ORDER, m_counterAndSeed, m_counter);
3549  ++m_counter;
3550  P1363_KDF2<SHA1>::DeriveKey(output, size, m_counterAndSeed, m_counterAndSeed.size(), NULLPTR, 0);
3551  }
3552 
3553 private:
3554  word32 m_counter;
3555  SecByteBlock m_counterAndSeed;
3556 };
3557 
3559 {
3560  Integer min = params.GetValueWithDefault("Min", Integer::Zero());
3561  Integer max;
3562  if (!params.GetValue("Max", max))
3563  {
3564  int bitLength;
3565  if (params.GetIntValue("BitLength", bitLength))
3566  max = Integer::Power2(bitLength);
3567  else
3568  throw InvalidArgument("Integer: missing Max argument");
3569  }
3570  if (min > max)
3571  throw InvalidArgument("Integer: Min must be no greater than Max");
3572 
3573  Integer equiv = params.GetValueWithDefault("EquivalentTo", Integer::Zero());
3574  Integer mod = params.GetValueWithDefault("Mod", Integer::One());
3575 
3576  if (equiv.IsNegative() || equiv >= mod)
3577  throw InvalidArgument("Integer: invalid EquivalentTo and/or Mod argument");
3578 
3579  Integer::RandomNumberType rnType = params.GetValueWithDefault("RandomNumberType", Integer::ANY);
3580 
3581  member_ptr<KDF2_RNG> kdf2Rng;
3583  if (params.GetValue(Name::Seed(), seed))
3584  {
3585  ByteQueue bq;
3586  DERSequenceEncoder seq(bq);
3587  min.DEREncode(seq);
3588  max.DEREncode(seq);
3589  equiv.DEREncode(seq);
3590  mod.DEREncode(seq);
3591  DEREncodeUnsigned(seq, rnType);
3592  DEREncodeOctetString(seq, seed.begin(), seed.size());
3593  seq.MessageEnd();
3594 
3595  SecByteBlock finalSeed((size_t)bq.MaxRetrievable());
3596  bq.Get(finalSeed, finalSeed.size());
3597  kdf2Rng.reset(new KDF2_RNG(finalSeed.begin(), finalSeed.size()));
3598  }
3599  RandomNumberGenerator &rng = kdf2Rng.get() ? (RandomNumberGenerator &)*kdf2Rng : i_rng;
3600 
3601  switch (rnType)
3602  {
3603  case ANY:
3604  if (mod == One())
3605  Randomize(rng, min, max);
3606  else
3607  {
3608  Integer min1 = min + (equiv-min)%mod;
3609  if (max < min1)
3610  return false;
3611  Randomize(rng, Zero(), (max - min1) / mod);
3612  *this *= mod;
3613  *this += min1;
3614  }
3615  return true;
3616 
3617  case PRIME:
3618  {
3619  const PrimeSelector *pSelector = params.GetValueWithDefault(Name::PointerToPrimeSelector(), (const PrimeSelector *)NULLPTR);
3620 
3621  int i;
3622  i = 0;
3623  while (1)
3624  {
3625  if (++i==16)
3626  {
3627  // check if there are any suitable primes in [min, max]
3628  Integer first = min;
3629  if (FirstPrime(first, max, equiv, mod, pSelector))
3630  {
3631  // if there is only one suitable prime, we're done
3632  *this = first;
3633  if (!FirstPrime(first, max, equiv, mod, pSelector))
3634  return true;
3635  }
3636  else
3637  return false;
3638  }
3639 
3640  Randomize(rng, min, max);
3641  if (FirstPrime(*this, STDMIN(*this+mod*PrimeSearchInterval(max), max), equiv, mod, pSelector))
3642  return true;
3643  }
3644  }
3645 
3646  default:
3647  throw InvalidArgument("Integer: invalid RandomNumberType argument");
3648  }
3649 }
3650 
3651 std::istream& operator>>(std::istream& in, Integer &a)
3652 {
3653  char c;
3654  unsigned int length = 0;
3655  SecBlock<char> str(length + 16);
3656 
3657  std::ws(in);
3658 
3659  do
3660  {
3661  in.read(&c, 1);
3662  str[length++] = c;
3663  if (length >= str.size())
3664  str.Grow(length + 16);
3665  }
3666  while (in && (c=='-' || c=='x' || (c>='0' && c<='9') || (c>='a' && c<='f') || (c>='A' && c<='F') || c=='h' || c=='H' || c=='o' || c=='O' || c==',' || c=='.'));
3667 
3668  if (in.gcount())
3669  in.putback(c);
3670  str[length-1] = '\0';
3671  a = Integer(str);
3672 
3673  return in;
3674 }
3675 
3676 // Ensure base 10 is default
3677 inline int FlagToBase(long f) {
3678  return f == std::ios::hex ? 16 : (f == std::ios::oct ? 8 : 10);
3679 }
3680 
3681 inline char FlagToSuffix(long f) {
3682  return f == std::ios::hex ? 'h' : (f == std::ios::oct ? 'o' : '.');
3683 }
3684 
3685 // Ensure base 10 is default
3686 std::ostream& operator<<(std::ostream& out, const Integer &a)
3687 {
3688  // Get relevant conversion specifications from ostream.
3689  const long f = out.flags() & std::ios::basefield;
3690  const int base = FlagToBase(f);
3691  const char suffix = FlagToSuffix(f);
3692 
3693  Integer temp1=a, temp2;
3694  if (a.IsNegative())
3695  {
3696  out << '-';
3697  temp1.Negate();
3698  }
3699 
3700  if (!a)
3701  out << '0';
3702 
3703  static const char upper[]="0123456789ABCDEF";
3704  static const char lower[]="0123456789abcdef";
3705 
3706  const char* vec = (out.flags() & std::ios::uppercase) ? upper : lower;
3707  unsigned int i=0;
3708  SecBlock<char> s(a.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
3709 
3710  while (!!temp1)
3711  {
3712  word digit;
3713  Integer::Divide(digit, temp2, temp1, base);
3714  s[i++]=vec[digit];
3715  temp1.swap(temp2);
3716  }
3717 
3718  while (i--)
3719  {
3720  out << s[i];
3721  }
3722 
3723 #ifdef CRYPTOPP_USE_STD_SHOWBASE
3724  if (out.flags() & std::ios_base::showbase)
3725  out << suffix;
3726 
3727  return out;
3728 #else
3729  return out << suffix;
3730 #endif
3731 }
3732 
3734 {
3735  if (NotNegative())
3736  {
3737  if (Increment(reg, reg.size()))
3738  {
3739  reg.CleanGrow(2*reg.size());
3740  reg[reg.size()/2]=1;
3741  }
3742  }
3743  else
3744  {
3745  word borrow = Decrement(reg, reg.size());
3746  CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3747 
3748  if (WordCount()==0)
3749  *this = Zero();
3750  }
3751  return *this;
3752 }
3753 
3755 {
3756  if (IsNegative())
3757  {
3758  if (Increment(reg, reg.size()))
3759  {
3760  reg.CleanGrow(2*reg.size());
3761  reg[reg.size()/2]=1;
3762  }
3763  }
3764  else
3765  {
3766  if (Decrement(reg, reg.size()))
3767  *this = -One();
3768  }
3769  return *this;
3770 }
3771 
3772 // This is a bit operation. We set sign to POSITIVE, so there's no need to
3773 // worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3775 {
3776  if (this == &t)
3777  {
3778  return AbsoluteValue();
3779  }
3780  else if (reg.size() >= t.reg.size())
3781  {
3782  Integer result(t);
3783  AndWords(result.reg, reg, t.reg.size());
3784 
3785  result.sign = POSITIVE;
3786  return result;
3787  }
3788  else // reg.size() < t.reg.size()
3789  {
3790  Integer result(*this);
3791  AndWords(result.reg, t.reg, reg.size());
3792 
3793  result.sign = POSITIVE;
3794  return result;
3795  }
3796 }
3797 
3798 // This is a bit operation. We set sign to POSITIVE, so there's no need to
3799 // worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3800 Integer Integer::Or(const Integer& t) const
3801 {
3802  if (this == &t)
3803  {
3804  return AbsoluteValue();
3805  }
3806  else if (reg.size() >= t.reg.size())
3807  {
3808  Integer result(*this);
3809  OrWords(result.reg, t.reg, t.reg.size());
3810 
3811  result.sign = POSITIVE;
3812  return result;
3813  }
3814  else // reg.size() < t.reg.size()
3815  {
3816  Integer result(t);
3817  OrWords(result.reg, reg, reg.size());
3818 
3819  result.sign = POSITIVE;
3820  return result;
3821  }
3822 }
3823 
3824 // This is a bit operation. We set sign to POSITIVE, so there's no need to
3825 // worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3827 {
3828  if (this == &t)
3829  {
3830  return Integer::Zero();
3831  }
3832  else if (reg.size() >= t.reg.size())
3833  {
3834  Integer result(*this);
3835  XorWords(result.reg, t.reg, t.reg.size());
3836 
3837  result.sign = POSITIVE;
3838  return result;
3839  }
3840  else // reg.size() < t.reg.size()
3841  {
3842  Integer result(t);
3843  XorWords(result.reg, reg, reg.size());
3844 
3845  result.sign = POSITIVE;
3846  return result;
3847  }
3848 }
3849 
3850 void PositiveAdd(Integer &sum, const Integer &a, const Integer& b)
3851 {
3852  // Profiling tells us the original second Else If was dominant, so it
3853  // was promoted to the first If statement. The code change occurred at
3854  // Commit dc99266599a0e72d.
3855  int carry; const bool pre = (a.reg.size() == b.reg.size());
3856  if (!pre && a.reg.size() > b.reg.size())
3857  {
3858  carry = Add(sum.reg, a.reg, b.reg, b.reg.size());
3859  CopyWords(sum.reg+b.reg.size(), a.reg+b.reg.size(), a.reg.size()-b.reg.size());
3860  carry = Increment(sum.reg+b.reg.size(), a.reg.size()-b.reg.size(), carry);
3861  }
3862  else if (pre)
3863  {
3864  carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3865  }
3866  else
3867  {
3868  carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3869  CopyWords(sum.reg+a.reg.size(), b.reg+a.reg.size(), b.reg.size()-a.reg.size());
3870  carry = Increment(sum.reg+a.reg.size(), b.reg.size()-a.reg.size(), carry);
3871  }
3872 
3873  if (carry)
3874  {
3875  sum.reg.CleanGrow(2*sum.reg.size());
3876  sum.reg[sum.reg.size()/2] = 1;
3877  }
3878  sum.sign = Integer::POSITIVE;
3879 }
3880 
3881 void PositiveSubtract(Integer &diff, const Integer &a, const Integer& b)
3882 {
3883  unsigned aSize = a.WordCount();
3884  aSize += aSize%2;
3885  unsigned bSize = b.WordCount();
3886  bSize += bSize%2;
3887 
3888  // Profiling tells us the original second Else If was dominant, so it
3889  // was promoted to the first If statement. The code change occurred at
3890  // Commit dc99266599a0e72d.
3891  if (aSize > bSize)
3892  {
3893  word borrow = Subtract(diff.reg, a.reg, b.reg, bSize);
3894  CopyWords(diff.reg+bSize, a.reg+bSize, aSize-bSize);
3895  borrow = Decrement(diff.reg+bSize, aSize-bSize, borrow);
3896  CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3897  diff.sign = Integer::POSITIVE;
3898  }
3899  else if (aSize == bSize)
3900  {
3901  if (Compare(a.reg, b.reg, aSize) >= 0)
3902  {
3903  Subtract(diff.reg, a.reg, b.reg, aSize);
3904  diff.sign = Integer::POSITIVE;
3905  }
3906  else
3907  {
3908  Subtract(diff.reg, b.reg, a.reg, aSize);
3909  diff.sign = Integer::NEGATIVE;
3910  }
3911  }
3912  else
3913  {
3914  word borrow = Subtract(diff.reg, b.reg, a.reg, aSize);
3915  CopyWords(diff.reg+aSize, b.reg+aSize, bSize-aSize);
3916  borrow = Decrement(diff.reg+aSize, bSize-aSize, borrow);
3917  CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3918  diff.sign = Integer::NEGATIVE;
3919  }
3920 }
3921 
3922 // MSVC .NET 2003 workaround
3923 template <class T> inline const T& STDMAX2(const T& a, const T& b)
3924 {
3925  return a < b ? b : a;
3926 }
3927 
3929 {
3930  Integer sum((word)0, STDMAX2(reg.size(), b.reg.size()));
3931  if (NotNegative())
3932  {
3933  if (b.NotNegative())
3934  PositiveAdd(sum, *this, b);
3935  else
3936  PositiveSubtract(sum, *this, b);
3937  }
3938  else
3939  {
3940  if (b.NotNegative())
3941  PositiveSubtract(sum, b, *this);
3942  else
3943  {
3944  PositiveAdd(sum, *this, b);
3945  sum.sign = Integer::NEGATIVE;
3946  }
3947  }
3948  return sum;
3949 }
3950 
3952 {
3953  reg.CleanGrow(t.reg.size());
3954  if (NotNegative())
3955  {
3956  if (t.NotNegative())
3957  PositiveAdd(*this, *this, t);
3958  else
3959  PositiveSubtract(*this, *this, t);
3960  }
3961  else
3962  {
3963  if (t.NotNegative())
3964  PositiveSubtract(*this, t, *this);
3965  else
3966  {
3967  PositiveAdd(*this, *this, t);
3968  sign = Integer::NEGATIVE;
3969  }
3970  }
3971  return *this;
3972 }
3973 
3975 {
3976  Integer diff((word)0, STDMAX2(reg.size(), b.reg.size()));
3977  if (NotNegative())
3978  {
3979  if (b.NotNegative())
3980  PositiveSubtract(diff, *this, b);
3981  else
3982  PositiveAdd(diff, *this, b);
3983  }
3984  else
3985  {
3986  if (b.NotNegative())
3987  {
3988  PositiveAdd(diff, *this, b);
3989  diff.sign = Integer::NEGATIVE;
3990  }
3991  else
3992  PositiveSubtract(diff, b, *this);
3993  }
3994  return diff;
3995 }
3996 
3998 {
3999  reg.CleanGrow(t.reg.size());
4000  if (NotNegative())
4001  {
4002  if (t.NotNegative())
4003  PositiveSubtract(*this, *this, t);
4004  else
4005  PositiveAdd(*this, *this, t);
4006  }
4007  else
4008  {
4009  if (t.NotNegative())
4010  {
4011  PositiveAdd(*this, *this, t);
4012  sign = Integer::NEGATIVE;
4013  }
4014  else
4015  PositiveSubtract(*this, t, *this);
4016  }
4017  return *this;
4018 }
4019 
4021 {
4022  const size_t wordCount = WordCount();
4023  const size_t shiftWords = n / WORD_BITS;
4024  const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4025 
4026  reg.CleanGrow(RoundupSize(wordCount+BitsToWords(n)));
4027  ShiftWordsLeftByWords(reg, wordCount + shiftWords, shiftWords);
4028  ShiftWordsLeftByBits(reg+shiftWords, wordCount+BitsToWords(shiftBits), shiftBits);
4029  return *this;
4030 }
4031 
4033 {
4034  const size_t wordCount = WordCount();
4035  const size_t shiftWords = n / WORD_BITS;
4036  const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4037 
4038  ShiftWordsRightByWords(reg, wordCount, shiftWords);
4039  if (wordCount > shiftWords)
4040  ShiftWordsRightByBits(reg, wordCount-shiftWords, shiftBits);
4041  if (IsNegative() && WordCount()==0) // avoid -0
4042  *this = Zero();
4043  return *this;
4044 }
4045 
4047 {
4048  if (this != &t)
4049  {
4050  const size_t size = STDMIN(reg.size(), t.reg.size());
4051  reg.resize(size);
4052  AndWords(reg, t.reg, size);
4053  }
4054  sign = POSITIVE;
4055  return *this;
4056 }
4057 
4059 {
4060  if (this != &t)
4061  {
4062  if (reg.size() >= t.reg.size())
4063  {
4064  OrWords(reg, t.reg, t.reg.size());
4065  }
4066  else // reg.size() < t.reg.size()
4067  {
4068  const size_t head = reg.size();
4069  const size_t tail = t.reg.size() - reg.size();
4070  reg.resize(head+tail);
4071  OrWords(reg, t.reg, head);
4072  CopyWords(reg+head,t.reg+head,tail);
4073  }
4074  }
4075  sign = POSITIVE;
4076  return *this;
4077 }
4078 
4080 {
4081  if (this == &t)
4082  {
4083  *this = Zero();
4084  }
4085  else
4086  {
4087  if (reg.size() >= t.reg.size())
4088  {
4089  XorWords(reg, t.reg, t.reg.size());
4090  }
4091  else // reg.size() < t.reg.size()
4092  {
4093  const size_t head = reg.size();
4094  const size_t tail = t.reg.size() - reg.size();
4095  reg.resize(head+tail);
4096  XorWords(reg, t.reg, head);
4097  CopyWords(reg+head,t.reg+head,tail);
4098  }
4099  }
4100  sign = POSITIVE;
4101  return *this;
4102 }
4103 
4104 void PositiveMultiply(Integer &product, const Integer &a, const Integer &b)
4105 {
4106  size_t aSize = RoundupSize(a.WordCount());
4107  size_t bSize = RoundupSize(b.WordCount());
4108 
4109  product.reg.CleanNew(RoundupSize(aSize+bSize));
4110  product.sign = Integer::POSITIVE;
4111 
4112  IntegerSecBlock workspace(aSize + bSize);
4113  AsymmetricMultiply(product.reg, workspace, a.reg, aSize, b.reg, bSize);
4114 }
4115 
4116 void Multiply(Integer &product, const Integer &a, const Integer &b)
4117 {
4118  PositiveMultiply(product, a, b);
4119 
4120  if (a.NotNegative() != b.NotNegative())
4121  product.Negate();
4122 }
4123 
4125 {
4126  Integer product;
4127  Multiply(product, *this, b);
4128  return product;
4129 }
4130 
4131 /*
4132 void PositiveDivide(Integer &remainder, Integer &quotient,
4133  const Integer &dividend, const Integer &divisor)
4134 {
4135  remainder.reg.CleanNew(divisor.reg.size());
4136  remainder.sign = Integer::POSITIVE;
4137  quotient.reg.New(0);
4138  quotient.sign = Integer::POSITIVE;
4139  unsigned i=dividend.BitCount();
4140  while (i--)
4141  {
4142  word overflow = ShiftWordsLeftByBits(remainder.reg, remainder.reg.size(), 1);
4143  remainder.reg[0] |= dividend[i];
4144  if (overflow || remainder >= divisor)
4145  {
4146  Subtract(remainder.reg, remainder.reg, divisor.reg, remainder.reg.size());
4147  quotient.SetBit(i);
4148  }
4149  }
4150 }
4151 */
4152 
4153 void PositiveDivide(Integer &remainder, Integer &quotient,
4154  const Integer &a, const Integer &b)
4155 {
4156  unsigned aSize = a.WordCount();
4157  unsigned bSize = b.WordCount();
4158 
4159  if (!bSize)
4160  throw Integer::DivideByZero();
4161 
4162  if (aSize < bSize)
4163  {
4164  remainder = a;
4165  remainder.sign = Integer::POSITIVE;
4166  quotient = Integer::Zero();
4167  return;
4168  }
4169 
4170  aSize += aSize%2; // round up to next even number
4171  bSize += bSize%2;
4172 
4173  remainder.reg.CleanNew(RoundupSize(bSize));
4174  remainder.sign = Integer::POSITIVE;
4175  quotient.reg.CleanNew(RoundupSize(aSize-bSize+2));
4176  quotient.sign = Integer::POSITIVE;
4177 
4178  IntegerSecBlock T(aSize+3*(bSize+2));
4179  Divide(remainder.reg, quotient.reg, T, a.reg, aSize, b.reg, bSize);
4180 }
4181 
4182 void Integer::Divide(Integer &remainder, Integer &quotient, const Integer &dividend, const Integer &divisor)
4183 {
4184  PositiveDivide(remainder, quotient, dividend, divisor);
4185 
4186  if (dividend.IsNegative())
4187  {
4188  quotient.Negate();
4189  if (remainder.NotZero())
4190  {
4191  --quotient;
4192  remainder = divisor.AbsoluteValue() - remainder;
4193  }
4194  }
4195 
4196  if (divisor.IsNegative())
4197  quotient.Negate();
4198 }
4199 
4200 void Integer::DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
4201 {
4202  q = a;
4203  q >>= n;
4204 
4205  const size_t wordCount = BitsToWords(n);
4206  if (wordCount <= a.WordCount())
4207  {
4208  r.reg.resize(RoundupSize(wordCount));
4209  CopyWords(r.reg, a.reg, wordCount);
4210  SetWords(r.reg+wordCount, 0, r.reg.size()-wordCount);
4211  if (n % WORD_BITS != 0)
4212  r.reg[wordCount-1] %= (word(1) << (n % WORD_BITS));
4213  }
4214  else
4215  {
4216  r.reg.resize(RoundupSize(a.WordCount()));
4217  CopyWords(r.reg, a.reg, r.reg.size());
4218  }
4219  r.sign = POSITIVE;
4220 
4221  if (a.IsNegative() && r.NotZero())
4222  {
4223  --q;
4224  r = Power2(n) - r;
4225  }
4226 }
4227 
4229 {
4230  Integer remainder, quotient;
4231  Integer::Divide(remainder, quotient, *this, b);
4232  return quotient;
4233 }
4234 
4236 {
4237  Integer remainder, quotient;
4238  Integer::Divide(remainder, quotient, *this, b);
4239  return remainder;
4240 }
4241 
4242 void Integer::Divide(word &remainder, Integer &quotient, const Integer &dividend, word divisor)
4243 {
4244  if (!divisor)
4245  throw Integer::DivideByZero();
4246 
4247  // IsPowerOf2 uses BMI on x86 if available. There is a small
4248  // but measurable improvement during decryption and signing.
4249  if (IsPowerOf2(divisor))
4250  {
4251  quotient = dividend >> (BitPrecision(divisor)-1);
4252  remainder = dividend.reg[0] & (divisor-1);
4253  return;
4254  }
4255 
4256  unsigned int i = dividend.WordCount();
4257  quotient.reg.CleanNew(RoundupSize(i));
4258  remainder = 0;
4259  while (i--)
4260  {
4261  quotient.reg[i] = DWord(dividend.reg[i], remainder) / divisor;
4262  remainder = DWord(dividend.reg[i], remainder) % divisor;
4263  }
4264 
4265  if (dividend.NotNegative())
4266  quotient.sign = POSITIVE;
4267  else
4268  {
4269  quotient.sign = NEGATIVE;
4270  if (remainder)
4271  {
4272  --quotient;
4273  remainder = divisor - remainder;
4274  }
4275  }
4276 }
4277 
4279 {
4280  word remainder;
4281  Integer quotient;
4282  Integer::Divide(remainder, quotient, *this, b);
4283  return quotient;
4284 }
4285 
4286 word Integer::Modulo(word divisor) const
4287 {
4288  if (!divisor)
4289  throw Integer::DivideByZero();
4290 
4291  word remainder;
4292 
4293  // Profiling tells us the original Else was dominant, so it was
4294  // promoted to the first If statement. The code change occurred
4295  // at Commit dc99266599a0e72d.
4296  if ((divisor & (divisor-1)) != 0) // divisor is not a power of 2
4297  {
4298  // Profiling tells us the original Else was dominant, so it
4299  // was promoted to the first If statement. The code change
4300  // occurred at Commit dc99266599a0e72d.
4301  unsigned int i = WordCount();
4302  if (divisor > 5)
4303  {
4304  remainder = 0;
4305  while (i--)
4306  remainder = DWord(reg[i], remainder) % divisor;
4307  }
4308  else
4309  {
4310  DWord sum(0, 0);
4311  while (i--)
4312  sum += reg[i];
4313  remainder = sum % divisor;
4314  }
4315  }
4316  else // divisor is a power of 2
4317  {
4318  remainder = reg[0] & (divisor-1);
4319  }
4320 
4321  if (IsNegative() && remainder)
4322  remainder = divisor - remainder;
4323 
4324  return remainder;
4325 }
4326 
4328 {
4329  if (!!(*this)) // don't flip sign if *this==0
4330  sign = Sign(1-sign);
4331 }
4332 
4333 int Integer::PositiveCompare(const Integer& t) const
4334 {
4335  // Profiling tells us the original Else was dominant, so it
4336  // was promoted to the first If statement. The code change
4337  // occurred at Commit dc99266599a0e72d.
4338  const unsigned size = WordCount(), tSize = t.WordCount();
4339  if (size != tSize)
4340  return size > tSize ? 1 : -1;
4341  else
4342  return CryptoPP::Compare(reg, t.reg, size);
4343 }
4344 
4345 int Integer::Compare(const Integer& t) const
4346 {
4347  if (NotNegative())
4348  {
4349  if (t.NotNegative())
4350  return PositiveCompare(t);
4351  else
4352  return 1;
4353  }
4354  else
4355  {
4356  if (t.NotNegative())
4357  return -1;
4358  else
4359  return -PositiveCompare(t);
4360  }
4361 }
4362 
4364 {
4365  if (!IsPositive())
4366  return Zero();
4367 
4368  // overestimate square root
4369  Integer x, y = Power2((BitCount()+1)/2);
4370  CRYPTOPP_ASSERT(y*y >= *this);
4371 
4372  do
4373  {
4374  x = y;
4375  y = (x + *this/x) >> 1;
4376  } while (y<x);
4377 
4378  return x;
4379 }
4380 
4381 bool Integer::IsSquare() const
4382 {
4383  Integer r = SquareRoot();
4384  return *this == r.Squared();
4385 }
4386 
4387 bool Integer::IsUnit() const
4388 {
4389  return (WordCount() == 1) && (reg[0] == 1);
4390 }
4391 
4393 {
4394  return IsUnit() ? *this : Zero();
4395 }
4396 
4397 Integer a_times_b_mod_c(const Integer &x, const Integer& y, const Integer& m)
4398 {
4399  CRYPTOPP_ASSERT(m.NotZero());
4400  if (m.IsZero())
4401  throw Integer::DivideByZero();
4402 
4403  return x*y%m;
4404 }
4405 
4406 Integer a_exp_b_mod_c(const Integer &x, const Integer& e, const Integer& m)
4407 {
4408  CRYPTOPP_ASSERT(m.NotZero());
4409  if (m.IsZero())
4410  throw Integer::DivideByZero();
4411 
4412  ModularArithmetic mr(m);
4413  return mr.Exponentiate(x, e);
4414 }
4415 
4417 {
4418  return EuclideanDomainOf<Integer>().Gcd(a, b);
4419 }
4420 
4422 {
4424  CRYPTOPP_ASSERT(m.NotZero());
4425 
4426  if (IsNegative())
4427  return Modulo(m).InverseModNext(m);
4428 
4429  // http://github.com/weidai11/cryptopp/issues/602
4430  if (*this >= m)
4431  return Modulo(m).InverseModNext(m);
4432 
4433  return InverseModNext(m);
4434 }
4435 
4436 Integer Integer::InverseModNext(const Integer &m) const
4437 {
4439  CRYPTOPP_ASSERT(m.NotZero());
4440 
4441  if (m.IsEven())
4442  {
4443  if (!m || IsEven())
4444  return Zero(); // no inverse
4445  if (*this == One())
4446  return One();
4447 
4448  Integer u = m.Modulo(*this).InverseModNext(*this);
4449  return !u ? Zero() : (m*(*this-u)+1)/(*this);
4450  }
4451 
4452  // AlmostInverse requires a 4x workspace
4453  IntegerSecBlock T(m.reg.size() * 4);
4454  Integer r((word)0, m.reg.size());
4455  unsigned k = AlmostInverse(r.reg, T, reg, reg.size(), m.reg, m.reg.size());
4456  DivideByPower2Mod(r.reg, r.reg, k, m.reg, m.reg.size());
4457  return r;
4458 }
4459 
4460 word Integer::InverseMod(word mod) const
4461 {
4462  CRYPTOPP_ASSERT(mod != 0);
4463 
4464  word g0 = mod, g1 = *this % mod;
4465  word v0 = 0, v1 = 1;
4466  word y;
4467 
4468  while (g1)
4469  {
4470  if (g1 == 1)
4471  return v1;
4472  y = g0 / g1;
4473  g0 = g0 % g1;
4474  v0 += y * v1;
4475 
4476  if (!g0)
4477  break;
4478  if (g0 == 1)
4479  return mod-v0;
4480  y = g1 / g0;
4481  g1 = g1 % g0;
4482  v1 += y * v0;
4483  }
4484  return 0;
4485 }
4486 
4487 // ********************************************************
4488 
4490 {
4491  BERSequenceDecoder seq(bt);
4492  OID oid(seq);
4493  if (oid != ASN1::prime_field())
4494  BERDecodeError();
4495  m_modulus.BERDecode(seq);
4496  seq.MessageEnd();
4497  m_result.reg.resize(m_modulus.reg.size());
4498 }
4499 
4501 {
4502  DERSequenceEncoder seq(bt);
4503  ASN1::prime_field().DEREncode(seq);
4504  m_modulus.DEREncode(seq);
4505  seq.MessageEnd();
4506 }
4507 
4509 {
4510  a.DEREncodeAsOctetString(out, MaxElementByteLength());
4511 }
4512 
4514 {
4515  a.BERDecodeAsOctetString(in, MaxElementByteLength());
4516 }
4517 
4519 {
4520  if (a.reg.size()==m_modulus.reg.size())
4521  {
4522  CryptoPP::DivideByPower2Mod(m_result.reg.begin(), a.reg, 1, m_modulus.reg, a.reg.size());
4523  return m_result;
4524  }
4525  else
4526  return m_result1 = (a.IsEven() ? (a >> 1) : ((a+m_modulus) >> 1));
4527 }
4528 
4529 const Integer& ModularArithmetic::Add(const Integer &a, const Integer &b) const
4530 {
4531  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4532  {
4533  if (CryptoPP::Add(m_result.reg.begin(), a.reg, b.reg, a.reg.size())
4534  || Compare(m_result.reg, m_modulus.reg, a.reg.size()) >= 0)
4535  {
4536  CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4537  }
4538  return m_result;
4539  }
4540  else
4541  {
4542  m_result1 = a+b;
4543  if (m_result1 >= m_modulus)
4544  m_result1 -= m_modulus;
4545  return m_result1;
4546  }
4547 }
4548 
4550 {
4551  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4552  {
4553  if (CryptoPP::Add(a.reg, a.reg, b.reg, a.reg.size())
4554  || Compare(a.reg, m_modulus.reg, a.reg.size()) >= 0)
4555  {
4556  CryptoPP::Subtract(a.reg, a.reg, m_modulus.reg, a.reg.size());
4557  }
4558  }
4559  else
4560  {
4561  a+=b;
4562  if (a>=m_modulus)
4563  a-=m_modulus;
4564  }
4565 
4566  return a;
4567 }
4568 
4569 const Integer& ModularArithmetic::Subtract(const Integer &a, const Integer &b) const
4570 {
4571  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4572  {
4573  if (CryptoPP::Subtract(m_result.reg.begin(), a.reg, b.reg, a.reg.size()))
4574  CryptoPP::Add(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4575  return m_result;
4576  }
4577  else
4578  {
4579  m_result1 = a-b;
4580  if (m_result1.IsNegative())
4581  m_result1 += m_modulus;
4582  return m_result1;
4583  }
4584 }
4585 
4587 {
4588  if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4589  {
4590  if (CryptoPP::Subtract(a.reg, a.reg, b.reg, a.reg.size()))
4591  CryptoPP::Add(a.reg, a.reg, m_modulus.reg, a.reg.size());
4592  }
4593  else
4594  {
4595  a-=b;
4596  if (a.IsNegative())
4597  a+=m_modulus;
4598  }
4599 
4600  return a;
4601 }
4602 
4604 {
4605  if (!a)
4606  return a;
4607 
4608  CopyWords(m_result.reg.begin(), m_modulus.reg, m_modulus.reg.size());
4609  if (CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, a.reg, a.reg.size()))
4610  Decrement(m_result.reg.begin()+a.reg.size(), m_modulus.reg.size()-a.reg.size());
4611 
4612  return m_result;
4613 }
4614 
4615 Integer ModularArithmetic::CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
4616 {
4617  if (m_modulus.IsOdd())
4618  {
4619  MontgomeryRepresentation dr(m_modulus);
4620  return dr.ConvertOut(dr.CascadeExponentiate(dr.ConvertIn(x), e1, dr.ConvertIn(y), e2));
4621  }
4622  else
4623  return AbstractRing<Integer>::CascadeExponentiate(x, e1, y, e2);
4624 }
4625 
4626 void ModularArithmetic::SimultaneousExponentiate(Integer *results, const Integer &base, const Integer *exponents, unsigned int exponentsCount) const
4627 {
4628  if (m_modulus.IsOdd())
4629  {
4630  MontgomeryRepresentation dr(m_modulus);
4631  dr.SimultaneousExponentiate(results, dr.ConvertIn(base), exponents, exponentsCount);
4632  for (unsigned int i=0; i<exponentsCount; i++)
4633  results[i] = dr.ConvertOut(results[i]);
4634  }
4635  else
4636  AbstractRing<Integer>::SimultaneousExponentiate(results, base, exponents, exponentsCount);
4637 }
4638 
4640  : ModularArithmetic(m),
4641  m_u((word)0, m_modulus.reg.size()),
4642  m_workspace(5*m_modulus.reg.size())
4643 {
4644  if (!m_modulus.IsOdd())
4645  throw InvalidArgument("MontgomeryRepresentation: Montgomery representation requires an odd modulus");
4646 
4647  RecursiveInverseModPower2(m_u.reg, m_workspace, m_modulus.reg, m_modulus.reg.size());
4648 }
4649 
4651 {
4652  word *const T = m_workspace.begin();
4653  word *const R = m_result.reg.begin();
4654  const size_t N = m_modulus.reg.size();
4655  CRYPTOPP_ASSERT(a.reg.size()<=N && b.reg.size()<=N);
4656 
4657  AsymmetricMultiply(T, T+2*N, a.reg, a.reg.size(), b.reg, b.reg.size());
4658  SetWords(T+a.reg.size()+b.reg.size(), 0, 2*N-a.reg.size()-b.reg.size());
4659  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4660  return m_result;
4661 }
4662 
4664 {
4665  word *const T = m_workspace.begin();
4666  word *const R = m_result.reg.begin();
4667  const size_t N = m_modulus.reg.size();
4668  CRYPTOPP_ASSERT(a.reg.size()<=N);
4669 
4670  CryptoPP::Square(T, T+2*N, a.reg, a.reg.size());
4671  SetWords(T+2*a.reg.size(), 0, 2*N-2*a.reg.size());
4672  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4673  return m_result;
4674 }
4675 
4677 {
4678  word *const T = m_workspace.begin();
4679  word *const R = m_result.reg.begin();
4680  const size_t N = m_modulus.reg.size();
4681  CRYPTOPP_ASSERT(a.reg.size()<=N);
4682 
4683  CopyWords(T, a.reg, a.reg.size());
4684  SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4685  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4686  return m_result;
4687 }
4688 
4690 {
4691 // return (EuclideanMultiplicativeInverse(a, modulus)<<(2*WORD_BITS*modulus.reg.size()))%modulus;
4692  word *const T = m_workspace.begin();
4693  word *const R = m_result.reg.begin();
4694  const size_t N = m_modulus.reg.size();
4695  CRYPTOPP_ASSERT(a.reg.size()<=N);
4696 
4697  CopyWords(T, a.reg, a.reg.size());
4698  SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4699  MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4700  unsigned k = AlmostInverse(R, T, R, N, m_modulus.reg, N);
4701 
4702 // cout << "k=" << k << " N*32=" << 32*N << endl;
4703 
4704  if (k>N*WORD_BITS)
4705  DivideByPower2Mod(R, R, k-N*WORD_BITS, m_modulus.reg, N);
4706  else
4707  MultiplyByPower2Mod(R, R, N*WORD_BITS-k, m_modulus.reg, N);
4708 
4709  return m_result;
4710 }
4711 
4712 // Specialization declared in misc.h to allow us to print integers
4713 // with additional control options, like arbirary bases and uppercase.
4714 template <> CRYPTOPP_DLL
4715 std::string IntToString<Integer>(Integer value, unsigned int base)
4716 {
4717  // Hack... set the high bit for uppercase. Set the next bit fo a suffix.
4718  static const unsigned int BIT_32 = (1U << 31);
4719  const bool UPPER = !!(base & BIT_32);
4720  static const unsigned int BIT_31 = (1U << 30);
4721  const bool BASE = !!(base & BIT_31);
4722 
4723  const char CH = UPPER ? 'A' : 'a';
4724  base &= ~(BIT_32|BIT_31);
4725  CRYPTOPP_ASSERT(base >= 2 && base <= 32);
4726 
4727  if (value == 0)
4728  return "0";
4729 
4730  bool negative = false, zero = false;
4731  if (value.IsNegative())
4732  {
4733  negative = true;
4734  value.Negate();
4735  }
4736 
4737  if (!value)
4738  zero = true;
4739 
4740  SecBlock<char> s(value.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
4741  Integer temp;
4742 
4743  unsigned int i=0;
4744  while (!!value)
4745  {
4746  word digit;
4747  Integer::Divide(digit, temp, value, word(base));
4748  s[i++]=char((digit < 10 ? '0' : (CH - 10)) + digit);
4749  value.swap(temp);
4750  }
4751 
4752  std::string result;
4753  result.reserve(i+2);
4754 
4755  if (negative)
4756  result += '-';
4757 
4758  if (zero)
4759  result += '0';
4760 
4761  while (i--)
4762  result += s[i];
4763 
4764  if (BASE)
4765  {
4766  if (base == 10)
4767  result += '.';
4768  else if (base == 16)
4769  result += 'h';
4770  else if (base == 8)
4771  result += 'o';
4772  else if (base == 2)
4773  result += 'b';
4774  }
4775 
4776  return result;
4777 }
4778 
4779 // Specialization declared in misc.h to avoid Coverity findings.
4780 template <> CRYPTOPP_DLL
4781 std::string IntToString<word64>(word64 value, unsigned int base)
4782 {
4783  // Hack... set the high bit for uppercase.
4784  static const unsigned int HIGH_BIT = (1U << 31);
4785  const char CH = !!(base & HIGH_BIT) ? 'A' : 'a';
4786  base &= ~HIGH_BIT;
4787 
4788  CRYPTOPP_ASSERT(base >= 2);
4789  if (value == 0)
4790  return "0";
4791 
4792  std::string result;
4793  while (value > 0)
4794  {
4795  word64 digit = value % base;
4796  result = char((digit < 10 ? '0' : (CH - 10)) + digit) + result;
4797  value /= base;
4798  }
4799  return result;
4800 }
4801 
4802 #ifndef CRYPTOPP_NO_ASSIGN_TO_INTEGER
4803 // Allow the linker to discard Integer code if not needed.
4804 // Also see http://github.com/weidai11/cryptopp/issues/389.
4805 bool AssignIntToInteger(const std::type_info &valueType, void *pInteger, const void *pInt)
4806 {
4807  if (valueType != typeid(Integer))
4808  return false;
4809  *reinterpret_cast<Integer *>(pInteger) = *reinterpret_cast<const int *>(pInt);
4810  return true;
4811 }
4812 #endif // CRYPTOPP_NO_ASSIGN_TO_INTEGER
4813 
4814 // *************************** C++ Static Initialization ***************************
4815 
4817 {
4818 public:
4819  InitInteger()
4820  {
4821  SetFunctionPointers();
4822  }
4823 };
4824 
4825 // This is not really needed because each Integer can dynamically initialize
4826 // itself, but we take a peephole optimization and initialize the class once
4827 // if init priorities are available. Dynamic initialization will be used if
4828 // init priorities are not available.
4829 
4830 #if defined(HAVE_GCC_INIT_PRIORITY)
4831  const InitInteger s_init __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 10))) = InitInteger();
4832  const Integer g_zero __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 11))) = Integer(0L);
4833  const Integer g_one __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 12))) = Integer(1L);
4834  const Integer g_two __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 13))) = Integer(2L);
4835 #elif defined(HAVE_MSC_INIT_PRIORITY)
4836  #pragma warning(disable: 4075)
4837  #pragma init_seg(".CRT$XCU")
4838  const InitInteger s_init;
4839  const Integer g_zero(0L);
4840  const Integer g_one(1L);
4841  const Integer g_two(2L);
4842  #pragma warning(default: 4075)
4843 #elif HAVE_XLC_INIT_PRIORITY
4844  // XLC needs constant, not a define
4845  #pragma priority(280)
4846  const InitInteger s_init;
4847  const Integer g_zero(0L);
4848  const Integer g_one(1L);
4849  const Integer g_two(2L);
4850 #else
4851  const InitInteger s_init;
4852 #endif
4853 
4854 // ***************** Library code ********************
4855 
4857 {
4858 #if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4859  return g_zero;
4860 #elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4861  static const Integer s_zero(0L);
4862  return s_zero;
4863 #else // Potential memory leak. Avoid if possible.
4864  return Singleton<Integer, NewInteger<0L> >().Ref();
4865 #endif
4866 }
4867 
4869 {
4870 #if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4871  return g_one;
4872 #elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4873  static const Integer s_one(1L);
4874  return s_one;
4875 #else // Potential memory leak. Avoid if possible.
4876  return Singleton<Integer, NewInteger<1L> >().Ref();
4877 #endif
4878 }
4879 
4881 {
4882 #if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4883  return g_two;
4884 #elif defined(CRYPTOPP_CXX11_DYNAMIC_INIT)
4885  static const Integer s_two(2L);
4886  return s_two;
4887 #else // Potential memory leak. Avoid if possible.
4888  return Singleton<Integer, NewInteger<2L> >().Ref();
4889 #endif
4890 }
4891 
4892 NAMESPACE_END
4893 
4894 #endif // CRYPTOPP_IMPORTS
Used to pass byte array input as part of a NameValuePairs object.
Definition: algparam.h:20
An invalid argument was detected.
Definition: cryptlib.h:202
unsigned int WordCount() const
Determines the number of words required to represent the Integer.
Definition: integer.cpp:3331
Integer MultiplicativeInverse() const
Calculate multiplicative inverse.
Definition: integer.cpp:4392
Integer & Reduce(Integer &a, const Integer &b) const
TODO.
Definition: integer.cpp:4586
bool NotZero() const
Determines if the Integer is non-0.
Definition: integer.h:333
Classes for working with NameValuePairs.
Integer Or(const Integer &t) const
Bitwise OR.
Definition: integer.cpp:3800
Integer & operator|=(const Integer &t)
Bitwise OR Assignment.
Definition: integer.cpp:4058
void swap(SecBlock< T, A > &b)
Swap contents with another SecBlock.
Definition: secblock.h:1041
Integer And(const Integer &t) const
Bitwise AND.
Definition: integer.cpp:3774
a number which is probabilistically prime
Definition: integer.h:95
Utility functions for the Crypto++ library.
Integer Plus(const Integer &b) const
Addition.
Definition: integer.cpp:3928
virtual size_t Peek(byte &outByte) const
Peek a 8-bit byte.
Definition: cryptlib.cpp:550
ByteOrder
Provides the byte ordering.
Definition: cryptlib.h:143
Restricts the instantiation of a class to one static object without locks.
Definition: misc.h:263
void CleanNew(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:980
an unsigned value
Definition: integer.h:85
T GetValueWithDefault(const char *name, T defaultValue) const
Get a named value.
Definition: cryptlib.h:363
bool IsSquare() const
Determine whether this integer is a perfect square.
Definition: integer.cpp:4381
void DEREncode(BufferedTransformation &bt) const
Encode in DER format.
Definition: integer.cpp:3432
size_t DEREncodeUnsigned(BufferedTransformation &out, T w, byte asnTag=INTEGER)
DER Encode unsigned value.
Definition: asn.h:461
virtual void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Definition: cryptlib.cpp:311
size_t size() const
Length of the memory block.
Definition: algparam.h:84
size_t BitsToBytes(size_t bitCount)
Returns the number of 8-bit bytes or octets required for the specified number of bits.
Definition: misc.h:818
This file contains helper classes/functions for implementing public key algorithms.
void DEREncodeAsOctetString(BufferedTransformation &bt, size_t length) const
Encode absolute value as big-endian octet string.
Definition: integer.cpp:3455
Integer & operator=(const Integer &t)
Assignment.
Definition: integer.cpp:3091
bool FirstPrime(Integer &p, const Integer &max, const Integer &equiv, const Integer &mod, const PrimeSelector *pSelector)
Finds a random prime of special form.
Definition: nbtheory.cpp:379
static Integer Gcd(const Integer &a, const Integer &n)
Calculate greatest common divisor.
Definition: integer.cpp:4416
void resize(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1031
Integer & operator+=(const Integer &t)
Addition Assignment.
Definition: integer.cpp:3951
size_t BitsToWords(size_t bitCount)
Returns the number of words required for the specified number of bits.
Definition: misc.h:838
void PutWord(bool assumeAligned, ByteOrder order, byte *block, T value, const byte *xorBlock=NULL)
Access a block of memory.
Definition: misc.h:2396
unsigned int BytePrecision(const T &value)
Returns the number of 8-bit bytes or octets required for a value.
Definition: misc.h:699
void CleanGrow(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1013
Integer & operator--()
Pre-decrement.
Definition: integer.cpp:3754
virtual Element CascadeExponentiate(const Element &x, const Integer &e1, const Element &y, const Integer &e2) const
TODO.
Definition: algebra.cpp:323
Secure memory block with allocator and cleanup.
Definition: secblock.h:688
const Integer & Inverse(const Integer &a) const
Inverts the element in the ring.
Definition: integer.cpp:4603
signed long ConvertToLong() const
Convert the Integer to Long.
Definition: integer.cpp:3012
const Integer & Subtract(const Integer &a, const Integer &b) const
Subtracts elements in the ring.
Definition: integer.cpp:4569
void OpenPGPDecode(const byte *input, size_t inputLen)
Decode from OpenPGP format.
Definition: integer.cpp:3488
Signedness
Used when importing and exporting integers.
Definition: integer.h:83
unsigned int MaxElementByteLength() const
Provides the maximum byte size of an element in the ring.
Definition: modarith.h:232
ASN.1 object identifiers for algorthms and schemes.
Square block cipher.
Definition: square.h:24
Classes for automatic resource management.
bool IsP4()
Determines if the CPU is an Intel P4.
Definition: cpu.h:236
bool IsNegative() const
Determines if the Integer is negative.
Definition: integer.h:336
lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition: queue.h:33
Library configuration file.
MontgomeryRepresentation(const Integer &modulus)
Construct a MontgomeryRepresentation.
Definition: integer.cpp:4639
static void DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
Extended Division.
Definition: integer.cpp:4200
Ring of congruence classes modulo n.
Definition: modarith.h:38
Interface for random number generators.
Definition: cryptlib.h:1383
size_t BytesToWords(size_t byteCount)
Returns the number of words required for the specified number of bytes.
Definition: misc.h:828
Common C++ header files.
void Randomize(RandomNumberGenerator &rng, size_t bitCount)
Set this Integer to random integer.
Definition: integer.cpp:3503
std::string IntToString< word64 >(word64 value, unsigned int base)
Converts an unsigned value to a string.
Definition: integer.cpp:4781
void New(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:965
void SetByte(size_t n, byte value)
Set the n-th byte to value.
Definition: integer.cpp:3139
Integer InverseMod(const Integer &n) const
Calculate multiplicative inverse.
Definition: integer.cpp:4421
the value is negative
Definition: integer.h:77
SecBlock<byte> typedef.
Definition: secblock.h:1058
BER Sequence Decoder.
Definition: asn.h:309
Interface for buffered transformations.
Definition: cryptlib.h:1598
bool IsPositive() const
Determines if the Integer is positive.
Definition: integer.h:342
bool NotNegative() const
Determines if the Integer is non-negative.
Definition: integer.h:339
const Integer & Add(const Integer &a, const Integer &b) const
Adds elements in the ring.
Definition: integer.cpp:4529
static const Integer & One()
Integer representing 1.
Definition: integer.cpp:4868
Integer & Accumulate(Integer &a, const Integer &b) const
TODO.
Definition: integer.cpp:4549
P1363 key derivation function.
Definition: pubkey.h:729
byte order is little-endian
Definition: cryptlib.h:145
Sign
Used internally to represent the integer.
Definition: integer.h:73
Integer ConvertIn(const Integer &a) const
Reduces an element in the congruence class.
Definition: modarith.h:292
bool operator!() const
Negation.
Definition: integer.cpp:3086
Pointer that overloads operator ->
Definition: smartptr.h:36
bool IsUnit() const
Determine if 1 or -1.
Definition: integer.cpp:4387
unsigned int ByteCount() const
Determines the number of bytes required to represent the Integer.
Definition: integer.cpp:3336
Classes and functions for secure memory allocations.
void DEREncodeElement(BufferedTransformation &out, const Element &a) const
Encodes element in DER format.
Definition: integer.cpp:4508
Integer Modulo(const Integer &b) const
Remainder.
Definition: integer.cpp:4235
Copy input to a memory buffer.
Definition: filters.h:1136
void BERDecodeElement(BufferedTransformation &in, Element &a) const
Decodes element in DER format.
Definition: integer.cpp:4513
size_t MinEncodedSize(Signedness sign=UNSIGNED) const
Minimum number of bytes to encode this integer.
Definition: integer.cpp:3393
Integer DividedBy(const Integer &b) const
Division.
Definition: integer.cpp:4228
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Definition: integer.cpp:4615
Integer Times(const Integer &b) const
Multiplication.
Definition: integer.cpp:4124
a number with no special properties
Definition: integer.h:93
const byte * begin() const
Pointer to the first byte in the memory block.
Definition: algparam.h:80
size_t Put(byte inByte, bool blocking=true)
Input a byte for processing.
Definition: cryptlib.h:1620
Integer & operator++()
Pre-increment.
Definition: integer.cpp:3733
AlgorithmParameters MakeParameters(const char *name, const T &value, bool throwIfNotUsed=true)
Create an object that implements NameValuePairs.
Definition: algparam.h:502
void swap(Integer &a)
Swaps this Integer with another Integer.
Definition: integer.cpp:3169
Integer()
Creates the zero integer.
Definition: integer.cpp:2958
unsigned int TrailingZeros(word32 v)
Determines the number of trailing 0-bits in a value.
Definition: misc.h:747
Integer & operator<<=(size_t n)
Left-shift Assignment.
Definition: integer.cpp:4020
bool IsZero() const
Determines if the Integer is 0.
Definition: integer.h:330
Exception thrown when an error is encountered decoding an OpenPGP integer.
Definition: integer.h:282
void Negate()
Reverse the Sign of the Integer.
Definition: integer.cpp:4327
int Compare(const Integer &a) const
Perform signed comparison.
Definition: integer.cpp:4345
T Crop(T value, size_t bits)
Truncates the value to the specified number of bits.
Definition: misc.h:806
void BERDecodeAsOctetString(BufferedTransformation &bt, size_t length)
Decode nonnegative value from big-endian octet string.
Definition: integer.cpp:3462
Application callback to signal suitability of a cabdidate prime.
Definition: nbtheory.h:113
void ConditionalSwapPointers(bool c, T &a, T &b)
Performs a branchless swap of pointers a and b if condition c is true.
Definition: misc.h:1237
static Integer Power2(size_t e)
Exponentiates to a power of 2.
Definition: integer.cpp:3079
Multiple precision integer with arithmetic operations.
Definition: integer.h:49
Precompiled header file.
virtual Element Exponentiate(const Element &a, const Integer &e) const
Raises a base to an exponent in the group.
Definition: algebra.cpp:316
const Integer & Multiply(const Integer &a, const Integer &b) const
Multiplies elements in the ring.
Definition: integer.cpp:4650
a signed value
Definition: integer.h:87
size_t OpenPGPEncode(byte *output, size_t bufferSize) const
Encode absolute value in OpenPGP format.
Definition: integer.cpp:3471
const Integer & Half(const Integer &a) const
Divides an element by 2.
Definition: integer.cpp:4518
static const Integer & Two()
Integer representing 2.
Definition: integer.cpp:4880
virtual lword Skip(lword skipMax=LWORD_MAX)
Discard skipMax bytes from the output buffer.
Definition: cryptlib.cpp:573
bool IsPowerOf2(const T &value)
Tests whether a value is a power of 2.
Definition: misc.h:889
const Integer & Square(const Integer &a) const
Square an element in the ring.
Definition: integer.cpp:4663
Integer & operator^=(const Integer &t)
Bitwise XOR Assignment.
Definition: integer.cpp:4079
#define MEMORY_BARRIER
A memory barrier.
Definition: misc.h:227
RandomNumberType
Properties of a random integer.
Definition: integer.h:91
const char * Seed()
ConstByteArrayParameter.
Definition: argnames.h:19
bool IsEven() const
Determines if the Integer is even parity.
Definition: integer.h:348
byte order is big-endian
Definition: cryptlib.h:147
const T & STDMIN(const T &a, const T &b)
Replacement function for std::min.
Definition: misc.h:535
String-based implementation of Store interface.
Definition: filters.h:1195
#define CRYPTOPP_ASSERT(exp)
Debugging and diagnostic assertion.
Definition: trap.h:69
static void Divide(Integer &r, Integer &q, const Integer &a, const Integer &d)
Extended Division.
Definition: integer.cpp:4182
ModularArithmetic(const Integer &modulus=Integer::One())
Construct a ModularArithmetic.
Definition: modarith.h:49
void BERDecodeError()
Raises a BERDecodeErr.
Definition: asn.h:69
Data structure used to store byte strings.
Definition: queue.h:18
Functions for CPU features and intrinsics.
Classes and functions for working with ANS.1 objects.
Classes for SHA-1 and SHA-2 family of message digests.
void SetBit(size_t n, bool value=1)
Set the n-th bit to value.
Definition: integer.cpp:3114
size_t GetWord16(word16 &value, ByteOrder order=BIG_ENDIAN_ORDER)
Retrieve a 16-bit word.
Definition: cryptlib.cpp:793
iterator begin()
Provides an iterator pointing to the first element in the memory block.
Definition: secblock.h:772
unsigned int BitCount() const
Determines the number of bits required to represent the Integer.
Definition: integer.cpp:3345
const char * PointerToPrimeSelector()
const PrimeSelector *
Definition: argnames.h:42
Implementation of BufferedTransformation&#39;s attachment interface.
Classes and functions for number theoretic operations.
byte GetByte(size_t i) const
Provides the i-th byte of the Integer.
Definition: integer.cpp:3128
Exception thrown when division by 0 is encountered.
Definition: integer.h:55
DER Sequence Encoder.
Definition: asn.h:319
void Encode(byte *output, size_t outputLen, Signedness sign=UNSIGNED) const
Encode in big-endian format.
Definition: integer.cpp:3410
Integer Squared() const
Multiply this integer by itself.
Definition: integer.h:609
T1 SaturatingSubtract1(const T1 &a, const T2 &b)
Performs a saturating subtract clamped at 1.
Definition: misc.h:989
virtual lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition: cryptlib.cpp:504
Exception thrown when a random number cannot be found that satisfies the condition.
Definition: integer.h:63
bool GenerateRandomNoThrow(RandomNumberGenerator &rng, const NameValuePairs &params=g_nullNameValuePairs)
Generate a random number.
Definition: integer.cpp:3558
Performs modular arithmetic in Montgomery representation for increased speed.
Definition: modarith.h:274
DER General Encoder.
Definition: asn.h:291
bool HasSSE2()
Determines SSE2 availability.
Definition: cpu.h:116
Integer Minus(const Integer &b) const
Subtraction.
Definition: integer.cpp:3974
void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Definition: integer.cpp:3545
void Decode(const byte *input, size_t inputLen, Signedness sign=UNSIGNED)
Decode from big-endian byte array.
Definition: integer.cpp:3354
size_t PutWord16(word16 value, ByteOrder order=BIG_ENDIAN_ORDER, bool blocking=true)
Input a 16-bit word for processing.
Definition: cryptlib.cpp:745
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Definition: modarith.h:306
size_t DEREncodeOctetString(BufferedTransformation &bt, const byte *str, size_t strLen)
DER encode octet string.
Definition: asn.cpp:104
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the ring.
Definition: integer.cpp:4626
Multiple precision integer with arithmetic operations.
Integer Xor(const Integer &t) const
Bitwise XOR.
Definition: integer.cpp:3826
static const Integer & Zero()
Integer representing 0.
Definition: integer.cpp:4856
const T & STDMAX(const T &a, const T &b)
Replacement function for std::max.
Definition: misc.h:546
Euclidean domain.
Definition: algebra.h:315
void Grow(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:995
void BERDecode(const byte *input, size_t inputLen)
Decode from BER format.
Definition: integer.cpp:3439
std::string IntToString< Integer >(Integer value, unsigned int base)
Converts an Integer to a string.
Definition: integer.cpp:4715
virtual size_t Get(byte &outByte)
Retrieve a 8-bit byte.
Definition: cryptlib.cpp:527
Class file for performing modular arithmetic.
Crypto++ library namespace.
bool GetValue(const char *name, T &value) const
Get a named value.
Definition: cryptlib.h:350
Integer & operator>>=(size_t n)
Right-shift Assignment.
Definition: integer.cpp:4032
bool GetBit(size_t i) const
Provides the i-th bit of the Integer.
Definition: integer.cpp:3103
Integer SquareRoot() const
Extract square root.
Definition: integer.cpp:4363
BER General Decoder.
Definition: asn.h:258
bool GetIntValue(const char *name, int &value) const
Get a named value with type int.
Definition: cryptlib.h:386
bool IsConvertableToLong() const
Determines if the Integer is convertable to Long.
Definition: integer.cpp:2998
Object Identifier.
Definition: asn.h:166
size_t Get(byte &outByte)
Retrieve a 8-bit byte.
Definition: queue.cpp:301
Integer AbsoluteValue() const
Retrieve the absolute value of this integer.
Definition: integer.cpp:3162
Integer & operator-=(const Integer &t)
Subtraction Assignment.
Definition: integer.cpp:3997
unsigned int BitPrecision(const T &value)
Returns the number of bits required for a value.
Definition: misc.h:722
size_type size() const
Provides the count of elements in the SecBlock.
Definition: secblock.h:797
Integer & operator &=(const Integer &t)
Bitwise AND Assignment.
void DEREncode(BufferedTransformation &bt) const
Encodes in DER format.
Definition: integer.cpp:4500
lword GetBits(size_t i, size_t n) const
Provides the low order bits of the Integer.
Definition: integer.cpp:3146
Integer operator-() const
Subtraction.
Definition: integer.cpp:3155
the value is positive or 0
Definition: integer.h:75
const Integer & MultiplicativeInverse(const Integer &a) const
Calculate the multiplicative inverse of an element in the ring.
Definition: integer.cpp:4689
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the ring.
Definition: modarith.h:309
bool IsOdd() const
Determines if the Integer is odd parity.
Definition: integer.h:351
virtual void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the Ring.
Definition: algebra.cpp:334
Interface for retrieving values given their names.
Definition: cryptlib.h:293
Integer ConvertOut(const Integer &a) const
Reduces an element in the congruence class.
Definition: integer.cpp:4676