Integer Patch

From Crypto++ Wiki
Jump to navigation Jump to search

The Integer Patch adds little-endian support and output base awareness to the Integer class. The endian support enables the Integer class to construct Integers from strings and byte arrays in little-endian format. Its useful for algorithms like Poly1305 and MS CAPI interop, where many parameters are provided in little-endian format.

The base support allows Integers to honor std::ios_base::showbase and std::ios_base::noshowbase flags. std::ios_base::showbase is enabled by default, so enabling it produces no changes to a stream's default behavior. If std::noshowbase is in effect, then the Integer class will not emit the suffix indicating the base. The suffixes are b, o, h, or . (the last is for decimal).

An optional lookup table was also added for parsing ASCII strings. The lookout table avoids 4 if/then/else's and 6 compares. It should be useful on processors like ARM in thumb mode because it avoids the branching (and subsequent stalls) at the expense of 256 bytes.

There is a test source file named integer-test.c++ that allows testing of the little-endian conversion routines.

Note well: this is not part of the Crypto++ library. You must download and install the patch below.

Patch

The changes the patch makes can be found in integer.diff. The essence of the patch is to add an additional ByteOrder parameter to select Integer constructors, and then parse the incoming array in little endian format if LITTLE_ENDIAN_ORDER is specified. A default ByteOrder parameter of BIG_ENDIAN_ORDER is used, so existing code works as expected. The patch also adds awareness for ostream's std::showbase and std::noshowbase.

The files that changed are:

  • config.h
  • integer.h integer.cpp
  • integer-test.c++

Little Endian Integers

For little endian support, the constructors that changed (with their new signatures) are:

  • explicit Integer(const char *str, ByteOrder order = BIG_ENDIAN_ORDER)
  • explicit Integer(const wchar_t *str, ByteOrder order = BIG_ENDIAN_ORDER)
  • Integer(const byte *encodedInteger, size_t byteCount, Signedness s=UNSIGNED, ByteOrder o=BIG_ENDIAN_ORDER)
  • Integer(BufferedTransformation &bt, size_t byteCount, Signedness s=UNSIGNED, ByteOrder o=BIG_ENDIAN_ORDER)

The little-endian functionality was not extended to the various Decode (and Encode) functions.

For binary, octal and decimal, the incoming string is simply parsed in reverse. So 123456789010/LE converts to 098765432110/BE. The tricky part to little endian support is handling an odd number of hexadecimal digits or nibbles (a nibble is a 4-bit chunk). In an ideal world, we get would always encounter two digits or nibbles, and there would never be a single digit or nibble because hexadecimal demands two of them.

For example, 0x1FF16/BE are three nibbles. In big endian, they are the two octets 0x01 and 0xFF. To ensure the consistent results with little endian 0xFF116/LE, an odd nibble is shifted down due to the missing nibble. That is, 0xFF116/LE is interpreted as 0xFF and 0x01; and not 0xFF and 0x10. If 0xFF116/LE was interpreted as 0xFF and 0x10, then that would break the most basic case of 0x1. That is, something that should intuitively be 1 (0x116/LE) would be interpreted as 1016 or 1610.

std::showbase and Suffixes

By default, Crypto++ always applies a base suffix to its output and there is no way to control it. If you want Crypto++ Integers to honor std::showbase and std::noshowbase, then uncomment the define CRYPTOPP_USE_STD_SHOWBASE in config.h. The define is already present, and it just needs to be uncommented.

C++ I/O streams don't use std::showbase by default, so the suffixes that Crypto++ normally applies will be suppressed without further action. If you want to show the suffixes, then enable std::showbase by performing similar to the following:

Integer n(32 + 15);

cout.setf(std::ios::showbase);
cout << std::oct << n << endl;
cout << std::dec << n << endl;
cout << std::hex << n << endl;

cout.unsetf(std::ios::showbase);
cout << std::oct << n << endl;
cout << std::dec << n << endl;
cout << std::hex << n << endl;

cout << std::showbase << endl;
cout << std::oct << n << endl;
cout << std::dec << n << endl;
cout << std::hex << n << endl;

cout << std::noshowbase << endl;
cout << std::oct << n << endl;
cout << std::dec << n << endl;
cout << std::hex << n << endl;

It will produce output similar to:

$ ./integer-test.exe
57o
47.
2fh

57
47
2f

57o
47.
2fh

57
47
2f

ASCII Lookup Table

An optional lookup table was added for parsing ASCII strings. If you want to use the lookup table, then uncomment the define CRYPTOPP_USE_ASCII_CHAR_VALUE_LOOKUP_TABLE in config.h. The define is already present, and it just needs to be uncommented. The lookout table avoids 4 if/then/else's and 6 compares. It should be useful on processors like ARM in thumb mode because it avoids the branching (and subsequent stalls) at the expense of 256 bytes.

The unused values in the table are set to 46, which is the period ('.') character. Its just a filler that can be seen under a debugger. Any value greater than 16 could have been used.

Do not use the table for a system that uses EBCDIC as the execution or runtime encoding. EBCDIC character encodings are different than ASCII, and the table won't produce expected results. For example, in ASCII A is 6510, while in EBCDIC A is 19310. And EBCDIC lower case letters proceed capital letters, while in ASCII capital letters proceed lower case letters.

In a morbid sort of humorous way, the C and C++ standards don't guarantee the letters A through F or a through f are contiguous (only the characters 0 through 9). So tests like the following could fail on obscure systems. If you find such a system, then please report it :)

int digit;
char ch = str[idx];

if(ch >= 'A' && ch <= 'F')
    digit = ch - 'A' + 10;
...

Testing

To test the patch, drop integer-test.c++ in the cryptopp directory, and then compile it:

$ g++ -DDEBUG=1 -g3 -Os -Wall -Wextra \
  -I. integer-test.c++ ./libcryptopp.a -o integer-test.exe

A typical output line is simply the name of the test (for example, H-1 or Hex-1), the big endian value(s) and little endian value(s) of the constructed integer. All the values on a line should be the same. (See the sample output below for the hexadecimal tests)

Issue the following to determine if there are failures. The grep -B 1 prints the failed message, the test name and values which failed. A failure would look similar to below:

$ ./integer-test.exe | grep -B 1 FAILED
...
H-XXX: 01, 01, 10, 10
FAILED
$ ./integer-test.exe
H-1a: 0, 0, 0, 0
H-2a: 0, 0, 0, 0
H-3a: 0, 0, 0, 0
H-1b: 0, 0, 0, 0
H-2b: 0, 0, 0, 0
H-3b: 0, 0, 0, 0
H-4a: 1, 1, 1, 1
H-5a: 1, 1, 1, 1
H-4b: -1, -1, -1, -1
H-5b: -1, -1, -1, -1
H-6a: 1, 1, 1, 1
H-7a: 1, 1, 1, 1
H-8a: 1, 1, 1, 1
H-9a: 1, 1, 1, 1
H-6b: -1, -1, -1, -1
H-7b: -1, -1, -1, -1
H-8b: -1, -1, -1, -1
H-9b: -1, -1, -1, -1
H-10: 1, 1, 1, 1
H-11: 123, 123, 123, 123
H-12: 12345, 12345, 12345, 12345
H-13: 1234567, 1234567, 1234567, 1234567
H-14: 123456789, 123456789, 123456789, 123456789
H-15: 123456789ab, 123456789ab, 123456789ab, 123456789ab
H-16: 123456789ab, 123456789ab, 123456789ab, 123456789ab
H-17: 123456789abcd, 123456789abcd, 123456789abcd, 123456789abcd
H-18: 123456789abcd, 123456789abcd, 123456789abcd, 123456789abcd
H-19: 123456789abcdef, 123456789abcdef, 123456789abcdef, 123456789abcdef
H-20: 123456789abcdef, 123456789abcdef, 123456789abcdef, 123456789abcdef
H-21: 1, 1, 1, 1
H-22: 123, 123, 123, 123
H-23: 12345, 12345, 12345, 12345
H-24: 1234567, 1234567, 1234567, 1234567
H-25: 123456789, 123456789, 123456789, 123456789
H-26: 123456789ab, 123456789ab, 123456789ab, 123456789ab
H-27: 123456789ab, 123456789ab, 123456789ab, 123456789ab
H-28: 123456789abcd, 123456789abcd, 123456789abcd, 123456789abcd
H-29: 123456789abcd, 123456789abcd, 123456789abcd, 123456789abcd
H-30: 123456789abcdef, 123456789abcdef, 123456789abcdef, 123456789abcdef
H-31: 123456789abcdef, 123456789abcdef, 123456789abcdef, 123456789abcdef
H-32: 1, 1, 1, 1
H-33: 2301, 2301, 2301, 2301
H-34: 452301, 452301, 452301, 452301
H-35: 67452301, 67452301, 67452301, 67452301
H-36: 8967452301, 8967452301, 8967452301, 8967452301
H-37: ab8967452301, ab8967452301, ab8967452301, ab8967452301
H-38: ab8967452301, ab8967452301, ab8967452301, ab8967452301
H-39: cdab8967452301, cdab8967452301, cdab8967452301, cdab8967452301
H-40: cdab8967452301, cdab8967452301, cdab8967452301, cdab8967452301
H-41: efcdab8967452301, efcdab8967452301, efcdab8967452301, efcdab8967452301
H-41: efcdab8967452301, efcdab8967452301, efcdab8967452301, efcdab8967452301
H-42: 1ff, 1ff, 1ff, 1ff
H-43: ff01, ff01, ff01, ff01
H-44: 1ff, 1ff, 1ff, 1ff
H-45: 0, 0, 0, 0
H-46: 0, 0, 0, 0
...

Downloads

cryptopp-integer.zip - Patch for little-endian support and output base awareness to the Integer class.