RDRAND and RDSEED

From Crypto++ Wiki
(Redirected from RDSEED)
Jump to navigation Jump to search
RDRAND and RDSEED
Documentation
#include <cryptopp/rdrand.h>

RDRAND is a class file to access Intel and AMD's rdrand instruction of the same name. RDSEED is a similar class for access to the rdseed instruction. The classes access a hardware random number generator provided on-die with the some IA32 CPUs. Both RDRAND and RDSEED are included in the header file rdrand.h.

AMD and Intel each provide both RDRAND and RDSEED. RDRAND was provided with Intel's Ivy Bridge processors, while RDSEED made its debut with Broadwell. AMD added RDRAND in Bulldozer v4, and RDSEED in Ryzen. Intel's RDRAND circuit provides random numbers that satisfy NIST SP800-90A; while RDSEED provides random numbers that satisfy NIST SP800-90B and SP800-90C. Its not clear if AMD processors generate values according to a particular standard.

The library provides unconditional support regardless of compiler or intrinsics availability. GCC ASM and MASM/MASM64 assembly language routines are provided to ensure the library can call the instruction if its available. In addition, intrinsic support is available if desired, but the compiler must support it and the user must enable it.

Be aware that some versions of GCC produced incorrect code for RDSEED; see Issue 80180, Incorrect codegen from rdseed intrinsic use. Also be aware that RDRAND on some AMD processors stops producing random numbers after being suspended; see Crypto++ Issue 924, rdrand instruction fails after resume on AMD family 22 CPU and rdrand instruction fails after resume on AMD CPU.

The project takes no position on the suitability of RDRAND or RDSEED as a generator. If you are concerned about the Bullrun program and possible tampering, then you can avoid the generators in their entirety. Or, you can use the generators as a source and extract its entropy using a class like HKDF. Or, you can combine the output of RDRAND or RDSEED with another generator using xorbuff from misc.h. Or, you can use the generators directly. It all depends on your risk adversity and comfort zone.

RDRAND and RDSEED were added to the library in Crypto++ 5.6.3, and it was back-ported to Visual Studio 2005 solution files.

Classes

rdrand.h provides class files for both RDRAND and RDSEED. The classes are nearly identical. Intel's RDRAND is designed to never underflow. As of Crypto++ 6.0, each generator provides one constructor that takes no arguments. The constructor will throw a RDRAND_Err or RDSEED_Err if the hardware is not available. Be sure to use HasRDRAND or HasRDSEED before creating one. Also see Sample Program below for an example.

Crypto++ 5.6.5 and earlier provided constructors that accepted a number of retries. The retry count was a safety valve in case of bad silicon, and it was recommended by Intel. The retry parameter was removed from Crypto++ 6.0 for four reasons. First, we require good silicon. There's nothing we can really do if the processor is bad. Second, the book keeping overhead made the generator run 30% slower. Third, an unconditional retry was a better strategy since the user wants the random bytes without guessing at the number of retries needed. Finally, RDRAND did not underflow and it was hard to guess how many retires were needed for RDSEED.

After Crypto++ added RNG benchmarks, we also learned the latencies of RDRAND and RDSEED were more important than the number of retries. Even if the library retried a "generate" immediately, the one cycle jump to the "generate" instruction was dwarfed by, say, a 10 cycle latency for RDRAND or a 30 cycle latency for RDSEED instruction.

Constructor

RDRAND ()

RDSEED ()

Each generator provides one constructor that takes no arguments. The constructor will throw a RDRAND_Err or RDSEED_Err if the hardware is not available.

You can test for the presence of CPU support by calling HasRDRAND() or HasRDSEED(). See the example below at Sample Program.

Seeding

The RDRAND and RDSEED generators do not accept a seed. If you call CanIncorporateEntropy, then it will return false. If you call IncorporateEntropy, then the generator will silently ignore the request.

Generating Bytes

Use GenerateBlock to retrieve a block of bytes. If the instruction is not available, then the generator will SIGILL. The generator does not throw an exception.

You can test for the presence of CPU support by calling HasRDRAND() or HasRDSEED(). See the example below at Sample Program.

Discarding Bytes

The RDRAND and RDSEED generators will discard bytes if requested. The implementation rounds up the number of bytes to machine words and then discards the equivalent number machine words.

If you are experimenting with RDRAND and RDSEED and you want to discard actual bytes (and not machine words), then you will need to modify the Crypto++ sources.

Exceptions

The RDRAND class will throw a RDRAND_Err exception, while the RDSEED class will throw a RDSEED_Err. They only throw an exception when a suitable implementation cannot be located at compile time.

At runtime the availability must be checked with either HasRDRAND() or HasRDSEED(). The generator will throw an exception during construction if the generators are not available.

If you call GenerateBlock and a generator is not available, then a SIGILL will result.

Vendor Support

Intel provided the RDRAND circuit in late 2012, while AMD provided equivalent support in June 2015. According to the AMD Programmers Manual, AMD provides both RDRAND and RDSEED circuit. It is not clear if AMD's processors provide FIPS 140-2 support.

The compilers that support the instructions are:

  • Clang added RDRAND in July 2012, Clang 3.2
  • GCC added RDRAND in December 2010, GCC 4.6
  • Intel added RDRAND in September 2011, ICC 12.1
  • Microsoft added RDRAND in August 2012, VS2012
  • Microsoft added RDSEED in November 2013, VS2013
  • Sun added RDRAND in November 2014, Sun Studio 12.4
  • Sun added RDSEED in June 2016, Sun Studio 12.5

If you know of a compiler that supports the instruction but is missing, then please discuss it on the mailing list.

ASM vs Intrinsics

The source files allow either an ASM implementation or Intrinsics. The ASM is more flexible because it does not require the compiler to support the rdrand or rdseed instructions. The ASM also sidestps occasional problems like GCC Issue 80180, Incorrect codegen from rdseed intrinsic use. The intrinsics are available but must be manually enabled.

The library's assembly code is usually a little faster than the intrinsics. That's because the assembly code generates four machine-word blocks, and then reduces to single machine-words for tail bytes. The 4-word blocks save about a dozen compares and jumps, and it provides about an 8% increase in performance. For example, a RDRAND generator which nominally runs at 198 MiB/s would increase to about 215 MiB/s.

CPU Opcodes

Earlier it was stated ... the ASM source files have the opcodes hard coded into the .CODE section. Here are the relevant opcodes from the MASM/MASM64 sources:

Call_RDRAND_EAX:
    DB 0Fh, 0C7h, 0F0h

Call_RDRAND_RAX:
    DB 048h, 0Fh, 0C7h, 0F0h

Call_RDSEED_EAX:
    DB 0Fh, 0C7h, 0F8h

Call_RDSEED_RAX:
    DB 048h, 0Fh, 0C7h, 0F8h

You can cross check the opcodes using an assembler like YASM. Simply view the listing file created with the following program:

; rdseed.asm:   a RDSEED program for NASM and YASM
;
; assemble:	nasm -f {win32|win64|elf} -l rdseed.lst rdseed.asm -o rdseed.o

        SECTION .text        ; code section
        global main          ; make label available to linker 
main:                        ; standard entry point
	
        rdseed  eax             ; rdseed rax 

        mov	ebx,0		; exit code, 0=normal
        mov	eax,1		; exit command to kernel
        int	0x80		; interrupt 80 hex, call kernel

Performance

The throughput of the RDRAND and RDSEED generators vary wildly depending on processor family, cpu sub-architecture and processor manufacturer. Additionally, RDSEED appears to run from 1/2 to 1/5 the rate of RDRAND on Intel hardware. Below is a comparison of data gathered using the Crypto++ benchmark program. Results were cross-validated with Jack Lloyd's Botan.

The benchmark program is basic, and it only uses a single thread running on a single core. Performance can be easily improved by spinning up additional pthreads to perform work on available cores.

Processor RDRAND
MiB/s
RDRAND
Cycles/Byte
RDSEED
MiB/s
RDSEED
Cycles/Byte
Comment
Athlon 845 X4 1 4119 - - AMD Bulldozer v4 @ 3.5 GHz
AMD A6 2 1016 - - AMD A6-9220 @ 1.6 GHz
Ryzen 7 1700X 11 282 11 283 AMD Ryzen 7 @ 3.4 GHz
Celeron J3455 6 251 3 419 Low end Celeron @ 1.5 GHz
Atom Z3735 9 145 - - Low end Atom @ 1.3 GHz
Core i5-3200 212 11 - - Ivy Bridge (3rd gen) @ 2.6 GHz
Xeon E5-2666 87 32 - - Haswell (4th gen) @ 2.9 GHz
Core i7-4980 78 34 - - Haswell (4th gen) @ 2.8 GHz
Core i5-5300 67 39 15 150 Broadwell (5th gen) @ 2.3 GHz
Core i5-6400 66 48 25 121 Skylake (6th gen) @ 2.7 GHz
Core XX-7xxx ? ? ? ? Kabylake (7th gen) @ x.x GHz
Core i7-8700 71 43.2 24 125.6 Coffee lake (8th gen) @ 3.2 GHz

RDRAND and RDSEED were disabled for some AMD processors at Commit 95d8f2abfa36. Also see Issue 924, Disable RDRAND on AMD cpu's with family 15h or 16h.

Sample Program

The first example program guards the use of a RDRAND generator. It also uses member_ptr from smartptr.h to avoid warnings (auto_ptr) and missing classes (unique_ptr) among C++03 and C++11.

member_ptr<RandomNumberGenerator> prng(HasRDRAND() ? new RDRAND : new AutoSeededRandomPool);
SecByteBlock key(AES::DEFAULT_KEYLENGTH), iv(AES::BLOCKSIZE);

prng->GenerateBlock(key, key.size());
prng->GenerateBlock(iv, iv.size());

You can avoid member_ptr and HasRDRAND using code similar to the following.

RandomNumberGenerator* prng;

try {
    // May fail if RDRAND is not available
    prng = new RDRAND;
}
catch (const RDRAND_Err&) {
    // Should never fail, always available
    prng = new AutoSeededRandomPool;
}

prng->GenerateBlock(...);
...

delete prng;

The second example shows how you could XOR a RDRAND generator with another generator.

class CombinedRNG : public RandomNumberGenerator
{
public:
    CombinedRNG(RandomNumberGenerator& rng1, RandomNumberGenerator& rng2)
        : m_rng1(rng1), m_rng2(rng2) {}

    bool CanIncorporateEntropy () const
    {
        return m_rng1.CanIncorporateEntropy() ||
            m_rng2.CanIncorporateEntropy();
    }

    void IncorporateEntropy (const byte *input, size_t length)
    {
        if (m_rng1.CanIncorporateEntropy())
            m_rng1.IncorporateEntropy(input, length);
        if (m_rng2.CanIncorporateEntropy())
            m_rng2.IncorporateEntropy(input, length);
    }

    void GenerateBlock (byte *output, size_t size)
    {
        RandomNumberSource(m_rng1, size, true, new ArraySink(output, size));
        RandomNumberSource(m_rng2, size, true, new ArrayXorSink(output, size));
    }

private:
    RandomNumberGenerator &m_rng1, &m_rng2;
};

int main (int argc, char* argv[])
{
    RDRAND rdrand;
    AutoSeededRandomPool rpool;
    CombinedRNG prng(rdrand, rpool);

    RandomNumberSource src(prng, 32, true, new HexEncoder(new FileSink(std::cout)));
    std::cout << std::endl;

    return 0;
}

The final example shows how you could extract entropy from a RDRAND generator, and use it with a key derivation function.

int main (int argc, char **argv)
{
    SecByteBlock key(AES::DEFAULT_KEYLENGTH);

    RDRAND rdrand;
    rdrand.GenerateBlock(key, key.size());

    std::cout << "Pre-extraction:" << std::endl;
    StringSource(key, key.size(), true, new HexEncoder(new FileSink(std::cout)));
    std::cout << std::endl;

    HKDF<SHA256> kdf;
    kdf.DeriveKey(key, key.size(), key, key.size());

    std::cout << "Post-extraction:" << std::endl;
    StringSource(key, key.size(), true, new HexEncoder(new FileSink(std::cout)));
    std::cout << std::endl;

    return 0;
}

The final example produces output similar to below.

$ ./test.exe
Pre-extraction:
651CEA46CE5E469AFCF79BE2F67DEB0C
Post-extraction:
AAAEA9FF9D0A83A1E7573391474B98AB

Downloads

RDRAND.zip - Class files and ASM files for RDRAND and RDSEED. The files can be used with earlier versions of Crypto++, like 5.6.2.