ARM (Command Line)

From Crypto++ Wiki
Jump to navigation Jump to search

Crypto++ supports ARM platforms, including Linux, iOS, Windows Phone and Windows Store. Starting around 5.6.3, additional support was added for ARM, which included ARM NEON and ARMv8 CRC and Crypto extensions. BLAKE2 was the first class to receive the additional ARM support. Additional classes include GCM using NEON's 64x64 → 128-bit multiplier.

This wiki page is dedicated to performing native builds for IoT gadgets and selecting appropriate ARM options. IoT gadgets include BeagleBones, CubieTrucks, Banana Pis, HiKeys and other dev boards. In a native build, you will either (1) log into the device just like its a desktop computer, or (2) SSH into the device and then build the library just like its a desktop computer.

The ARM options presented below are suited for the device you are working on. They are not generic options like Android recommends because the options don't cater to a wide array of devices. Instead, the options are tuned for the host device, and they try to use its full capabilities, like all 32 vectorized registers from the FPU if its VFPv4 and NEON capable.

If you are build for Android or iOS, then there are separate pages for the platforms. The pages use the toolchains provided by AOSP and Apple. See Android (Command Line) and iOS (Command Line) for details. Also see ARM (Command Line), ARM Embedded (Command Line) and ARM Embedded (Bare Metal) if building with the arm-linux-gnueabi toolchain.

Recipes

Here is a small list of CXXFLAGS tuned for the test devices used during Crypto++ testing. -mfloat is determined by the platform using CPU feature flags from /proc/cpuinfo, where the platform is the hardware/os/toolchain/applications combination. Modern platforms use hard floats because they are faster for procedural calls.

Wandboard Dual (ARMv7 with NEON):

-march=armv7-a -mtune=cortex-a9 -mfpu=neon -mfloat-abi=hard

BeagleBone Black (ARMv7 with NEON and VFPv3):

-march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=hard

CubieTruck 5 (ARMv7 with NEON and VFPv4):

-march=armv7-a -mtune=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard

Banana Pi (ARMv7 with NEON and VFPv4):

-march=armv7-a -mtune=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard

Raspberry Pi 3 (ARMv7 with NEON and VFPv4):

-march=armv7-a -mfpu=neon-vfpv4 -mfloat-abi=hard 

Raspberry Pi 3 (ARMv8/Aarch32, with CRC, without Crypto):

-march=armv8-a+crc -mtune=cortex-a53 -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard

ODROID C2 (ARMv8/Aarch64, with CRC, without Crypto):

-march=armv8-a+crc -mtune=cortex-a53

PINE64 (ARMv8/Aarch64, with CRC and Crypto):

-march=armv8-a+crc+crypto -mtune=cortex-a53

Applied Micro Mustang (ARMv8/Aarch64, without CRC and Crypto):

-march=armv8-a -mtune=cortex-a53

LeMaker HiKey (ARMv8/Aarch64, with CRC and Crypto):

-march=armv8-a+crc+crypto -mtune=cortex-a53

Overdrive 1000 (ARMv8/Aarch64, with CRC and Crypto):

-march=armv8-a+crc+crypto -mtune=cortex-a57

Native Build

A native build does not use specialized CXXFLAGS on ARM. There's a few reasons the Makefile avoids selecting ARM options, but the short answer is: its not an easy problem because its the wild, wild west. Notice the build is missing some key options, like -march, -mcpu (or -mtune), -mfpu and -mfloat-abi.

git clone https://github.com/weidai11/cryptopp.git
cd cryptopp

make
g++ -DNDEBUG -g2 -O2 -fPIC -c cryptlib.cpp
g++ -DNDEBUG -g2 -O2 -fPIC -c cpu.cpp
...

What we really want is something like the following:

git clone https://github.com/weidai11/cryptopp.git
cd cryptopp

make
g++ -DNDEBUG -g2 -O2 -march=armv7-a -mtune=cortex-a9 -mfpu=neon -mfloat-abi=hard -fPIC -c cryptlib.cpp
g++ -DNDEBUG -g2 -O2 -march=armv7-a -mtune=cortex-a9 -mfpu=neon -mfloat-abi=hard -fPIC -c cpu.cpp
...

If you know the CXXFLAGS you need, then:

export CXXFLAGS="-DNDEBUG -g2 -O2 -march=armv7-a -mtune=cortex-a9 -mfpu=neon -mfloat-abi=hard"

make
g++ -DNDEBUG -g2 -O2 -march=armv7-a -mtune=cortex-a9 -mfpu=neon -mfloat-abi=hard -fPIC -c cryptlib.cpp
g++ -DNDEBUG -g2 -O2 -march=armv7-a -mtune=cortex-a9 -mfpu=neon -mfloat-abi=hard -fPIC -c cpu.cpp

cryptest.sh

The library's cryptest.sh attempts to set most of the ARM options to exercise the library under real world configurations. If you are in a jamb and want to get options that are mostly correct, then run cryptest.sh and observe the PLATFORM_CXXFLAGS it uses:

# Results from BeagleBone Black (ARMv7, 32-bit)
$ ./cryptest.sh

IS_LINUX: 1
IS_ARM32: 1
HAVE_ARMV7A: 1
HAVE_ARM_NEON: 1
HAVE_ARM_VFPV3: 1

...

PLATFORM_CXXFLAGS: -march=armv7-a  -mfpu=neon  -mfloat-abi=hard

And:

# Results from Banana Pi (ARMv7, 32-bit)
$ ./cryptest.sh

IS_LINUX: 1
IS_ARM32: 1
HAVE_ARMV7A: 1
HAVE_ARM_NEON: 1
HAVE_ARM_VFPV3: 1
HAVE_ARM_VFPV4: 1

...

PLATFORM_CXXFLAGS: -march=armv7-a  -mfpu=neon-vfpv4  -mfloat-abi=hard

And:

# Results from LeMaker HiKey (ARMv8, Aarch64)
$ ./cryptest.sh

IS_LINUX: 1
IS_ARM64: 1
HAVE_ARMV8A: 1
HAVE_ARM_CRC: 1
HAVE_ARM_CRYPTO: 1

...

PLATFORM_CXXFLAGS: -march=armv8-a+crc+crypto

Once you have your option set, add them to CXXFLAGS and then make as usual:

$ export CXXFLAGS="-DNDEBUG -g2 -O2 -march=armv7-a -mtune=cortex-a7 -mfpu=neon -mfloat-abi=hard"

$ make
g++ -DNDEBUG -g2 -O2 -march=armv7-a -mtune=cortex-a7 -mfpu=neon -mfloat-abi=hard -fPIC -pipe -c cryptlib.cpp
g++ -DNDEBUG -g2 -O2 -march=armv7-a -mtune=cortex-a7 -mfpu=neon -mfloat-abi=hard -fPIC -pipe -c cpu.cpp
...

ARM Options

You should supply -march, -mcpu (or -mtune), -mfpu and -mfloat-abi when performing native ARM builds. There are two places you generally find the information you need. The first is /proc/cpuinfo and the second is the compiler. There's a third place, uname -m, and its usually good for determining Aarch32 (ARM32) and Aarch64 (ARM64).

/proc/cpuinfo

/proc/cpuinfo provides most information on th CPU brand, architecture and feature flags. The problem with some of the fields is, they are not standard, so you may miss ARMv7a because a device uses suni8.

# Results from BeagleBone Black (Aarch32)
$ cat /proc/cpuinfo 
processor       : 0
model name      : ARMv7 Processor rev 2 (v7l)
BogoMIPS        : 996.14
Features        : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpd32 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x3
CPU part        : 0xc08
CPU revision    : 2
Hardware	: Generic AM33XX (Flattened Device Tree)
Revision        : 0000
Serial          : 0000000000000000

And:

# Results from LeMaker HiKey (Aarch64)
$ cat /proc/cpuinfo 
Processor       : AArch64 Processor rev 3 (aarch64)
processor       : 0
processor       : 1
processor       : 2
processor       : 3
processor       : 4
processor       : 5
processor       : 6
processor       : 7
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 
CPU implementer : 0x41
CPU architecture: AArch64
CPU variant	: 0x0
CPU part	: 0xd03
CPU revision	: 3

Hardware	: HiKey Development Board

Compiler

The compiler is useful for detecting the floating point ABI. If the target is armhf or gnueabihf, then you should use -mfloat-abi=hard. Otherwise, you should use -mfloat-abi=softfp.

ARM_HARD_FLOAT=$("$CXX" -v 2>&1 | "$GREP" 'Target' | "$EGREP" -i -c '(armhf|gnueabihf)')
if [[ ("$ARM_HARD_FLOAT" -ne "0") ]]; then
   PLATFORM_CXXFLAGS+=("-mfloat-abi=hard ")
else
   PLATFORM_CXXFLAGS+=("-mfloat-abi=softfp ")
fi

CPU and Tuning

To tune for a particular cpu, you need to determine the cpu hardware. You are looking for a string like Cortex-A9 or Cortext-A53. You can then specify it through -mcpu or -mtune.

As far as we know, there's no way to get the friendly name through commands. You need to know what the manufacturer puts on the board and use it for -mcpu or -mtune. Sometimes it s not easy to determine. For example, the HiKey may offer sun8i, which you need to know its an Allwinner SoC based on Cortex-A7 cores.

Once you determine the cpu, you can use it like -mtune=cortex-a7 or -mtune=cortex-a53.

Architecture

You generally use ARMv7, ARMv7-a, ARMv8, ARMv8-a or ARMv8.1-a. You can usually detect it through uname and /proc/cpuinfo.

uname tells you if its ARMv7 (32-bit) or ARMv8 (64-bit). Both ARMv7 and ARMv8 need to be further refined with the cpu flags from /proc/cpuinfo.

/proc/cpuinfo tells you cpu and fpu features, like neon and crypto. Here's the logic from cryptest.sh that's used to determine PLATFORM_CXXFLAGS. Its awful, but you can se the options in the GCC manual at GCC ARM Options and GCC ARM64 Options.