Search by Tags

Linux - Floating Point Calling Convention - Co-Processor/Engine

 

Article updated at 28 Oct 2017
Compare with Revision

Floating Point Calling Convention

An Application Binary Interface (ABI) defines a calling convention, which defines how parameters are passed from a caller to a called procedure and how a return value is returned. For an ARM CPU the commonly used ABI is called EABI. This ABI allows for two incompatible ways to pass floating point numbers:

Calling Convention Description GCC flag
EABI soft-float Floats are passed in normal (integer) registers. -mfloat-abi=soft or -mfloat-abi=softfp
EABI hard-float Floats are passed in floating point registers (VFP). -mfloat-abi=hard

Usually one uses a cross toolchain which is configured for the needed ABI so that one does not need to give the used ABI on the command line. Also tool-chain provided libraries have automatically the correct format.

Two binaries (e.g. executables and C library) compiled using different calling conventions are incompatible with each other. All programs and libraries must be compiled either with softfp/soft or with hard. If you try to run a program which uses soft calling convention on a system compiled for hard float calling convention you will get a 'No such file or directory' error, despite that the file exists and has executable permissions.

Note that the hard float calling convention uses registers of the floating point unit. So this is inherently incompatible with CPUs which do not have such a unit.

Calling Convention used by Toradex Colbri Txx BSP

BSP Version Calling Convention
BSP V1.x EABI soft-float
BSP V2.x and later EABI hard-float

Co-Processor/Engine

ARM CPUs can have a built in floating point unit (FPU) to accelerate operations which use floating point operands. Such an FPU adds additional instructions to the available instruction set.
Related is the NEON instruction set which adds the capability to have a single instruction act on multiple data (SIMD). An implementation of this instruction set is the NEON media processing engine. This coprocessor additionally provides FPU functionality in the form of the VFPv3 instruction set.

See also:
- en.wikipedia.org/wiki/ARM_architecture
- wiki.debian.org/ArmHardFloatPort/VfpComparison

Availability on the Colibri Modules

Module/CPU family VFP Unit NEON unit
Colibri PXA(1) - -
Colibri/Apalis iMX6 VFPv3 Yes
Colibri T20 VFPv3-D16 -
Colibri/Apalis T30 VFPv3 Yes
Colibri VFxx VFPv3 Yes

1: PXA Modules do not have any hardware floating point unit. However, GCC provides optimized software floating point emulation using Intels Integer SIMD extension (iWMMXt)

Compiler Options

gcc.gnu.org/onlinedocs/gcc-4.7.3/gcc/ARM-Options.html#ARM-Options

Compiler Options used in Toradex Colbri Txx BSP

BSP Version GCC flag
BSP V1.x -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16
BSP V2.x and later -march=armv7-a -mfloat-abi=hard -mfpu=vfpv3-d16

Colibri PXA

Use a compiler for soft float calling conventions and don't pass an fpu related compiler option to the compiler. The resulting code does not contain instructions which are executed by the FPU but rather depends on library code doing the calculations with the CPU's integer instruction set.

-march=armv5te -mtune=xscale -O3

Colibri T20

The resulting code is compatible with a CPU having a NEON coprocessor.
Depending on your calling convention use 'softfp' or 'hard' for the mfloat-abi option.

-march=armv7-a -mfloat-abi=xxx -mfpu=vfpv3-d16 -mtune=cortex-a9 -O3

Colibri/Apalis iMX6/T30

Depending on your calling convention use 'softfp' or 'hard' for the mfloat-abi option.

-march=armv7-a -mfloat-abi=xxx -mfpu=neon -mtune=cortex-a9 -O3

Colibri VFxx

Depending on your calling convention use 'softfp' or 'hard' for the mfloat-abi option.
The NEON implementation on the Vybrid might even be further optimized with a more specific mfpu option.

-march=armv7-a -mfloat-abi=xxx -mfpu=neon -mtune=cortex-a5 -O3

Detecting Floating Point Properties of a Binary File

In our Linux Images all object files use ELF.
en.wikipedia.org/wiki/Executable_and_Linkable_Format

The top level architecture (e.g. i686 or ARM) can be queried with the "file 'afile'" program, information on architecture related properties with the "readelf -A 'afile'" program. (readelf from both native and cross toolchain can be used for this.)

If the calling convention is hard floating point the line "Tag_ABI_VFP_args: VFP registers" is part of the readelf output. If that tag is missing soft floating point convention is used.

Example for a shared object compiled for a Colibri T20 module.

  • Uses VFPv3-D16 instructions and register set
  • Dosn't use SIMD extensions
  • Hard calling convention
$ file colibri-t20/usr/lib/libcurl.so.5.3.0
colibri-t20/usr/lib/libcurl.so.5.3.0: ELF 32-bit LSB shared object, ARM, version 1 (SYSV), dynamically linked, not stripped
$ readelf -A colibri-t20/usr/lib/libcurl.so.5.3.0
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "7-A"
  Tag_CPU_arch: v7
  Tag_CPU_arch_profile: Application
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-2
  Tag_FP_arch: VFPv3-D16
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_HardFP_use: SP and DP
  Tag_ABI_VFP_args: VFP registers
  Tag_CPU_unaligned_access: v6

Example for a shared object compiled for a NXP/Freescale Vybrid.

  • Uses VFPv3 instructions and register set
  • Uses NEON SIMD extensions
  • Soft calling convention
$ file twr-vf65gs10/usr/lib/libcurl.so.5.3.0
twr-vf65gs10/usr/lib/libcurl.so.5.3.0: ELF 32-bit LSB shared object, ARM, version 1 (SYSV), dynamically linked, not stripped
$ readelf -A twr-vf65gs10/usr/lib/libcurl.so.5.3.0
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "7-A"
  Tag_CPU_arch: v7
  Tag_CPU_arch_profile: Application
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-2
  Tag_FP_arch: VFPv3
  Tag_Advanced_SIMD_arch: NEONv1
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_HardFP_use: SP and DP
  Tag_CPU_unaligned_access: v6

Example for a shared object compiled for a Colibri PXA.

  • Dosn't use a floating point unit
  • Dosn't use SIMD extensions
  • Soft calling convention
$ file colibri-pxa/usr/lib/libcurl.so.5.3.0
colibri-pxa/usr/lib/libcurl.so.5.3.0: ELF 32-bit LSB shared object, ARM, version 1 (SYSV), dynamically linked, not stripped
$ readelf -A colibri-pxa/usr/lib/libcurl.so.5.3.0
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "5TE"
  Tag_CPU_arch: v5TE
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-1
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align8_neededU: Yes
  Tag_ABI_align8_preserved: Yes, except leaf SP
  Tag_ABI_enum_size: int