Search by Tags

FreeRTOS on the Cortex-M4 of a Colibri iMX7

Applicable for

Compare with Revision

Subscribe for this article updates


The NXP/Freescale i.MX 7 SoC which is the core of the Colibri iMX7 module implements a heterogeneous asymmetric architecture. Besides the main CPU core(s) based on the ARM Cortex-A7 processor, a secondary general purpose ARM Cortex-M4 core is available too. The secondary core typically runs a RTOS optimized for microcontrollers or a bare-metal application. Toradex provides FreeRTOS™, a free professional grade real-time operating system for microcontrollers, along with drivers and several examples which can be used on our Colibri iMX7 platform. The FreeRTOS™ port is based on NXP FreeRTOS BSP for i.MX 7.

The source code is available at:

There are multiple build systems supported to build the firmware and examples, we recommend to use one of the following alternatives: * GNU Makefiles and CMake with GCC (Linaro ARM Embedded/bare-metal toolchain, e.g. 4.8-2014-q3-update or 4.9-2015-q3-update) * ARM Development Studio 5 (DS-5) Professional or Ultimate edition (Version 5.23.1)


The Cortex-M4 CPU core lives side by side with the Cortex-A7 based primary CPU cores. Both CPU complexes have access to the same interconnect and hence have equally access to all peripherals (shared bus topology). The graphic below is an incomplete and simplified drawing of the architecture with emphasis on the relevant sub systems to understand the heterogeneous asymmetric multicore architecture.

i.MX 7 Heterogeneous Asymmetric Multicore Architecture Block Diagram

There are several types of memory available. The Cortex-M4 provides local memory (Tightly Coupled Memory, TCM), which is relatively small but can be accessed by the CPU without any latency. There are multiple OCRAM areas (On-Chip RAM, typically SRAM) which are relatively fast as well and slightly larger. The third option is the DDR3 based main memory. From a performance perspective one of the internal areas should be selected whenever possible.

A traditional microcontroller typically has internal NOR flash where the firmware is stored and executed from. This is not the case on Colibri iMX7: There is no NOR flash where the firmware can be flashed onto. Instead, the firmware needs to be stored on the mass storage device such as SD-card or the internal NAND flash. The available mass storage devices are not "memory mapped", and hence application can not be executed directly from any of the cores (no eXecuted-In-Place, XIP). Instead, code need to be loaded into one of the available memory sections before the CPU can start executing it.

The i.MX 7 SoC always boots using the Cortex-A7 core. The core executes the internal boot ROM which typically loads a boot loader such as U-Boot. The boot loader allows loading the firmware from the mass storage device (e.g. NAND flash) into memory, and triggers the Cortex-M4 to start executing the firmware. To upgrade or replace a firmware, one can just replace the firmware binary on the mass storage device.

Memory areas

The two CPU platforms use a different memory layout to access individual sub systems. This table lists some important areas and their memory location for each of the cores side by side. The full list can be found in the i.MX 7 reference manual.

Region Size Cortex-A7 Cortex-M4 (System Bus) Cortex-M4 (Code Bus) GCC Linker file
DDR Address 2048MB (less for M4) 0x80000000-0xFFFFFFFF 0x80000000-0xDFFFFFFF 0x10000000-0x1FFEFFFF MCIMX7D_M4_ddr.ld
OCRAM_PXP 32KB 0x00940000-0x00947FFF 0x20240000-0x20247FFF 0x00940000-0x00947FFF
OCRAM_EPDC 128KB 0x00920000-0x0093FFFF 0x20220000-0x2023FFFF 0x00920000-0x0093FFFF MCIMX7D_M4_ocram.ld
OCRAM 128KB 0x00900000-0x0091FFFF 0x20200000-0x2021FFFF 0x00900000-0x0091FFFF MCIMX7D_M4_ocram.ld
TCMU 32KB 0x00800000-0x00807FFF 0x20000000-0x20007FFF MCIMX7D_M4_tcm.ld
TCML 32KB 0x007F8000-0x007FFFFF 0x1FFF8000-0x1FFFFFFF MCIMX7D_M4_tcm.ld
OCRAM_S 32KB 0x00180000-0x00187FFF 0x00180000-0x00187FFF 0x00000000-0x00007FFF
Boot ROM 96KB 0x00000000-0x00017FFF 0x20020000-0x20037FFF

The Cortex-M4 CPU has two buses connected to the main interconnect (modified Harvard architecture). One bus is meant to fetch data (system bus) whereas the other bus is meant to fetch instructions (code bus). To get optimal performance, the program code should be located and linked for a region which is going to be fetched through the code bus, while the data area (e.g. bss or data section) should be located in a region which is fetched through the system bus. There are multiple example linker files in the platform/devices/MCIMX7D/linker/ sub directory which can be used and/or modified. All example firmware below use the MCIMX7D_M4_tcm.ld linker file (TCML region for code, and the TCMU region for data).

Resource Domain Controller

Understanding and configuring the Resource Domain Controller (RDC) is crucial when running two operating systems within iMX 7. The RDC prohibits and grants access to peripherals and memory areas for individual bus masters (e.g. CPU, DMA controller) on hardware level. The RDC allows to define up to 4 resource domains, and assign peripherals and memory locations to those resource domains. By default, the A7 core is in domain 0 and all peripherals are assign to the domain 0. When the FreeRTOS firmware start, the Cortex-M4 core is in domain 0 too, but then reassigns the Cortex-M4 and the required peripherals to domain 1 (see board.c and the example specific hardware_init.c).

If a device shall be used on the Cortex-M4 which is used by the Linux kernel running on the Cortex-A7 (e.g. I2C), it is important to disable this device in the device tree of the Linux kernel (e.g. set the status property to disabled). The article Device Tree Customization explains in more details how to alter the device tree.

Get the FreeRTOS Source Code

The FreeRTOS source code is available on our git server. Use git to clone the repository:

$ git clone -b colibri-imx7-m4-freertos-v8 git:// freertos-colibri-imx7/
$ cd freertos-colibri-imx7/

This table shows how the FreeRTOS BSP source code is structured:

Directory Content
doc/ NXP/Freescales FreeRTOS BSP Documentation
examples/ Application examples
examples/imx7_colibri_m4/ Examples ported to Toradex Colibri iMX7
middleware/multicore/open-amp/ OpenAMP based RPMsg stack (remote messaging framework)
platform/ Driver library, startup code and utilities
platform/CMSIS/ Cortex Microcontroller Software Interface Standard (CMSIS) ARM Cortex®-M header files, DSP library source
platform/devices/MCIMX7D/linker/ Linker control files for each supported toolchain
platform/drivers/ Peripheral Drivers
platform/utilities/ Utilities such as debug console
rtos/FreeRTOS/ FreeRTOS Kernel folder

GNU Makefiles and CMake with GCC

Using GNU Makefiles and CMake along with the GCC compiler has been tested on Linux based host systems. GNU Makefiles and CMake can be used on a Windows systems by using MinGW. However, on Windows operating system using the ARM DS-5 IDE will be easier to setup and use (see below).


We recommend using one of the Linaro provided ARM Embedded toolchains, e.g. 4.9 2015 Q3 update. Unpack the toolchain to an appropriate location, e.g. your home directory:

$ tar xjf ~/Downloads/gcc-arm-none-eabi-4_9-2015q3-20150921-linux.tar.bz2

Since the toolchain is compiled as 32-bit binaries, make sure to install 32-bit version of libc and libncurses. On Ubuntu 14.04 the commands to install those libraries are:

sudo dpkg --add-architecture i386
sudo apt-get update
sudo apt-get install libc6:i386 libncurses5:i386

With that, gcc should run fine

$ ~/gcc-arm-none-eabi-4_9-2015q3/bin/arm-none-eabi-gcc --version
arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 4.9.3 20150529 (release) [ARM/embedded-4_9-branch revision 227977]
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO

Furthermore, the GNU make and cmake need to be available. For example, on Ubuntu:

$ sudo apt-get install make cmake


Each example has a sub-directory called armgcc. Enter this subdirectory and use the provided shell scripts to build the example. The scripts expect the environment variable to point to the Linaro ARM Embedded toolchain:

$ export ARMGCC_DIR=~/gcc-arm-none-eabi-4_9-2015q3/
$ cd examples/imx7_colibri_m4/demo_apps/hello_world/armgcc
$ ./
-- TOOLCHAIN_DIR: /home/ags/gcc-arm-none-eabi-4_9-2015q3/
-- BUILD_TYPE: Debug
-- TOOLCHAIN_DIR: /home/ags/gcc-arm-none-eabi-4_9-2015q3/
-- BUILD_TYPE: Debug
-- The ASM compiler identification is GNU
-- Found assembler: /home/ags/gcc-arm-none-eabi-4_9-2015q3//bin/arm-none-eabi-gcc
-- Configuring done
-- Generating done
-- Build files have been written to: /home/ags/freertos-colibri-imx7/examples/imx7_colibri_m4/demo_apps/hello_world/armgcc
Scanning dependencies of target hello_world
[  4%] Building C object CMakeFiles/hello_world.dir/home/ags/freertos-colibri-imx7/rtos/FreeRTOS/Source/portable/GCC/ARM_CM4F/port.c.obj
[100%] Linking C executable debug/hello_world.elf
[100%] Built target hello_world

With that, the sub directories release/ and debug/ contain the firmware binaries which can be executed on the target (see the instructions of the individual examples below).

ARM Development Studio 5

ARM provides DS-5, an integrated development environment (IDE) suited for C/C++ development on ARM systems. DS-5 provides build capabilities similar to GNU Makefiles with GCC, but is also capable of debugging and profiling the application on actual hardware (through JTAG).

Refer to our article Using ARM DS-5 IDE with Cortex-M4 of a Colibri iMX7.

Running a Firmware on Cortex-M4

U-Boot can be used to start a firmware on the Cortex-M4 core. Use the following commands to load the firmware from a SD-card into the Cortex-M4 local TCM (Tightly Coupled Memory):

Colibri iMX7 # fatload mmc 0:1 0x7F8000 hello_world.bin
Colibri iMX7 # dcache flush
Colibri iMX7 # bootaux 0x7F8000
## Starting auxiliary core at 0x007F8000 ...
Colibri iMX7 # 

The FreeRTOS firmware uses Colibri UART_B as its debugging console. Make sure to connect UART_B to your debugging host and start a serial terminal emulator with a baudrate of 115200 on the serial port.

Linux disables unused clocks by default. However Linux is not aware what clocks are used by the Cortex-M4 core, therefor one should use the clk_ignore_unused kernel parameter:

Colibri iMX7 # setenv defargs clk_ignore_unused

By default, our Linux device tree uses UART_B too, which leads to a external abort when the Linux kernel tries to access UART_B. It is recommended to alter the device tree and disable UART_B using the status property (see Device Tree Customization). Temporary, the following fdt_fixup command can be use in U-Boot:

Colibri iMX7 # setenv fdt_fixup 'fdt addr ${fdt_addr_r} && fdt rm /soc/aips-bus@30800000/spba-bus@30800000/serial@30890000'
Colibri iMX7 # saveenv

Note: In V2.6 Alpha 1, there is an issue rendering the Linux clock driver useless when the Cortex-M4 is started. Due to a second bug, this only manifests on Colibri iMX7D. Use the latest 3.14.52 kernel available in the -next branch to avoid those issues.

Some examples seem also to interfere with SPI, therefor the node ecspi3 might need to be disabled as well.

Store a Firmware on Flash and Run it on Boot

Our BSP V2.6 Beta 2 and later provisioned a partition to store the Firmware directly on a static UBI partition and run it on boot.

Use the following commands to store a firmware:

Colibri iMX7 # ubi part ubi
Colibri iMX7 # fatload mmc 0:1 ${loadaddr} hello_world.bin
Colibri iMX7 # ubi write ${loadaddr} m4firmware ${filesize}

Use the following command to enable automatic loading and execution of the firmware

Colibri iMX7 # setenv m4boot 'ubi read 0x7F8000 m4firmware && dcache flush && bootaux 0x7F8000'
Colibri iMX7 # saveenv

With that, U-Boot will load and start the firmware just before booting Linux:

Read 0 bytes from volume m4firmware to 7f8000
No size specified -> Using max size (20956)
## Starting auxiliary core at 0x007F8000 ...


This section provides information about the available examples and how to use them. The source code of the Colibri iMX7 examples is located under examples/imx7_colibri_m4/. All examples are linked to run in the TCM area.

Precompiled firmwares can be found at:

Hello World

  • Firmware Location: examples/imx7_colibri_m4/demo_apps/hello_world/
  • Compiled Example: hello_world.bin

Simple application starting the FreeRTOS kernel and printing Hello World in a FreeRTOS task.

Hello World!

GPIO Example

  • Firmware Location: examples/imx7_colibri_m4/driver_examples/gpio_imx/
  • Compiled Example: gpio_imx_example.bin

GPIO example using input and output as well as interrupt mode. The Toradex example uses EXT_IO1 and EXT_IO2 for buttons/keys and EXT_IO0 for a LED. On a Colibri Evaluation board, you can use header X21 to connect the SO-DIMM signals to a button or LED respectively:

  • EXT_IO0 => X21-LED1
  • EXT_IO1 => X21-SW6
  • EXT_IO2 => X21-SW5
====================== GPIO Example ========================

=================== GPIO Interrupt =====================
The (EXT_IO1) button is configured to trigger GPIO interrupt.
Press the (EXT_IO1) button 3 times to continue.

Button pressed 1 time. 
Button pressed 2 time. 
Button pressed 3 time. 

================= GPIO Functionality==================
The (EXT_IO1) button state is now polled.
Press the (EXT_IO1) button to switch LED on or off

Button released
Button pressed
Button released
Button pressed

Note: The interrupt part of the demo only works as long as Linux is not running. The reason for that is that Linux uses the same GPIO bank too, and reconfigures interrupts. The easiest way to use GPIO in interrupt mode on FreeRTOS while using Linux is to assign a complete GPIO bank to FreeRTOS and disabling the GPIO bank in Linux' device tree.

RPMsg TTY Example

  • Firmware Location: examples/imx7_colibri_m4/demo_apps/rpmsg/str_echo_freertos/
  • Compiled Example: rpmsg_str_echo_freertos_example.bin
  • Linux Kernel Module Source:: drivers/rpmsg/imx_rpmsg_tty.c

This demo uses a virtual TTY on Linux side to send a RPmsg to FreeRTOS. Each string sent to the FreeRTOS example will be sent back to the Linux host.

Start the example rpmsg_str_echo_example.bin on the Cortex-M4 first, which should already print the following output:

RPMSG String Echo Demo...
RPMSG Init as Remote
init M4 as REMOTE

Then, boot into Linux and execute the following commands:

# modprobe imx_rpmsg_tty
[   61.598675] imx_rpmsg_tty rpmsg0: new channel: 0x400 -> 0x1!
[   61.606640] Install rpmsg tty driver!

With that, we have a TTY device located at /dev/ttyRPMSG. The probe method also sends a "hello world" to the other side, which should be visible in the Cortex-M4 console:

Name service handshake is done, M4 has setup a rpmsg channel [1 ---> 1024]
Get Message From A7 : "hello world!" [len : 12] from slot 0

We can use bash to open the file and assign it the file descriptor 3, write into it, read from it and ultimately close it.

# stty -F /dev/ttyRPMSG -echo
# exec 3<> /dev/ttyRPMSG 
# echo Test >&3 
# cat <&3
# exec 3>&-

On the Cortex-M4 side, this should print the Test message:

Get Message From A7 : "Test" [len : 4] from slot 1
Get New Line From A7 From Slot 2

To load the module automatically on every boot, a config file in the modules-load.d directory can be used:

# echo imx_rpmsg_tty > /etc/modules-load.d/rpmsg_tty.conf

RPMsg PingPong Example

  • Firmware Location: examples/imx7_colibri_m4/demo_apps/rpmsg/pingpong_freertos/
  • Compiled Example: rpmsg_pingpong_freertos_example.bin
  • Linux Kernel Module Source:: drivers/rpmsg/imx_rpmsg_pingpong.c

This example uses the messaging mechanism to send an integer value back and forth, while incrementing it on both sites. The demo will stop after 100'000 iterations (that is when 200'000 incrementations have been executed).

# modprobe imx_rpmsg_pingpong
[   19.538184] imx_rpmsg_pingpong rpmsg0: new channel: 0x400 -> 0x1!
[   19.553916] get 1 (src: 0x1)
[   19.559268] get 3 (src: 0x1)
[   19.564410] get 5 (src: 0x1)
[   19.570272] get 7 (src: 0x1)
[   19.575276] get 9 (src: 0x1)
[   19.580268] get 11 (src: 0x1)
[   19.585275] get 13 (src: 0x1)
[  330.056188] get 199999 (src: 0x1)
[  330.056200] imx_rpmsg_pingpong rpmsg0: goodbye!
PMSG PingPong Demo...
RPMSG Init as Remote
init M4 as REMOTE
Name service handshake is done, M4 has setup a rpmsg channel [1 ---> 1024]
Get Data From A7 : 0
Get Data From A7 : 2
Get Data From A7 : 4
Get Data From A7 : 6
Get Data From A7 : 8
Get Data From A7 : 10
Get Data From A7 : 12
Get Data From A7 : 14

Use debug level 5 to avoid the messages on the debug console

# echo 5 > /proc/sys/kernel/printk

TAQ (Toradex + Antmicro + QT)

Toradex toghether with AntMicro and Qt created a self balancing robot TAQ. You can see the robot in action in this video. The firmware has been built using an early version of the Colibri iMX7 BSP, hence the firmware might not run well with newer Linux BSP's. However, the API is pretty much unchanged therefor the source code can still be used as a reference.

Other Examples

There are other examples available. Some of the examples run from DDR (_ddr) and hence require a different load/boot address as well as making sure that Linux not using that DDR memory. Some examples are Bare-Metal ((_bm) examples which do not use the FreeRTOS kernel while still using the driver/RPmsg framework.


Increase available Heap

The heap size of FreeRTOS can be changed by altering the value defined by configTOTAL_HEAP_SIZE in the FreeRTOSConfig.h header file. However, the maximum size will depend on the available memory as defined in the linker file, which ultimately depends on which memory location is used (see Section Memory Areas).

Change RPmsg Buffer Count and Size

RPmsg creates two VirtIO Ring Buffers, for each direction one. The default buffer size is 512 bytes (from which 16 bytes are used by a header) and 256 buffers are allocated for each direction. VirtIO's vring implementation is a sophisticated ring buffer implementation (the virtio: Towards a De-Facto Standard For Virtual I/O Devices paper describes the design in more detail). To better understand the memory requirements, a high level overview of the inner workings of VirtIO is required. The system consists of:

  • Buffers (shared, dynamically allocated, 512 bytes by default)
  • The virtio_ring (shared, at predefined location in DDR3, 0x8FFF0000 and 0x8FFF8000)
    • Descriptors array (16 bytes per buffer)
    • Available ring
    • Used ring

By default Linux is the RPmsg Master. The RPmsg master allocates the buffers, Linux does so in DDR3 memory. The location of the virtio_ring as well as the ring length and size of the buffers are currently hard-coded and must match between Linux and FreeRTOS. Furthermore, the ring length must be a power of two (256, 512, 1024...).

  • Linux: arm/mach-imx/imx_rpmsg.c (RPMSG_NUM_BUFS, RPMSG_BUF_SIZE and location of vring in imx_rpmsg_probe)
  • FreeRTOS: middleware/multicore/open-amp/porting/imx7d_m4/platform_info.c (RPMSG_NUM_BUFS, VRING0/1_BASE) and middleware/multicore/open-amp/rpmsg/rpmsg_core.h (RPMSG_BUFFER_SIZE)



To troubleshoot Linux problems (e.g. when booting Linux, after starting the firmware hangs), the kernel's earlyprintk mechanism can help. Often the kernel does actually start booting; however, just before the serial port (and hence the serial console) initializes, the kernel crashes. The earlyprintk mechanism starts printing kernel log messages from the very beginning.

The earlyprintk functionality is not part of the default kernel and hence needs to be enabled in the kernel configuration (see also Build U-Boot and Linux Kernel from Source Code).

Kernel hacking  --->
[*] Kernel low-level debugging functions (read help!)
      Kernel low-level debugging port (i.MX7D Debug UART)  --->
    (1) i.MX Debug UART Port Selection
[*] Early printk

Problems often appear due to the RDC (Resource Domain Controller). For example, if the firmware on the secondary CPU acquired exclusive access to a certain peripheral, the primary core(s) running Linux won't be able to access that peripheral anymore. This leads to an external abort on the primary CPU, as seen below:

[    0.426158] Unhandled fault: imprecise external abort (0x1c06) at 0x76563000
[    0.433355] Internal error: : 1c06 [#1] SMP ARM
[    0.437921] Modules linked in:
[    0.441175] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.52-03016-g6bcfa7b-dirty #5
[    0.449155] task: 8c0a8000 ti: 8c09e000 task.ti: 8c09e000
[    0.454606] PC is at i2c_imx_probe+0x36c/0x4cc
[    0.459236] LR is at i2c_imx_probe+0x320/0x4cc

To resolve this, make sure to not access the same peripheral on the primary and secondary CPU.