Search by Tags

FreeRTOS on the Cortex-M4 of a Apalis iMX8X and Colibri iMX8X


Article updated at 10 Mar 2021
Compare with Revision

Subscribe for this article updates

The i.MX8X applications processor is a feature and performance-scalable multicore platform including a freely usable Cortex-M4 core. This secondary core typically runs an RTOS optimized for microcontrollers or a bare-metal application. Toradex provides FreeRTOS™, a free professional-grade real-time operating system for microcontrollers, along with drivers and several examples that can be used on our i.MX8X based platforms. The FreeRTOS™ port is based on NXP's MCUxpresso SDK for i.MX8X.

The Toradex modules featuring this SoC include

  • Apalis iMX8QXP
  • Apalis iMX8DXP
  • Colibri iMX8QXP
  • Colibri iMX8DX

At the time of writing this article (November 2019) there's one build system supported to build the firmware and examples:

  • GNU Makefiles and CMake with GCC (Linaro ARM Embedded/bare-metal toolchain, e.g. 4.8-2014-q3-update or 4.9-2015-q3-update)

Toradex is integrating support for FreeRTOS into Visual Studio Code. Please contact us on the Toradex Community if you want to test preliminary versions of the VSCode support. To learn more see: Developing M4 Applications Using Visual Studio Code


The Cortex-M4 CPU core lives side by side with the Cortex-A35 based primary CPU cores. Both CPU complexes have access to the same interconnect and hence have equal access to all peripherals (shared bus topology). The graphic below is an incomplete and simplified drawing of the architecture with emphasis on the relevant sub systems to understand the heterogeneous asymmetric multicore architecture.

  • i.MX 8X Heterogeneous Asymmetric Multicore Architecture Block Diagram

    i.MX 8X Heterogeneous Asymmetric Multicore Architecture Block Diagram

There are several types of memory available. The Cortex-M4 provides local memory (Tightly Coupled Memory, TCM), which is relatively small but can be accessed by the CPU without any latency. For applications requiring more memory, the system DRAM is accessible by the M4 cores. From a performance perspective the TCM memory should be used whenever possible.

A traditional microcontroller typically has internal NOR flash where the firmware is stored and executed from. This is not the case on Toradex modules based on the i.MX8X SoC: There is no NOR flash where the firmware can be flashed onto. Instead, the firmware needs to be stored on the mass storage device such as SD-card or the internal eMMC flash. The available mass storage devices are not "memory mapped", and hence application can not be executed directly from any of the cores (no executed-In-Place, XIP). Instead, code need to be loaded into one of the available memory sections before the CPU can start executing it.

The M4 firmware can be placed in the common boot container, so it will be loaded and started by the boot ROM, or it can be placed on a mass storage device. In this case U-Boot needs to be configured to load and execute the M4 firmware.

Memory areas

The two CPU platforms use a different memory layout to access individual sub systems. This table lists some important areas and their memory location for each of the cores side by side. The full list can be found in the i.MX8QM reference manual.

Region Size Cortex-A35 M4 (Code Bus) M4 (System Bus)
DDR 2GB(*1) 0x80000000-0xFFFFFFFF 0x00100000-0x1BFFFFFF 0x80000000-0xDFFFFFFF
TCML 128KB 0x34FE0000-0x34FFFFFF 0x1FFE0000-0x1FFFFFFF
TCMU 128KB 0x35000000-0x3501FFFF 0x20000000-0x2001FFFF

(*1): Full DRAM range is 0x8_00000000 - 0xB_FFFFFFFF. Only a part of the DRAM is accessible by the M4 cores

The Cortex-M4 CPU has two buses connected to the main interconnect (modified Harvard architecture). One bus is meant to fetch data (system bus) whereas the other bus is meant to fetch instructions (code bus).
To get optimal performance, the program code should be located and linked for a region which is going to be fetched through the code bus, while the data area (e.g. bss or data section) should be located in a region which is fetched through the system bus.

The TCML and TCMU regions can be accessed with zero wait-states and thus provides massively better performance than DRAM, even if it is cached. Therefore it is advisable to place all code and data in the TCM whenever possible.

Get the FreeRTOS Source Code

The FreeRTOS source code is currently only available on NXP's MCUXpresso web page:

Here are the steps to download the resources (as of 2019-11-12)

  1. Register and log into MCUXpresso
  2. On the main page, ↠ Explore and Filter Devices
  3. Select Board on the left side
  4. Navigate to ↠ Processorsi.MX8QuadXPlusMIMX8QXxMIMX8QX6xxxFZ
  5. Click the button to Generate MCUXpresso SDK
  6. Click the Download SDK button to get the source code.

The examples as they are provided by NXP are designed for NXP's own carrier board. Because the Colibri modules use a different pinout, the examples need to be adjusted. The following table should help to achieve this.

BGA Ball Nr BGA Ball Name AltFn Function Direction SODIMM Evaluation Board function Evaluation Board Jumper
AF28 SCU_GPIO0_00 2 M40_UART0_RX MICO ← 144 LCD_D_22 X8-40
AH30 SCU_GPIO0_01 2 M40_UART0_TX MOCI → 146 LCD_D_23 X8-41
(G17)H26 ENET0_RGMII_RXD3 4 LSIO_GPIO5_IO08 135 EXT_IO_0 X11-40
H14 USB_SS3_TC1 4 LSIO_GPIO4_IO04 133 EXT_IO_1 X11-41
G15 USB_SS3_TC2 4 LSIO_GPIO4_IO05 127 EXT_IO_2 X11-42

The sample applications are not yet tested by Toradex.