Search by Tags

FreeRTOS on the Cortex-M4s of a Apalis iMX8


Article updated at 24 May 2021
Compare with Revision

Subscribe for this article updates

The i.MX8QM applications processor is a feature and performance-scalable multicore platform including 2 Cortex-M4 cores. These secondary cores typically run an RTOS optimized for microcontrollers or a bare-metal application. Toradex provides FreeRTOS™, a free professional-grade real-time operating system for microcontrollers, along with drivers and several examples that can be used on our Apalis iMX8QM platform. The FreeRTOS™ port is based on NXP's MCUxpresso SDK for i.MX8QM.

The build system supported to build the firmware and examples is:


The 2 Cortex-M4 CPU cores live side by side with the Cortex-A53/A72 based primary CPU cores. Both CPU complexes have access to the same interconnect and hence have equal access to all peripherals (shared bus topology). The graphic below is an incomplete and simplified drawing of the architecture with emphasis on the relevant sub systems to understand the heterogeneous asymmetric multicore architecture.

  • i.MX 8QM Heterogeneous Asymmetric Multicore Architecture Block Diagram

    i.MX 8QM Heterogeneous Asymmetric Multicore Architecture Block Diagram

There are several types of memory available. The Cortex-M4 provides local memory (Tightly Coupled Memory, TCM), which is relatively small but can be accessed by the CPU without any latency. For applications requiring more memory, the system DRAM is accessible by the M4 cores. From a performance perspective the TCM memory should be used whenever possible.

A traditional microcontroller typically has internal NOR flash where the firmware is stored and executed from. This is not the case on the Apalis iMX8QM: There is no NOR flash where the firmware can be flashed onto. Instead, the firmware needs to be stored on the mass storage device such as SD-card or the internal eMMC flash. The available mass storage devices are not "memory mapped", and hence application can not be executed directly from any of the cores (no executed-In-Place, XIP). Instead, code need to be loaded into one of the available memory sections before the CPU can start executing it.

The M4 firmware can be placed in the common boot container, so it will be loaded and started by the boot ROM, or it can be placed on a mass storage device. In this case U-Boot needs to be configured to load and execute the M4 firmware.

Memory areas

The two CPU platforms use a different memory layout to access individual sub systems. This table lists some important areas and their memory location for each of the cores side by side. The full list can be found in the i.MX8QM reference manual.

Region Size Cortex-A53/A72 M4-0 (Code Bus) M4-0 (System Bus) M4-1 (Code Bus) M4-1 (System Bus)
DDR Address 2GB(*1) 0x80000000-0xFFFFFFFF 0x00100000-0x1BFFFFFF 0x80000000-0xDFFFFFFF 0x00100000-0x1BFFFFFF 0x80000000-0xDFFFFFFF
TCML for M4-0 128KB 0x34FE0000-0x34FFFFFF 0x1FFE0000-0x1FFFFFFF N/A N/A
TCMU for M4-0 128KB 0x35000000-0x3501FFFF 0x20000000-0x2001FFFF N/A N/A
TCML for M4-1 128KB 0x38FE0000-0x38FFFFFF N/A N/A 0x1FFE0000-0x1FFFFFFF
TCMU for M4-1 128KB 0x39000000-0x3901FFFF N/A N/A 0x20000000-0x2001FFFF

(*1): Full DRAM range is 0x8_00000000 - 0xB_FFFFFFFF. Only a part of the DRAM is accessible by the M4 cores

The Cortex-M4 CPU has two buses connected to the main interconnect (modified Harvard architecture). One bus is meant to fetch data (system bus) whereas the other bus is meant to fetch instructions (code bus).
To get optimal performance, the program code should be located and linked for a region which is going to be fetched through the code bus, while the data area (e.g. bss or data section) should be located in a region which is fetched through the system bus.

The TCML and TCMU regions can be accessed with zero wait-states and thus provides massively better performance than DRAM, even if it is cached. Therefore it is advisable to place all code and data in the TCM whenever possible.

Get the FreeRTOS Source Code

The FreeRTOS source code is currently only available on NXP's MCUXpresso web page:

Here are the steps to download the resources (as of 2019-10-04)

  1. Register and log into MCUXpresso
  2. On the main page, ↠ Explore and Filter Devices
  3. Select Board on the left side
  4. Navigate to ↠ Processorsi.MX8QuadMaxMIMX8QMxMIMX8QM6xxxFF
  5. Click the button to Generate MCUXpresso SDK
  6. Click the Download SDK button to get the source code.

The standard FreeRTOS and bare-metal examples provided by NXP use the M4's tightly coupled UARTs to communicate with the user. This section describes how to configure the Apalis Evaluation Board V1.1A in order to make the required UART ports accessible from your development PC, and how to load and run binary examples.

Warning: The setup supports either to run samples in M4 core 0, or in M4 core 1, but not on both cores at the same time.

Hardware Setup


Rotate the Apalis Evaluation board (EvB), so that the SD card sockets and audio connectors are pointing towards you.

  • X3 is the large horizontal jumper row away from you. X2 is shorted to the near end of X3 and holds the signals going directly to the Apalis module X4 is shorted to the far end of X3 and holds the signals to the external circuitry.
  • X6 is the large vertical jumper row on the right side of the EvB. X5 is shorted to the left side of X6 and holds the signals going directly to the Apalis module. X7 is shorted to the right side of X6 and holds the signals to the external circuitry.
  • X28 is the stacked DSub-9 connector in the far right end of the EvB. We will use the upper DSub-9 connector to communicate to the M4-core.
  • X29 is the USB-B connector on the far end of the EvB, between the stacked DSub-9 connectors. We will use it to communicate to U-Boot and Linux.

Connection between Apalis Evaluation Board and PC

  • Connect a USB cable between X29 and your PC (115200 bps, 8N1)
  • Connect an RS232-to-USB converter (through a null-modem adapter) between X28 and your PC (115200 bps, 8N1)

Configuring the Evaluation Board

To Run Examples on M40

In order to run the regular examples on M4 core 0, the following jumpers on the Apalis Evaluation Board (EVB) need to be removed:

Remove Jumper Apalis Standard function
X6-35 UART1_DTR (to PC)
X6-32 UART2_TXD (to PC)
X6-29 UART2_RXD (from PC)
X6-40 UART1_DSR (from PC)

Then add two patch wires to route the M4_0 tightly-coupled UART to the DSub-9 connector X28

Patch from patch to Comment
X5-35 X7-32 iMX8 -> PC
X5-40 X7-29 PC -> iMX8

To Run Examples on M41

In order to run the regular examples on M4 core 1, the following jumpers on the Apalis Evaluation board (EvB) need to be removed:

Remove Jumper Apalis Standard function
X3-12 PWM4 (to PC)
X6-32 UART2_TXD (to PC)
X6-29 UART2_RXD (from PC)
X3-13 PWM3 (from PC)

Then add two patch wires to route the M4_1 tightly-coupled UART to the DSub-9 connector X28

Patch from patch to Comment
X2-12 X7-32 iMX8 -> PC
X2-13 X7-29 PC -> iMX8

Run The Examples from U-Boot

At the time of writing this article, the tested U-Boot version was:

U-Boot 2020.04-5.1.0-devel+git.0a26a04408ca (Dec 28 2020 - 14:02:46 +0000)

Currently U-Boot does not support loading .elf files. Therefore we need to use .bin files which tend to be less defined and leave more space for usage errors.

Prepare your environment as follows:

  1. Open two terminals on the PC:
    • One for the connection to U-Boot / Linux
    • One for the communication with the M4
  2. Copy the hello_world_m40.bin (or whatever example you want to run) onto an SD card
  3. Insert the SD card into the right-side SD card socket of the EvB

Then for each time you want to run the example, Put your keyboard focus into the U-Boot/Linux terminal and follow the steps below:

  1. Turn on the EvB

  2. Press Space in the terminal to enter U-Boot command line.

  3. Optional: To verify that your binary file is accessible on the SD card, enter:

> ls mmc 2
  1. Enter the following sequence to run the M4 code on core 0. We also added the U-Boot output for clarity:
> fatload mmc 2 ${loadaddr} hello_world_m40.bin && dcache flush && bootaux ${loadaddr} 0
18536 bytes read in 15 ms (1.2 MiB/s)
## Starting auxiliary core at 0x80280000 ...
Power on M4 and MU
Copy M4 image from 0x80280000 to TCML 0x34fe0000
Start M4 0
bootaux complete

hello_world_m40.bin is the name of your application and the trailing 0 stands for core 0.

To run the example on core 1 instead, simply replace the trailing "0" with a "1" in the command above.