Running FreeRTOS on the Cortex-M4 of a Colibri iMX7
Introduction
The objective of this article is to guide you through case-oriented examples on the implementation of FreeRTOS on the Cortex-M of a Colibri iMX7 System on Module, focusing on the execution of a sample application firmwares leveraging the Heterogeneous Multicore Processing architecture.
The use cases described in this article were tested and validated with FreeRTOS on the Cortex-M running alongside an embedded Linux image (Linux BSP) on the A-cores.
Prerequisites
- Set up the SDK and Toolchain as described in the article Setting Up MCUXpresso SDK and Toolchain for Cortex-M development.
- Consider the architecture and memory areas of the specific i.MX-based SoC as explained in the article Cortex-M and Memory Areas Overview on Toradex SoMs
Overview
The NXP/Freescale i.MX 7 SoC which is the core of the Colibri iMX7 module implements a heterogeneous asymmetric architecture. Besides the main CPU core(s) based on the ARM Cortex-A7 processor, a secondary general purpose ARM Cortex-M4 core is available too. The secondary core typically runs a RTOS optimized for microcontrollers or a bare-metal application. Toradex provides FreeRTOS™, a free professional grade real-time operating system for microcontrollers, along with drivers and several examples which can be used on our Colibri iMX7 platform. The FreeRTOS™ port is based on NXP FreeRTOS BSP for i.MX 7.
Resource Domain Controller
Understanding and configuring the Resource Domain Controller (RDC) is crucial when running two operating systems within iMX 7. The RDC prohibits and grants access to peripherals and memory areas for individual bus masters (e.g. CPU, DMA controller) on hardware level. The RDC allows to define up to 4 resource domains, and assign peripherals and memory locations to those resource domains. By default, the A7 core is in domain 0 and all peripherals are assigned to the domain 0. When the FreeRTOS firmware start, the Cortex-M4 core is in domain 0 too, but then reassigns the Cortex-M4 and the required peripherals to domain 1 (see board.c and the example specific hardware_init.c).
If a device shall be used on the Cortex-M4 which is used by the Linux kernel running on the Cortex-A7 (e.g. I2C), it is important to disable this device in the device tree of the Linux kernel (e.g. set the status property to disabled). The article Device Tree Customization explains in more details how to alter the device tree.
The source code
This table shows how the FreeRTOS BSP source code is structured:
Directory | Content |
---|---|
doc/ | NXP/Freescales FreeRTOS BSP Documentation |
examples/ | Application examples |
examples/imx7_colibri_m4/ | Examples ported to Toradex Colibri iMX7 |
middleware/multicore/open-amp/ | OpenAMP based RPMsg stack (remote messaging framework) |
platform/ | Driver library, startup code and utilities |
platform/CMSIS/ | Cortex Microcontroller Software Interface Standard (CMSIS) ARM Cortex®-M header files, DSP library source |
platform/devices/MCIMX7D/linker/ | Linker control files for each supported toolchain |
platform/drivers/ | Peripheral Drivers |
platform/utilities/ | Utilities such as debug console |
rtos/FreeRTOS/ | FreeRTOS Kernel folder |
Preparing the Environment
Install the necessary dependencies:
$ sudo apt-get install make cmake libc6:i386 libncurses6:i386
Get the FreeRTOS Source Code: The FreeRTOS source code is available on our git server. Use git to clone the repository:
$ git clone -b master https://github.com/toradex/FreeRTOS-Colibri-iMX7 freertos-colibri-imx7/
$ cd freertos-colibri-imx7/Download and install the GCC toolchain: Use one of the Linaro provided ARM Embedded toolchains, e.g. 4.9 2015 Q3 update. Unpack the toolchain to an appropriate location, e.g. your home directory:
$ tar xjf ~/Downloads/gcc-arm-none-eabi-4_9-2015q3-20150921-linux.tar.bz2
Compile the binary: Each example has a sub-directory called armgcc. Enter this subdirectory and use the provided shell scripts to build the example. The scripts expect the environment variable to point to the Linaro ARM Embedded toolchain:
$ export ARMGCC_DIR=~/gcc-arm-none-eabi-4_9-2015q3/
$ cd examples/imx7_colibri_m4/demo_apps/hello_world/armgcc
$ ./build_all.sh
-- TOOLCHAIN_DIR: /home/ags/gcc-arm-none-eabi-4_9-2015q3/
-- BUILD_TYPE: Debug
-- TOOLCHAIN_DIR: /home/ags/gcc-arm-none-eabi-4_9-2015q3/
-- BUILD_TYPE: Debug
-- The ASM compiler identification is GNU
-- Found assembler: /home/ags/gcc-arm-none-eabi-4_9-2015q3//bin/arm-none-eabi-gcc
-- Configuring done
-- Generating done
-- Build files have been written to: /home/ags/freertos-colibri-imx7/examples/imx7_colibri_m4/demo_apps/hello_world/armgcc
Scanning dependencies of target hello_world
[ 4%] Building C object CMakeFiles/hello_world.dir/home/ags/freertos-colibri-imx7/rtos/FreeRTOS/Source/portable/GCC/ARM_CM4F/port.c.obj
...
[100%] Linking C executable debug/hello_world.elf
[100%] Built target hello_worldWith that, the sub directories release/ and debug/ contain the firmware binaries which can be executed on the target (see the instructions of the individual examples below).
Load and Run the Firmware on Cortex-M4
The FreeRTOS firmware uses Colibri UART_B (e.g. RS232 X25-Top on the Colibri Evaluation Board) as its debugging console. Make sure to connect UART_B
to your debugging host and start a serial terminal emulator with a baudrate of 115200 on the serial port to see the FreeRTOS debug output.
U-Boot can be used to start a firmware on the Cortex-M4 core. Toradex recommends to use the *.elf
file format exclusively since its header contains load/entry address and allows to load multiple independent sections. Make sure you have access to the U-Boot console by connecting the serial console on UART_A.
By default, our Linux device tree uses UART_B too, which leads to an external abort when the Linux kernel tries to access UART_B. It is recommended to alter the device tree and disable UART_B using the status property, which can be done by applying device tree overlays.
Steps for eMMC-based Colibri iMX7
Follow the steps described at How to Load Compiled Binaries into Cortex-M to understand how to load the firmware using the ext4load method.
Steps for NAND-based Colibri iMX7
For the Colibri iMX7S 256MB and Colibri iMX7D 512MB, follow the steps below:
> ubi part ubi
> fatload mmc 0:1 ${loadaddr} hello_world.elf
> ubi write ${loadaddr} m4firmware ${filesize}
U-Boot 2016.11 sometimes seems to have issue when calling ubi part ubi
multiple times (UBI volume corruption). Try to attach the UBI volume only once.
From within Linux the firmware can be stored using ubiupdatevol
. Check sysfs to verify which device got assigned to the m4firmware
UBI volume (compare device major/minor). On a newly flashed module the m4firmware
volume device is usually /dev/ubi0_2
:
# cat /sys/class/ubi/ubi0/ubi0_2/name
m4firmware
# ubiupdatevol /dev/ubi0_2 hello_world.elf
Use the following command to enable automatic loading and execution of a elf
binary firmware
> setenv m4boot 'ubi read ${loadaddr} m4firmware && bootaux ${loadaddr}'
> saveenv
With that, U-Boot will load and start the firmware just before booting Linux:
...
Read 0 bytes from volume m4firmware to 7f8000
No size specified -> Using max size (20956)
## Starting auxiliary core at 0x007F8000 ...
...
Examples
This section provides information about the available examples and how to use them. The source code of the Colibri iMX7 examples is located under examples/imx7_colibri_m4/. All examples are linked to run in the TCM area.
Precompiled firmwares can be found at: https://developer-archives.toradex.com/files/toradex-dev/uploads/media/Colibri/FreeRTOS/Binaries/
Hello World
- Firmware Location: examples/imx7_colibri_m4/demo_apps/hello_world/
- Compiled Example: hello_world.elf
Simple application starting the FreeRTOS kernel and printing Hello World in a FreeRTOS task.
Hello World!
GPIO Example
- Firmware Location: examples/imx7_colibri_m4/driver_examples/gpio_imx/
- Compiled Example: gpio_imx_example.elf
GPIO example using input and output as well as interrupt mode. The Toradex example uses EXT_IO1 and EXT_IO2 for buttons/keys and EXT_IO0 for a LED. On a Colibri Evaluation board, you can use header X21 to connect the SO-DIMM signals to a button or LED respectively:
- EXT_IO0 => X21-LED1
- EXT_IO1 => X21-SW6
- EXT_IO2 => X21-SW5
====================== GPIO Example ========================
=================== GPIO Interrupt =====================
The (EXT_IO1) button is configured to trigger GPIO interrupt.
Press the (EXT_IO1) button 3 times to continue.
Button pressed 1 time.
Button pressed 2 time.
Button pressed 3 time.
================= GPIO Functionality==================
The (EXT_IO1) button state is now polled.
Press the (EXT_IO1) button to switch LED on or off
Button released
Button pressed
Button released
Button pressed
...
Note: The interrupt part of the demo only works as long as Linux is not running. The reason for that is that Linux uses the same GPIO bank too, and reconfigures interrupts. The easiest way to use GPIO in interrupt mode on FreeRTOS while using Linux is to assign a complete GPIO bank to FreeRTOS and disabling the GPIO bank in Linux' device tree.
RPMsg TTY Example
- Firmware Location: examples/imx7_colibri_m4/demo_apps/rpmsg/str_echo_freertos/
- Compiled Example: rpmsg_str_echo_freertos_example.elf
- Linux Kernel Module Source: drivers/rpmsg/imx_rpmsg_tty.c
This demo uses a virtual TTY on Linux side to send a RPmsg to FreeRTOS. Each string sent to the FreeRTOS example will be sent back to the Linux host.
Start the example rpmsg_str_echo_example.elf on the Cortex-M4 first, which should already print the following output:
RPMSG String Echo Demo...
RPMSG Init as Remote
init M4 as REMOTE
Then, boot into Linux and execute the following commands:
# modprobe imx_rpmsg_tty
[ 61.598675] imx_rpmsg_tty rpmsg0: new channel: 0x400 -> 0x1!
[ 61.606640] Install rpmsg tty driver!
...
With that, we have a newly created TTY device that we can use to communicate with the M4 core. The probe method also sends a "hello world" to the other side, which should be visible in the Cortex-M4 console:
...
Name service handshake is done, M4 has setup a rpmsg channel [1 ---> 1024]
Get Message From A7 : "hello world!" [len : 12] from slot 0
We can use bash to open the file and assign it the file descriptor 3, write into it, read from it and ultimately close it.
# DEV=/dev/`ls /dev/|grep RPMSG`
# stty -F $DEV -echo
# exec 3<> $DEV
# echo Test >&3
# cat <&3
Test
^C
# exec 3>&-
On the Cortex-M4 side, this should print the Test message:
...
Get Message From A7 : "Test" [len : 4] from slot 1
Get New Line From A7 From Slot 2
To load the module automatically on every boot, a config file in the modules-load.d directory can be used:
# echo imx_rpmsg_tty > /etc/modules-load.d/rpmsg_tty.conf
RPMsg Char Example
- Firmware Location: examples/imx7_colibri_m4/demo_apps/rpmsg/str_echo_bm
- Linux Kernel Module Source:: drivers/rpmsg/rpmsc_char.c
Linux kernel should be built with the flags enabled:
CONFIG_RPMSG_CHAR=y
CONFIG_RPMSG_VIRTIO=y
CONFIG_HAVE_IMX_RPMSG=y
CONFIG_RPMSG=y
Here RPMSG String Echo Bare Metal Demo is used which reports all newly created dynamic endpoints and reports all messages sent to it.
Start the example rpmsg_str_echo_bm.elf on the Cortex-M4 first, which should already print the following output:
RPMSG String Echo Bare Metal Demo...
RPMSG Init as Remote
Then, build this example application, put into your rootfs and boot Linux:
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <linux/rpmsg.h>
#include <sys/ioctl.h>
#include <unistd.h>
int main(void)
{
char data_buf [] = {'a', 'b', 'c', 'd', 'e', '\0'};
struct rpmsg_endpoint_info ept_info = {"rpmsg-openamp-demo-channel", 0x2, 0x1};
int fd = open("/dev/rpmsg_ctrl0", O_RDWR);
/* create endpoint interface */
ioctl(fd, RPMSG_CREATE_EPT_IOCTL, &ept_info); // /dev/rpmsg0 is created
/* create endpoint */
int fd_ept = open("/dev/rpmsg0", O_RDWR); // backend creates endpoint
/* receive data from remote device */
read(fd_ept, &data_buf, sizeof(data_buf));
/* send data to remote device */
write(fd_ept, &data_buf, sizeof(data_buf));
/* destroy endpoint */
ioctl(fd_ept, RPMSG_DESTROY_EPT_IOCTL);
close(fd_ept);
close(fd);
}
When executing this application, it will create a new dynamic endpoint using /dev/rpmsg_ctrl0 device and send "abcd" string to the remote M4 application:
UART A:
root@colibri-imx7:/# ./test
[ 256.791124] virtio_rpmsg_bus virtio0: RPMsg ept created, rpmsg-openamp-demo-channel:2,0.
[ 256.813333] virtio_rpmsg_bus virtio0: msg received with no recipient
UART B:
RPMSG String Echo Bare Metal Demo...
RPMSG Init as Remote
Name service handshake is done, M4 has setup a rpmsg channel [1 ---> 2]
Name service handshake is done, M4 has setup a rpmsg channel [0 ---> 2]
Name service handshake is done, M4 has setup a rpmsg channel [1 ---> 2]
Get Message From Master Side : "abcde" [len : 6] from slot 0
RPMsg PingPong Example
- Firmware Location: examples/imx7_colibri_m4/demo_apps/rpmsg/pingpong_freertos/
- Compiled Example: rpmsg_pingpong_freertos_example.elf
- Linux Kernel Module Source:: drivers/rpmsg/imx_rpmsg_pingpong.c
This example uses the messaging mechanism to send an integer value back and forth, while incrementing it on both sites. The demo will stop after 100'000 iterations (that is when 200'000 incrementations have been executed).
# modprobe imx_rpmsg_pingpong
[ 19.538184] imx_rpmsg_pingpong rpmsg0: new channel: 0x400 -> 0x1!
[ 19.553916] get 1 (src: 0x1)
[ 19.559268] get 3 (src: 0x1)
[ 19.564410] get 5 (src: 0x1)
[ 19.570272] get 7 (src: 0x1)
[ 19.575276] get 9 (src: 0x1)
[ 19.580268] get 11 (src: 0x1)
[ 19.585275] get 13 (src: 0x1)
...
[ 330.056188] get 199999 (src: 0x1)
[ 330.056200] imx_rpmsg_pingpong rpmsg0: goodbye!
PMSG PingPong Demo...
RPMSG Init as Remote
init M4 as REMOTE
Name service handshake is done, M4 has setup a rpmsg channel [1 ---> 1024]
Get Data From A7 : 0
Get Data From A7 : 2
Get Data From A7 : 4
Get Data From A7 : 6
Get Data From A7 : 8
Get Data From A7 : 10
Get Data From A7 : 12
Get Data From A7 : 14
...
Use debug level 5 to avoid the messages on the debug console:
# echo 5 > /proc/sys/kernel/printk
TAQ (Toradex + Antmicro + QT)
Toradex together with AntMicro and Qt created a self-balancing robot TAQ. You can see the robot in action in this video. The firmware has been built using an early version of the Colibri iMX7 BSP, hence the firmware might not run well with newer Linux BSP's. However, the API is pretty much unchanged therefore the source code can still be used as a reference.
- TAQ User Interface: https://github.com/mitchcurtis/robot-faces
- TAQ Firmware: https://github.com/antmicro/imx7-taq-demo/
Other Examples
There are other examples available. Some of the examples run from DDR (_ddr) and hence require a different load/boot address as well as making sure that Linux not using that DDR memory. Some examples are Bare-Metal ((_bm) examples which do not use the FreeRTOS kernel while still using the driver/RPmsg framework.
Customization
Increase available Heap
The heap size of FreeRTOS can be changed by altering the value defined by configTOTAL_HEAP_SIZE
in the FreeRTOSConfig.h
header file. However, the maximum size will depend on the available memory as defined in the linker file, which ultimately depends on which memory location is used (see Section Memory Areas).
Change RPmsg Buffer Count and Size
RPmsg creates two VirtIO Ring Buffers, for each direction one. The default buffer size is 512 bytes (from which 16 bytes are used by a header) and 256 buffers are allocated for each direction. VirtIO's vring implementation is a sophisticated ring buffer implementation (the virtio: Towards a De-Facto Standard For Virtual I/O Devices paper describes the design in more detail). To better understand the memory requirements, a high level overview of the inner workings of VirtIO is required. The system consists of:
- Buffers (shared, dynamically allocated, 512 bytes by default)
- The virtio_ring (shared, at predefined location in DDR3, 0x8FFF0000 and 0x8FFF8000)
- Descriptors array (16 bytes per buffer)
- Available ring
- Used ring
By default Linux is the RPmsg Master. The RPmsg master allocates the buffers, Linux does so in DDR3 memory. The location of the virtio_ring as well as the ring length and size of the buffers are currently hard-coded and must match between Linux and FreeRTOS. Furthermore, the ring length must be a power of two (256, 512, 1024...).
- Linux:
arm/mach-imx/imx_rpmsg.c
(RPMSG_NUM_BUFS, RPMSG_BUF_SIZE and location of vring in imx_rpmsg_probe) - FreeRTOS:
middleware/multicore/open-amp/porting/imx7d_m4/platform_info.c
(RPMSG_NUM_BUFS, VRING0/1_BASE) andmiddleware/multicore/open-amp/rpmsg/rpmsg_core.h
(RPMSG_BUFFER_SIZE)
Run FreeRTOS and Windows CE
Demo with prebuilt M4 binary:
- Download the Toradex CE Libraries and build the Rpmsg_Demo application.
- Run Rpmsg_Demo.exe application along with rpmgsp_pingpong_example.bin
- Verify whether the M4 firmware is running by displaying the messages received from M4.
Build M4 binary on Windows 7 machine
- Install Git and clone the Toradex FreeRTOS repository:
git clone -b master https://github.com/toradex/FreeRTOS-Colibri-iMX7
Download and Install MinGW.
Select mingw32-base and msys-base packages and apply changes.
Install cmake.
Download the zip package of GNU ARM Embedded Toolchain and extract its content inside C:\MinGW\bin. You should see the folders arm-none-eabi, bin, lib and share below C:\MinGW\bin".
Set the user environment variables(My Computer > Properties > Advanced System settings > Environment variables). ARMGCC_DIR=C:\MinGW\bin and PATH=C:\MinGW\bin".
Run build_all.bat to build FreeRTOS source code, which you can find in FreeRTOS source code directories.
Extract entry point instruction from firmware:
- Run C:\MinGW\msys\1.0\msys.bat file in a command prompt.
- Run command readelf -h firmwareFile.elf, it will output entry point address of the firmware file.
Configure the library through API:
- Configure LoadAddr, the address into which the firmware needs to be loaded.
- Configure CodeSize, the code block size of the firmware.
- Configure ExecuteAddr, the entry point instruction of the firmware. Please refer to the Memory areas section for more details.
- Configure RxRingAddr, Rx shared memory address. This must be identical to the "VRING0_BASE" value in the file "\freertos-toradex\middleware\multicore\open-amp\porting\vf6xx_m4\platform_info.c".
- Configure TxRingAddr, Tx shared memory address. This must be identical to the "VRING1_BASE" value in the file "\freertos-toradex\middleware\multicore\open-amp\porting\vf6xx_m4\platform_info.c".
- Configure ChannelName, communication channel name to filter the packets. This must be identical to the configured channel name in proc_table structure in "\freertos-toradex\middleware\multicore\open-amp\porting\vf6xx_m4\platform_info.c".
- Call Rpmsg_Open to initialize the firmware.
- Get name of the event using DataAvailableEvent for creating an event and wait till you receive any message.
- Flush the message buffer by calling FlushRcvFifo and start transmitting and receiving data.
Troubleshooting
Linux
To troubleshoot Linux problems (e.g. when booting Linux, after starting the firmware hangs), the kernel's earlyprintk mechanism can help. Often the kernel does actually start booting; however, just before the serial port (and hence the serial console) initializes, the kernel crashes. The earlyprintk mechanism starts printing kernel log messages from the very beginning.
The earlyprintk functionality is not part of the default kernel and hence needs to be enabled in the kernel configuration (see also Build U-Boot and Linux Kernel from Source Code).
Kernel hacking --->
[*] Kernel low-level debugging functions (read help!)
Kernel low-level debugging port (i.MX7D Debug UART) --->
(1) i.MX Debug UART Port Selection
[*] Early printk
Problems often appear due to the RDC (Resource Domain Controller). For example, if the firmware on the secondary CPU acquired exclusive access to a certain peripheral, the primary core(s) running Linux won't be able to access that peripheral anymore. This leads to an external abort on the primary CPU, as seen below:
...
[ 0.426158] Unhandled fault: imprecise external abort (0x1c06) at 0x76563000
[ 0.433355] Internal error: : 1c06 [#1] SMP ARM
[ 0.437921] Modules linked in:
[ 0.441175] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.52-03016-g6bcfa7b-dirty #5
[ 0.449155] task: 8c0a8000 ti: 8c09e000 task.ti: 8c09e000
[ 0.454606] PC is at i2c_imx_probe+0x36c/0x4cc
[ 0.459236] LR is at i2c_imx_probe+0x320/0x4cc
...
To resolve this, make sure to not access the same peripheral on the primary and secondary CPU.
GPT
The recommended clock for the General Purpose Timer is the 24MHz Oscillator clock since it is guaranteed to be always enabled. If a faster clock is required, the System PLL PFD0 can be used, however Linux need to be modified in order to make sure the PFD does not get disabled by the Linux clock infrastructure. Adding the following two lines of code at the bottom of imx7d_clocks_init
in arch/arm/mach-imx/clk-imx7d.c
prevents Linux from disabling the System PLL PFD0:
/* Make sure pfd0 stays on in CCM_ANALOG part */
clk_prepare_enable(clks[IMX7D_PLL_SYS_PFD0_392M_CLK]);