Version: Torizon OS 7.x.y

Build TensorFlow Lite Applications With NPU Support on SL1680-Based Modules on Torizon OS

Introduction

In this article, you will learn the basic workflow to build containerized TensorFlow Lite applications with Neural Processing Unit (NPU) support on SL1680-based modules running Torizon OS.

Prerequisites

Hardware Prerequisites:
- A SL1680-based Toradex Single Board Computer (SBC)
- A compatible Carrier Board
Software Prerequisites:
- A Toradex SBC with Torizon OS installed
- A configured build environment, as described in the Configure Build Environment for Torizon Containers article
- A quantized .tflite model supported by the VX delegate
  - CPU fallback remains available for models that are not compatible with the NPU

Build TensorFlow Lite Applications

On your host machine, create a Dockerfile to set up the necessary dependencies to run TensorFlow Lite with NPU support. The produced container will include the runtime provided by Synaptics, which is designed to run on any hardware accelerator available on the system.

tip

You do not need to use Toradex's packages to train your models. Toradex recommends using the upstream TensorFlow libraries for training.

Dockerfile
ARG IMAGE_ARCH=linux/arm64

FROM --platform=$IMAGE_ARCH torizon/debian-sl1680:4

# Define build arguments for the application
ARG TF_LITE_MODEL=""
ARG IMAGE_DATASET=""
ARG LABELMAP_FILE=""
ARG INFERENCE_SCRIPT=""

ARG WORKDIR_PATH="/app"

# Environment setup
ENV DEBIAN_FRONTEND=noninteractive \
    PYTHONDONTWRITEBYTECODE=1      \
    PYTHONUNBUFFERED=1

WORKDIR ${WORKDIR_PATH}

# Install dependencies
RUN apt-get update &&                          \
    apt-get install -y --no-install-recommends \
        \
        # Python dependencies
        python3      \
        python3-venv \
        python3-pip  \
        \
        # Python libraries
        python3-numpy \
        python3-pil   \
        \
        # TensorFlow Lite dependencies
        libtensorflow-lite2.15.0-1 \
        tflite-vx-delegate         \
        tim-vx                     \
        \
        # Synaptics dependencies
        synap-runtime        \
        synap-prebuilts-libs \
        synap-vxk            \
    && rm -rf /var/lib/apt/lists/*

# Set up a Python virtual environment and install the TensorFlow Lite runtime
RUN python3 -m venv ${WORKDIR_PATH}/.venv --system-site-packages

# Install Python dependencies
RUN . ${WORKDIR_PATH}/.venv/bin/activate && \
    pip install --no-cache-dir --upgrade pip && \
    pip install --no-cache-dir tflite-runtime

# Copy the application code into the container
COPY ${TF_LITE_MODEL} ${WORKDIR_PATH}
COPY ${LABELMAP_FILE} ${WORKDIR_PATH}
COPY ${IMAGE_DATASET} ${WORKDIR_PATH}

COPY ${INFERENCE_SCRIPT} ${WORKDIR_PATH}

As an example, we are going to use the NXP-provided object detection demo for the remainder of this article. Download the required resources using the commands below:

# Download the input image
$ curl -L https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp \
    -o grace_hopper.bmp

# Download and extract the model
$ curl -L https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224_quant.tgz \
    | tar xz --wildcards --no-anchored '*.tflite'

# Download and extract the labels file
$ curl -L https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz \
    | tar xz --strip-components=1 --wildcards --no-anchored '*/labels.txt'

# Download the inference script
$ curl -L https://raw.githubusercontent.com/nxp-imx/tensorflow-imx/lf-6.12.49_2.2.0/tensorflow/lite/examples/python/label_image.py \
    -o label_image.py

warning

Define all required build arguments in the Dockerfile before building the image. The TF_LITE_MODEL, IMAGE_DATASET, LABELMAP_FILE, and INFERENCE_SCRIPT arguments must be set to valid file paths.

Finally, build the container image using the command below. Make sure to replace <username> and <image-name> with your Docker Hub username and the name for your image, respectively.

$ docker build -t <username>/<image-name>

$ docker push <username>/<image-name>

Run the Application

Once you have built and pushed your Docker image to the container registry, you can run the container on your SL1680-based module. Use the following command to start and attach to the container with the necessary permissions and environment variables to access the NPU:

# docker run -it                              \
    -v /dev:/dev                              \
    --device-cgroup-rule "c 10:119 rmw"       \
    <username>/<image-name>

Finally, execute the TensorFlow Lite demo application inside the container:

## source .venv/bin/activate

## python3 label_image.py --model_file mobilenet_v1_1.0_224_quant.tflite --image grace_hopper.bmp --label_file labels.txt --ext_delegate=/usr/lib/aarch64-linux-gnu/libvx_delegate.so

Run the following commands to execute the application using the other hardware accelerators available on the system:

CPU:

## python3 label_image.py --model_file mobilenet_v1_1.0_224_quant.tflite --image grace_hopper.bmp --label_file labels.txt

Performance Comparison

The following table provides a performance comparison of the inference times for the CPU and NPU on the Luna SL1680 module when running the TensorFlow Lite demo application:

SoM	HW accelerator	Inference Time	FPS (1/Inference Time)
Luna SL1680	None (CPU)	74.10 ms	13.49 FPS
Luna SL1680	NPU	2.10 ms	476.19 FPS

Introduction​

Prerequisites​

Build TensorFlow Lite Applications​

Run the Application​

Performance Comparison​

Additional Resources​

Introduction

Prerequisites

Build TensorFlow Lite Applications

Run the Application

Performance Comparison

Additional Resources