Hailo AI
Software Suite


Hailo devices are accompanied by a comprehensive AI Software Suite that enables the compilation of deep learning models and the implementation of AI applications in production environments. The model build environment seamlessly integrates with common ML frameworks to allow smooth and easy integration in existing development ecosystems. The runtime environment enables integration and deployment in host processors, such as x86 and ARM based products, when utilizing Hailo-8, and in Hailo-15 vision processor.

Key Components of the
AI Software Suite:

Model Build Environment

Model Build Computer
Machine Learning Frameworks
User Models
Hailo Model Zoo Dataflow Compiler

Model Zoo, a variety of common and state-of-the-art pre-trained models and tasks in TensorFlow and ONNX.

Dataflow Compiler, for offline compilation and optimization of user’s models for Hailo devices.

Runtime Environment

TAPPAS, a set of full application examples, implementing pipeline elements and pre-trained AI tasks.

HailoRT, production-grade and light runtime software package, running on the host processor for real-time inferencing the deep learning models compiled by the Dataflow Compiler.

AI Vision Processor


TAPPAS, set of full application examples, implementing pipeline elements and pre-trained AI tasks.

HailoRT, production-grade and light runtime software package running on Hailo-15™ for real-time inferencing the deep learning models compiled by the Dataflow Compiler.

Breathe life into your edge products with Hailo’s AI Accelerators and Vision Processors

Dataflow Compiler

A Complete & Scalable Software Toolchain

Hailo devices are accompanied by a comprehensive dataflow compiler that seamlessly integrates with existing deep learning development frameworks to allow smooth and easy integration in existing development ecosystems.

Full deployment flow toolchain capabilities:

Model Translation from industry-standard frameworks to Hailo executable format

Model Optimization to the internal representation, using state of the art quantization scheme

Automated resource allocation for meeting user requirements in FPS, latency, and power consumption

Compilation of models into Hailo binary by the dedicated deep learning compiler

Loading binary and running inference on the Hailo target device

Supports both standalone inference allowing direct access to the device, and Tensorflow-integrated inference for easy integration with existing environments

Analysis and debug tools:

Emulator providing bit exact emulation of chip behavior

Profiler providing chip performance estimations (e.g. FPS, power and latency)

Find the best model to run on your Hailo device with the new Model Explorer


Runtime Software Suite

HailoRT is a production-grade, light, scalable runtime software, providing a robust library with intuitive APIs for optimized performance. Our AI SDK enables developers to build easy and fast pipelines for AI applications in production and is also suitable for evaluation and prototyping. It runs on Hailo AI Vision Processor or when utilizing Hailo AI Accelerator, it runs on the host processor and enables high throughput inferencing with one or more Hailo devices. HailoRT is available as open-source software via Hailo Github.

HailoRT Key capabilities:

Multi-Host architecture support – supports both x86 & ARM architectures

Flexible interfaces for AI applications – C/C++ and Python API

Easy integration with devices and pipelines – standard frameworks support:
GStreamer and ONNX runtime

Supports multiple streams – process multiple video streams simultaneously

Supports high throughput inferencing with up to 16 Hailo AI Accelerator devices

Seamless interfaces control for two-ways control and data communication with Hailo NN core

Key components:

Runtime Frameworks Integration

  • pyHailoRT – Python API for load models to Hailo NN core and send & receive data 
  • GStreamer plugin – provides the “hailonet” element which infer GStreamer frames according to the configured network. This element can be used multiple times in a GStreamer pipeline to infer multiple networks in parallel

Integration Tool

For a verification process of hardware integration with Hailo AI Accelerator


Command Line application for controlling the Hailo device(s), running inferences on the device(s), and collecting inference statistics & device events

HailoRT Library

User-space, runtime, robust, C/C++ API for control and data transfer to/from Hailo devices

Yocto Layer

  • Enables integration of HailoRT into a Yocto build for both Hailo AI Vision Processor and Hailo AI Accelerators
  • Includes recipes for the HailoRT library, pyHailoRT, PCIe driver and NN Core driver

Get Hailo’s Software Downloads and Documentation

Sign in / Sign up is required

Model Zoo

The Hailo Model Zoo provides deep learning models for various computer vision tasks. The pre-trained models can be used to create fast prototypes on Hailo devices.

Additionally, the Hailo Model Zoo Github repository provides users with the capability to quickly and easily reproduce Hailo’s published performance on the common models and architectures included in our Model Zoo.

Main features include

A variety of common and state-of-the-art pre-trained models and tasks in TensorFlow and ONNX

Model details, including full precision accuracy vs. quantized model accuracy measured on Hailo devices

Each model also includes a binary HEF file that is fully supported in the Hailo toolchain and Application suite (for registered users only)

Model Explorer

The Hailo Model Explorer is a dynamic tool designed to help users explore the models on the Model Zoo and select the best NN models for their AI applications.

The Model Zoo gives users the capability to quickly and easily reproduce the Hailo published performance on the common models and architectures included in our Model Zoo and retrain these models. The collection encompasses both common and state-of-the-art models available in TensorFlow and ONNX formats.
The pre-trained models can be used for rapidly prototyping on Hailo devices and each model is accompanied by a binary HEF file, fully supported within the Hailo toolchain and Application suite (accessible to registered users only).

Selecting an appropriate model to use in your application can be challenging due to various factors like inference speed, model availability, desired accuracy, licensing, and more. Inference speed is unique since it cannot be easily estimated without the underlying hardware used.
Unfortunately, no single intrinsic model attribute (e.g., FLOPS, parameters, size of activation maps. etc.) is a reliable predictor for inference speed and, to complicate things further, different hardware architectures have different optimal workloads. While it is possible to measure the inference time for each model, it can be tedious and time consuming.
The Model Explorer below was designed to helps users in making better-informed decisions, ensuring maximum efficiency on the Hailo platform. The Model Explorer offers an interactive interface with filters based on Hailo device, tasks, model, FPS, and accuracy, allowing users to explore numerous NN models from Hailo’s vast library.

Read more about how the Hailo Model Zoo works and what it can do on the Hailo Blog.

Find the best model to run on your Hailo device with the new Model Explorer


Streamline the Development of Edge AI Applications

Hailo has developed TAPPAS to streamline the development and deployment of edge applications that require high AI performance. This reference applications software package helps users accelerate their time-to-market by reducing the amount of time and effort spent on development. TAPPAS includes a user-friendly GStreamer-based set of fully functional application examples that incorporate pipeline elements and pre-trained AI tasks. These examples are built on top of advanced Deep Neural Networks that showcase the best-in-class throughput and power efficiency of Hailo’s AI processors. In addition, TAPPAS illustrates how Hailo’s system integration works by showcasing specific use cases on predefined software and hardware platforms. By utilizing TAPPAS, you can simplify integration with Hailo’s runtime software stack and have a starting point to fine-tune your applications.

Object detection

Detection on one video file source by running a single-stream object detection pipeline

Detecting and classifying objects within an image is a crucial task in computer vision, known as object detection. Deep learning models trained on the COCO dataset, which is a popular dataset for object detection, offer varying tradeoffs between performance and accuracy. For instance, by running inference on Hailo-8, the YOLOv5m model achieves 218 FPS and 42.46mAP accuracy, while the SSD-MobileNet-v1 model attains 1055 FPS and 23.17mAP accuracy. The COCO dataset includes 80 unique classes of objects for general usage scenarios, including both indoor and outdoor scenes.

License Plate Recognition (LPR)

Automatic license plate recognition application based on a complex pipeline utilizing model scheduling

License Plate Recognition (LPR) pipeline, also referred to as Automatic Number Plate Recognition (ANPR), is commonly used in the Intelligent Transportation Systems (ITS) market. This example application demonstrates an automatic model switching between 3 different networks in a complex pipeline. Running in parallel YOLOv5m model for vehicle detection, YOLOv4-tiny model for detecting license plate and lprnet model for text extraction.

Read more in our License Plate Recognition Blog Post

Multi-stream Object Detection

Detection apps with several available neural networks, delivering unique functionalities and supporting multiple streams

Multi-Stream object detection is utilized in diverse applications across different industries, including complex ones like Smart City traffic management and Intelligent Transportation Systems (ITS). You can either use your own object detection network or rely on pre-built models like YOLOv5m, which are all trained on the COCO dataset. Notably, these models offer unique capabilities such as Tiling, which utilizes Hailo’s high throughput to handle high-resolution images (FHD, 4K) by dividing them into smaller tiles. Processing high-resolution images proves particularly useful in crowded locations and public safety applications where small objects are abound, for instance, in crowd analytics for Retail and Smart Cities, among other use cases.

Multi-Camera Multi-Person Tracking (RE-ID)

Tracking specific objects or people across multiple cameras utilizing model scheduling

Multi-person re-identification across different streams is essential for security and retail applications. This includes the identification of a specific person multiple times, either in a specific location over time, or along a trail between multiple locations. This example application demonstrates NN model switching, of both the YOLOv5s and repvgg_a0_person_reid deep learning models, trained on Hailo’s dataset, in a complex pipeline with inference-based decision-making. This is achieved using the model scheduler, an automatic tool for model switching, which enables processing multiple models simultaneously at runtime.

Semantic Segmentation

Application used for partitioning an image into multiple image segments

Semantic segmentation aims to assign a specific class to each pixel within the input image, and recognize a collection of pixels that form distinct categories. This technique is commonly used for ADAS applications, to enable the vehicle to decide where the road, sidewalk, other vehicles and pedestrians are. It also enhances the detection of defects in quality control through optical inspection applications in industrial automation and enhances the precision of detail detection in medical imaging cameras, retail cameras and more. In this specific setup, the pipeline relies on the Cityscapes dataset, which contains images captured from the perspective of a vehicle’s front-facing camera, encompassing 19 distinct classes. The pre-configured TAPPAS semantic segmentation pipeline showcases the robust computational capacity necessary for handling an FHD input video stream (1080p) while employing the FCN8-ResNet-v1-18 network.

Depth Estimation

Estimates the depth or distance information from a given 2D image or video, providing a perception of the three-dimensional structure

Depth estimation from a single image is achieved by the ability to estimate the depth or distance information from 2D images and turn it into a 3D mapping. It enables automotive cameras to better understand the distance to objects, helps industrial inspection cameras in tasks like defect detection and quality control and can improve the accuracy of person detection for security cameras by proving more detailed spatial information.

In this example, we are using the fast_depth deep learning model, trained on NYUv2 dataset, which predicts a distance matrix (different depth for each pixel) with the same shape of the input frame.

Instance Segmentation

Application identifies, outlines and colors different objects and persons for precise object localization and separation

Instance segmentation task is the process of merging the capabilities of object detection (which includes identifying and categorizing objects) and semantic segmentation (which allocates specific classes to individual pixels) to produce a distinct mask for each object within a given scene. This task becomes especially crucial when bounding boxes lack precision for localization, and when the application requires pixel-level differentiation between objects. This application utilizes either the yolov5seg or YOLACT architectures, and it entails the training of these models using the COCO dataset.

Pose Estimation

Understanding and analyzing human activities or detecting and tracking suspicious or abnormal human poses or movements

Pose estimation is a computer vision technology that detects and tracks human body poses in images or videos. From recognizing emergency situations at home or on the factory floor, to analyzing customer behavior for better business outcomes. It involves localizing the different parts of the human body such as the head, shoulders, arms, legs, and torso, and estimating their positions and orientations in a 3D space. This pipeline includes a combination of centerpose models pre-trained on COCO dataset.

Facial Detection and Recognition

Application utilized for surveillance and security, authentication and access control and human-computer interaction.

Facial detection is a common task of utilizing object detection network for a specific object of faces. The face detection network was trained using the WIDER dataset and its output is the boxes prediction of all the faces in the frame. This application demonstrates how to crop the Region of Interest (ROI) produced by the detector and feed a second network to predict facial landmarks for each predicted face. Facial landmarks are important features in analyzing the face orientation, structure and so on.


Allowing to examine specific sections of an image in greater detail without compromising its resolution

To enhance the processing power of Hailo devices for handling large input resolutions, we can divide an input frame into multiple tiles and run an object detector on each tile individually. For instance, consider an object that occupies 10×10 pixels in a 4K input frame. This object will only contain 1 pixel of information for a 640×640 detector, such as YOLOv5m, making it nearly impossible to detect. To address this challenge, we use tiles to divide the input frame into smaller patches and detect the object in each tile without sacrificing information due to resizing. The tiles are identified by blue rectangles, and we utilize a pre-trained SSD-MobileNet-v1 model trained on the VisDrone dataset.

Breathe life into your edge applications with the Hailo AI processors