Project description

Overview

During the last decade, artificial intelligence (AI) has received particular attention from academic and industrial communities. This made it possible to develop complex systems, such as those required for Autonomous Driving (AD) and Advanced Driver-Assistance Systems (ADAS). These systems process a large amount of data, on which a series of AI inferences are applied. Typical research interests in the AI community focus on improving recognition rates, without considering AI inference execution constraints. In this project, we want to improve both predictability and efficiency of running AI deep learning (DL) inferences, by exploiting their characteristics to enable their timing-aware implementations with novel scheduling policies, onto complex high-performance heterogeneous embedded computing platforms. The HeRITAGES project will achieve these goals by (i) evaluating timing performances of running DL implementations on different compute elements, to characterize and extract their execution features, (ii) modeling complex DL Inference workloads, composed of dependent DL inferences with the HPC-DAG task model; (iii) studying the impact of the extracted features such as task memory footprint and inter-task interference on resources co-scheduling and (iv) finally, designing novel scheduling frameworks to support the proposed approaches in HeRITAGES project.

HeRITAGES approach

The HeRITAGES project has an integral approach, i.e., it targets to provide the design of a complete prototype framework that takes as input DL Inferences architecture, their dependencies, a hardware platform model and is able after multiple graph transformations to generate code for a timeliness valid real-time implementation of the system in input. The HeRITAGES approach workflow is organized through 4 stages, as shown in Figure ??.

  1. S1: The first stage, denoted as S1 in the figure, will allow having the first transformation of the DL inference design to a DAG. This transformation will be based only on DL inference characteristics, e.g., the dependency between the neurons of a given layer and the next layers.
  2. S2: The second stage, denoted as S2, will transform the resulting DAGs of S1 into a set of task specifications.
  3. S3: This stage consists in achieving schedulability tests and sub-task-to-core allocation. The goal of this stage is to choose a single concrete task for every task specification.
  4. S4: In this stage, concrete DAGs will receive the last transformation by generating C-code for the CPU, GPU, and the other accelerators. This transformation is described in WP4.

Start date: 01/10/2023, End Date: 30/09/2027

Jobs

Research Engineer (Open position since : 29/08/2023)

The project will require at its starting phase an engineering effort to prepare several implementations of AI inferences used in advanced imaging and computer vision. These implementations will be diversified to enlarge the exploration of the GPU resources, from a predictability and performance perspective. The engineer will support the PhD student during 6 months, to adapt and extend the benchmark according to the project requirements and evolutions. As part of the HERITAGES project, we are seeking to hire a development engineer to undertake the following tasks:

  • Creating a benchmark with multiple implementations of various neural network inferences, different types of networks on different hardware targets: CPU, GPU, and DLA, as well as hybrid implementations.
  • Developing a CPU-GPU execution model within Phylog.
  • Quantifying interferences related to the execution of different inferences on a heterogeneous execution platform.

The contract duration is 1 year

Desired profile: 1 - 5 years of experience.

Start date: As soon as possible.

Ph.D position (Hiring process closed)

Recent commercial hardware platforms for embedded real-time systems feature heteroge- neous processing units and computing accelerators on the same System-on-Chip. When designing deep learning inferences with timing constraints for such architectures, the designer is exposed to a number of non-trivial choices, such as the implementation choices of its abstract DL inference design, and how the latter will be executed on a heterogeneous hardware featuring CPUs, GPUs and DLAs. The PhD thesis will consists in (i) building a hardware model based on the documentation of PASCAL, VOLTA, AMPERE GPU architectures, and the results of benchmarking, (ii) studying the algorithms of automatic conversion of DL inferences design to HPC-DAG tasks, taking into account the heterogeneous nature of the target architecture, and (iii) proposing novel schedulability techniques (allocation and schedulability analysis) according to the specification of neural networks.

The contract duration 3 years

Start date: 01/11/2023.

Delivrables

WP Delivrable Expected due date link
WP 0 Data Management Plan (DMP) 03/2024
Mid-term report 10/2025
Final report [M48] 10/2027
WP 1 Benchmark of neural network implementations: performance and predictability perspective 08/2024
Timing performances model of DL inferences implementation: key performance indicators 01/2025
WP 2 Techniques of automatic conversion of abstract neural networks inferences to DAGs 06/2025
Expressing alternative design patterns of abstract neural networks using HPC-DAGs 12/2025
WP 3 Schedulability analysis for a set of CNN-Inference DAGs on homogeneous platforms 09/2025
Task mapping for a set of HPC-DAG DL-Inferences on heterogeneous platforms 09/2025
Global scheduling approach for HPC-DAG DL inferences onto heterogeneous platforms 07/2027
WP 4 Extension of PRUDA scheduler to DL inferences implementation 06/2027
Software scheduler for DLA 09/2027
The HPC-DAG compiler 09/2027

Project-related scientific production

TBC