Training and Deploying Next-Generation Quantized Neural Networks on Microcontrollers

Description

The design and deployment of highly efficient neural networks (NNs) to be executed on microcontroller-class systems (MCUs) has seen intense attention from the research community in recent times, with the current state of the art being represented by MCUNet (see references). Due to the architectural limitations of commodity MCUs, exploiting sub-byte formats for a better model size-accuracy tradeoff has seen only limited attention - while many approaches to network design and quantization have been proposed, only few publications present a flow to map these networks to real systems.

The PULP family of MCUs developed at IIS has hardware support for ultra-low-precision (down to 2 bits) SIMD arithmetic, which was introduced with the express goal of supporting such networks. Furthermore, we have been developing QuantLab, a framework for training quantized NNs and have recently created a prototype flow for automatically integerizing arbitrary precision networks.

The goal of this project is to leverage these existing tools to train and deploy mixed-precision networks on PULP-based systems. More precisely, in this project, you will:

Select one or more suitable state-of-the-art networks to target a given PULP platform specification - e.g. MCUNet
(Re)train this network with different per-layer precisions, using approaches from literature or developed yourself to determine the precision for each layer
Integerize the mixed-precision network for execution on PULP using our newly developed pipeline
Compare the accuracy-latency-model size tradeoff to the baseline 8-bit model.

A detailed task description and project plan will be uploaded soon, if you are interested in this project and/or have any questions, please do not hesitate to contact us!

Status: Available

Looking for 1-2 students for a Semester project, or potentially a single student for a Master's thesis.

Supervision: Georg Rutishauser, Moritz Scherer

Prerequisites

Machine Learning
Python
C

Character

25% Theory

75% Implementation

Literature

[1] J. Lin et al., MCUNet: Tiny Deep Learning on IoT Devices
[2] M. Rusci et al., Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On Microcontrollers

Professor

Luca Benini

↑ top

Practical Details

↑ top

Personal tools

Training and Deploying Next-Generation Quantized Neural Networks on Microcontrollers - iis-projects

Search

Navigation

Tools