Knowledge Distillation for Embedded Machine Learning

Description

The vast majority of high-performance neural networks used on datasets like ImageNet use Millions or Billions of parameters and are trained and executed with several GPUs at once. Such networks can never be deployed to devices like microcontrollers. However, using novel training techniques, we can leverage these well-trained networks to transfer their knowledge to smaller networks that can be deployed to embedded devices.

Knowledge Distillation is a novel training approach for deep neural networks, which uses well-trained large networks or ensembles of specialized models to train smaller, more efficient networks. This technique shows a lot of potential for deploying models to embedded devices when used in conjunction with well-established quantization techniques. The goal of this thesis is to develop a knowledge distillation algorithm and evaluate it for the training of networks for embedded devices, comparing it to traditional training methods.

The main goals (not all have to be met in a single semester project) of the project are:

Develop framework for distillation-based training in PyTorch

Combine knowledge distillation with quantization to optimize model size

Evaluate knowledge distillation as a method for deployment of networks to embedded devices

Status: Available

Looking for a student for a Semester project.

Supervision: Xiaying Wang, Moritz Scherer

Prerequisites

Machine Learning
Python

Character

20% Theory

80% Implementation

Literature

[1] G. Hinton, et. al., Distilling the Knowledge in a Neural Network
[2] T. Furlanello, et. al., Born Again Neural Networks

Professor

Luca Benini

↑ top

Practical Details

↑ top

Personal tools

Knowledge Distillation for Embedded Machine Learning - iis-projects

Search

Navigation

Tools