Personal tools

Probing the limits of fake-quantised neural networks

From iis-projects

Jump to: navigation, search


Deep neural networks usually require substantial computational resources to deliver their statistical power. However, in many applications where latency and data privacy are essential constraints, it might be necessary to execute these models on resource-constrained computing systems such as embedded, mobile, or edge devices. The research field of TinyML has approached the problem of making DNNs more efficient from several angles: optimising network topologies in terms of accuracy-per-parameter or accuracy-per-operation; reducing model size via techniques such as parameter pruning; replacing high-bit-width floating-point operands with low-bit-width integer operands, training so-called quantised neural networks (QNNs). QNNs have the double benefit of reducing models size and replacing the energy-costly floating-point arithmetic with the energy-efficient integer arithmetic. These properties make QNNs an ideal fit for resource-constrained hardware.

At training time, QNNs use floating-point operands to leverage the optimised floating-point software kernels provided by the chosen deep learning frameworks. However, these floating-point parameters are constrained so that the application of elementary arithmetic properties (e.g., associative, distributive) allows transforming them into fully integerised programs that can be deployed to the target hardware. Unfortunately, this conversion process is a lossy one, and the techniques that can reduce the errors it introduces are crucial to make QNNs useful in practice.

In this project, you will explore the impact of different floating-point formats on what we call the fake-to-true conversion process.

Project description

Project Description

Skills and project character



  • Fundamental concepts of deep learning (convolutional neural networks, backpropagation, computational graphs)
  • Numerical representation formats (integer, floating-point)
  • Numerical analysis
  • Python programming
  • C/C++ programming


  • Knowledge of the PyTorch deep learning framework
  • Knowledge of digital arithmetic (e.g., two's complement, overflow, wraparound)

Project character

  • 20% Theory
  • 40% C/C++ and Python coding
  • 40% Deep learning


The student and the advisors will meet on a weekly basis to check the progress of the project, clarify doubts, and decide the next steps. The schedule of this weekly update meeting will be agreed at the beginning of the project by both parties. Of course, additional meetings can be organised to address urgent issues.

At the end of the project, you will have to present your work during a 15 minutes (20 minutes if carried out as a Master Thesis) talk in front of the IIS team and defend it during the following 5 minutes discussion.


Luca Benini

Status: Available

We are looking for 1 Master student. It is possible to complete the project either as a Semester Project or a Master Thesis.

Supervisors: Matteo Spallanzani, Renzo Andri (Huawei RC Zurich)