Personal tools

Difference between revisions of "Probing the limits of fake-quantised neural networks"

From iis-projects

Jump to: navigation, search
(Prerequisites)
(18 intermediate revisions by 2 users not shown)
Line 1: Line 1:
==Short Description==
+
== Introduction ==
Abstract of the project
 
  
===Status: Available ===
+
Deep neural networks usually require huge computational resources to deliver their statistical power.
: Looking for 1 Semester/Master students
+
However, in many applications where latency and data privacy are important constraints, it might be necessary to execute these models on resource-constrained computing systems such as embedded, mobile, or edge devices.
: Contact: [[:User:spmatteo | Matteo Spallanzani]]
+
The research field of TinyML has approached the problem of making DNNs more efficient from several angles: optimising network topologies in terms of accuracy-per-parameter or accuracy-per-operation; reducing model size via techniques such as parameter pruning; replacing high-bit-width floating-point operands with low-bit-width integer operands, training so-called quantised neural networks (QNNs).
===Prerequisites===
+
QNNs have the double benefit or reducing model size and replacing the energy-costly floating-point arithmetic with the energy-efficient integer arithmetic, properties that make them an ideal fit for resource-constrained hardware.
: C/C++ and Python programming
 
  
===Character===
+
At training time, QNNs use floating-point operands to leverage the optimised floating-point software kernels provided by the chosen deep learning frameworks.
: 20% Theory
+
However, these floating-point parameters are constrained in such a way that the application of elementary arithmetic properties (e.g., associative, distributive) allow to return fully integerised programs that can be deployed to the target hardware.
: 40% C/C++, Python coding
+
Unfortunately, this conversion process is a lossy one, and the techniques that can reduce the errors it introduces are crucial to make QNNs useful in practice.
: 40% Deep learning
 
  
===Professor===
+
In this project, you will explore the impact of different floating-point formats on what we call the fake-to-true conversion process.
<!-- : [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini] --->
 
  
[[#top|↑ top]]
 
  
==Detailed Task Description==
+
== Project description ==
  
===Goals===
+
: [https://iis-projects.ee.ethz.ch/images/6/68/Probing_the_limits_of_fake-quantised_neural_networks.pdf Project Description]
===Practical Details===
 
* '''[[Project Plan]]'''
 
* '''[[Project Meetings]]'''
 
* '''[[Design Review]]'''
 
* '''[[Coding Guidelines]]'''
 
* '''[[Final Report]]'''
 
* '''[[Final Presentation]]'''
 
  
==Results==
 
  
==Links==  
+
== Skills and project character ==
 +
 
 +
=== Skills ===
 +
 
 +
Required:
 +
* Fundamental concepts of deep learning (convolutional neural networks, backpropagation, computational graphs)
 +
* Numerical representation formats (integer, floating-point)
 +
* Numerical analysis
 +
* Python programming
 +
* C/C++ programming
 +
 
 +
Optional:
 +
* Knowledge of the PyTorch deep learning framework
 +
* Knowledge of digital arithmetic (e.g., two's complement, overflow, wraparound)
 +
 
 +
=== Project character ===
 +
 
 +
* 20% Theory
 +
* 40% C/C++ and Python coding
 +
* 40% Deep learning
 +
 
 +
 
 +
== Logistics ==
 +
 
 +
The student and the advisor will meet on a weekly basis to check the progress of the project, clarify doubts, and decide the next steps.
 +
The schedule of this weekly update meeting will be agreed at the beginning of the project by both parties.
 +
Of course, additional meetings can be organised to address urgent issues.
 +
 
 +
At the end of the project, you will have to present your work during a 15 minutes (20 minutes if carried out as a Master Thesis) talk in front of the IIS team and defend it during the following 5 minutes discussion.
 +
 
 +
 
 +
== Professor ==
 +
 
 +
: [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini]
 +
 
 +
 
 +
== Status: Available ==
 +
 
 +
We are looking for 1 Master student.
 +
It is possible to complete the project either as a Semester Project or a Master Thesis.
 +
 
 +
Supervisors: [[:User:spmatteo | Matteo Spallanzani]] [mailto:spmatteo@iis.ee.ethz.ch spmatteo@iis.ee.ethz.ch], [[:User:andrire | Renzo Andri (Huawei RC Zurich)]]
 +
 
  
[[#top|↑ top]]
 
 
<!--  
 
<!--  
  
Line 45: Line 73:
 
       [[Category:cat4]]
 
       [[Category:cat4]]
 
       [[Category:cat5]]
 
       [[Category:cat5]]
 
  
 
[[Category:Digital]]
 
[[Category:Digital]]
    SUB CATEGORIES
 
    NEW CATEGORIES
 
 
       [[Category:Computer Architecture]]
 
       [[Category:Computer Architecture]]
 
       [[Category:Acceleration and Transprecision]]
 
       [[Category:Acceleration and Transprecision]]
Line 68: Line 93:
 
       [[Category:EmbeddedAI]]     
 
       [[Category:EmbeddedAI]]     
  
 
+
      [[Category:ASIC]]
    [[Category:ASIC]]
+
      [[Category:FPGA]]
    [[Category:FPGA]]
 
 
      
 
      
    [[Category:System Design]]
+
      [[Category:System Design]]
    [[Category:Processor]]
+
      [[Category:Processor]]
    [[Category:Telecommunications]]
+
      [[Category:Telecommunications]]
    [[Category:Modelling]]
+
      [[Category:Modelling]]
    [[Category:Software]]
+
      [[Category:Software]]
    [[Category:Audio]]
+
      [[Category:Audio]]
  
 
[[Category:Analog]]
 
[[Category:Analog]]
 +
 
[[Category:Nano-TCAD]]
 
[[Category:Nano-TCAD]]
  
 
[[Category:AnalogInt]]
 
[[Category:AnalogInt]]
  SUB CATEGORIES
+
      [[Category:Telecommunications]]
  [[Category:Telecommunications]]
 
 
 
  
 
STATUS
 
STATUS
Line 113: Line 136:
 
[[Category:EPI]]
 
[[Category:EPI]]
 
[[Category:Fractal]]
 
[[Category:Fractal]]
 
  
 
YEAR (IF FINISHED)
 
YEAR (IF FINISHED)
Line 128: Line 150:
 
[[Category:2020]]
 
[[Category:2020]]
  
 +
--->
  
--->
+
[[Category:Digital]]
 +
[[Category:Deep Learning Projects]]
 
[[Category:Deep Learning Acceleration]]  
 
[[Category:Deep Learning Acceleration]]  
 +
[[Category:Available]]
 
[[Category:Semester Thesis]]
 
[[Category:Semester Thesis]]
 
[[Category:Master Thesis]]
 
[[Category:Master Thesis]]
[[Category:Spmatteo]]
+
[[Category:Hot]]
[[Category:Andrire]]
+
[[Category:spmatteo]]
 +
[[Category:andrire]]

Revision as of 20:21, 24 September 2021

Introduction

Deep neural networks usually require huge computational resources to deliver their statistical power. However, in many applications where latency and data privacy are important constraints, it might be necessary to execute these models on resource-constrained computing systems such as embedded, mobile, or edge devices. The research field of TinyML has approached the problem of making DNNs more efficient from several angles: optimising network topologies in terms of accuracy-per-parameter or accuracy-per-operation; reducing model size via techniques such as parameter pruning; replacing high-bit-width floating-point operands with low-bit-width integer operands, training so-called quantised neural networks (QNNs). QNNs have the double benefit or reducing model size and replacing the energy-costly floating-point arithmetic with the energy-efficient integer arithmetic, properties that make them an ideal fit for resource-constrained hardware.

At training time, QNNs use floating-point operands to leverage the optimised floating-point software kernels provided by the chosen deep learning frameworks. However, these floating-point parameters are constrained in such a way that the application of elementary arithmetic properties (e.g., associative, distributive) allow to return fully integerised programs that can be deployed to the target hardware. Unfortunately, this conversion process is a lossy one, and the techniques that can reduce the errors it introduces are crucial to make QNNs useful in practice.

In this project, you will explore the impact of different floating-point formats on what we call the fake-to-true conversion process.


Project description

Project Description


Skills and project character

Skills

Required:

  • Fundamental concepts of deep learning (convolutional neural networks, backpropagation, computational graphs)
  • Numerical representation formats (integer, floating-point)
  • Numerical analysis
  • Python programming
  • C/C++ programming

Optional:

  • Knowledge of the PyTorch deep learning framework
  • Knowledge of digital arithmetic (e.g., two's complement, overflow, wraparound)

Project character

  • 20% Theory
  • 40% C/C++ and Python coding
  • 40% Deep learning


Logistics

The student and the advisor will meet on a weekly basis to check the progress of the project, clarify doubts, and decide the next steps. The schedule of this weekly update meeting will be agreed at the beginning of the project by both parties. Of course, additional meetings can be organised to address urgent issues.

At the end of the project, you will have to present your work during a 15 minutes (20 minutes if carried out as a Master Thesis) talk in front of the IIS team and defend it during the following 5 minutes discussion.


Professor

Luca Benini


Status: Available

We are looking for 1 Master student. It is possible to complete the project either as a Semester Project or a Master Thesis.

Supervisors: Matteo Spallanzani spmatteo@iis.ee.ethz.ch, Renzo Andri (Huawei RC Zurich)