Personal tools

Efficient TNN compression

From iis-projects

Revision as of 10:29, 5 November 2020 by Scheremo (talk | contribs) (Created page with "==Introduction== While most recent research in ultra-low precision neural network design has focused on Binary Neural Networks (BNNs), Ternary Neural Networks (TNNs) have very...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

While most recent research in ultra-low precision neural network design has focused on Binary Neural Networks (BNNs), Ternary Neural Networks (TNNs) have very recently been shown to outperform BNNs both in terms of energy efficiency as well as statistical accuracy on classification tasks. Further, it has been shown that ternary feature maps are very smooth in the sense that moving from one pixel to a neighbouring pixel, very few bit flips are found. Making use of this smoothness property in TNNs by implementing an efficient compression algorithm directly allows saving memory bandwidth and consequently energy. A promising approach developed at IIS is extended bit-plane compression (EBPC), which has been shown to outperform similar schemes.

Project description

In this project, the feature map smoothness property in TNNs will be studied and exploited to design an efficient compression algorithm that can be implemented in hardware. The student is required to:

  1. Study the state-of-the-art in feature map compression algorithms, especially EBPC [1]
  2. Design and verify a ternary compression algorithm in System Verilog or VHDL
  3. Benchmark the implementation against the state-of-the-art

Depending on the progress and interests of the student the following points can be investigated:

  • Design optimization in terms of area, energy and parallelizability
  • Integration and evaluation of the compressor and decompressor with an ultra-low power TNN accelerator (CUTIE [2])
  • Training of TNNs to further benchmark the compression scheme
  • Expansion of the design to allow for higher and mixed-precision compression

Literature

[1]: EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators

[2]: CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration with Better-than-Binary Energy Efficiency

Required Skills

  • Basic knowledge of System Verilog or VHDL and digital circuit design (VLSI 1)
  • Basic knowledge or interest in Information Theory / Compression Algorithms
  • Interest in algorithms

Skills you might find useful

  • Basic knowledge of or experience with physical layout tools (VLSI 2)
  • Knowledge of machine learning tools and best practice

Meetings & Presentations

The students and advisor(s) agree on weekly meetings to discuss all relevant decisions and decide on how to proceed. Of course, additional meetings can be organized to address urgent issues. At the end of the project, you have to present/defend your work during a 15 min. presentation and 5 min. of discussion as part of the IIS colloquium.