Efficient TNN Inference on PULP Systems
Ternary neural networks (TNNs) are highly quantized machine learning models where all parameters and activations are quantized to one of three values: -1, 0 or 1. At IIS, we have been collecting experience with TNNs for a good while and have a well-established flow for training them. While we have already developed an accelerator (CUTIE, see references), the execution of TNNs on microcontrollers - such as the PULP family of ultra-low-power RISC-V microcontrollers developed at IIS - harbors potential for substantial optimization.
The goal of this project is to enable efficient execution of TNNs on PULP-based systems. This will be achieved by a two-tracked process, which is why this project is proposed as a joint semester thesis for two Master’s students:
- In the Hardware Track, one student will implement TNN-specific ISA extensions to a baseline PULP core (RI5CY) with existing support for sub-byte arithmetic operations (e.g., 16x2b MAC) and construct a system around the improved core, which will be taped out. You get to build your own microcontroller!
- In the Software Track, the other student will build on existing software to implement an efficient framework for executing TNNs. They will start with software support for the baseline architecture with the pre-existing ISA extensions. In a second step, the new TNN-specific ISA extensions developed on the hardware track will be exploited to make TNN inference more efficient, and the results compared.
For more details, please have a look at the detailed project description linked below and don't hesitate to contact us!
Looking for 2 students for a Semester project, or potentially a single student for a Master's thesis.
- Machine Learning
- SystemVerilog knowledge
- VLSI I
- 20% Theory
- 80% Implementation
-  M. Scherer et al., CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration with Better-than-Binary Energy Efficiency