An all Standard-Cell Based Energy Efficient HW Accelerator for DSP and Deep Learning Applications
In the past couple of years the Integrated Systems Laboratory (IIS) developed various Hardware Accelerators for deep learning and digital signal processing applications as part of larger systems-on-chips (SoC) leveraging the PULP (Parallel Ultra-Low Power) architecture. Most of these accelerators boil down to the hardware implementation of simple linear algebra operations like matrix-vector multiplication or convolutions. While the existing solutions achieve high throughput and are at least one order of magnitude more energy efficient than software implementations of these algorithms, they all make use of static random access memories (SRAMs) to store the large amounts of intermediate results and data to process. While SRAMs provide an area efficient way to store volatile data, their potential for agressive voltage scaling and thus increasing the systems energy efficiency is limited. The goal of this projet is the development of a hardware matrix-vector multiplication accelerator that solely relies on standard cells as state holding elements of the design.
Standard cell memories (SCMs) have a number of advantages over traditional SRAMs: Since they are entirely described in RTL, they are completely technolgy independent wheras porting an SRAM based design to a new technology requires the availability of memory compiler that hopefully match the characteristics of the memories used in the old design. SCMs provide arbitrary amounts of read and write bandwidth which is a commonly observed bottleneck of hardware accelerators in general. Most importantly for energy efficiency however is their characteristic to still operate at very low operating voltages well below nominal conditions were traditional SRAM macros fail. Due to the quadratic impact of operating voltage scaling to power consumption even small differences in minimal operating voltage can thus lead to large differences in power consumption and to a certain degree also energy efficiency.
During this project you will design your own hardware accelerator from scratch and optimize it for energy efficiency using low power ASIC design methodology. The accelerator will then be integrated in an existing microcontroller architecture (PULPissimo). As the final step of the project you will have the oportunity to tapeout the microcontroller including your HW accelerator as part of a TSMC65nm multi-project-waver (MPW) run, which will give you practical experience not only in frontend- but also backend-design.
To work on this project, you will need:
- to have worked in the past with at least one RTL language (SystemVerilog or Verilog or VHDL) - having followed (or simultaneously following) the VLSI1 / VLSI2 courses is recommended
- to have prior knowledge of hardware design and computer architecture - having followed the Advances System-on-Chip Design course is recommended
Other skills that you might find useful include:
- to be familiar working on Linux and in a command line terminal
- familiarity with a scripting language (e.g. Python)
- to be strongly motivated for a difficult but super-cool project were you get to unique chance to acquire hands-on experience in ASIC development.
If you want to work on this project, but you think that you do not match some the required skills, we can give you some preliminary exercise to help you fill in the gap.
Status: In Progress
- Supervision: Manuel Eggimann
- Date: Autumn 2020
Meetings & Presentations
The students and advisor(s) agree on weekly meetings to discuss all relevant decisions and decide on how to proceed. Of course, additional meetings can be organized to address urgent issues.
Around the middle of the project there is a design review, where senior members of the lab review your work (bring all the relevant information, such as prelim. specifications, block diagrams, synthesis reports, testing strategy, ...) to make sure everything is on track and decide whether further support is necessary. They also make the definite decision on whether the chip is actually manufactured (no reason to worry, if the project is on track) and whether more chip area, a different package, ... is provided. For more details confer to .
At the end of the project, you have to present/defend your work during a 15 min. presentation and 5 min. of discussion as part of the IIS colloquium.
- The EDA wiki with lots of information on the ETHZ ASIC design flow (internal only) 
- The IIS/DZ coding guidelines