Difference between revisions of "Investigating the Cost of Special-Case Handling in Low-Precision Floating-Point Dot Product Units (1S)"
From iis-projects
Lbertaccini (talk | contribs) (Created page with "<!-- Investigating the Cost of Special-Case Handling in Low-Precision Floating-Point Dot Product Units (1S) --> Category:Digital Category:Acceleration_and_Transprecisio...") |
Lbertaccini (talk | contribs) |
||
(4 intermediate revisions by the same user not shown) | |||
Line 7: | Line 7: | ||
[[Category:2023]] | [[Category:2023]] | ||
[[Category:Semester Thesis]] | [[Category:Semester Thesis]] | ||
− | [[Category: | + | [[Category:Completed]] |
[[Category:Lbertaccini]] | [[Category:Lbertaccini]] | ||
= Overview = | = Overview = | ||
− | == Status: | + | == Status: Completed == |
+ | * Student: Maurus Item | ||
* Type: Semester Thesis | * Type: Semester Thesis | ||
* Professor: Prof. Dr. L. Benini | * Professor: Prof. Dr. L. Benini | ||
Line 22: | Line 23: | ||
[[File:Fpu_block_diagram.png|thumb|300px|FPnew block diagram [1]. Each operation group block can be instantiated through a parameter. In the figure, the FPU was instantiated without a DivSqrt module.]] | [[File:Fpu_block_diagram.png|thumb|300px|FPnew block diagram [1]. Each operation group block can be instantiated through a parameter. In the figure, the FPU was instantiated without a DivSqrt module.]] | ||
− | Low-precision floating-point (FP) formats are getting more and more traction in the context of neural network (NN) training. Employing low-precision formats such as 8-bit FP data | + | Low-precision floating-point (FP) formats are getting more and more traction in the context of neural network (NN) training. Employing low-precision formats, such as 8-bit FP data types, reduce the model's memory footprint and open new opportunities to increase the system's energy efficiency. |
A low-precision FP dot product unit was recently developed at IIS [1], [2]. The module computes 8 or 16-bit dot products and accumulates the result in larger precision. It has been designed following the standard IEEE-754 directives. However, supporting all the special cases can be costly in hardware and some of these special cases might be unnecessary for low-precision training. The goal of this project is to evaluate such costs. | A low-precision FP dot product unit was recently developed at IIS [1], [2]. The module computes 8 or 16-bit dot products and accumulates the result in larger precision. It has been designed following the standard IEEE-754 directives. However, supporting all the special cases can be costly in hardware and some of these special cases might be unnecessary for low-precision training. The goal of this project is to evaluate such costs. | ||
Line 52: | Line 53: | ||
− | ===Status: | + | ===Status: Completed=== |
Latest revision as of 18:20, 3 November 2023
Contents
Overview
Status: Completed
- Student: Maurus Item
- Type: Semester Thesis
- Professor: Prof. Dr. L. Benini
- Supervisors:
Introduction
Low-precision floating-point (FP) formats are getting more and more traction in the context of neural network (NN) training. Employing low-precision formats, such as 8-bit FP data types, reduce the model's memory footprint and open new opportunities to increase the system's energy efficiency.
A low-precision FP dot product unit was recently developed at IIS [1], [2]. The module computes 8 or 16-bit dot products and accumulates the result in larger precision. It has been designed following the standard IEEE-754 directives. However, supporting all the special cases can be costly in hardware and some of these special cases might be unnecessary for low-precision training. The goal of this project is to evaluate such costs.
Project
- Investigation of the Sdotp unit and its fundamental blocks
- RTL modifications to the Sdotp unit. Support for special cases will be incrementally removed/modified, and its costs will be assessed.
Character
- 20% Literature / architecture review
- 40% RTL implementation
- 40% Evaluation
Prerequisites
- Strong interest in computer architecture
- Experience with digital design in SystemVerilog as taught in VLSI I
- Experience with ASIC implementation flow (synthesis) as taught in VLSI II
References
[1] https://arxiv.org/abs/2207.03192 MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V cores
[2] https://github.com/openhwgroup/cvfpu/tree/feature/expanding_dotp