Personal tools

Investigating the Cost of Special-Case Handling in Low-Precision Floating-Point Dot Product Units (1S)

From iis-projects

Jump to: navigation, search


Overview

Status: Completed

Introduction

FPnew block diagram [1]. Each operation group block can be instantiated through a parameter. In the figure, the FPU was instantiated without a DivSqrt module.

Low-precision floating-point (FP) formats are getting more and more traction in the context of neural network (NN) training. Employing low-precision formats, such as 8-bit FP data types, reduce the model's memory footprint and open new opportunities to increase the system's energy efficiency.

A low-precision FP dot product unit was recently developed at IIS [1], [2]. The module computes 8 or 16-bit dot products and accumulates the result in larger precision. It has been designed following the standard IEEE-754 directives. However, supporting all the special cases can be costly in hardware and some of these special cases might be unnecessary for low-precision training. The goal of this project is to evaluate such costs.


Project

  • Investigation of the Sdotp unit and its fundamental blocks
  • RTL modifications to the Sdotp unit. Support for special cases will be incrementally removed/modified, and its costs will be assessed.

Character

  • 20% Literature / architecture review
  • 40% RTL implementation
  • 40% Evaluation

Prerequisites

  • Strong interest in computer architecture
  • Experience with digital design in SystemVerilog as taught in VLSI I
  • Experience with ASIC implementation flow (synthesis) as taught in VLSI II

References

[1] https://arxiv.org/abs/2207.03192 MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V cores

[2] https://github.com/openhwgroup/cvfpu/tree/feature/expanding_dotp


Status: Completed