Personal tools

Difference between revisions of "Optimizing the Pipeline in our Floating Point Architectures (1S)"

From iis-projects

Jump to: navigation, search
Line 3: Line 3:
 
[[Category:Digital]]
 
[[Category:Digital]]
 
[[Category:Acceleration_and_Transprecision]]
 
[[Category:Acceleration_and_Transprecision]]
[[Category:Energy Efficient SoCs]]
+
[[Category:High Performance SoCs]]
 
[[Category:Computer Architecture]]
 
[[Category:Computer Architecture]]
 
[[Category:2022]]
 
[[Category:2022]]

Revision as of 18:29, 5 August 2022


Overview

Status: Available

Introduction

FPnew block diagram [1]. Each operation group block can be instantiated through a parameter. In the figure, FPnew was instantiated without a DivSqrt module.

Floating-point (FP) arithmetic is fundamental for a large set of applications spanning from high-performance computing to neural network training. FP architectures usually show a large critical path and need to be pipelined to match the system’s operating frequency. A flexible highly-parametrized open-source floating-point unit (FPU) called FPnew [1,2] has been developed at IIS.

FPnew is optimized for high-performance and energy efficiency. It is internally organized in modules, each one carrying out one operation group (add/mul, divsqrt, cast, comparisons, dot-product). Each operation group block (except the DivSqrt module which implements an iterative algorithm) contains a parametrized number of pipeline registers. Currently, all the registers are placed close to the input boundaries, and the timing is optimized during the backend. However, this can make the backend a longer and more complex process. The goal of this project is to manually place the pipeline registers optimizing for timing, and compare them against the baseline implementation.


Project

  • Investigation of the FPU timing. This will require you to
    • Understand what are the critical paths in the unit
    • How the critical paths are broken when inserting different numbers of pipeline registers
  • RTL modifications to FPnew to manually optimized the pipeline for different numbers of pipeline registers
  • Implementation of a Python generator that takes the number of pipeline levels as an input and places the registers in the position you identified

Character

  • 15% Literature / architecture review
  • 30% RTL implementation
  • 40% Evaluation
  • 15% Python generator

Prerequisites

  • Strong interest in computer architecture
  • Experience with digital design in SystemVerilog as taught in VLSI I
  • Experience with ASIC implementation flow (synthesis) as taught in VLSI II

References

[1] https://ieeexplore.ieee.org/abstract/document/9311229 FPnew: An Open-Source Multiformat Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing

[2] https://github.com/openhwgroup/cvfpu/


Status: Available