Overview

Status: Completed

Student Mingrui Yuan
Type: Semester Thesis
Professor: Prof. Dr. L. Benini
Supervisors:
- Luca Bertaccini: lbertaccini@iis.ee.ethz.ch

Introduction

FPnew block diagram [1]. Each operation group block can be instantiated through a parameter. In the figure, FPnew was instantiated without a DivSqrt module.

Floating-point (FP) arithmetic is fundamental for a large set of applications spanning from high-performance computing to neural network training. FP architectures usually show a large critical path and need to be pipelined to match the system’s operating frequency. A flexible highly-parametrized open-source floating-point unit (FPU) called FPnew [1,2] has been developed at IIS.

FPnew is optimized for high-performance and energy efficiency. It is internally organized in modules, each one carrying out one operation group (add/mul, divsqrt, cast, comparisons, dot-product). Each operation group block (except the DivSqrt module which implements an iterative algorithm) contains a parametrized number of pipeline registers. Currently, all the registers are placed close to the input boundaries, and the timing is optimized during the backend. However, this can make the backend a longer and more complex process. The goal of this project is to manually place the pipeline registers optimizing for timing, and compare them against the baseline implementation.

Project

Investigation of the FPU timing. This will require you to
- Understand what are the critical paths in the unit
- How the critical paths are broken when inserting different numbers of pipeline registers
RTL modifications to FPnew to manually optimized the pipeline for different numbers of pipeline registers
Implementation of a Python generator that takes the number of pipeline levels as an input and places the registers in the position you identified

Character

15% Literature / architecture review
30% RTL implementation
40% Evaluation
15% Python generator

Prerequisites

Strong interest in computer architecture
Experience with digital design in SystemVerilog as taught in VLSI I
Experience with ASIC implementation flow (synthesis) as taught in VLSI II

References

[1] https://ieeexplore.ieee.org/abstract/document/9311229 FPnew: An Open-Source Multiformat Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing

[2] https://github.com/openhwgroup/cvfpu/

Status: Available

Personal tools

Optimizing the Pipeline in our Floating Point Architectures (1S) - iis-projects

Search

Navigation

Tools

Optimizing the Pipeline in our Floating Point Architectures (1S)

From iis-projects

Contents