Personal tools

Difference between revisions of "Universal Stream Semantic Registers for Snitch (1S)"

From iis-projects

Jump to: navigation, search
Line 4: Line 4:
 
[[Category:High Performance SoCs]]
 
[[Category:High Performance SoCs]]
 
[[Category:Computer Architecture]]
 
[[Category:Computer Architecture]]
 +
[[Category:Acceleration_and_Transprecision]]
 
[[Category:2021]]
 
[[Category:2021]]
 
[[Category:Semester Thesis]]
 
[[Category:Semester Thesis]]

Revision as of 14:52, 10 August 2021


Overview

Status: Available

Introduction

Processors often access data as memory streams, sequences of memory requests following predefined address patterns. Recent architectural extensions [1-2] propose handling such streams in hardware. This frees processors from explicitly computing addresses and issuing requests, increasing compute throughput. It also decouples data movement from execution, hiding architectural latencies and maximizing bandwidth utilization.

In our group, we developed Stream Semantic Registers (SSRs) [1]. These map memory streams directly to general-purpose registers in a RISC-V core, such that simply accessing a register loads or stores data. The stream's addresses are computed by an address generator, which is programmed with the stream's address pattern (loop bounds, strides, ...) beforehand.

SSRs are used in the Snitch cluster [3] along with the floating point repetition (FREP) hardware loop; this enables floating-point unit (FPU) utilizations near 100% on regular problems. In this context, we recently extended SSRs to also handle indirect streams [4] for sparse workloads, and are actively working on further extensions.

However, there is a fundamental limitation to SSRs as currently implemented in Snitch systems: they only support streaming double-precision (64-bit) floating-point data. Adding support integer types and different element sizes (8, 16, 32, 64 bit) would enable accelerating many more scenarios, such as graph processing.

Project

In this project, we want to:

  • Extend SSRs to support variably-sized types for stream elements.
  • Extend the work-in-progress Snitch integer processing unit (IPU) to support integer SSRs.
  • Write simple programs (e.g. linear algebra, graph algorithm kernels) demonstrating the use of integer and variable-size streams, respectively.
  • Evaluate the performance, area, energy, and timing impact of these extensions on the above applications.

The project can be simplified, adapted, or extended to suit your needs and wishes.

Character

  • 20% Architecture specification
  • 40% RTL implementation
  • 40% Verification and Evaluation

Prerequisites

  • Strong interest in computer architecture and/or memory systems
  • Experience with HDLs (preferably SystemVerilog) as taught in VLSI I
  • Knowledge of ASIC tool flow or parallel enrollment with VLSI II
  • Basic knowledge on embedded / bare-metal programming in C

References

[1] https://ieeexplore.ieee.org/document/9068465

[2] https://ieeexplore.ieee.org/document/8980305

[3] https://ieeexplore.ieee.org/document/9216552

[4] https://arxiv.org/abs/2011.08070