Personal tools

Difference between revisions of "Enhancing our DMA Engine with Vector Processing Capabilities (1-2S/B)"

From iis-projects

Jump to: navigation, search
 
(One intermediate revision by the same user not shown)
Line 10: Line 10:
 
[[Category:Michaero]]
 
[[Category:Michaero]]
 
[[Category:Paulsc]]
 
[[Category:Paulsc]]
[[Category:Reserved]]
+
[[Category:Completed]]
  
  

Latest revision as of 11:13, 21 June 2022


Overview

Status: Reserved

Introduction

Traditional vector processors are complex and require an immensely large circuit area. Ara [1], our in-house vector Processor (RISC-V Vector Extension Version 0.10) e.g. has a complexity of multiple MGE. Additionally to these area requirements, vector processors are register machines, meaning they have to load (and later store) data into their internal register file for processing, reducing their energy efficiency.

In our group, we are developing a modular and extensible high-performance direct memory access (DMA) engine. So far the DMA can only copy a stream of data without modifying it. Thanks to the modularity of our DMA engine, a small and lightweight stream processing unit can be added to our DMA to extend its capability with simple vector operations.


Project

In this project, you will extend the DMA with simple vector manipulation capabilities such as:

  • Writing a stream of 0s, 1s, or pseudo-random numbers for initialization of memory locations
  • Summing the elements of a vector
  • Finding the minimum and maximum elements
  • Performing element-wise operations like scalar multiplication, scalar addition, and logical operations (AND, OR, XOR, NOT)

Next to the implementation of these capabilities, you will:

  • Investigate how to enhance the generic DMA transfer request with the newly gained operations; this will include adaptations to the software driver of the DMA
  • Simulate the hardware to gather performance numbers (throughput, latency, ...)
  • Synthesize your updated DMA to investigate the influence of your additions on area and timing


Depending on the remaining time and your personal interests, more complex vector operations can be added such as:

  • Rounding operations
  • SIMD compression / decompression
  • Vector addition
  • Data compression


Character

  • 20% Architecture
  • 40% RTL implementation
  • 20% Verification
  • 20% Evaluation

Prerequisites

  • Strong interest in memory systems
  • Experience with digital design in SystemVerilog as taught in VLSI I
  • Preferred: Knowledge or experience with AXI and RISC-V

References

[1] https://github.com/pulp-platform/ara

[2] https://github.com/pulp-platform/axi