Personal tools

Difference between revisions of "Creating A Reshuffling Mid-end For Reorganizing Data Inside The Compute Cluster (1-3S/B)"

From iis-projects

Jump to: navigation, search
(Created page with "<!-- Creating A Reshuffling Mid-end For Reorganizing Data Inside The Compute Cluster (1-3S/B) --> Category:Digital Category:High Performance SoCs Category:Computer...")
 
Line 15: Line 15:
 
== Status: Available ==
 
== Status: Available ==
  
* Type: Bachelor / Semester Thesis or Group Project
+
* Type: Bachelor / Semester Thesis
 
* Professor: Prof. Dr. L. Benini
 
* Professor: Prof. Dr. L. Benini
 
* Supervisors:
 
* Supervisors:

Revision as of 11:26, 29 August 2023


Overview

Status: Available

Introduction

At IIS we are developing a scalable and flexible family of DMA engines, called iDMA [1]. iDMA is the cluster-level DMA in both the Snitch and PULP clusters. When implemented as this cluster-level engine, iDMA has fine-granular access to the cluster-internal tightly-coupled data memory (TCDM).

Traditionally, when reorganizing data, e.g. transposing a matrix, an accelerator requires a huge internal buffer to read the data in a dense format, reshuffle it, and write it out again as a dense stream. This requires a dedicated special-purpose buffer. Our idea is to create such a reshuffling accelerator (based on iDMA) which is instead using the cluster TCDM as its buffer.


Project

You first investigate common data reshuffling operations and define their reshuffling characteristics. You then implement these reshuffling operations in our iDMA engine. You implement the required changes to access the Snitch TCDM memory at a word granularity (or even sub-word) and enable its usage as a buffer. You then finally evaluate your approach compared to accelerators using a dedicated internal buffer.

Character

  • 20% Getting familiar with the iDMA, and Snitch, evaluating reshuffle operations
  • 30% Implementing the reshuffle operation in the iDMA
  • 30% Integrating your accelerator in Snitch
  • 20% Evaluation


Prerequisites

  • Interest in memory systems
  • Experience with digital design in SystemVerilog as taught in VLSI I

References

[1] “A High-performance, Energy-efficient Modular DMA Engine Architecture” https://arxiv.org/abs/2305.05240