Overview

Status: Available

Type: Bachelor / Semester Thesis
Professor: Prof. Dr. L. Benini
Supervisors:
- Thomas Benz: tbenz@iis.ee.ethz.ch

Introduction

At IIS we are developing a scalable and flexible family of DMA engines, called iDMA [1]. iDMA is the cluster-level DMA in both the Snitch and PULP clusters. When implemented as this cluster-level engine, iDMA has fine-granular access to the cluster-internal tightly-coupled data memory (TCDM).

Traditionally, when reorganizing data, e.g., transposing a matrix, an accelerator requires a huge internal buffer to read the data in a dense format, reshuffle it, and write it out again as a dense stream. This requires a dedicated special-purpose buffer. Our idea is to create such a reshuffling accelerator (inside of iDMA) using the already present cluster TCDM as its buffer.

The resulting stream accelerators are extremely lightweight yet very performant.

Project

You first investigate common data reshuffling operations and define their reshuffling characteristics. You then implement these reshuffling operations in our iDMA engine. You implement the required changes to access the Snitch TCDM memory at a word granularity (or even sub-word) and enable its usage as a buffer. Finally, you evaluate your approach compared to accelerators using a dedicated internal buffer. Depending on the progress, this work can directly lead to a publication.

Character

20% Getting familiar with the iDMA, and Snitch, evaluating reshuffle operations
30% Implementing the reshuffle operation in the iDMA
30% Integrating your accelerator in Snitch
20% Evaluation

Prerequisites

Interest in memory systems
Experience with digital design in SystemVerilog as taught in VLSI I

References

[1] “A High-performance, Energy-efficient Modular DMA Engine Architecture” https://arxiv.org/abs/2305.05240

Personal tools

Creating an At-memory Low-overhead Bufferless Matrix Transposition Accelerator (1-3S/B) - iis-projects

Search

Navigation

Tools

Creating an At-memory Low-overhead Bufferless Matrix Transposition Accelerator (1-3S/B)

From iis-projects

Contents