Personal tools

Difference between revisions of "Implementing DSP Instructions in Banshee (1S)"

From iis-projects

Jump to: navigation, search
Line 3: Line 3:
= Overview =
= Overview =
== Status: In progress ==
== Status: Completed ==
* Type: Semester Thesis
* Type: Semester Thesis

Revision as of 13:15, 15 September 2022


Status: Completed


  • 60% Rust programming
  • 20% Bare-metal C programming
  • 20% Evaluation


  • Experience or interest in learning Rust
  • Experience with C


In a quest for high-performance computing systems, few architectural models retain the flexibility of manycore systems. Those systems integrate many small cores (hundreds, thousands) that work independently to execute highly-parallelizable algorithms. Exploring new architectures and writing software for manycore systems is very challenging and requires the support of good simulation tools at various levels of abstraction.

At ETH, we have developed Banshee, an LLVM-based binary translator capable of simulating our manycore architectures [1]. It is written in Rust, making it easy to extend, and thanks to its static binary translation, it reaches a performance of up to 72 GIPS, outperforming RTL simulation by several orders of magnitude.

One of the manycore systems developed at ETH is MemPool [2], [3]. It boasts 256 lightweight 32-bit Snitch cores [4]. They implement the RISC-V instruction set architecture (ISA), a modular and open ISA [5]. Despite its size, MemPool manages to give all 256 cores low-latency access to the shared L1 memory, with a zero-load latency of at most five cycles. Therefore, all cores can efficiently communicate, making MemPool suitable for various workloads and easy to program. To improve MemPool’s performance, we have recently added a custom ISA extension with specialized DSP instructions such as multiply-accumulate or SIMD instructions [6]. The instructions are a subset of the xpulp instruction set [7].

To allow us to also use the xpulp instructions in Banshee, the project’s goal is to add the xpulp instruction set extension to Banshee. In a first step, the focus lies on adding the instructions currently supported by Snitch, followed by the rest of the xpulp set. While adding the instructions to Banshee, we also want to evaluate their impact on signal-processing kernels while comparing Banshee’s accuracy with the RTL model.

Project Description

  • Implement MemPool’s instructions in Banshee
    • Analyze the subsets necessary to support the full extension.
    • Add them to Banshee by emitting the corresponding LLVM IR or writing a high-level description of the functionality in Rust.
    • Verify your implementation with MemPool’s test infrastructure.
  • Add the complete xpulp set
    • Decide on the order of the most useful subsets and add as many as time allows.
    • Verify your implementation by extending MemPool’s test infrastructure.
  • Evaluate the performance gain those instructions bring and the accuracy of Banshee.
    • Use existing DSP kernels and/or implement your own ones to evaluate the benefit your instructions bring.
    • Compare the estimated speedup with the real performance gain observed in RTL.

Project Realization


Weekly meetings will be held between the student and the assistants. The exact time and location of these meetings will be determined within the first week of the project in order to fit the student’s and the assistants’ schedule. These meetings will be used to evaluate the status and progress of the project. Beside these regular meetings, additional meetings can be organized to address urgent issues as well.

Weekly Reports

Semester Thesis: The student is advised, but not required, to a write a weekly report at the end of each week and to send it to his advisors. The idea of the weekly report is to briefly summarize the work, progress and any findings made during the week, to plan the actions for the next week, and to bring up open questions and points. The weekly report is also an important means for the student to get a goal-oriented attitude to work.

Coding Guidelines

HDL Code Style

Adapting a consistent code style is one of the most important steps in order to make your code easy to understand. If signals, processes, and modules are always named consistently, any inconsistency can be detected more easily. Moreover, if a design group shares the same naming and formatting conventions, all members immediately feel at home with each other’s code. At IIS, we use lowRISC’s style guide for SystemVerilog HDL:

Software Code Style

We generally suggest that you use style guides or code formatters provided by the language’s developers or community. For example, we recommend LLVM’s or Google’s code styles with clang-format for C/C++, PEP-8 and pylint for Python, and the official style guide with rustfmt for Rust.

Version Control

Even in the context of a student project, keeping a precise history of changes is essential to a maintainable codebase. You may also need to collaborate with others, adopt their changes to existing code, or work on different versions of your code concurrently. For all of these purposes, we heavily use Git as a version control system at IIS. If you have no previous experience with Git, we strongly advise you to familiarize yourself with the basic Git workflow before you start your project.


Documentation is an important and often overlooked aspect of engineering. A final report has to be completed within this project.

The common language of engineering is de facto English. Therefore, the final report of the work is preferred to be written in English.

Any form of word processing software is allowed for writing the reports, nevertheless the use of LaTeX with Inkscape or any other vector drawing software (for block diagrams) is strongly encouraged by the IIS staff.

If you write the report in LaTeX, we offer an instructive, ready-to-use template, which can be forked from the Git repository at

Final Report

The final report has to be presented at the end of the project and a digital copy needs to be handed in and remain property of the IIS. Note that this task description is part of your report and has to be attached to your final report.


There will be a presentation 15 min presentation and 5 min Q&A) at the end of this project in order to present your results to a wider audience. The exact date will be determined towards the end of the work.


In order to complete the project successfully, the following deliverables have to be submitted at the end of the work:

  • Final report incl. presentation slides
  • Source code and documentation for all developed software and hardware
  • Testsuites (software) and testbenches (hardware)
  • Synthesis and implementation scripts, results, and reports


[1] PULP Team, Banshee GitHub (” 2021.

[2] M. Cavalcante, S. Riedel, A. Pullini, and L. Benini, MemPool: A shared-L1 memory many-core cluster with a low-latency interconnect,” in 2021 design, automation, and test in europe conference and exhibition (DATE), 2021, pp. 701–706.

[3] S. Riedel and M. Cavalcante, MemPool GitHub.” 2021.

[4] F. Zaruba, F. Schuiki, T. Hoefler, and L. Benini, Snitch: A 10 kGE Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads,” IEEE TRANSACTIONS ON COMPUTERS, pp. 1–1, Feb. 2020.

[5] A. Waterman and K. Asanović, The RISC-V Instruction Set Manual Volume I: Unprivileged ISA - Document Version 20191213,” RISC-V Foundation, 2019.

[6] S. Mazzola, ISA extensions in the Snitch Processor for Signal Processing,” Apr. 2021.

[7] OpenHW Group, cv32e40p User Manual.” 2021.