Revision as of 13:59, 16 August 2022

Some examples of (regular) stencil regions

Overview

Status: Available

Type: Semester or Bachelor Thesis
Professor: Prof. Dr. L. Benini
Supervisors:
- Paul Scheffler: paulsc@iis.ee.ethz.ch
- Luca Colagrande: colluca@iis.ee.ethz.ch

Introduction

Stencil codes are algorithms which iteratively process data on n-dimensional grids by accessing arrays in fixed, possibly irregular patterns relative to each grid point [1]. They are widespread in high-performance computing (HPC) and underly various problems in physical simulation, economics, and image processing among other domains.

We recently evaluated the performance of a few stencil kernels on our Snitch cluster [2], which is designed for energy-efficient HPC. For this purpose, it includes a few intruction set extensions [3, 4] which enable floating-point unit (FPU) utilizations approaching 100%.

We found that a recent extension to Snitch, indirection stream semantic registers (ISSRs) [4], are highly effective in accelerating stencil codes. ISSRs load a predefined sequence of elements from a high-bandwidth scratchpad memory directly into a processor register as it is being used by an instruction, enabling very high FPU utilizations for arbitrary stencil shapes.

Project

In this project, you will extend our evaluation on accelerating stencils with ISSRs to a larger number of stencils from various sources. These may include

Benchmark suites like PolyBench [5] or Rodinia [6].
Stencil benchmark collections like that of MeteoSwiss [7].
Example stencils for stencil code generators like StencilFlow [8] or AN5D [9]

The goal is to port a representative subset of stencil kernels to Snitch, accelerate them using ISSRs, and evaluate the performance benefits. Motivated students may also work towards creating a stencil code generator generating Snitch code from high-level stencil descriptions similar to the generators mentioned above.

Character

25% Literature / Architecture review
50% Bare-metal C and Assembly programming
25% Performance evaluation

Prerequisites

Knowledge of bare-metal C and assembly programming
Strong interest in computer architecture
Preferred: Knowledge of or prior experience with RISC-V Assembly and programming ISA extensions
Preferred: Prior experience with high-performance and/or numerical computing

References

[1] https://en.wikipedia.org/wiki/Iterative_Stencil_Loops

[2] https://ieeexplore.ieee.org/document/9216552, https://github.com/pulp-platform/snitch

[3] https://ieeexplore.ieee.org/document/9068465

[4] https://ieeexplore.ieee.org/document/9474230

[5] https://github.com/MatthiasJReisinger/PolyBenchC-4.2.1

[6] https://www.cs.virginia.edu/rodinia/doku.php

[7] https://github.com/MeteoSwiss-APN/stencil_benchmarks

[8] https://www.computer.org/csdl/proceedings-article/cgo/2021/09370315/1rSR4s1zlUA

[9] https://dl.acm.org/doi/abs/10.1145/3368826.3377904

@@ Line 11: / Line 11: @@
 [[Category:Available]]
+[[File:stencils.png | thumb | Some examples of (regular) stencil regions]]
 = Overview =
@@ Line 20: / Line 21: @@
 * Supervisors:
 ** [[:User:Paulsc | Paul Scheffler]]: [mailto:paulsc@iis.ee.ethz.ch paulsc@iis.ee.ethz.ch]
-** Luca Colagrande: [mailto:colluca@iis.ee.ethz.ch colluca@iis.ee.ethz.ch]
+** [[:User:Colluca | Luca Colagrande]]: [mailto:colluca@iis.ee.ethz.ch colluca@iis.ee.ethz.ch]
 = Introduction =

Personal tools

Difference between revisions of "Accelerating Stencil Workloads on Snitch using ISSRs (1-2S/B)" - iis-projects

Search

Navigation

Tools

Difference between revisions of "Accelerating Stencil Workloads on Snitch using ISSRs (1-2S/B)"

From iis-projects