Difference between revisions of "Accelerating Stencil Workloads on Snitch using ISSRs (1-2S/B)"
(Created page with "<!-- Accelerating Stencil Workloads on Snitch using ISSRs (1-2S/B) --> Category:Digital Category:High Performance SoCs Category:Computer Architecture Category:2...")
|Line 11:||Line 11:|
= Overview =
= Overview =
Revision as of 13:57, 16 August 2022
- Type: Semester or Bachelor Thesis
- Professor: Prof. Dr. L. Benini
Stencil codes are algorithms which iteratively process data on n-dimensional grids by accessing arrays in fixed, possibly irregular patterns relative to each grid point . They are widespread in high-performance computing (HPC) and underly various problems in physical simulation, economics, and image processing among other domains.
We recently evaluated the performance of a few stencil kernels on our Snitch cluster , which is designed for energy-efficient HPC. For this purpose, it includes a few intruction set extensions [3, 4] which enable floating-point unit (FPU) utilizations approaching 100%.
We found that a recent extension to Snitch, indirection stream semantic registers (ISSRs) , are highly effective in accelerating stencil codes. ISSRs load a predefined sequence of elements from a high-bandwidth scratchpad memory directly into a processor register as it is being used by an instruction, enabling very high FPU utilizations for arbitrary stencil shapes.
In this project, you will extend our evaluation on accelerating stencils with ISSRs to a larger number of stencils from various sources. These may include
- Benchmark suites like PolyBench  or Rodinia .
- Stencil benchmark collections like that of MeteoSwiss .
- Example stencils for stencil code generators like StencilFlow  or AN5D 
The goal is to port a representative subset of stencil kernels to Snitch, accelerate them using ISSRs, and evaluate the performance benefits. Motivated students may also work towards creating a stencil code generator generating Snitch code from high-level stencil descriptions similar to the generators mentioned above.
- 25% Literature / Architecture review
- 50% Bare-metal C and Assembly programming
- 25% Performance evaluation
- Knowledge of bare-metal C and assembly programming
- Strong interest in computer architecture
- Preferred: Knowledge of or prior experience with RISC-V Assembly and programming ISA extensions
- Preferred: Prior experience with high-performance and/or numerical computing