http://iis-projects.ee.ethz.ch/api.php?action=feedcontributions&user=Cykoenig&feedformat=atomiis-projects - User contributions [en]2024-03-29T11:32:30ZUser contributionsMediaWiki 1.28.0http://iis-projects.ee.ethz.ch/index.php?title=Benchmarking_a_RISC-V-based_Server_on_LLMs/Foundation_Models_(SA_or_MA)&diff=10257Benchmarking a RISC-V-based Server on LLMs/Foundation Models (SA or MA)2024-03-11T17:46:06Z<p>Cykoenig: </p>
<hr />
<div><!-- Benchmarking a RISC-V-based Server on LLMs/Foundation Models (SA or MA) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:2023]]<br />
[[Category:Master Thesis]]<br />
[[Category:Hot]]<br />
[[Category:Xiaywang]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester or Master Thesis (multiple students possible)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Xiaywang | Xiaying Wang]]: [mailto:xiaywang@iis.ee.ethz.ch xiaywang@iis.ee.ethz.ch]<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
Milk-V is a company committed to delivering high-quality RISC-V products to developers, enterprises, and consumers. It focuses on the development of both hardware and software ecosystems around the RISC-V architecture. Milk-V strongly supports open-source initiatives and aims to enrich the RISC-V product landscape, hoping that through its efforts and those of the community, the future of RISC-V products will be as vast and luminous as the Milky Way.<br />
<br />
The Milk-V Pioneer is a developer motherboard utilizing the SOPHON SG2042 [1], designed in the standard microATX (mATX) form factor. It offers PC-like interfaces and compatibility with PC industrial standards, aiming to provide a native RISC-V development environment and desktop experience. The Pioneer is targeted at RISC-V developers and hardware pioneers, offering a platform to engage with cutting-edge RISC-V technology. This motherboard serves as an excellent choice for those interested in exploring and developing within the RISC-V architecture.<br />
<br />
[[File:Pioneer.jpg|400px|]] [2]<br />
<br />
= Project description =<br />
<br />
In this project, you will be executing LLMs and Foundation Models, e.g., Whisper AI, to Milk-V servers and benchmark their performance.<br />
<br />
You will first select a framework to execute LLMs in C/C++, for instance lama.cpp [3]. You will then evaluate one or several models using this framework on the SG2042 CPU. Finally, you will identify potential limitation or improvements of the code related to the micro architecture.<br />
<br />
== Character ==<br />
<br />
* 20% Literature/architecture review<br />
* 60% Programming<br />
* 20% Evaluation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience in C programming<br />
* Preferred: Knowledge or prior experience with RISC-V<br />
<br />
= References =<br />
<br />
[https://github.com/milkv-pioneer/pioneer-files/blob/main/hardware/SG2042-TRM.pdf](https://github.com/milkv-pioneer/pioneer-files/blob/main/hardware/SG2042-TRM.pdf)<br />
<br />
[https://milkv.io/docs/pioneer/](https://milkv.io/docs/pioneer/)<br />
<br />
[https://github.com/ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Benchmarking_a_RISC-V-based_Server_on_LLMs/Foundation_Models_(SA_or_MA)&diff=10256Benchmarking a RISC-V-based Server on LLMs/Foundation Models (SA or MA)2024-03-11T17:18:50Z<p>Cykoenig: </p>
<hr />
<div><!-- Benchmarking a RISC-V-based Server on LLMs/Foundation Models (SA or MA) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:2023]]<br />
[[Category:Master Thesis]]<br />
[[Category:Hot]]<br />
[[Category:Xiaywang]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester or Master Thesis (multiple students possible)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Xiaywang | Xiaying Wang]]: [mailto:xiaywang@iis.ee.ethz.ch xiaywang@iis.ee.ethz.ch]<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
Milk-V is a company committed to delivering high-quality RISC-V products to developers, enterprises, and consumers. It focuses on the development of both hardware and software ecosystems around the RISC-V architecture. Milk-V strongly supports open-source initiatives and aims to enrich the RISC-V product landscape, hoping that through its efforts and those of the community, the future of RISC-V products will be as vast and luminous as the Milky Way.<br />
<br />
The Milk-V Pioneer is a developer motherboard utilizing the SOPHON SG2042, designed in the standard microATX (mATX) form factor. It offers PC-like interfaces and compatibility with PC industrial standards, aiming to provide a native RISC-V development environment and desktop experience. The Pioneer is targeted at RISC-V developers and hardware pioneers, offering a platform to engage with cutting-edge RISC-V technology. This motherboard serves as an excellent choice for those interested in exploring and developing within the RISC-V architecture.<br />
<br />
[[File:Pioneer.jpg|700px|]] [1]<br />
<br />
= Project description =<br />
<br />
In this project, you will be executing LLMs and Foundation Models, e.g., Whisper AI, to Milk-V servers and benchmark their performance.<br />
<br />
== Character ==<br />
<br />
* 20% Literature/architecture review<br />
* 60% Programming<br />
* 20% Evaluation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience in C programming<br />
* Preferred: Knowledge or prior experience with RISC-V<br />
<br />
= References =<br />
<br />
[1] : [https://milkv.io/docs/pioneer/](https://milkv.io/docs/pioneer/)</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=File:Pioneer.jpg&diff=10255File:Pioneer.jpg2024-03-11T17:12:45Z<p>Cykoenig: </p>
<hr />
<div></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Writing_a_Hero_runtime_for_EPAC_(1-3S/B)&diff=10178Writing a Hero runtime for EPAC (1-3S/B)2024-02-15T13:27:23Z<p>Cykoenig: </p>
<hr />
<div><!-- Writing a Hero runtime for EPAC (1-3S/B) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Reserved]]<br />
<br />
= Overview =<br />
<br />
== Status: Reserved ==<br />
<br />
* Type: Computer Architecture Bachelor / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
EPAC is one of the chip resulting from the European Processor Initiative (EPI) consortium, in which ETH Zurich is involved. The EPAC and EPAC1.5 chips have been successfully taped-out in the past years and are available at IIS for testing and developing SW. This heterogeneous chip contains four RISC-V Avispado cores along with two STX accelerator tiles and one Variable floating point precision core.<br />
<br />
[[File:Epac_backend.png|500px]]<br />
''Source: [https://www.european-processor-initiative.eu/accelerator/]''<br />
<br />
[https://www.openmp.org/specifications/ OpenMP] is an Application Programming Interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran. OpenMP allows developers to write parallel programs that can run on a wide range of hardware, including multi-core processors and symmetric multiprocessing (SMP) systems. OpenMP uses pragma directives to exploit parallelism in the annotated code regions. These directives are embedded in the source code and guide the compiler in generating parallel executable code. In addition to compiler support, an OpenMP runtime library abstracts the details of thread creation and management from the programmer, simplifying the parallelization process.<br />
<br />
Starting from version 4.0, OpenMP introduced a target directive, which allows offloading computations to accelerators by explicitly specifying the code regions amenable to execute on the accelerator.<br />
<br />
<syntaxhighlight lang="c"><br />
int main() {<br />
printf("Hello from the host\n");<br />
<br />
#pragma omp target device(1)<br />
{<br />
printf("Hello from the accelerator main thread\n");<br />
#pragma omp parallel<br />
printf("Hello from the accelerator thread %i\n", omp_get_thread_num());<br />
}<br />
<br />
}</syntaxhighlight><br />
<br />
<br />
The HERO stack, developed at IIS, proposes an implementation of the OpenMP runtime that can run on multiple of our SoCs with a maximum of code reuse.<br />
<br />
= Project =<br />
<br />
This project aims to port the HERO stack to the EPAC1.5 chip in order to benchmark multiple OpenMP based kernel and applications on this state of the art Heterogeneous SoC.<br />
<br />
In order to reach this goal, the student will have to familiarize with the EPAC1.5 chip architecture and it's interface (the chip is programmable via a FPGA host setup). The student will then need to understand the HERO runtime and port it to EPAC. Finally they will implement some benchmarks or applications using OpenMP targeting the accelerators inside the chip.<br />
<br />
== Character ==<br />
<br />
* 20% Get familiar with the SoC architecture and its progrmaming interface<br />
* 20% Study the Hero runtimes<br />
* 40% Propose and implement the runtime plugin (written in C) for EPAC<br />
* 20% Benchmark some OpenMP kernel on the STX accelerator<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in C, knowledge of C++<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
<br />
= References =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP]<br />
[https://arxiv.org/pdf/1712.06497.pdf Original HERO paper]<br />
[https://www.european-processor-initiative.eu/accelerator/ EPAC chip]</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Building_an_RTL_top_level_for_a_Mempool-based_Heterogeneous_SoC_(M/1-3S)&diff=10157Building an RTL top level for a Mempool-based Heterogeneous SoC (M/1-3S)2024-02-13T13:57:18Z<p>Cykoenig: </p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Smazzola]]<br />
[[Category:Mbertuletti]]<br />
[[Category:Reserved]]<br />
<br />
= Overview =<br />
<br />
== Status: Reserved ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
** [[:User:Smazzola | Sergio Mazzola]]: [mailto:smazzola@iis.ee.ethz.ch smazzola@iis.ee.ethz.ch] <br />
** [[:User:Mbertuletti | Marco Bertuletti]]: [mailto:mbertuletti@iis.ee.ethz.ch mbertuletti@iis.ee.ethz.ch] <br />
<br />
= Introduction =<br />
<br />
MemPool[1] is an example of the massively parallel SoCs built at IIS. It integrates 256 Snitch cores and 1MiB of shared-L1 memory. Despite its size, MemPool gives all cores low-latency access to the shared L1 memory, with a maximum latency of only five cycles when no contention occurs. This implements efficient communication among all cores, making MemPool suitable for various workload domains and easy to program. <br />
<br />
Today, MemPool is a standalone cluster of accelerators with distributed memory, but it aims to be programmed by and for a Host subsystem<br />
<br />
Cheshire[2], an open-source SoC from our group that features a 64-bit RISC-V core and various peripherals such as UART, SPI, I2C, VGA and more. It is intended as a pluggable host system that can be reused in Heterogeneous SoCs.<br />
<br />
The goal of this work will be to build a RTL top level for a future SoC gathering a Cheshire host subsystem with a Mempool accelerator subsystem.<br />
<br />
= Project =<br />
<br />
This work will go through multiple of the steps required when proposing a new SoC. After a first architectural proposal, the student will build the top level of the future SoC using System Verilog and verify the communication between the Host and Accelerator subsystems.<br />
<br />
Then, the student will adapt the existing FPGA flow of Cheshire to test the Linux boot on this new platform.<br />
<br />
Finally, a Master thesis student will extend this work with one of the following points<br />
<br />
* ''Extending the HERO runtime for Mempool and benchmark OpenMP [3] kernels on this platform''<br />
* ''Adapt previous synthesis and implementation flows to get an area estimation of the SoC in GF12''<br />
* Integrate a verified RISC-V compliant IOMMU [4] to simplify shared memory based communication between Mempool and Cheshire<br />
<br />
== Character ==<br />
<br />
* 40% Architecture pre-study, RTL top level<br />
* 20% Verification of the memory accesses among the chip<br />
* 40% FPGA implementation and booting Linux<br />
<br />
Master thesis:<br />
<br />
After completing the three points above, an estimated 30% time of the thesis will be dedicated to one of the stretch goal defined in the Project section.<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in System Verilog<br />
* Proficient in C<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
= References =<br />
<br />
[1] https://github.com/pulp-platform/cheshire<br />
<br />
[2] https://pulp-platform.org/docs/lugano2023/MemPool_05_06_23.pdf<br />
<br />
[3] https://www.openmp.org/specifications/<br />
<br />
[4] https://github.com/zero-day-labs/riscv-iommu/tree/main</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Building_an_RTL_top_level_for_a_Mempool-based_Heterogeneous_SoC_(M/1-3S)&diff=10156Building an RTL top level for a Mempool-based Heterogeneous SoC (M/1-3S)2024-02-13T09:40:12Z<p>Cykoenig: </p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Smazzola]]<br />
[[Category:Mbertuletti]]<br />
[[Category:Reserved]]<br />
<br />
= Overview =<br />
<br />
== Status: Reserved ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
** [[:User:Smazzola | Sergio Mazzola]]: [mailto:smazzola@iis.ee.ethz.ch smazzola@iis.ee.ethz.ch] <br />
** [[:User:Mbertuletti | Marco Bertuletti]]: [mailto:mbertuletti@iis.ee.ethz.ch mbertuletti@iis.ee.ethz.ch] <br />
<br />
= Introduction =<br />
<br />
MemPool[1] is an example of the massively parallel SoCs built at IIS. It integrates 256 Snitch cores and 1MiB of shared-L1 memory. Despite its size, MemPool gives all cores low-latency access to the shared L1 memory, with a maximum latency of only five cycles when no contention occurs. This implements efficient communication among all cores, making MemPool suitable for various workload domains and easy to program. <br />
<br />
Today, MemPool is a standalone cluster of accelerators with distributed memory, but it aims to be programmed by and for a Host subsystem<br />
<br />
Cheshire[2], an open-source SoC from our group that features a 64-bit RISC-V core and various peripherals such as UART, SPI, I2C, VGA and more. It is intended as a pluggable host system that can be reused in Heterogeneous SoCs.<br />
<br />
The goal of this work will be to build a RTL top level for a future SoC gathering a Cheshire host subsystem with a Mempool accelerator subsystem.<br />
<br />
= Project =<br />
<br />
This work will go through multiple of the steps required when proposing a new SoC. After a first architectural proposal, the student will build the top level of the future SoC using System Verilog and verify the communication between the Host and Accelerator subsystems.<br />
<br />
Then, the student will adapt the existing FPGA flow of Cheshire to test the Linux boot on this new platform.<br />
<br />
Finally, a Master thesis student will extend this work with one of the following points<br />
<br />
* Extending the HERO runtime for Mempool and benchmark OpenMP [3] kernels on this platform<br />
* Adapt previous synthesis and implementation flows to get an area estimation of the SoC in GF12<br />
* Integrate a verified RISC-V compliant IOMMU [4] to simplify shared memory based communication between Mempool and Cheshire <br />
<br />
== Character ==<br />
<br />
* 40% Architecture pre-study, RTL top level<br />
* 20% Verification of the memory accesses among the chip<br />
* 40% FPGA implementation and booting Linux<br />
<br />
Master thesis:<br />
<br />
After completing the three points above, an estimated 30% time of the thesis will be dedicated to one of the stretch goal defined in the Project section.<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in System Verilog<br />
* Proficient in C<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
= References =<br />
<br />
[1] https://github.com/pulp-platform/cheshire<br />
<br />
[2] https://pulp-platform.org/docs/lugano2023/MemPool_05_06_23.pdf<br />
<br />
[3] https://www.openmp.org/specifications/<br />
<br />
[4] https://github.com/zero-day-labs/riscv-iommu/tree/main</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Building_an_RTL_top_level_for_a_Mempool-based_Heterogeneous_SoC_(M/1-3S)&diff=10155Building an RTL top level for a Mempool-based Heterogeneous SoC (M/1-3S)2024-02-13T09:40:02Z<p>Cykoenig: </p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Smazzola]]<br />
[[Category:Mbertuletti]]<br />
[[Category:Reserved]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
** [[:User:Smazzola | Sergio Mazzola]]: [mailto:smazzola@iis.ee.ethz.ch smazzola@iis.ee.ethz.ch] <br />
** [[:User:Mbertuletti | Marco Bertuletti]]: [mailto:mbertuletti@iis.ee.ethz.ch mbertuletti@iis.ee.ethz.ch] <br />
<br />
= Introduction =<br />
<br />
MemPool[1] is an example of the massively parallel SoCs built at IIS. It integrates 256 Snitch cores and 1MiB of shared-L1 memory. Despite its size, MemPool gives all cores low-latency access to the shared L1 memory, with a maximum latency of only five cycles when no contention occurs. This implements efficient communication among all cores, making MemPool suitable for various workload domains and easy to program. <br />
<br />
Today, MemPool is a standalone cluster of accelerators with distributed memory, but it aims to be programmed by and for a Host subsystem<br />
<br />
Cheshire[2], an open-source SoC from our group that features a 64-bit RISC-V core and various peripherals such as UART, SPI, I2C, VGA and more. It is intended as a pluggable host system that can be reused in Heterogeneous SoCs.<br />
<br />
The goal of this work will be to build a RTL top level for a future SoC gathering a Cheshire host subsystem with a Mempool accelerator subsystem.<br />
<br />
= Project =<br />
<br />
This work will go through multiple of the steps required when proposing a new SoC. After a first architectural proposal, the student will build the top level of the future SoC using System Verilog and verify the communication between the Host and Accelerator subsystems.<br />
<br />
Then, the student will adapt the existing FPGA flow of Cheshire to test the Linux boot on this new platform.<br />
<br />
Finally, a Master thesis student will extend this work with one of the following points<br />
<br />
* Extending the HERO runtime for Mempool and benchmark OpenMP [3] kernels on this platform<br />
* Adapt previous synthesis and implementation flows to get an area estimation of the SoC in GF12<br />
* Integrate a verified RISC-V compliant IOMMU [4] to simplify shared memory based communication between Mempool and Cheshire <br />
<br />
== Character ==<br />
<br />
* 40% Architecture pre-study, RTL top level<br />
* 20% Verification of the memory accesses among the chip<br />
* 40% FPGA implementation and booting Linux<br />
<br />
Master thesis:<br />
<br />
After completing the three points above, an estimated 30% time of the thesis will be dedicated to one of the stretch goal defined in the Project section.<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in System Verilog<br />
* Proficient in C<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
= References =<br />
<br />
[1] https://github.com/pulp-platform/cheshire<br />
<br />
[2] https://pulp-platform.org/docs/lugano2023/MemPool_05_06_23.pdf<br />
<br />
[3] https://www.openmp.org/specifications/<br />
<br />
[4] https://github.com/zero-day-labs/riscv-iommu/tree/main</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Building_an_RTL_top_level_for_a_Mempool-based_Heterogeneous_SoC_(M/1-3S)&diff=10154Building an RTL top level for a Mempool-based Heterogeneous SoC (M/1-3S)2024-02-13T09:29:14Z<p>Cykoenig: </p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Smazzola]]<br />
[[Category:Mbertuletti]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
** [[:User:Smazzola | Sergio Mazzola]]: [mailto:smazzola@iis.ee.ethz.ch smazzola@iis.ee.ethz.ch] <br />
** [[:User:Mbertuletti | Marco Bertuletti]]: [mailto:mbertuletti@iis.ee.ethz.ch mbertuletti@iis.ee.ethz.ch] <br />
<br />
= Introduction =<br />
<br />
MemPool[1] is an example of the massively parallel SoCs built at IIS. It integrates 256 Snitch cores and 1MiB of shared-L1 memory. Despite its size, MemPool gives all cores low-latency access to the shared L1 memory, with a maximum latency of only five cycles when no contention occurs. This implements efficient communication among all cores, making MemPool suitable for various workload domains and easy to program. <br />
<br />
Today, MemPool is a standalone cluster of accelerators with distributed memory, but it aims to be programmed by and for a Host subsystem<br />
<br />
Cheshire[2], an open-source SoC from our group that features a 64-bit RISC-V core and various peripherals such as UART, SPI, I2C, VGA and more. It is intended as a pluggable host system that can be reused in Heterogeneous SoCs.<br />
<br />
The goal of this work will be to build a RTL top level for a future SoC gathering a Cheshire host subsystem with a Mempool accelerator subsystem.<br />
<br />
= Project =<br />
<br />
This work will go through multiple of the steps required when proposing a new SoC. After a first architectural proposal, the student will build the top level of the future SoC using System Verilog and verify the communication between the Host and Accelerator subsystems.<br />
<br />
Then, the student will adapt the existing FPGA flow of Cheshire to test the Linux boot on this new platform.<br />
<br />
Finally, a Master thesis student will extend this work with one of the following points<br />
<br />
* Extending the HERO runtime for Mempool and benchmark OpenMP [3] kernels on this platform<br />
* Adapt previous synthesis and implementation flows to get an area estimation of the SoC in GF12<br />
* Integrate a verified RISC-V compliant IOMMU [4] to simplify shared memory based communication between Mempool and Cheshire <br />
<br />
== Character ==<br />
<br />
* 40% Architecture pre-study, RTL top level<br />
* 20% Verification of the memory accesses among the chip<br />
* 40% FPGA implementation and booting Linux<br />
<br />
Master thesis:<br />
<br />
After completing the three points above, an estimated 30% time of the thesis will be dedicated to one of the stretch goal defined in the Project section.<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in System Verilog<br />
* Proficient in C<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
= References =<br />
<br />
[1] https://github.com/pulp-platform/cheshire<br />
<br />
[2] https://pulp-platform.org/docs/lugano2023/MemPool_05_06_23.pdf<br />
<br />
[3] https://www.openmp.org/specifications/<br />
<br />
[4] https://github.com/zero-day-labs/riscv-iommu/tree/main</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Building_an_RTL_top_level_for_a_Mempool-based_Heterogeneous_SoC_(M/1-3S)&diff=10153Building an RTL top level for a Mempool-based Heterogeneous SoC (M/1-3S)2024-02-13T09:07:28Z<p>Cykoenig: </p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Smazzola]]<br />
[[Category:Mbertuletti]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
** [[:User:Smazzola | Sergio Mazzola]]: [mailto:smazzola@iis.ee.ethz.ch smazzola@iis.ee.ethz.ch] <br />
** [[:User:Mbertuletti | Marco Bertuletti]]: [mailto:mbertuletti@iis.ee.ethz.ch mbertuletti@iis.ee.ethz.ch] <br />
<br />
= Introduction =<br />
<br />
MemPool[1] is an example of the massively parallel SoCs built at IIS. It integrates 256 Snitch cores and 1MiB of shared-L1 memory. Despite its size, MemPool gives all cores low-latency access to the shared L1 memory, with a maximum latency of only five cycles when no contention occurs. This implements efficient communication among all cores, making MemPool suitable for various workload domains and easy to program. <br />
<br />
Today, MemPool is a standalone cluster of accelerators with distributed memory, but it aims to be programmed by and for a Host subsystem<br />
<br />
Cheshire[2], an open-source SoC from our group that features a 64-bit RISC-V core and various peripherals such as UART, SPI, I2C, VGA and more. It is intended as a pluggable host system that can be reused in Heterogeneous SoCs.<br />
<br />
The goal of this work will be to build a RTL top level for a future SoC gathering a Cheshire host subsystem with a Mempool accelerator subsystem.<br />
<br />
= Project =<br />
<br />
This work will go through multiple of the steps required when proposing a new SoC. After a first architectural proposal, the student will build the top level of the future SoC using System Verilog and verify the communication between the Host and Accelerator subsystems.<br />
<br />
Then, the student will adapt the existing FPGA flow of Cheshire to test the Linux boot on this new platform.<br />
<br />
Finally, a Master thesis student will extend this work with one of the following points<br />
<br />
* Extending the HERO runtime for Mempool and benchmark OpenMP [3] kernels on this platform<br />
* Adapt previous synthesis and implementation flows to get an area estimation of the SoC in GF22<br />
* Integrate a verified RISC-V compliant IOMMU [4] to simplify shared memory based communication between Mempool and Cheshire <br />
<br />
== Character ==<br />
<br />
* 40% Architecture pre-study, RTL top level<br />
* 20% Verification of the memory accesses among the chip<br />
* 40% FPGA implementation and booting Linux<br />
<br />
Master thesis:<br />
<br />
After completing the three points above, an estimated 30% time of the thesis will be dedicated to one of the stretch goal defined in the Project section.<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in System Verilog<br />
* Proficient in C<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
= References =<br />
<br />
[1] https://github.com/pulp-platform/cheshire<br />
<br />
[2] https://pulp-platform.org/docs/lugano2023/MemPool_05_06_23.pdf<br />
<br />
[3] https://www.openmp.org/specifications/<br />
<br />
[4] https://github.com/zero-day-labs/riscv-iommu/tree/main</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=User:Cykoenig&diff=10149User:Cykoenig2024-02-12T10:49:51Z<p>Cykoenig: </p>
<hr />
<div>= Cyril Koenig =<br />
<br />
[[File:Cykoenig_face_pulp_team.png|thumb|200px|]]<br />
<br />
==Contact==<br />
<br />
* '''e-mail''': [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<!-- * '''phone''': +41 44 632 81 49 --><br />
* '''office''': ETZ J76.2<br />
<br />
==Research==<br />
<br />
[[High Performance SoCs]]<br />
<br />
[[Heterogeneous SoCs]]<br />
<br />
==Projects==<br />
<br />
===Available Projects===<br />
<DynamicPageList><br />
category = Available<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
===Projects In Progress===<br />
<DynamicPageList><br />
category = In progress<br />
category = Cykoenig<br />
</DynamicPageList><br />
<br />
<DynamicPageList><br />
category = Reserved<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Completed Projects'''<br />
<DynamicPageList><br />
category = Completed<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Archived'''<br />
<DynamicPageList><br />
category = Archived<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Building_an_RTL_top_level_for_a_Mempool-based_Heterogeneous_SoC_(M/1-3S)&diff=10148Building an RTL top level for a Mempool-based Heterogeneous SoC (M/1-3S)2024-02-12T10:49:01Z<p>Cykoenig: </p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Smazzola]]<br />
[[Category:Mbertuletti]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
** [[:User:Smazzola | Sergio Mazzola]]: [mailto:smazzola@iis.ee.ethz.ch smazzola@iis.ee.ethz.ch] <br />
** [[:User:Mbertuletti | Marco Bertuletti]]: [mailto:mbertuletti@iis.ee.ethz.ch mbertuletti@iis.ee.ethz.ch] <br />
<br />
= Introduction =<br />
<br />
MemPool[1] is an example of the massively parallel SoCs built at IIS. It integrates 256 Snitch cores and 1MiB of shared-L1 memory. Despite its size, MemPool gives all cores low-latency access to the shared L1 memory, with a maximum latency of only five cycles when no contention occurs. This implements efficient communication among all cores, making MemPool suitable for various workload domains and easy to program. <br />
<br />
Today, MemPool is a standalone cluster of accelerators with distributed memory, but it aims to be programmed by and for a Host subsystem<br />
<br />
Cheshire[2], an open-source SoC from our group that features a 64-bit RISC-V core and various peripherals such as UART, SPI, I2C, VGA and more. It is intended as a pluggable host system that can be reused in Heterogeneous SoCs.<br />
<br />
The goal of this work will be to build a RTL top level for a future SoC gathering a Cheshire host subsystem with a Mempool accelerator subsystem.<br />
<br />
= Project =<br />
<br />
This work will go through multiple of the steps required when proposing a new SoC. After a first architectural proposal, the student will build the top level of the future SoC using System Verilog and verify the communication between the Host and Accelerator subsystems.<br />
<br />
Then, the student will adapt the existing FPGA flow of Cheshire to test the Linux boot on this new platform.<br />
<br />
Finally, a Master thesis student will extend this work with one of the following points<br />
<br />
* Extending the HERO runtime for Mempool and benchmark OpenMP [3] kernels on this platform<br />
* Adapt previous synthesis and implementation flows to get an area estimation of the SoC in GF22<br />
<br />
== Character ==<br />
<br />
* 40% Architecture pre-study, RTL top level<br />
* 20% Verification of the memory accesses among the chip<br />
* 40% FPGA implementation and booting Linux<br />
<br />
Master thesis:<br />
<br />
After completing the three points above, an estimated 30% time of the thesis will be dedicated to one of the stretch goal defined in the Project section.<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in System Verilog<br />
* Proficient in C<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
= References =<br />
<br />
[1] https://github.com/pulp-platform/cheshire<br />
<br />
[2] https://pulp-platform.org/docs/lugano2023/MemPool_05_06_23.pdf<br />
<br />
[3] https://www.openmp.org/specifications/</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Building_an_RTL_top_level_for_a_Mempool-based_Heterogeneous_SoC_(M/1-3S)&diff=10147Building an RTL top level for a Mempool-based Heterogeneous SoC (M/1-3S)2024-02-12T10:43:39Z<p>Cykoenig: Created page with "<!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --> Category:Digital Category:High Performance SoCs Cat..."</p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Smazzola]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
** [[:User:Smazzola | Sergio Mazzola]]: [mailto:smazzola@iis.ee.ethz.ch smazzola@iis.ee.ethz.ch] <br />
<br />
= Introduction =<br />
<br />
MemPool[1] is an example of the massively parallel SoCs built at IIS. It integrates 256 Snitch cores and 1MiB of shared-L1 memory. Despite its size, MemPool gives all cores low-latency access to the shared L1 memory, with a maximum latency of only five cycles when no contention occurs. This implements efficient communication among all cores, making MemPool suitable for various workload domains and easy to program. <br />
<br />
Today, MemPool is a standalone cluster of accelerators with distributed memory, but it aims to be programmed by and for a Host subsystem<br />
<br />
Cheshire[2], an open-source SoC from our group that features a 64-bit RISC-V core and various peripherals such as UART, SPI, I2C, VGA and more. It is intended as a pluggable host system that can be reused in Heterogeneous SoCs.<br />
<br />
The goal of this work will be to build a RTL top level for a future SoC gathering a Cheshire host subsystem with a Mempool accelerator subsystem.<br />
<br />
= Project =<br />
<br />
This work will go through multiple of the steps required when proposing a new SoC. After a first architectural proposal, the student will build the top level of the future SoC using System Verilog and verify the communication between the Host and Accelerator subsystems.<br />
<br />
Then, the student will adapt the existing FPGA flow of Cheshire to test the Linux boot on this new platform.<br />
<br />
Finally, a Master thesis student will extend this work with one of the following points<br />
<br />
* Extending the HERO runtime for Mempool and benchmark OpenMP [3] kernels on this platform<br />
* Adapt previous synthesis and implementation flows to get an area estimation of the SoC in GF22<br />
<br />
== Character ==<br />
<br />
* 40% Architecture pre-study, RTL top level<br />
* 20% Verification of the memory accesses among the chip<br />
* 40% FPGA implementation and booting Linux<br />
<br />
Master thesis:<br />
<br />
After completing the three points above, an estimated 30% time of the thesis will be dedicated to one of the stretch goal defined in the Project section.<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in System Verilog<br />
* Proficient in C<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
= References =<br />
<br />
[1] https://github.com/pulp-platform/cheshire<br />
<br />
[2] https://pulp-platform.org/docs/lugano2023/MemPool_05_06_23.pdf<br />
<br />
[3] https://www.openmp.org/specifications/</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Writing_a_Hero_runtime_for_EPAC_(1-3S/B)&diff=10146Writing a Hero runtime for EPAC (1-3S/B)2024-02-12T10:19:31Z<p>Cykoenig: </p>
<hr />
<div><!-- Writing a Hero runtime for EPAC (1-3S/B) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Bachelor / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
EPAC is one of the chip resulting from the European Processor Initiative (EPI) consortium, in which ETH Zurich is involved. The EPAC and EPAC1.5 chips have been successfully taped-out in the past years and are available at IIS for testing and developing SW. This heterogeneous chip contains four RISC-V Avispado cores along with two STX accelerator tiles and one Variable floating point precision core.<br />
<br />
[[File:Epac_backend.png|500px]]<br />
''Source: [https://www.european-processor-initiative.eu/accelerator/]''<br />
<br />
[https://www.openmp.org/specifications/ OpenMP] is an Application Programming Interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran. OpenMP allows developers to write parallel programs that can run on a wide range of hardware, including multi-core processors and symmetric multiprocessing (SMP) systems. OpenMP uses pragma directives to exploit parallelism in the annotated code regions. These directives are embedded in the source code and guide the compiler in generating parallel executable code. In addition to compiler support, an OpenMP runtime library abstracts the details of thread creation and management from the programmer, simplifying the parallelization process.<br />
<br />
Starting from version 4.0, OpenMP introduced a target directive, which allows offloading computations to accelerators by explicitly specifying the code regions amenable to execute on the accelerator.<br />
<br />
<syntaxhighlight lang="c"><br />
int main() {<br />
printf("Hello from the host\n");<br />
<br />
#pragma omp target device(1)<br />
{<br />
printf("Hello from the accelerator main thread\n");<br />
#pragma omp parallel<br />
printf("Hello from the accelerator thread %i\n", omp_get_thread_num());<br />
}<br />
<br />
}</syntaxhighlight><br />
<br />
<br />
The HERO stack, developed at IIS, proposes an implementation of the OpenMP runtime that can run on multiple of our SoCs with a maximum of code reuse.<br />
<br />
= Project =<br />
<br />
This project aims to port the HERO stack to the EPAC1.5 chip in order to benchmark multiple OpenMP based kernel and applications on this state of the art Heterogeneous SoC.<br />
<br />
In order to reach this goal, the student will have to familiarize with the EPAC1.5 chip architecture and it's interface (the chip is programmable via a FPGA host setup). The student will then need to understand the HERO runtime and port it to EPAC. Finally they will implement some benchmarks or applications using OpenMP targeting the accelerators inside the chip.<br />
<br />
== Character ==<br />
<br />
* 20% Get familiar with the SoC architecture and its progrmaming interface<br />
* 20% Study the Hero runtimes<br />
* 40% Propose and implement the runtime plugin (written in C) for EPAC<br />
* 20% Benchmark some OpenMP kernel on the STX accelerator<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in C, knowledge of C++<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
<br />
= References =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP]<br />
[https://arxiv.org/pdf/1712.06497.pdf Original HERO paper]<br />
[https://www.european-processor-initiative.eu/accelerator/ EPAC chip]</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Extending_the_HERO_RISC-V_HPC_stack_to_support_multiple_devices_on_heterogeneous_SoCs_(M/1-3S)&diff=10145Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S)2024-02-12T10:19:24Z<p>Cykoenig: </p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP] is an Application Programming Interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran. OpenMP allows developers to write parallel programs that can run on a wide range of hardware, including multi-core processors and symmetric multiprocessing (SMP) systems. OpenMP uses pragma directives to exploit parallelism in the annotated code regions. These directives are embedded in the source code and guide the compiler in generating parallel executable code. In addition to compiler support, an OpenMP runtime library abstracts the details of thread creation and management from the programmer, simplifying the parallelization process.<br />
<br />
Starting from version 4.0, OpenMP introduced a target directive, which allows offloading computations to accelerators by explicitly specifying the code regions amenable to execute on the accelerator.<br />
<br />
The HERO stack, developed at IIS, proposes an implementation of the OpenMP runtime that can run on multiple of our SoCs with a maximum of code reuse.<br />
<br />
HERO is today capable of compiling separate applications for different devices, but OpenMP does not support offloading to different devices within the same application.<br />
<br />
= Project =<br />
<br />
The goal of the project is to extend the HERO software stack and toolchain to allow offloading from one host running Linux to multiple devices simultaneously.<br />
<br />
The proposed extension of HERO will be tested on our latest (emulated) heterogeneous platform.<br />
<br />
[[File:Hero_carfield.png|500px|]]<br />
<br />
The project aims to enable the following code:<br />
<br />
<syntaxhighlight lang=C><br />
<br />
int main(int argc, char *argv[]) {<br />
<br />
printf("I am CVA6\n");<br />
double *a, *b, *c, *d, *e, *f;<br />
<br />
#pragma omp target device(MEMCPY_SAFETY) map(to : a, b) map(from : c)<br />
// Process your data on Safety<br />
<br />
#pragma omp target device(MEMCPY_CLUSTER) map(to : d, e) map(from : f)<br />
// Process your data on Cluster<br />
<br />
printf("Back on CVA6");<br />
<br />
return 0;<br />
}<br />
<br />
</syntaxhighlight><br />
<br />
This will rely on the multiple pre-existing hero libraries and drivers:<br />
<br />
[[File:Hero_heterogeneous.png|500px|]]<br />
<br />
== Character ==<br />
<br />
* 20% Study the LLVM project and Hero extensions<br />
* 20% Get familiar with the SoC architecture and its FPGA implementation<br />
* 60% Propose and implement the runtime libraries extensions (written in C) to communicate with multiple devices<br />
* (Optional) Consider compiler support for asynchronous and simultaneous offloading<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in C, knowledge of C++<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
= References =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP]<br />
[https://arxiv.org/pdf/1712.06497.pdf Original HERO paper]</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Extending_the_HERO_RISC-V_HPC_stack_to_support_multiple_devices_on_heterogeneous_SoCs_(M/1-3S)&diff=10144Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S)2024-02-12T10:18:34Z<p>Cykoenig: /* Project */</p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP] is an Application Programming Interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran. OpenMP allows developers to write parallel programs that can run on a wide range of hardware, including multi-core processors and symmetric multiprocessing (SMP) systems. OpenMP uses pragma directives to exploit parallelism in the annotated code regions. These directives are embedded in the source code and guide the compiler in generating parallel executable code. In addition to compiler support, an OpenMP runtime library abstracts the details of thread creation and management from the programmer, simplifying the parallelization process.<br />
<br />
Starting from version 4.0, OpenMP introduced a target directive, which allows offloading computations to accelerators by explicitly specifying the code regions amenable to execute on the accelerator.<br />
<br />
The HERO stack, developed at IIS, proposes an implementation of the OpenMP runtime that can run on multiple of our SoCs with a maximum of code reuse.<br />
<br />
HERO is today capable of compiling separate applications for different devices, but OpenMP does not support offloading to different devices within the same application.<br />
<br />
= Project =<br />
<br />
The goal of the project is to extend the HERO software stack and toolchain to allow offloading from one host running Linux to multiple devices simultaneously.<br />
<br />
The proposed extension of HERO will be tested on our latest (emulated) heterogeneous platform.<br />
<br />
[[File:Hero_carfield.png|500px|]]<br />
<br />
The project aims to enable the following code:<br />
<br />
<syntaxhighlight lang=C><br />
<br />
int main(int argc, char *argv[]) {<br />
<br />
printf("I am CVA6\n");<br />
double *a, *b, *c, *d, *e, *f;<br />
<br />
#pragma omp target device(MEMCPY_SAFETY) map(to : a, b) map(from : c)<br />
// Process your data on Safety<br />
<br />
#pragma omp target device(MEMCPY_CLUSTER) map(to : d, e) map(from : f)<br />
// Process your data on Cluster<br />
<br />
printf("Back on CVA6");<br />
<br />
return 0;<br />
}<br />
<br />
</syntaxhighlight><br />
<br />
This will rely on the multiple pre-existing hero libraries and drivers:<br />
<br />
[[File:Hero_heterogeneous.png|500px|]]<br />
<br />
== Character ==<br />
<br />
* 20% Study the LLVM project and Hero extensions<br />
* 20% Get familiar with the SoC architecture and its FPGA implementation<br />
* 60% Propose and implement the runtime libraries extensions (written in C) to communicate with multiple devices<br />
* (Optional) Consider compiler support for asynchronous and simultaneous offloading<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in C, knowledge of C++<br />
* Willing to learn about Linux and Linux drivers<br />
<br />
= References =<br />
<br />
[https://arxiv.org/abs/1712.06497 Original HERO paper]<br />
<br />
= References =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP]<br />
[https://arxiv.org/pdf/1712.06497.pdf Original HERO paper]</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=File:Epac_backend.png&diff=10143File:Epac backend.png2024-02-12T10:07:51Z<p>Cykoenig: Cykoenig uploaded a new version of File:Epac backend.png</p>
<hr />
<div></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=File:Epac_backend.png&diff=10142File:Epac backend.png2024-02-12T10:05:42Z<p>Cykoenig: </p>
<hr />
<div></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Extending_the_HERO_RISC-V_HPC_stack_to_support_multiple_devices_on_heterogeneous_SoCs_(M/1-3S)&diff=10141Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S)2024-02-02T18:01:41Z<p>Cykoenig: </p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Master / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP] is an Application Programming Interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran. OpenMP allows developers to write parallel programs that can run on a wide range of hardware, including multi-core processors and symmetric multiprocessing (SMP) systems. OpenMP uses pragma directives to exploit parallelism in the annotated code regions. These directives are embedded in the source code and guide the compiler in generating parallel executable code. In addition to compiler support, an OpenMP runtime library abstracts the details of thread creation and management from the programmer, simplifying the parallelization process.<br />
<br />
Starting from version 4.0, OpenMP introduced a target directive, which allows offloading computations to accelerators by explicitly specifying the code regions amenable to execute on the accelerator.<br />
<br />
The HERO stack, developed at IIS, proposes an implementation of the OpenMP runtime that can run on multiple of our SoCs with a maximum of code reuse.<br />
<br />
HERO is today capable of compiling separate applications for different devices, but OpenMP does not support offloading to different devices within the same application.<br />
<br />
= Project =<br />
<br />
The goal of the project is to extend the HERO software stack and toolchain to allow offloading from one host running Linux to multiple devices simultaneously.<br />
<br />
The proposed extension of HERO will be tested on our latest (emulated) heterogeneous platform.<br />
<br />
[[File:Hero_carfield.png|500px|]]<br />
<br />
The project aims to enable the following code:<br />
<br />
<syntaxhighlight lang=C><br />
<br />
int main(int argc, char *argv[]) {<br />
<br />
printf("I am CVA6\n");<br />
double *a, *b, *c, *d, *e, *f;<br />
<br />
#pragma omp target device(MEMCPY_SAFETY) map(to : a, b) map(from : c)<br />
// Process your data on Safety<br />
<br />
#pragma omp target device(MEMCPY_CLUSTER) map(to : d, e) map(from : f)<br />
// Process your data on Cluster<br />
<br />
printf("Back on CVA6");<br />
<br />
return 0;<br />
}<br />
<br />
</syntaxhighlight><br />
<br />
This will rely on the multiple pre-existing hero libraries and drivers:<br />
<br />
[[File:Hero_heterogeneous.png|500px|]]<br />
<br />
== Character ==<br />
<br />
* 20% Study the LLVM project and Hero extensions<br />
* 20% Get familiar with the SoC architecture and its FPGA implementation<br />
* 60% Propose and implement the runtime libraries extensions (written in C) to communicate with multiple devices<br />
* (Optional) Consider compiler support for asynchronous and simultaneous offloading<br />
<br />
== Prerequisites ==<br />
<br />
* Good knowledge of computer architectures<br />
* Proficient in C, knowledge of C++<br />
* Willing to learn about Linux and Linux drivers <br />
<br />
= References =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP]<br />
[https://arxiv.org/pdf/1712.06497.pdf Original HERO paper]</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=File:Hero_heterogeneous2.png&diff=10140File:Hero heterogeneous2.png2024-02-02T17:51:51Z<p>Cykoenig: </p>
<hr />
<div></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=File:Hero_heterogeneous.png&diff=10139File:Hero heterogeneous.png2024-02-02T17:48:57Z<p>Cykoenig: Cykoenig uploaded a new version of File:Hero heterogeneous.png</p>
<hr />
<div></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=File:Hero_heterogeneous.png&diff=10138File:Hero heterogeneous.png2024-02-02T17:47:15Z<p>Cykoenig: </p>
<hr />
<div></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=File:Hero_carfield.png&diff=10137File:Hero carfield.png2024-02-02T17:36:39Z<p>Cykoenig: </p>
<hr />
<div></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Extending_the_HERO_RISC-V_HPC_stack_to_support_multiple_devices_on_heterogeneous_SoCs_(M/1-3S)&diff=10136Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S)2024-02-02T17:36:14Z<p>Cykoenig: /* Project */</p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Bachelor / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP] is an Application Programming Interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran. OpenMP allows developers to write parallel programs that can run on a wide range of hardware, including multi-core processors and symmetric multiprocessing (SMP) systems. OpenMP uses pragma directives to exploit parallelism in the annotated code regions. These directives are embedded in the source code and guide the compiler in generating parallel executable code. In addition to compiler support, an OpenMP runtime library abstracts the details of thread creation and management from the programmer, simplifying the parallelization process.<br />
<br />
Starting from version 4.0, OpenMP introduced a target directive, which allows offloading computations to accelerators by explicitly specifying the code regions amenable to execute on the accelerator.<br />
<br />
The HERO stack, developed at IIS, proposes an implementation of the OpenMP runtime that can run on multiple of our SoCs with a maximum of code reuse.<br />
<br />
HERO is today capable of compiling separate applications for different devices, but OpenMP does not support offloading to different devices within the same application.<br />
<br />
= Project =<br />
<br />
The goal of the project is to extend the HERO software stack and toolchain to allow offloading from one host running Linux to multiple devices simultaneously.<br />
<br />
The proposed extension of HERO will be tested on our latest (emulated) heterogeneous platform.<br />
<br />
== Character ==<br />
<br />
== Prerequisites ==<br />
<br />
= References =</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Extending_the_HERO_RISC-V_HPC_stack_to_support_multiple_devices_on_heterogeneous_SoCs_(M/1-3S)&diff=10135Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S)2024-02-02T17:32:17Z<p>Cykoenig: /* Introduction */</p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Bachelor / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
[https://www.openmp.org/specifications/ OpenMP] is an Application Programming Interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran. OpenMP allows developers to write parallel programs that can run on a wide range of hardware, including multi-core processors and symmetric multiprocessing (SMP) systems. OpenMP uses pragma directives to exploit parallelism in the annotated code regions. These directives are embedded in the source code and guide the compiler in generating parallel executable code. In addition to compiler support, an OpenMP runtime library abstracts the details of thread creation and management from the programmer, simplifying the parallelization process.<br />
<br />
Starting from version 4.0, OpenMP introduced a target directive, which allows offloading computations to accelerators by explicitly specifying the code regions amenable to execute on the accelerator.<br />
<br />
The HERO stack, developed at IIS, proposes an implementation of the OpenMP runtime that can run on multiple of our SoCs with a maximum of code reuse.<br />
<br />
HERO is today capable of compiling separate applications for different devices, but OpenMP does not support offloading to different devices within the same application.<br />
<br />
= Project =<br />
<br />
<br />
== Character ==<br />
<br />
== Prerequisites ==<br />
<br />
= References =</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Writing_a_Hero_runtime_for_EPAC_(1-3S/B)&diff=10132Writing a Hero runtime for EPAC (1-3S/B)2024-01-25T09:16:07Z<p>Cykoenig: </p>
<hr />
<div><!-- Writing a Hero runtime for EPAC (1-3S/B) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Bachelor / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
= Project =<br />
<br />
<br />
== Character ==<br />
<br />
== Prerequisites ==<br />
<br />
= References =</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Extending_the_HERO_RISC-V_HPC_stack_to_support_multiple_devices_on_heterogeneous_SoCs_(M/1-3S)&diff=10131Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S)2024-01-25T09:15:46Z<p>Cykoenig: Created page with "<!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --> Category:Digital Category:High Performance SoCs Cat..."</p>
<hr />
<div><!-- Creating Extending the HERO RISC-V HPC stack to support multiple devices on heterogeneous SoCs (M/1-3S) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Bachelor / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
= Project =<br />
<br />
<br />
== Character ==<br />
<br />
== Prerequisites ==<br />
<br />
= References =</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=User:Cykoenig&diff=10130User:Cykoenig2024-01-25T09:10:24Z<p>Cykoenig: </p>
<hr />
<div>= Cyril Koenig =<br />
<br />
[[File:Cykoenig_face_pulp_team.png|thumb|200px|]]<br />
<br />
==Contact==<br />
<br />
* '''e-mail''': [mailto:cykoenig@iis.ee.ethz.ch cykoenigiis.ee.ethz.ch]<br />
<!-- * '''phone''': +41 44 632 81 49 --><br />
* '''office''': ETZ J76.2<br />
<br />
==Research==<br />
<br />
[[High Performance SoCs]]<br />
<br />
[[Heterogeneous SoCs]]<br />
<br />
==Projects==<br />
<br />
===Available Projects===<br />
<DynamicPageList><br />
category = Available<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
===Projects In Progress===<br />
<DynamicPageList><br />
category = In progress<br />
category = Cykoenig<br />
</DynamicPageList><br />
<br />
<DynamicPageList><br />
category = Reserved<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Completed Projects'''<br />
<DynamicPageList><br />
category = Completed<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Archived'''<br />
<DynamicPageList><br />
category = Archived<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Writing_a_Hero_runtime_for_EPAC_(1-3S/B)&diff=10129Writing a Hero runtime for EPAC (1-3S/B)2024-01-25T09:10:08Z<p>Cykoenig: </p>
<hr />
<div><!-- Creating Creating A Technology-independent USB1.0 Host Implementation Targetting ASICSs (1-3S/B) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Bachelor / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
= Project =<br />
<br />
<br />
== Character ==<br />
<br />
== Prerequisites ==<br />
<br />
= References =</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Writing_a_Hero_runtime_for_EPAC_(1-3S/B)&diff=10128Writing a Hero runtime for EPAC (1-3S/B)2024-01-25T09:09:44Z<p>Cykoenig: /* Status: Available */</p>
<hr />
<div><!-- Creating Creating A Technology-independent USB1.0 Host Implementation Targetting ASICSs (1-3S/B) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Cykoenig]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Bachelor / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
= Project =<br />
<br />
<br />
== Character ==<br />
<br />
== Prerequisites ==<br />
<br />
= References =</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Writing_a_Hero_runtime_for_EPAC_(1-3S/B)&diff=10127Writing a Hero runtime for EPAC (1-3S/B)2024-01-25T09:08:19Z<p>Cykoenig: Created page with "<!-- Creating Creating A Technology-independent USB1.0 Host Implementation Targetting ASICSs (1-3S/B) --> Category:Digital Category:High Performance SoCs Category:H..."</p>
<hr />
<div><!-- Creating Creating A Technology-independent USB1.0 Host Implementation Targetting ASICSs (1-3S/B) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2024]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Cykoenig]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Computer Architecture Bachelor / Semester Thesis<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
<br />
= Introduction =<br />
<br />
= Project =<br />
<br />
<br />
== Character ==<br />
<br />
== Prerequisites ==<br />
<br />
= References =</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=User:Cykoenig&diff=10126User:Cykoenig2024-01-25T09:01:52Z<p>Cykoenig: /* Research */</p>
<hr />
<div>= Cyril Koenig =<br />
<br />
[[File:Cykoenig_face_pulp_team.png|thumb|200px|]]<br />
<br />
==Contact==<br />
<br />
* '''e-mail''': [mailto:cykoenig@iis.ee.ethz.ch cykoenigiis.ee.ethz.ch]<br />
<!-- * '''phone''': +41 44 632 81 49 --><br />
* '''office''': ETZ J76.2<br />
<br />
==Research==<br />
<br />
[[High_Performance_SoCs]]<br />
<br />
[[Heterogeneous_SoCs]]<br />
<br />
==Projects==<br />
<br />
===Available Projects===<br />
<DynamicPageList><br />
category = Available<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
===Projects In Progress===<br />
<DynamicPageList><br />
category = In progress<br />
category = Cykoenig<br />
</DynamicPageList><br />
<br />
<DynamicPageList><br />
category = Reserved<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Completed Projects'''<br />
<DynamicPageList><br />
category = Completed<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Archived'''<br />
<DynamicPageList><br />
category = Archived<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=User:Cykoenig&diff=10125User:Cykoenig2024-01-25T09:01:43Z<p>Cykoenig: /* Cyril Koenig */</p>
<hr />
<div>= Cyril Koenig =<br />
<br />
[[File:Cykoenig_face_pulp_team.png|thumb|200px|]]<br />
<br />
==Contact==<br />
<br />
* '''e-mail''': [mailto:cykoenig@iis.ee.ethz.ch cykoenigiis.ee.ethz.ch]<br />
<!-- * '''phone''': +41 44 632 81 49 --><br />
* '''office''': ETZ J76.2<br />
<br />
==Research==<br />
<br />
[[High_Performance_SoCs]]<br />
[[Heterogeneous_SoCs]]<br />
<br />
==Projects==<br />
<br />
===Available Projects===<br />
<DynamicPageList><br />
category = Available<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
===Projects In Progress===<br />
<DynamicPageList><br />
category = In progress<br />
category = Cykoenig<br />
</DynamicPageList><br />
<br />
<DynamicPageList><br />
category = Reserved<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Completed Projects'''<br />
<DynamicPageList><br />
category = Completed<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Archived'''<br />
<DynamicPageList><br />
category = Archived<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=High_Performance_SoCs&diff=10124High Performance SoCs2024-01-25T08:54:15Z<p>Cykoenig: Ordered users by alphabetical order and added Cykoenig</p>
<hr />
<div>==High-Performance Systems-on-Chip==<br />
<br />
[[File:Snitch-bd.png|thumb|350px|The ''Snitch'' cluster couples tiny RISC-V ''Snitch'' cores with performant double-precision FPUs to minimize the control-to-compute ratio; it uses hardware loop buffers and stream semantic registers to achieve almost full FPU utilization.]]<br />
[[File:Floorplan_baikonur.png|thumb|350px|''Baikonur'', a 22 nm chip integrating two application-grade RISC-V Ariane cores and 3 Snitch clusters with 8 cores each.]]<br />
[[File:Manticore_concept.png|thumb|350px|Concept art for ''Manticore'', a Snitch-based 22 nm system with 4096 cores on multiple chiplets and with HBM2 memory.]]<br />
<br />
Today, a multitude of data-driven applications such as machine learning, scientific computing, and big data demand an ever-increasing amount of '''parallel floating-point performance''' from computing systems. Increasingly, such applications must scale across a wide range of applications and energy budgets, from supercomputers simulating next week's weather to your smartphone cameras correcting for low light conditions.<br />
<br />
This brings challenges on multiple fronts:<br />
<br />
* '''Energy Efficiency''' becomes a major concern: As logic density increases, supplying these systems with energy and managing their heat dissipation requires increasingly complex solutions.<br />
<br />
* '''Memory bandwidth and latency''' become a major bottleneck as the amount of processed data increases. Despite continuous advances, memory lags behind computing in scaling, and many data-driven problems today are memory-bound.<br />
<br />
* '''Parallelization and scaling''' bring challenges of their own: on-chip interconnects may introduce significant area and performance overheads as they grow, and both the data and instruction streams of cores may compete for valuable memory bandwidth and interfere in a destructive way.<br />
<br />
While all state-of-the-art high-performance computing systems are constrained by the above issues, they are also subject to a fundamental trade-off between efficiency and flexibility. This forms a design space which includes the following paradigms:<br />
<br />
* '''Accelerators''' are designed to do one thing very well: they are very energy efficient and performant and usually offer predetermined data movement. However, they are not or barely programmable, inflexible, and monolithic in their design.<br />
<br />
* '''Superscalar Out-of-Order CPUs''', on the other end, provide extreme flexibility, full programmability, and reasonable performance across various workloads. However, they require large area and energy overheads for a given performance, use memory inefficiently, and are often hard to scale well to manycore systems.<br />
<br />
* '''GPUs''' are parallel and data-oriented by design, yet still meaningfully programmable, aiming for a sweet-spot between scalability, efficiency, and programmability. However, are still subject to memory access challenges and often require manual memory management for decent performance.<br />
<br />
'''How can we further improve on these existing paradigms?''' Can we design decently efficient and performant, yet freely programmable systems with scalable, performant memory systems?<br />
<br />
If these questions sound intriguing to you, consider joining us for a project or thesis! You can find currently available projects and our contact information below.<br />
<br />
==Our Activities==<br />
<br />
We are primarily interested in '''architecture design and hardware implementation''' for high-performance systems. However, ensuring high performance requires us to consider the '''entire hardware-software stack''':<br />
<br />
* '''HPC Software''': Design and porting of high-performance applications, benchmarks, compiler tools, and operating systems (Linux) to our hardware.<br />
* '''Hardware-software codesign''': Design of performance-aware algorithms and kernels and hardware that can be efficiently programmed for use in processor-based systems.<br />
* '''Architecture''': RTL implementation of energy-efficient designs with an emphasis on high utilization and throughput, as well as on efficient interoperability with existing IPs.<br />
* '''SoC design and Implementation''': Design of full high-performance systems-on-chips; implementation and tapeout on modern silicon technologies such as TSMC's 65 nm and GlobalFoundries' 22 nm nodes.<br />
* '''IC testing and Board-Level design''': Testing of the returning chips with industry-grade automated test equipment (ATE) and design of system-level demonstrator boards.<br />
<br />
Our current interests include systems with '''low control-to-compute ratios''', high-performance '''on-chip interconnects''', and '''scalable many-core systems'''. However, we are always happy to explore new domains; if you have an interesting idea, contact us and we can discuss it in detail!<br />
<br />
==Who are we==<br />
<br />
<!----------Benz----------><br />
{|<br />
| style="padding: 10px" | [[File:Tbenz_face_pulp_team.jpg|frameless|left|96px]]<br />
|<br />
===[[:User:Tbenz | Thomas Benz]]===<br />
* '''e-mail''': [mailto:tbenz@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 05 18<br />
* '''office''': ETZ J85<br />
|}<br />
<br />
<!----------Bertaccini----------><br />
{|<br />
| style="padding: 10px" | [[File:lbertaccini_photo.jpg|frameless|left|96px]]<br />
|<br />
===[[:User:Lbertaccini | Luca Bertaccini]]===<br />
* '''e-mail''': [mailto:lbertaccini@iis.ee.ethz.ch lbertaccini@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 55 58<br />
* '''office''': ETZ J78<br />
|}<br />
<br />
<!----------Bertuletti----------><br />
{|<br />
| style="padding: 10px" | [[File: Mbertuletti_squaredpicture.png|frameless|left|96px]]<br />
|<br />
<br />
===[[:User:Mbertuletti| Marco Bertuletti]]===<br />
* '''e-mail''': [mailto:mbertuletti@iis.ee.ethz.ch mbertuletti@iis.ee.ethz.ch]<br />
* '''office''': ETZ J69.2<br />
|}<br />
<br />
<!----------Collagrande----------><br />
{|<br />
| style="padding: 10px" | [[File:Colluca picture.png|frameless|left|96px]]<br />
|<br />
<br />
===[[:User:Colluca| Luca Colagrande]]===<br />
* '''e-mail''': [mailto:colluca@iis.ee.ethz.ch colluca@iis.ee.ethz.ch]<br />
* '''office''': OAT U21<br />
|}<br />
<br />
<!----------Fischer----------><br />
{|<br />
| style="padding: 10px" | [[File:Tim_Fischer.jpeg|frameless|left|96px]]<br />
|<br />
<br />
===[[:User:Fischeti| Tim Fischer]]===<br />
* '''e-mail''': [mailto:fischeti@iis.ee.ethz.ch fischeti@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 59 12<br />
* '''office''': ETZ J76.2<br />
|}<br />
<br />
<!----------Garofalo----------><br />
{|<br />
| style="padding: 10px" | [[File: Agarofalo_new.jpeg|frameless|left|96px]]<br />
|<br />
===[[:User:Agarofalo| Angelo Garofalo]]===<br />
* '''e-mail''': [mailto:agarofalo@iis.ee.ethz.ch agarofalo@iis.ee.ethz.ch]<br />
* '''office''': ETZ J78<br />
|}<br />
<br />
<!----------Koenig----------><br />
{|<br />
| style="padding: 10px" | [[File: Cykoenig_face_pulp_team.png|frameless|left|96px]]<br />
|<br />
===[[:User:Cykoenig| Cyril Koenig]]===<br />
* '''e-mail''': [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
* '''office''': ETZ J76.2<br />
|}<br />
<br />
<!----------Mazzola----------><br />
{|<br />
| style="padding: 10px" | [[File:Smazzola_face_1to1.png|frameless|left|96px]]<br />
|<br />
===[[:User:Smazzola | Sergio Mazzola]]===<br />
* '''e-mail''': [mailto:smazzola@iis.ee.ethz.ch smazzola@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 81 49<br />
* '''office''': ETZ J76.2<br />
|}<br />
<br />
<!----------Perotti----------><br />
{|<br />
| style="padding: 10px" | [[File:Mperotti_face_pulp_team.jpg|frameless|left|96px]]<br />
|<br />
===[[:User:Mperotti | Matteo Perotti]]===<br />
* '''e-mail''': [mailto:mperotti@iis.ee.ethz.ch mperotti@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 05 25<br />
* '''office''': OAT U21<br />
|}<br />
<br />
<!----------Riedel----------><br />
{|<br />
| style="padding: 10px" | [[File:Sriedel_face_pulp_team.jpg|frameless|left|96px]]<br />
|<br />
<br />
===[[:User:Sriedel | Samuel Riedel]]===<br />
* '''e-mail''': [mailto:sriedel@iis.ee.ethz.ch sriedel@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 65 69<br />
* '''office''': ETZ J71.2<br />
|}<br />
<br />
<!----------Scheffler----------><br />
{|<br />
| style="padding: 10px" | [[File:Paulsc_face_1to1.png|frameless|left|96px]]<br />
|<br />
===[[:User:Paulsc | Paul Scheffler]]===<br />
* '''e-mail''': [mailto:paulsc@iis.ee.ethz.ch paulsc@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 09 15<br />
* '''office''': ETZ J85<br />
|}<br />
<br />
<!----------Wistoff----------><br />
{|<br />
| style="padding: 10px" | [[File:Nwistoff_face_pulp_team.JPG|frameless|left|96px]]<br />
|<br />
===[[:User:Nwistoff | Nils Wistoff]]===<br />
* '''e-mail''': [mailto:nwistoff@iis.ee.ethz.ch nwistoff@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 06 75<br />
* '''office''': ETZ J85<br />
|}<br />
<br />
<!----------Zhang----------><br />
{|<br />
| style="padding: 10px" | [[File: Yichao_Photo.jpeg|frameless|left|96px]]<br />
|<br />
<br />
===[[:User:Yiczhang| Yichao Zhang]]===<br />
* '''e-mail''': [mailto:yiczhang@iis.ee.ethz.ch yiczhang@iis.ee.ethz.ch]<br />
* '''office''': ETZ J76.2<br />
|}<br />
<br />
<!----------Retired members----------><br />
<!--Retired members<br />
{|<br />
| style="padding: 10px" | [[File:Akurth_face_pulp_team.jpeg|frameless|left|96px]]<br />
|<br />
===[[:User:Akurth | Andreas Kurth]]===<br />
* '''e-mail''': [mailto:akurth@iis.ee.ethz.ch akurth@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 04 87<br />
* '''office''': ETZ J69.2<br />
|}<br />
<br />
{|<br />
| style="padding: 10px" | [[File:Zarubaf_face_pulp_team.jpg|frameless|left|96px]]<br />
|<br />
===[[:User:Zarubaf | Florian Zaruba]]===<br />
* '''e-mail''': [mailto:zarubaf@iis.ee.ethz.ch zarubaf@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 65 56<br />
* '''office''': ETZ J89<br />
|}<br />
{|<br />
| style="padding: 10px" | [[File:Fschuiki_face_pulp_team.jpg|frameless|left|96px]]<br />
|<br />
===[[:User:Fschuiki | Fabian Schuiki]]===<br />
* '''e-mail''': [mailto:fschuiki@iis.ee.ethz.ch fschuiki@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 67 89<br />
* '''office''': ETZ J89<br />
|}<br />
{|<br />
| style="padding: 10px" | [[File:Matheusd_face_1to1.png|frameless|left|96px]]<br />
|<br />
===[[:User:Matheusd | Matheus Cavalcante]]===<br />
* '''e-mail''': [mailto:matheusd@iis.ee.ethz.ch matheusd@iis.ee.ethz.ch]<br />
* '''phone''': +41 44 632 54 96<br />
* '''office''': ETZ J69.2<br />
|}<br />
--><br />
<br />
<!--<br />
Who are we<br />
What do we do<br />
Where to find us<br />
--><br />
<br />
==Projects==<br />
<br />
All projects are annotated with one or more possible ''project types'' (M/S/B/G) and a ''number of students'' (1 to 3). <br />
<br />
* '''M''': Master's thesis: ''26 weeks'' full-time (6 months) for ''one student only''<br />
* '''S''': Semester project: ''14 weeks'' half-time (1 semester lecture period) or ''7 weeks'' full-time for ''1-3 students''<br />
* '''B''': Bachelor's thesis: ''14 weeks'' half-time (1 semester lecture period) for ''one student only''<br />
* '''G''': Group project: ''14 weeks'' part-time (1 semester lecture period) for ''2-3 students''<br />
<br />
Usually, these are merely suggestions from our side; proposals can often be reformulated to fit students' needs.<br />
<br />
===Available Projects===<br />
<DynamicPageList><br />
category = Available<br />
category = Digital<br />
category = High Performance SoCs<br />
suppresserrors=true<br />
ordermethod=sortkey<br />
order=ascending<br />
</DynamicPageList><br />
<br />
===Projects In Progress===<br />
<DynamicPageList><br />
category = In progress<br />
category = Digital<br />
category = High Performance SoCs<br />
suppresserrors=false<br />
ordermethod=sortkey<br />
order=ascending<br />
</DynamicPageList><br />
===Completed Projects===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = High Performance SoCs<br />
suppresserrors=true<br />
</DynamicPageList><br />
===Reserved Projects===<br />
<DynamicPageList><br />
category = Reserved<br />
category = Digital<br />
category = High Performance SoCs<br />
suppresserrors=true<br />
ordermethod=sortkey<br />
order=ascending<br />
</DynamicPageList></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=A_Flexible_FPGA-Based_Peripheral_Platform_Extending_Linux-Capable_Systems_on_Chip_(1-3S/B)&diff=10123A Flexible FPGA-Based Peripheral Platform Extending Linux-Capable Systems on Chip (1-3S/B)2024-01-25T08:49:13Z<p>Cykoenig: </p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:FPGA]]<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Tbenz]]<br />
[[Category:Nwistoff]]<br />
[[Category:Cykoenig]]<br />
[[Category:Completed]]<br />
<br />
<br />
== Status: Completed ==<br />
<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Tbenz | Thomas Benz]]: [mailto:tbenz@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
** [[:User:Nwistoff | Nils Wistoff]]: [mailto:nwistoff@iis.ee.ethz.ch nwistoff@iis.ee.ethz.ch]<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
== Introduction ==<br />
<br />
Recent PULP chips, such as [http://asic.ethz.ch/2022/Occamy.html Occamy] or [http://asic.ethz.ch/2022/Neo.html Neo], feature a Linux-capable '''CVA6''' core [1] and a '''Serial Link''' off-chip interface. While the chips contain a few basic peripherals that allow running Linux (such as UART, GPIO), further peripherals to extend their functionality (e.g. Ethernet, USB Host, DVI/HDMI) are desirable.<br />
<br />
The idea of this project is to bring up an FPGA-based peripheral platform that can be connected to existing chips via Serial Link to extend their functionality.<br />
<br />
== Project ==<br />
<br />
This project involves both creating the FPGA platform and extending the software stack (e.g. Linux) running on the ASIC to use it.<br />
<br />
===== Tasks =====<br />
<br />
====== Hardware Design ====== <br />
<br />
* '''Serial-Link-capable base platform:''' Design an FPGA platform that can be accessed via Serial Link.<br />
<br />
* '''Integration of an Ethernet Peripheral:''' Integrate an Ethernet controller that communicates with the on-board Ethernet PHY.<br />
<br />
* '''Integration of PAPER:''' Integrate PAPER, a DVI/HDMI peripheral developed in previous student projects.<br />
<br />
====== Software Design ======<br />
<br />
* '''Bare-Metal Applications:''' Prototype bare-metal applications that access the integrated peripherals.<br />
<br />
* '''U-Boot and Linux Device Drivers:''' Integrate the necessary drivers in U-Boot and Linux to use the integrated peripherals<br />
<br />
<br />
====== Stretch Goals ======<br />
<br />
Depending on your progress and interests, several further steps can be considered, such as:<br />
<br />
* '''Integration of further peripherals:''' Further peripherals can be integrated in hardware and software, such as SPI devices and a USB Host.<br />
<br />
We can also discuss targeting a subset of the tasks above depending on your time frame and interests.<br />
<br />
===== Requirements ===== <br />
<br />
* Strong interest system design and hardware/software interaction<br />
* Experience with HDLs (preferably SystemVerliog) such as taught in VLSI I<br />
* Basic knowledge of operating systems<br />
<br />
Composition: 40% RTL Implementation, 20% Verification, 40% Software Design<br />
<br />
===== Project Supervisors ===== <br />
* [[:User:Tbenz | Thomas Benz]]: [mailto:tbenz@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
* [[:User:Nwistoff | Nils Wistoff]]: [mailto:nwistoff@iis.ee.ethz.ch nwistoff@iis.ee.ethz.ch]<br />
* [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
== References ==<br />
<br />
* [1] https://github.com/openhwgroup/cva6</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=Benchmarking_a_heterogeneous_217-core_MPSoC_on_HPC_applications_(M/1-3S)&diff=10122Benchmarking a heterogeneous 217-core MPSoC on HPC applications (M/1-3S)2024-01-25T08:47:53Z<p>Cykoenig: </p>
<hr />
<div><!-- Benchmarking a heterogeneous 217-core MPSoC on HPC applications (SA or MA) --><br />
<br />
[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:2023]]<br />
[[Category:Master Thesis]]<br />
[[Category:Hot]]<br />
[[Category:Colluca]]<br />
[[Category:Cykoenig]]<br />
[[Category:Available]]<br />
[[Category:Completed]]<br />
<br />
[[File:Manticore concept.png|thumb|Manticore concept architecture]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester or Master Thesis (multiple students possible)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Colluca | Luca Colagrande]]: [mailto:colluca@iis.ee.ethz.ch colluca@iis.ee.ethz.ch]<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
** [[:User:Vivianep | Viviane Potocnik]]: [mailto:vivianep@iis.ee.ethz.ch vivianep@iis.ee.ethz.ch]<br />
* Previously involved students:<br />
** Gioele Gottardo [SA]<br />
** Bettina Lory [MA]<br />
<br />
= Introduction =<br />
<br />
Occamy is a massively-parallel multiprocessor system-on-chip (MPSoC) designed for energy-efficient high-performance computing (HPC) applications. It is a concrete implementation of the concept Manticore architecture that went on display at HotChips 2020 [1]. It couples a 64-bit RISC-V application-class out-of-order CVA6 core [2,3], which can boot Linux, with a many-core accelerator comprising 216 energy-efficient 32-bit RISC-V Snitch cores [4,5]. The accelerator cores are tightly coupled to a set of software-managed L1 scratch-pad memories (SPMs). The difference to hardware-managed caches is that data movement between the L2 and L1 memories has to be explicitly defined in software, for which several DMA engines are provided. This design decision improves overall energy efficiency.<br />
<br />
This heterogeneous platform guarantees high single-thread performance by executing sequential code on the host (CVA6 core), while parallel code regions can be offloaded to the accelerator to take advantage of its higher energy efficiency and peak performance. <br />
The heterogeneous programming model described opens up many interesting questions, such as:<br />
* When is it convenient to offload a computation to the accelerator?<br />
* What is the optimal number of accelerator cores to select for offload?<br />
Given the rising popularity of heterogeneous MPSoCs [6,7], being able to confidently answer these questions should be considered a valuable takeaway of this thesis.<br />
<br />
Snitch's floating point subsystem is also of particular interest. Several ISA extensions have been developed to improve its energy efficiency, namely stream semantic registers (SSRs) [8,9] and the floating-point repetition (FREP) instruction [5], respectively enabling load/store elisions and pseudo dual-issue execution, and other developments are ongoing.<br />
<br />
= Project description =<br />
<br />
In this project, you will port a series of HPC kernels from the PolyBench/C benchmark [10] to Occamy. You will optimize the kernels to take advantage of the heterogeneous architecture and software defined data movement. An additional goal would be to explore the applicability of SSRs and FREP to the kernels, and potentially other extensions under development. The applications will be developed in C. A bare-metal runtime is provided, hiding the details of the hardware beneath a set of convenience functions. The programs will be run in RTL simulation. To speed up the development, we might opt for a downscaled version of Occamy, with a reduced number of accelerator cores.<br />
<br />
== Detailed task description ==<br />
<br />
To break it down in more detail, you will:<br />
<br />
* '''Gain a deep understanding of the PolyBench kernels''', in particular of: <br />
** the underlying algorithms;<br />
** the data movement and communication patterns;<br />
** the parallelism they expose, i.e. distinguish sequential vs. parallel code regions, data vs. task parallelism, etc.;<br />
* '''Understand the Occamy architecture and familiarize with the software stack<br />
* '''Select a suitable subset of kernels to implement'''<br />
* '''Implement the kernels on Occamy'''<br />
** '''A)''' Port the original sources to run on the CVA6 host<br />
** '''B)''' Offload amenable code regions to the accelerator<br />
** '''C)''' Optimize data movement, overlapping communication and computation where possible<br />
* '''Compare the performance and energy efficiency of the implementations in A), B) and C)'''<br />
** Analyze the speedup in Amdahl's terms<br />
** Understand and locate where the major performance losses occur<br />
** Compare the attained FPU utilization and performance to the architecture's peak values<br />
** Suggest new hardware features or ISA extensions to further improve the attained performance<br />
<br />
== Optional stretch goals ==<br />
<br />
Additional stretch goals may include:<br />
<br />
* Study which kernels could be optimized with the SSR or FREP ISA extensions<br />
* Eventually optimize the kernels with SSRs or FREP<br />
* Categorize the kernels based on their use of collective communication (multicast, reductions) and synchronization primitives (barriers, locks)<br />
* Compare your results to a GPU or server-class CPU implementation [11]<br />
<br />
== Character ==<br />
<br />
* 20% Literature/architecture review<br />
* 60% Bare-metal C programming<br />
* 20% Evaluation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience in bare-metal or embedded C programming<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Preferred: Knowledge or prior experience with RISC-V<br />
* Preferred: Experience with ASIC implementation flow as taught in VLSI II<br />
<br />
= References =<br />
<br />
[1] [https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9296802 Manticore: A 4096-Core RISC-V Chiplet Architecture for Ultraefficient Floating-Point Computing] <br /><br />
[2] [https://github.com/openhwgroup/cva6 CVA6 core Github repository] <br /><br />
[3] [https://arxiv.org/pdf/1904.05442.pdf The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux-ready 1.7GHz 64bit RISC-V Core in 22nm FDSOI Technology] <br /><br />
[4] [https://github.com/pulp-platform/snitch_cluster Snitch cluster Github repository] <br /><br />
[5] [https://arxiv.org/pdf/2002.10143.pdf Snitch: A tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads] <br /><br />
[6] [https://en.wikichip.org/wiki/nvidia/tegra/xavier Nvidia Tegra Xavier Wikichip article] <br /><br />
[7] [https://en.wikipedia.org/wiki/ARM_big.LITTLE Arm big.Little Wikipedia article] <br /><br />
[8] [https://arxiv.org/pdf/1911.08356.pdf Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores] <br /><br />
[9] [https://arxiv.org/pdf/2011.08070.pdf Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra] <br /><br />
[10] [http://web.cse.ohio-state.edu/~pouchet.2/software/polybench/ PolyBench/C Website] <br /><br />
[11] [https://iis-git.ee.ethz.ch/bjoernf/PolyBench-ACC PolyBench port to HERO architecture] <br /><br />
[12] [https://arxiv.org/pdf/1712.06497.pdf HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA] <br /><br />
[13] [https://github.com/MatthiasJReisinger/PolyBenchC-4.2.1/blob/master/polybench.pdf PolyBench 4.2.1 kernel descriptions]</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=User:Cykoenig&diff=10121User:Cykoenig2024-01-25T08:47:20Z<p>Cykoenig: </p>
<hr />
<div>= Cyril Koenig =<br />
<br />
[[File:Cykoenig_face_pulp_team.png|thumb|200px|]]<br />
<br />
==Contact==<br />
<br />
* '''e-mail''': [mailto:cykoenig@iis.ee.ethz.ch cykoenigiis.ee.ethz.ch]<br />
<!-- * '''phone''': +41 44 632 81 49 --><br />
* '''office''': ETZ J76.2<br />
<br />
==Projects==<br />
<br />
===Available Projects===<br />
<DynamicPageList><br />
category = Available<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
===Projects In Progress===<br />
<DynamicPageList><br />
category = In progress<br />
category = Cykoenig<br />
</DynamicPageList><br />
<br />
<DynamicPageList><br />
category = Reserved<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Completed Projects'''<br />
<DynamicPageList><br />
category = Completed<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList><br />
<br />
'''Archived'''<br />
<DynamicPageList><br />
category = Archived<br />
category = Cykoenig<br />
suppresserrors=true<br />
</DynamicPageList></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=File:Cykoenig_face_pulp_team.png&diff=10120File:Cykoenig face pulp team.png2024-01-25T08:43:21Z<p>Cykoenig: </p>
<hr />
<div></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=A_Flexible_FPGA-Based_Peripheral_Platform_Extending_Linux-Capable_Systems_on_Chip_(1-3S/B)&diff=8464A Flexible FPGA-Based Peripheral Platform Extending Linux-Capable Systems on Chip (1-3S/B)2022-12-20T10:07:22Z<p>Cykoenig: /* Project Supervisors */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:FPGA]]<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Tbenz]]<br />
[[Category:Nwistoff]]<br />
[[Category:Available]]<br />
<br />
<br />
== Status: Available ==<br />
<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Tbenz | Thomas Benz]]: [mailto:tbenz@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
** [[:User:Nwistoff | Nils Wistoff]]: [mailto:nwistoff@iis.ee.ethz.ch nwistoff@iis.ee.ethz.ch]<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
== Introduction ==<br />
<br />
Recent PULP chips, such as [http://asic.ethz.ch/2022/Occamy.html Occamy] or [http://asic.ethz.ch/2022/Neo.html Neo], feature a Linux-capable '''CVA6''' core [1] and a '''Serial Link''' off-chip interface. While the chips contain a few basic peripherals that allow running Linux (such as UART, GPIO), further peripherals to extend their functionality (e.g. Ethernet, USB Host, DVI/HDMI) are desirable.<br />
<br />
The idea of this project is to bring up an FPGA-based peripheral platform that can be connected to existing chips via Serial Link to extend their functionality.<br />
<br />
== Project ==<br />
<br />
This project involves both creating the FPGA platform and extending the software stack (e.g. Linux) running on the ASIC to use it.<br />
<br />
===== Tasks =====<br />
<br />
====== Hardware Design ====== <br />
<br />
* '''Serial-Link-capable base platform:''' Design an FPGA platform that can be accessed via Serial Link.<br />
<br />
* '''Integration of an Ethernet Peripheral:''' Integrate an Ethernet controller that communicates with the on-board Ethernet PHY.<br />
<br />
* '''Integration of PAPER:''' Integrate PAPER, a DVI/HDMI peripheral developed in previous student projects.<br />
<br />
====== Software Design ======<br />
<br />
* '''Bare-Metal Applications:''' Prototype bare-metal applications that access the integrated peripherals.<br />
<br />
* '''U-Boot and Linux Device Drivers:''' Integrate the necessary drivers in U-Boot and Linux to use the integrated peripherals<br />
<br />
<br />
====== Stretch Goals ======<br />
<br />
Depending on your progress and interests, several further steps can be considered, such as:<br />
<br />
* '''Integration of further peripherals:''' Further peripherals can be integrated in hardware and software, such as SPI devices and a USB Host.<br />
<br />
We can also discuss targeting a subset of the tasks above depending on your time frame and interests.<br />
<br />
===== Requirements ===== <br />
<br />
* Strong interest system design and hardware/software interaction<br />
* Experience with HDLs (preferably SystemVerliog) such as taught in VLSI I<br />
* Basic knowledge of operating systems<br />
<br />
Composition: 40% RTL Implementation, 20% Verification, 40% Software Design<br />
<br />
===== Project Supervisors ===== <br />
* [[:User:Tbenz | Thomas Benz]]: [mailto:tbenz@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
* [[:User:Nwistoff | Nils Wistoff]]: [mailto:nwistoff@iis.ee.ethz.ch nwistoff@iis.ee.ethz.ch]<br />
* [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
== References ==<br />
<br />
* [1] https://github.com/openhwgroup/cva6</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=A_Flexible_FPGA-Based_Peripheral_Platform_Extending_Linux-Capable_Systems_on_Chip_(1-3S/B)&diff=8463A Flexible FPGA-Based Peripheral Platform Extending Linux-Capable Systems on Chip (1-3S/B)2022-12-20T10:07:10Z<p>Cykoenig: /* Status: Available */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:FPGA]]<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Tbenz]]<br />
[[Category:Nwistoff]]<br />
[[Category:Available]]<br />
<br />
<br />
== Status: Available ==<br />
<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Tbenz | Thomas Benz]]: [mailto:tbenz@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
** [[:User:Nwistoff | Nils Wistoff]]: [mailto:nwistoff@iis.ee.ethz.ch nwistoff@iis.ee.ethz.ch]<br />
** [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
<br />
== Introduction ==<br />
<br />
Recent PULP chips, such as [http://asic.ethz.ch/2022/Occamy.html Occamy] or [http://asic.ethz.ch/2022/Neo.html Neo], feature a Linux-capable '''CVA6''' core [1] and a '''Serial Link''' off-chip interface. While the chips contain a few basic peripherals that allow running Linux (such as UART, GPIO), further peripherals to extend their functionality (e.g. Ethernet, USB Host, DVI/HDMI) are desirable.<br />
<br />
The idea of this project is to bring up an FPGA-based peripheral platform that can be connected to existing chips via Serial Link to extend their functionality.<br />
<br />
== Project ==<br />
<br />
This project involves both creating the FPGA platform and extending the software stack (e.g. Linux) running on the ASIC to use it.<br />
<br />
===== Tasks =====<br />
<br />
====== Hardware Design ====== <br />
<br />
* '''Serial-Link-capable base platform:''' Design an FPGA platform that can be accessed via Serial Link.<br />
<br />
* '''Integration of an Ethernet Peripheral:''' Integrate an Ethernet controller that communicates with the on-board Ethernet PHY.<br />
<br />
* '''Integration of PAPER:''' Integrate PAPER, a DVI/HDMI peripheral developed in previous student projects.<br />
<br />
====== Software Design ======<br />
<br />
* '''Bare-Metal Applications:''' Prototype bare-metal applications that access the integrated peripherals.<br />
<br />
* '''U-Boot and Linux Device Drivers:''' Integrate the necessary drivers in U-Boot and Linux to use the integrated peripherals<br />
<br />
<br />
====== Stretch Goals ======<br />
<br />
Depending on your progress and interests, several further steps can be considered, such as:<br />
<br />
* '''Integration of further peripherals:''' Further peripherals can be integrated in hardware and software, such as SPI devices and a USB Host.<br />
<br />
We can also discuss targeting a subset of the tasks above depending on your time frame and interests.<br />
<br />
===== Requirements ===== <br />
<br />
* Strong interest system design and hardware/software interaction<br />
* Experience with HDLs (preferably SystemVerliog) such as taught in VLSI I<br />
* Basic knowledge of operating systems<br />
<br />
Composition: 40% RTL Implementation, 20% Verification, 40% Software Design<br />
<br />
===== Project Supervisors ===== <br />
* [[:User:Tbenz | Thomas Benz]]: [mailto:tbenz@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
* [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
* [[:User:Nwistoff | Nils Wistoff]]: [mailto:nwistoff@iis.ee.ethz.ch nwistoff@iis.ee.ethz.ch]<br />
<br />
== References ==<br />
<br />
* [1] https://github.com/openhwgroup/cva6</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=User:Cykoenig&diff=8317User:Cykoenig2022-11-07T16:34:56Z<p>Cykoenig: </p>
<hr />
<div>Contact<br />
<br />
Office: ETZ J71.2<br />
E-Mail: cykoenig@iis.ee.ethz.ch<br />
<br />
Interests<br />
<br />
Processor Design<br />
Operating Systems<br />
ASIC/FPGA</div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=User:Cykoenig&diff=8316User:Cykoenig2022-11-07T16:32:35Z<p>Cykoenig: Created blank page</p>
<hr />
<div></div>Cykoenighttp://iis-projects.ee.ethz.ch/index.php?title=A_Flexible_FPGA-Based_Peripheral_Platform_Extending_Linux-Capable_Systems_on_Chip_(1-3S/B)&diff=8315A Flexible FPGA-Based Peripheral Platform Extending Linux-Capable Systems on Chip (1-3S/B)2022-11-07T16:32:03Z<p>Cykoenig: /* Project Supervisors */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:FPGA]]<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Bachelor Thesis]]<br />
[[Category:Tbenz]]<br />
[[Category:Nwistoff]]<br />
[[Category:Available]]<br />
<br />
<br />
== Status: Available ==<br />
<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** [[:User:Tbenz | Thomas Benz]]: [mailto:tbenz@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
** [[:User:Nwistoff | Nils Wistoff]]: [mailto:nwistoff@iis.ee.ethz.ch nwistoff@iis.ee.ethz.ch]<br />
<br />
== Introduction ==<br />
<br />
Recent PULP chips, such as [http://asic.ethz.ch/2022/Occamy.html Occamy] or [http://asic.ethz.ch/2022/Neo.html Neo], feature a Linux-capable '''CVA6''' core [1] and a '''Serial Link''' off-chip interface. While the chips contain a few basic peripherals that allow running Linux (such as UART, GPIO), further peripherals to extend their functionality (e.g. Ethernet, USB Host, DVI/HDMI) are desirable.<br />
<br />
The idea of this project is to bring up an FPGA-based peripheral platform that can be connected to existing chips via Serial Link to extend their functionality.<br />
<br />
== Project ==<br />
<br />
This project involves both creating the FPGA platform and extending the software stack (e.g. Linux) running on the ASIC to use it.<br />
<br />
===== Tasks =====<br />
<br />
====== Hardware Design ====== <br />
<br />
* '''Serial-Link-capable base platform:''' Design an FPGA platform that can be accessed via Serial Link.<br />
<br />
* '''Integration of an Ethernet Peripheral:''' Integrate an Ethernet controller that communicates with the on-board Ethernet PHY.<br />
<br />
* '''Integration of PAPER:''' Integrate PAPER, a DVI/HDMI peripheral developed in previous student projects.<br />
<br />
====== Software Design ======<br />
<br />
* '''Bare-Metal Applications:''' Prototype bare-metal applications that access the integrated peripherals.<br />
<br />
* '''U-Boot and Linux Device Drivers:''' Integrate the necessary drivers in U-Boot and Linux to use the integrated peripherals<br />
<br />
<br />
====== Stretch Goals ======<br />
<br />
Depending on your progress and interests, several further steps can be considered, such as:<br />
<br />
* '''Integration of further peripherals:''' Further peripherals can be integrated in hardware and software, such as SPI devices and a USB Host.<br />
<br />
We can also discuss targeting a subset of the tasks above depending on your time frame and interests.<br />
<br />
===== Requirements ===== <br />
<br />
* Strong interest system design and hardware/software interaction<br />
* Experience with HDLs (preferably SystemVerliog) such as taught in VLSI I<br />
* Basic knowledge of operating systems<br />
<br />
Composition: 40% RTL Implementation, 20% Verification, 40% Software Design<br />
<br />
===== Project Supervisors ===== <br />
* [[:User:Tbenz | Thomas Benz]]: [mailto:tbenz@iis.ee.ethz.ch tbenz@iis.ee.ethz.ch]<br />
* [[:User:Cykoenig | Cyril Koenig]]: [mailto:cykoenig@iis.ee.ethz.ch cykoenig@iis.ee.ethz.ch]<br />
* [[:User:Nwistoff | Nils Wistoff]]: [mailto:nwistoff@iis.ee.ethz.ch nwistoff@iis.ee.ethz.ch]<br />
<br />
== References ==<br />
<br />
* [1] https://github.com/openhwgroup/cva6</div>Cykoenig