Personal tools

Difference between revisions of "Big Data Analytics Benchmarks for Ara"

From iis-projects

Jump to: navigation, search
 
 
(13 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
[[Category:Digital]]
 
[[Category:Digital]]
 +
[[Category:High Performance SoCs]]
 
[[Category:Acceleration and Transprecision]]
 
[[Category:Acceleration and Transprecision]]
 
[[Category:Computer Architecture]]
 
[[Category:Computer Architecture]]
Line 7: Line 8:
 
[[Category:Chizhang]]
 
[[Category:Chizhang]]
 
[[Category:Mperotti]]
 
[[Category:Mperotti]]
[[Category:Available]]
+
[[Category:In progress]]
  
== Introduction ==
 
  
Ara
+
== Status: In-Progress ==
big data analytics
 
Ara lack these benchmarks
 
goal:
 
  
 +
* Professor: Prof. Dr. L. Benini
 +
* Supervisors:
 +
** [[:User:Chizhang | Chi Zhang]]: [mailto:chizhang@iis.ee.ethz.ch]
 +
** [[:User:Mperotti | Matteo Perotti]]: [mailto:mperotti@iis.ee.ethz.ch]
  
===== Tasks =====  
+
== Introduction ==
 +
 
 +
Vector processing is becoming a widespread option when dealing with highly parallel data workloads, thanks to its intrinsic computational capabilities and flexibility. A vector core can sustain high computational throughput using deep pipelines and multiple parallel units.
 +
 
 +
What a time for a project on a vector processor! RISC-V has almost finished ratifying its open-source vector ISA RVV (a process that lasted many years!), and many industries/universities are producing their first RVV-compatible cores. ETH is at the forefront of this race with its agile in-order vector processor Ara, fresh from an update from the unripe specifications RVV 0.5.
 +
 
 +
In the age of big data, high performance big data analyzing is demanded. Now, it's the time to leverage the high parallel data computational capabilities of our vector processor Ara on big data analytics! In this project, you will code high performance big data analytics benchmarks based on open-source vector ISA RVV 0.5, evaluate them on vector processor Ara, and try to achieve their best performance.
 +
 
 +
 
 +
== Tasks ==  
  
 
* Familiarize yourself with vector processor Ara
 
* Familiarize yourself with vector processor Ara
 
** Try to run Ara RTL simulation  
 
** Try to run Ara RTL simulation  
 
** Executing existing benchmarks
 
** Executing existing benchmarks
** Understand how vector processor works and the chaining techneque
+
** Understand how vector processor works and the chaining technique
* Familiarize yourself with a bunch of popular big data analytics worksloads, including:  
+
* Familiarize yourself with a bunch of popular big data analytics workloads, including:  
 
** Naive Bayes
 
** Naive Bayes
 
** SVM
 
** SVM
 
** K-means clustering
 
** K-means clustering
 
** Breadth-first search
 
** Breadth-first search
** Depth-first search
 
 
** Multilayer perceptron,  
 
** Multilayer perceptron,  
 
** Graph neural network
 
** Graph neural network
 
* Coding for big data analytics benchmarks for Ara, while think about:
 
* Coding for big data analytics benchmarks for Ara, while think about:
 
** How to vectorize these workloads
 
** How to vectorize these workloads
** How to schedule memory access and computation to make best advantage of vector chaining and reach to high function unit utilization
+
** How to schedule memory access and computation to make best advantage of vector chaining and reach high function unit utilization
 
* Evaluating big data analytics benchmarks
 
* Evaluating big data analytics benchmarks
 
** Run you benchmarks on Ara and count performance metrics, function unit utilization, bandwidth, bus utilization, etc.
 
** Run you benchmarks on Ara and count performance metrics, function unit utilization, bandwidth, bus utilization, etc.
** Make roofline model, while varing data set size and Ara lane counts  
+
** Make the roof-line model, while varying data set size and Ara lane counts  
 
* Write a report and prepare a presentation.
 
* Write a report and prepare a presentation.
 
* Possible BONUS goals.
 
* Possible BONUS goals.
  
  
===== Requirements =====  
+
== Requirements ==
  
 
* Strong interest and basic knowledge in computer architecture and operating systems, both on the HW and SW sides
 
* Strong interest and basic knowledge in computer architecture and operating systems, both on the HW and SW sides
Line 54: Line 63:
 
* 25% Performance evaluation
 
* 25% Performance evaluation
  
===== Project Supervisors =====
 
* [[:User:Chizhang | Chi Zhang]]: [mailto:chizhang@iis.ee.ethz.ch]
 
* [[:User:Mperotti | Matteo Perotti]]: [mailto:mperotti@iis.ee.ethz.ch]
 
  
 
== References ==
 
== References ==
Line 64: Line 70:
 
[2] Ara source code: https://github.com/pulp-platform/ara
 
[2] Ara source code: https://github.com/pulp-platform/ara
  
[3] Cray-Processor: http://www.edwardbosworth.com/My5155_Slides/Chapter13/Cray_Supercomputers.htm
+
[3] RVV: https://github.com/riscv/riscv-v-spec/releases/tag/v1.0
 +
 
 +
[4] Big data analytics: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-015-0030-3
 +
 
 +
[5] How AI and ML Applications Will Benefit from Vector Processing: https://www.enterpriseai.news/2020/07/31/how-ai-and-ml-applications-will-benefit-from-vector-processing/
  
[4] RVV: https://github.com/riscv/riscv-v-spec/releases/tag/v1.0
+
[6] A survey on platforms for big data analytics: https://link.springer.com/article/10.1186/s40537-014-0008-6

Latest revision as of 11:34, 3 November 2023


Status: In-Progress

Introduction

Vector processing is becoming a widespread option when dealing with highly parallel data workloads, thanks to its intrinsic computational capabilities and flexibility. A vector core can sustain high computational throughput using deep pipelines and multiple parallel units.

What a time for a project on a vector processor! RISC-V has almost finished ratifying its open-source vector ISA RVV (a process that lasted many years!), and many industries/universities are producing their first RVV-compatible cores. ETH is at the forefront of this race with its agile in-order vector processor Ara, fresh from an update from the unripe specifications RVV 0.5.

In the age of big data, high performance big data analyzing is demanded. Now, it's the time to leverage the high parallel data computational capabilities of our vector processor Ara on big data analytics! In this project, you will code high performance big data analytics benchmarks based on open-source vector ISA RVV 0.5, evaluate them on vector processor Ara, and try to achieve their best performance.


Tasks

  • Familiarize yourself with vector processor Ara
    • Try to run Ara RTL simulation
    • Executing existing benchmarks
    • Understand how vector processor works and the chaining technique
  • Familiarize yourself with a bunch of popular big data analytics workloads, including:
    • Naive Bayes
    • SVM
    • K-means clustering
    • Breadth-first search
    • Multilayer perceptron,
    • Graph neural network
  • Coding for big data analytics benchmarks for Ara, while think about:
    • How to vectorize these workloads
    • How to schedule memory access and computation to make best advantage of vector chaining and reach high function unit utilization
  • Evaluating big data analytics benchmarks
    • Run you benchmarks on Ara and count performance metrics, function unit utilization, bandwidth, bus utilization, etc.
    • Make the roof-line model, while varying data set size and Ara lane counts
  • Write a report and prepare a presentation.
  • Possible BONUS goals.


Requirements

  • Strong interest and basic knowledge in computer architecture and operating systems, both on the HW and SW sides
  • Experience with SystemVerilog HDL, such as taught in VLSI I
  • Knowledge of bare-metal C and assembly programming
  • Bonus: being familiar with vector processors, RISC-V RVV

Character

  • 25% Literature / Architecture review
  • 50% Bare-metal C and Assembly programming
  • 25% Performance evaluation


References

[1] Ara: https://arxiv.org/pdf/1906.00478.pdf

[2] Ara source code: https://github.com/pulp-platform/ara

[3] RVV: https://github.com/riscv/riscv-v-spec/releases/tag/v1.0

[4] Big data analytics: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-015-0030-3

[5] How AI and ML Applications Will Benefit from Vector Processing: https://www.enterpriseai.news/2020/07/31/how-ai-and-ml-applications-will-benefit-from-vector-processing/

[6] A survey on platforms for big data analytics: https://link.springer.com/article/10.1186/s40537-014-0008-6