Big Data Analytics Benchmarks for Ara
- Professor: Prof. Dr. L. Benini
Vector processing is becoming a widespread option when dealing with highly parallel data workloads, thanks to its intrinsic computational capabilities and flexibility. A vector core can sustain high computational throughput using deep pipelines and multiple parallel units.
What a time for a project on a vector processor! RISC-V has almost finished ratifying its open-source vector ISA RVV (a process that lasted many years!), and many industries/universities are producing their first RVV-compatible cores. ETH is at the forefront of this race with its agile in-order vector processor Ara, fresh from an update from the unripe specifications RVV 0.5.
In the age of big data, high performance big data analyzing is demanded. Now, it's the time to leverage the high parallel data computational capabilities of our vector processor Ara on big data analytics! In this project, you will code high performance big data analytics benchmarks based on open-source vector ISA RVV 0.5, evaluate them on vector processor Ara, and try to achieve their best performance.
- Familiarize yourself with vector processor Ara
- Try to run Ara RTL simulation
- Executing existing benchmarks
- Understand how vector processor works and the chaining technique
- Familiarize yourself with a bunch of popular big data analytics workloads, including:
- Naive Bayes
- K-means clustering
- Breadth-first search
- Multilayer perceptron,
- Graph neural network
- Coding for big data analytics benchmarks for Ara, while think about:
- How to vectorize these workloads
- How to schedule memory access and computation to make best advantage of vector chaining and reach high function unit utilization
- Evaluating big data analytics benchmarks
- Run you benchmarks on Ara and count performance metrics, function unit utilization, bandwidth, bus utilization, etc.
- Make the roof-line model, while varying data set size and Ara lane counts
- Write a report and prepare a presentation.
- Possible BONUS goals.
- Strong interest and basic knowledge in computer architecture and operating systems, both on the HW and SW sides
- Experience with SystemVerilog HDL, such as taught in VLSI I
- Knowledge of bare-metal C and assembly programming
- Bonus: being familiar with vector processors, RISC-V RVV
- 25% Literature / Architecture review
- 50% Bare-metal C and Assembly programming
- 25% Performance evaluation
 Ara: https://arxiv.org/pdf/1906.00478.pdf
 Ara source code: https://github.com/pulp-platform/ara
 RVV: https://github.com/riscv/riscv-v-spec/releases/tag/v1.0
 Big data analytics: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-015-0030-3
 How AI and ML Applications Will Benefit from Vector Processing: https://www.enterpriseai.news/2020/07/31/how-ai-and-ml-applications-will-benefit-from-vector-processing/
 A survey on platforms for big data analytics: https://link.springer.com/article/10.1186/s40537-014-0008-6