Personal tools

Serverless Benchmarks on RISC-V (M)

From iis-projects

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Introduction

The introduction of function-as-a-service (FaaS) or serverless cloud services has re-opened a number of interesting research questions about software architecture, operating systems, security, and hardware [6]. For example, the steady move from monolithic applications to collections of microservices is based upon a number of advantages such as increased modularity and functional independence [4]. The modification, upgrade, or deployment of functionality becomes significantly simpler, and scaling out of individual services can be made much more dynamic. Microservices also allow for greater heterogeneity in both software (e.g. using different languages) and hardware (e.g. accelerators). Heterogeneous hardware such as TPUs, VPUs, and FPGAs are finding increasing use in datacenters for accelerating applications. This can be even more significant when using serverless or function-as-a-service (FaaS) architectures.

However, this modularity comes with potential drawbacks including increased inter- function latency, unforeseen interactions due to the complexity of the functional graph, and challenges in deployment, allocation, and management. The behavior of these microservices/functions has been shown to have very different performance characteristics from traditional monolithic applications. For example, a short-running function makes significantly less use of sophisticated branch predictors or large caches found in modern server processors [7]. This preliminary research suggests that the complexity of modern processors is not especially suited to serverless functions and we would like to undertake a more thorough exploration of the design space of microarchitectural features.

One way to examine different microarchitectural features is to use an open-source CPU design which allows for relatively easy modications of various units (e.g. branch predictor, cache, ALUs) as well as additions to the ISA. The first step is to get a set of serverless benchmarks runnning on an open-source CPU, in this case, the open-source RISC-V CPU CVA6 (formerly Ariane) [8]. CVA6 is a thoroughly documented and tested application class 6-stage RISC-V CPU capable of booting Linux [1]. The ultimate goal is to compare the performance of a broad set of functions running on various processor architectures (e.g. Intel/AMD x86, ARMv8, RISC-V) in order to gain insight into what types of architectural features and resources are best suited for different types of lambdas.

Project

In order to test benchmarks on the CVA6 core, it must be built and loaded onto the Xilinx Ultrascale+ FPGA found on Enzian. As the CVA6 core is already running in the Xilinx toolchain, and there are reports of it working on the VCU118 [2] which contains the same model FPGA as Enzian, this should not be a significant roadblock, but it may involve modifying block diagrams, projects, etc.

In parallel, a benchmarking suite (or subset of benchmarks) should be decided upon and tested. There are a number of possible benchmarks that could be used [5, 3]. Most importantly, it should (be made to) run on a RISC-V Linux distribution. This does not initially have to be the CVA6 core on the FPGA, but could be on a hard RISC-V core for testing and debugging. Ideally, we would also run the same set of benchmarks on an Intel/AMD x86 processor (possibly via Cloudlab), and an ARMv8 such as the ThunderX1 found in Enzian.

Experimental Evaluation

Extensive measurements should be collected for analyzing power and performance of the benchmarks across architectures. Aside from obvious metrics such as latency and execution time, we place special emphasis on behavior of specific microarchitectural elements (e.g. branch predictors, caches) as well as energy measurements (e.g. instructions per joule). It is important to keep in mind that this is exploratory work and thus certain desired measurements may not be available on the CVA6 core, and these limitations will not impact the outcome of this thesis.

Work Plan

The work consists of the following units:

  1. A critical survey of the relevant related work and important ideas
  2. Running the CVA6 core on the Enzian FPGA.
  3. Porting one or more serverless function benchmarks to the RISC-V architecture.
  4. An evaluation of the work based on Section 3 above, or subsequent revisions.
Requirements
  • Experience with HDLs (preferably SystemVerilog) such as taught in VLSI I
  • Programming Languages: C, Python
  • Tools: git, Vivado, Vitis

Composition: 30% Architecture specification, 30% RTL implementation, 40% Software implementation

Project Supervisors

References

  • [1] CVA6 (fka Ariane) Github Repository. https://github.com/openhwgroup/cva6
  • [2] CVA6 Github Issue Tracker: Support more FPGA boards 154. https://github.com/openhwgroup/cva6/issues/154
  • [3] Copik et al., "Sebs: A serverless benchmark suite for function-as-a-service computing", Middleware 2021. (Link)[1]
  • [4] Gan et al., "An open-source benchmark suite for microservices and their hardware-software implications for cloud and edge systems", ASPLOS 2019.
  • [5] Maissen et al., "FaaSdom: A benchmark suite for serverless computing", DEBS 2020.
  • [6] Raza et al., "SoK: Function-as-a- Service: From An Application Developer’s Perspective", JSys 2021.
  • [7] Shahrad et al., "Architectural implications of function-as-a-service computing", MICRO 2019.
  • [8] Zaruba and Benini, "Energy and per- formance analysis of a Linux-ready 1.7-GHz 64-bit RISC-V core in 22-nm FDSOI technology", VLSI 2019.