Personal tools

Difference between revisions of "Development of statistics and contention monitoring unit for PULP"

From iis-projects

Jump to: navigation, search
 
Line 11: Line 11:
 
[[Category:Aottaviano]]
 
[[Category:Aottaviano]]
 
[[Category:Balasr]]
 
[[Category:Balasr]]
[[Category:Available]]
+
[[Category:In progress]]
  
 
= Overview =
 
= Overview =
  
== Status: Available ==
+
== Status: In progress ==
  
 
* Type: Semester Thesis
 
* Type: Semester Thesis

Latest revision as of 16:14, 10 September 2022


Overview

Status: In progress

Introduction

Multiprocessor System-on-Chip (MPSoCs) are getting more popular in the domain of Critical Real-Time Embedded Systems (CRTES). In such systems, validation and verification (V&V) are a synergetic process that aims at checking both temporal violations and functional bugs [1]. MPSoCs have the advantage of reducing CPU overload with respect to single-core applications, improving overall system’s performance [2].

Nevertheless, MPSoC complicates CRTES temporal verification and Worst-case Execution Time (WCET) analysis due to the additional contention and interference of tasks that are executing on multiple cores, which degrades the system’s predictability, a key feature in safety-critical designs such as those employed in the automotive, aerospace and robotics fields.

Statistics units have been recently employed to support CRTES’s V&V and deploy safety measures. In particular, SafeSU [4] provides an interesting and open-source [5] implementation to improve multicore timing inference on both observability and controllability sides, based on three building blocks:

1. Cycle Contention Stack (CCS): monitors working and contention cycles of a shared resource, i.e. cycles spent using or busy-waiting on the resource due to other cores’ requests

2. Maximum-Contention Control Unit (MCCU): allocates fixed timing interference quotas (contention budget) on access counts, and triggers an interrupt whenever the quotas are exceeded [3].

3. Request Duration Counter (RDC): coupled with MCCU, monitors the duration of an event. It is used to probe when the quota from MCCU is passed over.

The design relies on performance monitoring counters (PMC) both at the core and system levels [1]. For each shared resource, a common approach is to avoid extending the shared functional unit block (FUB) design itself, e.g. the interconnect or memory controller, but rather to introduce an external unit while keeping the FUB interface and internal design untouched [3], thus reducing the need of a full V&V re-iteration step for the whole FUB.

Relying on industry-grade and verified designs (Cobham Gaisler’s NOEL-V, for example) for aerospace applications, SafeSU is the reference design for this project.


Project

At IIS we have developed ControlPULP, a complete HW/SW platform with the primary use case of serving as an embedded parallel power controller for HPC systems [6][7]. Nevertheless, several applications in other domains (automotive and aerospace above all) show the need for a control platform that, other than providing a complete framework, should implement features towards safety and predictability.

ControlPULP is a parallel embedded MCU with a configurable cluster of 8 32-bit RISC-V cores (CV32E40P [8]). The 8 cores have a private instruction cache (I$), and communicate with a TCDM memory through a low-latency logarithmic interconnect.

The goal of this project is to make ControlPULP closer to a CRTES system by developing its own statistics and contention monitoring unit. The unit should take advantage of the existing PMCs in the processor, introduce new ones if needed and implement the necessary logic to monitor and control contention in the cluster. Inspiration can be taken from the open-source SafeSU.

Stretch goal While the statistics unit can play a fundamental role within one cluster to monitor and control the contention among the worker cores, when the system is scaled up to a multi-cluster configuration multiple clusters starts interfering with the shared L2 memory through a shared interconnect:

Multicluster cpulp.png

Following the approach within one cluster, the student will have the possibility to scale the monitoring/contention unit to a multi-cluster system. This implies the design choices around the scaling of the unit in hardware, as well as its effect in terms of task scheduling when a control quota is exceeded.

The implemented design will be verified with RTL simulation and eventually with FPGA mapping, which already exists for ControlPULP.

Character

  • 20% Literature / architecture review
  • 45% RTL implementation
  • 15% Bare-metal C programming
  • 20% Evaluation

Prerequisites

  • Strong interest in computer architecture
  • Experience with digital design in SystemVerilog as taught in VLSI I
  • Experience with C programming


References

[1] https://ieeexplore.ieee.org/document/7509440

[2] Aurix training central processing unit: [1]

[3] https://ieeexplore.ieee.org/abstract/document/8715155

[4] https://ieeexplore.ieee.org/document/9465444

[5] https://gitlab.bsc.es/caos_hw/hdl_ip/bsc_pmu/-/tree/develop/

[6] Ottaviano, A. et al. (2022). ControlPULP: A RISC-V Power Controller for HPC Processors with Parallel Control-Law Computation Acceleration. In: Orailoglu, A., Reichenbach, M., Jung, M. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2022. Lecture Notes in Computer Science, vol 13511. Springer, Cham. https://doi.org/10.1007/978-3-031-15074-6_8

[7] control-pulp github (soon)

[8] https://github.com/openhwgroup/cv32e40p