Personal tools

PULPonFPGA: Lightweight Virtual Memory Support - Page Table Walker

From iis-projects

Revision as of 09:48, 22 June 2016 by Vogelpi (talk | contribs) (Created page with "thumb|600px ==Intro== While high-end heterogeneous systems-on-chip (SoCs) are increasingly supporting heterogeneous uniform memory access (hUMA), the...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Pulp on fpga.png

Intro

While high-end heterogeneous systems-on-chip (SoCs) are increasingly supporting heterogeneous uniform memory access (hUMA), their low-power counterparts targeting the embedded domain still lack basic features like virtual memory support for accelerators. As opposed to simply passing virtual address pointers, explicit data management involving copies is needed to share data between host processor and accelerators which hampers programmability and performance.

At IIS, we study the integration of programmable many-core accelerators into embedded heterogeneous SoCs. We have developed a mixed hardware/software solution to enable lightweight virtual memory support for many-core accelerators in heterogeneous embedded SoCs [1,2]. Our solution is based on the Remapping Address Block (RAB): A hardware input/output translation lookaside buffer (IOTLB) efficiently managed by a kernel-level driver module running on the host CPU. In case of an IOTLB miss, the hardware sends an interrupt to the host CPU, which causes a miss-handling routine in the driver to be executed. As this routine relies on standard Linux kernel application programming interfaces (APIs), it is easily portable to other host CPU architectures, but it cannot be executed in interrupt context which causes substantial scheduling delays and an overall high IOTLB miss penalty.

Short Description

The goal of this project is to accelerate the miss-handling routine by implementing a custom page table walker for the ARMv7/ARMv8 architecture used by the host CPU on our evaluation platform [3]. In a first step, the page table walker will be implemented in software as part of the kernel-level driver module. After verifying and profiling the routine with real, heterogeneous applications on the evaluation platform, the routine shall either be ported to a dedicated microcontroller core inside the RAB or be implemented in dedicated hardware. While the first step allows to remove the scheduling delay due to the kernel APIs, the second step allows to remove the interrupt latency of the host CPU.

Status: Available

Looking for 1 Interested Master Student (Semester Project)
Supervision: Pirmin Vogel, Andrea Marongiu

Character

10% Theory, Algorithms and Simulation
40% C programming, Linux kernel hacking
20% VHDL/System Verilog, FPGA Design
30% Verification

Prerequisites

VLSI I
VHDL/System Verilog, C
Embedded Linux experience
Experience with Linux kernel-level driver development is of advantage, but not strictly required.

Professor

Luca Benini

References

  1. P. Vogel, A. Marongiu, L. Benini, "Lightweight Virtual Memory Support for Many-Core Accelerators in Heterogeneous Embedded SoCs", Proceedings of the 10th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'15), Amsterdam, The Netherlands, 2015. link
  2. P. Vogel, A. Marongiu, L. Benini, "Lightweight Virtual Memory Support for Zero-Copy Sharing of Pointer-Rich Data Structures in Heterogeneous Embedded SoCs", to be published, 2016.
  3. PULPonFPGA Howto DZ EDA Wiki entrylink
  4. Memory Part 3: Virtual Memory LWN article link
  5. Virtual Memory in the IA-64 Linux Kernel excerpt from IA-64 Linux Kernel: Design and Implementation link

↑ top