Personal tools

Evaluating memory access pattern specializations in OoO, server-grade cores (M)

From iis-projects

Revision as of 16:21, 26 November 2020 by Paulsc (talk | contribs) (Created page with "Category:Digital Category:High Performance SoCs Category:Computer Architecture Category:2020 Category:Master Thesis Category:Paulsc Category:Availabl...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Introduction

In recent research [1], we explored the opportunity of adding streaming semantic to the processor memory architecture. This was done in in-order CPUs with the concept of stream-semantic registers. These registers implicitly track a memory stream and effectively offload stream access management from the main core pipeline, achieving significant benefits on in-order CPUs. Similar ideas have been proposed in the context of more complex, out-of-order (OoO) cores [2], using an ISA extension as the interface to define memory access streams and using a streammanagement coprocessor to map streams into registers and manage memory access. The goal of this MSc thesis project is to implement an evaluation workbench to further analyze these state-of-the-art techniques and related tradeoffs when implemented in superscalar, OoO CPUs. This analysis will serve as a base for a detailed study of the interaction of the streammanagement coprocessor with the memory subsystem and the OoO cores, as well as an exploration of new ideas, such as offloading address computation for indirect accesses into the memory controller.

Project Description

The primary goal of this project is to develop and validate an implementation of a stream coprocessor similar to the state-of-the-art proposals [1,2] in the gem5 simulator [3], on top of an existing model of an ARM server-grade multicore processor. The project will be comentored by researchers from Huawei's Zurich Research Center, who will provide the baseline model and support on extending it. This gem5 model developed during the project will be then useful to explore in greater detail the microarchitectural implementation of such techniques and possibly evaluate new solutions. The project can be roughly split into the following main parts:

Part I – Familiarizing with the simulator, baseline model, and state-of-the-art techniques

The first few weeks will be devoted to understanding relevant parts of the gem5 simulator and baseline model and to reviewing and understanding the techniques we want to reproduce. The conclusion of this part will be implementing a simple component in gem5 to do a "best case" study of the achievable performance improvements using the stream semantic concepts. This simplified implementation will mainly involve modifications in the ISA description and implementation to support the new stream related instructions as well as in the Load/Store Unit of the OoO model and in the hardware data prefetchers.

Part II – Implementation of the stream-management coprocessor

This part aims at improving the development made during part I, extending it to make it more realistic. The goal is to understand and overcome the challenges linked to integrating the stream-management coprocessor into a OoO pipeline, especially regarding the speculation. Most parts of the back-end of the OoO model's pipeline will need to be modified and details not fully described in the papers might need to be architected. This part will also include evaluation of the performance difference between the limit study and the more realistic implementation.

Part III – Validation of the model against published results

In order to validate the robustness of the implementation, results will be compared with the literature. It is not expected those will perfectly match, but we expect to see the same trends. This activity will partly proceed concurrently with Part II.

Part IV (optional) – Implementation and validation of alternative solutions

During the realization of parts I and II, new ideas might get discovered and if time permits, those ideas will be implemented and tested against the base model. This part is optional as it will strongly depends on the outcomes of the previous parts. If new interesting solutions can be validated, the mentors will advise on preparing a paper submission to an appropriate venue for publication.

Part V (optional) – Open-sourcing the implementation

We encourage open-sourcing the gem5 components implemented in the projects to the gem5 community through the official reviewing process. However, this step will require further review and code cleanup and it might not be possible to complete it within the timeline for the thesis project, so it is not a requirement for successfully executing the project.

Project Management

This project will be co-mentored by ETH IIS and Huawei's Zurich Research Center. Huawei will provide support with the simulation infrastructure.

Meetings

There will be a regular schedule of meetings (e.g., weekly), plus any additional on-demand meetings to address specific issues or discussions. Depending on the COVID-related measures, the meetings will take place online via a conferencing platform, or at ETH or Huawei office, as agreed between the student and the supervisors.

Project Report

Documentation is an important and often overlooked aspect of engineering. A final report has to be completed within this project. The common language of engineering is de facto English. Therefore, the final report of the work is preferred to be written in English. Any form of word processing software is allowed for writing the reports, nevertheless, the use of LaTeX with Inkscape or Tgif, or any other vector drawing software (for block diagrams) is strongly encouraged by the IIS and Huawei staff.

Presentation

There will be a presentation (20min presentation and 5min Q&A) at the end of this project in order to present your results to a wider audience. The exact date will be determined towards the end of the work.

Required Skills

  • Necessary skills for successfully taking on the project:
  • Good proficiency with modern C++ (at last C++11) and Python 3
  • Understanding of basic microarchitectural concepts (pipelining, out-of-order execution, caches, prefetching)
  • Willingness to "get your hands dirty" and implement advanced techniques in gem5
  • Ability to work independently on the implementation and raise questions and issues to get help and guidance as necessary
  • Previous experience with gem5 (desirable, but not necessary)
  • Previous experience with adding opcodes to LLVM (desirable, but not necessary)

Project Supervisors

References