Investigation of the high-performance multi-threaded OoO IBM A2O Core (1-3S)
IBM recently contributed their A2O processor core to the open-source community. The A2O is a 2-way multithreaded out-of-order core optimized for single thread performance. It is entirely written in Verilog 2001.
Even though the A2O primarily targets embedded applications, it features high computational throughput, running at up to 3 GHz in a 45 nm technology node. It was created as an application-grade, Linux-capable processor to be integrated in large SoCs primarily targeting applications like artificial intelligence and autonomous driving.
For us at IIS, this core poses a great opportunity to advance from rather simple, pipelined in-order cores (RI5CY, Zero-riscy, Ariane) to a fully-fledged commercial superscalar, multi-threaded, out-of-order processor. There are many knobs available in RTL to tune and tweak the A2O core. As of now, only a single configuration has been tested and successfully implemented on an FPGA. There are therefore plenty of opportunities to experiment with the parameters and investigate their impact on performance, area, and timing.
The project will be divided into the following sub tasks:
- Initial exploration: get familiar and understand the structure of the A2O core
- RTL synthesis: process the Verilog source of the A2O core to be compatible with our synthesis toolchain and synthesize the default configuration
- RTL simulation: understand the interface of the core and create a testbench that can execute binaries on the processor
- Parameter exploration: understand what parameters can be tweaked and how they influence performance, area, and speed in a 22nm node.
- Profound knowledge of computer architecture
- Experience with HDLs as taught in VLSI I
- Preferably previous experience with FPGAs and / or an ASIC toolflow (simulation & synthesis)
Composition: 40% initial exploration and base-line synthesis, 30% RTL simulation, 40% parameter exploration