Mapping Networks on Reconfigurable Binary Engine Accelerator

Short Description

We have recently designed an accelerator called Reconfigurable Binary Engine (RBE)[1]. The RBE architecture exploits two computational concepts.

Computational Concept: BBQ - Binary Based Quantization (BBQ)

RBE aims to have a freely configurable accuracy allowing to balance the power and performance vs accuracy tradeoff. The design is inspired by the ABC-Net [2] which is based on the following two innovations:

Linear combination of multiple binary weight bases.
Employing multiple binary activations to alleviate the information loss.

The RBE architecture uses the two innovations to emulate quantized NNs by choosing the binary weights to correspond to each bit of the quantized weights. One quantized NN can therefore be emulated by a superposition of power-of-2 weighted QA×QW binary NN, whereas QW corresponds to the quantization level of the weights and QA quantization level of the activations. We call this concept from now on Binary Based Quantization (BBQ) which allows the RBE to perform convolutions with configurable arithmetic precisions in a flexible and power-scalable way. BBQ can be applied on both complete NNs and single layers.

Architecture

The RBE accelerator consists out of three parts:

Control Unit - contains all control related logic:
- Whole tensor tile is handled in a single job, helped by HWPE uloop (tiny microcoded loop processor)
- Classic HWPE programming interface + hardwired controller
Streamer Unit - handles all request
- Source - includes the address and request generation for reading data from the TCDM memory
- Sink - includes the address and request generation for writing data back to the TCDM memory
Engine Unit - performs all computation. The unit includes the following modules:
- A grid of 9x9=81 Block units (9 columns of each 9 Block units)
- Each Block includes 4 Binary Convolution Engines, or short Binconv, modules
- Each Binconv performs a QW x 1bit 32x32 Matrix-Vector product in QW x 32 cycles (32 bMAC/cycle)
- The reduced Binconv results are scaled by a power-of-two and accumulated
- The accumulated results of all block in one of the 9 Columns are again accumulated and stored in the Accumulator Banks
- After the full accumulation, the values are quantized by the Quantization module and streamed out

Literature

[1] RBE Github -> have a look at the documentation
[2] X. Lin, C. Zhao and W. Pan. "Towards Accurate Binary Convolutional Neural Network." Advances in Neural Information Processing Systems, 2017.
[2] Dory Github
[3] Dory Examples Github
[4] Nemo Github
[5] Quantlab Github

Status: Available

Looking for 1-2 Semester/Master students

Contact: Gianna Paulin, Thorir Mar Ingolfsson

Prerequisites

VLSI I
C coding
python coding (optimal: Pytorch)

Character

20% Theory

20% HW understanding

40% ML Tools: Nemo, Dory, Pytorch

20% Embedded C programming

Professor

Luca Benini

Project Organization

Weekly Meetings

The student shall meet with the advisor(s) every week in order to discuss any issues/problems that may have persisted during the previous week and with a suggestion of next steps. These meetings are meant to provide a guaranteed time slot for mutual exchange of information on how to proceed, clear out any questions from either side and to ensure the student’s progress.

Report / Presentation

Documentation is an important and often overlooked aspect of engineering. One final report has to be completed within this project. Any form of word processing software is allowed for writing the reports, nevertheless, the use of LaTeX with Tgif, drawoio (See: http://bourbon.usc.edu:8001/tgif/index.html and http://www.dz.ee.ethz.ch/en/information/how-to/drawing-schematics.html) or any other vector drawing software (for block diagrams) is strongly encouraged by the IIS staff.

Final Report

A digital copy of the report, the presentation, the developed software, build script/project files, drawings/illustrations, acquired data, etc. needs to be handed in at the end of the project. Note that this task description is part of your report and has to be attached to your final report.

Presentation

At the end of the project, the outcome of the thesis will be presented in a 15 (SA) or 20-minutes (MA) talk and 5 minutes of discussion in front of interested people of the Integrated Systems Laboratory. The presentation is open to the public, so you are welcome to invite interested friends. The exact date will be determined towards the end of the work.

Links

↑ top

Personal tools

Mapping Networks on Reconfigurable Binary Engine Accelerator - iis-projects

Search

Navigation

Tools