Personal tools

Bringing XNOR-nets (ConvNets) to Silicon

From iis-projects

Revision as of 17:50, 12 December 2016 by Lukasc (talk | contribs)
Jump to: navigation, search
Origami-top.png
X1-adas.jpg
Labeled-scene.png

Short Description

Imaging sensor networks, UAVs, smartphones, and other embedded computer vision systems require power-efficient, low-cost and high-speed implementations of synthetic vision systems capable of recognizing and classifying objects in a scene. Many popular algorithms in this area require the evaluations of multiple layers of filter banks. Almost all state-of-the-art synthetic vision systems are based on features extracted using multi-layer convolutional networks (ConvNets). When evaluating ConvNets, most of the time is spent performing the convolutions (80% to 90%).

We have built an accelerator for this, Origami, which has been very successful, and we have moved towards lower-resolution implementations with our YodaNN implementation. Energy-efficiency is critical to bring ConvNets to mobile devices, and I/O is limiting how far we can go. With fully-binary ConvNets, this could be improved by requiring only ~1bit/pixel instead of ~12bit/pixel like before. The current state-of-the-art in this direction are XNOR-nets, but we want to understand them better and use their structure to build an even more efficient ConvNet accelerator with almost no multipliers and relatively small adders.

Status: In Progress

1-2 Master thesis or 2-3 semester project students
Supervision: Lukas Cavigelli, Renzo Andri

Prerequisites

  • Interest in VLSI architecture exploration and computer vision
  • VLSI 1 or equivalent

Character

15% Theory / Literature Research
15% Software Evaluations
70% VLSI Architecture, Implementation & Verification

Professor

Luca Benini

↑ top

Detailed Task Description

A detailed task description will be worked out right before the project, taking the student's interests and capabilities into account.

Literature

  • Hardware Acceleration of Convolutional Networks:
    • Andri, R., Cavigelli, L., Rossi, D., & Benini, L. (2016). YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights. arXiv preprint arXiv:1606.05487. [1]
    • Lukas Cavigelli, David Gschwend, Christoph Mayer, Samuel Willi, Beat Muheim, Luca Benini, "Origami: A Convolutional Network Accelerator", Proc. ACM/IEEE GLS-VLSI'15 [2] [3]
    • F. Conti, L. Benini, "A Ultra-Low-Energy Convolution Engine for Fast Brain-Inspired Vision in Multicore Clusters", Proc. ACM/IEEE DATE'15 [4]
    • Yu-Hsin Chen, Tushar Krishna, Joel Emer, Vivienne Sze, "Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks", Proc. ISSCC'16.
    • C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello and Y. LeCun, "NeuFlow: A Runtime Reconfigurable Dataflow Processor for Vision", Proc. IEEE ECV'11@CVPR'11 [5]
    • V. Gokhale, J. Jin, A. Dundar, B. Martini and E. Culurciello, "A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks", Proc. IEEE CVPRW'14 [6]
    • Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, Jason Cong, "Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks", Proc. FPGA'15 [7]
  • Rastegari, M., Ordonez, V., Redmon, J. and Farhadi, A., 2016. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. arXiv preprint arXiv:1603.05279. [8]
  • L. Cavigelli, M. Magno, L. Benini, "Accelerating real-time embedded scene labeling with convolutional networks", Proc. ACM/IEEE/EDAC DAC'15, [9]

Practical Details

Links

  • The EDA wiki with lots of information on the ETHZ ASIC design flow (internal only) [10]
  • The IIS/DZ coding guidelines [11]


↑ top