Rethinking our Convolutional Network Accelerator Architecture
From iis-projects
Contents
Short Description
Imaging sensor networks, UAVs, smartphones, and other embedded computer vision systems require power-efficient, low-cost and high-speed implementations of synthetic vision systems capable of recognizing and classifying objects in a scene. Many popular algorithms in this area require the evaluations of multiple layers of filter banks. Almost all state-of-the-art synthetic vision systems are based on features extracted using multi-layer convolutional networks (ConvNets). When evaluating ConvNets, most of the time is spent performing the convolutions (80% to 90%). We have built an accelerator for this, Origami, which has been very successful. Nevertheless, it has some limitations such as the synthesis-time fixed filter sizes and room for improvements in terms of energy efficiency. We are looking for your creativity to completely (or also incrementally) rethink the architecture to make it more versatile and energy efficient.
Status: Not Available
- 1-2 Master thesis or 1-3 semester project students
- Supervision: Lukas Cavigelli
Prerequisites
- Interest in VLSI architecture exploration and computer vision
- VLSI 1 or equivalent
Character
- 15% Theory / Literature Research
- 85% VLSI Architecture, Implementation & Verification
Professor
Detailed Task Description
A detailed task description will be worked out right before the project, taking the student's interests and capabilities into account.
Literature
- Hardware Acceleration of Convolutional Networks:
- Lukas Cavigelli, David Gschwend, Christoph Mayer, Samuel Willi, Beat Muheim, Luca Benini, "Origami: A Convolutional Network Accelerator", Proc. ACM/IEEE GLS-VLSI'15 [1] [2]
- F. Conti, L. Benini, "A Ultra-Low-Energy Convolution Engine for Fast Brain-Inspired Vision in Multicore Clusters", Proc. ACM/IEEE DATE'15 [3]
- Yu-Hsin Chen, Tushar Krishna, Joel Emer, Vivienne Sze, "Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks", Proc. ISSCC'16.
- C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello and Y. LeCun, "NeuFlow: A Runtime Reconfigurable Dataflow Processor for Vision", Proc. IEEE ECV'11@CVPR'11 [4]
- V. Gokhale, J. Jin, A. Dundar, B. Martini and E. Culurciello, "A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks", Proc. IEEE CVPRW'14 [5]
- Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, Jason Cong, "Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks", Proc. FPGA'15 [6]
- L. Cavigelli, M. Magno, L. Benini, "Accelerating real-time embedded scene labeling with convolutional networks", Proc. ACM/IEEE/EDAC DAC'15, [7]