Personal tools

Convolution Neural Networks on our Ultra-Low Power Mult-Core Plattform PULP

From iis-projects

Revision as of 08:36, 20 January 2021 by Kgf (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Description

Neural Nets are achieving record-breaking results in all common machine learning tasks and reached high attention of the machine learning community and well-known companies like Google, Facebook, Microsoft, and others. Until now convolution neural networks have been implemented on power-hungry devices like server-clusters or highly-parallelized GPUs as the number of parameters and the computational complexity was too high for devices with tight energy budget.

Recently published algorithms propose to binarize the weights which reduces the memory needs by a factor of 32 for the weight's storage and reduces the arithmetic complexity as no multiply-accumulate operations are needed anymore [1-3]. Rastegari et al. present XNOR networks where they go even further and binarize the feature maps such that the MAC operations are replaced by XNOR operations. Even though the information in the network is reduced, the accuracy was shown to be reduced by about 10% or less. (75% instead of 85% Top-1 accuracy in the sound recognition CNN.)

In a previous thesis, an efficient implementation of a XNOR network (for sound recognition) was implemented on the STM32F469NI board including with a ST micro controller with an ARM M4F core.

The goal of this thesis is to port the network to PULP. To exploit the parallelism which comes with neural networks, PULP supports OpenMP. Furthermore, for a fast development you will develop the software on the virtual plattform of PULP. The virtual platform is a cycle-accurate simulator of PULP, all parameters (e.g. memory sizes) can be freely chosen which helps to boost the application development on the newest PULP architecture, even though it may not be taped-out yet (in real).


Status: Available

Looking for 1-2 semester student
Supervision: Renzo Andri

Prerequisites

  • Familiarity with C/C++ programming.
  • Knowledge of OpenMP or GPU programming would be an asset.
  • Knowledge in Neural Network are not needed before starting the project.

Character

30% Theory
50% C,C++, OpenMP programming
20% Experimental evaluation and experiments, documentation

Professor

Luca Benini

↑ top

Detailed Task Description

Meetings & Presentations

The student(s) and advisor(s) agree on weekly meetings to discuss all relevant decisions and decide on how to proceed. Of course, additional meetings can be organized to address urgent issues.

Literature

[1] Courbariaux, Matthieu, Yoshua Bengio, and Jean-Pierre David. "Binaryconnect: Training deep neural networks with binary weights during propagations." Advances in Neural Information Processing Systems. 2015.

[2] Rastegari, Mohammad, et al. "Xnor-net: Imagenet classification using binary convolutional neural networks." European Conference on Computer Vision. Springer International Publishing, 2016.

[3] Andri, Renzo, Lukas Cavigelli, Davide Rossi, and Luca Benini. "YodaNN: An ultra-low power convolutional neural network accelerator based on binary weights." In VLSI (ISVLSI), 2016 IEEE Computer Society Annual Symposium on, pp. 236-241. IEEE, 2016.

Practical Details

↑ top