Personal tools

Difference between pages "Design and Implementation of a Convolutional Neural Network Accelerator ASIC" and "PULP"

From iis-projects

(Difference between pages)
Jump to: navigation, search
(Step-by-Step Workflow)
 
(Related Chips)
 
Line 1: Line 1:
<!--[[File:Pedestrians.png|500px|thumb]]--->
+
__NOTOC__
 +
==PULP - an Open-Source Parallel Ultra-Low-Power Processing-Platform==
  
==Short Description==
+
This is a joint project between the [http://www.iis.ee.ethz.ch Integrated Systems laboratory (IIS)] of ETH Zurich (IIS) and the [http://www.dei.unibo.it/en/research/research-facilities/Labs/eess-energy-efficient-embedded-systems Energy-efficient Embedded Systems] (EEES) group of UNIBO to develop an open-source scalable Hardware and Software research platform with the goal to break the pJ/op barrier within a power envelope of a few mW.
Imaging sensor networks, UAVs, smartphones, and other embedded computer vision systems require power-efficient, low-cost and high-speed implementations of synthetic vision systems capable of recognizing and classifying objects in a scene. Many popular algorithms in this area require the evaluations of multiple layers of filter banks. Almost all state-of-the-art synthetic vision systems are based on features extracted using multi-layer convolutional networks (ConvNets). When evaluating ConvNets, most of the time is spent performing the convolutions (80% to 90%). The focus of this work is on speeding up this step by creating an accelerator to perform this step faster and more power-efficiently.  
 
  
<!--
+
[mailto:lbenini@iis.ee.ethz.ch Inquiries] from interested partners are welcome.
More and more video surveillance data is being collect for real-time surveillance and storage. Privacy is a real issue, posing a legal obstacle when public places are being monitored: real-time surveillance is not allowed in such cases, and stored data can (even for internal use) only be accessed with a court order.  
 
  
With the use of privacy enhancement techniques, this is different. Currently, such systems are often based on simple motion detection, blurring everything that has moved recently. This can be an option if there is little activity and only very low detail is needed. However, when monitoring a crowded area the results are useless and important detail such as the person's movements is completely hidden.
+
''....more to follow.... stay tuned!''
  
This project is supposed to overcome this by using deep learning techniques to detect pedestrians/persons and using temporal/motion information to improve the delineation of moving objects. This way the pedestrians can be overpainted, blurred, or overlaid with motion-based information, protecting their privacy while enabling better information to security personnel.
+
===Related Available Projects===
--->
+
<DynamicPageList>
===Status: In Progress ===
+
category = PULP
: David Gschwend, Christoph Mayer, Samuel Willi
+
category = Available
: Supervision: [[:User:Lukasc | Lukas Cavigelli]], [[:User:muheim| Beat Muheim]]
+
</DynamicPageList>
: Date: Fall Semester 2014 (sem14h17, sem14h18, sem14h19)
+
=== Related Chips ===
 +
* [http://asic.ethz.ch/2013/Pulp.html Pulp v1] The first version of the PULP platform realized in 28nm FDSOI technology with 4 parallel cores.
 +
* [http://asic.ethz.ch/2013/Or10n.html Or10n] An optimized implementation of the OpenRISC processor developed to be used within PULP.
 +
* [http://asic.ethz.ch/2013/Sir10us.html Sir10us] A cryptographic application that uses the Or10n processor developed for PULP.
 +
* [http://asic.ethz.ch/2014/Artemis.html Artemis] 4 core PULP system including FPU.
 +
* [http://asic.ethz.ch/2014/Hecate.html Artemis] 4 core PULP system with 2 shared FPUs.
 +
* [http://asic.ethz.ch/2014/Selene.html Artemis] 4 core PULP system with 1 shared FPU using a logarithmic number system.
  
===Prerequisites===
+
===Links===
: Knowledge of Matlab
+
* [http://www-micrel.deis.unibo.it/sitonew/links/index.html PULP page in University of Bologna]
: Interest in video processing and VLSI design
 
: VLSI 1 and enrolment in VLSI 2 is required
 
: At least one student has to test the chip as part of the VLSI 3 lecture, if the ASIC should be manufactured.
 
<!--
 
===Status: Completed ===
 
: Fall Semester 2014 (sem13h2)
 
: Matthias Baer, Renzo Andri
 
--->
 
<!--
 
===Status: In Progress ===
 
: Student A, StudentB
 
: Supervision: [[:User:Mluisier | Mathieu Luisier]]
 
--->
 
  
===Character===
+
[[Category:PULP]]
: 10% Theory / Literature Research
 
: 60% VLSI Architecture, Implementation & Verification
 
: 30% VLSI back-end Design
 
 
 
===Professor===
 
: [http://www.iis.ee.ethz.ch/portrait/staff/lbenini.en.html Luca Benini]
 
[[#top|↑ top]]
 
 
 
==Detailed Task Description==
 
 
 
===Goals===
 
: Explore various architectures to perform the 2D convolutions used in convolutional networks, considering the constraints of an ASIC design, and performing fixed-point analyses for the most viable architecture(s)
 
: Get to know the ASIC design flow from specification through architecture exploration to implementation, functional verification, back-end design and silicon testing.
 
 
 
===Step-by-Step Workflow===
 
# Do some first project planning. Create a time schedule and set some milestones based on what you have learned as part of the VLSI lectures.
 
# Get to understand the basic concepts of convolutional networks.
 
# Catch up on relevant previous work, in particular the papers we give to you.
 
# Become aware of the possibilities and limitations of the used technology; make some very rough estimates of area and timing. Also consider setting some target specifications for your chip.
 
# Come up with and evaluate/discuss several possible architectures (architecture exploration), implement the datapath/most resource relevant parts to get some first impression of the most promissing architecture(s). Also give some first thoughts to testability.
 
# Run detailed fixed-point analyses to determine the signal width in all parts of the data path.
 
# Create high quality, synthesizable VHDL code for your circuit. Please respect the lab's coding guidelines and continuously verify proper functionality of the individual parts of your design.
 
# Implement the necessary configuration interface, ...
 
# Perform thorough functional verification. This is very important.
 
# Take your final implementation through the backend design process.
 
# Write a project report. Include all major decisions taken during the design process and argue your choice. Include everything that deviates from the very standard case -- show off everything that took time to figure out and all your ideas that have influenced the project.
 
 
 
Be aware, that these steps cannot always be performed one after the other and often need some initial guesses followed by several iterations.
 
 
 
===Meetings & Presentations===
 
The students and advisor(s) agree on weekly meetings to discuss all relevant decisions and decide on how to proceed.
 
 
 
Around the middle of the project there is a design review, where senior members of the institue review your work (bring all the relevant information, such as prelim. specifications, block diagrams, synthesis reports, testing strategy, ...) to make sure everything is on track and decide whether further support is necessary. They also make the definite decision on whether the chip is actually manufactured (no reason to worry, if the project is on track) and whether more chip area, a different package, ... is provided.
 
 
 
At the end of the project, you have to present/defend your work during a 15 min. presentation and 5 min. of discussion as part of the IIS colloquium.
 
 
 
===Deliverables===
 
* description of the most promising architectures, and argumentation on the decision taken (as part of the report)
 
* synthesizable, verified VHDL code
 
* generated test vector files
 
* synthesis scripts & relevant software models developed for verification
 
* GDS II data & bonding diagram
 
* datasheet (part of report)
 
* project report
 
 
 
===Literature===
 
NeuFlow [http://www.neuflow.org/] in general and in particular
 
 
 
* C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello and Y. LeCun, "NeuFlow: A Runtime Reconfigurable Dataflow Processor for Vision", Proc. IEEE ECV'11@CVPR'11 [http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=5981829]
 
 
 
* Vinayak Gokhale, Jonghoon Jin, Aysegul Dundar, Berin Martini and Eugenio Culurciello, "A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks", Proc. IEEE CVPRW'14 [http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6910056]
 
 
 
and a not-yet-published paper of our group.
 
<!-- F. Conti, L. Benini, "A Ultra-Low-Energy Convolution Engine for Fast Brain-Inspired Vision in Multicore Clusters", submitted to IEEE DAC'15, under review. -->
 
 
 
===Practical Details===
 
* '''[[Project Plan]]'''
 
* '''[[Project Meetings]]'''
 
* '''[[Design Review]]'''
 
* '''[[Coding Guidelines]]'''
 
* '''[[Final Report]]'''
 
* '''[[Final Presentation]]'''
 
 
 
==Results==
 
 
 
==Links==
 
* The EDA wiki with lots of information on the ETHZ ASIC design flow (internal only) [http://eda.ee.ethz.ch/]
 
* The IIS/DZ coding guidelines [http://www.dz.ee.ethz.ch/en/information/hdl-help/vhdl-naming-conventions.html]
 
 
 
 
 
[[#top|↑ top]]
 
 
 
[[Category:Digital]]
 
[[Category:In progress]]
 
[[Category:Semester Thesis]]
 
 
 
<!--
 
 
 
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES
 
 
 
GROUP
 
[[Category:Digital]]
 
[[Category:Analog]]
 
[[Category:Nano-TCAD]]
 
[[Category:Nano Electronics]]
 
 
 
STATUS
 
[[Category:Available]]
 
[[Category:In progress]]
 
[[Category:Completed]]
 
[[Category:Hot]]
 
 
 
TYPE OF WORK
 
[[Category:Semester Thesis]]
 
[[Category:Master Thesis]]
 
[[Category:PhD Thesis]]
 
[[Category:Research]]
 
 
 
NAMES OF EU/CTI/NT PROJECTS
 
[[Category:UltrasoundToGo]]
 
[[Category:IcySoC]]
 
[[Category:PSocrates]]
 
[[Category:UlpSoC]]
 
[[Category:Qcrypt]]
 
 
 
YEAR (IF FINISHED)
 
[[Category:2010]]
 
[[Category:2011]]
 
[[Category:2012]]
 
[[Category:2013]]
 
[[Category:2014]]
 
 
 
--->
 

Revision as of 14:47, 14 January 2015

PULP - an Open-Source Parallel Ultra-Low-Power Processing-Platform

This is a joint project between the Integrated Systems laboratory (IIS) of ETH Zurich (IIS) and the Energy-efficient Embedded Systems (EEES) group of UNIBO to develop an open-source scalable Hardware and Software research platform with the goal to break the pJ/op barrier within a power envelope of a few mW.

Inquiries from interested partners are welcome.

....more to follow.... stay tuned!

Related Available Projects

Related Chips

  • Pulp v1 The first version of the PULP platform realized in 28nm FDSOI technology with 4 parallel cores.
  • Or10n An optimized implementation of the OpenRISC processor developed to be used within PULP.
  • Sir10us A cryptographic application that uses the Or10n processor developed for PULP.
  • Artemis 4 core PULP system including FPU.
  • Artemis 4 core PULP system with 2 shared FPUs.
  • Artemis 4 core PULP system with 1 shared FPU using a logarithmic number system.

Links