Personal tools

Difference between revisions of "Improving Scene Labeling with Hyperspectral Data"

From iis-projects

Jump to: navigation, search
(Created page with "200px|thumb 250px|thumb 200px|thumb ==Short Description== Hyperspectral imaging is different from n...")
 
m
 
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[File:Labeled-scene.png|200px|thumb]]
+
[[File:Hyperspectral.jpg|300px|thumb]]
 
[[File:x1-adas.jpg|250px|thumb]]
 
[[File:x1-adas.jpg|250px|thumb]]
 
[[File:Labeled-scene.png|200px|thumb]]
 
[[File:Labeled-scene.png|200px|thumb]]
  
==Short Description==
+
==Description==
Hyperspectral imaging is different from normal RGB imaging in the sense that it does not capture the amount of light within three spectral bins (red, green, blue), but many more (e.g. 16 or 25) and not necessarily in the visual range of the spectrum. This allows the camera to capture more information than humans can process visually with their eyes and thus gives way to very interesting opportunities to outperform even the best-trained humans, in fact you can see it as a step towards spectroscopic analysis of materials.  
+
Hyperspectral imaging is different from normal RGB imaging in the sense that it does not capture the amount of light within three spectral bins (red, green, blue), but many more (e.g. 16 or 25) and not necessarily in the visual range of the spectrum. This allows the camera to capture more information than humans can process visually with their eyes and thus gives way to very interesting opportunities to outperform even the best-trained humans; in fact you can see it as a step towards spectroscopic analysis of materials.  
  
Recently, a novel hyperspectral imaging sensor has been presented [[https://vimeo.com/73617050 video], [http://www2.imec.be/content/user/File/Brochures/2015/IMEC%20HYPERSPECTRAL%20SNAPSHOT%20MOSAIC%20IMAGER%2020150421.pdf pdf]] and adopted in first industrial computer vision cameras [[http://www.ximea.com/en/usb3-vision-camera/hyperspectral-usb3-cameras link]]. These new cameras are only 31 grams w/o the lens as opposed to the old cameras which very very heavy and not mobile.  
+
Recently, a novel hyperspectral imaging sensor has been presented [[https://vimeo.com/73617050 video], [http://www2.imec.be/content/user/File/Brochures/2015/IMEC%20HYPERSPECTRAL%20SNAPSHOT%20MOSAIC%20IMAGER%2020150421.pdf pdf]] and adopted in first industrial computer vision cameras [[http://www.ximea.com/files/brochures/xiSpec-Hyperspectral-cameras-2015-brochure.pdf pdf], [http://www.ximea.com/en/usb3-vision-camera/hyperspectral-usb3-cameras link]]. These new cameras are only 31 grams w/o the lens as opposed to the old cameras which used complex optics with beam splitters, could not provide a large number of channels and were very heavy, ultra expensive and not mobile [[http://www.helimetrex.com.au/tetra.html link], [http://www.fluxdata.com/multispectral-cameras link]].  
Imaging sensor networks, UAVs, smartphones, driver assistance appliances, and other embedded computer vision systems require power-efficient, low-cost and high-speed implementations of synthetic vision systems capable of recognizing and classifying objects in a scene. Many popular algorithms in this area require the evaluations of multiple layers of filter banks. Almost all state-of-the-art synthetic vision systems are based on features extracted using multi-layer convolutional networks (ConvNets).  
 
  
To be power efficient and achieve a high throughput at the same time, we would like to create a FPGA implementation of an entire scene labeling network. In order to keep the developed system flexible in terms of the convolutional neural network that is applied as well as the types of layer in the ConvNet, interaction between a flow controlling processor (e.g. an ARM core on a Xilinx Zynq) and the programmable logic is foreseen. If time permits or based on the preference of the student, some focus can also be given towards interfacing directly to a camera and a display or an ethernet adapter. As opposed to an ASIC project, such FPGA and hardware-software codesign work is much more applicable in industry and less constrained in terms of memory and interfaces. If desired by the student, also the use of high-level synthesis tools can be considered.
+
We have acquired such a camera and would like to explore its use for image understanding/scene labeling/semantic segmentation (see labeled image). Your task would be to evaluate and integrate this camera into a working scene labeling system [[http://dl.acm.org/citation.cfm?id=2744788 paper]] and would be very diverse:
 +
* create a software interface to read the imaging data from the camera
 +
* collect some images to build a dataset for evaluation (fused together with data from a high-res RGB camera)
 +
* adapt the convolutional network we use for scene labeling to profit from the new data (don't worry, we will help you :) )
 +
* create a system from the individual parts (build a case/box mounting the cameras, dev board, WiFi module, ...) and do some programming to make it all work together smoothly and efficiently
 +
* cross your fingers, hoping that we will outperform all the existing approaches to scene labeling in urban areas
  
===Status: Available ===
+
===Status: Completed ===
<!-- : David Gschwend, Christoph Mayer, Samuel Willi --->
+
: Dominic Bernath
 
: Supervision: [[:User:Lukasc | Lukas Cavigelli]]
 
: Supervision: [[:User:Lukasc | Lukas Cavigelli]]
: Date: tbd
+
: Date: Autumn Semester 2015
[[Category:Digital]] [[Category:FPGA]] [[Category:Available]] [[Category:Semester Thesis]]
+
[[Category:Software]] [[Category:System]] [[Category:Completed]] [[Category:Semester Thesis]] [[Category:2015]]
  
 
===Prerequisites===
 
===Prerequisites===
* Knowledge of Matlab
+
* Knowledge of C/C++
* Interest in VLSI design and computer vision
+
* Interest in computer vision and system engineering
* VLSI 1
 
  
 
===Character===
 
===Character===
: 20% Theory / Literature Research  
+
: 10% Literature Research
: 80% VLSI Architecture, Implementation & Verification
+
: 40% Programming
 +
: 20% Collecting Data
 +
: 30% System Integration
  
 
===Professor===
 
===Professor===
Line 31: Line 36:
  
 
==Detailed Task Description==
 
==Detailed Task Description==
 
===Goals===
 
The goals of this project are
 
* for the student(s) to get to know the FPGA design flow from specification through architecture exploration to implementation, including the useof memory interfaces and other off-chip communication
 
* to learn how to gradually port software blocks to programmable logic and design an entire hetergeneous system using with software, FPGA fabric and hardwired interfaces.
 
<!--
 
===Important Steps===
 
# Do some first project planning. Create a time schedule and set some milestones based on what you have learned as part of the VLSI lectures.
 
# Get to understand the basic concepts of convolutional networks. Create a Matlab model of the problem at hand.
 
# Catch up on relevant previous work, in particular the papers we give to you.
 
# Become aware of the possibilities and limitations of the used technology; make some very rough estimates of area and timing. Also consider setting some target specifications for your chip.
 
# Come up with and evaluate/discuss several possible architectures (architecture exploration), implement the datapath/most resource relevant parts to get some first impression of the most promissing architecture(s). Also give some first thoughts to testability.
 
# Run detailed fixed-point analyses to determine the signal width in all parts of the data path.
 
# Create high quality, synthesizable VHDL code for your circuit. Please respect the lab's coding guidelines and continuously verify proper functionality of the individual parts of your design.
 
# Implement the necessary configuration interface, ...
 
# Perform thorough functional verification. This is very important.
 
# Take your final implementation through the backend design process.
 
# Write a project report. Include all major decisions taken during the design process and argue your choice. Include everything that deviates from the very standard case -- show off everything that took time to figure out and all your ideas that have influenced the project.
 
 
Be aware, that these steps cannot always be performed one after the other and often need some initial guesses followed by several iterations. Please use the supplied svn repository for your VHDL files and maybe even your notes, presentation, and the final report (you can check out the code on any device, collaborate more easily and intensively, keep track of changes to the code, have a backup of every version, ...).
 
--->
 
  
 
===Meetings & Presentations===
 
===Meetings & Presentations===
Line 58: Line 42:
 
Around the middle of the project there is a design review, where senior members of the lab review your work (bring all the relevant information, such as prelim. specifications, block diagrams, synthesis reports, testing strategy, ...) to make sure everything is on track and decide whether further support is necessary. They also make the definite decision on whether the chip is actually manufactured (no reason to worry, if the project is on track) and whether more chip area, a different package, ... is provided. For more details confer to [http://eda.ee.ethz.ch/index.php/Design_review].  
 
Around the middle of the project there is a design review, where senior members of the lab review your work (bring all the relevant information, such as prelim. specifications, block diagrams, synthesis reports, testing strategy, ...) to make sure everything is on track and decide whether further support is necessary. They also make the definite decision on whether the chip is actually manufactured (no reason to worry, if the project is on track) and whether more chip area, a different package, ... is provided. For more details confer to [http://eda.ee.ethz.ch/index.php/Design_review].  
 
--->
 
--->
At the end of the project, you have to present/defend your work during a 15 min. presentation and 5 min. of discussion as part of the IIS colloquium.\
+
At the end of the project, you have to present/defend your work during a 15 min. or 25 min. presentation and 5 min. of discussion as part of the IIS colloquium (as required for any semester or master thesis at D-ITET).
 
<!--
 
<!--
 
===Deliverables===
 
===Deliverables===
Line 70: Line 54:
 
* project report (in digital form; a hard copy also welcome, but not necessary)
 
* project report (in digital form; a hard copy also welcome, but not necessary)
 
--->
 
--->
===Timeline===
+
<!--
 +
===Timeline==
 
To give some idea on how the time can be split up, we provide some possible partitioning:  
 
To give some idea on how the time can be split up, we provide some possible partitioning:  
 
* Literature survey, building a basic understanding of the problem at hand, catch up on related work  
 
* Literature survey, building a basic understanding of the problem at hand, catch up on related work  
Line 77: Line 62:
 
* Implementation of data interfaces (software or hardware)
 
* Implementation of data interfaces (software or hardware)
 
* Report and presentation  
 
* Report and presentation  
 +
-->
 
<!-- 13.5 weeks total here -->
 
<!-- 13.5 weeks total here -->
  
 +
<!--
 
===Literature===
 
===Literature===
 
* Hardware Acceleration of Convolutional Networks:
 
* Hardware Acceleration of Convolutional Networks:
Line 85: Line 72:
 
** [http://cadlab.cs.ucla.edu/~cong/slides/fpga2015_chen.pdf]
 
** [http://cadlab.cs.ucla.edu/~cong/slides/fpga2015_chen.pdf]
 
* two not-yet-published papers by our group on acceleration of ConvNets
 
* two not-yet-published papers by our group on acceleration of ConvNets
<!-- F. Conti, L. Benini, "A Ultra-Low-Energy Convolution Engine for Fast Brain-Inspired Vision in Multicore Clusters", submitted to IEEE DAC'15, under review. -->
+
F. Conti, L. Benini, "A Ultra-Low-Energy Convolution Engine for Fast Brain-Inspired Vision in Multicore Clusters", submitted to IEEE DAC'15, under review. -->
  
 
===Practical Details===
 
===Practical Details===
 
* '''[[Project Plan]]'''
 
* '''[[Project Plan]]'''
 
* '''[[Project Meetings]]'''
 
* '''[[Project Meetings]]'''
* '''[[Design Review]]'''
 
* '''[[Coding Guidelines]]'''
 
 
* '''[[Final Report]]'''
 
* '''[[Final Report]]'''
 
* '''[[Final Presentation]]'''
 
* '''[[Final Presentation]]'''
 
==Links==
 
* The EDA wiki with lots of information on the ETHZ ASIC design flow (internal only) [http://eda.ee.ethz.ch/]
 
* The IIS/DZ coding guidelines [http://eda.ee.ethz.ch/index.php/Naming_Conventions]
 
 
  
 
[[#top|↑ top]]
 
[[#top|↑ top]]

Latest revision as of 11:29, 5 February 2016

Hyperspectral.jpg
X1-adas.jpg
Labeled-scene.png

Description

Hyperspectral imaging is different from normal RGB imaging in the sense that it does not capture the amount of light within three spectral bins (red, green, blue), but many more (e.g. 16 or 25) and not necessarily in the visual range of the spectrum. This allows the camera to capture more information than humans can process visually with their eyes and thus gives way to very interesting opportunities to outperform even the best-trained humans; in fact you can see it as a step towards spectroscopic analysis of materials.

Recently, a novel hyperspectral imaging sensor has been presented [video, pdf] and adopted in first industrial computer vision cameras [pdf, link]. These new cameras are only 31 grams w/o the lens as opposed to the old cameras which used complex optics with beam splitters, could not provide a large number of channels and were very heavy, ultra expensive and not mobile [link, link].

We have acquired such a camera and would like to explore its use for image understanding/scene labeling/semantic segmentation (see labeled image). Your task would be to evaluate and integrate this camera into a working scene labeling system [paper] and would be very diverse:

  • create a software interface to read the imaging data from the camera
  • collect some images to build a dataset for evaluation (fused together with data from a high-res RGB camera)
  • adapt the convolutional network we use for scene labeling to profit from the new data (don't worry, we will help you :) )
  • create a system from the individual parts (build a case/box mounting the cameras, dev board, WiFi module, ...) and do some programming to make it all work together smoothly and efficiently
  • cross your fingers, hoping that we will outperform all the existing approaches to scene labeling in urban areas

Status: Completed

Dominic Bernath
Supervision: Lukas Cavigelli
Date: Autumn Semester 2015

Prerequisites

  • Knowledge of C/C++
  • Interest in computer vision and system engineering

Character

10% Literature Research
40% Programming
20% Collecting Data
30% System Integration

Professor

Luca Benini

↑ top

Detailed Task Description

Meetings & Presentations

The students and advisor(s) agree on weekly meetings to discuss all relevant decisions and decide on how to proceed. Of course, additional meetings can be organized to address urgent issues. At the end of the project, you have to present/defend your work during a 15 min. or 25 min. presentation and 5 min. of discussion as part of the IIS colloquium (as required for any semester or master thesis at D-ITET).


Practical Details

↑ top