Difference between revisions of "Deep Learning Projects"
From iis-projects
m (→Available Projects) |
m (→Available Projects) |
||
Line 20: | Line 20: | ||
|- | |- | ||
| available || MA || Distributed/Federated Learning || With the increasing number of IoT devices equipped with a bunch of sensor, it is not feasible to always stream all the data back to a server. Therefore, there is the need to learn on the node itself and synchronize/merge the network in a periodic scheme. || Embedded GPU || SW(algo, evals) || [[:User:andrire|Renzo Andri]], [[:User:lukasc|Lukas Cavigelli]] | | available || MA || Distributed/Federated Learning || With the increasing number of IoT devices equipped with a bunch of sensor, it is not feasible to always stream all the data back to a server. Therefore, there is the need to learn on the node itself and synchronize/merge the network in a periodic scheme. || Embedded GPU || SW(algo, evals) || [[:User:andrire|Renzo Andri]], [[:User:lukasc|Lukas Cavigelli]] | ||
− | |||
− | |||
|- | |- | ||
| available || MA/SA || On-chip Learning || Neural Networks are compute and resource intensive and are usually run on power-intensive GPU clusters, but we would like to exploit them also on the everywhere IoT devices. To reach that, we need to develop new hardware architecture optimized for this application. This also include to check new algorithmic approach, which can reduce the compute or memory footprint of these networks. || ASIC || HW (ASIC) || [[:User:andrire|Renzo Andri]] | | available || MA/SA || On-chip Learning || Neural Networks are compute and resource intensive and are usually run on power-intensive GPU clusters, but we would like to exploit them also on the everywhere IoT devices. To reach that, we need to develop new hardware architecture optimized for this application. This also include to check new algorithmic approach, which can reduce the compute or memory footprint of these networks. || ASIC || HW (ASIC) || [[:User:andrire|Renzo Andri]] | ||
− | |||
− | |||
|} | |} | ||
Line 31: | Line 27: | ||
Workload types: SW (GPU), SW (microcontr.), SW (algorithm evals), HW (FPGA), HW (ASIC), HW (PCB) | Workload types: SW (GPU), SW (microcontr.), SW (algorithm evals), HW (FPGA), HW (ASIC), HW (PCB) | ||
+ | |||
+ | |||
+ | <!-- | ||
+ | |- | ||
+ | | available || MA/SA || Self-Learning Drone || Autonomous Driving is a hot topic nowadays, but also self-learning approaches (i.e. re-inforcement learning) have had a big success (e.g. AlphaGo from Google beat the world champion in Go. We want a drone to learn from its environment such that the drone is able to solve a task independantly. || ML frameworks (e.g. Torch)/GPU, Drone Simulation (ROS/Gazebo) || SW (Training) || [[:User:andrire|Renzo Andri]], [[:User:dpalossi|Daniele Palossi]] | ||
+ | |- | ||
+ | | available || SA/MA || Weather Invariant Representations || When running computer vision applications, their performance under lab conditions often significantly differs from what you using real-world data. One main aspect is that often lighting conditions are normalized. Your target is to train a CNN to normalize weather conditions and going through the entire flow of collecting a dataset, training a CNN, and evaluating it. This type of problem can likely be approach with unsupervised or semi-supervised methods. || Workstation|| SW (algo evals, data acq.) || [[:User:lukasc|Lukas Cavigelli]] | ||
+ | |||
+ | --> | ||
==On-Going Projects== | ==On-Going Projects== |
Revision as of 12:22, 7 December 2018
We are listing a few projects below to give you an idea of what we do. However, we constantly have new project ideas and maybe some other approaches become obsolete in the very rapidly advancing research area. Please just contact the people of a project most similar to what you would like to do, and come talk to us.
Contents
Prerequisites
We have no strict, general requirements, as they are highly dependent on the exact project steps. The projects will be adapted to the skills and interests of the student(s) -- just come talk to us! If you don't know about GPU programming or CNNs or ... just let us know and we can together determine what is a useful way to go -- after all you are here to learn not only about project work, but also to develop your technical skills.
Only hard requirements:
- Excitement for deep learning
- For VLSI projects: VLSI 1 or equivalent
Available Projects
Status | Type | Project Name | Description | Platform | Workload Type | First Contact(s) |
---|---|---|---|---|---|---|
available | SA/MA | Stand-Alone Edge Computing with GAP8 | Detailled description: Stand-Alone_Edge_Computing_with_GAP8 | Embedded | SW/HW (PCB-level) | Renzo Andri Andres Gomez |
available | MA | INQ Accelerator | INQ is a quantization technique which has been proven to work very well for neural networks. The weights are quantized to levels of +-2^n. As multiplcations with power's of two can be done by just shifting the bits, it is perfect for HW acceleration. In this thesis you will design an ASIC performing INQ quantized networks. | ASIC | ASIC | Renzo Andri |
available | MA | Distributed/Federated Learning | With the increasing number of IoT devices equipped with a bunch of sensor, it is not feasible to always stream all the data back to a server. Therefore, there is the need to learn on the node itself and synchronize/merge the network in a periodic scheme. | Embedded GPU | SW(algo, evals) | Renzo Andri, Lukas Cavigelli |
available | MA/SA | On-chip Learning | Neural Networks are compute and resource intensive and are usually run on power-intensive GPU clusters, but we would like to exploit them also on the everywhere IoT devices. To reach that, we need to develop new hardware architecture optimized for this application. This also include to check new algorithmic approach, which can reduce the compute or memory footprint of these networks. | ASIC | HW (ASIC) | Renzo Andri |
Workload types: SW (GPU), SW (microcontr.), SW (algorithm evals), HW (FPGA), HW (ASIC), HW (PCB)
On-Going Projects
Status | Type | Project Name | Description | Platform | Workload Type | First Contact(s) |
---|---|---|---|---|---|---|
taken | SA | SAR Data Analysis | We would like to explore the automated analysis of aerial synthetic aperture radar (SAR) images. Essentially, we have one very high-resolution image of a Swiss city and no labels. This project is not about labeling a lot of data, but to explore various options for supervised (cf. paper) or semi-/unsupervised learning to segment these images using very few labeled data. | Workstation | SW (algo evals) | Xiaying Wang, Lukas Cavigelli, Michele Magno |
taken | MA/2x SA | DNN Training Accelerator | The compute effort to train state-of-the-art CNNs is tremendous and largely done on GPUs, or less frequently on specialized HW (e.g. Google's TPUs). Their energy effiency and often performance is limited by DRAM accesses. When storing all the data required for the gradient descent step of typical DNNs, there is no way to store it in on-chip SRAM--even across multiple, very large chips. Recently, Invertible ResNets has been presented (cf. paper) and allows to trade these storage requirements for some more compute effort--a huge opportunity. In this project, you will perform an architecture exploration to analyze how this could best be exploited. | ASIC | HW (ASIC) | Lukas Cavigelli |
Completed Projects
Status | Type | Project Name | Description | Platform | Workload Type | First Contact(s) |
---|---|---|---|---|---|---|
completed FS18 | SA | CBinfer for Speech Recognition | We have recently published an approach to dramatically reduce computation effort when performing object detection on video streams with limited frame-to-frame changes (cf. paper). We think this approach could also be applied to audio signals for continuous listening to void commands: when looking at MFCCs or the short-term Fourier transform, changes in the spectrum between neighboring time windows are also limited. | Embedded GPU (Tegra X2) | SW (GPU, algo evals) | Lukas Cavigelli |
completed HS18 | MA | One-shot/Few-shot Learning | One-shot learning comes in handy whenever it is not possible to collect a large dataset. Consider for example face identification as a form of opening you apartment's door, where the user provides a single picture (not 100s) and is recognized reliably from then on. In this project you would apply a method called Prototypical Networks (cf. [paper, code]) to learn to identify faces. Once you have trained such a DNN, you will optimize it for an embedded system to run it in real time. For a master thesis, an interesting additional step could be to look at expanding this further to share information between multiple nodes/cameras and learn to re-identify faces also as they evolve over time. | Embedded GPU or Microcontroller | SW (algo, uC) | Lukas Cavigelli, Renzo Andri |
Where to find us
Renzo Andri, ETZ J 76.2, andrire@iis.ee.ethz.ch
Lukas Cavigelli, ETZ J 76.2, cavigelli@iis.ee.ethz.ch