Personal tools

Difference between revisions of "Deep Learning Projects"

From iis-projects

Jump to: navigation, search
(On-Device Training)
 
(12 intermediate revisions by 7 users not shown)
Line 7: Line 7:
 
'''Topological optimizations''' are concerned with network topologies (AKA network architectures) which are more efficient in terms of accuracy-per-parameter or accuracy-per-MAC (multiply-accumulate operation). As a specific form of topological optimization, '''pruning''' strategies aim at maximizing the number of zero-valued operands (parameters and/or activations) in order to 1) take advantage of sparsity (for storing the model) and to 2) minimize the number of effective arithmetic operations (i.e., the operations not involving zero-valued operands, which must be actually executed). '''Hardware-oriented optimizations''' are instead concerned with replacing time-consuming and energy-hungry operations, such as evaluations of transcendent functions or floating-point MAC operations, with more efficient counterparts, such as piecewise linear activation functions (e.g., the ReLU) and integer MAC operations (as in quantized neural networks, QNNs).
 
'''Topological optimizations''' are concerned with network topologies (AKA network architectures) which are more efficient in terms of accuracy-per-parameter or accuracy-per-MAC (multiply-accumulate operation). As a specific form of topological optimization, '''pruning''' strategies aim at maximizing the number of zero-valued operands (parameters and/or activations) in order to 1) take advantage of sparsity (for storing the model) and to 2) minimize the number of effective arithmetic operations (i.e., the operations not involving zero-valued operands, which must be actually executed). '''Hardware-oriented optimizations''' are instead concerned with replacing time-consuming and energy-hungry operations, such as evaluations of transcendent functions or floating-point MAC operations, with more efficient counterparts, such as piecewise linear activation functions (e.g., the ReLU) and integer MAC operations (as in quantized neural networks, QNNs).
  
 +
 +
==Foundation models and LLMs for Health==
 +
<!-- [[File:EEG_ECG.png|border|text-top|400px]] -->
 +
<!-- [[File:LLM.png|border|text-top|400px]] -->
 +
<!-- Seizure-prediction.png -->
 +
<!-- ==Short Description=== -->
 +
Incorporating Foundation Models and Large Language Models (LLMs) within artificial intelligence is gaining significant traction, particularly due to their potential applications in the health sector. This project is dedicated to developing sophisticated methodologies for utilizing foundation models and LLMs in health-related applications, specifically analyzing electroencephalogram (EEG) brain signals.
 +
 +
In healthcare and biomedical research, implementing advanced computational models, notably Foundation Models and Large Language Models (LLMs), revolutionizes the understanding and interpretation of intricate biosignals. We stand at the vanguard of this revolutionary change, delving into the capabilities of these models for the analysis and interpretation of critical biosignals, including electroencephalograms (EEG) and electrocardiograms (ECG).
 +
 +
Foundation Models, encompassing a spectrum of robust, pre-trained models, are transforming our ability to process and interpret large datasets. Initially trained on extensive and diverse datasets, these models are adaptable for specific tasks, offering remarkable accuracy and efficiency. This adaptability renders them particularly beneficial for biosignal analysis, where the intricacies of EEG and ECG data demand both precision and contextual understanding.
 +
 +
As a subset of Foundation Models, LLMs have demonstrated efficacy in processing and generating human language. At IIS, we are pioneering the application of LLMs in the domain of biosignal interpretation, extending beyond textual data. This entails training the models to interpret the 'language' of biosignals, translating complex patterns into actionable insights.
 +
 +
Our emphasis on EEG and ECG signals is motivated by these biosignals' profound insights into human health. EEGs, capturing brain activity, and ECGs, monitoring heart rhythms, are instrumental in diagnosing and managing various health conditions. By leveraging Foundation Models and LLMs, our objective is to refine diagnostic accuracy, predict health outcomes, and customize patient care.
 +
 +
IIS invites Master's students to immerse themselves in this pioneering area. Our projects offer avenues to engage with state-of-the-art technologies, apply them to real-world health challenges, and contribute to shaping a future where healthcare is more predictive, preventive, and personalized. We encourage your participation in this exhilarating endeavor to redefine the confluence of healthcare and technology.
 +
 +
===Links===
 +
* [https://braingpt.org/ BrainGPT]
 +
 +
{|
 +
| style="padding: 10px" | [[File:philippmayer.jpg|frameless|left|100px]]
 +
|
 +
===[[:User:mayerph | Dr. Philipp Mayer]]===
 +
* '''e-mail''': [mailto:mayerph@iis.ee.ethz.ch mayerph@iis.ee.ethz.ch]
 +
* '''phone''': +41 44 63 242 68, '''skype''': mayer.philipp1
 +
* '''Office''': ETF F108
 +
|}
  
 
==Hardware-oriented neural architecture search (NAS)==
 
==Hardware-oriented neural architecture search (NAS)==
Line 32: Line 61:
  
 
{|
 
{|
| style="padding: 10px" | [[File:victor_jung.png|frameless|left|96px]]
+
| style="padding: 10px" | [[File:victor_jung.jpg|frameless|left|96px]]
 
|
 
|
 
===[[:User:Jungvi| Victor Jung]]===
 
===[[:User:Jungvi| Victor Jung]]===
Line 46: Line 75:
  
 
{|
 
{|
| style="padding: 10px" | [[File:gianna.jpg|frameless|left|96px]]
+
| style="padding: 10px" | [[File:victor_jung.jpg|frameless|left|96px]]
|
 
===[[:User:Paulin| Gianna Paulin]]===
 
* '''e-mail''': [mailto:pauling@iis.ee.ethz.ch pauling@iis.ee.ethz.ch]
 
* '''phone''': +41 44 632 60 80
 
* '''office''': ETZ J76.2
 
|}
 
 
 
{|
 
| style="padding: 10px" | [[File:Tim_Fischer.jpeg|frameless|left|96px]]
 
|
 
 
 
===[[:User:Fischeti| Tim Fischer]]===
 
* '''e-mail''': [mailto:fischeti@iis.ee.ethz.ch fischeti@iis.ee.ethz.ch]
 
* '''phone''': +41 44 632 59 12
 
* '''office''': ETZ J76.2
 
|}
 
 
 
{|
 
| style="padding: 10px" | [[File:victor_jung.png|frameless|left|96px]]
 
 
|
 
|
 
===[[:User:Jungvi| Victor Jung]]===
 
===[[:User:Jungvi| Victor Jung]]===
Line 83: Line 93:
  
 
{|
 
{|
| style="padding: 10px" | [[File:jannis_schoenleber.jpg|frameless|left|96px]]
+
| style="padding: 10px" | [[File:Georg.jpg|frameless|left|96px]]
 
|
 
|
 +
===[[User:Georg | Georg Rutishauser]]===
 +
* '''e-mail''': [mailto:georgr@iis.ee.ethz.ch georgr@iis.ee.ethz.ch]
 +
* '''phone''': +41 44 632 54 97
 +
* '''office''': ETZ J68.2
 +
|}
  
===[[:User:Janniss| Jannis Schönleber]]===
+
{|
* '''e-mail''': [mailto:janniss@iis.ee.ethz.ch janniss@iis.ee.ethz.ch]
+
| style="padding: 10px" | [[File:Wiesep.jpg|frameless|left|96px]]
* '''phone''': TBD
+
|
* '''office''': ETZ J76.2
+
===[[User:Wiesep | Philip Wiese]]===
 +
* '''e-mail''': [mailto:wiesep@iis.ee.ethz.ch wiesep@iis.ee.ethz.ch]
 +
* '''phone''': +41 79 244 92 40
 +
* '''office''': OAT U25
 
|}
 
|}
  
Line 105: Line 123:
 
* '''phone''': +41 44 632 82 19
 
* '''phone''': +41 44 632 82 19
 
* '''office''': ETZ J78
 
* '''office''': ETZ J78
|}
 
{|
 
| style="padding: 10px" | [[File:gianna.jpg|frameless|left|96px]]
 
|
 
===[[:User:Paulin| Gianna Paulin]]===
 
* '''e-mail''': [mailto:pauling@iis.ee.ethz.ch pauling@iis.ee.ethz.ch]
 
* '''phone''': +41 44 632 60 80
 
* '''office''': ETZ J76.2
 
 
|}
 
|}
 
{|
 
{|
Line 131: Line 141:
 
* '''office''': ETZ J69.2
 
* '''office''': ETZ J69.2
 
|}
 
|}
{|
 
| style="padding: 10px" | [[File:Tim_Fischer.jpeg|frameless|left|96px]]
 
|
 
  
===[[:User:Fischeti| Tim Fischer]]===
 
* '''e-mail''': [mailto:fischeti@iis.ee.ethz.ch fischeti@iis.ee.ethz.ch]
 
* '''phone''': +41 44 632 59 12
 
* '''office''': ETZ J76.2
 
|}
 
 
{|
 
{|
 
| style="padding: 10px" | [[File:Arpan_Suravi_Prasad.jpeg|frameless|left|96px]]
 
| style="padding: 10px" | [[File:Arpan_Suravi_Prasad.jpeg|frameless|left|96px]]
Line 149: Line 151:
 
* '''office''': ETZ J89
 
* '''office''': ETZ J89
 
|}
 
|}
 +
 
{|
 
{|
| style="padding: 10px" | [[File:jannis_schoenleber.jpg|frameless|left|96px]]
+
| style="padding: 10px" | [[File:gislamoglu.jpg|frameless|left|96px]]
 
|
 
|
  
===[[:User:Janniss| Jannis Schönleber]]===
+
===[[:User:Gislamoglu| Gamze İslamoğlu]]===
* '''e-mail''': [mailto:janniss@iis.ee.ethz.ch janniss@iis.ee.ethz.ch]
+
* '''e-mail''': [mailto:gislamoglu@iis.ee.ethz.ch gislamoglu@iis.ee.ethz.ch]
* '''phone''': TBD
+
* '''office''': ETZ J78
* '''office''': ETZ J76.2
+
|}
 +
 
 +
{|
 +
| style="padding: 10px" | [[File:Wiesep.jpg|frameless|left|96px]]
 +
|
 +
===[[User:Wiesep | Philip Wiese]]===
 +
* '''e-mail''': [mailto:wiesep@iis.ee.ethz.ch wiesep@iis.ee.ethz.ch]
 +
* '''phone''': +41 79 244 92 40
 +
* '''office''': OAT U25
 
|}
 
|}
  
Line 195: Line 206:
 
The fast development of the Internet-of Things (IoT) comes with the growing need for smart end-node devices able to execute Deep Learning networks locally. Processing the data on device has many advantages, not only drastically reducing the latency and communication energy cost, but also taking one step towards autonomous IoT end-nodes. Most of the current research efforts are focusing on inference, under the "train-then-deploy" paradigm. However, this results in a device unable to face real-life phenomena such as data distribution shifts or class increments. At IIS, we are actively researching new methods to tackle this significant challenge in the context of tightly memory constrained devices such as Microcontrollers (MCUs).  
 
The fast development of the Internet-of Things (IoT) comes with the growing need for smart end-node devices able to execute Deep Learning networks locally. Processing the data on device has many advantages, not only drastically reducing the latency and communication energy cost, but also taking one step towards autonomous IoT end-nodes. Most of the current research efforts are focusing on inference, under the "train-then-deploy" paradigm. However, this results in a device unable to face real-life phenomena such as data distribution shifts or class increments. At IIS, we are actively researching new methods to tackle this significant challenge in the context of tightly memory constrained devices such as Microcontrollers (MCUs).  
  
 
{|
 
| style="padding: 10px" | [[File:gianna.jpg|frameless|left|96px]]
 
|
 
===[[:User:Paulin| Gianna Paulin]]===
 
* '''e-mail''': [mailto:pauling@iis.ee.ethz.ch pauling@iis.ee.ethz.ch]
 
* '''phone''': +41 44 632 60 80
 
* '''office''': ETZ J76.2
 
|}
 
  
 
{|
 
{|

Latest revision as of 18:29, 19 February 2024

What is Deep Learning?

Nowadays, machine learning systems are the go-to choice when the cost of analytically deriving closed-form expressions to solve a given problem is prohibitive (e.g., it is very time-consuming, or the knowledge about the problem is insufficient). Machine learning systems can be particularly effective when the amount of data is large, since the statistics are expected to get more and more stable as the amount of data increases. Amongst machine learning systems, deep neural networks (DNNs) have established a reputation for their effectiveness and simplicity. To understand this success as compared to that of other machine learning systems, it is important to consider not only the accuracy performance of DNNs, but also their computational properties. The training algorithm (an iterative application of backpropagation and stochastic gradient descent) is linear in the data set size, making it more appealing in big data contexts than, for instance, support vector machines (SVMs). DNNs do not use branching instructions, making them predictable programs and allowing to design efficient access patterns for the memory hierarchies of the computing devices (exploiting spatial and temporal locality). DNNs are parallelizable, both at the neuron level and at the layer level. These predictability and parallelizability properties make DNNs an ideal fit for modern SIMD architectures and distributed computing systems.


The main drawback of these systems is their size: millions or even billions of parameters are a common feature of many top-performing DNNs, and a proportional amount of arithmetic operations must be performed to process each data sample. Hence, to reduce the pressure of DNNs on the underlying computing infrastructure, research in computational deep learning has focussed on two families of optimizations: topological and hardware-oriented. Topological optimizations are concerned with network topologies (AKA network architectures) which are more efficient in terms of accuracy-per-parameter or accuracy-per-MAC (multiply-accumulate operation). As a specific form of topological optimization, pruning strategies aim at maximizing the number of zero-valued operands (parameters and/or activations) in order to 1) take advantage of sparsity (for storing the model) and to 2) minimize the number of effective arithmetic operations (i.e., the operations not involving zero-valued operands, which must be actually executed). Hardware-oriented optimizations are instead concerned with replacing time-consuming and energy-hungry operations, such as evaluations of transcendent functions or floating-point MAC operations, with more efficient counterparts, such as piecewise linear activation functions (e.g., the ReLU) and integer MAC operations (as in quantized neural networks, QNNs).


Foundation models and LLMs for Health

Incorporating Foundation Models and Large Language Models (LLMs) within artificial intelligence is gaining significant traction, particularly due to their potential applications in the health sector. This project is dedicated to developing sophisticated methodologies for utilizing foundation models and LLMs in health-related applications, specifically analyzing electroencephalogram (EEG) brain signals.

In healthcare and biomedical research, implementing advanced computational models, notably Foundation Models and Large Language Models (LLMs), revolutionizes the understanding and interpretation of intricate biosignals. We stand at the vanguard of this revolutionary change, delving into the capabilities of these models for the analysis and interpretation of critical biosignals, including electroencephalograms (EEG) and electrocardiograms (ECG).

Foundation Models, encompassing a spectrum of robust, pre-trained models, are transforming our ability to process and interpret large datasets. Initially trained on extensive and diverse datasets, these models are adaptable for specific tasks, offering remarkable accuracy and efficiency. This adaptability renders them particularly beneficial for biosignal analysis, where the intricacies of EEG and ECG data demand both precision and contextual understanding.

As a subset of Foundation Models, LLMs have demonstrated efficacy in processing and generating human language. At IIS, we are pioneering the application of LLMs in the domain of biosignal interpretation, extending beyond textual data. This entails training the models to interpret the 'language' of biosignals, translating complex patterns into actionable insights.

Our emphasis on EEG and ECG signals is motivated by these biosignals' profound insights into human health. EEGs, capturing brain activity, and ECGs, monitoring heart rhythms, are instrumental in diagnosing and managing various health conditions. By leveraging Foundation Models and LLMs, our objective is to refine diagnostic accuracy, predict health outcomes, and customize patient care.

IIS invites Master's students to immerse themselves in this pioneering area. Our projects offer avenues to engage with state-of-the-art technologies, apply them to real-world health challenges, and contribute to shaping a future where healthcare is more predictive, preventive, and personalized. We encourage your participation in this exhilarating endeavor to redefine the confluence of healthcare and technology.

Links

Philippmayer.jpg

Dr. Philipp Mayer

Hardware-oriented neural architecture search (NAS)

The problems of topology selection and pruning can be considered instances of the classical statistics problems of model selection and feature selection, respectively. In the scope of deep learning, model selection is also called neural architecture search (NAS). When designing a DNN topology, you have a large number of degrees of freedom at your disposal: number of layers, number of neurons for each layer, connectivity of each neuron, and so on; moreover, the number of choices for each degree of freedom is huge. These properties imply that the design space for a DNN can grow exponentially, making exhaustive searches prohibitive. Therefore, to increase the efficiency of the exploration, stochastic optimization tools are the preferred choice: evolutionary algorithms, reinforcement learning, gradient-based techniques or even random graph generation. An interesting feature of model selection is that specific constraints can be enforced on the search space so that desired properties are always respected. For instance, given a storage budget describing a hard limitation of the chosen computing platform, the network generation algorithm can be limited to propose topologies that do not exceed a given number of parameters. This capability of incorporating HW features as constraints on the search space make NAS algorithms very interesting in the context of generating HW-friendly DNNs.

Thorir.jpg

Thorir Mar Ingolfsson

Cioflanc.jpg

Cristian Cioflan

Victor jung.jpg

Victor Jung

Algorithms & Frameworks for Quantization and Deployment for Deep Neural Networks (DNNs)

The typical training algorithm for DNNs is an iterative application of the backpropagation algorithm (BP) and stochastic gradient descent (SGD). When the quantization is not “aggressive” (i.e., when the parameters and feature maps can be represented as integers with a precision of 8-bits or more), many solutions are available either in specialized literature or in commercial software that can convert models pre-trained with gradient descent to quantized counterparts (post-training quantization). But when the precision is extremely reduced (i.e., 1-bit or 2-bits operands), these solutions can no longer be applied, and quantization-aware training algorithms are needed. The naive application of gradient descent (which in theory is not even correct) to train these QNNs yields major accuracy drops. Hence, it is likely that suitable training algorithms for QNNs require to replace the standard BP+SGD scheme, which is suitable for differentiable optimization, with search strategies that are more apt for discrete optimization.

Victor jung.jpg

Victor Jung

Cioflanc.jpg

Cristian Cioflan

Georg.jpg

Georg Rutishauser

Wiesep.jpg

Philip Wiese

Hardware Acceleration of DNNs and QNNs

Deep Learning (DL) and Artificial Intelligence (AI) are quickly becoming dominant paradigms for all kinds of analytics, complementing or replacing traditional data science methods. Successful at-scale deployment of these algorithms requires deploying them directly at the data source, i.e. in the IoT end-nodes collecting data. However, due to the extreme constraints of these devices (in terms of power, memory footprint, area cost), performing full DL inference in-situ in low-power end-nodes requires a breakthrough in computational performance and efficiency. It is widely known that the numerical representation typically used when developing DL algorithms (single-precision floating-point) encodes a higher precision than what is actually required to achieve high quality-of-results in inference (Courbariaux et al. 2016); this fact can be exploited in the design of energy-efficient hardware for DL. For example, by using ternary weights, which means all network weights are quantized to {-1,0,1}, we can design the fundamental compute units in hardware without using an HW-expensive multiplication unit. Additionally, it allows us to store the weights much more compact on-chip.


Angelo garofalo.png

Angelo Garofalo

Georg.jpg

Georg Rutishauser

Moritz scherer.jpg

Moritz Scherer

Arpan Suravi Prasad.jpeg

Arpan Suravi Prasad

Gislamoglu.jpg

Gamze İslamoğlu

Wiesep.jpg

Philip Wiese

Event-Driven Computing

With the increasing demand for "smart" algorithms on mobile and wearable devices, the energy cost of computing is becoming the bottleneck for battery lifetime. One approach to defuse this bottleneck is to reduce the compute activity on such devices - one of the most popular approaches uses sensor information to determine whether it is worth to run expensive computations or whether there is not enough activity in the environment. This approach is called event-driven computing. Event-driven architectures can be implemented for many applications - From pure sensing platforms to multi-core systems for machine learning on the edge. At IIS, we cover most of these applications. Besides working with novel, state-of-the-art sensors and sensing platforms to push the limits of lifetime of wearables and mobile devices, we also work with cutting-edge computing systems like Intel Loihi for Spiking Neural Networks to minimize the energy cost of machine intelligence.

Adimauro.png

Alfio Di Mauro

Moritz scherer.jpg

Moritz Scherer

Arpan Suravi Prasad.jpeg

Arpan Suravi Prasad


On-Device Training

The fast development of the Internet-of Things (IoT) comes with the growing need for smart end-node devices able to execute Deep Learning networks locally. Processing the data on device has many advantages, not only drastically reducing the latency and communication energy cost, but also taking one step towards autonomous IoT end-nodes. Most of the current research efforts are focusing on inference, under the "train-then-deploy" paradigm. However, this results in a device unable to face real-life phenomena such as data distribution shifts or class increments. At IIS, we are actively researching new methods to tackle this significant challenge in the context of tightly memory constrained devices such as Microcontrollers (MCUs).


Cioflanc.jpg

Cristian Cioflan

Viviane potocnik.png

Viviane Potocnik

Victor jung.jpg

Victor Jung

Prerequisites

We have no strict, general requirements, as they are highly dependent on the exact project steps. The projects will be adapted to the skills and interests of the student(s) -- just come talk to us! If you don't know about GPU programming or CNNs or ... just let us know and we can together determine what is a useful way to go -- after all you are here to learn not only about project work but also to develop your technical skills.

Only hard requirements:

  • Excitement for deep learning
  • For HW Design projects: VLSI 1, VLSI 2 or equivalent

Tags

All our projects will be categorized into three categories. Therefore, look out for the following tags:

  • Algorithmic - you will mainly make algorithmic evaluations using languages and frameworks like e.g. Python, Pytorch, Tensorflow and our in-house frameworks like Quantlab, DORY, NEMO
  • Embedded Coding - you will implement e.g. c-code for one of our microcontrollers
  • HW Design - you will be designing HW including writing RTL, simulate, synthesize, and layout (backend) some HW


Available Projects

New projects are constantly being added, check back often! If you have any questions or would like to propose own ideas, do not hesitate to contact us!


Projects in Progress


Completed Projects