Personal tools

Difference between revisions of "Exploring NAS spaces with C-BRED"

From iis-projects

Jump to: navigation, search
(Project description)
(Status: Available)
(2 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
== Introduction ==
 
== Introduction ==
  
 +
Deep neural networks (DNNs) are critical components of modern machine learning systems.
 +
Pushing the boundaries of DNN accuracy requires considerable expertise and time.
 +
This task is even more challenging when constraints such as model size and latency are introduced.
 +
For these reasons, the deep learning research community has invested considerable efforts in neural architecture search (NAS).
 +
NAS encompasses a series of techniques to create spaces containing multiple candidate network architectures (NAS spaces) and search these spaces for the best candidates (NAS algorithms).
 +
NAS spaces are usually designed to be large to increase the likelihood of including high-quality networks, i.e., better than sibling candidates.
 +
However, the size of NAS spaces poses a non-trivial challenge to the convergence of NAS algorithms.
 +
Therefore, most NAS flows must find a balance between the size of the target NAS space and the complexity of the NAS algorithm that searches it.
 +
 +
A possible alternative is restricting the search to sub-spaces containing high-quality networks.
 +
In a previous project, we developed Clustering-Based REDuction (C-BRED), a new algorithm to reduce NAS spaces to high-quality subsets.
 +
C-BRED works under the assumption that DNNs having similar computational graphs (i.e., similar information flows) also have similar performance.
 +
Under this assumption, we can use graph distances and clustering algorithms to create candidate sub-spaces through unsupervised learning.
 +
Then, to select the best cluster, we need a way to estimate the likelihood that a cluster contains high-quality networks.
 +
To do this, we would like to know the accuracy of the enclosed network; but if we had access to this information, there would be no point using NAS.
 +
Training-free (TF) statistics are quantities describing neural networks that can be computed before running any training iteration and correlate well with post-training accuracy.
 +
C-BRED hence uses TF statistics to compute the most promising cluster.
 +
 +
In the first part of this project, you will re-design and refactor the existing C-BRED codebase to support additional graph distances, additional clustering algorithms, and additional TF statistics.
 +
In the second part of this project, you will apply the new code to two popular NAS benchmark spaces (NAS-Bench-101 and NAS-Bench-201) and tune C-BRED hyper-parameters to turn it into a robust pre-processing algorithm for NAS flows.
  
  
Line 49: Line 69:
 
It is possible to complete the project either as a Semester Project or a Master Thesis.
 
It is possible to complete the project either as a Semester Project or a Master Thesis.
  
Supervisors: [[:User:spmatteo | Matteo Spallanzani]] [mailto:spmatteo@iis.ee.ethz.ch spmatteo@iis.ee.ethz.ch], [[:User:thoriri | Thorir Mar Ingolfsson]]
+
Supervisors: [[:User:spmatteo | Matteo Spallanzani]] [mailto:spmatteo@iis.ee.ethz.ch spmatteo@iis.ee.ethz.ch], [[:User:thoriri | Thorir Mar Ingolfsson]], [[:User:jungvi | Victor Jung]] [mailto:jungvi@iis.ee.ethz.ch jungvi@iis.ee.ethz.ch]
  
  

Revision as of 20:47, 12 September 2022

Introduction

Deep neural networks (DNNs) are critical components of modern machine learning systems. Pushing the boundaries of DNN accuracy requires considerable expertise and time. This task is even more challenging when constraints such as model size and latency are introduced. For these reasons, the deep learning research community has invested considerable efforts in neural architecture search (NAS). NAS encompasses a series of techniques to create spaces containing multiple candidate network architectures (NAS spaces) and search these spaces for the best candidates (NAS algorithms). NAS spaces are usually designed to be large to increase the likelihood of including high-quality networks, i.e., better than sibling candidates. However, the size of NAS spaces poses a non-trivial challenge to the convergence of NAS algorithms. Therefore, most NAS flows must find a balance between the size of the target NAS space and the complexity of the NAS algorithm that searches it.

A possible alternative is restricting the search to sub-spaces containing high-quality networks. In a previous project, we developed Clustering-Based REDuction (C-BRED), a new algorithm to reduce NAS spaces to high-quality subsets. C-BRED works under the assumption that DNNs having similar computational graphs (i.e., similar information flows) also have similar performance. Under this assumption, we can use graph distances and clustering algorithms to create candidate sub-spaces through unsupervised learning. Then, to select the best cluster, we need a way to estimate the likelihood that a cluster contains high-quality networks. To do this, we would like to know the accuracy of the enclosed network; but if we had access to this information, there would be no point using NAS. Training-free (TF) statistics are quantities describing neural networks that can be computed before running any training iteration and correlate well with post-training accuracy. C-BRED hence uses TF statistics to compute the most promising cluster.

In the first part of this project, you will re-design and refactor the existing C-BRED codebase to support additional graph distances, additional clustering algorithms, and additional TF statistics. In the second part of this project, you will apply the new code to two popular NAS benchmark spaces (NAS-Bench-101 and NAS-Bench-201) and tune C-BRED hyper-parameters to turn it into a robust pre-processing algorithm for NAS flows.


Project description

Project Description

Skills and project character

Skills

Required:

  • Fundamental concepts of deep learning (convolutional neural networks, backpropagation, computational graphs)
  • Python programming
  • Familiarity with the PyTorch deep learning framework
  • Familiarity with the scikit-learn Python package

Optional:

  • Familiarity with the ONNX standard
  • Familiarity with dimensionality reduction, clustering, and kernel methods
  • Familiarity with elementary concepts of statistics (random variable, probability distribution)

Project character

  • 10% Software engineering
  • 60% Python coding
  • 30% Data science applied to deep learning


Logistics

The student and the advisors will meet on a weekly basis to check the progress of the project, clarify doubts, and decide the next steps. The student and the advisors will also have regular code reviews, whose frequency will depend on the stage of the project. The schedule of these meetings will be agreed at the beginning of the project by both parties. Of course, additional meetings can be organised to address urgent issues.

At the end of the project, you will have to present your work during a 15 minutes (20 minutes if carried out as a Master Thesis) talk in front of the IIS team and defend it during the following 5 minutes discussion.


Professor

Luca Benini


Status: Available

We are looking for 1 Master student. It is possible to complete the project either as a Semester Project or a Master Thesis.

Supervisors: Matteo Spallanzani spmatteo@iis.ee.ethz.ch, Thorir Mar Ingolfsson, Victor Jung jungvi@iis.ee.ethz.ch