Personal tools

Difference between revisions of "Exploring schedules for incremental and annealing quantization algorithms"

From iis-projects

Jump to: navigation, search
Line 51: Line 51:
Possible to complete as a Semester Thesis or as a Master Thesis.
Possible to complete as a Semester Thesis or as a Master Thesis.
Supervision: [[:User:Spmatteo | Matteo Spallanzani]] []
Supervisor: [[:User:Spmatteo | Matteo Spallanzani]] []

Revision as of 11:50, 16 November 2020


Empirical evidence is supporting the confidence in the fact that quantization-aware training algorithms are necessary when aggressive quantization schemes (1-bit, 2-bits) are considered. Most algorithms in this family interpret the quantization problem as an approximation task, where the quantized network is either obtained by projecting a full-precision network on a constrained space [1 XNOR, 2 INQ], or by progressively “hardening” a relaxed version of a QNN towards its natural discrete definition [3 ANA, 4 relaxed quantization]. An idea which is central to some of these algorithms is that of achieving quantization progressively. The rationale for this choice is allowing the full-precision part of the network to compensate for the error introduced at the different steps of the quantization process.

For instance, the incremental network quantization algorithm (INQ) [2] defines a partition P = {p_{1}, …, p_{T}} of the weights space of a given network topology, and assigns to each of its elements p_{i} an integer t_{i} representing its quantization epoch. The algorithm then starts training a full-precision network, and whenever a quantization epoch t_{i} is reached, the weights in p_{i} are projected onto the corresponding quantized weights space and are no longer allowed to be updated, whereas the weights which have not yet been quantized are able to adapt to compensate for the error introduced. In this way, at epoch T, all the network’s weights will be quantized.

Another example is that of the additive noise annealing algorithm (ANA) [4]. In this case, the target QNN is regularised through the addition of noise to the parameters, allowing gradients and updates to be computed. To each parameter m ∊ M, an annealing schedule f_m(t) is attached, governing the amount of noise for each parameter. By ensuring that all the schedules {f_m(t)}_{m ∊ M} are decreasing and that they eventually reach zero when t is the final training epoch, each regularised layer is sharpened to converge towards its quantized counterpart that will be implemented on the real hardware. This annealing happens prioritizing the lower layers and then proceeding to the upper layers. The original intuitive rationale was that the features of the lower layers should stabilize before the features of the upper layers, allowing the latter to adapt to the hardening of the former, but this strategy has also been motivated by theoretical results [Analytical aspects].

Despite these algorithms having shown promising results, the tuning of their scheduling hyper-parameters is not yet well understood, requiring tedious and time-consuming iterative searches. When learning algorithms are governed by some hyper-parameters (e.g., the learning rate and the momentum in stochastic gradient descent), an appealing possibility is applying machine learning systems to learn these hyper-parameters automatically, in a process called meta-learning. For instance, this approach has recently been applied to learning the hyper-parameters governing stochastic gradient descent.

Project description

In this project, we will start by designing suitable parametric models to describe the scheduling processes for the INQ and ANA algorithms. Then, you will be in charge of implementing the meta-learning algorithm and collect empirical data. Finally, we will analyse your findings and, if possible, derive some heuristic rules which can improve the effectiveness and efficiency of the INQ and ANA algorithms.

If time remains, we could also consider deploying some of the models trained with the improved algorithms on ternary network accelerators developed by other members of the IIS team.




  • Algorithms & data structures
  • Python programming
  • Basic knowledge in deep learning (convolutional neural networks, backpropagation)


  • Knowledge of the PyTorch deep learning framework
  • C/C++ programming


Luca Benini

Status: Available

Possible to complete as a Semester Thesis or as a Master Thesis.

Supervisor: Matteo Spallanzani


The student(s) and the advisor will meet on a weekly basis to check the progress of the project, clarify doubts and decide the next steps. The schedule of this weekly update meeting will be agreed at the beginning of the project by both parties. Of course, additional meetings can be organised to address urgent issues.

At the end of the project, you will have to present your work during a 15 minutes talk in front of the IIS team, and defend it during the following 5 minutes discussion.