Personal tools

Extreme-Edge Experience Replay for Keyword Spotting

From iis-projects

Jump to: navigation, search


Overview

Status: Available

Introduction

In an ever-changing world, deploying neural networks at the edge with the only purpose of solving a predefined, offline-learned tasks becomes obsolete. Often times, models experience on-site domain shifts (e.g., keyword spotting systems pretrained for warehouses and utilized in construction sites)[1] or new functionalities (i.e., new classes)[2] are added directly in the target device. In such contexts, on-device continual learning -- both domain incremental and class incremental learning -- enables the deployed neural network to remain up-to-date. Nevertheless, while learning new tasks, the model must not forget the previously learned ones, thus avoiding the so-called catastrophic forgetting effect[3].

Rehearsal-based methods[4][6] avoid or reduce catastrophic forgetting by maintaining a subset of already-seen samples in a memory buffer, also called reservoir. During the adaptation stage, the subset is replayed together with the newly available samples, jointly training the model, thus preventing overfitting on the new domain/class and maintaining generalization. The TinyML constraints associated with low-power, always-on keyword spotting (i.e., memory, storage, latency)[5] limit the size of the memory buffer, therefore the number of samples is drastically limited. Moreover, since each sample is a candidate for the replay buffer, the selection method running along with the inference must be lightweight, not incurring significant overheads for a real-time application.

The objective of this project is to propose an energy-efficient, real-time rehearsal-based method for keyword spotting, in the context of class-, task-, and domain-incremental learning.


Character

  • 20% literature research
  • 70% architectural implementation and optimizations
  • 10% evaluation

Prerequisites

  • Must be familiar with Python.
  • Knowledge of deep learning basics, including some deep learning frameworks like PyTorch or TensorFlow from a course, project, or self-taught with some tutorials.

Project Goals

The main tasks of this project are:

  • Task 1: Familiarize yourself with the project specifics (1-2 Weeks)

    Learn about DNN training and PyTorch, how to visualize results with TensorBoard. Read up on class-, task-, and domain-incremental learning, as well as rehearsal-based methods. Read up on DNN models aimed at time series (e.g., DS-CNNs, TCNs, Transformer and Conformer networks) and the recent advances in keyword spotting.

  • Task 2: Propose evaluation methodology and evaluate related work (4-6 weeks)

    Propose an evaluation methodology (e.g.,[3]) for audio-based tasks in the context of continual learning.

    Considering publicly-available methods and state-of-the-art implementation, evaluate experience replay techniques for keyword spotting.

    Expand the accuracy evaluation with respect to the TinyML constraints.

  • Task 3: Propose novel selection method (4-6 weeks)

    Propose and implement lightweight memory buffer update technique and analyse it considering previously defined evaluation methodology.

    Evaluate the proposed methodology considering different neural network topologies and model sizes.

  • (Only if conducted as Master's thesis) Task 4: Hardware-in-the-loop evaluation (6-8 weeks weeks)

    Familiarize yourself with GAP9 architecture and deployment tools.

    Implement the proposed selection method(s) and network update scheme.

    On-device evaluation of learning costs.

  • Task 5 - Gather and Present Final Results (2-3 Weeks)

    Gather final results.

    Prepare presentation (15/20 min. + 5 min. discussion).

    Write a final report. Include all major decisions taken during the design process and argue your choice. Include everything that deviates from the very standard case - show off everything that took time to figure out and all your ideas that have influenced the project.

Project Organization

Weekly Meetings

The student shall meet with the advisor(s) every week in order to discuss any issues/problems that may have persisted during the previous week and with a suggestion for the next steps. These meetings are meant to provide a guaranteed time slot for a mutual exchange of information on how to proceed, clear out any questions from either side and ensure the student’s progress.

Report

Documentation is an important and often overlooked aspect of engineering. One final report has to be completed within this project. Any form of word processing software is allowed for writing the reports, nevertheless, the use of LaTeX with Tgif (See: http://bourbon.usc.edu:8001/tgif/index.html and http://www.dz.ee.ethz.ch/en/information/how-to/drawing-schematics.html) or any other vector drawing software (for block diagrams) is strongly encouraged by the IIS staff.

Final Report

A digital copy of the report, the presentation, the developed software, build script/project files, drawings/illustrations, acquired data, etc. needs to be handed in at the end of the project. Note that this task description is part of your report and has to be attached to your final report.

Presentation

At the end of the project, the outcome of the thesis will be presented in a 15-minute/20-minute talk and 5 minutes of discussion in front of interested people of the Integrated Systems Laboratory. The presentation is open to the public, so you are welcome to invite interested friends. The exact date will be determined towards the end of the work.

References

[1] Cristian Cioflan, Lukas Cavigelli, Manuele Rusci, Miguel De Prado, Luca Benini Towards On-device Domain Adaptation for Noise-Robust Keyword Spotting 2022.

[2] Hamed Hemati and Andrea Cossu and Antonio Carta and Julio Hurtado and Lorenzo Pellegrini and Davide Bacciu and Vincenzo Lomonaco and Damian Borth Class-Incremental Learning with Repetition. 2023.

[3] Vincenzo Lomonaco and Davide Maltoni CORe50: a New Dataset and Benchmark for Continuous Object Recognition 2017.

[4] David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lillicrap, Gregory Wayne, Experience Replay for Continual Learning 2019.

[5] Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunathm Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun, Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, and Yuchen Zhou. MLperf inference benchmark. 2020

[6] Tao Zhuo and Zhiyong Cheng and Zan Gao and Mohan Kankanhalli Continual Learning with Strong Experience Replay 2023