Personal tools

EEG artifact detection with machine learning

From iis-projects

Revision as of 10:45, 10 September 2021 by Thoriri (talk | contribs) (Created page with "Category:HotCategory:Human IntranetCategory:Epilepsy thumb|300px === Introduction === Epilepsy is a central nervous system disorder in wh...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Ieeg seizure.png


Epilepsy is a central nervous system disorder in which brain activity becomes abnormal, causing seizures or periods of unusual behavior, sensations, and sometimes loss of awareness. The golden diagnostic standard is represented by Electroencephalography (EEG) systems, which unfortunately are cumbersome and can make patients uncomfortable because of perceived stigmatization. Thus, both patients and caregivers would benefit from the availability of wearable long-term EEG monitoring devices. These long-term EEG monitoring devices must be robust to different noises or artifacts, which can be either external disturbances or patient movement that taints the EEG signal.

Project Description

In this project, the student works with the Temple University (TUH) EEG Corpus dataset [1] labeled with multiple artifact types. The data is labeled channel-wise, meaning that from the dataset, three sub-dataset can be extracted.

  • Binary classification, e.g., Artifact : 0 Non-artifact:1. If there is an artifact on one of the channels, then the whole sample gets classified as an artifact such that the labeled dataset has the dimension of (N,1) -> where N is the total number of samples we have.
  • Binary classification, e.g., Artifact: 0 Non-artifact: 1. If there is an artifact on one of the channels but not the other, it should be classified as such. Then the labeled dataset has the following dimension: (N, C), where C is the total number of channels.
  • Multioutput classification, e.g., we have several different types of artifacts available that have been labeled, so in that sense, we go for a dataset that has channel-wise labeling and the correct artifact label for each artifact so that we have a labeled dataset with dimensions of (N, C)

We, therefore, have the three different datasets of:

  • 1 -> Normal binary classification
  • 2 -> Multilabel binary classification
  • 3 -> Multioutput-multilabel classification

The difficulty of each task follows 1<2<3 where the easiest is, of course, number 1.

The student explores methods of detecting these artifacts, ranging from classical machine learning such as Random Forest [2], AdaBoost, which require feature extraction beforehand. To explore deep learning methods such as using Variational Autoencoders (VAEs) [3], which generally do not require any feature engineering.

The project's goal as a whole is to present a method of detecting artifacts as soon as they happen, preferably in a fashion that uses minimum computational resources.

Required Skills

  • Intermediate knowledge of machine learning
  • Basic knowledge of Python


  • [1] The Temple University Artifact Corpus: An Annotated Corpus of EEG Artifacts
  • [2] Random Forests
  • [3] Variational Autoencoder based Anomaly Detection using Reconstruction Probability

Status: Available

Looking for Semester and Master Project Students
Supervision: Thorir Mar Ingolfsson, Andrea Cossettini, Simone Benatti


  • 20% literature review
  • 80% Implementation


Luca Benini

↑ top