Personal tools

Self-Learning Drones based on Neural Networks

From iis-projects

Jump to: navigation, search


Neural Nets are achieving record-breaking results in all common machine learning tasks and reached high attention of the machine learning community and well-known companies like Google, Facebook, Microsoft, and others. Based on this success, neural nets recently reached the field of reinforcement learning, too.

Reinforcement learning is different from the classic supervised and semi-supervised learning. In fact, the learning agent tries to optimize a reward function, but is able to interact in its environment and learns from experience. Just in March 2016 Google Deepmind made the headline by beating the 2nd best player in the world Lee Sedol in the game Go [1]. A game which were considered to be too complex to be solved with classic machine learning methods.

The goal of this thesis is to port the recent published methods [1][2][3] (mainly applied on games: e.g. Go and set of Atari Games) to a new cyber-physical use-case scenario. Drones are a promising example of machines which can learn from the surrounding environment [4]. Moreover, the interest in the use of autonomous vehicles in dangerous scenarios, such as natural disasters or hazardous areas is growing quickly. Autonomous robots can exploit complementary sense-act capabilities, supporting human operators in accomplishing surveillance and rescue tasks.

Our final goal is having drones which are able to learn and optimize their movements, autonomously, during their own activity (i.e. online learning).

In this project we propose to implement a reinforcement learning technique, aimed to control the movements of our rotor-craft, during a specific task (e.g. follow-me, reach the target, keep constant position, etc.).

Status: Completed

Semester Thesis by: Fabian Mueller, Maximilian Shuette, Gian-Marco Hutter
Supervision: Renzo Andri, Daniele Palossi


  • Familiarity with C/C++ programming.
  • Knowledge of GPU programming would be an asset.
  • Basic knowledge of ROS [5] and Torch [6] is favorable.


20% Theory
30% Tools familiarization (ROS, Torch)
35% C/CUDA programming
15% Experimental evaluation and experiments


Luca Benini

↑ top

Detailed Task Description

Meetings & Presentations

The student(s) and advisor(s) agree on weekly meetings to discuss all relevant decisions and decide on how to proceed. Of course, additional meetings can be organized to address urgent issues.


[1] Google DeepMind, AlphaGo (Slides at ICML’16),

[2] Volodymyr Mnih et al, Human-level control through deep reinforcement learning,

[3] Tom Zhavy, Graying the black box: Understanding DQNs,

[4] Scaramuzza et al., Vision-Controlled Micro Flying Robots: From System Design to Autonomous Navigation and Mapping in GPS-Denied Environments,

[5] Ros,

[6] Torch,

Practical Details

↑ top