Self-Learning Drones based on Neural Networks
Neural Nets are achieving record-breaking results in all common machine learning tasks and reached high attention of the machine learning community and well-known companies like Google, Facebook, Microsoft, and others. Based on this success, neural nets recently reached the field of reinforcement learning, too.
Reinforcement learning is different from the classic supervised and semi-supervised learning. In fact, the learning agent tries to optimize a reward function, but is able to interact in its environment and learns from experience. Just in March 2016 Google Deepmind made the headline by beating the 2nd best player in the world Lee Sedol in the game Go . A game which were considered to be too complex to be solved with classic machine learning methods.
The goal of this thesis is to port the recent published methods  (mainly applied on games: e.g. Go and set of Atari Games) to a new cyber-physical use-case scenario. Drones are a promising example of machines which can learn from the surrounding environment . Moreover, the interest in the use of autonomous vehicles in dangerous scenarios, such as natural disasters or hazardous areas is growing quickly. Autonomous robots can exploit complementary sense-act capabilities, supporting human operators in accomplishing surveillance and rescue tasks.
Our final goal is having drones which are able to learn and optimize their movements, autonomously, during their own activity (i.e. online learning).
In this project we propose to implement a reinforcement learning technique, aimed to control the movements of our rotor-craft, during a specific task (e.g. follow-me, reach the target, keep constant position, etc.).
- Semester Thesis by: Fabian Mueller, Maximilian Shuette, Gian-Marco Hutter
- Supervision: Renzo Andri, Daniele Palossi
- Familiarity with C/C++ programming.
- Knowledge of GPU programming would be an asset.
- Basic knowledge of ROS  and Torch  is favorable.
- 20% Theory
- 30% Tools familiarization (ROS, Torch)
- 35% C/CUDA programming
- 15% Experimental evaluation and experiments
Detailed Task Description
Meetings & Presentations
The student(s) and advisor(s) agree on weekly meetings to discuss all relevant decisions and decide on how to proceed. Of course, additional meetings can be organized to address urgent issues.
 Google DeepMind, AlphaGo (Slides at ICML’16), http://icml.cc/2016/tutorials/AlphaGo-tutorial-slides.pdf
 Volodymyr Mnih et al, Human-level control through deep reinforcement learning, www.nature.com/nature/journal/v518/n7540/pdf/nature14236.pdf
 Tom Zhavy, Graying the black box: Understanding DQNs, http://jmlr.org/proceedings/papers/v48/zahavy16.pdf
 Scaramuzza et al., Vision-Controlled Micro Flying Robots: From System Design to Autonomous Navigation and Mapping in GPS-Denied Environments, http://www.margaritachli.com/papers/RAM2014article.pdf
 Ros, http://www.ros.org/
 Torch, http://torch.ch/