Personal tools

Android reliability governor

From iis-projects

Revision as of 19:04, 31 August 2014 by Barandre (talk | contribs) (Created page with "400px|right|thumb ==Short Description== Reliability (R(t)) is the probability that a given system does not fail before time t. It is becoming a major conce...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Short Description

Reliability (R(t)) is the probability that a given system does not fail before time t. It is becoming a major concern in modern multiprocessors systems as the technology scaling is increasing the incidence of phenomena such as Time Dependent Dielectric Breakdown (TDDB), Bias Temperature Instability (BTI) and Hot Carrier Injection (HCI) which degrade processors subject to voltage and temperature stress. In addition process variation makes reliability varying between different instances of the same device and within similar blocks in the same device. As a matter of fact state-of-the-art design strategies to tolerate silicon degradation are based on guardbanding and design margins which became too conservative and performance-limiting due to the increasing distance in between the worse and typical device realization.

Dynamic Reliability Management (DRM) techniques aims at trading-off processor performance with lifetime at run-time by modulating the working temperature and voltage to counteract degradation when present. Recent work modulates this trade-off as a function of the workload requirements with the goal of preserving the performance on important tasks while degrading the one of less critical tasks. Indeed degradation happens at a larger time scale than computation and O.S. time quantum. This can be exploited to design speculative control technique that allows QoS-critical tasks to run in the short time above the reliability management constraint while preserving the reliability constraint in average in the long time scale.

Today multiprocessors are capable of scaling dynamically voltage and frequency by mean of the O.S. power manager which in linux are implemented by the power governor while temperature is managed by feedback controller loop. In the last year we have designed the first linux reliability governor that enables today and future android devices to extend their lifetime by correctly tuning the power supply during the device lifetime.

ODROID platform features the big.LITTLE ARM cluster architecture and it can be exploited to support a more aggressive reliability management by exploiting the little (big) cluster to extend (reduce) the overall lifetime without a significant performance drawback.

In this project we will design a big.LITTLE aware reliability management solution. In the first phase the candidate will study how the two clusters degrade differently due to the different operating conditions and will implement a new reliability governor inside the linux kernel. The candidate will also study the android O.S. to identify different reliability constraints in between different apps.

Status: Available

Looking for Interested Students
Supervisors: Andrea Bartolini


C Language
Interest in Linux Kernel Development and Android Design


20% Theory
60% Implementation
20% Testing


Luca Benini

↑ top

Detailed Task Description


Practical Details