Difference between revisions of "Hardware Acceleration"

Revision as of 17:29, 16 November 2020

A NVIDIA Tesla V100 GP-GPU. This cutting-edge accelerator provides huge computational power on a massive 800 mm² die.

Google's Cloud TPU (Tensor Processing Unit). This machine learning accelerator can do one thing extremely well: multiply-accumulate operations.

Accelerators are the backbone of big data and scientific computing. While general-purpose processor architectures such as Intel's x86 provide good performance across a wide variety of applications, it is only since the advent of general-purpose GPUs that many computationally demanding tasks have become feasible. Since these GPUs support a much narrower set of operations, it is easier to optimize the architecture to make them more efficient. Such accelerators are not limited to the high-performance sector alone. In low power computing, they allow complex tasks such as computer vision or cryptography to be performed under a very tight power budget. Without a dedicated accelerator, these tasks would not be feasible.

General-Purpose Computing

TBA

Nils Wistoff

e-mail: nwistoff@iis.ee.ethz.ch
phone: +41 44 632 06 75
office: ETZ J85

Paul Scheffler

e-mail: paulsc@iis.ee.ethz.ch
phone: +41 44 632 09 15
office: ETZ J85

Manuel Eggimann

meggiman@iis.ee.ethz.ch
ETZ J68

Fabian Schuiki

fschuiki@iis.ee.ethz.ch
ETZ J89

Computational Units

The last decade has seen explosive growth in the quest for energy-efficient architectures and systems. An era of exponentially improving computing efficiency - driven mostly by CMOS technology scaling - is coming to an end as Moore’s law falters. The obstacle of the so-called thermal- or power-wall is fueling a push towards computing paradigms, which hold energy efficiency as the ultimate figure of merit for any hardware design.

The broad term "computational units" covers a wide range of hardware accelerators for a multitude of different systems, such as floating-point units (FPUs) for processors, or dedicated accelerators for cryptography, signal processing, etc. Such computational units are housed within full systems which usually command stringent requirements in terms of performance, size, and efficiency.

Key topics of interest are energy-efficient accelerators at various extremes of the design space, covering high-performance, ultra low-power, or minimum area implementations, as well as the exploration of novel paradigms in computing, arithmetics, and processor architectures.

Hardware Acceleration of DNNs and QNNs

Deep Learning (DL) and Artificial Intelligence (AI) are quickly becoming dominant paradigms for all kinds of analytics, complementing or replacing traditional data science methods. Successful at-scale deployment of these algorithms requires deploying them directly at the data source, i.e. in the IoT end-nodes collecting data. However, due to the extreme constraints of these devices (in terms of power, memory footprint, area cost), performing full DL inference in-situ in low-power end-nodes requires a breakthrough in computational performance and efficiency. It is widely known that the numerical representation typically used when developing DL algorithms (single-precision floating-point) encodes a higher precision than what is actually required to achieve high quality-of-results in inference (Courbariaux et al. 2016); this fact can be exploited in the design of energy-efficient hardware for DL. For example, by using ternary weights, which means all network weights are quantized to {-1,0,1}, we can design the fundamental compute units in hardware without using an HW-expensive multiplication unit. Additionally, it allows us to store the weights much more compact on-chip.

Gianna Paulin

e-mail: pauling@iis.ee.ethz.ch
phone: +41 44 632 60 80
office: ETZ J76.2

Georg Rutishauser

e-mail: georgr@iis.ee.ethz.ch
phone: +41 44 632 54 97
office: ETZ J68.2

Moritz Scherer

e-mail: scheremo@iis.ee.ethz.ch
phone: +41 44 632 77 86
office: ETZ J69.2

Projects Overview

Available Projects

Projects In Progress

Completed Projects

The Logarithmic Number Unit chip Selene.

@@ Line 26: / Line 26: @@
 ==Computational Units==
-TBA
+The last decade has seen explosive growth in the quest for energy-efficient architectures and systems. An era of exponentially improving computing efficiency - driven mostly by CMOS technology scaling - is coming to an end as Moore’s law falters. The obstacle of the so-called thermal- or power-wall is fueling a push towards computing paradigms,  which hold energy efficiency as the ultimate figure of merit for any hardware design.
+The broad term "computational units" covers a wide range of hardware accelerators for a multitude of different systems, such as floating-point units (FPUs) for processors, or dedicated accelerators for cryptography, signal processing, etc. Such computational units are housed within full systems which usually command stringent requirements in terms of performance, size, and efficiency.
+Key topics of interest are energy-efficient accelerators at various extremes of the design space, covering high-performance, ultra low-power, or minimum area implementations, as well as the exploration of novel paradigms in computing, arithmetics, and processor architectures.
 ====Luca Bertaccini====
@@ Line 39: / Line 44: @@
 * [mailto:smach@iis.ee.ethz.ch smach@iis.ee.ethz.ch]
 * ETZ J89
 ==Hardware Acceleration of DNNs and QNNs==

Personal tools

Difference between revisions of "Hardware Acceleration" - iis-projects

Search

Navigation

Tools