http://iis-projects.ee.ethz.ch/api.php?action=feedcontributions&user=Andrire&feedformat=atomiis-projects - User contributions [en]2024-03-29T11:45:05ZUser contributionsMediaWiki 1.28.0http://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8064Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-15T06:56:52Z<p>Andrire: </p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
[[User:Andrire]]<br />
<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
<!-- [[Category:Available]]--><br />
<br />
<br />
= Overview =<br />
<br />
== Status: Not Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Floorplan or the Maddness Accelerator.]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation (SystemVerilog)<br />
* 10% low-level software implementation (C)<br />
* 30% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
* Lite experience with C or comparable language for low-level SW glue code<br />
<br />
If you want to work on this project, but you think that you do not match some of the required skills, please get in touch with us and we can provide preliminary exercises to help you fill in the gap.<br />
<br />
<br />
===Status: Not Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8063Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-15T06:56:43Z<p>Andrire: </p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
[[User:Andrire]]<br />
<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
<!-- [[Category:Available]]--><br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Floorplan or the Maddness Accelerator.]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation (SystemVerilog)<br />
* 10% low-level software implementation (C)<br />
* 30% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
* Lite experience with C or comparable language for low-level SW glue code<br />
<br />
If you want to work on this project, but you think that you do not match some of the required skills, please get in touch with us and we can provide preliminary exercises to help you fill in the gap.<br />
<br />
<br />
===Status: Not Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8044Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-13T09:49:33Z<p>Andrire: </p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
[[User:Andrire]]<br />
<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Floorplan or the Maddness Accelerator.]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation (SystemVerilog)<br />
* 10% low-level software implementation (C)<br />
* 30% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
* Lite experience with C or comparable language for low-level SW glue code<br />
<br />
If you want to work on this project, but you think that you do not match some of the required skills, please get in touch with us and we can provide preliminary exercises to help you fill in the gap.<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8043Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-13T09:20:19Z<p>Andrire: </p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
[[Category:Andrire]]<br />
<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Floorplan or the Maddness Accelerator.]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation (SystemVerilog)<br />
* 10% low-level software implementation (C)<br />
* 30% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
* Lite experience with C or comparable language for low-level SW glue code<br />
<br />
If you want to work on this project, but you think that you do not match some of the required skills, please get in touch with us and we can provide preliminary exercises to help you fill in the gap.<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=User:Andrire&diff=8042User:Andrire2022-09-13T09:19:02Z<p>Andrire: /* Available Projects */</p>
<hr />
<div><br />
<br />
==Dr. Renzo Andri -- Contact Information==<br />
* '''Office''': Huawei Research Center <br />
* '''e-mail''': renzo.andri AT huawei com<br />
[[Category:Supervisors]]<br />
[[Category:Digital]]<br />
<br />
==Interests==<br />
Dr. Renzo Andri has earned his PhD under the supervision of Prof. Luca Benini at the Integrated Systems Laboratory at ETH. In his research, he has focused on energy-efficient machine learning accelerators form efficient embedded Systems design to full custom ASIC design. <br />
He is currently working as Senior Researcher at Huawei Research in Zurich, conducting research on energy-efficient compute architectures and machine learning acceleration.<br />
<br />
* Computer Vision<br />
* Machine Learning, Neural Networks<br />
* FPGA & Digital ASIC Design<br />
* Low-Power Design<br />
* C/C++/CUDA software development<br />
* Embedded systems<br />
<br />
==Available Projects==<br />
We are providing student projects (master and semester theses) in collaboration with the IIS.<br />
See project under: [[Huawei Research]]<br />
<br />
== Available Projects ==<br />
<br />
<DynamicPageList><br />
supresserrors = true<br />
category = Available<br />
category = andrire<br />
</DynamicPageList><br />
<br />
<!--<DynamicPageList><br />
supresserrors = true<br />
category = Available<br />
category = Andrire<br />
</DynamicPageList><br />
<br />
== Projects in Progress==<br />
<DynamicPageList><br />
supresserrors = true<br />
category = In progress<br />
category = Andrire<br />
</DynamicPageList>--><br />
<!--<br />
==Completed Projects==<br />
===2015===<br />
<DynamicPageList><br />
supresserrors = true<br />
category = Completed<br />
category = Lukasc<br />
category = 2015<br />
</DynamicPageList><br />
<br />
--></div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Training_Strategy_And_Algorithmic_optimizations&diff=8041Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Training Strategy And Algorithmic optimizations2022-09-13T09:17:43Z<p>Andrire: </p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
<br />
[[Category:2022]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
[[Category:Hot]]<br />
[[Category:andrire]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Figure 1: Clock layout of the MADDness accelerator using ASAP7 technology]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. Fully tested drop-in PyTorch layers have already been developed and used. Currently, only a single layer replacement analysis has been done rigorously. So far the layers have only been replaced with the MADDness algorithm and the network has not been retrained with the corresponding new outputs of the layers. <br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to investigate if we can improve our accelerator’s accuracy by implementing a retraining strategy and framework. The goal would be to be able to replace multiple layers of a DNN without a significant drop in accuracy. A new realm of possible inter-layer optimization can then be analyzed afterwards. For example: Only calculating the needed dimensions for the next MADDness layer or including the activation layer into the MADDness algorithm. <br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
1. Acquire background knowledge & familiarize with the project 3 weeks<br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
2. Setup the project & rerun single layer analysis 2 weeks<br />
* Setup the project and rerun a single layer analysis (for example for ResNet-50)<br />
* Update the single layer analysis with a larger than previously used LeViT model<br />
3. Set up and evaluate a first retraining pipeline 8 weeks<br />
* Using the simple method of replacing one layer with MADDness and then retrain the following layers. After that we freeze that layer and proceed with the next one.<br />
* Evaluate and optimize the pipeline including a detailed analysis of the accuracy development for the ResNet-50, LeViT and DS-CNN networks<br />
* Include the developed framework into the already developed learning framework<br />
4. Extend the MADDness algorithm with intra-layer optimizations 10 weeks<br />
* Include the activation function into the MADDness algorithm<br />
* Can we optimize memory bandwidth and/or compute by only computing the dimensions needed for the following MADDness layer.<br />
* Is the encoding function that we are using the most accurate? Can we improve it?<br />
5. Project finalization 3 weeks<br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 20% Literature / project review<br />
* 40% Retraining pipeline implementation in Python<br />
* 30% Algorithmic optimisations<br />
* 10% Detailed analysis and preparation of results<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in Deep Learning and Hardware accelerators<br />
* Experience with Python and preferable with PyTorch or a similar machine learning framework (e.g. TensorFlow)<br />
<br />
<br />
If you want to work on this project, but you think that you do not match some of the required skills, please get in touch with us and we can provide preliminary exercises to help you fill in the gap.<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Training_Strategy_And_Algorithmic_optimizations&diff=8040Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Training Strategy And Algorithmic optimizations2022-09-13T09:16:27Z<p>Andrire: /* Project Details */</p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
<br />
[[Category:2022]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Figure 1: Clock layout of the MADDness accelerator using ASAP7 technology]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. Fully tested drop-in PyTorch layers have already been developed and used. Currently, only a single layer replacement analysis has been done rigorously. So far the layers have only been replaced with the MADDness algorithm and the network has not been retrained with the corresponding new outputs of the layers. <br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to investigate if we can improve our accelerator’s accuracy by implementing a retraining strategy and framework. The goal would be to be able to replace multiple layers of a DNN without a significant drop in accuracy. A new realm of possible inter-layer optimization can then be analyzed afterwards. For example: Only calculating the needed dimensions for the next MADDness layer or including the activation layer into the MADDness algorithm. <br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
1. Acquire background knowledge & familiarize with the project 3 weeks<br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
2. Setup the project & rerun single layer analysis 2 weeks<br />
* Setup the project and rerun a single layer analysis (for example for ResNet-50)<br />
* Update the single layer analysis with a larger than previously used LeViT model<br />
3. Set up and evaluate a first retraining pipeline 8 weeks<br />
* Using the simple method of replacing one layer with MADDness and then retrain the following layers. After that we freeze that layer and proceed with the next one.<br />
* Evaluate and optimize the pipeline including a detailed analysis of the accuracy development for the ResNet-50, LeViT and DS-CNN networks<br />
* Include the developed framework into the already developed learning framework<br />
4. Extend the MADDness algorithm with intra-layer optimizations 10 weeks<br />
* Include the activation function into the MADDness algorithm<br />
* Can we optimize memory bandwidth and/or compute by only computing the dimensions needed for the following MADDness layer.<br />
* Is the encoding function that we are using the most accurate? Can we improve it?<br />
5. Project finalization 3 weeks<br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 20% Literature / project review<br />
* 40% Retraining pipeline implementation in Python<br />
* 30% Algorithmic optimisations<br />
* 10% Detailed analysis and preparation of results<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in Deep Learning and Hardware accelerators<br />
* Experience with Python and preferable with PyTorch or a similar machine learning framework (e.g. TensorFlow)<br />
<br />
<br />
If you want to work on this project, but you think that you do not match some of the required skills, please get in touch with us and we can provide preliminary exercises to help you fill in the gap.<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Training_Strategy_And_Algorithmic_optimizations&diff=8039Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Training Strategy And Algorithmic optimizations2022-09-13T09:15:44Z<p>Andrire: </p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
<br />
[[Category:2022]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Figure 1: Clock layout of the MADDness accelerator using ASAP7 technology]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. Fully tested drop-in PyTorch layers have already been developed and used. Currently, only a single layer replacement analysis has been done rigorously. So far the layers have only been replaced with the MADDness algorithm and the network has not been retrained with the corresponding new outputs of the layers. <br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to investigate if we can improve our accelerator’s accuracy by implementing a retraining strategy and framework. The goal would be to be able to replace multiple layers of a DNN without a significant drop in accuracy. A new realm of possible inter-layer optimization can then be analyzed afterwards. For example: Only calculating the needed dimensions for the next MADDness layer or including the activation layer into the MADDness algorithm. <br />
More information can be found here:<br />
• Code: https://github.com/joennlae/halutmatmul<br />
• Reference Paper: https://arxiv.org/abs/2106.10860<br />
• HN discussion: https://news.ycombinator.com/item?id=28375096<br />
• and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
<br />
= Project Plan =<br />
1. Acquire background knowledge & familiarize with the project 3 weeks<br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
2. Setup the project & rerun single layer analysis 2 weeks<br />
* Setup the project and rerun a single layer analysis (for example for ResNet-50)<br />
* Update the single layer analysis with a larger than previously used LeViT model<br />
3. Set up and evaluate a first retraining pipeline 8 weeks<br />
* Using the simple method of replacing one layer with MADDness and then retrain the following layers. After that we freeze that layer and proceed with the next one.<br />
* Evaluate and optimize the pipeline including a detailed analysis of the accuracy development for the ResNet-50, LeViT and DS-CNN networks<br />
* Include the developed framework into the already developed learning framework<br />
4. Extend the MADDness algorithm with intra-layer optimizations 10 weeks<br />
* Include the activation function into the MADDness algorithm<br />
* Can we optimize memory bandwidth and/or compute by only computing the dimensions needed for the following MADDness layer.<br />
* Is the encoding function that we are using the most accurate? Can we improve it?<br />
5. Project finalization 3 weeks<br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 20% Literature / project review<br />
* 40% Retraining pipeline implementation in Python<br />
* 30% Algorithmic optimisations<br />
* 10% Detailed analysis and preparation of results<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in Deep Learning and Hardware accelerators<br />
* Experience with Python and preferable with PyTorch or a similar machine learning framework (e.g. TensorFlow)<br />
<br />
<br />
If you want to work on this project, but you think that you do not match some of the required skills, please get in touch with us and we can provide preliminary exercises to help you fill in the gap.<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Training_Strategy_And_Algorithmic_optimizations&diff=8038Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Training Strategy And Algorithmic optimizations2022-09-13T09:14:44Z<p>Andrire: Created page with "<!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --> Category:Digital Cat..."</p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
<br />
[[Category:2022]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Figure 1: Clock layout of the MADDness accelerator using ASAP7 technology]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. Fully tested drop-in PyTorch layers have already been developed and used. Currently, only a single layer replacement analysis has been done rigorously. So far the layers have only been replaced with the MADDness algorithm and the network has not been retrained with the corresponding new outputs of the layers. <br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to investigate if we can improve our accelerator’s accuracy by implementing a retraining strategy and framework. The goal would be to be able to replace multiple layers of a DNN without a significant drop in accuracy. A new realm of possible inter-layer optimization can then be analyzed afterwards. For example: Only calculating the needed dimensions for the next MADDness layer or including the activation layer into the MADDness algorithm. <br />
More information can be found here:<br />
• Code: https://github.com/joennlae/halutmatmul<br />
• Reference Paper: https://arxiv.org/abs/2106.10860<br />
• HN discussion: https://news.ycombinator.com/item?id=28375096<br />
• and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
<br />
= Project Plan =<br />
1. Acquire background knowledge & familiarize with the project 3 weeks<br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
2. Setup the project & rerun single layer analysis 2 weeks<br />
* Setup the project and rerun a single layer analysis (for example for ResNet-50)<br />
* Update the single layer analysis with a larger than previously used LeViT model<br />
3. Set up and evaluate a first retraining pipeline 8 weeks<br />
* Using the simple method of replacing one layer with MADDness and then retrain the following layers. After that we freeze that layer and proceed with the next one.<br />
* Evaluate and optimize the pipeline including a detailed analysis of the accuracy development for the ResNet-50, LeViT and DS-CNN networks<br />
* Include the developed framework into the already developed learning framework<br />
4. Extend the MADDness algorithm with intra-layer optimizations 10 weeks<br />
* Include the activation function into the MADDness algorithm<br />
* Can we optimize memory bandwidth and/or compute by only computing the dimensions needed for the following MADDness layer.<br />
* Is the encoding function that we are using the most accurate? Can we improve it?<br />
5. Project finalization 3 weeks<br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 20% Literature / project review<br />
* 40% Retraining pipeline implementation in Python<br />
* 30% Algorithmic optimisations<br />
* 10% Detailed analysis and preparation of results<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in Deep Learning and Hardware accelerators<br />
* Experience with Python and preferable with PyTorch or a similar machine learning framework (e.g. TensorFlow)<br />
<br />
<br />
If you want to work on this project, but you think that you do not match some of the required skills, please get in touch with us and we can provide preliminary exercises to help you fill in the gap.<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8037Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-13T09:11:26Z<p>Andrire: </p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Floorplan or the Maddness Accelerator.]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation (SystemVerilog)<br />
* 10% low-level software implementation (C)<br />
* 30% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
* Lite experience with C or comparable language for low-level SW glue code<br />
<br />
If you want to work on this project, but you think that you do not match some of the required skills, please get in touch with us and we can provide preliminary exercises to help you fill in the gap.<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8036Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-13T09:11:06Z<p>Andrire: </p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Deep Learning Projects]]<br />
<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Floorplan or the Maddness Accelerator.]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation (SystemVerilog)<br />
* 10% low-level software implementation (C)<br />
* 30% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
* Lite experience with C or comparable language for low-level SW glue code<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8035Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-13T09:09:40Z<p>Andrire: </p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Floorplan or the Maddness Accelerator.]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation (SystemVerilog)<br />
* 10% low-level software implementation (C)<br />
* 30% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
* Lite experience with C or comparable language for low-level SW glue code<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8034Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-13T09:09:03Z<p>Andrire: /* Introduction */</p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png|thumb|350px|Floorplan or the Maddness Accelerator.]]<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation<br />
* 40% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8033Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-13T09:08:11Z<p>Andrire: /* Introduction */</p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
[[File:maddness_floorplan.png]] The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation<br />
* 40% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=File:Maddness_floorplan.png&diff=8032File:Maddness floorplan.png2022-09-13T09:07:18Z<p>Andrire: </p>
<hr />
<div></div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=File:Final_clocks.webp&diff=8031File:Final clocks.webp2022-09-13T08:58:28Z<p>Andrire: floorplan from maddness</p>
<hr />
<div>floorplan from maddness</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Approximate_Matrix_Multiplication_based_Hardware_Accelerator_to_achieve_the_next_10x_in_Energy_Efficiency:_Full_System_Intregration&diff=8030Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Intregration2022-09-13T08:55:54Z<p>Andrire: Created page with "<!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --> Category:Digital Cat..."</p>
<hr />
<div><!-- Approximate Matrix Multiplication based Hardware Accelerator to achieve the next 10x in Energy Efficiency: Full System Integration (2S,1M) --><br />
<br />
[[Category:Digital]]<br />
[[Category:Acceleration_and_Transprecision]]<br />
[[Category:High Performance SoCs]]<br />
[[Category:Computer Architecture]]<br />
[[Category:2022]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Available]]<br />
<br />
= Overview =<br />
<br />
== Status: Available ==<br />
<br />
* Type: Semester Thesis (2 students), Master Thesis (1 student)<br />
* Professor: Prof. Dr. L. Benini<br />
* Supervisors:<br />
** Jannis Schönleber: [mailto:janniss@ethz.ch janniss@ethz.ch]<br />
** Lukas Cavigelli (Huawei), [mailto:lukas.cavigelli@huawei.com lukas.cavigelli@huawei.com]<br />
** Renzo Andri (Huawei), [mailto:renzo.andri@huawei.com renzo.andri@huawei.com]<br />
<br />
= Introduction =<br />
The continued growth in DNN model parameter count, application domains and general adoption led to an explosion of the needed computing power and energy. Especially the energy needs have become large enough to be economically unviable or extremely difficult to cool down. That led to a push for more energy-efficient solutions. Energy efficient accelerator solutions have a long tradition in IIS, with a multitude of proven accelerators published in the past. Standard accelerator architectures try to increase throughput via higher memory bandwidth, improved memory hierarchy or reduced precision (FP16, INT8, INT4). The approach of the accelerator used in the project is a different one. It uses an approximate matrix multiplication (AMM) algorithm called MADDness, which replaces the matrix multiplication with a lookup into a look-up-table (LUT) and an addition. That can significantly reduce the overall computing and energy needs.<br />
<br />
<br />
<br />
= Project Details =<br />
<br />
The MADDness algorithm is split into two parts. We have an encoding part, which translates the input matrix A into the addresses of the LUT. After the translation follows a decoding part which adds the corresponding LUT values together to calculate the approximate output of the matrix multiplication. MADDness is then integrated into deep neural networks. The most common seen layers in DNNs are convolutional layers and linear layers, both can be replaced by MADDness. The current RTL implementation is fully post-simulation tested and includes both the encoder and decoder unit. Additionally, a post-layout simulation-based energy estimation has been done. The accelerator is not yet integrated into a full system.<br />
Energy estimates with the current implementation using GF 22nm FDX technology suggest an energy efficiency of up to 32 TMACs/W compared to a state-of-the-art datacenter NVIDIA A100 (TSMC 7nm FinFET) at around 0.7 TMACs/W (FP16).<br />
In this project, we would like to integrate the accelerator into a full system. The aim at the end would be to be able to have a tape-out ready full system which includes the MADDness accelerator. A full system includes a suitable memory hierarchy to support the bandwidth needs. We are envisioning an integration into one of the existing PULP systems (for example: PULP clusters or ARA). The evaluation of which system suiting the accelerator the best and defining the final architecture is part of the thesis.<br />
More information can be found here:<br />
* Code: https://github.com/joennlae/halutmatmul<br />
* Reference Paper: https://arxiv.org/abs/2106.10860<br />
* HN discussion: https://news.ycombinator.com/item?id=28375096<br />
* and please do not hesitate to reach out to me: janniss@ethz.ch<br />
<br />
= Project Plan =<br />
Acquire background knowledge & familiarize with the project <br />
* Read up on the MADDness algorithm and product quantization methods<br />
* Familiarize yourself with the current state of the project<br />
* Familiarize with the IIS compute environment<br />
Setup the project & rerun RTL simulations <br />
* Setup the project with all its dependencies<br />
* Try to rerun the current RTL simulations<br />
Evaluate suitable systems to integrate and refine an architecture <br />
* Define bandwidth needs and brainstorm suitable memory hierarchies<br />
* Spreadsheet based evaluation of the different target systems like PULP clusters, ARA etc. this includes exploring different configurations and estimated size of the chip<br />
* Decide on a final architecture that we will pursue for the remainder of the project<br />
Integrate the accelerator into the defined architecture <br />
* Implementing the integration in SystemVerilog & add testbenches<br />
* Replace the currently used standard cell memories with compiled memories<br />
Setup the design flow <br />
* Setup and integrate the (most likely tsmc65) design flow into the project<br />
Synthesize + Place-and-Route & make design tape-out ready <br />
* Synthesize + Place-and-Route the design<br />
* Get a working post-layout simulation<br />
* Place macros & power routing, IR drop checks<br />
* The goal is to have everything ready for a design review: http://eda.ee.ethz.ch/index.php?title=Design_review (ETH domain)<br />
Project finalization <br />
* Prepare final report<br />
* Prepare project presentation<br />
* Clean up code<br />
<br />
<br />
== Character ==<br />
<br />
* 15% Literature / architecture review<br />
* 15% Design Evaluation<br />
* 30% RTL implementation<br />
* 40% ASIC tape-out preparation<br />
<br />
== Prerequisites ==<br />
<br />
* Strong interest in computer architecture<br />
* Experience with digital design in SystemVerilog as taught in VLSI I<br />
* Experience with ASIC implementation flow (synthesis) as taught in VLSI II<br />
<br />
<br />
===Status: Available ===</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7883Huawei Research2022-08-01T09:11:18Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://careers.huaweirc.ch/jobs/1946580-internship-digital-ic-design-engineer HW Design and Enhancement for ML Acceleration System] || AI Acceleration || digital VLSI design || [mailto:renzo.andri@huawei.com Renzo Andri]<br />
|- <br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || [mailto:renzo.andri@huawei.com Renzo Andri]<br />
|- <br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7878Huawei Research2022-07-26T12:14:41Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || [mailto:renzo.andri@huawei.com Renzo Andri]<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7877Huawei Research2022-07-26T12:14:23Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || [[mailto:renzo.andri@huawei.com Renzo Andri]]<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7876Huawei Research2022-07-26T12:14:14Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || [[mailto:renzo.andri@huawei.com | Renzo Andri]]<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7875Huawei Research2022-07-26T12:14:00Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || [mailto:renzo.andri@huawei.com | Renzo Andri]<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7874Huawei Research2022-07-26T12:13:52Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || [[mailto:renzo.andri@huawei.com | Renzo Andri]]<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7873Huawei Research2022-07-26T12:13:36Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || [[mailto:renzo.andri@huawei.com|Renzo Andri]<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7872Huawei Research2022-07-26T12:13:22Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || [[mailto:renzo.andri@huawei.com | Renzo Andri]<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7871Huawei Research2022-07-26T12:12:41Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7870Huawei Research2022-07-26T12:12:19Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7869Huawei Research2022-07-26T12:12:01Z<p>Andrire: /* Available and On-Going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
| open || 2022 || Internship || Digital VLSI Design Intern (ML Acceleration)|| We offer up to two internships from autumn on ML acceleration on Ascend. Details follow. If you are interested, get in contact. || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
<br />
|-<br />
| finished || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| finished || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| finished || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| finished || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=7611Huawei Research2022-02-09T13:17:02Z<p>Andrire: </p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available and On-Going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 4%;"|Status !! style="width: 2%;"|Year !! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
| open || 2021 || Semester Thesis || [[New RVV 1.0 Vector Instructions for Ara]] || Implementation and complimenting the Ara vector processor for full compliance of the RVV 1.0 Vector standard || Processor Design || digital VLSI design || [[:User:Mperotti | Matteo Perotti]] (ETH), Renzo Andri (Huawei),<br />
|-<br />
| on-going || 2021 || Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Renzo Andri, TBD PhD student at IIS<br />
|-<br />
| on-going || 2021 || Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Renzo Andri<br />
|-<br />
| on-going || 2021 || Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Renzo Andri, Lukas Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
* Renzo Andri, firstname.lastname at huawei com<br />
* Lukas Cavigelli, firstname.lastname at huawei com<br />
<br />
==Detailed Information==<br />
<br />
===Internship Digital VLSI Design for ML Acceleration (Taken)===<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
'''Your Responsibilities'''<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
'''Requirements - Your background'''<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Probing_the_limits_of_fake-quantised_neural_networks&diff=7016Probing the limits of fake-quantised neural networks2021-09-24T19:16:16Z<p>Andrire: /* Logistics */</p>
<hr />
<div>== Introduction ==<br />
<br />
Deep neural networks usually require huge computational resources to deliver their statistical power.<br />
However, in many applications where latency and data privacy are important constraints, it might be necessary to execute these models on resource-constrained computing systems such as embedded, mobile, or edge devices.<br />
The research field of TinyML has approached the problem of making DNNs more efficient from several angles: optimising network topologies in terms of accuracy-per-parameter or accuracy-per-operation; reducing model size via techniques such as parameter pruning; replacing high-bit-width floating-point operands with low-bit-width integer operands, training so-called quantised neural networks (QNNs).<br />
QNNs have the double benefit or reducing model size and replacing the energy-costly floating-point arithmetic with the energy-efficient integer arithmetic, properties that make them an ideal fit for resource-constrained hardware.<br />
<br />
At training time, QNNs use floating-point operands to leverage the optimised floating-point software kernels provided by the chosen deep learning frameworks.<br />
However, these floating-point parameters are constrained in such a way that the application of elementary arithmetic properties (e.g., associative, distributive) allow to return fully integerised programs that can be deployed to the target hardware.<br />
Unfortunately, this conversion process is a lossy one, and the techniques that can reduce the errors it introduces are crucial to make QNNs useful in practice.<br />
<br />
In this project, you will explore the impact of different floating-point formats on what we call the fake-to-true conversion process.<br />
<br />
<br />
== Project description ==<br />
<br />
: [https://iis-projects.ee.ethz.ch/images/6/68/Probing_the_limits_of_fake-quantised_neural_networks.pdf Project Description]<br />
<br />
== Skills and project character ==<br />
<br />
=== Skills ===<br />
<br />
Required:<br />
* Fundamental concepts of deep learning (convolutional neural networks, backpropagation, computational graphs)<br />
* Numerical representation formats (integer, floating-point)<br />
* Numerical analysis<br />
* Python programming<br />
* C/C++ programming<br />
<br />
Optional:<br />
* Knowledge of the PyTorch deep learning framework<br />
* Knowledge of digital arithmetic (e.g., two's complement, overflow, wraparound)<br />
<br />
=== Project character ===<br />
<br />
* 20% Theory<br />
* 40% C/C++ and Python coding<br />
* 40% Deep learning<br />
<br />
<br />
== Logistics ==<br />
<br />
The student and the advisor will meet on a weekly basis to check the progress of the project, clarify doubts, and decide the next steps.<br />
The schedule of this weekly update meeting will be agreed at the beginning of the project by both parties.<br />
Of course, additional meetings can be organised to address urgent issues.<br />
<br />
At the end of the project, you will have to present your work during a 15 minutes talk in front of the IIS team and defend it during the following 5 minutes discussion.<br />
<br />
== Professor ==<br />
<br />
: [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini]<br />
<br />
<br />
== Status: Available ==<br />
<br />
We are looking for 1 Master student.<br />
It is possible to complete the project either as a Semester Project or a Master Thesis.<br />
<br />
Supervisors: [[:User:spmatteo | Matteo Spallanzani]] [mailto:spmatteo@iis.ee.ethz.ch spmatteo@iis.ee.ethz.ch], [[:User:andrire | Renzo Andri (Huawei RC Zurich)]]<br />
<br />
<br />
<!-- <br />
<br />
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES<br />
<br />
GROUP<br />
[[Category:IIP]]<br />
[[Category:cat1]]<br />
[[Category:cat2]]<br />
[[Category:cat3]]<br />
[[Category:cat4]]<br />
[[Category:cat5]]<br />
<br />
[[Category:Digital]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Acceleration and Transprecision]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Event-Driven Computing]]<br />
[[Category:Predictable Execution]]<br />
[[Category:SmartSensors]]<br />
[[Category:Transient Computing]]<br />
[[Category:System on Chips for IoTs]]<br />
[[Category:Energy Efficient Autonomous UAVs]]<br />
[[Category:Biomedical System on Chips]]<br />
[[Category:Digital Medical Ultrasound Imaging]]<br />
[[Category:Cryptography]]<br />
[[Category:Deep Learning Acceleration]]<br />
[[Category:Hyperdimensional Computing]] <br />
<br />
[[Category:Competition]] <br />
[[Category:EmbeddedAI]] <br />
<br />
[[Category:ASIC]]<br />
[[Category:FPGA]]<br />
<br />
[[Category:System Design]]<br />
[[Category:Processor]]<br />
[[Category:Telecommunications]]<br />
[[Category:Modelling]]<br />
[[Category:Software]]<br />
[[Category:Audio]]<br />
<br />
[[Category:Analog]]<br />
<br />
[[Category:Nano-TCAD]]<br />
<br />
[[Category:AnalogInt]]<br />
[[Category:Telecommunications]]<br />
<br />
STATUS<br />
[[Category:Hot]]<br />
[[Category:Available]]<br />
[[Category:In progress]]<br />
[[Category:Completed]]<br />
<br />
TYPE OF WORK<br />
[[Category:Group Work]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:PhD Thesis]]<br />
[[Category:Research]]<br />
<br />
NAMES OF EU/CTI/NT PROJECTS<br />
[[Category:Oprecomp]]<br />
[[Category:Antarex]]<br />
[[Category:Hercules]]<br />
[[Category:Icarium]]<br />
[[Category:PULP]]<br />
[[Category:ArmaSuisse]]<br />
[[Category:Mnemosene]]<br />
[[Category:Aloha]]<br />
[[Category:Ampere]]<br />
[[Category:ExaNode]]<br />
[[Category:EPI]]<br />
[[Category:Fractal]]<br />
<br />
YEAR (IF FINISHED)<br />
[[Category:2010]]<br />
[[Category:2011]]<br />
[[Category:2012]]<br />
[[Category:2013]]<br />
[[Category:2014]]<br />
[[Category:2015]]<br />
[[Category:2016]]<br />
[[Category:2017]]<br />
[[Category:2018]]<br />
[[Category:2019]]<br />
[[Category:2020]]<br />
<br />
---><br />
<br />
[[Category:Digital]]<br />
[[Category:Deep Learning Projects]]<br />
[[Category:Deep Learning Acceleration]] <br />
[[Category:Available]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Hot]]<br />
[[Category:spmatteo]]<br />
[[Category:andrire]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Probing_the_limits_of_fake-quantised_neural_networks&diff=7014Probing the limits of fake-quantised neural networks2021-09-24T19:15:24Z<p>Andrire: /* Project description */</p>
<hr />
<div>== Introduction ==<br />
<br />
Deep neural networks usually require huge computational resources to deliver their statistical power.<br />
However, in many applications where latency and data privacy are important constraints, it might be necessary to execute these models on resource-constrained computing systems such as embedded, mobile, or edge devices.<br />
The research field of TinyML has approached the problem of making DNNs more efficient from several angles: optimising network topologies in terms of accuracy-per-parameter or accuracy-per-operation; reducing model size via techniques such as parameter pruning; replacing high-bit-width floating-point operands with low-bit-width integer operands, training so-called quantised neural networks (QNNs).<br />
QNNs have the double benefit or reducing model size and replacing the energy-costly floating-point arithmetic with the energy-efficient integer arithmetic, properties that make them an ideal fit for resource-constrained hardware.<br />
<br />
At training time, QNNs use floating-point operands to leverage the optimised floating-point software kernels provided by the chosen deep learning frameworks.<br />
However, these floating-point parameters are constrained in such a way that the application of elementary arithmetic properties (e.g., associative, distributive) allow to return fully integerised programs that can be deployed to the target hardware.<br />
Unfortunately, this conversion process is a lossy one, and the techniques that can reduce the errors it introduces are crucial to make QNNs useful in practice.<br />
<br />
In this project, you will explore the impact of different floating-point formats on what we call the fake-to-true conversion process.<br />
<br />
<br />
== Project description ==<br />
<br />
: [https://iis-projects.ee.ethz.ch/images/6/68/Probing_the_limits_of_fake-quantised_neural_networks.pdf Project Description]<br />
<br />
== Skills and project character ==<br />
<br />
=== Skills ===<br />
<br />
Required:<br />
* Fundamental concepts of deep learning (convolutional neural networks, backpropagation, computational graphs)<br />
* Numerical representation formats (integer, floating-point)<br />
* Numerical analysis<br />
* Python programming<br />
* C/C++ programming<br />
<br />
Optional:<br />
* Knowledge of the PyTorch deep learning framework<br />
* Knowledge of digital arithmetic (e.g., two's complement, overflow, wraparound)<br />
<br />
=== Project character ===<br />
<br />
* 20% Theory<br />
* 40% C/C++ and Python coding<br />
* 40% Deep learning<br />
<br />
<br />
== Logistics ==<br />
<br />
The student and the advisor will meet on a weekly basis to check the progress of the project, clarify doubts, and decide the next steps.<br />
The schedule of this weekly update meeting will be agreed at the beginning of the project by both parties.<br />
Of course, additional meetings can be organised to address urgent issues.<br />
<br />
At the end of the project, you will have to present your work during a 20 minutes talk in front of the IIS team and defend it during the following 5 minutes discussion.<br />
<br />
<br />
== Professor ==<br />
<br />
: [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini]<br />
<br />
<br />
== Status: Available ==<br />
<br />
We are looking for 1 Master student.<br />
It is possible to complete the project either as a Semester Project or a Master Thesis.<br />
<br />
Supervisors: [[:User:spmatteo | Matteo Spallanzani]] [mailto:spmatteo@iis.ee.ethz.ch spmatteo@iis.ee.ethz.ch], [[:User:andrire | Renzo Andri (Huawei RC Zurich)]]<br />
<br />
<br />
<!-- <br />
<br />
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES<br />
<br />
GROUP<br />
[[Category:IIP]]<br />
[[Category:cat1]]<br />
[[Category:cat2]]<br />
[[Category:cat3]]<br />
[[Category:cat4]]<br />
[[Category:cat5]]<br />
<br />
[[Category:Digital]]<br />
[[Category:Computer Architecture]]<br />
[[Category:Acceleration and Transprecision]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Event-Driven Computing]]<br />
[[Category:Predictable Execution]]<br />
[[Category:SmartSensors]]<br />
[[Category:Transient Computing]]<br />
[[Category:System on Chips for IoTs]]<br />
[[Category:Energy Efficient Autonomous UAVs]]<br />
[[Category:Biomedical System on Chips]]<br />
[[Category:Digital Medical Ultrasound Imaging]]<br />
[[Category:Cryptography]]<br />
[[Category:Deep Learning Acceleration]]<br />
[[Category:Hyperdimensional Computing]] <br />
<br />
[[Category:Competition]] <br />
[[Category:EmbeddedAI]] <br />
<br />
[[Category:ASIC]]<br />
[[Category:FPGA]]<br />
<br />
[[Category:System Design]]<br />
[[Category:Processor]]<br />
[[Category:Telecommunications]]<br />
[[Category:Modelling]]<br />
[[Category:Software]]<br />
[[Category:Audio]]<br />
<br />
[[Category:Analog]]<br />
<br />
[[Category:Nano-TCAD]]<br />
<br />
[[Category:AnalogInt]]<br />
[[Category:Telecommunications]]<br />
<br />
STATUS<br />
[[Category:Hot]]<br />
[[Category:Available]]<br />
[[Category:In progress]]<br />
[[Category:Completed]]<br />
<br />
TYPE OF WORK<br />
[[Category:Group Work]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:PhD Thesis]]<br />
[[Category:Research]]<br />
<br />
NAMES OF EU/CTI/NT PROJECTS<br />
[[Category:Oprecomp]]<br />
[[Category:Antarex]]<br />
[[Category:Hercules]]<br />
[[Category:Icarium]]<br />
[[Category:PULP]]<br />
[[Category:ArmaSuisse]]<br />
[[Category:Mnemosene]]<br />
[[Category:Aloha]]<br />
[[Category:Ampere]]<br />
[[Category:ExaNode]]<br />
[[Category:EPI]]<br />
[[Category:Fractal]]<br />
<br />
YEAR (IF FINISHED)<br />
[[Category:2010]]<br />
[[Category:2011]]<br />
[[Category:2012]]<br />
[[Category:2013]]<br />
[[Category:2014]]<br />
[[Category:2015]]<br />
[[Category:2016]]<br />
[[Category:2017]]<br />
[[Category:2018]]<br />
[[Category:2019]]<br />
[[Category:2020]]<br />
<br />
---><br />
<br />
[[Category:Digital]]<br />
[[Category:Deep Learning Projects]]<br />
[[Category:Deep Learning Acceleration]] <br />
[[Category:Available]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Hot]]<br />
[[Category:spmatteo]]<br />
[[Category:andrire]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Probing_the_limits_of_fake-quantised_neural_networks&diff=6990Probing the limits of fake-quantised neural networks2021-09-24T17:03:10Z<p>Andrire: </p>
<hr />
<div>==Short Description==<br />
Abstract of the project<br />
<br />
===Status: Available ===<br />
: Looking for 1 Semester/Master student<br />
: Contact: [[:User:spmatteo | Matteo Spallanzani]], [[:User:andrire | Renzo Andri (Huawei RC Zurich)]]<br />
<br />
===Prerequisites===<br />
: Experience in C/C++ and Python programming<br />
: Knowledge of numerical analysis<br />
: (Recommended) Experience in the PyTorch deep learning framework<br />
: (Recommended) Knowledge of numerical representation formats and digital arithmetic<br />
<br />
===Character===<br />
: 20% Theory<br />
: 40% C/C++ and Python coding<br />
: 40% Deep learning<br />
<br />
===Professor===<br />
: [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini]<br />
<br />
[[#top|↑ top]]<br />
<br />
==Detailed Task Description==<br />
<br />
===Goals===<br />
===Practical Details===<br />
* '''[[Project Plan]]'''<br />
* '''[[Project Meetings]]'''<br />
* '''[[Design Review]]'''<br />
* '''[[Coding Guidelines]]'''<br />
* '''[[Final Report]]'''<br />
* '''[[Final Presentation]]'''<br />
<br />
==Results== <br />
<br />
==Links== <br />
<br />
[[#top|↑ top]]<br />
<!-- <br />
<br />
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES<br />
<br />
GROUP<br />
[[Category:IIP]]<br />
[[Category:cat1]]<br />
[[Category:cat2]]<br />
[[Category:cat3]]<br />
[[Category:cat4]]<br />
[[Category:cat5]]<br />
<br />
<br />
[[Category:Digital]]<br />
SUB CATEGORIES<br />
NEW CATEGORIES<br />
[[Category:Computer Architecture]]<br />
[[Category:Acceleration and Transprecision]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Event-Driven Computing]]<br />
[[Category:Predictable Execution]]<br />
[[Category:SmartSensors]]<br />
[[Category:Transient Computing]]<br />
[[Category:System on Chips for IoTs]]<br />
[[Category:Energy Efficient Autonomous UAVs]]<br />
[[Category:Biomedical System on Chips]]<br />
[[Category:Digital Medical Ultrasound Imaging]]<br />
[[Category:Cryptography]]<br />
[[Category:Deep Learning Acceleration]]<br />
[[Category:Hyperdimensional Computing]] <br />
<br />
[[Category:Competition]] <br />
[[Category:EmbeddedAI]] <br />
<br />
<br />
[[Category:ASIC]]<br />
[[Category:FPGA]]<br />
<br />
[[Category:System Design]]<br />
[[Category:Processor]]<br />
[[Category:Telecommunications]]<br />
[[Category:Modelling]]<br />
[[Category:Software]]<br />
[[Category:Audio]]<br />
<br />
[[Category:Analog]]<br />
[[Category:Nano-TCAD]]<br />
<br />
[[Category:AnalogInt]]<br />
SUB CATEGORIES<br />
[[Category:Telecommunications]]<br />
<br />
<br />
STATUS<br />
[[Category:Available]]<br />
[[Category:In progress]]<br />
[[Category:Completed]]<br />
[[Category:Hot]]<br />
<br />
TYPE OF WORK<br />
[[Category:Group Work]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:PhD Thesis]]<br />
[[Category:Research]]<br />
<br />
NAMES OF EU/CTI/NT PROJECTS<br />
[[Category:Oprecomp]]<br />
[[Category:Antarex]]<br />
[[Category:Hercules]]<br />
[[Category:Icarium]]<br />
[[Category:PULP]]<br />
[[Category:ArmaSuisse]]<br />
[[Category:Mnemosene]]<br />
[[Category:Aloha]]<br />
[[Category:Ampere]]<br />
[[Category:ExaNode]]<br />
[[Category:EPI]]<br />
[[Category:Fractal]]<br />
<br />
<br />
YEAR (IF FINISHED)<br />
[[Category:2010]]<br />
[[Category:2011]]<br />
[[Category:2012]]<br />
[[Category:2013]]<br />
[[Category:2014]]<br />
[[Category:2015]]<br />
[[Category:2016]]<br />
[[Category:2017]]<br />
[[Category:2018]]<br />
[[Category:2019]]<br />
[[Category:2020]]<br />
<br />
<br />
---><br />
[[Category:Deep Learning Acceleration]] <br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Spmatteo]]<br />
[[Category:Andrire]]<br />
[[Category:Hot]]<br />
[[Category:Available]]<br />
[[Category:Digital]]<br />
[[Category:Deep Learning Projects]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Probing_the_limits_of_fake-quantised_neural_networks&diff=6989Probing the limits of fake-quantised neural networks2021-09-24T17:00:40Z<p>Andrire: </p>
<hr />
<div>==Short Description==<br />
Abstract of the project<br />
<br />
===Status: Available ===<br />
: Looking for 1 Semester/Master student<br />
: Contact: [[:User:spmatteo | Matteo Spallanzani]], [[:User:andrire | Renzo Andri (Huawei RC Zurich)]]<br />
<br />
===Prerequisites===<br />
: Experience in C/C++ and Python programming<br />
: Knowledge of numerical analysis<br />
: (Recommended) Experience in the PyTorch deep learning framework<br />
: (Recommended) Knowledge of numerical representation formats and digital arithmetic<br />
<br />
===Character===<br />
: 20% Theory<br />
: 40% C/C++ and Python coding<br />
: 40% Deep learning<br />
<br />
===Professor===<br />
: [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini]<br />
<br />
[[#top|↑ top]]<br />
<br />
==Detailed Task Description==<br />
<br />
===Goals===<br />
===Practical Details===<br />
* '''[[Project Plan]]'''<br />
* '''[[Project Meetings]]'''<br />
* '''[[Design Review]]'''<br />
* '''[[Coding Guidelines]]'''<br />
* '''[[Final Report]]'''<br />
* '''[[Final Presentation]]'''<br />
<br />
==Results== <br />
<br />
==Links== <br />
<br />
[[#top|↑ top]]<br />
<!-- <br />
<br />
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES<br />
<br />
GROUP<br />
[[Category:IIP]]<br />
[[Category:cat1]]<br />
[[Category:cat2]]<br />
[[Category:cat3]]<br />
[[Category:cat4]]<br />
[[Category:cat5]]<br />
<br />
<br />
[[Category:Digital]]<br />
SUB CATEGORIES<br />
NEW CATEGORIES<br />
[[Category:Computer Architecture]]<br />
[[Category:Acceleration and Transprecision]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Event-Driven Computing]]<br />
[[Category:Predictable Execution]]<br />
[[Category:SmartSensors]]<br />
[[Category:Transient Computing]]<br />
[[Category:System on Chips for IoTs]]<br />
[[Category:Energy Efficient Autonomous UAVs]]<br />
[[Category:Biomedical System on Chips]]<br />
[[Category:Digital Medical Ultrasound Imaging]]<br />
[[Category:Cryptography]]<br />
[[Category:Deep Learning Acceleration]]<br />
[[Category:Hyperdimensional Computing]] <br />
<br />
[[Category:Competition]] <br />
[[Category:EmbeddedAI]] <br />
<br />
<br />
[[Category:ASIC]]<br />
[[Category:FPGA]]<br />
<br />
[[Category:System Design]]<br />
[[Category:Processor]]<br />
[[Category:Telecommunications]]<br />
[[Category:Modelling]]<br />
[[Category:Software]]<br />
[[Category:Audio]]<br />
<br />
[[Category:Analog]]<br />
[[Category:Nano-TCAD]]<br />
<br />
[[Category:AnalogInt]]<br />
SUB CATEGORIES<br />
[[Category:Telecommunications]]<br />
<br />
<br />
STATUS<br />
[[Category:Available]]<br />
[[Category:In progress]]<br />
[[Category:Completed]]<br />
[[Category:Hot]]<br />
<br />
TYPE OF WORK<br />
[[Category:Group Work]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:PhD Thesis]]<br />
[[Category:Research]]<br />
<br />
NAMES OF EU/CTI/NT PROJECTS<br />
[[Category:Oprecomp]]<br />
[[Category:Antarex]]<br />
[[Category:Hercules]]<br />
[[Category:Icarium]]<br />
[[Category:PULP]]<br />
[[Category:ArmaSuisse]]<br />
[[Category:Mnemosene]]<br />
[[Category:Aloha]]<br />
[[Category:Ampere]]<br />
[[Category:ExaNode]]<br />
[[Category:EPI]]<br />
[[Category:Fractal]]<br />
<br />
<br />
YEAR (IF FINISHED)<br />
[[Category:2010]]<br />
[[Category:2011]]<br />
[[Category:2012]]<br />
[[Category:2013]]<br />
[[Category:2014]]<br />
[[Category:2015]]<br />
[[Category:2016]]<br />
[[Category:2017]]<br />
[[Category:2018]]<br />
[[Category:2019]]<br />
[[Category:2020]]<br />
<br />
<br />
---><br />
[[Category:Deep Learning Acceleration]] <br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Spmatteo]]<br />
[[Category:Andrire]]<br />
[[Category:Hot]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Probing_the_limits_of_fake-quantised_neural_networks&diff=6986Probing the limits of fake-quantised neural networks2021-09-24T16:52:58Z<p>Andrire: /* Status: Available */</p>
<hr />
<div>==Short Description==<br />
Abstract of the project<br />
<br />
===Status: Available ===<br />
: Looking for 1 Semester/Master students<br />
: Contact: [[:User:spmatteo | Matteo Spallanzani]], [[:User:andrire | Renzo Andri (Huawei RC Zurich)]]<br />
<br />
===Prerequisites===<br />
: C/C++ and Python programming<br />
<br />
===Character===<br />
: 20% Theory<br />
: 40% C/C++, Python coding<br />
: 40% Deep learning<br />
<br />
===Professor===<br />
<!-- : [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini] ---><br />
<br />
[[#top|↑ top]]<br />
<br />
==Detailed Task Description==<br />
<br />
===Goals===<br />
===Practical Details===<br />
* '''[[Project Plan]]'''<br />
* '''[[Project Meetings]]'''<br />
* '''[[Design Review]]'''<br />
* '''[[Coding Guidelines]]'''<br />
* '''[[Final Report]]'''<br />
* '''[[Final Presentation]]'''<br />
<br />
==Results== <br />
<br />
==Links== <br />
<br />
[[#top|↑ top]]<br />
<!-- <br />
<br />
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES<br />
<br />
GROUP<br />
[[Category:IIP]]<br />
[[Category:cat1]]<br />
[[Category:cat2]]<br />
[[Category:cat3]]<br />
[[Category:cat4]]<br />
[[Category:cat5]]<br />
<br />
<br />
[[Category:Digital]]<br />
SUB CATEGORIES<br />
NEW CATEGORIES<br />
[[Category:Computer Architecture]]<br />
[[Category:Acceleration and Transprecision]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Event-Driven Computing]]<br />
[[Category:Predictable Execution]]<br />
[[Category:SmartSensors]]<br />
[[Category:Transient Computing]]<br />
[[Category:System on Chips for IoTs]]<br />
[[Category:Energy Efficient Autonomous UAVs]]<br />
[[Category:Biomedical System on Chips]]<br />
[[Category:Digital Medical Ultrasound Imaging]]<br />
[[Category:Cryptography]]<br />
[[Category:Deep Learning Acceleration]]<br />
[[Category:Hyperdimensional Computing]] <br />
<br />
[[Category:Competition]] <br />
[[Category:EmbeddedAI]] <br />
<br />
<br />
[[Category:ASIC]]<br />
[[Category:FPGA]]<br />
<br />
[[Category:System Design]]<br />
[[Category:Processor]]<br />
[[Category:Telecommunications]]<br />
[[Category:Modelling]]<br />
[[Category:Software]]<br />
[[Category:Audio]]<br />
<br />
[[Category:Analog]]<br />
[[Category:Nano-TCAD]]<br />
<br />
[[Category:AnalogInt]]<br />
SUB CATEGORIES<br />
[[Category:Telecommunications]]<br />
<br />
<br />
STATUS<br />
[[Category:Available]]<br />
[[Category:In progress]]<br />
[[Category:Completed]]<br />
[[Category:Hot]]<br />
<br />
TYPE OF WORK<br />
[[Category:Group Work]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:PhD Thesis]]<br />
[[Category:Research]]<br />
<br />
NAMES OF EU/CTI/NT PROJECTS<br />
[[Category:Oprecomp]]<br />
[[Category:Antarex]]<br />
[[Category:Hercules]]<br />
[[Category:Icarium]]<br />
[[Category:PULP]]<br />
[[Category:ArmaSuisse]]<br />
[[Category:Mnemosene]]<br />
[[Category:Aloha]]<br />
[[Category:Ampere]]<br />
[[Category:ExaNode]]<br />
[[Category:EPI]]<br />
[[Category:Fractal]]<br />
<br />
<br />
YEAR (IF FINISHED)<br />
[[Category:2010]]<br />
[[Category:2011]]<br />
[[Category:2012]]<br />
[[Category:2013]]<br />
[[Category:2014]]<br />
[[Category:2015]]<br />
[[Category:2016]]<br />
[[Category:2017]]<br />
[[Category:2018]]<br />
[[Category:2019]]<br />
[[Category:2020]]<br />
<br />
<br />
---><br />
[[Category:Deep Learning Acceleration]] <br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Spmatteo]]<br />
[[Category:Andrire]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Probing_the_limits_of_fake-quantised_neural_networks&diff=6985Probing the limits of fake-quantised neural networks2021-09-24T16:51:26Z<p>Andrire: /* Prerequisites */</p>
<hr />
<div>==Short Description==<br />
Abstract of the project<br />
<br />
===Status: Available ===<br />
: Looking for 1 Semester/Master students<br />
: Contact: [[:User:spmatteo | Matteo Spallanzani]]<br />
===Prerequisites===<br />
: C/C++ and Python programming<br />
<br />
===Character===<br />
: 20% Theory<br />
: 40% C/C++, Python coding<br />
: 40% Deep learning<br />
<br />
===Professor===<br />
<!-- : [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini] ---><br />
<br />
[[#top|↑ top]]<br />
<br />
==Detailed Task Description==<br />
<br />
===Goals===<br />
===Practical Details===<br />
* '''[[Project Plan]]'''<br />
* '''[[Project Meetings]]'''<br />
* '''[[Design Review]]'''<br />
* '''[[Coding Guidelines]]'''<br />
* '''[[Final Report]]'''<br />
* '''[[Final Presentation]]'''<br />
<br />
==Results== <br />
<br />
==Links== <br />
<br />
[[#top|↑ top]]<br />
<!-- <br />
<br />
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES<br />
<br />
GROUP<br />
[[Category:IIP]]<br />
[[Category:cat1]]<br />
[[Category:cat2]]<br />
[[Category:cat3]]<br />
[[Category:cat4]]<br />
[[Category:cat5]]<br />
<br />
<br />
[[Category:Digital]]<br />
SUB CATEGORIES<br />
NEW CATEGORIES<br />
[[Category:Computer Architecture]]<br />
[[Category:Acceleration and Transprecision]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Event-Driven Computing]]<br />
[[Category:Predictable Execution]]<br />
[[Category:SmartSensors]]<br />
[[Category:Transient Computing]]<br />
[[Category:System on Chips for IoTs]]<br />
[[Category:Energy Efficient Autonomous UAVs]]<br />
[[Category:Biomedical System on Chips]]<br />
[[Category:Digital Medical Ultrasound Imaging]]<br />
[[Category:Cryptography]]<br />
[[Category:Deep Learning Acceleration]]<br />
[[Category:Hyperdimensional Computing]] <br />
<br />
[[Category:Competition]] <br />
[[Category:EmbeddedAI]] <br />
<br />
<br />
[[Category:ASIC]]<br />
[[Category:FPGA]]<br />
<br />
[[Category:System Design]]<br />
[[Category:Processor]]<br />
[[Category:Telecommunications]]<br />
[[Category:Modelling]]<br />
[[Category:Software]]<br />
[[Category:Audio]]<br />
<br />
[[Category:Analog]]<br />
[[Category:Nano-TCAD]]<br />
<br />
[[Category:AnalogInt]]<br />
SUB CATEGORIES<br />
[[Category:Telecommunications]]<br />
<br />
<br />
STATUS<br />
[[Category:Available]]<br />
[[Category:In progress]]<br />
[[Category:Completed]]<br />
[[Category:Hot]]<br />
<br />
TYPE OF WORK<br />
[[Category:Group Work]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:PhD Thesis]]<br />
[[Category:Research]]<br />
<br />
NAMES OF EU/CTI/NT PROJECTS<br />
[[Category:Oprecomp]]<br />
[[Category:Antarex]]<br />
[[Category:Hercules]]<br />
[[Category:Icarium]]<br />
[[Category:PULP]]<br />
[[Category:ArmaSuisse]]<br />
[[Category:Mnemosene]]<br />
[[Category:Aloha]]<br />
[[Category:Ampere]]<br />
[[Category:ExaNode]]<br />
[[Category:EPI]]<br />
[[Category:Fractal]]<br />
<br />
<br />
YEAR (IF FINISHED)<br />
[[Category:2010]]<br />
[[Category:2011]]<br />
[[Category:2012]]<br />
[[Category:2013]]<br />
[[Category:2014]]<br />
[[Category:2015]]<br />
[[Category:2016]]<br />
[[Category:2017]]<br />
[[Category:2018]]<br />
[[Category:2019]]<br />
[[Category:2020]]<br />
<br />
<br />
---><br />
[[Category:Deep Learning Acceleration]] <br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Spmatteo]]<br />
[[Category:Andrire]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Probing_the_limits_of_fake-quantised_neural_networks&diff=6984Probing the limits of fake-quantised neural networks2021-09-24T16:49:36Z<p>Andrire: </p>
<hr />
<div>==Short Description==<br />
Abstract of the project<br />
<br />
===Status: Available ===<br />
: Looking for 1 Semester/Master students<br />
: Contact: [[:User:spmatteo | Matteo Spallanzani]]<br />
===Prerequisites===<br />
: C/C++ and Python programming<br />
: VLSI II (''recommended'')<br />
<br />
===Character===<br />
: 20% Theory<br />
: 40% C/C++, Python coding<br />
: 40% Deep learning<br />
<br />
===Professor===<br />
<!-- : [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini] ---><br />
<br />
[[#top|↑ top]]<br />
<br />
==Detailed Task Description==<br />
<br />
===Goals===<br />
===Practical Details===<br />
* '''[[Project Plan]]'''<br />
* '''[[Project Meetings]]'''<br />
* '''[[Design Review]]'''<br />
* '''[[Coding Guidelines]]'''<br />
* '''[[Final Report]]'''<br />
* '''[[Final Presentation]]'''<br />
<br />
==Results== <br />
<br />
==Links== <br />
<br />
[[#top|↑ top]]<br />
<!-- <br />
<br />
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES<br />
<br />
GROUP<br />
[[Category:IIP]]<br />
[[Category:cat1]]<br />
[[Category:cat2]]<br />
[[Category:cat3]]<br />
[[Category:cat4]]<br />
[[Category:cat5]]<br />
<br />
<br />
[[Category:Digital]]<br />
SUB CATEGORIES<br />
NEW CATEGORIES<br />
[[Category:Computer Architecture]]<br />
[[Category:Acceleration and Transprecision]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Event-Driven Computing]]<br />
[[Category:Predictable Execution]]<br />
[[Category:SmartSensors]]<br />
[[Category:Transient Computing]]<br />
[[Category:System on Chips for IoTs]]<br />
[[Category:Energy Efficient Autonomous UAVs]]<br />
[[Category:Biomedical System on Chips]]<br />
[[Category:Digital Medical Ultrasound Imaging]]<br />
[[Category:Cryptography]]<br />
[[Category:Deep Learning Acceleration]]<br />
[[Category:Hyperdimensional Computing]] <br />
<br />
[[Category:Competition]] <br />
[[Category:EmbeddedAI]] <br />
<br />
<br />
[[Category:ASIC]]<br />
[[Category:FPGA]]<br />
<br />
[[Category:System Design]]<br />
[[Category:Processor]]<br />
[[Category:Telecommunications]]<br />
[[Category:Modelling]]<br />
[[Category:Software]]<br />
[[Category:Audio]]<br />
<br />
[[Category:Analog]]<br />
[[Category:Nano-TCAD]]<br />
<br />
[[Category:AnalogInt]]<br />
SUB CATEGORIES<br />
[[Category:Telecommunications]]<br />
<br />
<br />
STATUS<br />
[[Category:Available]]<br />
[[Category:In progress]]<br />
[[Category:Completed]]<br />
[[Category:Hot]]<br />
<br />
TYPE OF WORK<br />
[[Category:Group Work]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:PhD Thesis]]<br />
[[Category:Research]]<br />
<br />
NAMES OF EU/CTI/NT PROJECTS<br />
[[Category:Oprecomp]]<br />
[[Category:Antarex]]<br />
[[Category:Hercules]]<br />
[[Category:Icarium]]<br />
[[Category:PULP]]<br />
[[Category:ArmaSuisse]]<br />
[[Category:Mnemosene]]<br />
[[Category:Aloha]]<br />
[[Category:Ampere]]<br />
[[Category:ExaNode]]<br />
[[Category:EPI]]<br />
[[Category:Fractal]]<br />
<br />
<br />
YEAR (IF FINISHED)<br />
[[Category:2010]]<br />
[[Category:2011]]<br />
[[Category:2012]]<br />
[[Category:2013]]<br />
[[Category:2014]]<br />
[[Category:2015]]<br />
[[Category:2016]]<br />
[[Category:2017]]<br />
[[Category:2018]]<br />
[[Category:2019]]<br />
[[Category:2020]]<br />
<br />
<br />
---><br />
[[Category:Deep Learning Acceleration]] <br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:Spmatteo]]<br />
[[Category:Andrire]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Probing_the_limits_of_fake-quantised_neural_networks&diff=6983Probing the limits of fake-quantised neural networks2021-09-24T16:46:18Z<p>Andrire: </p>
<hr />
<div>==Short Description==<br />
Abstract of the project<br />
<br />
===Status: Available ===<br />
: Looking for 1 Semester/Master students<br />
: Contact: [[:User:spmatteo | Matteo Spallanzani]]<br />
===Prerequisites===<br />
: C/C++ and Python programming<br />
: VLSI II (''recommended'')<br />
<br />
===Character===<br />
: 20% Theory<br />
: 40% C/C++, Python coding<br />
: 40% Deep learning<br />
<br />
===Professor===<br />
<!-- : [http://www.iis.ee.ethz.ch/people/person-detail.html?persid=194234 Luca Benini] ---><br />
<br />
[[#top|↑ top]]<br />
<br />
==Detailed Task Description==<br />
<br />
===Goals===<br />
===Practical Details===<br />
* '''[[Project Plan]]'''<br />
* '''[[Project Meetings]]'''<br />
* '''[[Design Review]]'''<br />
* '''[[Coding Guidelines]]'''<br />
* '''[[Final Report]]'''<br />
* '''[[Final Presentation]]'''<br />
<br />
==Results== <br />
<br />
==Links== <br />
<br />
[[#top|↑ top]]<br />
<!-- <br />
<br />
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES<br />
<br />
GROUP<br />
[[Category:IIP]]<br />
[[Category:cat1]]<br />
[[Category:cat2]]<br />
[[Category:cat3]]<br />
[[Category:cat4]]<br />
[[Category:cat5]]<br />
<br />
<br />
[[Category:Digital]]<br />
SUB CATEGORIES<br />
NEW CATEGORIES<br />
[[Category:Computer Architecture]]<br />
[[Category:Acceleration and Transprecision]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Event-Driven Computing]]<br />
[[Category:Predictable Execution]]<br />
[[Category:SmartSensors]]<br />
[[Category:Transient Computing]]<br />
[[Category:System on Chips for IoTs]]<br />
[[Category:Energy Efficient Autonomous UAVs]]<br />
[[Category:Biomedical System on Chips]]<br />
[[Category:Digital Medical Ultrasound Imaging]]<br />
[[Category:Cryptography]]<br />
[[Category:Deep Learning Acceleration]]<br />
[[Category:Hyperdimensional Computing]] <br />
<br />
[[Category:Competition]] <br />
[[Category:EmbeddedAI]] <br />
<br />
<br />
[[Category:ASIC]]<br />
[[Category:FPGA]]<br />
<br />
[[Category:System Design]]<br />
[[Category:Processor]]<br />
[[Category:Telecommunications]]<br />
[[Category:Modelling]]<br />
[[Category:Software]]<br />
[[Category:Audio]]<br />
<br />
[[Category:Analog]]<br />
[[Category:Nano-TCAD]]<br />
<br />
[[Category:AnalogInt]]<br />
SUB CATEGORIES<br />
[[Category:Telecommunications]]<br />
<br />
<br />
STATUS<br />
[[Category:Available]]<br />
[[Category:In progress]]<br />
[[Category:Completed]]<br />
[[Category:Hot]]<br />
<br />
TYPE OF WORK<br />
[[Category:Group Work]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:PhD Thesis]]<br />
[[Category:Research]]<br />
<br />
NAMES OF EU/CTI/NT PROJECTS<br />
[[Category:Oprecomp]]<br />
[[Category:Antarex]]<br />
[[Category:Hercules]]<br />
[[Category:Icarium]]<br />
[[Category:PULP]]<br />
[[Category:ArmaSuisse]]<br />
[[Category:Mnemosene]]<br />
[[Category:Aloha]]<br />
[[Category:Ampere]]<br />
[[Category:ExaNode]]<br />
[[Category:EPI]]<br />
[[Category:Fractal]]<br />
<br />
<br />
YEAR (IF FINISHED)<br />
[[Category:2010]]<br />
[[Category:2011]]<br />
[[Category:2012]]<br />
[[Category:2013]]<br />
[[Category:2014]]<br />
[[Category:2015]]<br />
[[Category:2016]]<br />
[[Category:2017]]<br />
[[Category:2018]]<br />
[[Category:2019]]<br />
[[Category:2020]]<br />
<br />
<br />
---><br />
[[Category:Deep Learning Acceleration]] <br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=6971Huawei Research2021-09-20T08:17:25Z<p>Andrire: /* On-going Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
==Internship Digital VLSI Design for ML Acceleration (Taken)==<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
===Your Responsibilities===<br />
<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
<br />
===Requirements - Your background===<br />
<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
<br />
==On-going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
<br />
| Semester Thesis || Digital VLSI Design (ML Acceleration)|| Winograd has been exploited for efficient calculation of convolutions which are typically used in ML applications (e.g., image classification), a novel algorithm shows nice properties to use complex Winograd to further reduce the computational complexity. In this project, we would like to evaluate the actual benefits in HW by designing an accelerator exploiting the new algorithm. || AI Acceleration || digital VLSI design || Dr. Andri, TBD PhD student at IIS<br />
|-<br />
| Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Dr. Andri<br />
|-<br />
| Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Dr. Andri, Dr. Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
: Internship at Huawei Research in Zurich-Oerlikon<br />
: Contact (at Huawei RC Zurich): Dr. Renzo Andri, surname.name at huawei com<br />
: Contact (at Huawei RC Zurich): Dr. Lukas Cavigelli, surname.name at huawei com</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=6970Huawei Research2021-09-20T08:12:32Z<p>Andrire: </p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
==Internship Digital VLSI Design for ML Acceleration (Taken)==<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
===Your Responsibilities===<br />
<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
<br />
===Requirements - Your background===<br />
<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
<br />
==On-going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
<br />
<br />
| Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Dr. Andri<br />
|-<br />
| Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Dr. Andri, Dr. Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
: Internship at Huawei Research in Zurich-Oerlikon<br />
: Contact (at Huawei RC Zurich): Dr. Renzo Andri, surname.name at huawei com<br />
: Contact (at Huawei RC Zurich): Dr. Lukas Cavigelli, surname.name at huawei com</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=User:Andrire&diff=6969User:Andrire2021-09-20T08:11:30Z<p>Andrire: /* Dr. Renzo Andri -- Contact Information */</p>
<hr />
<div><br />
<br />
==Dr. Renzo Andri -- Contact Information==<br />
* '''Office''': Huawei Research Center <br />
* '''e-mail''': renzo.andri AT huawei com<br />
[[Category:Supervisors]]<br />
[[Category:Digital]]<br />
<br />
==Interests==<br />
Dr. Renzo Andri has earned his PhD under the supervision of Prof. Luca Benini at the Integrated Systems Laboratory at ETH. In his research, he has focused on energy-efficient machine learning accelerators form efficient embedded Systems design to full custom ASIC design. <br />
He is currently working as Senior Researcher at Huawei Research in Zurich, conducting research on energy-efficient compute architectures and machine learning acceleration.<br />
<br />
* Computer Vision<br />
* Machine Learning, Neural Networks<br />
* FPGA & Digital ASIC Design<br />
* Low-Power Design<br />
* C/C++/CUDA software development<br />
* Embedded systems<br />
<br />
==Available Projects==<br />
We are providing student projects (master and semester theses) in collaboration with the IIS.<br />
See project under: [[Huawei Research]]<br />
<br />
<!--<DynamicPageList><br />
supresserrors = true<br />
category = Available<br />
category = Andrire<br />
</DynamicPageList><br />
<br />
== Projects in Progress==<br />
<DynamicPageList><br />
supresserrors = true<br />
category = In progress<br />
category = Andrire<br />
</DynamicPageList>--><br />
<!--<br />
==Completed Projects==<br />
===2015===<br />
<DynamicPageList><br />
supresserrors = true<br />
category = Completed<br />
category = Lukasc<br />
category = 2015<br />
</DynamicPageList><br />
<br />
--></div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=User:Andrire&diff=6968User:Andrire2021-09-20T08:11:13Z<p>Andrire: </p>
<hr />
<div><br />
<br />
==Dr. Renzo Andri -- Contact Information==<br />
* '''Office''': Huawei Research Center <br />
* '''e-mail''': name.andri AT huawei com<br />
[[Category:Supervisors]]<br />
[[Category:Digital]]<br />
<br />
==Interests==<br />
Dr. Renzo Andri has earned his PhD under the supervision of Prof. Luca Benini at the Integrated Systems Laboratory at ETH. In his research, he has focused on energy-efficient machine learning accelerators form efficient embedded Systems design to full custom ASIC design. <br />
He is currently working as Senior Researcher at Huawei Research in Zurich, conducting research on energy-efficient compute architectures and machine learning acceleration.<br />
<br />
* Computer Vision<br />
* Machine Learning, Neural Networks<br />
* FPGA & Digital ASIC Design<br />
* Low-Power Design<br />
* C/C++/CUDA software development<br />
* Embedded systems<br />
<br />
==Available Projects==<br />
We are providing student projects (master and semester theses) in collaboration with the IIS.<br />
See project under: [[Huawei Research]]<br />
<br />
<!--<DynamicPageList><br />
supresserrors = true<br />
category = Available<br />
category = Andrire<br />
</DynamicPageList><br />
<br />
== Projects in Progress==<br />
<DynamicPageList><br />
supresserrors = true<br />
category = In progress<br />
category = Andrire<br />
</DynamicPageList>--><br />
<!--<br />
==Completed Projects==<br />
===2015===<br />
<DynamicPageList><br />
supresserrors = true<br />
category = Completed<br />
category = Lukasc<br />
category = 2015<br />
</DynamicPageList><br />
<br />
--></div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=6534Huawei Research2021-05-10T08:53:48Z<p>Andrire: /* Available Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
==Internship Digital VLSI Design for ML Acceleration (Taken)==<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
===Your Responsibilities===<br />
<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
<br />
===Requirements - Your background===<br />
<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
<br />
==On-going Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
<br />
<br />
| Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Dr. Andri<br />
|-<br />
| Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Dr. Andri, Dr. Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
: Internship at Huawei Research in Zurich-Oerlikon<br />
: Contact (at Huawei RC Zurich): Dr. Renzo Andri, surname.name at huawei com<br />
: Contact (at Huawei RC Zurich): Dr. Lukas Cavigelli, surname.name at huawei com</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=6533Huawei Research2021-05-10T08:52:54Z<p>Andrire: /* Internship Digital VLSI Design for ML Acceleration */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
==Internship Digital VLSI Design for ML Acceleration (Taken)==<br />
This internship has been taken, if you are interested in similar topics, get in contact with us.<br />
<br />
For the new ZRC Laboratory, we were looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
===Your Responsibilities===<br />
<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
<br />
===Requirements - Your background===<br />
<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
<br />
<br />
| Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Dr. Andri<br />
|-<br />
| Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Dr. Andri, Dr. Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
: Internship at Huawei Research in Zurich-Oerlikon<br />
: Contact (at Huawei RC Zurich): Dr. Renzo Andri, surname.name at huawei com<br />
: Contact (at Huawei RC Zurich): Dr. Lukas Cavigelli, surname.name at huawei com</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=6528Huawei Research2021-04-16T08:49:54Z<p>Andrire: /* Available Projects */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
==Internship Digital VLSI Design for ML Acceleration==<br />
For this new ZRC Laboratory, we are currently looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
===Your Responsibilities===<br />
<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
<br />
===Requirements - Your background===<br />
<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
We are open to discuss also other topics. We are also supervising master's and semester theses in collaboration with the Integrated Systems Laboratory. Feel free to contact us, we are happy to hear from you.<br />
<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
<br />
<br />
| Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Dr. Andri<br />
|-<br />
| Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Dr. Andri, Dr. Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
: Internship at Huawei Research in Zurich-Oerlikon<br />
: Contact (at Huawei RC Zurich): Dr. Renzo Andri, surname.name at huawei com<br />
: Contact (at Huawei RC Zurich): Dr. Lukas Cavigelli, surname.name at huawei com</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Huawei_Research&diff=6527Huawei Research2021-04-16T08:46:52Z<p>Andrire: /* Requirements - Your background */</p>
<hr />
<div>[[Category:Digital]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]] <br />
[[Category:Available]] <br />
[[Category:2020]]<br />
[[Category:Hot]]<br />
[[File:Huawei.jpg]]<br />
===About the Huawei Future Computing Laboratory===<br />
<br />
With 18 sites across Europe and 1500 researchers, Huawei’s European Research Institute (ERI) oversees fundamental and applied technology research, academic research cooperation projects, and strategic technical planning across our network of European R&D facilities. Huawei’s ERI includes the new Zurich Research Center (ZRC), located in Zurich, Switzerland. A major element of ZRC is a new research laboratory focused on fundamental research in the area of future computing systems (new hardware, new software, new algorithms).<br />
<br />
The research work of the lab will be carried out not only by Huawei’s internal research staff but also by our academic research partners in universities across Europe. The lab will provide an “open research environment” where academics will be encouraged to visit and work on fundamental long-term research alongside Huawei staff in an environment that, like the best universities and research institutes, is open and conducive to such scientific work.<br />
<br />
==Internship Digital VLSI Design for ML Acceleration==<br />
For this new ZRC Laboratory, we are currently looking for an outstanding Digital VLSI Design Intern. As a key member in our motivated and multicultural team, you will support to design and evaluate novel VLSI architectures for energy-efficient machine learning acceleration.<br />
<br />
===Your Responsibilities===<br />
<br />
* Design and Implementation of Digital VLSI HW architecture (RTL) for Machine Learning Acceleration<br />
* Mapping of data, parameters and computations from a ML framework to the HW Accelerator.<br />
* Synthesis and Backend/Layout and gate-level power simulation<br />
* Scientific evaluation and potential publication.<br />
<br />
<br />
===Requirements - Your background===<br />
<br />
* You are currently enrolled in a Master’s degree or PhD in electrical engineering, compute engineering or computer science, or any related fields at a reputable university; or you graduated within the last six months<br />
* Solid Digital VLSI Design knowledge Front-end and preferably also Back-end (e.g., VLSI I-II)<br />
* You have worked on a VLSI project (e.g., semester/master thesis at IIS) and used industry-standard tools like Design Compiler, Innovus, Modelsim or similar.<br />
* Basic knowledge in computer arithmetics.<br />
* Basic knowledge in machine learning is an asset.<br />
* Strong coding and scripting skills (SystemVerilog/VHDL, Python, TCL, Bash etc.)<br />
* Excellent communication and writing skills in English<br />
<br />
Interested to develop with us the next generation of machine learning hardware, then apply [https://apply.workable.com/huawei-16/j/CE22FFA23B/ here]<br />
<br />
<br />
<!--===Useful Reading===<br />
Coming soon<br />
<br />
===Prerequisites===<br />
*General interest in Deep Learning and memory/system design<br />
*VLSI I and VLSI II (''recommended'')<br />
--><br />
<br />
==Available Projects==<br />
We are inviting applications from students to conduct their master’s thesis work or an internship project at the Huawei Future Computing Lab in Zurich on these exciting new topics. <br />
<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|-<br />
! style="width: 5%;"|Type !! style="width: 20%"|Project !! style="width: 40%"|Description !! style="width: 5%"|Topic !! style="width: 15%"|Workload Type || Contact <br />
|-<br />
<br />
<br />
| Internship || Digital VLSI Design Intern (ML Acceleration)|| [https://apply.workable.com/huawei-16/j/CE22FFA23B/ Link to description] || AI Acceleration || digital VLSI design || Dr. Andri<br />
|-<br />
| Internship || High-Performance Machine Learning Kernel Development || [https://apply.workable.com/huawei-16/j/E29D785D1A/ Link to description] || AI Acceleration || hardware-level SW development || Dr. Andri, Dr. Cavigelli<br />
|-<br />
|}<br />
<br />
==Contact==<br />
: Internship at Huawei Research in Zurich-Oerlikon<br />
: Contact (at Huawei RC Zurich): Dr. Renzo Andri, surname.name at huawei com<br />
: Contact (at Huawei RC Zurich): Dr. Lukas Cavigelli, surname.name at huawei com</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Digital&diff=6526Digital2021-04-16T08:44:39Z<p>Andrire: /* Topic List */</p>
<hr />
<div>__NOTOC__<br />
<imagemap><br />
Image:Project_Map_2020_11.png|780px<br />
rect 0 0 260 130 [[High Performance SoCs]]<br />
rect 520 0 780 130 [[Energy Efficient SoCs]]<br />
rect 0 130 260 260 [[Hardware Acceleration]]<br />
rect 520 130 780 260 [[Biomedical Circuits, Systems, and Applications]]<br />
rect 0 260 260 390 [[SW/HW Predictability and Security]]<br />
rect 260 260 520 390 [[Deep Learning Projects|Deep Learning Acceleration]]<br />
rect 520 260 780 390 [[Embedded Systems and autonomous UAVs]]<br />
default [[Digital]]<br />
</imagemap><br />
<br />
===Topic List===<br />
* '''[[High Performance SoCs]]'''<br />
** '''[[Heterogeneous Acceleration Systems]]'''<br />
* '''[[Energy Efficient SoCs]]'''<br />
* '''[[Hardware Acceleration]]'''<br />
* '''[[Biomedical Circuits, Systems, and Applications]]'''<br />
** '''[[Human Intranet]]'''<br />
** '''[[Digital Medical Ultrasound Imaging]]'''<br />
* '''[[SW/HW Predictability and Security]]'''<br />
** '''[[Predictable Execution]]'''<br />
** '''[[Cryptography|Cryptographic Hardware]]'''<br />
* '''[[Deep Learning Projects|Machine Learning / Deep Learning]]'''<br />
** '''[[Event-Driven Computing]]'''<br />
* '''[[Embedded Systems and autonomous UAVs]]'''<br />
** '''[[Energy Efficient Autonomous UAVs]]'''<br />
** '''[[Low Power Embedded Systems]]'''<br />
** '''[[Embedded Artificial Intelligence:Systems And Applications]]'''<br />
* '''[[ASIC Design Projects]]'''<br />
<br />
==External Collaborations==<br />
<imagemap><br />
Image:Project_Map_2020_11_external.png|520px<br />
rect 0 65 260 195 [[Biomedical System on Chips]]<br />
rect 260 65 540 195 [[Wireless Communication Systems for the IoT]]<br />
rect 0 195 260 324 [[IBM Research]]<br />
rect 260 195 540 324 [[Students' International Competitions: F1(AMZ), Swissloop, Educational Rockets]]<br />
</imagemap><br />
<br />
===Topic List===<br />
* '''[[Biomedical System on Chips]]'''<br />
* '''[[Wireless Communication Systems for the IoT]]'''<br />
* '''[[IBM Research]]'''<br />
* '''[[Huawei_Research|Huawei Research - Future Computing Laboratory (Computer Architecture and Machine Learning Acceleration)]]'''<br />
* '''[[Students' International Competitions: F1(AMZ), Swissloop, Educational Rockets]]'''<br />
* '''[[Physics is looking for PULP]]<br />
<br />
==Active Projects==<br />
These are the projects that are currently active:<br />
<DynamicPageList><br />
category = In progress<br />
category = Digital<br />
</DynamicPageList><br />
<br />
==Completed Projects==<br />
These are projects that were completed in the last few years:<br />
===2019===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = 2019<br />
suppresserrors=true<br />
</DynamicPageList><br />
===2018===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = 2018<br />
suppresserrors=true<br />
</DynamicPageList><br />
===2017===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = 2017<br />
suppresserrors=true<br />
</DynamicPageList><br />
===2016===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = 2016<br />
suppresserrors=true<br />
</DynamicPageList><br />
===2015===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = 2015<br />
suppresserrors=true<br />
</DynamicPageList><br />
===2014===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = 2014<br />
</DynamicPageList><br />
===2013===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = 2013<br />
</DynamicPageList><br />
===2012===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = 2012<br />
</DynamicPageList><br />
===2011===<br />
<DynamicPageList><br />
category = Completed<br />
category = Digital<br />
category = 2011<br />
</DynamicPageList><br />
<br />
===ASICs===<br />
<DynamicPageList><br />
category = ASIC<br />
category = Available<br />
</DynamicPageList><br />
<br />
[[Category:Computer Architecture]]<br />
[[Category:Acceleration and Transprecision]]<br />
[[Category:Heterogeneous Acceleration Systems]]<br />
[[Category:Event-Driven Computing]]<br />
[[Category:Predictable Execution]]<br />
[[Category:Low Power Embedded Systems]]<br />
[[Category:Embedded Artificial Intelligence:Systems And Applications]]<br />
[[Category:Transient Computing]]<br />
[[Category:System on Chips for IoTs]]<br />
[[Category:Energy Efficient Autonomous UAVs]]<br />
[[Category:Biomedical System on Chips]]<br />
[[Category:Digital Medical Ultrasound Imaging]]<br />
[[Category:Cryptography]]<br />
[[Category:Deep Learning Acceleration]]<br />
[[Category:Human Intranet]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=User:Andrire&diff=6525User:Andrire2021-04-14T12:47:00Z<p>Andrire: /* Interests */</p>
<hr />
<div><br />
<br />
==Dr. Renzo Andri -- Contact Information==<br />
* '''Office''': Huawei Research Center <br />
* '''e-mail''': name.andri AT huawei com<br />
[[Category:Supervisors]]<br />
[[Category:Digital]]<br />
<br />
==Interests==<br />
Dr. Renzo Andri has earned his PhD under the supervision of Prof. Luca Benini at the Integrated Systems Laboratory at ETH. In this research, he has focused on energy-efficent machine learning accelerators form efficient embedded Systems design to full custom ASIC design. <br />
He is currently working as Senior Researcher at Huawei Research in Zurich, conducting research on energy-efficient compute architectures and machine learning acceleration.<br />
<br />
* Computer Vision<br />
* Machine Learning, Neural Networks<br />
* FPGA & Digital ASIC Design<br />
* Low-Power Design<br />
* C/C++/CUDA software development<br />
* Embedded systems<br />
<br />
==Available Projects==<br />
See project under: [[Huawei Research]]<br />
<br />
<!--<DynamicPageList><br />
supresserrors = true<br />
category = Available<br />
category = Andrire<br />
</DynamicPageList><br />
<br />
== Projects in Progress==<br />
<DynamicPageList><br />
supresserrors = true<br />
category = In progress<br />
category = Andrire<br />
</DynamicPageList>--><br />
<!--<br />
==Completed Projects==<br />
===2015===<br />
<DynamicPageList><br />
supresserrors = true<br />
category = Completed<br />
category = Lukasc<br />
category = 2015<br />
</DynamicPageList><br />
<br />
--></div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=User:Andrire&diff=6524User:Andrire2021-04-14T12:42:39Z<p>Andrire: </p>
<hr />
<div><br />
<br />
==Dr. Renzo Andri -- Contact Information==<br />
* '''Office''': Huawei Research Center <br />
* '''e-mail''': name.andri AT huawei com<br />
[[Category:Supervisors]]<br />
[[Category:Digital]]<br />
<br />
==Interests==<br />
* Computer Vision<br />
* Machine Learning, Neural Networks<br />
* FPGA & Digital ASIC Design<br />
* Low-Power Design<br />
* C/C++/CUDA software development<br />
* Embedded systems<br />
<br />
==Available Projects==<br />
See project under: [[Huawei Research]]<br />
<br />
<!--<DynamicPageList><br />
supresserrors = true<br />
category = Available<br />
category = Andrire<br />
</DynamicPageList><br />
<br />
== Projects in Progress==<br />
<DynamicPageList><br />
supresserrors = true<br />
category = In progress<br />
category = Andrire<br />
</DynamicPageList>--><br />
<!--<br />
==Completed Projects==<br />
===2015===<br />
<DynamicPageList><br />
supresserrors = true<br />
category = Completed<br />
category = Lukasc<br />
category = 2015<br />
</DynamicPageList><br />
<br />
--></div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Neural_Networks_Framwork_for_Embedded_Plattforms&diff=6523Neural Networks Framwork for Embedded Plattforms2021-04-14T12:40:44Z<p>Andrire: </p>
<hr />
<div>==Short Description==<br />
Neural Networks have had a huge hype in the last decade, still very few implementations have been presented on embedded platforms, as neural networks are to compute-intense. Though, recent works have shown that correctly trained reduced-precision networks reach similar performance than their high-precision equivalents. In previous projects, we have successfully developed neural network implementations on a general purpose microcontroller platform and on our own PULP multicore processor platforms. But as applications and research is changing fast, these implementations need always to be adapted. In this thesis, you use common Machine learning framework like Torch or Tensorflow to evaluate and train the network and export it in a compilable form for efficient inference on the previously mentioned platforms.<br />
The work may also include implementation and evaluation of hardware-based special-purpose instructions on the PULP platform to accelerate the computing task.<br />
<br />
<br />
===Status: Available ===<br />
: Looking for 1 Master or 1 semester project students<br />
: Contact: [[:User:Andrire | Renzo Andri]], [[:User:Lukasc | Lukas Cavigelli]]<br />
<br />
===Prerequisites===<br />
: Knowledge of C/C++<br />
: Interest Machine learning<br />
<br />
<br />
<br />
===Character===<br />
: 20% Theory / Literature Research <br />
: 80% software development<br />
<br />
===Professor===<br />
: [http://www.iis.ee.ethz.ch/portrait/staff/lbenini.en.html Luca Benini]<br />
[[#top|↑ top]]<br />
<br />
==Detailed Task Description==<br />
<br />
===Goals===<br />
===Practical Details===<br />
* '''[[Project Plan]]'''<br />
* '''[[Project Meetings]]'''<br />
* '''[[Design Review]]'''<br />
* '''[[Coding Guidelines]]'''<br />
* '''[[Final Report]]'''<br />
* '''[[Final Presentation]]'''<br />
<br />
==Results== <br />
<br />
<br />
[[#top|↑ top]]<br />
<br />
[[Category:Digital]]<br />
[[Category:Available]]<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:System Design]]<br />
<br />
<!-- <br />
<br />
COPY PASTE FROM THE LIST BELOW TO ADD TO CATEGORIES<br />
<br />
GROUP<br />
[[Category:Digital]]<br />
[[Category:Analog]]<br />
[[Category:Nano-TCAD]]<br />
[[Category:Nano Electronics]]<br />
<br />
STATUS<br />
[[Category:Available]]<br />
[[Category:In progress]]<br />
[[Category:Completed]]<br />
[[Category:Hot]]<br />
<br />
TYPE OF WORK<br />
[[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:PhD Thesis]]<br />
[[Category:Research]]<br />
<br />
NAMES OF EU/CTI/NT PROJECTS<br />
[[Category:UltrasoundToGo]]<br />
[[Category:IcySoC]]<br />
[[Category:PSocrates]]<br />
[[Category:UlpSoC]]<br />
[[Category:Qcrypt]]<br />
<br />
YEAR (IF FINISHED)<br />
[[Category:2010]]<br />
[[Category:2011]]<br />
[[Category:2012]]<br />
[[Category:2013]]<br />
[[Category:2014]]<br />
<br />
---><br />
<br />
[[Category:Digital]] [[Category:Semester Thesis]]<br />
[[Category:Master Thesis]]<br />
[[Category:System Design]]<br />
[[Category:Completed]]</div>Andrirehttp://iis-projects.ee.ethz.ch/index.php?title=Stand-Alone_Edge_Computing_with_GAP8&diff=6522Stand-Alone Edge Computing with GAP82021-04-14T12:38:49Z<p>Andrire: </p>
<hr />
<div>'''Stand-Alone Edge Computing with GAP8 (status:completed)'''<br />
<br />
Current trends in low power systems point towards an edge computing paradigm. As such, sensing systems can have significant computational resources to analyze and aggregate data before forwarding data to the cloud. This can be particularly challenging with data intensive sensors such as cameras or microphones. In recent years, machine learning has become the most effective way to extract the most meaningful information from these data sources. However, until very recently, running machine learning on low-power, resource constrained devices was very inefficient in terms of memory, and energy. <br />
<br />
IoT application processors like GAP8 are a key building block for integrating artificial intelligence and advanced classification into next-generation wireless sensing devices. Thanks to its 8 RISC-V cores and its convolution hardware accelerator, GAP8 can perform complex computation with a mW-range power budget. In our lab, we have developed a fully-integrated GAP8-based sensor node with video and audio processing capabilities, as well as low power, long range communication. <br />
<br />
In this project, the student will continue developing this sensor node by developing an application to classify audio segments. There is already an existing audio classifier implemented using an XNOR network on GAP8. This application can be used as a basis to have a functioning stand-alone device that also uses communication. In addition to exploiting parallelism, the student can also implement ISA-level (e.g. using intrisics or HWCE for the non-binary layers) improvements or develop energy-aware mapping algorithms to improve the energy efficiency of the application.<br />
<br />
This project will be developed in close collaboration with MiroMico AG (Zurich).<br />
<br />
'''Prerequisites''':<br />
<br />
(not all need to be met by the single candidate)<br />
<br />
Experience using the laboratory instrumentation - signal generators, oscilloscopes, DAQ cards, Matlab etc..<br />
knowledge of microcontroller programming and PC programming (C/C++, preferably embedded C)<br />
basic knowledge or interests on signal processing, wireless communication and machine learning <br />
Motivation to build and test a real system<br />
<br />
'''Detailed Task Description'''<br />
<br />
A detailed task description will be worked out right before the project, taking the student's interests and capabilities into account.<br />
<br />
<br />
'''Contacts'''<br />
<br />
[[:User:Andrire|Renzo Andri]], andrire@iis.ee.ethz.ch<br />
<br />
Andres Gomez, gomez@miromico.ch<br />
<br />
<br />
[[Category:Andrire]]<br />
<br />
[[Category:Completed]]<br />
[[Category:Semester Thesis]]</div>Andrire