The presence of Deep Neural Networks (DNNs) is constantly growing in our everyday life: from speech recognition to image classification, our world is increasingly becoming more and more deep learning-based. These structures are really complex though: they generally require a huge number of resources in terms of data and computational power to work properly, leading otherwise to unacceptable training and inference performances. There are many effective approaches that aim to improve these issues by parallelising and distributing the workloads across a cluster of computers, eventually exploiting specialized hardware like GPUs and TPUs. These high-performance techniques can be implemented in a variety of ways, having each one some pros and cons. This work will address in particular the training phase of DNNs starting from a theoretical perspective of neural networks, moving then to practical topics such as the most common approaches to learning optimization and machine learning software. A brief discussion on the gradient descent method and the impact of distributed training on it will then lead to a recently developed distributed training strategy called Nearest Neighbours Training (NNT). The main focus of this work will indeed be on an experimental NNT implementation within the PyTorch framework, discussing it from the code level up to the experimental performances on different datasets, highlighting the differences between this innovative approach and the more traditional ones.

Nearest Neighbours Training: a PyTorch implementation

MITTONE, GIANLUCA
2018/2019

Abstract

The presence of Deep Neural Networks (DNNs) is constantly growing in our everyday life: from speech recognition to image classification, our world is increasingly becoming more and more deep learning-based. These structures are really complex though: they generally require a huge number of resources in terms of data and computational power to work properly, leading otherwise to unacceptable training and inference performances. There are many effective approaches that aim to improve these issues by parallelising and distributing the workloads across a cluster of computers, eventually exploiting specialized hardware like GPUs and TPUs. These high-performance techniques can be implemented in a variety of ways, having each one some pros and cons. This work will address in particular the training phase of DNNs starting from a theoretical perspective of neural networks, moving then to practical topics such as the most common approaches to learning optimization and machine learning software. A brief discussion on the gradient descent method and the impact of distributed training on it will then lead to a recently developed distributed training strategy called Nearest Neighbours Training (NNT). The main focus of this work will indeed be on an experimental NNT implementation within the PyTorch framework, discussing it from the code level up to the experimental performances on different datasets, highlighting the differences between this innovative approach and the more traditional ones.
ENG
IMPORT DA TESIONLINE
File in questo prodotto:
File Dimensione Formato  
794926_thesis_mittone_final.pdf

non disponibili

Tipologia: Altro materiale allegato
Dimensione 2.09 MB
Formato Adobe PDF
2.09 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/51765