Neural Networks (NNs) are widely used and highly effective in Machine Learning. Investigating NNs with substantial width and depth has revealed their superior capabilities, and understanding the infinite-width limit regime of Gaussian NNs has been a focus of research. Recent attention has shifted towards non-asymptotic approximations of Gaussian NNs. Eldan et al. (2021) and Basteri and Trevisan (2022) established quantitative Central Limit Theorems (CLTs) for NNs with one hidden layer and deep Gaussian NNs, respectively. Building upon these works, Bordino et al. (2023) proposed a novel approach using second-order Poincaré inequalities to obtain quantitative Gaussian approximations. However, their publication only covers one-hidden-layer architectures. This thesis aims to summarize Bordino et al.'s findings and extend them to multi-hidden-layer fully-connected architectures. It contributes to the field by offering insights into non-asymptotic approximations of deep Gaussian NNs, providing implications for understanding their behavior and performance.

Non-asymptotic approximations of Deep Gaussian Neural Networks via second-order Poincaré inequalities

BORELLO, PAOLO
2022/2023

Abstract

Neural Networks (NNs) are widely used and highly effective in Machine Learning. Investigating NNs with substantial width and depth has revealed their superior capabilities, and understanding the infinite-width limit regime of Gaussian NNs has been a focus of research. Recent attention has shifted towards non-asymptotic approximations of Gaussian NNs. Eldan et al. (2021) and Basteri and Trevisan (2022) established quantitative Central Limit Theorems (CLTs) for NNs with one hidden layer and deep Gaussian NNs, respectively. Building upon these works, Bordino et al. (2023) proposed a novel approach using second-order Poincaré inequalities to obtain quantitative Gaussian approximations. However, their publication only covers one-hidden-layer architectures. This thesis aims to summarize Bordino et al.'s findings and extend them to multi-hidden-layer fully-connected architectures. It contributes to the field by offering insights into non-asymptotic approximations of deep Gaussian NNs, providing implications for understanding their behavior and performance.
ENG
IMPORT DA TESIONLINE
File in questo prodotto:
File Dimensione Formato  
891165_borello_tesi.pdf

non disponibili

Tipologia: Altro materiale allegato
Dimensione 546.62 kB
Formato Adobe PDF
546.62 kB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/147961