Neural Networks (NNs) are widely used and highly effective in Machine Learning. Investigating NNs with substantial width and depth has revealed their superior capabilities, and understanding the infinite-width limit regime of Gaussian NNs has been a focus of research. Recent attention has shifted towards non-asymptotic approximations of Gaussian NNs. Eldan et al. (2021) and Basteri and Trevisan (2022) established quantitative Central Limit Theorems (CLTs) for NNs with one hidden layer and deep Gaussian NNs, respectively. Building upon these works, Bordino et al. (2023) proposed a novel approach using second-order Poincaré inequalities to obtain quantitative Gaussian approximations. However, their publication only covers one-hidden-layer architectures. This thesis aims to summarize Bordino et al.'s findings and extend them to multi-hidden-layer fully-connected architectures. It contributes to the field by offering insights into non-asymptotic approximations of deep Gaussian NNs, providing implications for understanding their behavior and performance.
Non-asymptotic approximations of Deep Gaussian Neural Networks via second-order Poincaré inequalities
BORELLO, PAOLO
2022/2023
Abstract
Neural Networks (NNs) are widely used and highly effective in Machine Learning. Investigating NNs with substantial width and depth has revealed their superior capabilities, and understanding the infinite-width limit regime of Gaussian NNs has been a focus of research. Recent attention has shifted towards non-asymptotic approximations of Gaussian NNs. Eldan et al. (2021) and Basteri and Trevisan (2022) established quantitative Central Limit Theorems (CLTs) for NNs with one hidden layer and deep Gaussian NNs, respectively. Building upon these works, Bordino et al. (2023) proposed a novel approach using second-order Poincaré inequalities to obtain quantitative Gaussian approximations. However, their publication only covers one-hidden-layer architectures. This thesis aims to summarize Bordino et al.'s findings and extend them to multi-hidden-layer fully-connected architectures. It contributes to the field by offering insights into non-asymptotic approximations of deep Gaussian NNs, providing implications for understanding their behavior and performance.File | Dimensione | Formato | |
---|---|---|---|
891165_borello_tesi.pdf
non disponibili
Tipologia:
Altro materiale allegato
Dimensione
546.62 kB
Formato
Adobe PDF
|
546.62 kB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14240/147961