Non-asymptotic approximations of Deep Gaussian Neural Networks via second-order Poincaré inequalities

Neural Networks (NNs) are widely used and highly effective in Machine Learning. Investigating NNs with substantial width and depth has revealed their superior capabilities, and understanding the infinite-width limit regime of Gaussian NNs has been a focus of research. Recent attention has shifted towards non-asymptotic approximations of Gaussian NNs. Eldan et al. (2021) and Basteri and Trevisan (2022) established quantitative Central Limit Theorems (CLTs) for NNs with one hidden layer and deep Gaussian NNs, respectively. Building upon these works, Bordino et al. (2023) proposed a novel approach using second-order Poincaré inequalities to obtain quantitative Gaussian approximations. However, their publication only covers one-hidden-layer architectures. This thesis aims to summarize Bordino et al.'s findings and extend them to multi-hidden-layer fully-connected architectures. It contributes to the field by offering insights into non-asymptotic approximations of deep Gaussian NNs, providing implications for understanding their behavior and performance.

Non-asymptotic approximations of Deep Gaussian Neural Networks via second-order Poincaré inequalities

BORELLO, PAOLO

2022/2023

Abstract

Neural Networks (NNs) are widely used and highly effective in Machine Learning. Investigating NNs with substantial width and depth has revealed their superior capabilities, and understanding the infinite-width limit regime of Gaussian NNs has been a focus of research. Recent attention has shifted towards non-asymptotic approximations of Gaussian NNs. Eldan et al. (2021) and Basteri and Trevisan (2022) established quantitative Central Limit Theorems (CLTs) for NNs with one hidden layer and deep Gaussian NNs, respectively. Building upon these works, Bordino et al. (2023) proposed a novel approach using second-order Poincaré inequalities to obtain quantitative Gaussian approximations. However, their publication only covers one-hidden-layer architectures. This thesis aims to summarize Bordino et al.'s findings and extend them to multi-hidden-layer fully-connected architectures. It contributes to the field by offering insights into non-asymptotic approximations of deep Gaussian NNs, providing implications for understanding their behavior and performance.

Scheda breve

	Facoltà/Dipartimento
	
				MATEMATICA "GIUSEPPE PEANO"
			
	Corso di studio
	
				STOCHASTICS AND DATA SCIENCE
			
	Lingua
	
				ENG
			
	Relatrice / Relatore
	
				FAVARO, Stefano
			
	Modalità consultazione tesi
	
				IMPORT DA TESIONLINE
			
	Appare nelle tipologie:
	
				Corso di Laurea Magistrale

File in questo prodotto:

File	Dimensione	Formato
891165_borello_tesi.pdf non disponibili Tipologia: Altro materiale allegato Dimensione 546.62 kB Formato Adobe PDF	546.62 kB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/147961