Exploring the Taxonomy of the Flatten Layer in VGG-16 Network

Convolutional Neural Networks (CNNs) have become powerful tools in various computer vision tasks, yet their internal working often remains opaque. Understanding the hidden layers of CNNs is crucial for improving their interpretability and performance. This thesis investigates the characteristics of the flatten layer, the final layer before the dense neural network layers, within the VGG-16 model, pre-trained on ImageNet and evaluated using COCO dataset images. We conduct experiments across three contexts—full images, bounding boxes, and image segmentations—to assess their impact on model predictions. Using dimensionality reduction techniques such as Principal Component Analysis (PCA) and clustering algorithms like K-Means, we analyze the outputs of the flatten layer. Our findings provide deeper insights into how context and dimensionality reduction enhance our understanding of CNN representations. Additionally, we demonstrate an alternative method for predicting samples by integrating PCA and K-Means. Ultimately, this thesis establishes a well-organized taxonomy of the flatten layer outputs in VGG-16, providing a clear structure, and suggests a practical approach to bypass traditional model fine-tuning.

Exploring the Taxonomy of the Flatten Layer in VGG-16 Network

BUSCA, FABIANO

2023/2024

Abstract

Convolutional Neural Networks (CNNs) have become powerful tools in various computer vision tasks, yet their internal working often remains opaque. Understanding the hidden layers of CNNs is crucial for improving their interpretability and performance. This thesis investigates the characteristics of the flatten layer, the final layer before the dense neural network layers, within the VGG-16 model, pre-trained on ImageNet and evaluated using COCO dataset images. We conduct experiments across three contexts—full images, bounding boxes, and image segmentations—to assess their impact on model predictions. Using dimensionality reduction techniques such as Principal Component Analysis (PCA) and clustering algorithms like K-Means, we analyze the outputs of the flatten layer. Our findings provide deeper insights into how context and dimensionality reduction enhance our understanding of CNN representations. Additionally, we demonstrate an alternative method for predicting samples by integrating PCA and K-Means. Ultimately, this thesis establishes a well-organized taxonomy of the flatten layer outputs in VGG-16, providing a clear structure, and suggests a practical approach to bypass traditional model fine-tuning.

Scheda breve

	Facoltà/Dipartimento
	
				INFORMATICA
			
	Corso di studio
	
				INFORMATICA
			
	Lingua
	
				ENG
			
	Relatrice / Relatore
	
				GLIOZZI, Valentina
			
	Modalità consultazione tesi
	
				IMPORT DA TESIONLINE
			
	Appare nelle tipologie:
	
				Corso di Laurea

File in questo prodotto:

File	Dimensione	Formato
978059_thesis.pdf non disponibili Tipologia: Altro materiale allegato Dimensione 2.55 MB Formato Adobe PDF	2.55 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/110570