Convolutional Neural Networks (CNNs) have become powerful tools in various computer vision tasks, yet their internal working often remains opaque. Understanding the hidden layers of CNNs is crucial for improving their interpretability and performance. This thesis investigates the characteristics of the flatten layer, the final layer before the dense neural network layers, within the VGG-16 model, pre-trained on ImageNet and evaluated using COCO dataset images. We conduct experiments across three contexts—full images, bounding boxes, and image segmentations—to assess their impact on model predictions. Using dimensionality reduction techniques such as Principal Component Analysis (PCA) and clustering algorithms like K-Means, we analyze the outputs of the flatten layer. Our findings provide deeper insights into how context and dimensionality reduction enhance our understanding of CNN representations. Additionally, we demonstrate an alternative method for predicting samples by integrating PCA and K-Means. Ultimately, this thesis establishes a well-organized taxonomy of the flatten layer outputs in VGG-16, providing a clear structure, and suggests a practical approach to bypass traditional model fine-tuning.
Exploring the Taxonomy of the Flatten Layer in VGG-16 Network
BUSCA, FABIANO
2023/2024
Abstract
Convolutional Neural Networks (CNNs) have become powerful tools in various computer vision tasks, yet their internal working often remains opaque. Understanding the hidden layers of CNNs is crucial for improving their interpretability and performance. This thesis investigates the characteristics of the flatten layer, the final layer before the dense neural network layers, within the VGG-16 model, pre-trained on ImageNet and evaluated using COCO dataset images. We conduct experiments across three contexts—full images, bounding boxes, and image segmentations—to assess their impact on model predictions. Using dimensionality reduction techniques such as Principal Component Analysis (PCA) and clustering algorithms like K-Means, we analyze the outputs of the flatten layer. Our findings provide deeper insights into how context and dimensionality reduction enhance our understanding of CNN representations. Additionally, we demonstrate an alternative method for predicting samples by integrating PCA and K-Means. Ultimately, this thesis establishes a well-organized taxonomy of the flatten layer outputs in VGG-16, providing a clear structure, and suggests a practical approach to bypass traditional model fine-tuning.File | Dimensione | Formato | |
---|---|---|---|
978059_thesis.pdf
non disponibili
Tipologia:
Altro materiale allegato
Dimensione
2.55 MB
Formato
Adobe PDF
|
2.55 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14240/110570