Codifica delle rappresentazioni di rotazione di dataset sintetici in modelli di deep learning basati su quaternioni

In recent years, artificial intelligence research has transformed the field of computer vision through the development of new models and algorithms. In addition, the spread of the internet has allowed researchers to access massive amounts of data to train networks. But despite this, it can be challenging to find the proper dataset, assuming it even exists. To address this issue, we use computer graphics to render the training datasets, which allow us to generate large amounts of synthetic images in a short time while maintaining close to real-world quality. In detail, for our training, we have rendered several datasets with Blender, reaching almost half a million RGBA images on sixty objects divided into six different classes. Each image is labeled with the object class and rotation using quaternions. Furthermore, this work compares the performances and the limitations of several neural networks architectures, including multilayer perceptrons, convolutional and capsule networks, in predicting both the orientation of rendered objects and their labels. This is achieved by grouping output neurons in such a way that they encode not only the probability of existence for each class, but also a valid quaternion (unit length quaternion so that it represents a valid rotation).

Codifica delle rappresentazioni di rotazione di dataset sintetici in modelli di deep learning basati su quaternioni

GRASSI, ALESSANDRO

2020/2021

Abstract

In recent years, artificial intelligence research has transformed the field of computer vision through the development of new models and algorithms. In addition, the spread of the internet has allowed researchers to access massive amounts of data to train networks. But despite this, it can be challenging to find the proper dataset, assuming it even exists. To address this issue, we use computer graphics to render the training datasets, which allow us to generate large amounts of synthetic images in a short time while maintaining close to real-world quality. In detail, for our training, we have rendered several datasets with Blender, reaching almost half a million RGBA images on sixty objects divided into six different classes. Each image is labeled with the object class and rotation using quaternions. Furthermore, this work compares the performances and the limitations of several neural networks architectures, including multilayer perceptrons, convolutional and capsule networks, in predicting both the orientation of rendered objects and their labels. This is achieved by grouping output neurons in such a way that they encode not only the probability of existence for each class, but also a valid quaternion (unit length quaternion so that it represents a valid rotation).

Scheda breve

	Facoltà/Dipartimento
	
				INFORMATICA
			
	Corso di studio
	
				INFORMATICA
			
	Lingua
	
				ENG
			
	Relatrice / Relatore
	
				GRANGETTO, Marco
			
	Modalità consultazione tesi
	
				IMPORT DA TESIONLINE
			
	Appare nelle tipologie:
	
				Corso di Laurea

File in questo prodotto:

File	Dimensione	Formato
881798_encoding_rotation_representations_of_synthetic_datasets_in_quaternions_based_deep_learning_models.pdf non disponibili Tipologia: Altro materiale allegato Dimensione 2.84 MB Formato Adobe PDF	2.84 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/138659