In recent years, artificial intelligence research has transformed the field of computer vision through the development of new models and algorithms. In addition, the spread of the internet has allowed researchers to access massive amounts of data to train networks. But despite this, it can be challenging to find the proper dataset, assuming it even exists. To address this issue, we use computer graphics to render the training datasets, which allow us to generate large amounts of synthetic images in a short time while maintaining close to real-world quality. In detail, for our training, we have rendered several datasets with Blender, reaching almost half a million RGBA images on sixty objects divided into six different classes. Each image is labeled with the object class and rotation using quaternions. Furthermore, this work compares the performances and the limitations of several neural networks architectures, including multilayer perceptrons, convolutional and capsule networks, in predicting both the orientation of rendered objects and their labels. This is achieved by grouping output neurons in such a way that they encode not only the probability of existence for each class, but also a valid quaternion (unit length quaternion so that it represents a valid rotation).
Codifica delle rappresentazioni di rotazione di dataset sintetici in modelli di deep learning basati su quaternioni
GRASSI, ALESSANDRO
2020/2021
Abstract
In recent years, artificial intelligence research has transformed the field of computer vision through the development of new models and algorithms. In addition, the spread of the internet has allowed researchers to access massive amounts of data to train networks. But despite this, it can be challenging to find the proper dataset, assuming it even exists. To address this issue, we use computer graphics to render the training datasets, which allow us to generate large amounts of synthetic images in a short time while maintaining close to real-world quality. In detail, for our training, we have rendered several datasets with Blender, reaching almost half a million RGBA images on sixty objects divided into six different classes. Each image is labeled with the object class and rotation using quaternions. Furthermore, this work compares the performances and the limitations of several neural networks architectures, including multilayer perceptrons, convolutional and capsule networks, in predicting both the orientation of rendered objects and their labels. This is achieved by grouping output neurons in such a way that they encode not only the probability of existence for each class, but also a valid quaternion (unit length quaternion so that it represents a valid rotation).File | Dimensione | Formato | |
---|---|---|---|
881798_encoding_rotation_representations_of_synthetic_datasets_in_quaternions_based_deep_learning_models.pdf
non disponibili
Tipologia:
Altro materiale allegato
Dimensione
2.84 MB
Formato
Adobe PDF
|
2.84 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14240/138659