Data augmentation is a technique consisting of extracting or generating more training data from a starting dataset. It can help in the training of a model, by reducing one of the most persistent issues in this task: the overfitting. This problem is often caused by insufficient data in the training set, and manifests itself in a poor performance of the model when tested on previously unseen data. Generating bigger datasets with real data is anything but a cheap and trivial task, requiring new approaches to reduce the resources consumption to inflate a dataset. The main goal of this thesis is to develop a tool capable of augmenting data, to help in the process of handling human behavioural and facial classification. In addition to that, we wanted to study how data augmentation (more precisely, data augmented by AI) can affect the performance of a classifier model. To do so, a data augmentation tool was developed, called FaceSwapAugmentation: it takes an existing video from a dataset, and replaces its face with second one. To this thesis’ scope, the model considered is trained upon the MaskDAR[1] dataset. The model’s purpose is to understand and classify the behaviour of a driving person. One of the key aspects of FaceSwapAugmentation, is that the latter face can be retrieved from another pre-existent video, or can be AI generated, making possible to generate an unlimited number of faceswaps. These face-swapped videos will then be used to inflate the original dataset, that will then be employed to train the classifier. The objective is to understand whether this technique can be useful to improve the model’s performance, without the need of acquiring a bigger dataset composed by real data. In addition to that, we also wanted to do a comparison between the generating methods put at disposal by faceSwapAugmentation. To answer the first question, the best and most feasible test was to build an aug- mented dataset with FaceSwapAugmentation, and then use it to train the classifier to see whether the performance improved. The first results showed a slightly worse per- formance of the model trained with fake data and tested on fake data, in comparison to the performance of the model trained and tested with real data. This, however, needs further consideration and study, taking into account also the resources that can be saved using AI generated data, and also the diminished privacy issues when making a dataset public. The second research question was answered by giving an objective score to each face swap made in each workflow of FaceSwapAugmentation. For each flow, 30 deepfakes were made: this allowed us to generate a distribution of values for each flow of usage, and to use these distributions to do a comparison. This gave as result that the AI- generated face swaps (ThisPersonDoesNotExists) produce the best deepfakes.
Sull'uso dei DeepFakes per la Data Augmentation
CASSANO, ENRICO
2021/2022
Abstract
Data augmentation is a technique consisting of extracting or generating more training data from a starting dataset. It can help in the training of a model, by reducing one of the most persistent issues in this task: the overfitting. This problem is often caused by insufficient data in the training set, and manifests itself in a poor performance of the model when tested on previously unseen data. Generating bigger datasets with real data is anything but a cheap and trivial task, requiring new approaches to reduce the resources consumption to inflate a dataset. The main goal of this thesis is to develop a tool capable of augmenting data, to help in the process of handling human behavioural and facial classification. In addition to that, we wanted to study how data augmentation (more precisely, data augmented by AI) can affect the performance of a classifier model. To do so, a data augmentation tool was developed, called FaceSwapAugmentation: it takes an existing video from a dataset, and replaces its face with second one. To this thesis’ scope, the model considered is trained upon the MaskDAR[1] dataset. The model’s purpose is to understand and classify the behaviour of a driving person. One of the key aspects of FaceSwapAugmentation, is that the latter face can be retrieved from another pre-existent video, or can be AI generated, making possible to generate an unlimited number of faceswaps. These face-swapped videos will then be used to inflate the original dataset, that will then be employed to train the classifier. The objective is to understand whether this technique can be useful to improve the model’s performance, without the need of acquiring a bigger dataset composed by real data. In addition to that, we also wanted to do a comparison between the generating methods put at disposal by faceSwapAugmentation. To answer the first question, the best and most feasible test was to build an aug- mented dataset with FaceSwapAugmentation, and then use it to train the classifier to see whether the performance improved. The first results showed a slightly worse per- formance of the model trained with fake data and tested on fake data, in comparison to the performance of the model trained and tested with real data. This, however, needs further consideration and study, taking into account also the resources that can be saved using AI generated data, and also the diminished privacy issues when making a dataset public. The second research question was answered by giving an objective score to each face swap made in each workflow of FaceSwapAugmentation. For each flow, 30 deepfakes were made: this allowed us to generate a distribution of values for each flow of usage, and to use these distributions to do a comparison. This gave as result that the AI- generated face swaps (ThisPersonDoesNotExists) produce the best deepfakes.File | Dimensione | Formato | |
---|---|---|---|
912344_definitiva_tesi_triennale_enrico_cassano.pdf
non disponibili
Tipologia:
Altro materiale allegato
Dimensione
6.96 MB
Formato
Adobe PDF
|
6.96 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14240/85694