GAN oversampling: applicazione di una Generative Adversarial Network a dataset sbilanciati.

Machine Learning has gained increasing importance in the last decade: class prediction in particular is critical for many real-world applications. Fraud analysis, cancer prediction and other more are remarkable examples of imbalanced problems: one class of the dataset under consideration contains too few istances. The usual Machine Learning algorithms, like neural networks, require a minimum amount of data to effectively learn the different classes:in an imbalanced situation, failure in minority class' prediction is observed. In this thesis an algorithm for balancing minority class' size was developed (oversampling): the aim was to improve performances of a classifier. Generative Adversarial Network, a neural network for image generation, was used to artificially generate minority class' data. GAN was properly modified depending on the particular dataset's features. The overall framework was tested on MNIST Dataset for handwritten digits and various real-worlddatasets, from seismic bumps to cancer detection. In the end, a comparison with SMOTE package for imbalanced problems is provided.

GAN oversampling: applicazione di una Generative Adversarial Network a dataset sbilanciati.

FANCHIN, FRANCESCO

2017/2018

Abstract

Machine Learning has gained increasing importance in the last decade: class prediction in particular is critical for many real-world applications. Fraud analysis, cancer prediction and other more are remarkable examples of imbalanced problems: one class of the dataset under consideration contains too few istances. The usual Machine Learning algorithms, like neural networks, require a minimum amount of data to effectively learn the different classes:in an imbalanced situation, failure in minority class' prediction is observed. In this thesis an algorithm for balancing minority class' size was developed (oversampling): the aim was to improve performances of a classifier. Generative Adversarial Network, a neural network for image generation, was used to artificially generate minority class' data. GAN was properly modified depending on the particular dataset's features. The overall framework was tested on MNIST Dataset for handwritten digits and various real-worlddatasets, from seismic bumps to cancer detection. In the end, a comparison with SMOTE package for imbalanced problems is provided.

Scheda breve

	Facoltà/Dipartimento
	
				FISICA
			
	Corso di studio
	
				FISICA DEI SISTEMI COMPLESSI
			
	Lingua
	
				ENG
			
	Relatrice / Relatore
	
				PANISSON, Andre'
			
	Modalità consultazione tesi
	
				IMPORT DA TESIONLINE
			
	Appare nelle tipologie:
	
				Corso di Laurea Magistrale

File in questo prodotto:

File	Dimensione	Formato
848585_francescofanchintesilmfsc.pdf non disponibili Tipologia: Altro materiale allegato Dimensione 3.05 MB Formato Adobe PDF	3.05 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/50558