Cost-Sensitive Binary Minimax Risk Classifiers: a theoretical lower bound for the performance and application to the medical field.

This thesis begins with a first chapter that introduces the problem of binary classification in supervised machine learning. In the second, third and fourth chapters it is presented the core of the research. Finally, in the fifth chapter, the conclusions on the work are reported. In the remainder, we provide an outline of the core chapters of the thesis. • Chapter 2. The second chapter presents the general theory behind Minimax Risk Classifiers (or MRCs). In the first section, we define mathematically the problem that MRCs are built to solve. In the second section, we state the building blocks of the general MRCs. In section three, we show how the explicit expression of the binary MRCs under 0-1 loss can be obtained. In the last section, we compute the explicit form of the classifier under the additional assumpion that the labels’ marginal distribution is known. • Chapter 3. In the third chapter we develop the theory of cost-sensitive binary MRCs, that is the main result of the research developed by the author. In the first section, the MRCs’ expression under cost-sensitive loss is found. In the second one, we find a theoretical upper bound for the curve of errors on sensitivity and specificity. In section three possible feature mappings are shown. • Chapter 4. In the fourth chapter, we present the experimental results on real datasets. In particular, in the first section all the classifiers shown are applied to a real dataset about covid cases. The second section applies the cost-sensitive version of MRCs to other real medical datasets. The choice of datasets belonging to medical field is due to the utility of cost-sensitive classification for medical prevention. The MATLAB code developed for this work is avaiable on Github at the following link: https://github.com/saraambrosi/Minimax-Risk-Classifiers.

Cost-Sensitive Binary Minimax Risk Classifiers: a theoretical lower bound for the performance and application to the medical field.

AMBROSI, SARA

2021/2022

Abstract

This thesis begins with a first chapter that introduces the problem of binary classification in supervised machine learning. In the second, third and fourth chapters it is presented the core of the research. Finally, in the fifth chapter, the conclusions on the work are reported. In the remainder, we provide an outline of the core chapters of the thesis. • Chapter 2. The second chapter presents the general theory behind Minimax Risk Classifiers (or MRCs). In the first section, we define mathematically the problem that MRCs are built to solve. In the second section, we state the building blocks of the general MRCs. In section three, we show how the explicit expression of the binary MRCs under 0-1 loss can be obtained. In the last section, we compute the explicit form of the classifier under the additional assumpion that the labels’ marginal distribution is known. • Chapter 3. In the third chapter we develop the theory of cost-sensitive binary MRCs, that is the main result of the research developed by the author. In the first section, the MRCs’ expression under cost-sensitive loss is found. In the second one, we find a theoretical upper bound for the curve of errors on sensitivity and specificity. In section three possible feature mappings are shown. • Chapter 4. In the fourth chapter, we present the experimental results on real datasets. In particular, in the first section all the classifiers shown are applied to a real dataset about covid cases. The second section applies the cost-sensitive version of MRCs to other real medical datasets. The choice of datasets belonging to medical field is due to the utility of cost-sensitive classification for medical prevention. The MATLAB code developed for this work is avaiable on Github at the following link: https://github.com/saraambrosi/Minimax-Risk-Classifiers.

Scheda breve

	Facoltà/Dipartimento
	
				MATEMATICA "GIUSEPPE PEANO"
			
	Corso di studio
	
				STOCHASTICS AND DATA SCIENCE
			
	Lingua
	
				ENG
			
	Relatrice / Relatore
	
				MONTAGNA, Silvia
			
	Modalità consultazione tesi
	
				IMPORT DA TESIONLINE
			
	Appare nelle tipologie:
	
				Corso di Laurea Magistrale

File in questo prodotto:

File	Dimensione	Formato
947199_tesi_sara_ambrosi.pdf non disponibili Tipologia: Altro materiale allegato Dimensione 2.96 MB Formato Adobe PDF	2.96 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/87292