Multi-armed stochastic bandits: an optimistic approach in sequential learning&#8203;

In spite of the rich research on multi-armed bandits, only a few articles in the literature have underlined the importance of the choice, in terms of performance, of the principle ¿optimism in face of uncertainty¿. In this thesis we want to outline more strategies that follow this concept, translating a problem of a stochastic and sequential nature, to an optimistic optimization structure. Particular attention is given not only to the search space under the exploration-exploitation dilemma, but also to how the strategies can be reformulated in the absence of information on the smoothness of the mean-payoff function. We also present a recent result of non-existence of an adaptive strategy for the semi-metric, leaving open the problem of other consistent and suitable assumptions that might help to structure adaptive algorithms in the branch of active-sequential learning.

Nonostante la ricca ricerca riguardo i multi-armed bandits, solo in pochi articoli si è sottolineata l'importanza della scelta, in termini di prestazione, del principio ¿optimism in face of uncertainty¿. In questa tesi si vogliono delineare più strategie che seguono tale concetto, traducendo un problema di natura stocastica e di tipo sequenziale, ad una struttura di ottimizzazione ottimista. Particolare attenzione è prestata non solo allo spazio di osservazione, ma anche a come le strategie possano essere riformulate in assenza di informazione sulla regolarità della funzione di guadagno. Presentiamo inoltre un recente risultato di non esistenza di una strategia adattiva per la semi-metrica, lasciando aperto il problema di quali altre assunzioni, consistenti ed opportune alla realtà, possano aiutare a strutturare algoritmi adattivi nella branca dell'active-sequential learning.

Multi-armed stochastic bandits: an optimistic approach in sequential learning

GAZZANI, GUIDO

2018/2019

Abstract

Nonostante la ricca ricerca riguardo i multi-armed bandits, solo in pochi articoli si è sottolineata l'importanza della scelta, in termini di prestazione, del principio ¿optimism in face of uncertainty¿. In questa tesi si vogliono delineare più strategie che seguono tale concetto, traducendo un problema di natura stocastica e di tipo sequenziale, ad una struttura di ottimizzazione ottimista. Particolare attenzione è prestata non solo allo spazio di osservazione, ma anche a come le strategie possano essere riformulate in assenza di informazione sulla regolarità della funzione di guadagno. Presentiamo inoltre un recente risultato di non esistenza di una strategia adattiva per la semi-metrica, lasciando aperto il problema di quali altre assunzioni, consistenti ed opportune alla realtà, possano aiutare a strutturare algoritmi adattivi nella branca dell'active-sequential learning.

Scheda breve

	Facoltà/Dipartimento
	
				MATEMATICA "GIUSEPPE PEANO"
			
	Corso di studio
	
				MATEMATICA
			
	Lingua
	
				ENG
			
	Abstract in inglese
	
				In spite of the rich research on multi-armed bandits, only a few articles in the literature have underlined the importance of the choice, in terms of performance, of the principle ¿optimism in face of uncertainty¿. In this thesis we want to outline more strategies that follow this concept, translating a problem of a stochastic and sequential nature, to an optimistic optimization structure. Particular attention is given not only to the search space under the exploration-exploitation dilemma, but also to how the strategies can be reformulated in the absence of information on the smoothness of the mean-payoff function. We also present a recent result of non-existence of an adaptive strategy for the semi-metric, leaving open the problem of other consistent and suitable assumptions that might help to structure adaptive
algorithms in the branch of active-sequential learning.
			
	Relatrice / Relatore
	
				DI NARDO, Elvira
			
	Modalità consultazione tesi
	
				IMPORT DA TESIONLINE
			
	Appare nelle tipologie:
	
				Corso di Laurea Magistrale

File in questo prodotto:

File	Dimensione	Formato
858861_finalthesis.pdf non disponibili Tipologia: Altro materiale allegato Dimensione 1.42 MB Formato Adobe PDF	1.42 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/48676

Multi-armed stochastic bandits: an optimistic approach in sequential learning​

GAZZANI, GUIDO

2018/2019

Abstract

Scheda breve

Informazioni

Conferma cancellazione

Multi-armed stochastic bandits: an optimistic approach in sequential learning