Basing on an observed sample of size n, focus will be on prediction of a key aspect of the outcome from an additional sample of size m, namely the so¿called discovery probability. In particular, conditionally on an observed basic sample of size n, we derive a novel estimator of the probability of detecting, at the (n + m + 1)¿th observation, species that have been observed with any given frequency in the enlarged sample of size n + m. We put in comparison the Good-Turing frequency estimator and a new Bayesian non-parametric estimator, computed by researchers of the University of Turin in 2012, on a fascinating literature problem. In fact, on November 14, 1985, Shakesperean scholar Gary Taylor discovered a nine-stanza poem whose Shakespeare's authorship is still discussed, the so-called Taylor poem. Statistical inference will be applied with both estimators. The consistency of the word usage in the Taylor poem with that of the Shakespearean canon is examined using a nonparametric empirical Bayes model. We consider also poems by Jonson, Marlowe and Donne, as well as four poems definitely attributed to Shakespeare. On balance, the poem is found to fit previous Shakespearean usage reasonably well. Futhermore, the new estimator is not only reliable with real data and with the Good-Turing statistic, but also more powerful from other aspects: since it is obtained with a Montecarlo method we can estimate also the distribution of the estimator. The last feature is the most important result because this new method gives us something that the Good-Turing estimator cannot.

Analisi Bayesiana non-parametrica sui poemi di Shakespeare

DUMA, MATTIA
2015/2016

Abstract

Basing on an observed sample of size n, focus will be on prediction of a key aspect of the outcome from an additional sample of size m, namely the so¿called discovery probability. In particular, conditionally on an observed basic sample of size n, we derive a novel estimator of the probability of detecting, at the (n + m + 1)¿th observation, species that have been observed with any given frequency in the enlarged sample of size n + m. We put in comparison the Good-Turing frequency estimator and a new Bayesian non-parametric estimator, computed by researchers of the University of Turin in 2012, on a fascinating literature problem. In fact, on November 14, 1985, Shakesperean scholar Gary Taylor discovered a nine-stanza poem whose Shakespeare's authorship is still discussed, the so-called Taylor poem. Statistical inference will be applied with both estimators. The consistency of the word usage in the Taylor poem with that of the Shakespearean canon is examined using a nonparametric empirical Bayes model. We consider also poems by Jonson, Marlowe and Donne, as well as four poems definitely attributed to Shakespeare. On balance, the poem is found to fit previous Shakespearean usage reasonably well. Futhermore, the new estimator is not only reliable with real data and with the Good-Turing statistic, but also more powerful from other aspects: since it is obtained with a Montecarlo method we can estimate also the distribution of the estimator. The last feature is the most important result because this new method gives us something that the Good-Turing estimator cannot.
ENG
IMPORT DA TESIONLINE
File in questo prodotto:
File Dimensione Formato  
748234_tesi.pdf

non disponibili

Tipologia: Altro materiale allegato
Dimensione 388.77 kB
Formato Adobe PDF
388.77 kB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/88427