Modeling and predicting the outcome of tennis matches has attracted much attention over the last few years. In elite tennis, it was established that the serving player wins more points than the receiving player. However, it is not certain how the serve advantage is related to the number of shots played. Kovalchik (2018b) proposes to model the probability of winning a point as a function of rally length. In particular, the author uses an exponential decay function to model the decline in serve advantage plus an asymptote, representing the difference between the rallying ability of the players. In this Master Thesis, we will focus on the role of the serve advantage in winning a point as a function of the rally length. Our model falls into the Bradley-Terry class of models (Bradley and Terry, 1952) and is built upon of Kovalchik (2018b). In particular, we represent the logit of the probability of winning a point on serve as a linear combination of B-splines basis functions, where the splines coefficients are athlete-specific. The basis functions decomposition allows for more flexible curves (if needed) than those attainable via an exponential decay. At beginning of a play, the serve advantage decreases naturally. On the other hand, for high values of rally length, having a sparse data, we ensure the non-increasing behavior of splines via their control polygon, that essentially result in imposing a constrain on the coefficients of the spline basis. We also consider the rallying ability of each player as in Kovalchik (2018b), and further consider an extension to investigate how the type of court impacts on the player's rallying ability. We apply our methodology to a Grand Slam singles matches dataset. With our model, it is possible to obtain an estimate of the longitudinal trajectory for the probability of scoring the point as a function of rally length, as well as predicting such trajectory for a pool of subjects in a test set. Further, we are also able to estimate other quantities of potential interest, such as the serve advantage on serve and the rally ability of the player. We perform sensitivity analyses to investigate how our results and conclusions are affected by different choices for the prior distributions on the model parameters and hyperparameter values. The software used for programming is R (R Core Team, 2013), and the model is developed in JAGS (Denwood, 2016).

Bayesian isotonic logistic regression via constrained splines: an application to estimating the serve advantage in professional tennis

ORANI, VANESSA
2017/2018

Abstract

Modeling and predicting the outcome of tennis matches has attracted much attention over the last few years. In elite tennis, it was established that the serving player wins more points than the receiving player. However, it is not certain how the serve advantage is related to the number of shots played. Kovalchik (2018b) proposes to model the probability of winning a point as a function of rally length. In particular, the author uses an exponential decay function to model the decline in serve advantage plus an asymptote, representing the difference between the rallying ability of the players. In this Master Thesis, we will focus on the role of the serve advantage in winning a point as a function of the rally length. Our model falls into the Bradley-Terry class of models (Bradley and Terry, 1952) and is built upon of Kovalchik (2018b). In particular, we represent the logit of the probability of winning a point on serve as a linear combination of B-splines basis functions, where the splines coefficients are athlete-specific. The basis functions decomposition allows for more flexible curves (if needed) than those attainable via an exponential decay. At beginning of a play, the serve advantage decreases naturally. On the other hand, for high values of rally length, having a sparse data, we ensure the non-increasing behavior of splines via their control polygon, that essentially result in imposing a constrain on the coefficients of the spline basis. We also consider the rallying ability of each player as in Kovalchik (2018b), and further consider an extension to investigate how the type of court impacts on the player's rallying ability. We apply our methodology to a Grand Slam singles matches dataset. With our model, it is possible to obtain an estimate of the longitudinal trajectory for the probability of scoring the point as a function of rally length, as well as predicting such trajectory for a pool of subjects in a test set. Further, we are also able to estimate other quantities of potential interest, such as the serve advantage on serve and the rally ability of the player. We perform sensitivity analyses to investigate how our results and conclusions are affected by different choices for the prior distributions on the model parameters and hyperparameter values. The software used for programming is R (R Core Team, 2013), and the model is developed in JAGS (Denwood, 2016).
ENG
IMPORT DA TESIONLINE
File in questo prodotto:
File Dimensione Formato  
849411_thesis.pdf

non disponibili

Tipologia: Altro materiale allegato
Dimensione 1.28 MB
Formato Adobe PDF
1.28 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/97104