Too often women's football has been compared to men's football mainly on the basis of the players' physical attributes, offering an incomplete analysis of when the characteristics of any football team are studied analytically. Thanks to the availability of an open soccer-logs data set provided by Wyscout, this thesis aims to statistically analyse and compare male and female national football teams based on their technical qualities, measured through the event data obtained from the last World Cup championships. An event could be defined as a certain action, such as a pass, a shot, a foul, a save attempt, and so on, made by a team's player in a match. First results show, for example, that there are significant differences in the number of key playing events, such as passes, percentage of accurate passes and free kicks made by the national teams during a match. Through the use of particular methods and algorithms [Pappalardo et al., 2019a, Cintia et al., 2015], there were computed variables related to the technical characteristics of a team, such as the average pass velocity and the average ball possession recovery time, which can also define the intensity of a game, and variables that summarize and quantify the individual and collective performance of a team's players within a single value, such as the H indicator or the players' ratings aggregated for each team via mean and standard deviation. For example, the more the ratings' standard deviation, the more, in a particular match, the team was characterized by players that, individually, outperformed respect to their teammates. Finally, all these features were used into advanced classification algorithms such as Decision Tree, Random Forest and Adaboost Classifier with the task of classifying a team in a game as male (class 0) or female (class 1). All the classifiers were validated through a 10-fold Cross Validation on a training set and they all showed a good predictive performance, indicating that it is possible to distinct a male football team from a female one (and vice versa) on technical skills. Moreover, after fitting a Decision Tree on different versions of training set and looking at the importance that each variable had in the decision path every time, we find that the most important differences underlie in variables such as players' individual performance variability, pass velocity, ball recovery time and the percentage of accurate passes made by the teams.

Analisi sugli attributi tecnici delle nazionali di calcio maschili e femminili: un confronto attraverso un approccio di statistical machine learning

PONTILLO, GIUSEPPE
2018/2019

Abstract

Too often women's football has been compared to men's football mainly on the basis of the players' physical attributes, offering an incomplete analysis of when the characteristics of any football team are studied analytically. Thanks to the availability of an open soccer-logs data set provided by Wyscout, this thesis aims to statistically analyse and compare male and female national football teams based on their technical qualities, measured through the event data obtained from the last World Cup championships. An event could be defined as a certain action, such as a pass, a shot, a foul, a save attempt, and so on, made by a team's player in a match. First results show, for example, that there are significant differences in the number of key playing events, such as passes, percentage of accurate passes and free kicks made by the national teams during a match. Through the use of particular methods and algorithms [Pappalardo et al., 2019a, Cintia et al., 2015], there were computed variables related to the technical characteristics of a team, such as the average pass velocity and the average ball possession recovery time, which can also define the intensity of a game, and variables that summarize and quantify the individual and collective performance of a team's players within a single value, such as the H indicator or the players' ratings aggregated for each team via mean and standard deviation. For example, the more the ratings' standard deviation, the more, in a particular match, the team was characterized by players that, individually, outperformed respect to their teammates. Finally, all these features were used into advanced classification algorithms such as Decision Tree, Random Forest and Adaboost Classifier with the task of classifying a team in a game as male (class 0) or female (class 1). All the classifiers were validated through a 10-fold Cross Validation on a training set and they all showed a good predictive performance, indicating that it is possible to distinct a male football team from a female one (and vice versa) on technical skills. Moreover, after fitting a Decision Tree on different versions of training set and looking at the importance that each variable had in the decision path every time, we find that the most important differences underlie in variables such as players' individual performance variability, pass velocity, ball recovery time and the percentage of accurate passes made by the teams.
ENG
IMPORT DA TESIONLINE
File in questo prodotto:
File Dimensione Formato  
781288_sport_analytics_thesis_781288.pdf

non disponibili

Tipologia: Altro materiale allegato
Dimensione 3.35 MB
Formato Adobe PDF
3.35 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/101434