Item response theory e progettazione robusta di un test multistage

The Item Response Theory is the main theory that studies multiple-choice tests. This theory provides tools, models, and properties to describe and estimate a candidate's abilities and the parameters that characterize the questions (such as difficulty and discrimination). Consequently, it is widely used in the design of tests to assess candidates' skills. A particular case of testing is the mastery test, which aims to classify candidates into two categories, usually pass/fail or master/non-master. In this context, it is important to establish thresholds on which to base the decision rule. Moreover, mastery tests can have either a fixed or variable length. In the latter case, the candidate answers an initial set of questions, and depending on the score obtained and the predetermined thresholds, the outcome may be "exam passed," "exam not passed," or the candidate must answer a second set of questions until they are classified as either passing or failing. Thus, special importance is given to the calculation of threshold scores and classification errors, which determine the accuracy of the test. In the context of Multi-Stage Mastery Tests and their design, the concept of robustness - the ability to remain reliable despite deviations and errors in the underlying assumptions - becomes essential, the ability to remain reliable despite deviations and errors in the underlying assumptions. Specifically, the objective of this work is to construct such a test, establish it as a reference model, and study its robustness by modifying the parameters that characterize the population and the response model on which it was designed.

L’Item Response Theory è la teoria principale che si occupa di studiare i test a risposta multipla. Questa teoria fornisce strumenti, modelli e proprietà per descrivere e stimare le abilità di un candidato e i parametri che caratterizzano le domande (come la difficoltà e il discriminante). Di conseguenza essa viene utilizzata ampiamente nella progettazione di test per valutare le abilità dei candidati. Un caso particolare di test è costituito dai mastery test che hanno l’obiettivo di classificare i candidati in due categorie, solitamente promossi/bocciati oppure masters/non master. In questo contesto è importante stabilire delle soglie su cui basare la regola decisionale. Inoltre i mastery test possono essere sia a lunghezza fissa, sia a lunghezza variabile. Nel secondo caso il candidato deve rispondere ad una prima batteria di domande e, a seconda del punteggio ottenuto, in base alle soglie fissate, l’esito può essere “esame superato”, “esame non superato”, oppure deve rispondere ad una seconda batteria di domande, fino a quando non viene classificato come promosso o bocciato. Particolare rilevanza hanno, quindi, il calcolo dei punteggi di soglia e gli errori di classificazione che determinano il grado di accuratezza del test. Nel contesto dei Multi-Stage Mastery Test e della loro progettazione, infine, diventa essenziale il concetto di robustezza, ossia la capacità di sopportare, rimanendo affidabile, deviazioni ed errori nelle ipotesi di base. In particolare, l’obiettivo di questo lavoro è costruire un test di questo tipo, fissarlo come modello di riferimento e studiarne la robustezza tramite delle modifiche nei parametri che caratterizzano la popolazione ed il modello di risposta su cui è stato progettato.