Negli ultimi decenni la teoria delle reti complesse ha fornito molteplici strumenti quantitativi per lo studio di quei sistemi in cui possono essere identificati nodi e interazioni. In particolare, per rappresentare i sistemi che contengono due distinti tipi di nodi vengono usate le reti bipartite. In questa tesi studio alcuni modelli per reti bipartite, partendo dal caso studio di un sistema di interazioni ospite-patogeno. Le reti ecologiche, soprattutto quelle con interazioni tra patogeni e ospiti, sono particolarmente interessanti per i fisici e gli epidemiologi. Infatti, esse costituiscono una risorsa fondamentale per lo studio delle malattie infettive di uomo e animali, comprese le zoonosi. Le malattie infettive, oltretutto, rappresentano un problema estremamente attuale, come dimostra la recente pandemia di SARS-CoV-2. Nella mia ricerca analizzo la rete host-pathogen con due obiettivi principali. In primo luogo cerco le comunità di ospiti simili, in particolare gli ospiti che vengono infettati dagli stessi patogeni dell'uomo. Successivamente cerco di individuare quali interazioni, nonostante non siano presenti nel database, abbiano una qualche probabilità di esistenza. Per iniziare, proietto la rete bipartita sull'insieme degli ospiti, con l'intento di trasferire quanta più informazione possibile della rete originale su di una rete monopartita, in cui algoritmi classici possono essere usati per fare ricostruzione di comunità. Nello specifico, applico diversi metodi di proiezione: il cosiddetto metodo naive e tre metodi più sofisticati chiamati entropy- based. Ottengo quattro risultati molto diversi. Attraverso la proiezione naive, infatti, ottengo una rete per nulla informativa, mentre con due dei metodi entropy-based ottengo proiezioni molto più significative, in cui è possibile identificare comunità di ospiti ecologicamente ben definite. Nella seconda parte della tesi ricorro a metodi di inferenza bayesiana e a tecniche Monte- Carlo per ricostruire la rete reale complessiva delle interazioni, combinando i dati disponibili con alcune ipotesi. Queste ipotesi riguardano i meccanismi con cui la rete ecologica è stata generata e poi osservata. Analizzo i risultati ottenuti, in particolare le interazioni non-osservate per cui risulta una probabilità di esistenza non trascurabile, e discuto l'efficacia del modello utilizzato nel rispondere ad un problema di link prediction. Infine propongo alcune idee per sviluppare modelli di ricostruzione bayesiana che potrebbero essere più adatti ad affrontare il problema ecologico di interesse.
In the last decades, network science has provided many quantitative tools to study real-world systems represented by nodes and interactions among them. In particular, bipartite networks are used to represent systems with two distinct types of nodes. In this thesis I study some models for bipartite networks, starting from the case-study of a host-pathogen interaction system. Ecological networks, especially the ones describing host-pathogen interactions, are particularly interesting for physicists and epidemiologists. Indeed, they represent a crucial source of information for research on infectious diseases of humans and animals, including zoonoses. And besides, infectious diseases represent a timely issue, as demonstrated by the current COVID-19 pandemic. In my thesis, I analyze the host-pathogen network with two main objectives. The first objective is to identify communities of similar hosts, in particular hosts that could share the same pathogens with humans. The second objective is to define a probability of existence for those interactions among pathogens and hosts which have not been recorded in the database. To start, I project the bipartite network into the layer of hosts, with the aim of transferring as much information as possible about the original network structure into a monopartite graph, on which classical algorithms for community detection can be performed. To do so, I compare different projection methods, the so-called naive projection and three entropy-based algorithms. I find clear different outcomes between the four approaches. In fact, from the naive projection I get a non-informative network, while two of the entropy-based methods result in much more informative projections, where well-defined ecological host communities can be detected. In the second part of the thesis I use Bayesian inference and MCMC techniques to recon- struct the real unknown network of interactions, starting from the observed interactions and from some hypotheses. These hypotheses regard the generative and the observational mecha- nisms of the network. I analyze the results obtained, in particular the non-observed interactions for which the estimated existence probability is not negligible, and I discuss the effectiveness of the model to face a link prediction problem. Finally I propose some ideas to develop Bayesian reconstruction models which would fit better the ecological problem.
Modelli per reti complesse bipartite: caso studio di una rete di interazioni ospite-patogeno
BONACINA, FRANCESCO
2018/2019
Abstract
In the last decades, network science has provided many quantitative tools to study real-world systems represented by nodes and interactions among them. In particular, bipartite networks are used to represent systems with two distinct types of nodes. In this thesis I study some models for bipartite networks, starting from the case-study of a host-pathogen interaction system. Ecological networks, especially the ones describing host-pathogen interactions, are particularly interesting for physicists and epidemiologists. Indeed, they represent a crucial source of information for research on infectious diseases of humans and animals, including zoonoses. And besides, infectious diseases represent a timely issue, as demonstrated by the current COVID-19 pandemic. In my thesis, I analyze the host-pathogen network with two main objectives. The first objective is to identify communities of similar hosts, in particular hosts that could share the same pathogens with humans. The second objective is to define a probability of existence for those interactions among pathogens and hosts which have not been recorded in the database. To start, I project the bipartite network into the layer of hosts, with the aim of transferring as much information as possible about the original network structure into a monopartite graph, on which classical algorithms for community detection can be performed. To do so, I compare different projection methods, the so-called naive projection and three entropy-based algorithms. I find clear different outcomes between the four approaches. In fact, from the naive projection I get a non-informative network, while two of the entropy-based methods result in much more informative projections, where well-defined ecological host communities can be detected. In the second part of the thesis I use Bayesian inference and MCMC techniques to recon- struct the real unknown network of interactions, starting from the observed interactions and from some hypotheses. These hypotheses regard the generative and the observational mecha- nisms of the network. I analyze the results obtained, in particular the non-observed interactions for which the estimated existence probability is not negligible, and I discuss the effectiveness of the model to face a link prediction problem. Finally I propose some ideas to develop Bayesian reconstruction models which would fit better the ecological problem.File | Dimensione | Formato | |
---|---|---|---|
787353_francescobonacina_modelsforbipartitecomplexnetworksacasestudyofahost-pathogennetwork.pdf
non disponibili
Tipologia:
Altro materiale allegato
Dimensione
5.42 MB
Formato
Adobe PDF
|
5.42 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14240/153048