Transformers are the state-of-the-art tool for sequence-to-sequence processing, leveraging deep contextual understanding to accurately interpret meaning based on surrounding information. Traditionally, Transformer models disambiguate words or lemmas in an input sentence by extracting contextual embeddings and predicting the most probable sense of an ambiguous term. We propose a new technique to harness Transformers’ power. Our architecture is instead trained to process sequences of senses and predict the most probable corresponding lemmas. Through this inverse procedure, we aim to capture additional, previously overlooked contextual information, turning our Transformer’s output into the input for a selection mechanism that identifies the most probable contextual sense of each ambiguous word.
Transformers are the state-of-the-art tool for sequence-to-sequence processing, leveraging deep contextual understanding to accurately interpret meaning based on surrounding information. Traditionally, Transformer models disambiguate words or lemmas in an input sentence by extracting contextual embeddings and predicting the most probable sense of an ambiguous term. We propose a new technique to harness Transformers’ power. Our architecture is instead trained to process sequences of senses and predict the most probable corresponding lemmas. Through this inverse procedure, we aim to capture additional, previously overlooked contextual information, turning our Transformer’s output into the input for a selection mechanism that identifies the most probable contextual sense of each ambiguous word.
Breaking the mold: Enhancing Transformer-based WSD through synset-to-lemma training
GABUTTI, DANIEL
2023/2024
Abstract
Transformers are the state-of-the-art tool for sequence-to-sequence processing, leveraging deep contextual understanding to accurately interpret meaning based on surrounding information. Traditionally, Transformer models disambiguate words or lemmas in an input sentence by extracting contextual embeddings and predicting the most probable sense of an ambiguous term. We propose a new technique to harness Transformers’ power. Our architecture is instead trained to process sequences of senses and predict the most probable corresponding lemmas. Through this inverse procedure, we aim to capture additional, previously overlooked contextual information, turning our Transformer’s output into the input for a selection mechanism that identifies the most probable contextual sense of each ambiguous word.File | Dimensione | Formato | |
---|---|---|---|
Tesi Magistrale - Daniel Gabutti.pdf
non disponibili
Dimensione
935.79 kB
Formato
Adobe PDF
|
935.79 kB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14240/164320