Nowadays deep learning is reshaping the way scientists are tackling problems that were thought to be almost impossible to solve. It seems that the applications of this kind of neural networks are endless and the research world is making huge efforts to push the boundaries even further. One of the great challenge of deep learning is about language: how a machine can model speech, written text and dialogue, like humans do, is still not entirely clear. Natural Language Generation (NLG) is the research field that focuses on generating narratives and reports automatically, in an easy-to-read language, which describes, summarizes or explains input data. This task is getting more and more important as companies pile up huge amount of data that cannot be read and understood by humans in a straightforward way. A paradigm shift is occurring as researchers figure out how to use neural networks and deep learning as building blocks in NLG algorithms to get way more accurate outcomes than with previous approaches. Here is analyzed the problem of generating fluent english descriptions from table-like data. We focused on the development of a sequence to sequence model able to handle this task leveraging two major features: the ability to work in a character-based manner, reading and generating character-wise; and the ability to switch between a generation and a copying mechanism when required. Working with characters instead of words is a challenge which brings a lot of different problems like a difficult training phase and a bigger error probability during inference. Nevertheless, our work shows that these issues are not impossible to solve and the efforts are repaid with the creation of a model able to handle more general inputs and outputs. On top of this, our copying technique, integrated with a clever shift mechanism, adds the ability to learn to generate outputs taking them directly from inputs; a feature really useful when the inputs contains rare words like proper names. We tested the neural network performance against a restaurants-related dataset scoring promising results and showing that deep learning is really an effective way to model language.

Reti neurali profonde per modelli sequence to sequence a caratteri con un meccanismo di copia

BONETTA, GIOVANNI
2017/2018

Abstract

Nowadays deep learning is reshaping the way scientists are tackling problems that were thought to be almost impossible to solve. It seems that the applications of this kind of neural networks are endless and the research world is making huge efforts to push the boundaries even further. One of the great challenge of deep learning is about language: how a machine can model speech, written text and dialogue, like humans do, is still not entirely clear. Natural Language Generation (NLG) is the research field that focuses on generating narratives and reports automatically, in an easy-to-read language, which describes, summarizes or explains input data. This task is getting more and more important as companies pile up huge amount of data that cannot be read and understood by humans in a straightforward way. A paradigm shift is occurring as researchers figure out how to use neural networks and deep learning as building blocks in NLG algorithms to get way more accurate outcomes than with previous approaches. Here is analyzed the problem of generating fluent english descriptions from table-like data. We focused on the development of a sequence to sequence model able to handle this task leveraging two major features: the ability to work in a character-based manner, reading and generating character-wise; and the ability to switch between a generation and a copying mechanism when required. Working with characters instead of words is a challenge which brings a lot of different problems like a difficult training phase and a bigger error probability during inference. Nevertheless, our work shows that these issues are not impossible to solve and the efforts are repaid with the creation of a model able to handle more general inputs and outputs. On top of this, our copying technique, integrated with a clever shift mechanism, adds the ability to learn to generate outputs taking them directly from inputs; a feature really useful when the inputs contains rare words like proper names. We tested the neural network performance against a restaurants-related dataset scoring promising results and showing that deep learning is really an effective way to model language.
ENG
IMPORT DA TESIONLINE
File in questo prodotto:
File Dimensione Formato  
823832_thesis_823832.pdf

non disponibili

Tipologia: Altro materiale allegato
Dimensione 1.87 MB
Formato Adobe PDF
1.87 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/95984