Knowledge-Enhanced Neural Networks for Machine Reading Comprehension

Mihaylov, Todor Borisov

[thumbnail of Thesis_Todor_Mihaylov_Camera_Ready.pdf]

Preview

PDF, English - main document
Download (3MB) | Terms of use

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

DOI: 10.11588/heidok.00034352
URN: urn:nbn:de:bsz:16-heidok-343522

Abstract

Machine Reading Comprehension is a language understanding task where a system is expected to read a given passage of text and typically answer questions about it. When humans assess the task of reading comprehension, in addition to the presented text, they usually use the knowledge that they already know, such as commonsense and world knowledge, or language skills that they previously acquired - understanding the events and arguments in a text (who did what to whom), their participants and the relation in discourse. In contrast, neural network approaches for machine reading comprehension focused on training end-to-end systems that rely only on annotated task-specific data.

In this thesis, we explore approaches for tackling the reading comprehension problem, motivated by how a human would solve the task, using existing background and commonsense knowledge or knowledge from various linguistic tasks.

First, we develop a neural reading comprehension model that integrates external commonsense knowledge encoded as a key-value memory. Instead of relying only on document-to-question interaction or discrete features, our model attends to relevant external knowledge and combines this knowledge with the context representation before inferring the answer. This allows the model to attract and imply knowledge from an external knowledge source that is not explicitly stated in the text but is relevant for inferring the answer. We demonstrated that the proposed approach improves the performance of very strong base models for cloze-style reading comprehension and open-book question answering. By including knowledge explicitly, our model can also provide evidence about the background knowledge used in the reasoning process.

Further, we examined the impact of transferring linguistic knowledge from low-level linguistic tasks into a reading comprehension system using neural representations. Our experiments show that the knowledge transferred from the neural representations trained on these linguistic tasks can be adapted and combined together to improve the reading comprehension task early in training and when trained with small portions of the data.

Last, we propose to use structured linguistic annotations as a basis for a Discourse-Aware Semantic Self-Attention encoder that we employ for reading comprehension of narrative texts. We extract relations between discourse units, events, and their arguments, as well as co-referring mentions, using available annotation tools. The empirical evaluation shows that the investigated structures improve the overall performance (up to +3.4 Rouge-L), especially intra-sentential and cross-sentential discourse relations, sentence-internal semantic role relations, and long-distance coreference relations. We also show that dedicating self-attention heads to intra-sentential relations and relations connecting neighboring sentences is beneficial for finding answers to questions in longer contexts. These findings encourage the use of discourse-semantic annotations to enhance the generalization capacity of self-attention models for machine reading comprehension.

Document type:	Dissertation
Supervisor:	Frank, Prof. Dr. Anette
Place of Publication:	Heidelberg
Date of thesis defense:	21 February 2022
Date Deposited:	08 Feb 2024 13:18
Date:	2024
Faculties / Institutes:	Neuphilologische Fakultät > Institut für Computerlinguistik
DDC-classification:	400 Linguistics
Controlled Keywords:	Reading Comprehension, Deep Learning, Background Knowledge, Neural Networks, Computational Linguistics