eprintid: 27919
rev_number: 18
eprint_status: archive
userid: 4991
dir: disk0/00/02/79/19
datestamp: 2020-02-21 12:24:57
lastmod: 2020-02-27 11:03:19
status_changed: 2020-02-21 12:24:57
type: doctoralThesis
metadata_visibility: show
creators_name: Moosavi, Nafise Sadat
title: Robustness in Coreference Resolution
subjects: ddc-004
subjects: ddc-400
divisions: i-90500
adv_faculty: af-09
cterms_swd: Natural Language Processing
cterms_swd: Coreference resolution
cterms_swd: Robustness
abstract: Coreference resolution is the task of determining different expressions of a text that refer to the same entity. The resolution of coreferring expressions is an essential step for automatic interpretation of the text. While coreference information is beneficial for various NLP tasks like summarization, question answering, and information extraction, state-of-the-art coreference resolvers are barely used in any of these tasks. The problem is the lack of robustness in coreference resolution systems. A coreference resolver that gets higher scores on the standard
evaluation set does not necessarily perform better than the others on a new test set.
In this thesis, we introduce robustness in coreference resolution by (1) introducing a reliable evaluation framework for recognizing robust improvements, and (2) proposing a solution that results in robust coreference resolvers.
As the first step of setting up the evaluation framework, we introduce a reliable evaluation metric, called LEA, that overcomes the drawbacks of the existing metrics. We analyze LEA based on various types of errors in coreference outputs and show that it results in reliable scores. In addition to an evaluation metric, we also introduce an evaluation setting in which we disentangle coreference evaluations from parsing complexities. Coreference resolution is affected by parsing complexities for detecting the boundaries of expressions that have complex syntactic structures. We reduce the effect of parsing errors in coreference evaluation by automatically extracting a minimum span for each expression. We then emphasize the importance of out-of-domain evaluations and generalization in coreference resolution and discuss the reasons behind the poor generalization of state-of-the-art coreference resolvers.
Finally, we show that enhancing state-of-the-art coreference resolvers with linguistic features is a promising approach for making coreference resolvers robust across domains. The
incorporation of linguistic features with all their values does not improve the performance.
However, we introduce an efficient pattern mining approach, called EPM, that mines all feature-value combinations that are discriminative for coreference relations. We then only
incorporate feature-values that are discriminative for coreference relations. By employing EPM feature-values, performance improves significantly across various domains.
date: 2020
id_scheme: DOI
id_number: 10.11588/heidok.00027919
ppn_swb: 1691157325
own_urn: urn:nbn:de:bsz:16-heidok-279197
date_accepted: 2019-07-22
advisor: HASH(0x564eeed3e440)
language: eng
bibsort: MOOSAVINAFROBUSTNESS2020
full_text_status: public
place_of_pub: Heidelberg
citation:   Moosavi, Nafise Sadat  (2020) Robustness in Coreference Resolution.  [Dissertation]     
document_url: https://archiv.ub.uni-heidelberg.de/volltextserver/27919/1/thesis.pdf