Unrestricted Bridging Resolution

Hou, Yufang

Preview

PDF, English
Download (1MB) | Terms of use

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

DOI: 10.11588/heidok.00020344
URN: urn:nbn:de:bsz:16-heidok-203444

Abstract

Anaphora plays a major role in discourse comprehension and accounts for the coherence of a text. In contrast to identity anaphora which indicates that a noun phrase refers back to the same entity introduced by previous descriptions in the discourse, bridging anaphora or associative anaphora links anaphors and antecedents via lexico-semantic, frame or encyclopedic relations.

In recent years, various computational approaches have been developed for bridging resolution. However, most of them only consider antecedent selection, assuming that bridging anaphora recognition has been performed. Moreover, they often focus on subproblems, e.g., only part-of bridging or definite noun phrase anaphora. This thesis addresses the problem of unrestricted bridging resolution, i.e., recognizing bridging anaphora and finding links to antecedents where bridging anaphors are not limited to definite noun phrases and semantic relations between anaphors and their antecedents are not restricted to meronymic relations.

In this thesis, we solve the problem using a two-stage statistical model. Given all mentions in a document, the first stage predicts bridging anaphors by exploring a cascading collective classification model. We cast bridging anaphora recognition as a subtask of learning fine-grained information status (IS). Each mention in a text gets assigned one IS class, bridging being one possible class. The model combines the binary classifiers for minority categories and a collective classifier for all categories in a cascaded way. It addresses the multi-class imbalance problem (e.g., the wide variation of bridging anaphora and their relative rarity compared to many other IS classes) within a multi-class setting while still keeping the strength of the collective classifier by investigating relational autocorrelation among several IS classes. The second stage finds the antecedents for all predicted bridging anaphors at the same time by exploring a joint inference model. The approach models two mutually supportive tasks (i.e., bridging anaphora resolution and sibling anaphors clustering) jointly, on the basis of the observation that semantically/syntactically related anaphors are likely to be sibling anaphors, and hence share the same antecedent. Both components are based on rich linguistically-motivated features and discriminatively trained on a corpus (ISNotes) where bridging is reliably annotated. Our approaches achieve substantial improvements over the reimplementations of previous systems for all three tasks, i.e., bridging anaphora recognition, bridging anaphora resolution and full bridging resolution.

The work is – to our knowledge – the first bridging resolution system that handles the unrestricted phenomenon in a realistic setting. The methods in this dissertation were originally presented in Markert et al. (2012) and Hou et al. (2013a; 2013b; 2014). The thesis gives a detailed exposition, carrying out a thorough corpus analysis of bridging and conducting a detailed comparison of our models to others in the literature, and also presents several extensions of the aforementioned papers.

Document type:	Dissertation
Supervisor:	Strube, Prof. Dr. Michael
Date of thesis defense:	22 January 2016
Date Deposited:	31 Mar 2016 06:45
Date:	2016
Faculties / Institutes:	Neuphilologische Fakultät > Institut für Computerlinguistik
DDC-classification:	004 Data processing Computer science 400 Linguistics
Uncontrolled Keywords:	natural language processing, bridging resolution, associative anaphora, information status