【ACL2019】利用关联词与关系词的对应性，通过标签嵌入识别隐性话语关系

Employing the Correspondence of Relations and Connectives to Identify Implicit Discourse Relations via Label Embeddings

利用关联词与关系词的对应性，通过标签嵌入识别隐性话语关系

Introduction

Discourse parsing reveals the discourse units (i.e., text spans, sentences, clauses) of the documents and how such units are related to each others to improve the coherence.

语篇分析揭示了文档中的语篇单位，以及这些单位之间是如何相互联系的，以提高文档的连贯性。

This work focuses on the task of implicit discourse relation recognition (IDRR), aiming to identify the discourse relations (i.e., cause, contrast) between adjacent text spans in documents.

本文主要研究隐性语篇关系识别（IDRR）的任务，旨在识别文档中相邻语篇跨度之间的语篇关系。

IDRR is a fundamental problem in discourse analysis (Knott, 2014; Webber et al., 1999) with important applications on question answering (Liakata et al., 2013; Jansen et al., 2014) and text summarization (Gerani et al., 2014; Yoshida et al., 2014), to name a few.

idrr是语篇分析中的一个基本问题，在问答和文本摘要中有着重要的应用。

Due it its importance, IDRR is being studied actively in the literature, leading to the recent advances for this problem based on deep learning (Chen et al., 2016; Qin et al., 2016; Zhang et al., 2016; Lan et al., 2017; Dai and Huang, 2018).

由于IDRR的重要性，文献中对其进行了积极的研究，导致了基于深度学习的IDRR问题的最新进展。

Consider the two following text spans (called arguments) taken from (Qin et al., 2017) as an example:

以以下两个文本跨距（称为参数）为例：

Argument 1: Never mind.

Argument 2: You already know the answer

An IDRR model should be able to recognize that argument 2 is the cause of argument 1 (i.e., the Cause relation) in this case.

在这种情况下，IDRR模型应该能够识别参数2是参数1的原因（即原因关系）。

This is a challenging problem as the models need to rely solely on the text of the arguments to predict accurate discourse relations.

这是一个具有挑战性的问题，因为模型需要仅仅依赖于论据的文本来预测准确的话语关系。

The problem would become more manageable if connective/marker cues (i.e., “but”, “so”) are provided to connect the two arguments according to their discourse relations (Qin et al., 2017).

如果根据两个论据的语篇关系，提供连接/标记线索（如“but”，“so”）来连接这两个论据，这个问题将变得更容易处理。

In the example above, it is beneficial for the models to know that “because” can be a connective of the two arguments that is consistent with their discourse relation (i.e., Cause).

在上面的例子中，让模型知道“因为”可以是两个论点的连接词，这与它们的话语关系（即原因）是一致的。

In fact, a human annotator can also benefit from the connectives between arguments when he or she needs to assign discourse relations for pairs of arguments (Qin et al., 2017).

事实上，当一个人需要为一对论据分配话语关系时，他或她也可以从论据之间的连接中获益。

This is demonstrated in the Penn Discourse Treebank dataset (PDTB) (Prasad et al., 2008), a major benchmark dataset for IDRR, where the annotators first inject the connectives between the arguments (called the “implicit connectives”) to aid the relation assignment of the arguments later (Qin et al., 2017).

这在PDTB数据集（IDRR的一个主要基准数据集）中得到了证明，在这里，注释者首先在参数之间插入连接词（称为“隐式连接词”），以帮助以后对参数进行关系赋值。

Motivated by the relevance of connectives for IDRR, some recent work on deep learning has explored methods to transfer the knowledge from the implicit connectives to support discourse relation prediction using the multi-task learning frameworks (Qin et al., 2017; Bai and Zhao, 2018).

基于连接词对IDRR的相关性，近年来的一些深层学习研究探索了利用多任务学习框架从内隐连接词中转移知识以支持语篇关系预测的方法。

The typical approach is to simultaneously predict the discourse relations and the implicit connectives for the input arguments in which the model parameters for the two prediction tasks are shared/tied to allow the knowledge transfer (Liu et al., 2016; Wu et al., 2016; Lan et al., 2017; 4202 Bai and Zhao, 2018).

典型的方法是同时预测两个预测任务的模型参数共享/绑定以允许知识转移的输入参数的语篇关系和隐含连接词。

Unfortunately, such multitask learning models for IDRR share the limitation of failing to exploit the mapping between the implicit connectives and the discourse relations.

不幸的是，这种用于IDRR的多任务学习模型存在着未能充分利用隐含连接词与语篇关系之间的映射的局限性。

In particular, each implicit connective in the PDTB dataset can be naturally mapped into the corresponding discourse relations based on their semantics that can be further employed to transfer the knowledge from the connectives to the relations.

特别地，PDTB数据集中的每个隐含连接词都可以基于语义自然地映射到相应的语篇关系中，这些语义可以进一步用于将知识从连接词转移到关系中。

For instance, in the PDTB dataset, the connective “consequently” uniquely corresponds to the relation cause while the connective “in contrast” can be associated with the relation comparison.

例如，在PDTB数据集中，连接词“因此”与关系原因唯一对应，而连接词“相反”可以与关系比较关联。

In this work, we argue that the knowledge transfer facilitated by such a connective-relation mapping can indeed help to improve the performance of the multi-task learning models for IDRR with deep learning.

在这项工作中，我们认为这种连接词-关系映射促进的知识转移确实有助于提高深度学习的IDRR多任务学习模型的性能。

Consequently, in order to exploit the connective-relation mapping, we propose to embed the implicit connectives and the discourse relations into the same space that would be used to transfer the knowledge between connective and relation predictions via the mapping.

因此，为了开发连接词-关系映射，我们建议将隐含连接词和语篇关系嵌入到同一空间中，通过映射在连接预测和关系预测之间传递知识。

We introduce several mechanisms to encourage both knowledge sharing and representation distinction for the embeddings of the connectives and relations for IDRR.

我们引入了一些机制来鼓励IDRR的连接词和关系的嵌入的知识共享和表示区分。

In the experiments, we extensively demonstrate that the novel embeddings of connectives and relations along with the proposed mechanisms significantly improve the multi-task learning models for IDRR.

在实验中，我们广泛地证明了连接词和关系的新嵌入以及所提出的机制显著地改进了IDRR的多任务学习模型。

We achieve the state-of-the-art performance for IDRR over several settings of the benchmark dataset PDTB.

我们在基准数据集PDTB的几个设置上实现了IDRR的最新性能状态。

There have been many research on IDRR since the creation of the PDTB dataset (Prasad et al., 2008).

自PDTB数据集创建以来，对IDRR进行了大量的研究。

The early work has manually designed various features for IDRR (Pitler et al., 2009; Lin et al., 2009; Wang et al., 2010; Zhou et al., 2010; Braud and Denis, 2015; Lei et al., 2018) while the recent approach has applied deep learning to significantly improve the performance of IDRR (Zhang et al., 2015; Ji et al., 2015a; Chen et al., 2016; Liu et al., 2016; Qin et al., 2016; Zhang et al., 2016; Cai and Zhao, 2017; Lan et al., 2017; Wu et al., 2017; Dai and Huang, 2018; Kishimoto et al., 2018).

早期的工作已经为IDRR手工设计了各种特性，而最近的方法已经应用了深度学习来显著提高IDRR的性能。

The most related work to ours in this paper involves the multi-task learning models for IDRR that employ connectives as the auxiliary labels for the prediction of the discourse relations.

本文中涉及到的最相关的工作涉及IDRR的多任务学习模型，该模型采用连接词作为预测语篇关系的辅助标记。

For the feature-based approach, (Zhou et al., 2010) employ a pipelined approach to first predict the connectives and then assign discourse relations accordingly while (Lan et al., 2013) use the connective-relation mapping to automatically generate synthetic data.

Zhou使用流水线方法首先预测连接词，然后相应地分配话语关系，而LAN使用连接关系映射自动生成合成数据。

For the recent work on deep learning for IDRR, (Liu et al., 2016; Wu et al., 2016; Lan et al., 2017; Bai and Zhao, 2018) simultaneously predict connectives and relations assuming the shared parameters of the deep learning models while (Qin et al., 2017) develop adversarial networks to encourage the relation models to mimic the features learned from the connective incorporation.

在最近的IDRR深度学习研究中，liu假设深度学习模型的共享参数，同时预测连接词和关系，而qin开发了对抗性网络，以鼓励关系模型模仿从连接词合并中学习到的特征。

However, none of these work employs embeddings of connectives and relations to transfer knowledge with the connective-relation mapping and deep learning as we do in this work.

然而，这些工作都没有像我们在这项工作中所做的那样，使用连接词和关系的嵌入来通过连接关系映射和深度学习来传递知识。