A Regularized Competition Model for Question Diffi culty Estimation in Community Question Answering Services-20160520

1、Information

publication：EMNLP 2014

author:Jing Liu(在前一篇sigir基础上，拓展模型的论文)

2、What

衡量CQA中问题的困难程度,提出从两个方向建模

1)利用Competition的比较：Competition Model
q = {ua ≺q , q ≺ub , ua ≺ub , uo1 ≺ub , · · · , uoM ≺ub } ,

2) question Text Similarities for QDE，相似程度的问题具有相似的描述。（冷启动问题）

3、Dataset

Stack Overflow:

是一个与程序相关的IT技术问答网站。

数据下载地址：

http://www.ics.uci.edu/~duboisc/stackoverflow/

qid: Unique question id
i: User id of questioner
qs: Score of the question
qt: Time of the question (in epoch time)
tags: a comma-separated list of the tags associated with the question. Examples of tags are ``html'', ``R'', ``mysql'', ``python'', and so on; often between two and six tags are used on each question.
qvc: Number of views of this question (at the time of the datadump)
qac: Number of answers for this question (at the time of the datadump)
aid: Unique answer id
j: User id of answerer
as: Score of the answer
at: Time of the answer

4、How

input: question user Competition,question-question的Competition，similarity.

output: pair compare result.

method：RCM

5、Evaluation:accuracy:ACC =# correctly judged question pairs/# all question pairs

baseline:pagerank,TS,CM

6、additional analysis

1)不同方式计算text similarity

2）estimate difficult sorce of cold start problem:KNN

3) 不同difficult level的text words 举例

7、conclusion

posted @ 2016-05-20 13:13 白婷阅读(293) 评论(0) 收藏举报

刷新页面返回顶部

白婷的博客