[论文笔记] Money, glory and cheap talk: analyzing strategic behavior of contestants in simultaneous crowdsourcing contests on TopCoder.com (WWW, 2010)

Nikolay Archak. 2010. Money, glory and cheap talk: analyzing strategic behavior of contestants in simultaneous crowdsourcing contests on TopCoder.com. In Proceedings of the 19th international conference on World wide web (WWW '10). ACM, New York, NY, USA, 21-30. DOI=10.1145/1772690.1772694 http://doi.acm.org/10.1145/1772690.1772694

    作者Nikolay Archak,神人也,80后,俄罗斯人,前些年在纽约大学读博。初识其人于Topcoder,在算法、设计、开发的比赛中均能名列前茅,Java和.NET系列左右开弓,能力相当全面,令人敬仰;后在WWW2010 accpeted paper list里赫然看到他有两篇一作在列(均为full paper)。一个人能在其中一方面做到这种程度已很不易,他居然能同时在这些领域做到顶级,着实令人叹为观止。本文即为其发表在WWW2010上的其中一篇。


1.  Title中的名词理解


    大陆有翻译为“众包”,以下摘录自“Crowdsourcing for CAD/CAM?” URL (链接可能会失效)
        “众包(crowdsourcing)平台是美国《连线》(Wired)杂志2006年6月份发明的一个专业术语(Jeff Howe在一篇报道中提出的),用来描述一种新的商业模式,即企业利用互联网来将工作分配出去、发现创意或解决技术问题。WikiPedia对此的定义(Howe自己加的)为:众包指的是一个公司或机构把过去由员工执行的工作任务,以自由自愿的形式外包给非特定的(而且通常是大型的)大众网络的做法。众包的任务通常是由个人来承担,但如果涉及到需要多人协作完成的任务,也有可能以依靠开源的个体生产的形式出现。Jonathan Corney(《3D Modeling with ACIS》的作者)在该文章中也提到该定义,并做了说明。”

    Howe的原话:“Simply defined, crowdsourcing represents the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call. This can take the form of peer-production (when the job is performed collaboratively), but is also often undertaken by sole individuals. The crucial prerequisite is the use of the open call format and the large network of potential laborers.”

    台湾有人翻译为“群众外包”,详见:群众外包(Crowdsourcing)浪潮的兴起(一) – 专家的新挑战

cheap talk

    网上搜了一下,"cheap talk”还是很学术的一个话题,属博弈论范畴,“cheap talk game”在中文里有翻译为“空谈博弈”。wikipedia上对cheap talk的解释如下:
    “In game theory, cheap talk is communication between players which does not directly affect the payoffs of the game. This is in contrast to signaling in which sending certain messages may be costly for the sender depending on the state of the world. The classic example is of an expert (say, ecological) trying to explain the state of the world to an uninformed decision maker (say, politician voting on a deforestation bill). The decision maker, after hearing the report from the expert, must then make a decision which affects the payoffs of both players.”

2.  (S4) 本文提出的几个假说及结果

Hypothesis 观点 (for higher rated members) 验证结果


(a) with more inherent skills & abilities 
-> deliver better solutions
(b) inherently care more about their ratings
-> consistently put more effort into the competition to keep the status high


with more accrued experience
-> deliver better solution


the rating is “addictive”, members that achieved high rating today tend to contribute more in the future to keep their status high F


experience less competition in the project choice phase
-> can afford to choose easier, better paying or less competitive projects and deliver higher scores


expect fiercer competition from components 
-> have to deliver better solutions in order to win

注:检验 Hypothesis III时排除了前五个components。

3. 方法

(1) 使用爬虫从topcoder网站上采集数据,时间范围为09/02/2003-08/23/2009

(2) 提出了一个公式表示各个因素与最终得分之间的关系(各个因子含义见S4):



基于采集的数据集,使用了普通最小二乘法(Ordinary Least Squares, OLS)、广义矩阵估计方法(General Method of Moments, GMM)等数值方法。
涉及到的其他概念还有:一阶随机占优(First order Stochastic Dominance, FSD)、Mann-Whitney-Wilcoxon stochastic dominance test、
                                         T统计量(t statistic)、Bernoulli(binomial probability) test

4. 小结

本文的结构很清晰,S1总体介绍,S2文献综述,S3介绍topcoder及本文采用的数据集,S4和S5是主体部分,在S4开始提出了假说,并对Competition和Registration Phase分别进行了实证分析,最后是总结(S6)。

    文中有些笔误的地方:文中只有四个table,S5中却有对”Table 5”的引用。

5. 文中提到的一些词汇

empirical analysis:实证分析或经验分析
equilibrium payoff: 均衡收益
cumulative density function(CDF): 累计分布函数
endogeneity: 内源性(多个因素之间有相关性,不是独立的,这个是我自己的简单理解)
ceteris paribus: (拉丁) (如)其他条件相同

