[文章摘录] The Case for Cloud Computing (ITPro, 2009)
Time: 3.5 hours
Robert L. Grossman, "The Case for Cloud Computing," IT Professional, vol. 11, no. 2, pp. 23-27, Mar./Apr. 2009, doi:10.1109/MITP.2009.40
作者是大牛, 学术界和企业界通吃. Publication List中有好几篇SC, SIGKDD, KDD这类级别的论文, 同时也是CCA08组委会的成员之一(看来要关注一下CCA了). 这又是一篇讨论云计算的杂志类文章, 本来这种文章就不如讨论具体问题的学术论文有意思, 看多了更有点腻, 但是在了解了作者的背景后, 对作者的仰慕之情激发了我对这篇文章的兴趣, 毕竟是出自一位在学术界和企业界浸淫了多年的前辈之手, 值得花几个小时静心一读.
Research Interests
Data mining, HPC and networking, grid computing etc.
Academic background
University of Illinois at Chicago. Professor, Department of Mathematics, Statistics, & Computer Science, Department of Electrical Engineering and Computer Science, 1995-present.
Cornell University. Visiting Associate Professor, Department of Computer Science 1994-1995. Visiting Scientist, Mathematical Sciences Institute, 1989.
University of California at Berkeley. NSF Postdoctoral Research Fellow. 1984-1988.
Princeton University. 1980-1984. Ph.D. Mathematics, 1985.
Harvard University. 1976-1980. A.B., Mathematics, 1980.
Industrial experiences
Open Data Group Founder and Managing Partner, 2001-present.
Magnify, Inc. Founder and Chairman, 1994-2005; CEO, 1994-2001.
以下是文章摘录
1. 云计算尚无统一的定义, 本文作者认为的一个可行的定义:
"clouds, or clusters of distributed computers, provide on-demand resources and services over a network, usually the Internet, with the scale and reliability of a data center."
对比Ian Foster对云计算的定义, 大同小异, 这个更加简洁一点.
2. 作者将"云"分成了两种类型.
(1) provide computing instances on demand
比如Amazon's EC2 services, Eucalyptus System(开源版本, 使用与EC2 Cloud相同的API)(有免费版的软件可用于构建自己的云)
(2) provide computing capacity on demand
比如Google's MapReduce, Hadoop System(MapReduce的开源版本)
这两种不同的类型, 有不同的开发模式, 标准制定的难度不同(portability/interoperability实现难度不同), 不同的benchmark.
3. 作者认为Cloud Computing会受到关注的原因(背景)
(1) Scale
"Some companies that rely on cloud computing have infrastructures that scale over several (or more) data centers."
我不太明白这句话, 按照自己的理解, 一些公司现有的infrastructure已经很庞大, 足以支持云计算平台. 因此, 在硬件上为云计算的出现做好了准备.
(2) Simplicity
MapReduce等编程模式的出现使得分布式编程变得简单.
(3) Pricing
"Pay as you go"的计价方法是云计算的商业模式取得成功的一个因素.
4. MapReduce
"At its core, MapReduce is a style of parallel programming supported by capacity-on-demands clouds."
作者使用了一个"compute an inverted index in parallel for a large collection of Web pages"的例子来说明MapReduce.
每个node上存储了一些Web pages.
map phase: 每个Web pages在本地结点处理, 得到一组<wj, pi>列表(one for each word wj on the page).
shuffle phase: partition function h(wj) 将每个key分配到云中的机器供进一步处理. 每个node将数据传送到需要的node上(由partition function决定).
sort phase: 每个node对其拥有的<wj, pi>列表进行排序.
reduce phase: 具有相同key的key-value paris被合并起来, 得到inverted index,形如(Wk; p1, p2, …, )
Map和Reduce函数由程序员来定义, Shuffle和Sort函数由系统提供.
5. Standards & Interoperability
(1) 事实标准
Amazone's APIs: de facto standard for clouds that provide on-demand instances.
Hadoop: the most prevalent system that provide on-demand capacity. 但是, 对于此类云, portability/interoperability更难实现.
(2) 书面标准
目前关于云的标准还没出现, 有一些机构在往这个方向努力, 比如Cloud Computing Interoperability Forum, Open Cloud Consortium.
(3) Service-based frameworks for clouds
Thrift: 使得基于云的应用访问不同的存储云变得简单.
(4) Common language
目前有针对为MapReduce类型并发编程提供语言的工作
(5) create a standard enables different clouds to interoperate
6. Benchmarks
(1) TeraSort
the most common method for measuring cloud performance.
(2) CLoudStone
benchmark for clouds providing on-demand instances.
(3) MalStone
benchmark for clouds providing on-demand capacity.