Studying Hadoop or MapReduce can be a daunting task if you get your hand dirty at the start.
I followed the schedule as follows :
- Start with very basics of MR with code.google.com/edu/parallel/dsd-tutorial.html code.google.com/edu/parallel/mapreduce-tutorial.html
- Then go for the first two lectures in www.cs.washington.edu/education/courses/cse490h/08au/lectures.htm A very good course intro to MapReduce and Hadoop.
- Read the seminal paper labs.google.com/papers/mapreduce.html and its improvements in the updated version http://www.cs.washington.edu/education/courses/cse490h/08au/readings/communications200801-dl.pdf
- Then go for all the other videos in the U.Washington link given above.
- Try youtubing the terms Map reduce and hadoop to find videos by ORielly and Google RoundTable for good overview of the future of Hadoop and MapReduce
- Then off to the most important videos -
Cloudera Videos
www.cloudera.com/resources/?media=Video
and
Google MiniLecture Series
code.google.com/edu/submissions/mapreduce-minilecture/listing.html
Along with all the Multimedia above we need good written material
Documents:
- Architecture diagrams at hadooper.blogspot.com are good to have on your wall
- Hadoop: The definitive guide goes more into the nuts and bolts of the whole system where as Hadoop in Action is a good read with lots of teaching examples to learn the concepts of hadoop. Pro Hadoop is not for beginners
- pdfs of the documentation from Apache Foundation
hadoop.apache.org/common/docs/current/
and hadoop.apache.org/common/docs/stable/
will help you learn as to how model your problem into a MR solution in order to gain the advantages of Hadoop in total. - HDFS paper by Yahoo! Research is also a good read in order to gain in depth knowledge of hadoop
- Subscribe to the User Mailing List of Commons, MapReduce and HDFS in order to know problems, solutions and future solutions.
- Try the http://developer.yahoo.com/hadoop/tutorial/module1.html link for beginners to expert path to Hadoop
For Any Queries ...
Contact Apache, Google, Bing, Yahoo!
Others:
Your question seems overly broad - To get a resource to use while looking at source code you should narrow your focus of what you want to study. This will make it easier for you (and any on SO) to find papers/topics covering that topic.
I've dug into the Hadoop source a few times. Normally with a very specific class I needed to learn about. In these cases an external resource wasn't really needed, and since I had the class name, I just googled for that and found resources.
If I were to start trying to understand the hadoop source at a higher level I'd get the source code and my copy of Hadoop: The Definitive Guide and use that as a reference to understand the higher level connections of the source code.
I won't claim that this would be a perfect solution. H:TDG is at a more technical level than the other hadoop books I have and I find it to be very informative. H:TDG is what I'd start with and as I found areas I wanted to dig into more, I would start searching for those specifically.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
2010-12-05 《测试驱动开发》读书笔记