cygwin下安装hadoop0.20

安装时选择软件包:

editor-两个VIM、base-SED、Net-OPENSSH、OPENSSL、libs-libintl3和libintl8

环境变量:

path=%JAVA_HOME%\bin;%JAVA_HOME%\jre6\bin;D:\Program Files\MySQL\MySQL Server 5.5\bin;D:\Program Files\Python2.7;D:\Program Files\MongoDB\bin;D:\Program Files\TortoiseSVN\bin;%MAVEN_HOME%\bin;D:\cygwin64\bin;D:\cygwin64\usr\bin

CYGWIN=ntsec tty

然后开始安装,安装完开始启动sshd服务:

进入终端执行:

tree@treePC ~
$ ssh-host-config

*** Query: Overwrite existing /etc/ssh_config file? (yes/no) yes
*** Info: Creating default /etc/ssh_config file
*** Query: Overwrite existing /etc/sshd_config file? (yes/no) yes
*** Info: Creating default /etc/sshd_config file
*** Info: Privilege separation is set to yes by default since OpenSSH 3.3.
*** Info: However, this requires a non-privileged account called 'sshd'.
*** Info: For more info on privilege separation read /usr/share/doc/openssh/README.privsep.
*** Query: Should privilege separation be used? (yes/no) no
*** Info: Updating /etc/sshd_config file

*** Query: Do you want to install sshd as a service?
*** Query: (Say "no" if it is already installed as a service) (yes/no) yes
*** Query: Enter the value of CYGWIN for the daemon: [] ntsec
*** Info: On Windows Server 2003, Windows Vista, and above, the
*** Info: SYSTEM account cannot setuid to other users -- a capability
*** Info: sshd requires.  You need to have or to create a privileged
*** Info: account.  This script will help you do so.

*** Info: You appear to be running Windows XP 64bit, Windows 2003 Server,
*** Info: or later.  On these systems, it's not possible to use the LocalSystem
*** Info: account for services that can change the user id without an
*** Info: explicit password (such as passwordless logins [e.g. public key
*** Info: authentication] via sshd).

*** Info: If you want to enable that functionality, it's required to create
*** Info: a new account with special privileges (unless a similar account
*** Info: already exists). This account is then used to run these special
*** Info: servers.

*** Info: Note that creating a new user requires that the current account
*** Info: have Administrator privileges itself.

*** Info: No privileged account could be found.

*** Info: This script plans to use 'cyg_server'.
*** Info: 'cyg_server' will only be used by registered services.
*** Query: Do you want to use a different name? (yes/no) no
*** Query: Create new privileged user account 'cyg_server'? (yes/no) yes
*** Info: Please enter a password for new user cyg_server.  Please be sure
*** Info: that this password matches the password rules given on your system.
*** Info: Entering no password will exit the configuration.
*** Query: Please enter the password:
*** Query: Reenter:

*** Info: User 'cyg_server' has been created with password 'lee**jan'.
*** Info: If you change the password, please remember also to change the
*** Info: password for the installed services which use (or will soon use)
*** Info: the 'cyg_server' account.

*** Info: Also keep in mind that the user 'cyg_server' needs read permissions
*** Info: on all users' relevant files for the services running as 'cyg_server'.
*** Info: In particular, for the sshd server all users' .ssh/authorized_keys
*** Info: files must have appropriate permissions to allow public key
*** Info: authentication. (Re-)running ssh-user-config for each user will set
*** Info: these permissions correctly. [Similar restrictions apply, for
*** Info: instance, for .rhosts files if the rshd server is running, etc].


*** Info: The sshd service has been installed under the 'cyg_server'
*** Info: account.  To start the service now, call `net start sshd' or
*** Info: `cygrunsrv -S sshd'.  Otherwise, it will start automatically
*** Info: after the next reboot.

*** Info: Host configuration finished. Have fun!

 

 

OK,启动SSH:

 

配置无密码SSH:

在终端执行

tree@treePC ~/.ssh
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/tree/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/tree/.ssh/id_rsa.
Your public key has been saved in /home/tree/.ssh/id_rsa.pub.
The key fingerprint is:
a0:4c:cd:34:e4:78:6f:c4:38:12:85:eb:d6:77:c9:94 tree@treePC
The key's randomart image is:
+--[ RSA 2048]----+
|    .+=          |
|    .B +         |
|    +.O o  .     |
|   o.+ =  E      |
|   .o.  So .     |
|    o ... +      |
|   .   . .       |
|                 |
|                 |
+-----------------+

 

将公钥加入公钥授权文件中:

tree@treePC ~/.ssh
$ pwd
/home/tree/.ssh

tree@treePC ~/.ssh
$ ls
id_rsa  id_rsa.pub

tree@treePC ~/.ssh
$ cp id_rsa.pub authorized_keys

tree@treePC ~/.ssh
$ ls
authorized_keys  id_rsa  id_rsa.pub

 

弄完之后重启cygWin,ssh localhost试试能不能连接。

如果报错,重启sshd服务。然后重启cygWin试试。 

 

 创建连接,将C盘的jdk链接进来:

tree@treePC ~
$ ln -s "/cygdrive/c/Program Files/Java/jdk1.6.0_43" "/usr/local/jdk"

 

配置文件:

hadoop-env.sh文件加上JDK

# The java implementation to use.  Required.
export JAVA_HOME=/usr/local/jdk

 

core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>  
        <name>hadoop.tmp.dir</name>  
        <value>/hadoop-0.20.2/temp</value>  
    </property>  
    <property>  
        <name>fs.default.name</name>  
        <value>hdfs://localhost:9000</value>  
    </property>    
</configuration>

 

hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>  
        <name>dfs.name.dir</name>  
        <value>/hadoop-0.20.2/hdfs/name</value>  
    </property>  
    <property>  
        <name>dfs.data.dir</name>  
        <value>/hadoop-0.20.2/hdfs/data</value>  
    </property>  
    <property>  
        <name>dfs.replication</name>  
        <value>1</value>  
    </property>  
</configuration>

 

mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>  
        <name>mapred.job.tracker</name>  
        <value>localhost:9001</value>  
    </property>  
    <property>  
        <name>mapred.local.dir</name>  
        <value>/hadoop-0.20.2/mapred</value>  
    </property>  
</configuration>

 

配置完之后,格式化namenode:

tree@treePC /hadoop-0.20.2/bin
$ ./hadoop namenode -format
14/02/28 19:18:42 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = treePC/192.168.0.157
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
14/02/28 19:18:42 INFO namenode.FSNamesystem: fsOwner=treepc\tree,None,root,Administrators,Users,HomeUsers,ora_dba
14/02/28 19:18:42 INFO namenode.FSNamesystem: supergroup=supergroup
14/02/28 19:18:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
14/02/28 19:18:42 INFO common.Storage: Image file of size 101 saved in 0 seconds.
14/02/28 19:18:43 INFO common.Storage: Storage directory \hadoop-0.20.2\hdfs\name has been successfully formatted.
14/02/28 19:18:43 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at treePC/192.168.0.157
************************************************************/

 

启动:

tree@treePC /hadoop-0.20.2/bin
$ ./start-all.sh
starting namenode, logging to /hadoop-0.20.2/bin/../logs/hadoop-tree-namenode-treePC.out
localhost: starting datanode, logging to /hadoop-0.20.2/bin/../logs/hadoop-tree-datanode-treePC.out
localhost: starting secondarynamenode, logging to /hadoop-0.20.2/bin/../logs/hadoop-tree-secondarynamenode-treePC.out
starting jobtracker, logging to /hadoop-0.20.2/bin/../logs/hadoop-tree-jobtracker-treePC.out
localhost: starting tasktracker, logging to /hadoop-0.20.2/bin/../logs/hadoop-tree-tasktracker-treePC.out

 

运行一个圆周率DEMO:

tree@treePC /hadoop-0.20.2/bin
$ ./hadoop jar ../hadoop-0.20.2-examples.jar pi 2 10
Number of Maps  = 2
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Starting Job
14/02/28 19:20:16 INFO mapred.FileInputFormat: Total input paths to process : 2
14/02/28 19:20:17 INFO mapred.JobClient: Running job: job_201402281919_0001
14/02/28 19:20:18 INFO mapred.JobClient:  map 0% reduce 0%
14/02/28 19:20:26 INFO mapred.JobClient:  map 100% reduce 0%
14/02/28 19:20:38 INFO mapred.JobClient:  map 100% reduce 100%
14/02/28 19:20:40 INFO mapred.JobClient: Job complete: job_201402281919_0001
14/02/28 19:20:40 INFO mapred.JobClient: Counters: 18
14/02/28 19:20:40 INFO mapred.JobClient:   Job Counters
14/02/28 19:20:40 INFO mapred.JobClient:     Launched reduce tasks=1
14/02/28 19:20:40 INFO mapred.JobClient:     Launched map tasks=2
14/02/28 19:20:40 INFO mapred.JobClient:     Data-local map tasks=2
14/02/28 19:20:40 INFO mapred.JobClient:   FileSystemCounters
14/02/28 19:20:40 INFO mapred.JobClient:     FILE_BYTES_READ=131
14/02/28 19:20:40 INFO mapred.JobClient:     HDFS_BYTES_READ=236
14/02/28 19:20:40 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=251
14/02/28 19:20:40 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=215
14/02/28 19:20:40 INFO mapred.JobClient:   Map-Reduce Framework
14/02/28 19:20:40 INFO mapred.JobClient:     Reduce input groups=4
14/02/28 19:20:40 INFO mapred.JobClient:     Combine output records=0
14/02/28 19:20:40 INFO mapred.JobClient:     Map input records=2
14/02/28 19:20:40 INFO mapred.JobClient:     Reduce shuffle bytes=56
14/02/28 19:20:40 INFO mapred.JobClient:     Reduce output records=0
14/02/28 19:20:40 INFO mapred.JobClient:     Spilled Records=8
14/02/28 19:20:40 INFO mapred.JobClient:     Map output bytes=36
14/02/28 19:20:40 INFO mapred.JobClient:     Map input bytes=48
14/02/28 19:20:40 INFO mapred.JobClient:     Combine input records=0
14/02/28 19:20:40 INFO mapred.JobClient:     Map output records=4
14/02/28 19:20:40 INFO mapred.JobClient:     Reduce input records=4
Job Finished in 24.927 seconds
Estimated value of Pi is 3.80000000000000000000

 上浏览器看看:

mapreduce:

http://localhost:50030

 

 

HDFS:

http://localhost:50070

 

 

OK,运行成功就证明安装成功了。

posted @ 2014-02-28 20:51  大树的博客  Views(309)  Comments(0Edit  收藏  举报