今天搭建测试环境:
首先讲hadoop,hive从线上库scp过来,发现从测试库无法连接线上库,转用szrz,rz时报错,经同事指点,发现从线上可以scp测试,完成。
之后配置环境:
hadoop,没有多少可说的,设计hadoop-env.sh中的Java_home,
hadoop-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- NEEDED TO CHANGE --> <property> <name>hadoop.job.ugi</name> <value>user,pwd</value> <description>username, password used by client</description> </property> <property> <name>fs.default.name</name> <value>hdfs://xxx:1234</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property> <property> <name>mapred.job.tracker</name> <value>domain:2468</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property> <property> <name>dfs.replication</name> <value>2</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property> </configuration>
然后设计Hive环境:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/xxx/xxx</value> <description>location of default database for the warehouse</description> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <!--value>jdbc:mysql://tc-nslog-cube01.tc.baidu.com:3308/inf_udw_hive</value--> <value>jdbc:mysql://ip:port/db</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>user</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>1234</value> <description>password to use against metastore database</description> </property> <property> <name>hive.exec.scratchdir</name> <value>/tmp/hive-${user.name}</value> <description>Scratch space for Hive jobs</description> </property> <property> <name>datanucleus.fixedDatastore</name> <value>true</value> </property> </configuration>
最麻烦的是安装存放hive元数据的mysql server。因为没有现成的包,只好到网上下载,首先查看线上mysql库版本,网上介绍有4中方法,我用的是连上mysql后输入status命令。到www.mysql.com上找对应版本,只有源码,下载到本地,rz不上去,网上找到解决方案:rz -be,然后在弹出对话框中将“以ASCII方式上传文件”复选框勾掉,上传成功。之后.configure --prefix make make install,之后不知道是否成功,反正没有出现设置root密码的地方,用mysql命令也不能连接localhost,失败(网上有个方案是要执行安装目录根目录下的mysql_install_db这个bash脚本,但我无法执行)。之后发现系统有rpm,检测linux版本,发现有/etc/redhat_release这个文件,查看这个文件确认版本,然后找mysql对应的rpm包,下载,安装,缺少依赖项perl-DBI-1.40-8.x86_64.rpm,google,下载,安装报错,据查可能是权限的原因,sudo在redhat下不能正常运行,应该是没有配置权限,su root没有密码,此路也不同。
中间还出现vim乱码:通过设置SecureCRT当前Session的编码解决。
然后一个同事告诉我地址,一个神脚本出现,修改配置,运行,安装成功。明天一定要研究一下这个神脚本。