Ubuntu配置HBase实验(英文版)【大数据处理技术】
1.Experimental environment
Version | |
---|---|
OS | Ubuntu 20.04.4 LTS |
JDK | 1.8.0_144 |
Hadoop | 2.7.2 |
HBase | 1.1.2 |
2. Install and set up HBase in pseudo distributed mode.
2.1 Install hbase
2.2 unzip hbase-1.1.2-bin.tar.gz to directory /usr/local
The command is as follows:
sudo tar -zxf hbase-1.1.2-bin.tar.gz -C /usr/local
2.3 Change the extracted file name hbase-1.1.2 to hbase for ease of use.
The command is as follows:
sudo mv /usr/local/hbase-1.1.2 /usr/local/hbase
2.4 Configure environment variables
Add the bin directory under HBase to the path
edit file ~/.bashrc, The command is as follows:
vi ~/.bashrc
Add /usr/local/hbase/bin to the export PATH line
After editing, execute the source command to make the above configuration take effect immediately on the current terminal. The command is as follows:
source ~/.bashrc
2.5 Add HBase permission
The command is as follow:
cd /usr/local
sudo chown -R hadoop ./hbase
2.6 Check the HBase version and confirm that the HBase installation is successful.
The command is as follows:
/usr/local/hbase/bin/hbase version
Seeing the above output message indicates that HBase has been successfully installed.
2.7 Configure the pseudo distributed mode.
2.7.1 configure /usr/local/hbase/conf/hbase-env.sh
The command is as follow:
vi /usr/local/hbase/conf/hbase-env.sh
Set the JAVA_HOME,HBASE_CLASSPATH,HBASE_MANAGES_ZK, HBASE_CLASSPATH to the directory /usr/local/hadoop/conf
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_144
export HBASE_CLASSPATH=/usr/local/hadoop/conf
export HBASE_MANAGES_ZK=true
2.7.2 configure hbase-site.xml
configure /usr/local/hbase/conf/hbase-site.xml
Open the hbase-site.xml with the command vi and edit it. The command is as follows:
vi /usr/local/hbase/conf/hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
Modify hbase.rootdir, which specifies the storage path of HBase data on HDFS.
Attribute hbase.cluter.distributed is set to true.
Suppose that the current Hadoop cluster is running in the pseudo distributed mode, running on the local machine, and the namenode is running on port 9000.
2.7.3 Test run HBase
(1) First, log in to SSH. Then switch the directory to /usr/local/hadoop. Start Hadoop again.
The command is as follows:
ssh localhost
cd /usr/local/hadoop
./sbin/start-dfs.sh
Enter the command jps. You can see that NameNode, DataNodeand and SecondaryNameNode have been started successfully, indicating that Hadoop has been started successfully
jps
(2) Switch directory to /usr/local/hbase. Restart HBase The command is as follows:
cd /usr/local/hbase
bin/start-hbase.sh
Enter the shell interface:
bin/hbase shell
3. Use HBase shell commands for the following tasks:
(1) List all the tables in HBase.
Create table student
create 'student', 'Sname', 'Ssex', 'Sage', 'Sdept', 'course'
If you still get "ERROR: Can't get master address from ZooKeeper; znode data == null" when creating table student, see Problems and Solutions section.
If you think there is something wrong with your table to the extent that you want to delete it, see Problems and Solutions section.
List all the tables in HBase.
list
(2) Print the description of a user-specified table.
Insert data into table student
put 'student','95001','Sname','LiYing'
put 'student','95001','course:math','80'
Print the description of table student
describe 'student'
(3) Scan a user-specified table and print the table contents.
Scan table student and print the table contents.
scan 'student'
(4) Add and delete a column family in an existing table.
Add a column family in table student.
describe 'student'
alter 'student', 'score'
Delete a column family in table student.
alter 'student', {NAME=>'score', METHOD => 'delete'}
(5) Add and delete a column in an existing table.
Add a column in table student.
put 'student','95001','course:english','90'
Delete a column in table student.
delete 'student', '95001', 'course:english'
(6) Empty a given table.
Empty table student.
truncate 'student'
(7) Count the number of rows in a table.
Count the number of rows in table student.
count 'student'
4. Use Java API for the following tasks:
4.1 Configure IDEA
4.1.1 New Project in IDEA
4.1.2 Import external jar package in the project
We need to import all the jar packages in the Lib file in the HBase installation directory.
4.2 List all the tables in HBase.
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import java.io.IOException;
public class HbaseList{
public static Configuration configuration;
public static Connection connection;
public static Admin admin;
/**
* 建立连接
*/
public static void init(){
configuration = HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
try{
connection = ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();
}catch (IOException e){
e.printStackTrace();
}
System.out.println("Successfully connected");
}
/**
* 关闭连接
*/
public static void close(){
try{
if(admin != null){
admin.close();
}
if(null != connection){
connection.close();
}
System.out.println("Successfully closed\n");
}catch (IOException e){
e.printStackTrace();
}
}
/**
* 查看已有表
*/
public static void listTables() throws IOException {
init();
HTableDescriptor[] hTableDescriptors = admin.listTables();
for(HTableDescriptor hTableDescriptor :hTableDescriptors){
System.out.println("Table: "+hTableDescriptor.getNameAsString());
}
System.out.println("Successfully list tables");
close();
}
public static void main(String[] args)throws IOException{
listTables();
}
}
4.3 Print the description of a user-specified table.
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import java.io.IOException;
public class HbaseDescription{
public static Configuration configuration;
public static Connection connection;
public static Admin admin;
/**
* Print the description of a user-specified table
*/
public static void getTableDescription(String table_name) throws IOException {
init();
TableName tableName = TableName.valueOf(table_name);
HTableDescriptor tableDescriptor = admin.getTableDescriptor(tableName);
System.out.println(tableDescriptor.toString());
System.out.println("Successfully get tables description");
close();
}
public static void main(String[] args)throws IOException{
getTableDescription("student");
}
/**
* 建立连接
*/
public static void init(){
configuration = HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
try{
connection = ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();
}catch (IOException e){
e.printStackTrace();
}
System.out.println("Successfully connected");
}
/**
* 关闭连接
*/
public static void close(){
try{
if(admin != null){
admin.close();
}
if(null != connection){
connection.close();
}
System.out.println("Successfully closed\n");
}catch (IOException e){
e.printStackTrace();
}
}
}
4.4 Scan a user-specified table and print the table contents.
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
import java.io.IOException;
public class HbaseScanAndContent{
public static Configuration configuration;
public static Connection connection;
public static Admin admin;
/**
* Scan a user-specified table and print the table contents
*/
public static void getTableScanAndContent(String table_name) throws IOException {
init();
TableName tableName = TableName.valueOf(table_name);
Table table = connection.getTable(tableName);
Scan scan = new Scan();
ResultScanner scanner = table.getScanner(scan);
for (Result result:scanner){
printRecoder(result);
}
System.out.println("Successfully Scan table and print the table contents");
close();
}
public static void printRecoder(Result result){
for(Cell cell:result.rawCells()){
System.out.print("row key: "+ Bytes.toString(cell.getRowArray(), cell.getRowOffset(), cell.getRowLength()));
System.out.print(" column family: "+ Bytes.toString(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength()));
System.out.print(" column: "+ Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength()));
System.out.print(" value: "+ Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength()));
System.out.println(" time stamp: "+cell.getTimestamp());
}
}
public static void main(String[] args)throws IOException{
getTableScanAndContent("student");
}
/**
* 建立连接
*/
public static void init(){
configuration = HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
try{
connection = ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();
}catch (IOException e){
e.printStackTrace();
}
System.out.println("Successfully connected");
}
/**
* 关闭连接
*/
public static void close(){
try{
if(admin != null){
admin.close();
}
if(null != connection){
connection.close();
}
System.out.println("Successfully closed\n");
}catch (IOException e){
e.printStackTrace();
}
}
}
4.5 Add and delete a column family in an existing table
4.5.1 Add a column family in table student
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
import java.io.IOException;
public class HbaseAddColumnFamily{
public static Configuration configuration;
public static Connection connection;
public static Admin admin;
/**
* 向表中添加列族
// * @param tableName 表名
* @param colFamily 列族名
* @throws IOException
*/
public static void addColFamily(String table_name, String colFamily) throws IOException {
init();
TableName tableName = TableName.valueOf(table_name);
HTableDescriptor htd = admin.getTableDescriptor(tableName);
HColumnDescriptor f = new HColumnDescriptor(colFamily); //新增列族
htd.addFamily(f);
admin.addColumn(tableName, f);
System.out.println("Add column family successful");
close();
}
public static void main(String[] args)throws IOException{
addColFamily("student", "score");
}
//建立连接
public static void init(){
configuration = HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
try{
connection = ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();
}catch (IOException e){
e.printStackTrace();
}
}
//关闭连接
public static void close(){
try{
if(admin != null){
admin.close();
}
if(null != connection){
connection.close();
}
}catch (IOException e){
e.printStackTrace();
}
}
}
4.5.2 Delete a column family in table student
4.6 Add and delete a column in an existing table.
4.6.1 Add a column cousrse:english in table student.
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import java.io.IOException;
public class HbaseAddColumn{
public static Configuration configuration;
public static Connection connection;
public static Admin admin;
/**
* 向某一行的某一列插入数据
* @param tableName 表名
* @param rowKey 行键
* @param colFamily 列族名
* @param col 列名(如果其列族下没有子列,此参数可为空)
* @param val 值
*/
public static void insertRow(String tableName,String rowKey,String colFamily,String col,String val) throws IOException {
init();
Table table = connection.getTable(TableName.valueOf(tableName));
Put put = new Put(rowKey.getBytes());
put.addColumn(colFamily.getBytes(), col.getBytes(), val.getBytes());
table.put(put);
System.out.println("Insert Row successful");
table.close();
close();
}
public static void main(String[] args)throws IOException{
insertRow("student", "95001", "course", "english", "88");
}
//建立连接
public static void init(){
configuration = HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
try{
connection = ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();
}catch (IOException e){
e.printStackTrace();
}
}
//关闭连接
public static void close(){
try{
if(admin != null){
admin.close();
}
if(null != connection){
connection.close();
}
}catch (IOException e){
e.printStackTrace();
}
}
}
4.6.2 Delete a column cousrse:english in table student.
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import java.io.IOException;
public class HbaseDeleteColumn{
public static Configuration configuration;
public static Connection connection;
public static Admin admin;
/**
* 删除指定列的数据
* @param tableName 表名
* @param rowKey 行键
* @param colFamily 列族名
* @param col 列名
*/
public static void deleteRow(String tableName,String rowKey,String colFamily,String col) throws IOException {
init();
Table table = connection.getTable(TableName.valueOf(tableName));
Delete delete = new Delete(rowKey.getBytes());
delete.addColumn(colFamily.getBytes(), col.getBytes());
table.delete(delete);
System.out.println("Delete Row successful");
table.close();
close();
}
public static void main(String[] args)throws IOException{
deleteRow("student", "95001", "course", "english");
}
//建立连接
public static void init(){
configuration = HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
try{
connection = ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();
}catch (IOException e){
e.printStackTrace();
}
}
//关闭连接
public static void close(){
try{
if(admin != null){
admin.close();
}
if(null != connection){
connection.close();
}
}catch (IOException e){
e.printStackTrace();
}
}
}
4.7 Empty a given table.
If you still get **TableNotDisabledException error **when , see Problems and Solutions section. In fact, that's what happened next.
4.8 Count the number of rows in a table.
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import java.io.IOException;
public class HbaseCountTableRows{
public static Configuration configuration;
public static Connection connection;
public static Admin admin;
/**
* 统计表中的行数
* @param tableName 表名
* @throws IOException
*/
public static void countTableRows(String table_name) throws IOException {
init();
Table table = connection.getTable(TableName.valueOf(table_name));
Scan scan = new Scan();
ResultScanner scanner = table.getScanner(scan);
int num = 0;
for (Result result = scanner.next(); result != null; result = scanner.next()) {
num++;
}
System.out.println("row numbers:" + num);
scanner.close();
close();
}
public static void main(String[] args)throws IOException{
countTableRows("student");
}
//建立连接
public static void init(){
configuration = HBaseConfiguration.create();
configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
try{
connection = ConnectionFactory.createConnection(configuration);
admin = connection.getAdmin();
}catch (IOException e){
e.printStackTrace();
}
}
//关闭连接
public static void close(){
try{
if(admin != null){
admin.close();
}
if(null != connection){
connection.close();
}
}catch (IOException e){
e.printStackTrace();
}
}
}
5.Problems and Solutions
5.1 HBase shell ERROR: Can't get master address from ZooKeeper; znode data == null
exit
We can see that HMaster and HRegionServer are missing, so we restart HBase and hadoop using the following command.
stop-hbase.sh
stop-all.sh
cd /usr/local/hadoop
./sbin/start-dfs.sh
cd /usr/local/hbase
bin/start-hbase.sh
We can see that the problem has been solved.
5.2 HBase shell Delete a table
First, you need to disable the table by executing the disable command followed by the table name:
disable 'table_name'
Replace 'table_name' with the name of the table you want to delete.
Next, you can proceed to drop the table using the drop command:
drop 'table_name'
Again, replace 'table_name' with the name of the table you wish to delete.
Please note that before performing the deletion, ensure that you have backed up any important table data, as the deletion is irreversible and all data associated with the table will be permanently removed.
5.3 TableNotDisabledException error
Specific error message:
Exception in thread "main" org.apache.hadoop.hbase.TableNotDisabledException: student
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.util.ForeignExceptionUtil.toIOException(ForeignExceptionUtil.java:45)
at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.convertResult(HBaseAdmin.java:4206)
at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.waitProcedureResult(HBaseAdmin.java:4164)
at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.get(HBaseAdmin.java:4098)
at org.apache.hadoop.hbase.client.HBaseAdmin.deleteTable(HBaseAdmin.java:777)
at HbaseEmptyTable.emptyTable(HbaseEmptyTable.java:21)
at HbaseEmptyTable.main(HbaseEmptyTable.java:30)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableNotDisabledException): student
at org.apache.hadoop.hbase.master.HMaster.checkTableModifiable(HMaster.java:1976)
at org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.prepareDelete(DeleteTableProcedure.java:256)
at org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.executeFromState(DeleteTableProcedure.java:101)
at org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.executeFromState(DeleteTableProcedure.java:59)
at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:107)
at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:400)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:869)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:673)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:626)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:70)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.run(ProcedureExecutor.java:413)
Process finished with exit code 1
Here's how you can handle the error:
- Check if the table is enabled or disabled: Before deleting a table, ensure that it is disabled. You can use the HBase shell command is_disabled 'table_name' to check the status of the table. If it returns true, it means the table is already disabled. If it returns false, you need to disable it before proceeding.
- Disable the table: If the table is enabled, you need to disable it using the HBase shell command disable 'table_name'. In your case, execute disable 'student' to disable the "student" table.
- Retry the deletion: Once the table is disabled, you can retry deleting it using the drop command: drop 'table_name'. In this case, execute drop 'student' to delete the "student" table.
By following these steps, you should be able to delete the table without encountering the TableNotDisabledException error.