Apache HBase – Java Client API with Docker HBase

HBase is the Hadoop database, a distributed, scalable, big data store. We can use HBase when we need random, realtime read/write access to our Big Data.

I have used the Standalone HBase and Docker HBase for this exercise.

The first step is to install Docker if you dont have it and then do the below steps to install docker HBase.

  1. Refer this repository https://github.com/sel-fish/hbase.docker and follow the instructions available to install Docker HBase.
  2. I have Ubuntu VM hence used my hostname instead of ‘myhbase’. If you have used, the hostname, then you don’t need to update the /etc/hosts file. But make sure to check the /etc/hosts file and verify the below.

    
    <<MACHINE_IP_ADDRESSS>> <<HOSTNAME>>
    
    
  3. My docker run command will be like below.
    
    docker run -d -h $(hostname) -p 2181:2181 -p 60000:60000 -p 60010:60010 -p 60020:60020 -p 60030:60030 --name hbase debian-hbase
    
    
  4. Once you are done, then check the links http://localhost:60010(Master) and http://localhost:60030(Region Server)

pom.xml


<dependency>
  <groupId>org.apache.hbase</groupId>
  <artifactId>hbase-client</artifactId>
  <version>1.3.0</version>
</dependency>

To access the Hbase shell, then follow the below steps,


1. Run 'docker exec -it hbase bash' to enter into the container
2. Go to '/opt/hbase/bin/' folder 
3. Run'./hbase shell' and it will open up the HBase Shell.

You can use the HBase shell available inside the docker container and run scripts to perform all the operations(create table, list, put and scan)


root@HOST-NAME:/opt/hbase/bin# ./hbase shell
2017-02-15 14:55:26,117 INFO  [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2017-02-15 14:55:27,095 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 1.2.0-cdh5.7.0, r49168a0b3987d5d8b1f1b359417666f477a0618e, Wed Jul 20 23:13:03 EDT 2016

hbase(main):001:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 3.0000 average load

hbase(main):002:0> list
TABLE                                                                                                                                                                                         
customer                                                                                                                                                                                      
1 row(s) in 0.0330 seconds

=> ["customer"]
hbase(main):003:0> create 'user','personal'
0 row(s) in 1.2540 seconds

=> Hbase::Table - user
hbase(main):004:0> list
TABLE                                                                                                                                                                                         
customer                                                                                                                                                                                      
user                                                                                                                                                                                          
2 row(s) in 0.0080 seconds

=> ["customer", "user"]
hbase(main):005:0> list 'user'
TABLE                                                                                                                                                                                         
user                                                                                                                                                                                          
1 row(s) in 0.0090 seconds

=> ["user"]
hbase(main):006:0> put 'user','row1','personal:name','bala'
0 row(s) in 0.1500 seconds

hbase(main):007:0> put 'user','row2','personal:name','chandar'
0 row(s) in 0.0110 seconds

hbase(main):008:0> scan 'user'
ROW                                              COLUMN+CELL                                                                                                                                  
 row1                                            column=personal:name, timestamp=1487170597246, value=bala                                                                                    
 row2                                            column=personal:name, timestamp=1487170608622, value=chandar                                                                                 
2 row(s) in 0.0700 seconds

hbase(main):009:0> get 'user' , 'row2'
COLUMN                                           CELL                                                                                                                                         
 personal:name                                   timestamp=1487170608622, value=chandar                                                                                                       
1 row(s) in 0.0110 seconds



The hbase-site.xml will be like this. It will be available in the docker container inder /opt/hbase/conf.

hbase-site.xml


<configuration>
  <property>
    <name>hbase.master.port</name>
    <value>60000</value>
  </property>
  <property>
    <name>hbase.master.info.port</name>
    <value>60010</value>
  </property>
  <property>
    <name>hbase.regionserver.port</name>
    <value>60020</value>
  </property>
  <property>
    <name>hbase.regionserver.info.port</name>
    <value>60030</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>localhost</value>
  </property>
  <property>
    <name>hbase.localcluster.port.ephemeral</name>
    <value>false</value>
  </property>
</configuration>

Create Table



import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;

public class CreateTable {

    public static void main(String... args) throws Exception {
        System.out.println("Creating Htable starts");
        Configuration config = HBaseConfiguration.create();
        //config.set("hbase.zookeeper.quorum", "HOSTNAME");
        //config.set("hbase.zookeeper.property.clientPort","2181");
        Connection connection = ConnectionFactory.createConnection(config);
        Admin admin = connection.getAdmin();
        TableName tableName = TableName.valueOf("customer");
        if (!admin.tableExists(tableName)) {
            HTableDescriptor htable = new HTableDescriptor(tableName);
            htable.addFamily(new HColumnDescriptor("personal"));
            htable.addFamily(new HColumnDescriptor("address"));
            admin.createTable(htable);
        } else {
            System.out.println("customer Htable is exists");
        }
        admin.close();
        connection.close();
        System.out.println("Creating Htable Done");
    }
}

List Tables



import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;

public class ListTable {

    public static void main(String... args) throws Exception {
        Connection connection = ConnectionFactory.createConnection(HBaseConfiguration.create());
        Admin admin = connection.getAdmin();
        HTableDescriptor[] tableDescriptors = admin.listTables();
        for (HTableDescriptor tableDescriptor : tableDescriptors) {
            System.out.println("Table Name:"+ tableDescriptor.getNameAsString());
        }
        admin.close();
        connection.close();
    }
}


Delete Table



import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;

import java.io.IOException;

public class DeleteTable {

    public static void main(String... args) {

        System.out.println("DeleteTable Starts");
        Connection connection = null;
        Admin admin = null;

        try {
            connection = ConnectionFactory.createConnection(HBaseConfiguration.create());
            TableName tableName = TableName.valueOf("customer");
            admin = connection.getAdmin();
            admin.disableTable(tableName);
            admin.deleteTable(tableName);
            if(!admin.tableExists(tableName)){
                System.out.println("Table is deleted");
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (admin != null) admin.close();
                if (connection != null) connection.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        System.out.println("DeleteTable Done");
    }
}

Delete Data



import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;

public class DeleteData {

    public static void main(String... args) throws Exception {
        System.out.println("DeleteData starts");
        Connection connection = ConnectionFactory.createConnection(HBaseConfiguration.create());
        TableName tableName = TableName.valueOf("customer");
        Table table = connection.getTable(tableName);
        Delete delete = new Delete(Bytes.toBytes("row1"));
        table.delete(delete);
        Get get = new Get(Bytes.toBytes("row1"));
        Result result = table.get(get);
        System.out.println("result:"+result);
        if (result.value() == null) {
            System.out.println("Delete Data is successful");
        }
        table.close();
        connection.close();
    }

}

To populate HBase table:


import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;

public class PopulateData {

    public static void main(String... args) throws Exception {

        Connection connection = ConnectionFactory.createConnection(HBaseConfiguration.create());

        TableName tableName = TableName.valueOf("customer");
        Table table = connection.getTable(tableName);

        Put p = new Put(Bytes.toBytes("row1"));
        //Customer table has personal and address column families. So insert data for 'name' column in 'personal' cf
        // and 'city' for 'address' cf
        p.addColumn(Bytes.toBytes("personal"), Bytes.toBytes("name"), Bytes.toBytes("bala"));
        p.addColumn(Bytes.toBytes("address"), Bytes.toBytes("city"), Bytes.toBytes("new york"));
        table.put(p);
        Get get = new Get(Bytes.toBytes("row1"));
        Result result = table.get(get);
        byte[] name = result.getValue(Bytes.toBytes("personal"), Bytes.toBytes("name"));
        byte[] city = result.getValue(Bytes.toBytes("address"), Bytes.toBytes("city"));
        System.out.println("Name: " + Bytes.toString(name) + " City: " + Bytes.toString(city));
        table.close();
        connection.close();
    }
}

To scan the tables


import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;

import java.io.IOException;

public class ScanTable {

    public static void main(String... args) {
        Connection connection = null;
        ResultScanner scanner = null;
        try {
            connection = ConnectionFactory.createConnection(HBaseConfiguration.create());
            TableName tableName = TableName.valueOf("customer");
            Table table = connection.getTable(tableName);
            Scan scan = new Scan();
            // Scanning the required columns
            scan.addColumn(Bytes.toBytes("personal"), Bytes.toBytes("name"));

            scanner = table.getScanner(scan);

            // Reading values from scan result
            for (Result result = scanner.next(); result != null; result = scanner.next())
                System.out.println("Found row : " + result);
            //closing the scanner
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (scanner != null) scanner.close();
            if (connection != null) try {
                connection.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }