Search This Blog

Saturday, July 25, 2015

Installing HBase on pseudo-distributed mode

In this post, we will install HBase in a pseudo-distributed mode on Ubuntu. In a previous post, we setup Hadoop (HDFS) in a pseudo distributed mode.

To get started, first lets get the latest version of HBase from http://hbase.apache.org/


Next we download the hbase distribution to our local file system and extract the contents..

I downloaded hbase 1.1.0.1 from the Apache mirror and saved it my local folder. To extract the tar file use the following command...

$ tar -xzvf hbase-1.1.0.1-bin.tar.gz


After extracting the contents, we need to make some configuration changes. We start by opening the following file.

$ cd hbase-1.1.0.1
$ cd conf
$ sudo gedit hbase-site.xml

You will see a near empty file with an empty configuration XML node. Enter following configuration for hbase root directory as well as the zookeeper process.

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///home/arthgallo/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/home/arthgallo/zookeeper</value>
  </property>
</configuration>

Here is a screenshot of the updated file.



Next we need to make sure JAVA_HOME is set properly. Open the environment file to set it

$  sudo gedit /usr/local/hbase/conf/hbase-env.sh

Uncomment the JAVA_HOME line and point it to the path where JDK is installed. In case you have trouble determining the best way to set up JAVA_HOME, refer to this post for how to do so.

This entry to be made in the hbase-env file is shown as below



Now in the terminal shell cd to the folder where HBase was extracted.

$ cd ~/Work/Servers/HBase/hbase-1.1.0.1
$ cd bin
$ ./start-hbase.sh



Now enter jps to make sure that HBase started correctly.

$ jps

We can see the Master processes, to tell us that jps started correctly.


Next we will connect to the HBase instance and run a few commands to make sure everything is working correctly.

In the bin folder, enter the following command

$ ./hbase shell



This will connect to the HBase shell client. Enter the following commands on the hbase shell to make sure everything is working correctly.

> create 'test', 'cf'
> list 'test'
> put 'test', 'row1', 'cf:a', 'value1'
> put 'test', 'row2', 'cf:b', 'value2'
> put 'test', 'row3', 'cf:c', 'value3'
> scan 'test'


Now that we have verified that everything is working, it is time to clean up and switch to the pseudo-distributed mode.

Enter the following commands on the hbase shell prompt

> disable 'test'
> drop 'test'
> quit



Next step is to stop hbase

$ ./stop-hbase.sh


Next we need to edit the hbase configuration file to run it in the pseduo distributed mode

Edit the hbase conf file, and make the following entries

<configuration>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://localhost:9000/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/usr/local/zookeeper</value>
  </property>
</configuration>


Save the file and close it.

Make sure the zookeeper folder is created and the zookeeper and hbase directories are accessible to the user.

$ mkdir /usr/local/zookeeper
$ sudo chown -R hadoop_admin /usr/local/zookeeper
$ sudo chown -R hadoop_admin /usr/local/hbase



Make sure HDFS (Hadoop Distributed File System) is installed and running. To check if HDFS is running enter the following command on the terminal window

$ hdfs dfsadmin -report

It will show a report as follows



 run the command to start hbase


Once it is started, you should see the following


Run jps command to see the running processes

$ jps



Next we can again open a new terminal to start hbase shell, and run the same commands again

> create 'test', 'cf'
> list 'test'
> put 'test', 'row1', 'cf:a', 'value1'
> put 'test', 'row2', 'cf:b', 'value1'
> put 'test', 'row3', 'cf:c', 'value3'
> scan 'test'

We can see if everything executed correctly.



That's it folks hbase is set up correctly.

Also, if you have set up thrift on your machine you can start the hbase thrift server with the following command

$ /usr/local/hbase/bin/hbase-daemon.sh start thrift


No comments: