Distributed HBASE & ZooKeeper Installation and Configuration

Hi,

I’m happy to share with you the output of an interesting lab exercise. Let’s install Hbase and ZooKeeper and issue some commands in hbase shell in this post.

2000px-wikipedia-logo-v2-en-svg1HBase is an open source, non-relational, distributed database modeled after Google’s BigTable and is written in Java. It is developed as part of Apache Software Foundation‘s Apache Hadoop project and runs on top of HDFS (Hadoop Distributed Filesystem), providing BigTable-like capabilities for Hadoop.

Apache ZooKeeper is a software project of the Apache Software Foundation. It is essentially a distributed hierarchical key-value store, which is used to provide a distributed configuration service, synchronization service, and naming registry for large distributed systems.[1] ZooKeeper was a sub-project of Hadoop but is now a top-level project in its own right.

HBASE download and configuration

hadoop@gandhari:/opt/hadoop-2.6.4$ wget https://archive.cloudera.com/cdh5/cdh/5/hbase-1.0.0-cdh5.5.1.tar.gz

hadoop@gandhari:/opt/hadoop-2.6.4$ gunzip hbase-1.0.0-cdh5.5.1.tar.gz

hadoop@gandhari:/opt/hadoop-2.6.4$ tar -xvf hbase-1.0.0-cdh5.5.1.tar

hadoop@gandhari:/opt/hadoop-2.6.4$ ln -s hbase-1.0.0-cdh5.5.1/ hbase

hadoop@gandhari:/opt/hadoop-2.6.4$ mkdir /tmp/hbase

hadoop@gandhari:/opt/hadoop-2.6.4$ cd hbase/conf

export HBASE_MANAGES_ZK=false
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
export HADOOP_HOME=/opt/hadoop


hadoop@gandhari:/opt/hadoop-2.6.4/hbase/conf$ vi ~/.bashrc

#HBASE VARIABLES
export HBASE_HOME=/opt/hadoop/hbase
export PATH=$PATH:$HBASE_HOME/bin


hadoop@gandhari:/opt/hadoop-2.6.4/hbase/conf$ source ~/.bashrc

ZooKeeper – Download and Configuration

zookeeper

hadoop@gandhari:/opt/hadoop-2.6.4/hbase/conf$ cd $HOME
hadoop@gandhari:~$ pwd
/opt/hadoop

hadoop@gandhari:~$ wget https://archive.cloudera.com/cdh5/cdh/5/zookeeper-3.4.5-cdh5.5.1.tar.gz

hadoop@gandhari:~$ gunzip zookeeper-3.4.5-cdh5.5.1.tar.gz

hadoop@gandhari:~$ tar -xvf zookeeper-3.4.5-cdh5.5.1.tar

hadoop@gandhari:~$ ln -s zookeeper-3.4.5-cdh5.5.1/ zookeeper

hadoop@gandhari:~$ cd zookeeper
hadoop@gandhari:~/zookeeper$ mkdir zookeeper

hadoop@gandhari:~/zookeeper$ cd conf

hadoop@gandhari:~/zookeeper/conf$ cp zoo_sample.cfg zoo.cfg

hadoop@gandhari:~/zookeeper/conf$ vi zoo.cfg

Add the following entries to zoo.cfg

dataDir=$HOME/zookeeper/zookeeper
server.0=gandhari:2888:3888


hadoop@gandhari:~/zookeeper/conf$ cp zoo.cfg /opt/hadoop/hbase/conf/

Port the zookeeper configuration to Hbase.

Create a myid file and put it in the dataDir folder of zookeeper with an entry 0, to denote the server instance number.

hadoop@gandhari:~/zookeeper$ touch myid

hadoop@gandhari:~/zookeeper$ echo '0'> /opt/hadoop/zookeeper/zookeeper/myid

hadoop@gandhari:/etc/hadoop/conf$ cd /opt/hadoop/etc/hadoop/

hadoop@gandhari:~/etc/hadoop$ cp core-site.xml /opt/hadoop/hbase/conf/
hadoop@gandhari:~/etc/hadoop$ cp hdfs-site.xml /opt/hadoop/hbase/conf/
hadoop@gandhari:~/etc/hadoop$ cp yarn-site.xml /opt/hadoop/hbase/conf/
hadoop@gandhari:~/etc/hadoop$ cp mapred-site.xml.template /opt/hadoop/hbase/conf/mapred-site.xml

Reconfigure HBASE with ZooKeeper

hadoop@gandhari:~/etc/hadoop$ cd $HOME
hadoop@gandhari:~$ cd hbase/conf/

hadoop@gandhari:~/hbase/conf$ vi hbase-site.xml

<configuration>
 <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
    <description>The mode the cluster will be in. Possible values are
      false: standalone and pseudo-distributed setups with managed Zookeeper
      true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
    </description>
  </property>
 <property>
    <name>hbase.regionserver.hlog.replicationd</name>
    <value>1</value>
  </property>
 <property>
    <name>hbase.tmp.dir</name>
    <value>/tmp/hbase</value>
  </property>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://gandhari:9000/hbase</value>
  </property>
 <property>
  <name>hbase.zookeeper.quorum</name>
  <value>gandhari:2181</value>
 </property>
 <property>
  <name>zookeeper.session.timeout</name>
  <value>15000</value>
 </property>
 <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/opt/hadoop/zookeeper/zookeeper</value>
    <description>Property from ZooKeeper config zoo.cfg.
    The directory where the snapshot is stored.
    </description>
  </property>
</configuration>

Lets start Zookeeper

hadoop@gandhari:~/hbase/conf$ cd /opt/hadoop/zookeeper/bin

hadoop@gandhari:~/zookeeper/bin$ ./zkServer.sh start

Now lets start the HBASE

hadoop@gandhari:~/zookeeper/bin$ cd /opt/hadoop/hbase/bin/

hadoop@gandhari:~/hbase/bin$ ./hbase-daemon.sh start master

hadoop@gandhari:~/hbase/bin$ hbase-daemon.sh start regionserver

Let’s start the HBASE shell

hadoop@gandhari:~/hbase/bin$ ./hbase shell

hbase(main):001:0> status

ERROR: Can't get master address from ZooKeeper; znode data == null

This error denotes that hadoop daemons are not running. Make sure you have started the servers

$jps

10693 NodeManager
 10229 DataNode
 10086 NameNode
 11254 HRegionServer
 10936 JobHistoryServer
 10569 ResourceManager
 11131 HMaster
 10411 SecondaryNameNode
 11356 Jps
 11070 QuorumPeerMain

hbase(main):001:0> status
 1 servers, 0 dead, 2.0000 average load

hbase(main):002:0> status 'simple'
 1 live servers
 gandhari:60020 1472360461080
 requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=26, maxHeapMB=1958, numberOfStores=2, numberOfStorefiles=0, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=8, writeRequestsCount=5, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[]
 0 dead servers
 Aggregate load: 0, regions: 2

hbase(main):003:0> status 'summary'
 1 servers, 0 dead, 2.0000 average load

hbase(main):005:0> status 'detailed'
 version 1.0.0-cdh5.5.1
 0 regionsInTransition
 master coprocessors: []
 1 live servers
 gandhari:60020 1472360461080
 requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=26, maxHeapMB=1958, numberOfStores=2, numberOfStorefiles=0, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=8, writeRequestsCount=5, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, coprocessors=[]
 "hbase:meta,,1"
 numberOfStores=1, numberOfStorefiles=0, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=2, writeRequestsCount=3, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=0.0
 "hbase:namespace,,1472360489768.21310113a36cdc875d33fdac0b6060fd."
 numberOfStores=1, numberOfStorefiles=0, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=6, writeRequestsCount=2, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, completeSequenceId=-1, dataLocality=0.0
 0 dead servers

hbase(main):006:0> list
 TABLE
 0 row(s) in 0.0400 seconds

=> []

 

 

 

 

Advertisements

One thought on “Distributed HBASE & ZooKeeper Installation and Configuration

  1. Pingback: Hadoop Eco System Installation – Contents | JavaShine

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s