Saturday, February 16, 2013

Managing Multiple Hadoop Clients

Introduction

I maintain several clusters.  Most of them are used for testing various versions of Hadoop.  I needed a way to automate switching the client side version of java and Hadoop on my laptop.

So, I created a simple system which helps me with that.

Setup

Create a folder for your
Hadoop distributions
mkdir /opt/hadoop
Create a subfolder for your
Hadoop distributions.
mkdir /opt/hadoop/hdp1.2
Extract your distribution of
Hadoop into its own folder
tar xvf hadoop-1.1.2.21.tar.gz
mv hadoop-1.1.2.21 /opt/hadoop/hdp1.2/hadoop
In the same subfolder, extract
your version of Java
tar xvf jdk-6u31-linux-x64.tar.gz -C /opt/hadoop/hdp1.2/
Create a symbolic link to
the java folder
cd /opt/hadoop/hdp1.2/
ln -s jdk-6u31-linux-x64 java
Modify JAVA_HOME in
hadoop-env.sh
nano hadoop/conf/hadoop-env.sh

Uncomment:
#export JAVA_HOME=

Change To:
export JAVA_HOME=/opt/hadoop/<distro>/java
Modify bin/hadoop nano hadoop/bin/hadoop

Comment the following lines:
#bin=`dirname "$0"`
#bin=`cd "$bin"; pwd`

Add the following lines:
bin=`readlink -f "$0"`
if [ -z "$bin" ]; then
bin="$0"
fi
bin=`dirname "$bin"`
Add update-alternatives
entry
Syntax:
update-alternatives --install <link> <name> <path> <priority>

Command:
sudo update-alternatives --install /usr/bin/hadoop hadoop /opt/hadoop/cdh3u4/hadoop/bin/hadoop 100
Install Hadoop configs Drop your configuration files in the conf directory

Repeat steps as needed for additional Hadoop clients.

 

Folder Structure

After setting everything up, your folder structure could look something like this:
  • /usr/opt/hadoop/
    • cdh3u5/
    • cdh4.1/
    • hdp1.1/
    • hdp1.2/
      • hadoop/
      • java -> jdk1.6.0_31/
      • jdk1.6.0_31/

 

Switching Hadoop Distributions

You can switch between hadoop configurations by entering the following:

sudo update-alternatives --config hadoop

You'll then be presented with a menu where you can select which version of Hadoop you'd like to use.

 

Other Possible Tweaks

  • You could store all your java distributions in /opt/java and then point your java symbolic links toward them.
  • Use slave links within update-alternatives to handle the java symbolic links

No comments:

Post a Comment