Collected thoughts and musings George's Sun Blog

Thursday Oct 09, 2008

In Part 1 of my series on building a virtual Hadoop cluster with OpenSolaris, we built and started a zone.  In part 2, we will:

  1. Set up ssh keys so that you can log into the zone without using your password
  2. Clone the original zone into node2
  3. Clone the original zone into node3
  4. Set up /etc/hosts
  5. Initialize the known_hosts file

1. Setting up ssh keys

While conceptually simple, this part always seems tricky.  I'm working on a set of scripts to automate this process, but until then, follow the instructions below to set up your ssh keys.  Setting up ssh keys is important, since Hadoop requires that you can log into other nodes without typing your password.

We start by creating a new ssh key for use with Hadoop:

$ ssh-keygen -t dsa -C 'hadoop_key' -f ~/.ssh/hadoop -N ''

Note that after the -N there are two single quotes right next to each other.

You should see something similar to the following:

gporter@pukapuka:~> ssh-keygen -t dsa -C 'hadoop_key' -f ~/.ssh/hadoop -N ''
Generating public/private dsa key pair.
Your identification has been saved in /export/home/gporter/.ssh/hadoop.
Your public key has been saved in /export/home/gporter/.ssh/hadoop.pub.
The key fingerprint is:
61:b8:61:f8:96:22:d9:30:2e:77:8f:a8:d9:30:14:9d hadoop_key
gporter@pukapuka:~>

Now, let's authorize this key we just created:

$ cat ~/.ssh/hadoop.pub >> ~/.ssh/authorized_keys

We need to propagate this new key to the newly created zone, as well:

$ scp -r ~/.ssh 192.168.3.11:

Type in your password, and then your keys will be transferred over

Now try logging into the zone:

$ ssh -i ~/.ssh/hadoop 192.168.3.11

The '-i ~/.ssh/hadoop' tells ssh to use our 'Hadoop' key when logging into the node.

gporter@pukapuka:~> ssh -i ~/.ssh/hadoop 192.168.3.11
Last login: Thu Oct  9 13:11:53 2008 from 192.168.3.1
Sun Microsystems Inc.   SunOS 5.11      snv_96  November 2008
$

Before we can clone the zone, we have to first halt it:

$ pfexec zoneadm -z node1 halt

2. Cloning node2

$ pfexec zonecfg -z node2 create -t node1

$ pfexec zonecfg -z node2 set zonepath=/export/zones/node2

$ pfexec zonecfg -z node2

This will bring you to the interactive editing mode

zonecfg:node2> select net address=192.168.3.11

zonecfg:node2:net> set address=192.168.3.12

zonecfg:node2:net> end

zonecfg:node2> commit

zonecfg:node2> exit

We can now clone the newly configured node2 zone:

$ pfexec zoneadm -z node2 clone node1

This should just take a couple of seconds...

Now update the nodename on the newly created zone:

$ pfexec vi /export/zones/node2/root/etc/nodename

And change 'node1' to 'node2'.  Exit out of 'vi' and start up the zone:

$ pfexec zoneadm -z node2 boot

You should be able to log into the newly created zone:

$ ssh -i ~/.ssh/hadoop 192.168.3.12

3. Cloning node3

$ pfexec zonecfg -z node3 create -t node1

$ pfexec zonecfg -z node3 set zonepath=/export/zones/node3

$ pfexec zonecfg -z node3

This will bring you to the interactive editing mode

zonecfg:node2> select net address=192.168.3.11

zonecfg:node2:net> set address=192.168.3.13

zonecfg:node2:net> end

zonecfg:node2> commit

zonecfg:node2> exit

We can now clone the newly configured node3 zone:

$ pfexec zoneadm -z node3 clone node1

This should just take a couple of seconds...

Now update the nodename on the newly created zone:

$ pfexec vi /export/zones/node3/root/etc/nodename

And change 'node1' to 'node3'.  Exit out of 'vi' and start up the zone:

$ pfexec zoneadm -z node3 boot

You should be able to log into the newly created zone:

$ ssh -i ~/.ssh/hadoop 192.168.3.13

3. Setting up networking

4. Setting up /etc/hosts

If the zones are not already booted, boot them now:

$ pfexec zoneadm -z node1 boot

$ pfexec zoneadm -z node2 boot

$ pfexec zoneadm -z node3 boot

Use 'vi' to create a host file for use with the zones:

$ pfexec vi /tmp/hosts

Put the following text in the file:

::1     localhost
127.0.0.1 localhost

192.168.3.1 master
192.168.3.11 node1

192.168.3.12 node2

192.168.3.13 node3

In order to copy these files to the zones in the proper place, we'll need to copy them over as root.  To enable the zones to accept files from root, log into each zone, and modify the /etc/ssh/sshd_config file:

$ ssh node1

$ su

<type in root password>

node1# vi /etc/ssh/sshd_config

Find the line that reads "PermitRootLogin no", and change the no to a yes.  Save the file

# svcadm restart ssh

# exit

Do these steps for nodes 2 and 3.

We'll now copy this file to the zone:

$ scp /tmp/hosts root@192.168.3.11:/etc

$ scp /tmp/hosts root@192.168.3.12:/etc

$ scp /tmp/hosts root@192.168.3.13:/etc

Now append the following lines to the /etc/hosts file in the global zone:

master 192.168.3.1
node1 192.168.3.11
node2 192.168.3.12
node3 192.168.3.13

You should now be able to refer to nodes by their friendly name, rather than their IP address:

$ ssh -i ~/.ssh/hadoop node2

5. Initializing 'known_hosts'

One of the last things we have to do before we can install Hadoop is initialize our 'known_hosts' file.  Its a pretty arcane thing to do, but just trust me--it will come in handy later.  The problem is that if you are in one of your newly created zones, and you try to ssh back out to the global zone, the ssh program will ask you if it should record the key of the global zone in your 'known_hosts' file.  When you start Hadoop, this is exactly what happens (the secondary namenode is started by ssh'ing from one of the local zones), and if that question pops up, everything will just hang.

Luckily its a pretty easy fix.  Just ssh into each local zone, then from within that zone, ssh back out to the global zone:

$ ssh node1
node1$ ssh master
master$exit
node1$ exit
$

Do this for node2:

$ ssh node2
node2$ ssh master
master$ exit
node2$ exit
$

and node3:

$ ssh node3
node3$ ssh master
master$ exit
node3$ exit
$

4. Setting up 'whoami'

For some reason, the system utility 'whoami' is not installed into the zones by default.  We must do this now.

$ scp /usr/bin/whoami root@node1:/usr/bin

$ scp /usr/bin/whoami root@node2:/usr/bin

$ scp /usr/bin/whoami root@node3:/usr/bin

We should be ready to install Hadoop at this point, which is the subject of Part 3 of this tutorial series.


           
Comments:

Post a Comment:
  • HTML Syntax: NOT allowed