Setting up Storage Server with HDFS
This topic contains a basic procedure on how to set up Storage Server with HDFS.
Pre-requisites
- Kerberos should be installed on the client and host machines.If not installed you will need to install the following in UNIX (see https://help.ubuntu.com/10.04/serverguide/kerberos.html for more information).
krb5-kdc
krb5-admin-server
- Open a terminal and type the following:
sudo apt-get install krb5-kdc krb5-admin-server
- Set the realm as WSO2.ORG.
Starting the Storage Server node
- Follow the steps below to create a keytab with the following service principals.
- admin/carbon.super - password: admin
- datanode/carbon.super - password: node0
- If you are starting a data node, add the data node principal as well.
- The keytabs are created.
The default carbon.keytab file that comes with the Storage Server pack includes this keytab. However, in case a new datanode needs to be added, then the new datanode's service principal, which should be unique, is added to this keytab file and a new carbon.keytab file needs to be created. By default, SS is configured to start one namenode and a datanode.  Cache the principle key using the following command.
ktutil: addent -password -p <your principle> -k 1 -e <encryption algo>
The following is a sample for this:deep@den:~$ ktutil ktutil: addent -password -p admin/carbon.super@WSO2.ORG -k 1 -e des-cbc-md5 Password for admin/carbon.super@WSO2.ORG: admin ktutil: addent -password -p datanode/carbon.super@WSO2.ORG -k 1 -e des-cbc-md5 Password for datanode/carbon.super@WSO2.ORG:datanode
Write a keytab for the service principle using the following command:
ktutil: write_kt <keytab file name>
The following is a sample for this:ktutil: wkt carbon.keytab
Copy the created keytab file toÂ
[SS_HOME]/repository/conf/etc/hadoop/keytabs/
 and rename it to carbon.keytab.- Start the server with HDFS enabled.
./wso2server.sh -enable.hdfs.startup
Access the Carbon configuration menu and create a new service principal for data nodes with relevant passwords.
When the namenode starts, the user should go to the Carbon console in the namenode and create a service principal for the datanodes.
datanode/carbon.super
should be added, plus any other datanodes the user is willing to start.Â
If your name node is up and HDFS is set up properly, you will notice the following lines in your console.
11-12 16:57:36,561] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - fsOwner=admin/node0@WSO2.ORG [2013-11-12 16:57:36,561] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - supergroup=admin [2013-11-12 16:57:36,561] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - isPermissionEnabled=true [2013-11-12 16:57:36,565] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - dfs.block.invalidate.limit=100 [2013-11-12 16:57:36,565] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - isAccessTokenEnabled=true accessKeyUpdateInterval=600 min(s), accessTokenLifetime=600 min(s) [2013-11-12 16:57:36,571] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - Registered FSNamesystemStateMBean and NameNodeMXBean [2013-11-12 16:57:36,586] INFO {org.apache.hadoop.hdfs.server.namenode.NameNode} - Caching file names occuring more than 10 times [2013-11-12 16:57:36,593] INFO {org.apache.hadoop.hdfs.server.common.Storage} - Number of files = 1 [2013-11-12 16:57:36,596] INFO {org.apache.hadoop.hdfs.server.common.Storage} - Number of files under construction = 0 [2013-11-12 16:57:36,597] INFO {org.apache.hadoop.hdfs.server.common.Storage} - Image file of size 134 loaded in 0 seconds. [2013-11-12 16:57:36,597] INFO {org.apache.hadoop.hdfs.server.common.Storage} - Edits file repository/data/hadoop/dfs/name/current/edits of size 30 edits # 2 loaded in 0 seconds. [2013-11-12 16:57:36,598] INFO {org.apache.hadoop.hdfs.server.common.Storage} - Image file of size 158 saved in 0 seconds. [2013-11-12 16:57:36,853] INFO {org.apache.hadoop.hdfs.server.common.Storage} - Image file of size 158 saved in 0 seconds. [2013-11-12 16:57:37,016] INFO {org.apache.hadoop.hdfs.server.namenode.NameCache} - initialized with 0 entries 0 lookups [2013-11-12 16:57:37,016] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - Finished loading FSImage in 456 msecs [2013-11-12 16:57:37,023] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - Total number of blocks = 0 [2013-11-12 16:57:37,024] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - Number of invalid blocks = 0 [2013-11-12 16:57:37,024] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - Number of under-replicated blocks = 0 [2013-11-12 16:57:37,024] INFO {org.apache.hadoop.hdfs.server.namenode.FSNamesystem} - Number of over-replicated blocks = 0
If data node needs to be up, open another terminal, and run the following command.
$ HADOOP_SECURE_DN_USER=<username> sudo -E bin/hadoop datanode
Starting multiple Data nodes pointing to one NamenodeÂ
Change the following property values in the hdfs-site.xml file to point to the namenode:
dfs.http.address dfs.https.port dfs.https.address
Change the following properties in the core-site.xml file to point to the namenode:
fs.default.name hadoop.security.group.mapping.service.url
Change the following in the hdfs-site.xml file to start the datanode.
dfs.datanode.address dfs.datanode.https.address dfs.datanode.http.address dfs.datanode.ipc.address dfs.replication
- Add the datanode IP and port to the slaves, one per line.
- When starting multiple datanodes in the same machine, make sure you change the PID_DIR and the IDENT_STRING for the data node in the hadoop-env.shfile.
If starting as a secure datanode then add the following line:
# The directory where pid files are stored for secured datanode export HADOOP_SECURE_DN_PID_DIR=/tmp/2
Alternatively add the following:
# A string representing this instance of hadoop. $USER by default. export HADOOP_IDENT_STRING=$USER_02