This command notifies the system that Oracle Java JDK is available for use
Reload your system wide PATH /etc/profile by typing the following command:
Test to see if Oracle Java was installed correctly on your system.
Adding dedicated Hadoop system user.
We will use a dedicated Hadoop user account for running Hadoop. While that’s not required but it is recommended, because it helps to separate the Hadoop installation from other software applications and user accounts running on the same machine.
a. Adding group:
sudo addgroup Hadoop
b. Creating a user and adding the user to a group:
sudo adduser –ingroup Hadoop hduser
Configuring SSH access:
The need for SSH Key based authentication is required so that the master node can then login to slave nodes (and the secondary node) to start/stop them and also local machine if you want to use Hadoop with it. For our single-node setup of Hadoop, we therefore need to configure SSH access to localhost for the hduser user we created in the previous section.
Before this step you have to make sure that SSH is up and running on your machine and configured it to allow SSH public key authentication.
Generating an SSH key for the hduser user. a. Login as hduser with sudo b. Run this Key generation command:
ssh-keyegen -t rsa -P ""
It will ask to provide the file name in which to save the key, just press has entered so that it will generate the key at ‘/home/hduser/ .ssh’
Enable SSH access to your local machine with this newly created key.
At this point Hadoop installed in your node.
Create folder for tmp
root@arrakis[~]#mkdir -p $HADOOP_HOME/tmp
Configuration : Multi-node setup
Add IP address of Master and all Slaves to /etc/hosts – for both Master and all the slave nodes
Add the association between the hostnames and the IP address for the master and the slaves on all the nodes in the /etc/hosts. Make sure that the all the nodes in the cluster are able to ping to each other.