Installation of Apache Hadoop for One Node Cluster, Assignments of Data Warehousing

A step-by-step guide on the installation of Apache Hadoop for a one node cluster on a machine. It includes commands, screenshots, and explanations of the functionality of each file used in the configuration of Apache Hadoop. The steps include installing Java, checking the Java version, setting up SSH keys, and configuring Hadoop-related files.

Typology: Assignments

2019/2020

Uploaded on 09/19/2020

cabinet-shah-zhylnzpttr
cabinet-shah-zhylnzpttr 🇮🇳

3 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Question 2
Hadoop installation
Show the steps and commands used in the installation of Apache Hadoop for one node cluster on
your machine. Each step must be supported with screenshots of your machine with your name on
terminal. Explain the functionality of each file used in the configuration of Apache Hadoop.
Step 1 : if java version is not installed then we use below command
$ sudo apt-get install openjdk-8-jdk
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Installation of Apache Hadoop for One Node Cluster and more Assignments Data Warehousing in PDF only on Docsity!

Question 2

Hadoop installation Show the steps and commands used in the installation of Apache Hadoop for one node cluster on your machine. Each step must be supported with screenshots of your machine with your name on terminal. Explain the functionality of each file used in the configuration of Apache Hadoop. Step 1 : if java version is not installed then we use below command $ sudo apt-get install openjdk-8-jdk

Step 2 : check java version. for that run this command $ java –version Step 3 : Run command $ readlink -f /usr/bin/java | sed "s:bin/java::" Copy the output after running this command

Step 6: Run the command $ ssh-keygen -t rsa -P "" Step 7 : Run the command $ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Step 8: Run the command $ ssh localhost Step 9: open bahsrc file with command $ gedit ~/.bashrc then paste the following below lines #HADOOP VARIABLES START export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd export HADOOP_INSTALL=/home/cabinetshah/hadoop-3.1. export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib" #HADOOP VARIABLES END

Step 12: Run these two commands $ sudo mkdir -p /home/cabinetshah/hadoop-3.1.3/tmp $ sudo chown cabinetshah:cabinetshah /home/cabinetshah/hadoop-3.1.3/tmp

Step 13: Goto hadoop 3.1.3 file  etc folder  hadoop folder  open in texteditor  core-site.xml file  paste following lines inside core-site.xml file hadoop.tmp.dir /home/cabinetshah/hadoop-3.1.3/tmp A base for other temporary directories. fs.default.name hdfs://localhost:54310 The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem. save & quit

Step 16: Goto hadoop 3.1.3 file  etc folder  hadoop folder  open in texteditor  hdfs-site.xml file  paste following lines inside hdfs-site.xml file dfs.replication 1 Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. dfs.namenode.name.dir file:/home/cabinetshah/hadoop-3.1.3/hadoop_store/hdfs/namenode

dfs.datanode.data.dir file:/home/cabinetshah/hadoop-3.1.3/hadoop_store/hdfs/datanode Save & quit Step 17: Run following command to format namenode $ hadoop namenode –format