Limit the Storage contributed by Data Node to Name Node (Hadoop Cluster)
Use case: We have to restrict the amount of storage contributed to the Name Node.
But why?
Simply, because so that we can use left storage for any other purposes.
So, we will use the Partitioning Concept to achieve our purpose.
Pre-Requisite:
Setup Hadoop Cluster first (Name Node/Master Node and Data Node/Slave Nodes)
Steps to be followed:
- Create and attach a Virtual Hard Disk to the Slave Node.
- Create partition in the created Hard Disk.
- Format the created partition (Through formatting, INode Table will be created which will have all the information about the Data Nodes and also the files present in the Hadoop Cluster, etc.)
- Mount the created Partition with the Directory which is to be shared with the Name Node.
Create and Attach Virtual Hard Disk
Make sure your Slave node machine is in the OF state.
- Go to the Settings of the Data Node Virtual Machine.
- Click (+) icon in the storage settings.
3. Click on create.
4. Click on Next
5. Again, click Next
6. Click Next
7. Click on ‘Choose’.
8. Now, Start the Data Node. Go to Terminal
9. Check if the newly created Virtual Hard Disk is attached or not using the following command:
fdisk -l
10. Now, Create Partition in the Virtual Hard Disk
But why we need to create a partition? Because, you can’t use the storage device directly. You have to compulsorily create a partition in it in order to use it.
Run ‘ fdisk /dev/sdb ’ command :
a). Type ‘ n ’ for New Partition
b). Type ‘ p ’ for Primary Partition (default)
c). Type Partition Number: 1 (default)
d). First Sector Size: 2048 (2MB, By default)
e). Last Sector Size: +2G
f). Type ‘ w ’ to Save this Partition
11. Run “ fdisk -l /dev/sdb ” to check the details of the partition.
12. Now, load a driver for the newly created partition. Drivers are needed to communicate with any device as we add it.
Run the following:
udevadm settle
13. Format the Partition. But why we format? This will create a INode Table which will contain all the metadata about the device. It’s like the index table.
Run the following:
mkfs.ext4 /dev/sdb1
14. Create a directory in the root storage. And then mount the folder with the newly created partition.
Run the following:
mkdir /datanode_partition
mount /dev/sdb1 /datanode_partition
15. Now, go to /etc/hadoop directory and open hdfs-site.xml.
And in the value tag, add the name of the directory which you have mounted with the newly created partition.
16. Start the Data Node now using
hadoop-daemon.sh start datanode (Make sure Name Node is running)
17. Run
hadoop dfsadmin -report (This will show the datanodes connected with the namenode, their IP Address, Storage they contribute, etc.
Summary:
So, we have contributed only limited storage to the Name Node by creating a new hard disk and creating a partition in it. And then, the directory to be shared is mounted with the partition and that is shared with the Name Node.
Thank you
Hope you liked it!!