In the previous blog “Smattering of HDFS“, we learnt that “The NameNode is a Single Point of Failure for the HDFS Cluster”. Each cluster had a single NameNode and if that machine became unavailable, the whole cluster would become unavailable until the NameNode is restarted or brought up on a different machine. Now in this blog, we will learn about resolving the failure issue of NameNode.
Issues that arise when NameNode fails/crashes-
The metadata for the HDFS like Namespace Information, block information etc, when in use needs to be stored in main memory, but for persistence storage, it is to be stored in disk. The NameNode stores two types of information:
1. in-memory fsimage – It is the latest and updated snapshot of the Hadoop filesystem namespace.
2. editLogs – It is the sequence of changes made to the filesystem after NameNode started.
The total availablity of HDFS
View original post 310 more words