the hadoop distributed file systemcis.csuohio.edu/~sschung/cis601/hdfs_shuha.pdf · 2017. 5. 2. ·...

THE HADOOP DISTRIBUTED FILE SYSTEM

Authors: Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler

Presented by Suhua Wei, Yong Yu

OUTLINE

• Architecture of Hadoop Distributed File System(HDFS)

• Report on the experience using HDFS at Yahoo!

Courtesy of http://revieweasyhomemadecookies.com/what-is-hadoopecosystem/

HDFS OVERVIEW

• HDFS stores files system metadata and application data separately

• HDFS stores metadata on a dedicated server, called NameNodes

• Application data are stored on other servers called DataNodes

• All servers are fully connected and communicate with each other using

TPC-based protocols.

ARCHITECTURE

• An HDFS client creates a new file by giving its path to the NameNode

• For each block of the file, the NameNode returns a list of DataNodes to host its replicas

• The client then pipelines data to the chosen DataNodes, which eventually confirm the creation of the block replicas to the NameNode

ARCHITECTURE

• NameNode

• The HDFS namespace is a hierarchy of files and directories. Files and directories are represented on the

NameNode by inodes, which record attributes like permissions, modification and access time, namespace

and diskspace quotas.

• The NameNode maintains the namespace tree and the mapping of file blocks to DataNodes

• The HDFS keeps the entire namespace in RAM.

• Image: the inode data and the list of blocks belonging to each file

• Checkpoint: the persistent record of the image stored in the local host’s native file system.

• Journal: the modification log of the image which is stored in the local host’s native file system

• For improve durability, redundant copies of the checkpoint and journal can be made at other servers.

ARCHITECTURE

• DataNode

• During startup each DataNode connects to the NameNode and performs a handshake, which is to verify the namespace ID, the software version of the DataNode

• After the handshake, the DataNode registers with NameNodes.

• During normal operation, DataNodes send heartbeats to the NameNodes to confirm that the DataNode is operating and the block replicas it hosts are available at interval of 3 seconds

• The NameNode replies to heartbeats to send instructions include commands to:

• replicate blocks to other nodes

• remove local block replicas

• re-register or to shut down the node

• an immediate block report

ARCHITECTURE

• HDFS Clients

• HDFS clients is a code library that exports the HDFS file system interface

• Reads data by transferring data from a DataNode directly

• Writing data

• it first asks the NameNode to choose DataNodes to host replicas of the first block of the file.

• The client sets up a node-to-node pipeline and sends data to the first DataNode

• When the first block is filled, the client requests new DataNodes to be chosen to host replicas of the next block.

• Unlike conventional file system, HDFS provides an API that exposes the locations of a file blocks, which allows applications like MapReduce framework to schedule a task to where the data are located.

REDUNDANCY MECHANISMS

• Image and Journal

• Image: the inode data and the list of blocks belonging to each file (file system metadata)

• Checkpoint: the persistent record of the image written to disk.

• Journal: the modification log of the image to ensure changes being persistent

• CheckpointNode and BackupNode

• A NameNode can alternatively be run as a CheckpointNode or BackupNode

• The CheckpointNode periodically combines the existing checkpoint and journal to create a new checkpoint and empty journal

• Creating periodic checkpoints is one way to protect the file system metadata. Good practice is to create a daily checkpoint

• A BackupNode acts like a shadow of the NameNode and keeps an up-to-date copy of the image in memory

FILE I/O OPERATIONS AND REPLICA MANAGEMENT

• File Read and Write

• HDFS implements a single-

writer, multiple-reader model

• The figure at right shows the

data pipeline during block

construction.

Acknowledgement

massages


• Block Placement

• For large cluster, a common practice is to

spread the nodes across multiple racks.

• Nodes of a rack share a switch and rack

switches are connected by one or more

switches.

• The default HDFS replica placement policy:

• No DataNode contains more than one

replica of any block

• No rack contains more than one

replicas of the same block, provided

there are sufficient racks on the cluster


• Replication management

• The NameNode endeavors to ensure that each block always has the intended number of relicas

• The necessity for re-replication may arise due to many reasons:

• a DataNode may become unavailable

• a replica may become corrupted

• a hard disk on a DataNode may fail

• or the replication factor of a file may be increased

• When a block becomes over replicated, the NameNode choose a replica to remove; when a block

becomes under replicated, it is put in the replication priority queue. A background thread

periodically scans the head of the replication queue to decide where to place new replicas.

HDFS AT YAHOO!

• Large HDFS clusters at Yahoo! Include about 3500 nodes.(60 million

files) A typical cluster node has:

• 2 quad core Xeon processors @ 2.5 ghz

• Red Hat Enterprise Linux Server Release 5.1

• Sun Java JDK

• 4 directly attached SATA drivers (One terabyte each)

• 1-gigabit Ethernet

PERFORMANCE BENCHMARKS

• The DFSIO benchmark measures average throughput for read, write, and

append operations.

DFSIO Production cluster Sort

Read: 66 MB /s per node Read: 1.02 MB/s per node 1 TB sort:

22.1 MB/s per node

Write: 40 MB /s per node Write: 1.09 MB /s per node 1000 TB:

9.35 MB/s per node

PERFORMANCE BENCHMARKS

• The NNTroughout benchmark is a

single node process which starts

the NameNode application and

runs a series of client threads on

the same node

• The benchmark measures the

number of operations per second

performed by the NameNode

SUMMARY

• The Hadoop Distributed File System is designed to store

very large data sets reliably and to stream these datasets

to user applications at high bandwidth

• The HDFS architecture consists of a single NameNode,

many DataNodes and the HDFS client

the hadoop distributed file systemcis.csuohio.edu/~sschung/cis601/hdfs_shuha.pdf · 2017. 5. 2. ·...

Documents