In HDFS the block placement policy is that it places 1 block in the same rack as of the writer while the two other replicas on different nodes of a different rack.
But why doesn't it place 1 of the other 2 replicas on the same rack as the original block of data? wouldn't that be more optimized? as it wouldn't require too much bandwidth to write the other two blocks on the other rack?
Data replication is performed as follows:
NameNode select new data nodes to host replicas the name server performs balancing of data placement by nodes and compiles a list of nodes for replication
The 1st replica is placed on the first node from the list
The 2nd replica is copied to another node in the same server rack
The 3rd replica is written to an arbitrary node in another server rack
the rest of the replicas are placed in an arbitrary way