[prev] 33 [next]

Scale, Distribution, Replication

Data for modern applications is very large (TB, PB, XB)
  • not feasible to store on a single machine
  • not feasible to store in a single location
Many systems opt for massive networks of simple nodes
  • each node holds moderate amount of data
  • each data item is replicated on several nodes
  • nodes clustered in different geographic sites
Benefits:
  • reliability, fault-tolerance, availability
  • proximity ... use data closest to client
  • scope for parallel execution/evaluation