[prev] 83 [next]

External Mergesort Cost Analysis (cont)

While this is an O(NlogN) algorithm, each iteration reads and write the entire data file.

Reduce iterations via an initial in-memory sorting pass:

  • reading data in chunks (e.g. 512 Items at a time)
  • sort each chunk in memory using e.g. QuickSort
  • write each chunk out after sorting
Mergesort algorithm then starts with chunk-sorted file.

If chunks are size 2k, need k less iterations.