[prev] 24 [next]

Hashing Performance

Aims:
  • distribute tuples evenly amongst buckets
  • have most buckets nearly full   (attempt to minimise wasted space)
Note: if data distribution not uniform, address distribution can't be uniform.

Best case: every bucket contains same number of tuples.

Worst case: every tuple hashes to same bucket.

Average case: some buckets have more tuples than others.

Use overflow pages to handle "overfull" buckets  (cf. sorted files)

All tuples in each bucket must have same hash value.