[prev] 10 [next]

Grace Hash Join

Basic approach (for R ⋈ S ):
  • partition both relations on join attribute using hashing (h1)
  • load each partition of R into N-buffer hash table (h2)
  • scan through corresponding partition of S to form results
  • repeat until all partitions exhausted
For best-case cost (O(bR + bS) ):
  • need ≥ √bR buffers to hold largest partition of outer relation
If < √bR buffers or poor hash distribution
  • need to scan some partitions of S multiple times