Week 10 Lecture (6)

6

Simple Hash Join

Basic approach:

hash part of outer relation R into memory buffers (build)
scan inner relation S, using hash to search (probe)
- if R.i=S.j, then h(R.i)=h(S.j) (hash to same buffer)
- only need to check one memory buffer for each S tuple
repeat until whole of R has been processed

No overflows allowed in in-memory hash table

works best with uniform hash function
can be adversely affected by data/hash skew