[prev] 22 [next]

Join Summary

No single join algorithm is superior in some overall sense.

Which algorithm is best for a given query depends on:

  • sizes of relations being joined,   size of buffer pool
  • any indexing on relations,   whether relations are sorted
  • which attributes and operations are used in the query
  • number of tuples in S matching each tuple in R
  • distribution of data values (uniform, skew, ...)
Choosing the "best" join algorithm is critical because the cost difference between best and worst case can be very large.

E.g.   Join[id=stude](Student,Enrolled):   3,000 ... 2,000,000