Scan, Sort, Project


Implementing Relational Operations


Relational Operations2/90

DBMS core = relational engine, with implementations of

In this part of the course:


... Relational Operations3/90

Implementation of relational operations in DBMS:

[Diagram:Pics/scansortproj/dbmsarch-small.png]


... Relational Operations4/90

All relational operations return a set of tuples.

Can represent a typical operation programmatically as:

ResultSet = {}  // initially an empty set
while (t = nextRelevantTuple()) {
   // format tuple according to projection
   t' = formatResultTuple(t,Projection)
   // add next relevant tuple to result set
   ResultSet = ResultSet  t'
}
return ResultSet

All of the hard work is in the nextRelevantTuple() function.


... Relational Operations5/90

nextRelevantTuple() for selection operator:

nextRelevantTuple() for join operator: Two ways to handle the ResultSet


... Relational Operations6/90

There are three "dimensions of variation" in this system:

We consider combinations of these, e.g. Also consider updates (insert/delete) on file structures.


Query Types7/90

Queries fall into a number of classes:

Type SQL RelAlg a.k.a.
Scan select * from R R -
Proj select x,y from R Proj[x,y]R -
Sort select * from R
order by x
Sort[x]R ord

Different query classes exhibit different query processing behaviours.


... Query Types8/90

Type SQL RelAlg a.k.a.
Sel1 select * from R
where id = k
Sel[id=k]R one
Seln select * from R
where a = k
Sel[a=k]R -
Selpmr select * from R
where a=j and b=k
Sel[a=j ∧ b=k]R pmr
Range1d select * from R
where a>j and a<k
Sel[a>j ∧ a<k]R rng
Rangend select * from R
where a>j and a<k
    and b>m and b<n
Sel[...]R space


... Query Types9/90

Type SQL RelAlg a.k.a.
Join1 select * from R,S
where R.id = S.r
R Join[id=r] S -
EquiJoin select * from R,S
where R.v=S.w and R.x=S.y
R Join[v=w ∧ x=y] S -
ThetaJoin select * from R,S
where R.x op S.y
R Join[...] S -
Similar select * from R
where R.* ≅ Object
R ≅ Obj sim


Cost Models


Cost Models11/90

An important aspect of this course is

Won't be using asymptotic complexity (O(n)) for this

Rather, we attempt to develop cost models

Cost is measured in terms of number of page reads/writes.


... Cost Models12/90

Assumptions in our cost models:


... Cost Models13/90

In developing cost models, we also assume:

[Diagram:Pics/scansortproj/file-struct0-small.png]


... Cost Models14/90

Typical values for measures used in cost models:

Quantity Symbol E.g. Value
total # tuples r 106
record size R 128 bytes
total # pages b 105
page size B 8192 bytes
# tuples per page c 60
page read/write time Tr,Tw 10 msec
process page in memory - ≅ 0
# pages containing
answers for query q
bq ≥ 0


Example file structures15/90

When describing file structures

[Diagram:Pics/scansortproj/one-page-small.png]


... Example file structures16/90

Consider three simple file structures:

All files are composed of b primary blocks/pages

[Diagram:Pics/scansortproj/file-struct0-small.png]

Some records in each page may be marked as "deleted".


... Example file structures17/90

Heap file with b = 4, c = 4:

[Diagram:Pics/scansortproj/file-heap-small.png]


... Example file structures18/90

Sorted file with b = 4, c = 4:

[Diagram:Pics/scansortproj/file-sort1-small.png]


... Example file structures19/90

Hashed file with b = 3, c = 4, h(k) = k%3

[Diagram:Pics/scansortproj/file-hash-small.png]


... Example file structures20/90

Indexed file with b = 4, c = 4, bi = 2, ci = 8:

[Diagram:Pics/scansortproj/file-index-small.png]


Scanning


Scanning22/90

Consider the query:

select * from T;

Conceptually:

for each tuple t in relation T {
   add tuple t to result set
}

[Diagram:Pics/scansortproj/file-struct0-small.png]


... Scanning23/90

Implemented via iteration over file containing T:

for each page P in file of relation T {
   for each tuple t in page P {
      add tuple t to result set
   }
}

Cost: read every data page once

Cost = b.Tr


... Scanning24/90

In terms of file operations:

// implementation of "select * from T"

File inf;   // data file handle
int p;      // input file page number
Buffer buf; // input file buffer
int i;      // current record in input buf
Tuple t;    // data for current record

inf = openFile(fileName("T"), READ)
for (p = 0; p < nPages(inf); p++) {
    buf = readPage(inf,p);
    for (i = 0; i < nTuples(buf); i++) {
        t = getTuple(buf,i);
        add t to result set
}   }


... Scanning25/90

Scan implementation when file has overflow pages, e.g.

[Diagram:Pics/scansortproj/file-struct1-small.png]


... Scanning26/90

In this case, the implementation changes to:

for each page P in file of relation T {
    for each tuple t in page P {
        add tuple t to result set
    }
    for each overflow page V of page P {
        for each tuple t in page V {
            add tuple t to result set
}   }   }

Cost: read each data and overflow page once

Cost = (b + bOv).Tr

where bOv = total number of overflow pages


... Scanning27/90

In terms of file operations:

// implementation of "select * from T"

File inf;   // data file handle
File ovf;   // overflow file handle
int p;      // input file page number
int ovp;    // overflow file page number
Buffer buf; // input file buffer
int i;      // current record in input buf
Tuple t;    // data for current record

inf = openFile(fileName("T"), READ)
ovf = openFile(ovFileName("T"), READ)
for (p = 0; p < nPages(inf); p++) {
    buf = readPage(inf,p);
    for (i = 0; i < nTuples(buf); i++) {
        t = getTuple(buf,i);
        add t to result set
    }
    ovp = ovflow(buf);
    while (ovp != NO_PAGE) {
        buf = readPage(ovf,ovp);
        for (i = 0; i < nTuples(buf); i++) {
            t = getTuple(buf,i);
            add t to result set
        }
        ovp = ovflow(buf);
    }
}

Cost: read data+ovflow page    Cost = (b+bov).Tr


Selection via Scanning28/90

Consider a one query like:

select * from Employee where id = 762288;

In an unordered file, search for matching record requires:

[Diagram:Pics/scansortproj/file-search-small.png]

Guaranteed at most one answer; could be in any page.


... Selection via Scanning29/90

In terms of file operations (assuming var delcarations as before):

inf = openFile(fileName("Employee"), READ);
for (p = 0; p < nPages(inf); p++)
    buf = readPage(inf,p);
    for (i = 0; i < nTuples(buf); i++) {
        t = getTuple(buf,i);
        if (getField(t,"id") == 762288)
           return t;
}   }

For different selection condition, simply replace   (getField(t,"id")==762288)


... Selection via Scanning30/90

Cost analysis for one searching in unordered file

Assumptions: Costavg = Trb/2    Costmin = Tr    Costmax = Trb


File Copying31/90

Consider an SQL statement like:

create table T as (select * from S);

Effectively, copies data from one file to another.

[Diagram:Pics/scansortproj/file-copy-small.png]

Conceptually:

make empty relation T
for each tuple t in relation S {
    append tuple t to relation T
}


... File Copying32/90

In terms of file operations:

File inf,outf; // input/output file handles
int ip,op;     // input/output page numbers
int i;         // record number in input buf
Tuple t;       // current record
Buffer buf;    // input file buffer
Buffer obuf;   // output file buffer

inf = openFile(fileName("S"), READ);
outf = openFile(fileName("T"), CREATE);
clear(obuf);
for (ip = op = 0; ip < nPages(inf); ip++) {
    buf = readPage(inf, ip);
    for (i = 0; i < nTuples(buf); i++) {
        t = getTuple(i, buf);
        addTuple(t, obuf);
        if (isFull(obuf)) {
            writePage(outf,op++,obuf);
            clear(obuf);
}   }   }
if (nTuples(obuf) > 0)
    writePage(outf,op,obuf);


... File Copying33/90

Cost analysis for file copying:

Cost:   binTr + boutTw    or    T.(bin + bout )   (assuming T = Tr = Tw)


Iterators34/90

Higher-levels of DBMS are given a view of scanning as:

cursor = initScan(relName,condition);
while (tup = getNextTuple(cursor)) {
	process tup
}
endScan(cursor);

Also known as iterator.


... Iterators35/90

Implementation of simple scan iterator (via file operations):

typedef struct {
    File   inf;   // data file handle
    Buffer buf;   // input buffer
    int    curp;  // current page number
    int    curi;  // current record number
    Expr   cond;  // representation of condition
} Cursor;


... Iterators36/90

Implementation of simple scan iterator (continued):

Cursor *initScan(char *rel, char *cond)
{
    Cursor *c;
    c = malloc(sizeof(Cursor));
    c->inf = openFile(fileName(rel),READ);
    c->buf = readPage(c->inf,0);
    c->curp = 0;
    c->curi = 0;
    c->cond = makeTestableCondition(cond);
    return c;
}
void endScan(Course *c)
{
    closeFile(c->inf);
    freeExpr(c->cond);
    free(c);
}


... Iterators37/90

Implementation of simple scan iterator (continued):

Tuple getNextTuple(Cursor *c)
{
getNextTuple:
    if (c->curi < nTuples(c->buf))
        return getTuple(c->buf, c->curi++);
    else {
        // no more tuples in this page; get next page
        c->curp++;
        if (c->curp == nPages(c->inf))
            return NULL;  // no more pages
        else {
            c->buf = readPage(c->inf,c->curp);
            c->curi = 0;
            goto getNextTuple;
        }
    }
}


... Iterators38/90

Implementation of full iterator interface via file operations:

typedef struct {
    File   inf;   // data file handle
    File   ovf;   // overflow file handle
    Buffer buf;   // input buffer
    int    curp;  // current page number
    int    curop; // current ovflow page number
    int    curi;  // current record number
    Expr   cond;  // representation of condition
} Cursor;


... Iterators39/90

Implementation of full iterator interface (continued):

Cursor *initScan(char *rel, char *cond)
{
    Cursor *c;

    c = malloc(sizeof(Cursor));
    c->inf = openFile(fileName(rel),READ);
    c->ovf = openFile(ovFileName(rel),READ);
    c->buf = readPage(c->inf,0);
    c->curp = 0;
    c->curop = NO_PAGE;
    c->curi = 0;
    c->cond = makeTestableCondition(cond)
    return c;
}
void endScan(Course *c)
{
    closeFile(c->inf);
    if (c->ovf) closeFile(c->ovf);
    freeExpr(c->cond);
    free(c);
}


... Iterators40/90

Implementation of scanning interface (continued):


Tuple getNextTuple(Cursor *c)
{
getNextTuple:
    if (c->curi < nTuples(c->buf))
        return getTuple(c->buf, c->curi++);
    else {
        // no more tuples in this page; get next page
        if (c->curop == NO_PAGE) {
            c->curop = ovflow(c->buf);
            if (c->curop != NO_PAGE) {
                // start ovflow chain scan
getNextOvPage:
                c->buf = readPage(c->ovf,c->curop);
                c->curi = 0;
                goto getNextTuple;
            }
            else {
getNextDataPage:
                c->curp++;
                if (c->curp == nPages(c->inf))
                    return NULL;  // no more pages
                else {
                    c->buf = readPage(c->inf,c->curp);
                    c->curi = 0;
                    goto getNextTuple;
                }
            }
        }
        else {
            // continue ovflow chain scan
            c->curop = ovflow(c->buf);
            if (c->curop == NO_PAGE)
                goto getNextDataPage;
            else
                goto getNextOvPage;
        }
    }
}


Scanning in PostgreSQL41/90

Scanning defined in: /backend/access/heap/heapam.c

Implements iterator data/operations:


... Scanning in PostgreSQL42/90

typedef struct HeapScanDescData
{
  // scan parameters 
  Relation      rs_rd;        // heap relation descriptor 
  Snapshot      rs_snapshot;  // snapshot ... tuple visibility 
  int           rs_nkeys;     // number of scan keys 
  ScanKey       rs_key;       // array of scan key descriptors 
  ...
  // state set up at initscan time 
  PageNumber    rs_npages;    // number of pages to scan 
  PageNumber    rs_startpage; // page # to start at 
  ...
  // scan current state, initally set to invalid 
  HeapTupleData rs_ctup;      // current tuple in scan
  PageNumber    rs_cpage;     // current page # in scan
  Buffer        rs_cbuf;      // current buffer in scan
   ...
} HeapScanDescData;


Scanning in other File Structures43/90

Above examples are for heap files

Other access file structures in PostgreSQL:


Sorting


The Sort Operation45/90

Sorting is explicit in queries only in the order by clause

select * from Students order by name;

More important, sorting is used internally in other operations:


External Sorting46/90

Sort methods such as quicksort are designed for in-memory data.

For data on disks, need external sorting techniques.

The standard external sorting method (merge sort) works by


... External Sorting47/90

Sorting tuples within pages

[Diagram:Pics/scansortproj/sorted-page-small.png]


Two-way Merge Sort48/90

Requires three in-memory buffers:

[Diagram:Pics/scansortproj/two-way-buf-small.png]


Assumption: cost of merge on two buffers ≅ 0.


... Two-way Merge Sort49/90

Two-way merge-sort method:

read each page into buffer, sort it, write it
numberOfRuns = b; runLength = 1;
while (numberOfRuns > 1) {
    for each pair of adjacent runs {
        merge the pair of runs to output, by
          - read pages from runs into input
              buffers, one page at a time
          - apply merge algorithm to transfer
              tuples to output buffer
          - flush output buffer when full and
              when merge finished
   }
   numberOfRuns = numberOfRuns / 2
   runLength = runLength * 2
}


... Two-way Merge Sort50/90

Example:

[Diagram:Pics/scansortproj/two-way-ex-small.png]


... Two-way Merge Sort51/90

Two-way merge-sort method (improved):

numberOfRuns = b; runLength = 1;
while (numberOfRuns > 1) {
    for each pair of adjacent runs {
        merge the pair of runs to output, by
          - read pages from runs into input
              buffers, one page at a time
          - if (runLength == 1)
              sort contents of each input buffer
          - apply merge algorithm to transfer
              tuples to output buffer
          - flush output buffer when full and
              when merge finished
   }
   numberOfRuns = numberOfRuns / 2
   runLength = runLength * 2
}

Avoids first pass to sort contents of individual pages.


... Two-way Merge Sort52/90

Consider file where b = 2k:

Method also works ok when


... Two-way Merge Sort53/90

Example:

[Diagram:Pics/scansortproj/two-way-ex2-small.png]


Merging Two Sorted Pages54/90

Method using operations on files and buffers:


// Pre:  buffers B1,B2; outfile position op
// Post: tuples from B1,B2 output in order
i1 = i2 = 0; clear(Out);
R1 = getTuple(B1,i1); R2 = getTuple(B2,i2);
while (i1 < nTuples(B1) && i2 < nTuples(B2)) {
    if (lessThan(R1,R2))
        { addTuple(R1,Out); i1++; R1 = getTuple(B1,i1); }
    else
        { addTuple(R2,Out); i2++; R2 = getTuple(B2,i2); }
    if (isFull(Out))
        { writePage(outf,op++,Out); clear(Out); }
}
for (i1=i1; i1 < nTuples(B1); i1++) {
    addTuple(getTuple(B1,i1), Out);
    if (isFull(Out))
        { writePage(outf,op++,Out); clear(Out); }
}
for (i2=i2; i2 < nTuples(B2); i2++) {
    addTuple(getTuple(B2,i2), Out);
    if (isFull(Out))
        { writePage(outf,op++,Out); clear(Out); }
}
if (nTuples(Out) > 0) writePage(outf,op,Out);


Merging Runs vs Merging Pages55/90

In the above, we merged two input buffers.

In general, we need to merge sorted "runs" of pages.

The only difference that this makes to the above method:

R1 = getTuple(B1,i1);

becomes

if (i1 == nTuples(B1)) {
    B1 = readPage(inf,ip++); i1 = 0;
}
R1 = getTuple(B1,i1);


Comparison for Sorting56/90

Above assumes that we have a function to compare tuples.

Mechanism needs to be generic, to handle all of:

select * from Employee order by eid;
select * from Employee order by name;
select * from Employee order by age;

Envisage a function tupCompare(r1,r2,f) (cf. C's strcmp)


... Comparison for Sorting57/90

SQL allows sorting based on multiple fields, e.g.

select * from Employee order by name,age;

How to specify sort condition for several attributes?

E.g. sort key is (A1,A2,..An)


int tupCompare(Tuple r1, Tuple r2, SortKeys A)
{
         if (r1.A1 != r2.A1) return(r1.A1 - r2.A1);
    else if (r1.A2 != r2.A2) return(r1.A2 - r2.A2);
    else if ...
    ...
    else if (r1.An != r2.An) return(r1.An - r2.An);
    else return 0;
}

Assume that ri.Ai is a method for extracting value of Aith attribute of record ri


Cost of Two-way Merge Sort58/90

For a file containing b data pages:

Gives total cost:   2.b.log2b

Example: Relation with r=105 and c=50     b=2000 pages.

Number of passes for sort:   log22000  =  11

Reads/writes entire file 11 times!    Can we do better?


n-Way Merge Sort59/90

Merge passes use:   B total memory buffers,   n input buffers,   B-n output buffers

[Diagram:Pics/scansortproj/n-way-buf-small.png]

Typically, consider only one output buffer, i.e. B = n + 1

Note that initial sort pass uses all B buffers ..


... n-Way Merge Sort60/90

Method:


// Produce B-page-long runs
for each group of B pages in Rel {
    read pages into memory buffers
    sort group in memory
    write pages out to Temp
}
// Merge runs until everything sorted
// n-way merge, where n=B-1
numberOfRuns = b/B
while (numberOfRuns > 1) {
    for each group of n runs in Temp {
        merge into a single run via input buffers
        write run to newTemp via output buffer
    }
    numberOfRuns = numberOfRuns/n
    Temp = newTemp // swap input/output files
}


... n-Way Merge Sort61/90

Method for merging n runs (n input buffers, 1 output buffer):


for i = 1..n {
   read first page of run[i] into a buffer[i]
   set current tuple cur[i] to first tuple in buffer[i]
}
while (more than 1 run still has tuples) {
   s = find buffer with smallest tuple as cur[i]
   copy tuple cur[i] to output buffer
   if (output buffer full) { write it and clear it}
   advance cur[i] to next tuple
   if (no more tuples in buffer[i]) {
      if (no more pages in run[i])
         mark run[i] as complete
      else {
         read next page of run[i] into buffer[i]
         set cur[i] to first tuple in buffer[i]
}  }  }
copy tuples in non-empty buffer to output


Cost of n-Way Merge Sort62/90

Consider file where b = 4096, B = 16 total buffers:

(cf. two-way merge sort which needs 11 passes)

For b data pages and n=15 input buffers (15-way merge)

Cost = 2.b.(1 + lognb0),   where b0 = b/B


... Cost of n-Way Merge Sort63/90

Costs (number of passes) for varying b and B (n=B-1):

b B=3 B=16 B=128
100 7 2 1
1000 10 3 2
10,00 13 4 2
100,000 17 5 3
1,000,000 20 5 3
 

In the above, we assume that

Elapsed time could be reduced by double-buffering


Sorting in PostgreSQL64/90

Sort uses a polyphase merge-sort (from Knuth):

Tuples are mapped to SortTuple structs for sorting: If all data fits into memory, sort using qsort().

If memory fills while reading, form "runs" and do disk-based sort.


... Sorting in PostgreSQL65/90

Disk-based sort has phases:

Many references to "tapes" since Knuth's original algorithm was described in terms of merging data from magnetic tapes.

Effectively, a "tape" is a sorted run.

Implementation of "tapes": backend/utils/sort/logtape.c


... Sorting in PostgreSQL66/90

Sorting is generic and comparison operators are defined in catalog:

// gets pointer to function via pg_operator
SelectSortFunction(Oid sortOperator,
                   bool nulls_first,
                   Oid *sortFunction,
                   int *sortFlags);

// returns negative, zero, positive
ApplySortFunction(FmgrInfo *sortFunction,
                  int sortFlags,
                  Datum datum1, bool isNull1,
                  Datum datum2, bool isNull2);

Flags indicate: ascending/descending, nulls-first/last.


Projection


The Projection Operation68/90

Consider the query:

select distinct name,age from Employee;

If the Employee relation has four tuples such as:

(94002, John, Sales, Manager,   32)
(95212, Jane, Admin, Manager,   39)
(96341, John, Admin, Secretary, 32)
(91234, Jane, Admin, Secretary, 21)

then the result of the projection is:

(Jane, 21)   (Jane, 39)   (John, 32)

Note that duplicate tuples (e.g. (John,32)) are eliminated.


... The Projection Operation69/90

The projection operation needs to:

  1. scan the entire relation as input

    (straightforward, whichever file organisation is used)

  2. remove unwanted attributes in output

    (straightforward, manipulating internal record structure)

  3. eliminate any duplicates produced

    (not as simple as other operations ...)

There are two approaches for task 3: sorting or hashing.


Removing Attributes70/90

Projecting attributes involves creating a new tuple, using only some values from the original tuple.

Precisely how to achieve this depends on tuple internals.

Removing attributes from fixed-length tuples:

[Diagram:Pics/scansortproj/fixed-len-small.png]


... Removing Attributes71/90

Removing attributes from variable-length tuples:

[Diagram:Pics/scansortproj/var-len-small.png]


Sort-based Projection72/90

Overview of the method:

  1. Scan input relation Rel and produce a file of tuples containing only the projected attributes
  2. Sort this file of tuples using the combination of all attributes as the sort key
  3. Scan the sorted result, comparing adjacent tuples, and discard duplicates
Requires a temporary file/relation (Temp)


... Sort-based Projection73/90

The method, in detail:

// Inputs: relName, attrList
inf = openFile(fileName(relName),READ);
tempf = openFile(tmpName,CREATE);
clear(outbuf); j = 0;
for (p = 0; p < nPages(inf); p++) {
    buf = readPage(inf,p);
    for (i = 0; i < nTuples(buf); i++) {
        tup = getTuple(buf,i);
        newtup = project(tup,attrList);
        addTuple(newtup,outbuf);
        if (isFull(outbuf)) {
            writePage(tempf,j++,outbuf);
            clear(outbuf);
        }
    }
}
mergeSort(tempf);

(continued ...)


... Sort-based Projection74/90

(... continued)

tempf = openFile(tmpName,READ);
outf = openFile(result,CREATE);
clear(outbuf); prev = EMPTY; j = 0;
for (p = 0; p < nPages(tempf); p++) {
    buf = readPage(tempf,p);
    for (i = 0; i < nTuples(buf); i++) {
        tup = getTuple(buf,i);
        if (tupCompare(tup,prev) != 0) {
            addTuple(tup,outbuf);
            if (isFull(outbuf)) {
                writePage(outf,j++,outbuf);
                clear(outbuf);
            }
            prev = tup;
        }
    }
}


Cost of Sort-based Projection75/90

The costs involved are (assuming B+1 buffers for sort):

Total cost = sum of above = bR + 2.bT + 2.bT(1 + logBb0 ) + bOut

Note that we often ignore cost of writing the result; especially when comparing different algorithms for the same relational operation.


Improving Sort-based Projection76/90

Some approaches for improving the cost:


Hash-based Projection77/90

Overview of the method:

  1. Scan input relation Rel and produce a set of hash partitions based on the projected attributes
  2. Scan each hash partition looking for duplicates
  3. Once each partition is duplicate-free, write out the remaining tuples
The method requires:


Hash Functions78/90

Hash function h(tuple,range):

Implementation issues for hash functions:


... Hash Functions79/90

Usual approach in hash function:

Example hash function for character strings:


unsigned int hash(char *val, int b)
{
    char *cp;
    unsigned int v, sum = 0;
    for (c = val; *c != '\0'; c++) {
        v = *c + (*(c+1) << 8);
        sum += (sum + 2153*v) % 19937;
    }
    return(sum % b);
}


Hash-based Projection80/90

Partitioning phase:

[Diagram:Pics/scansortproj/hash-project-small.png]


... Hash-based Projection81/90

Algorithm for partitioning phase:

for each page P in relation Rel {
    for each tuple t in page P {
        t' = project(t, attrList)
        H = h1(t', B-1)
        write t' to partition[H]
}   }

Each partition could be implemented as a simple data file.


... Hash-based Projection82/90

Duplicate elimination phase:

[Diagram:Pics/scansortproj/hash-project2-small.png]


... Hash-based Projection83/90

Algorithm for duplicate elimination phase:

for each partition P in 0..B-2 {
    for each tuple t in partition P {
        H = h2(t, B-1)
        if (!(t occurs in buffer[H]))
            append t to buffer H
    }
    output contents of all buffers
    clear all buffers
}


Cost of Hash-based Projection84/90

The total cost is the sum of the following:

To ensure that B is larger than the largest partition ...


... Cost of Hash-based Projection85/90

If the largest partition had more than B-1 pages

This would potentially increase the cost by a large amount
(worst case is one additional page read for every record after hash bucket fills)


Index-only Projection86/90

Under the conditions:

can do projection without accessing data file.

Basic idea:


... Index-only Projection87/90

Method:

for each entry I in index file {
    tup = project(I.key, attrList)
    if (tupCompare(tup,prev) != 0) {
        addTuple(outbuf,tup)
        if (isFull(outbuf)) {
            writePage(outf,op++,outbuf);
            clear(outbuf);
        }
        prev = tup;
}   }

"for each index entry": loop over index pages and loop over entries in each page


Cost of Index-only Projection88/90

Assume that the index (see details later):

Costs involved in index-only projection: Total cost:   bi + bOut   ≪   bR + bOut


Comparison of Projection Methods89/90

Difficult to compare, since they make different assumptions:

Best case scenario for each (assuming B+1 in-memory buffers):


Projection in PostgreSQL90/90

Code for projection forms part of execution iterators:

Functions involved with projection:


Produced: 13 Jun 2016