Heap Files

COMP9315 21T1 ♢ Heap Files ♢ [0/13]
❖ Heap Files

Heap files

[Diagram:Pics/file-struct/heap-file2.png]


Note: this is not "heap" as in the top-to-bottom ordered tree.

COMP9315 21T1 ♢ Heap Files ♢ [1/13]
❖ Selection in Heaps


For all selection queries, the only possible strategy is:

// select * from R where C
rel = openRelation("R", READ);
for (p = 0; p < nPages(rel); p++) {
    get_page(rel, p, buf);
    for (i = 0; i < nTuples(buf); i++) {
        T = get_tuple(buf, i);
        if (T satisfies C)
            add tuple T to result set
    }
}

i.e. linear scan through file searching for matching tuples

COMP9315 21T1 ♢ Heap Files ♢ [2/13]
❖ Selection in Heaps (cont)

The heap is scanned from the first to the last page:

[Diagram:Pics/file-struct/heapscan.png]

Costrange  =  Costpmr  =  b

If we know that only one tuple matches the query (one query),
a simple optimisation is to stop the scan once that tuple is found.

Costone :      Best = 1      Average = b/2      Worst = b

COMP9315 21T1 ♢ Heap Files ♢ [3/13]
❖ Insertion in Heaps

Insertion: new tuple is appended to file (in last page).

rel = openRelation("R", READ|WRITE);
pid = nPages(rel)-1;
get_page(rel, pid, buf);
if (size(newTup) > size(buf))
   { deal with oversize tuple }
else {
   if (!hasSpace(buf,newTup))
      { pid++; nPages(rel)++; clear(buf); }
   insert_record(buf,newTup);
   put_page(rel, pid, buf);
}

Costinsert  =  1r + 1w

COMP9315 21T1 ♢ Heap Files ♢ [4/13]
❖ Insertion in Heaps (cont)

Alternative strategy:


PostgreSQL's strategy:
COMP9315 21T1 ♢ Heap Files ♢ [5/13]
❖ Insertion in Heaps (cont)

Dealing with oversize tuple t:

for i in 1 .. nAttr(t) {
   if (t[i] not oversized) continue
   off = appendToFile(ovf, t[i])
   t[i] = (OVERSIZE, off)
}
insert into buf as before


[Diagram:Pics/storage/ovsize-values.png]

COMP9315 21T1 ♢ Heap Files ♢ [6/13]
❖ Insertion in Heaps (cont)

PostgreSQL's tuple insertion:

heap_insert(Relation relation,    // relation desc
            HeapTuple newtup,     // new tuple data
            CommandId cid, ...)   // SQL statement

COMP9315 21T1 ♢ Heap Files ♢ [7/13]
❖ Deletion in Heaps

SQL:  delete from R  where Condition

Implementation of deletion:

rel = openRelation("R",READ|WRITE);
for (p = 0; p < nPages(rel); p++) {
    get_page(rel, p, buf);
    ndels = 0;
    for (i = 0; i < nTuples(buf); i++) {
        tup = get_tuple(buf,i);
        if (tup satisfies Condition)
            { ndels++; delete_record(buf,i); }
    }
    if (ndels > 0) put_page(rel, p, buf);
    if (ndels > 0 && unique) break;
}

COMP9315 21T1 ♢ Heap Files ♢ [8/13]
❖ Deletion in Heaps (cont)

PostgreSQL tuple deletion:

heap_delete(Relation relation,    // relation desc
            ItemPointer tid, ..., // tupleID
            CommandId cid, ...)   // SQL statement


Vacuuming eventually compacts space in each page.
COMP9315 21T1 ♢ Heap Files ♢ [9/13]
❖ Updates in Heaps

SQL:  update R  set F = val   where Condition

Analysis for updates is similar to that for deletion

Costupdate  =  br + bqw

Complication: new tuple larger than old version  (too big for page)

Solution:   delete, re-organise free space, then insert

COMP9315 21T1 ♢ Heap Files ♢ [10/13]
❖ Updates in Heaps (cont)

PostgreSQL tuple update:

heap_update(Relation relation,     // relation desc
            ItemPointer otid,      // old tupleID
            HeapTuple newtup, ..., // new tuple data
            CommandId cid, ...)    // SQL statement

COMP9315 21T1 ♢ Heap Files ♢ [11/13]
❖ Heaps in PostgreSQL

PostgreSQL stores all table data in heap files (by default).

Typically there are also associated index files.

If a file is more useful in some other form:

Heap file implementation:   src/backend/access/heap
COMP9315 21T1 ♢ Heap Files ♢ [12/13]
❖ Heaps in PostgreSQL (cont)

PostgreSQL "heap file" may use multiple physical files

COMP9315 21T1 ♢ Heap Files ♢ [13/13]


Produced: 7 Mar 2021