❖ Abstract Data Types |
A data type is ...
FILE *
❖ DTs, ADTs, GADTs |
We want to distinguish ...
DT = (non-abstract) data type (e.g. C strings)
char s[10];
Set a, b, c;
Set<int> a, b, c;
Set<int> a; Set<char> b;
❖ Interface/Implementation |
ADT interface provides
FILE*
❖ Collections |
Many of the ADTs we deal with ...
❖ ... Collections |
Typical operations on collections
❖ Example: Set ADT |
Set data type: collection of unique integer values.
"Book-keeping" operations:
Set newSet()
void dropSet(Set)
void showSet(Set)
{1,2,3...}
void readSet(FILE*,Set)
Set SetCopy(Set)
❖ ... Example: Set ADT |
Data-type operations:
void SetInsert(Set,int)
void SetDelete(Set,int)
int SetMember(Set,int)
Set SetUnion(Set,Set)
Set SetIntersect(Set,Set)
int SetCard(Set)
Set
❖ Set ADT Interface |
// Set.h ... interface to Set ADT #ifndef SET_H #define SET_H #include <stdio.h> #include <stdbool.h> typedef struct SetRep *Set; Set newSet(); // create new empty set void dropSet(Set); // free memory used by set Set SetCopy(Set); // make a copy of a set void SetInsert(Set,int); // add value into set void SetDelete(Set,int); // remove value from set bool SetMember(Set,int); // set membership Set SetUnion(Set,Set); // union Set SetIntersect(Set,Set); // intersection int SetCard(Set); // cardinality void showSet(Set); // display set on stdout void readSet(FILE *, Set); // read+insert set values #endif
❖ ... Set ADT Interface |
Example set client: set of small odd numbers
#include "Set.h" ... Set s = newSet(); for (int i = 1; i < 26; i += 2) SetInsert(s,i); showSet(s); putchar('\n');
Outputs:
{1,3,5,7,9,11,13,15,17,19,21,23,25}
❖ Set Applications |
Example: eliminating duplicates
#include "Set.h"
...
// scan a list of items in a file
int item;
Set seenItems = newSet();
FILE *in = fopen(FileName,"r");
while (fscanf(in, "%d", &item) == 1) {
if (!SetMember(seenItems, item)) {
SetInsert(seenItems, item);
process item;
}
}
fclose(in);
❖ Set ADT Pre/Post-conditions |
Each Set
Express these semantics in detail via statements of:
assert()
But only during the development/testing phase
assert()
At the very least, implement as comments at start of functions.
❖ ... Set ADT Pre/Post-conditions |
If x
T
T
x
*x
*x
T
❖ ... Set ADT Pre/Post-conditions |
Examples of defining pre-/post-conditions:
// pre: true // post: valid(Set,res) and res = {} Set newSet() { ... } // pre: valid(Set,s) and valid(int n) // post: n ∈ s' void SetInsert(Set s, int n) { ... } // pre: valid(Set,s1) and valid(Set,s2) // post: ∀ n ∈ res, n ∈ s1 or n ∈ s2 Set SetUnion(Set s1, Set s2) { ... } // pre: valid(Set,s) // post: res = |s| int SetCard(Set s) { ... }
❖ ... Sets as Unsorted Arrays |
Concrete data structure (in C):
#define MAXELEMS 1000
// concrete data structure
struct SetRep {
int elems[MAXELEMS];
int nelems;
};
Need to set upper bound on number of elements
Could do statically (as above) or dynamically
Set newSet(int maxElems) { ... }
❖ ... Sets as Unsorted Arrays |
Set creation:
// create new empty set Set newSet() { Set s = malloc(sizeof(struct SetRep)); if (s == NULL) { fprintf(stderr, "Insufficient memory\n"); exit(EXIT_FAILURE); } s->nelems = 0; // assert(isValid(s)); return s; }
❖ ... Sets as Unsorted Arrays |
Checking membership:
// set membership test int SetMember(Set s, int n) { // assert(isValid(s)); int i; for (i = 0; i < s->nelems; i++) if (s->elems[i] == n) return TRUE; return FALSE; }
❖ ... Sets as Unsorted Arrays |
Costs for set operations on unsorted array:
❖ Sets as Sorted Arrays |
Same data structure as for unsorted array.
Differences in
❖ ... Sets as Sorted Arrays |
Costs for set operations on sorted array:
❖ ... Sets as Linked Lists |
Concrete data structure (in C):
typedef struct Node { int value; struct Node *next; } Node; struct SetRep { Node *elems; // pointer to first node Node *last; // pointer to last node int nelems; // number of nodes };
❖ ... Sets as Linked Lists |
Set creation:
// create new empty set
Set newSet()
{
Set s = malloc(sizeof(struct SetRep));
if (s == NULL) {...}
s->nelems = 0;
s->elems = s->last = NULL;
return s;
}
❖ ... Sets as Linked Lists |
Checking membership:
// set membership test int SetMember(Set s, int n) { // assert(isValid(s)); Node *cur = s->elems; while (cur != NULL) { if (cur->value == n) return true; cur = cur->next; } return false; }
❖ ... Sets as Linked Lists |
Costs for set operations on linked list:
If we don't have nelems
❖ Sets as Bit-strings |
Set is a very long bit-string, typically an array of words
Restrict possible values that can be stored in the Set
bit|1
bit&0
i
❖ ... Sets as Bit-strings |
Concrete data structure (in C):
#define NBITS 1024
#define NWORDS (NBITS/32)
typedef unsigned int Word;
typedef Word Bits[NWORDS];
struct SetRep {
int nelems;
Bits bits; // Word bits[NWORDS]
};
Set
❖ ... Sets as Bit-strings |
Implementation as bit-strings requires extra functions:
getBit(Bits b, int i)
setBit(Bits b, int i)
unsetBit(Bits b, int i)
getBit(Bits b, int i) { int whichWord = i / 32; int whichBit = i % 32; Word mask = (1 << whichBit) return (b[whichWord] & mask) >> whichBit; }
❖ ... Setting and unsetting bits |
Powers of two by bit-shifting - don't use pow(...)
math.h
❖ Performance of Set Implementations |
Performance comparison:
Data Structure | insert | delete | member | ∪, ∩ | storage |
unsorted array | O(n) | O(n) | O(n) | O(n.m) | O(N) |
sorted array | O(n) | O(n) | O(log2n) | O(n+m) | O(N) |
unsorted linked list | O(n) | O(n) | O(n) | O(n.m) | O(n) |
sorted linked list | O(n) | O(n) | O(n) | O(n+m) | O(n) |
bit-maps | O(1) | O(1) | O(1) | O(N) | O(N) |
n,m = #elems, N = max #elems,
Produced: 9 Feb 2021