The Problem (Re-stated)
Inputs
- a database of N objects
(obj1, obj2, ..., objN)
- representations of objects as d-dimensional vectors
(vobj1, vobj2, ..., vobjN)
(each vector may represent one or more features of the object)
- a query object q with associated d-dimensional vector (vq)
- a distance measure D(vi,vj) : [0..1)
(D=0 -> vi=vj)
Outputs
- a list of the k nearest objects in the database
[a1, a2, ... ak]
- ordered by distance to query object
D(va1,vq)
D(va2,vq) ...
D(vak,vq)
And all done as efficiently as possible ...
... where time is the most important efficiency factor
|