< ^  >

Experiments

Measures for accuracy:

  • Acc1   average top-10 entries in CurveIx top-10

  • Acc2   how frequently CurveIx gives 10 out of 10

Measures for efficiency:

  • Size   size of Db file + Index file

  • Dist   number of distance calculations required

  • IO   total amount of I/O performed

To determine how these measures vary:

  • built databases of size 5K, 10K, 15K, 20K   (supersets)

  • for each database, ran 25 query "benchmark" set

  • for each query, ran for 3,5,10,20,30,40 curve-neighbours
    (but because of curve-mapping problem, only got 20,30,40)

  • for each query, ran for 20,40,60,80,100 curves

Also implemented a linear scan version for comparison and to collect the exact answer sets.


< ^  >