Query Size Estimation using Machine Learning (i)

Banchong Harangsri, John Shepherd, Anne H.H. Ngu

Proceedings of the International Conference on Artificial Intelligence (TAAI-96),
National Sun Yat-Sen University, Kaohsiung, Taiwan, R.O.C., December 1996.

(Compressed Postscript ... 110KB)


We propose two novel notions in this paper: the first is that machine learning techniques can be used to solve the problem of query size estimation and the second is a new generic algorithm to correct the training set of queries in response to updates. The main advantage for machine learning is that no database scan is required to collect statistics for query size estimation. The training set correction algorithm is useful in that it allows us to "re-vitalise" some existing query size estimation methods whose performance previously deteriorated in the presence of high update loads. A by-product of this is that the length of training sets can be fixed -- the size of the training set determines the level of error in query estimation. Our experimental results show that (1) the machine learning technique is superior to a recent curve fitting method in approximating query result sizes and (2) the machine learning technique still performs as well after the correction algorithm is applied.

Keys: Databases, Query size estimation, Query optimisation, Machine learning


Recent Publications ... John Shepherd ... CSE ... UNSW