Improving the R*-tree with Outlier Handling Techniques
Tian Xia and Donghui Zhang
Abstract
The R*-tree, as a state-of-the-art spatial index, has already found
its way into commercial systems like Oracle. In this paper, we aim at
improving query performance of the R*-tree. We focus on five widely
used spatial queries: range query, aggregation query, nearest neighbor
query, skyline query, and join query. The idea is to store
outlier objects in internal tree nodes. The new structure is
named the RO-tree. Here an outlier is an object
which is located far from other objects or has large extent (we
consider both point objects and objects with extent). If such objects
are stored at higher levels of the tree, the lower-level nodes have
smaller minimum bounding rectangles and thus the index performs
better. To support the dynamic nature of the index,
several structural and algorithmic changes are needed.
The paper discusses these changes. In particular, we show
how to identify and handle the outlier objects during page
overflow/underflow, using gain/loss metrics. Extensive experiments
reveal that the RO-tree significantly outperforms
the R*-tree for all the five queries.