A Novel Improvement to the R*-tree Spatial Index using Gain/Loss Metrics
Donghui Zhang and Tian Xia
Abstract
The R*-tree is a state-of-the-art spatial index structure. It has
already found its way into commercial systems. The most important
improvement of the R*-tree over the original R-tree is that it
utilizes forced reinsertion. That is, if a disk page overflows, some
objects are removed from the page and reinserted into the index. The
goals are: (a) to reduce the MBR area; and (b) to keep the shape of
the MBR close to a square. However, no existing work consists of a
unified metric which can be used to balance the two criteria. For
example, if there are two methods to remove objects from a rectangle,
and one results in a rectangle with smaller area, while the other
results in a square with slightly larger area, which method shall we
choose? The R*-tree algorithm selects objects whose distances to the
center of the page's MBR are the largest. However, this is not
optimal. In this paper, we formally define the quality of a
rectangle and the gain to shrink a rectangle. Then we provide
algorithms to shrink the MBRs with the goal to maximize the gain. The
algorithms are experimentally compared with the R*-tree's reinsertion
algorithm. Furthermore, as the opposite of gain, we define
the loss of expanding a rectangle. While inserting an object
into the R*-tree, we need to choose a sub-tree to put the object
in. With the new metric, we can choose the sub-tree with the least
loss. Finally, we integrate the new algorithms into the R*-tree.