Spatial data indexing

Spatial Data Indexing using R-tree and Quad-tree for Mobile Geographical Information System (GIS) Technology

Abstract

The latest Mobile GIS technology is possible to manage spatial components of daily business project in corporate database and to apply proper geographical analysis efficiently in a wireless focused application. Basically, we use wireless internet technology for transfer process in spatial data from server to client or vice versa. However, the problem in Wireless Internet is system bottlenecks that can make the process of transferring data inefficient due to large amount of spatial data. Thus, optimization in the process of transferring and retrieving data is an essential issue that must be considered. The use of R-tree and Quad-tree spatial data indexing method can optimize the process. With the rapid proliferation of these databases in the past decade, extensive research has been conducted on the design of efficient data structures to enable fast spatial searching. Commercial database vendors like Oracle have also started implementing these spatial indexing to cater the large and diverse GIS.

Introduction

Mobile GIS can be described as an extension of a Geographical Information System (GIS). Previously, GIS can only run in an office with a desktop GIS. By using mobile GIS, user can retrieve, transfer, update, manipulate, analyze and display geographic information anywhere at any time. The standard technology integrated in Mobile GIS application are wireless network for internet transfer and data access, mobile device to run GIS application everywhere and Global Positioning System (GPS) to detect the location.

Rapid improvement in the Mobile GIS technology can solve mobile application device problems such as its small bandwidth, limitation of application capability, color resolution and small screen display (Vckovski, 1999). Recent developments of internet and Mobile GIS technology enable process of spatial data transferring, collection, processing and dissemination with large amount of geographical data (Kraak, 2002).

Indexing is one of database optimization processes which can be created using one or more database table columns to provide the foundation of rapid searching and efficient access of ordered records. Spatial indexing has a great methodology for managing records and it is identified based on its organization with a place. Some of the records are strongly connected to a place. Like other structures of indexing, geographical indexing may be merged with other indices. The difference is that spatial index has particular access process to retrieve spatial data from within the data-store and to optimize spatial queries by spatial databases.

T

his research attempts to provide a suitable spatial data indexing method using R-tree and Quad-tree which can reduce time of spatial data processing in Mobile GIS technology.

Related work

Mobile GIS

Geographic Information System (GIS) can be viewed as an information system that is used to retrieve, manipulate, store and analyze geographic information for making decision in the field.

GIS should be observed as a technology, not simply as a computer system. The mandatory aspect for this technology is the level of integration of these tools to supply smoothing process and functions of geographic data environment (Peng and Tsou, 2003).

Over few years, the development of geo-information technology that started in 1950s has been moved to the Mobile GIS where the maps possible to be displayed and processed on small mobile devices like PDA and Mobile Phones. Initially geo-information technology only focused on mainframe computers, then it is going to change to desktop computer GIS through to local networking GIS and to the Web GIS (Rajinder, 2004). GIS technology can be applied to many applications such as in the area of petroleum, environmental management, resource management, asset management, urban planning, tourism planning, marketing, health, cartography and many other widespread disciplines. The new GIS technology which is based on mobile GIS is expected to improve GIS technology to be more productive and efficient.

Spatial Data Indexing

Spatial data indexing is based on the physical grouping of items that are indexed. With spatial indexing, it is even possible to provide VR (virtual reality) fly through (or fly over, or fly under) to help users maintain the context of their search as the users expand or narrow the scope of their search. Spatial indexing has a richness that parallels indexing by subject or author. The idea of spatial indexing is very powerful, that for records management, records are found based on their association with a place. Many records are tightly coupled to a place. Like other forms of indexing, geographic indexing can be combined with other indices. There are many Indexing methods used in spatial data indexing technique such as Kd-tree, Z-order, UB-tree, Quad-tree, Oc-tree, Grid, and R-tree. However, we will discuss only two techniques used in this research which are R-tree and Quad-tree.

R-tree and Quad-tree

R-tree and Quad-tree indexes that use extensive framework are the best spatial data indexing methods among any other existing spatial indexing methods for low-dimensional spatial data. For queries processing, R-tree approach may be more efficient due to better maintenance of spatial immediacy, but may be slow in updating or index creating and implementation of own concurrency protocols on top of table-level concurrency mechanisms, since R-tree is built logically as a tree and physically using tables inside the database and search involves recursive SQL for traversing tree from root to relevant leaves. Meanwhile, Quad- tree results in simpler index creation, faster update and inherite configuration in B-tree concurrency control protocols, since those indexes compute tile approximations for geometries and use existing B-tree indexes for performing spatial search and other DML operations (Kothuri et al, 2002).

1.4 Issues and problems in Mobile GIS

Mobile GIS needs wireless technology, but a problem comes from system bottleneck. According to Chen and Shi research, the main problem in system bottleneck is because of database side. Database area can be grouped into two branches, i.e database designs and access database. Due to the majority latest issues and problems come from design database, our research tries to focus on this area, by choosing R-tree and Quad tree spatial data indexing method, since both are the best spatial data indexing method for mobile GIS technology. Secondly, we find the nearest research that delivered similar methodology which brings new improvement for our research.

The database is an essential component in GIS and without doubt, a poor access is a burden to performance. Concentrating on optimizing the database will give better GIS application performance. Since the majority of the latest issues and problem come from design database area, focusing on design database area that can be solved using spatial data indexing method is an essential matter. Spatial index is very important in GIS because it is used by spatial database to optimize spatial queries that can fasten transferring process and spatial data retrieval through Mobile GIS network.

R-tree and Quad-tree

R-tree

R-trees are tree data structures that derived from B-trees. It is used for representing multidimensional point data, but is used for spatial process methods for indexing multidimensional information. Each node in the tree communicates to the smallest d-dimensional rectangle that surrounds its child nodes. The leaf nodes include cursor to the real geometric entity in the database, as an alternative of child. The entity is represented by the smallest associated rectangle in which they are enclosed [7].

There are many possible approaches to divide a node. The first goal is to allocate the records among the nodes so that probability that the nodes will be visited to subsequent searches will be reduced. The total areas of the wrapping rectangles for the nodes should be minimized to achieve this goal. Another goal is to decrease the probability that both nodes are observe in successive searches. The area common to both nodes should be minimized to achieve this goal. This following figure is to describe R-tree collection of rectangles and the spatial extension of the enveloping rectangles.

Many researchers explored about R-tree spatial indexing, because R-tree is one of the best spatial indexing and recommended in Oracle spatial database. A new spatial cluster grouping algorithm and R-tree insertion algorithm has been proposed. It introduces the k-means clustering method and employs the 3D overlap volume, 3D coverage volume and the minimum bounding box shape value of nodes as the integrative grouping criteria [8]. A scalable technique called Seeded Clustering that allows us to maintain R-tree indices by bulk insertion while keeping pace with high data arrival rates has also been proposed [9]. A bottom-up update strategy for R-trees that generalizes existing update techniques and aims to improve update performance has also been proposed [10]. A novel bulk insertion technique for R-Trees using Oracle 10g that is fast and does not compromise on the quality of the resulting already presented [11]. A generalization for the family of R-trees, called the Multi-scale R-tree, that allows efficient retrieval of geometric objects at different levels of detail has also been proposed [12]. The real-time mobile GIS based on the HBR-tree to manage mass of location data efficiently have been explored [13]. A Technique combining existing Q + R-tree and Quad Tree in terms of range query execution time by a high order of magnitude has also been proposed [14]. An efficient protocol for the kNN search on a broadcast R-tree, which is a popular multi-dimensional index tree, in a wireless broadcast environment in terms of latency and tuning time as well as memory usage has also been proposed [15].

Q-tree

A quad tree is a tree data structure in which each internal node has up to four children. Quad trees are most often used to divide two dimensional spaces by recursively subdividing it into four quadrants or regions. The regions may be square or rectangular, or may have arbitrary shapes. Quad trees have two features which are: (1) they decompose space into adaptable cells and (2) each cell (or bucket) has a maximum capacity. When maximum capacity is reached, the bucket splits, and the tree directory follows the spatial decomposition of the Quad tree [16]. To cover the geometry, Quad tree spatial data indexing uses fixed-size tiles controlled by tile resolution. Each tile has a fixed-size and shape, because the coordinate space has been decomposed a specific number of times. Then the tessellation will be terminated. However all of this process could happen if the resolution is the sole controlling factor [17]. As we can see in Fig. 3, the tiles can be sorted linearly and systematically at the exact level. Then it will visit the tiles which determined by a space-filling curve. The unique numeric identifiers are also assigned to the tiles.

Many researchers had explored about quad tree spatial indexing, even though quad tree is not recommended to use for particular spatial indexing compare to R-tree spatial data indexing. However, there are many advantages using quad tree spatial indexing in the special situation. An algorithm based on applying Eigen space method has been presented to a quad-tree of related set images to solve the pose estimation problem in the presence of occlusion and/or background clutter. The inability to easily locate the desired object and apply the appropriate normalizations, are efficiently overcome by the recursive quad tree procedure [18]. Fluid flow solver based on unstructured quad tree / octree has been explored also. This indexing type Eulerian grids is coupled with a spectral Finite Element (p-FEM) structural mechanics solver based on a Lagrangian description can predict bidirectional fluid-structure interaction (FSI) [19]. The new structure of Multiversion Linear Quadtree (MVLQ) has been present based on spatio-temporal access method. This indexing structure can be used as an index mechanism for storing and accessing evolving raster images [20]. A novel fractal image coding based on Quadtree partition of the adaptive threshold value has been proposed to show better performance including the improved quality of the decoded image, shorter compression time and higher compression ratio with another spatial indexing structure [21]. An alternative non-pointer quad tree node codification to manage geographical spatial data also has been presented. In this structure, new codification is based on a variable sequence of z-ordered base four digits [22]. A new quad tree-based decomposition of a polygon possibly with holes also has been presented. An approximate this indexing structure is good for ray shooting in the average case as defined by Aronov and Fortune [23]. MPI collective algorithm selection process and explore the applicability of the quad tree encoding method has been proposed. This method constructs quad trees with different properties from the measured algorithm performance data and analyzes the quality and performance of decision functions generated from these trees. Selecting the close-to-optimal collective algorithm based on the parameters of the collective call at run time is an important step in achieving good performance of MPI applications [24]. The matrix quad tree also has been presented based on the weighted in the form of modification. The technique of distance transform is extended to the weighted regions and applied to the robot path planning [25]. A new data dissemination protocol that exploits "Quadtree-based network space partitioning" to provide more efficient routing among multiple mobile stimuli and sink nodes also has been proposed. Because of a common hierarchy of cluster-head nodes is constructed where the data delivery to mobile sinks is independent of the current position of mobile stimuli [26]. A 3D mesh is generated by using the original quadtree triangulation also algorithm has been proposed with a nontrivial algorithm for quadtree triangulation. The aim of the study was to increase the accuracy of a terrain triangulation while maintaining or reducing the number of triangles [27]. A new data dissemination protocol that exploits "Quadtreebased network space partitioning" to provide more efficient routing among multiple mobile stimuli and sink nodes has been proposed. Simulation results show that in that work significantly reduces average energy consumption while maintaining comparably higher data delivery ratio [28]. A novel computational paradigm for detection of blood vessels in fundus images based on RGB components and quad tree decomposition also has been proposed. The proposed algorithm employs median filtering, quad tree decomposition, post filtration of detected edges, and morphological reconstruction on retinal images [29]. A distributed quad tree index that adapts the MX-CIF quad tree was described that enables more powerful accesses to data in P2P networks. This index has been implemented for various prototype P2P applications [30]. The coordinate space (for the layer where all geometric objects are located) is issued to a procedure called tessellation in the linear quad tree indexing scheme of oracle spatial 10g, which delineates elite and extensive cover tiles for all of pile up geometry. Tessellation is done by crumbling the coordinate space in a standard hierarchical mode. The range of coordinates, the coordinate space, is viewed as a rectangle. At the first level of decomposition, the rectangle is divided into halves along each coordinate dimension, producing four tiles. Each tile that cooperates with the geometry being tessellated is further decomposed into four tiles. This procedure continues until some termination criteria, such as size of the tiles or the maximum number of tiles to cover the geometry is met. The results of the tessellation process on geometry are stored in a table, referred to as the SDOINDEX table [17]. Spatial quadtree indexing utilizes fixed-size tiles to cover geometry. Fixed-size tiles are managed by tile resolution. If the resolution is the single controlling reason, then tessellation terminates when the coordinate space has been crumbled an exact number of times. Consequently, each tile is of a fixed amount and form. Fixed-size tile resolution is managed by a user-selectable keyword named sdo_level. Smaller fixed-size tiles give better geometry approximations.

QuadR-tree Spatial Data Indexing

QuadRtree is the hybrid of Quad tree and R-tree. This selection engine is used in this research to combine those spatial data indexing methods. This research suggests that engine is needed to improve Mobile GIS performance. Some tables use Quad tree spatial data indexing method, while the others use R-tree spatial data indexing method. One or two spatial data indexing method will be applied in a single spatial database. The use of R-tree and Quad tree spatial data indexing method in different tables is based on the condition of data and requirement of applications which will increase the speed of spatial data processing.

This research explores the knowledged base of R-tree and Quad tree spatial data indexing method. Each of those methods has different advantages and disadvantages based on application requirements and type of data. Quad-tree and R-tree are used together for data indexing in a single spatial database system then the database can be optimized with appropriate spatial data indexing method and can be contributed to improve spatial data transferring speed. Thus, the process of transferring and retrieving spatial data using wireless technology will be more efficient and effective.

Result

The model used for testing and evaluation in this research has been simplified with the following assumption:

  1. Time is one-dimensional and linearly ordered;
  2. Oracle 10g spatial database are used to develop the GIS model;
  3. Connection to the server and data retrieval via an active TCP/IP connection using Hyper Text Transfer Protocol.

The average response time from the mobile client was measured as the time spent (sec) from the moment the query is issued to the moment the results of the query are generated.

R-tree Spatial Data Indexing

The result for R-tree spatial index is depicted in Fig. 4, which is plotted as a two-dimensional graph to illustrate the results of average response time in Mobile GIS application. The line with blue color corresponds to the time measurement from the data that use an R-tree Spatial Indexing, while the line with red color represents the time measurement from the data that doesn't use Spatial Indexing. We observe that the time saved increases as the database size expands. For the number of records, over 30,000, the time required in the data that use an R-tree Spatial Indexing is almost two times more than the data that doesn't use Spatial Indexing. It is noted, however, that this is related to the actual time of the query, since the database volume is varied as time changes. The response time is slightly different to the result shown in Fig. 4. Nevertheless, we can easily discover that the time response on the data that use an R-tree Spatial Indexing is faster than the data that doesn't use Spatial Indexing.

Quad-tree Spatial Data Indexing

The result for Quad tree spatial index is depicted in Fig. 5, which is plotted the results of average response time in Mobile GIS application. The line with blue color corresponds to the time measurement from the data that use a quadtree Spatial Indexing, while the line with red color represents the time measurement from the data that doesn't use Spatial Indexing. We observe that the time saved increases as the database size expands. For the number of records, over 30,000, the time required in the data that use a quadtree Spatial Indexing is almost two times more than the data that doesn't use Spatial Indexing. It is noted, however, that this is related to the actual time of the query, since the database volume is varied as time changes. The response time is slightly different to the result shown in figure (Fig. 4). Nevertheless, we can easily discover that the time response on the data that use a quadtree Spatial Indexing is faster than the data that doesn't use Spatial Indexing.

Please be aware that the free essay that you were just reading was not written by us. This essay, and all of the others available to view on the website, were provided to us by students in exchange for services that we offer. This relationship helps our students to get an even better deal while also contributing to the biggest free essay resource in the UK!