Reachability indexes for relational keyword search

Abstract

Due to its considerable ease of use, relational keyword search (R-KWS) has become increasingly popular. Its simplicity, however, comes at the cost of intensive query processing. Specifically, R-KWS explores a vast search space, comprised of all possible combinations of keyword occurrences in any attribute of every table. Existing systems follow two general methodologies for query processing: (i) graph based, which traverses a materialized data graph, and (ii) operator based, which executes relational operator trees on an underlying DBMS. In both cases, computations are largely wasted on graph traversals or operator tree executions that fail to return results. Motivated by this observation, we introduce a comprehensive framework for reachability indexing that eliminates such fruitless operations. We describe a range of indexes that capture various types of join reachability. Extensive experiments demonstrate that the proposed techniques significantly improve performance, often by several orders of magnitude.

abstract = "Due to its considerable ease of use, relational keyword search (R-KWS) has become increasingly popular. Its simplicity, however, comes at the cost of intensive query processing. Specifically, R-KWS explores a vast search space, comprised of all possible combinations of keyword occurrences in any attribute of every table. Existing systems follow two general methodologies for query processing: (i) graph based, which traverses a materialized data graph, and (ii) operator based, which executes relational operator trees on an underlying DBMS. In both cases, computations are largely wasted on graph traversals or operator tree executions that fail to return results. Motivated by this observation, we introduce a comprehensive framework for reachability indexing that eliminates such fruitless operations. We describe a range of indexes that capture various types of join reachability. Extensive experiments demonstrate that the proposed techniques significantly improve performance, often by several orders of magnitude.",

N2 - Due to its considerable ease of use, relational keyword search (R-KWS) has become increasingly popular. Its simplicity, however, comes at the cost of intensive query processing. Specifically, R-KWS explores a vast search space, comprised of all possible combinations of keyword occurrences in any attribute of every table. Existing systems follow two general methodologies for query processing: (i) graph based, which traverses a materialized data graph, and (ii) operator based, which executes relational operator trees on an underlying DBMS. In both cases, computations are largely wasted on graph traversals or operator tree executions that fail to return results. Motivated by this observation, we introduce a comprehensive framework for reachability indexing that eliminates such fruitless operations. We describe a range of indexes that capture various types of join reachability. Extensive experiments demonstrate that the proposed techniques significantly improve performance, often by several orders of magnitude.

AB - Due to its considerable ease of use, relational keyword search (R-KWS) has become increasingly popular. Its simplicity, however, comes at the cost of intensive query processing. Specifically, R-KWS explores a vast search space, comprised of all possible combinations of keyword occurrences in any attribute of every table. Existing systems follow two general methodologies for query processing: (i) graph based, which traverses a materialized data graph, and (ii) operator based, which executes relational operator trees on an underlying DBMS. In both cases, computations are largely wasted on graph traversals or operator tree executions that fail to return results. Motivated by this observation, we introduce a comprehensive framework for reachability indexing that eliminates such fruitless operations. We describe a range of indexes that capture various types of join reachability. Extensive experiments demonstrate that the proposed techniques significantly improve performance, often by several orders of magnitude.