Tools

"... MTCache is a prototype mid-tier database caching solution for SQL Server that transparently of-floads part of the query workload from a backend server to front-end servers. The goal is to improve system throughput and scalability but without requiring application changes. This paper outlines the ar- ..."

MTCache is a prototype mid-tier database caching solution for SQL Server that transparently of-floads part of the query workload from a backend server to front-end servers. The goal is to improve system throughput and scalability but without requiring application changes. This paper outlines the ar-chitecture of MTCache and highlights several of its key features: modeling of data as materialized views, integration with query optimization, and support for queries with explicit data freshness and consistency requirements. 1

...les are possible, for example, range control tables and a PMV may have several control tables. Further details will be provided in an upcoming paper. 6 Related Work Our approach is similar to DBCache =-=[8, 1, 2, 3]-=- in the sense that both systems are transparently offload some queries to front-end servers, forward all updates to the backend server and rely on replication to propagate updates. DBCache was origina...

"... Many applications today are designed for a multi-tier environment typically consisting of browser-based clients, application servers and a backend database server. Application servers do not maintain persistent state and typically run on fairly inexpensive ..."

Many applications today are designed for a multi-tier environment typically consisting of browser-based clients, application servers and a backend database server. Application servers do not maintain persistent state and typically run on fairly inexpensive

...erver that achieves these goals. It builds on SQL Server’s support for materialized views, distributed queries and transactional replication. In this sense, it resembles the approach taken by DBCache =-=[1]-=-[3] but MTCache allows caching not only of complete tables but also horizontal and vertical subsets of tables and materialized views. DBCache always uses the cached version of a table when it is refer...

"... Scientific database federations are geographically distributed and network bound. Thus, they could benefit from proxy caching. However, existing caching techniques are not suitable for their workloads, which compare and join large data sets. Existing techniques reduce parallelism by conducting distr ..."

Scientific database federations are geographically distributed and network bound. Thus, they could benefit from proxy caching. However, existing caching techniques are not suitable for their workloads, which compare and join large data sets. Existing techniques reduce parallelism by conducting distributed queries in a single cache and lose the data reduction benefits of performing selections at each database. We develop the bypass-yield formulation of caching, which reduces network traffic in wide-area database federations, while preserving parallelism and data reduction. Bypass-yield caching is altruistic; caches minimize the overall network traffic generated by the federation, rather than focusing on local performance. We present an adaptive, workload-driven algorithm for managing a bypass-yield cache. We also develop on-line algorithms that make no assumptions about workload: a k-competitive deterministic algorithm and a randomized algorithm with minimal space complexity. We verify the efficacy of bypass-yield caching by running workload traces collected from the Sloan Digital Sky Survey through a prototype implementation.

...o not directly apply to our objective of minimizing network cost. We should also mention that there have been several initiatives in the industry that develop caching systems using relational objects =-=[1, 40]-=-, and overlapping dynamic materialized views as objects [2]. Their cache policies may be considered simple compared to hierarchical and widelydistributed systems like Squid [42], but they do underline...

by
Niraj Tolia, M. Satyanarayanan, Adam Wolbach
- In Proceedings of the 5th International Conference on Mobile Systems, Applications and Services, 2007

"... We report on the design, implementation, and evaluation of a system called Cedar that enables mobile database access with good performance over low-bandwidth networks. This is accomplished without degrading consistency. Cedar exploits the disk storage and processing power of a mobile client to compe ..."

We report on the design, implementation, and evaluation of a system called Cedar that enables mobile database access with good performance over low-bandwidth networks. This is accomplished without degrading consistency. Cedar exploits the disk storage and processing power of a mobile client to compensate for weak connectivity. Its central organizing principle is that even a stale client replica can be used to reduce data transmission volume from a database server. The reduction is achieved by using content addressable storage to discover and elide commonality between client and server results. This organizing principle allows Cedar to use an optimistic approach to solving the difficult problem of database replica control. For laptop-class clients, our experiments show that Cedar improves the throughput of read-write workloads by 39 % to as much as 224 % while reducing response time by 28 % to as much as 79%.

...t are later committed to the master copy. Like Cedar, a number of systems have advocated middle-tier database caching where parts of the database are replicated at the edge for web-based applications =-=[3, 4, 5, 36]-=-. These systems, based on an avoidance-based approach, require tight integration with the database to ensure timely propagation of updates and are usually targeted towards workloads that do not requir...

"... With the growing use of dynamic web content generated from relational databases, traditional caching solutions for throughput and latency improvements are ineffective. We describe a middleware layer called Ganesh that reduces the volume of data transmitted without semantic interpretation of queries ..."

With the growing use of dynamic web content generated from relational databases, traditional caching solutions for throughput and latency improvements are ineffective. We describe a middleware layer called Ganesh that reduces the volume of data transmitted without semantic interpretation of queries or results. It achieves this reduction through the use of cryptographic hashing to detect similarities with previous results. These benefits do not require any compromise of the strict consistency semantics provided by the back-end database. Further, Ganesh does not require modifications to applications, web servers, or database servers, and works with closed-source applications and databases. Using two benchmarks representative of dynamic web sites, measurements of our prototype show that it can increase end-to-end throughput by as much as twofold for non-data intensive applications and by as much as tenfold for data intensive ones.

by
Niraj Tolia, M. Satyanarayanan
- In Proceedings of the 16th International World Wide Web Conference (WWW2007, 2007

"... opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF or Carnegie Mellon University. All unidentified trademarks mentioned in the paper are properties of their respective owners. ..."

opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF or Carnegie Mellon University. All unidentified trademarks mentioned in the paper are properties of their respective owners.

...ience to be highly variable because there is no caching to insulate the client from bursty loads. Previous attempts in caching dynamic database content have generally weakened transactional semantics =-=[3, 5]-=- or required application modifications [20, 44]. We report on a new solution that takes the form of a database-agnostic middleware layer called Ganesh. Ganesh makes no effort to semantically interpret...

"... Current projects that automate the collection of provenance information use a centralized architecture for managing the resulting metadata- that is, provenance is gathered at remote hosts and submitted to a central provenance management service. In contrast, we are developing a completely decentrali ..."

Current projects that automate the collection of provenance information use a centralized architecture for managing the resulting metadata- that is, provenance is gathered at remote hosts and submitted to a central provenance management service. In contrast, we are developing a completely decentralized system with each computer maintaining the authoritative repository of the provenance gathered on it. Our model has several advantages, such as scaling to large amounts of metadata generation, providing low-latency access to provenance metadata about local data, avoiding the need for synchronization with a central service after operating while disconnected from the network, and letting users retain control over their data provenance records. We describe the SPADE project’s support for tracking data provenance in distributed environments, including how queries can be optimized with provenance sketches, pre-caching, and caching.

...This is especially true for caching recursive lineage queries. Centralized database management systems have also addressed scalability concerns by using caching mechanisms. Times Ten [48] and DBCache =-=[2]-=- use middle-tier caching to avoid bottlenecks at the central back-end server. DBProxy [4] supports structured data caching at edge servers. It caches a large number of overlapping and dynamically chan...

"... A major problem in web database applications and on the Internet in general is the scalable delivery of data. One proposed solution for this problem is a hybrid system that uses multicast push to scalably deliver the most popular data, and reserves traditional unicast pull for delivery of less popul ..."

A major problem in web database applications and on the Internet in general is the scalable delivery of data. One proposed solution for this problem is a hybrid system that uses multicast push to scalably deliver the most popular data, and reserves traditional unicast pull for delivery of less popular data. However, such a hybrid scheme introduces a variety of data management problems at the server. In this paper we examine three of these problems: the push popularity problem, the document classification problem, and the bandwidth division problem. The push popularity problem is to estimate the popularity of the documents in the web site. The document classification problem is to determine which documents should be pushed and which documents must be pulled. The bandwidth division problem is to determine how much of the server bandwidth to devote to pushed documents and how much of the server bandwidth should be reserved for pulled documents. We propose simple and elegant solutions for these problems. We report on experiments with our system that validate our algorithms.

"... Abstract. The social and economic importance of large bodies of programs and data that are potentially long-lived has attracted much attention in the commercial and research communities. Here we concentrate on a set of methodologies and technologies called persistent programming. In particular we re ..."

Abstract. The social and economic importance of large bodies of programs and data that are potentially long-lived has attracted much attention in the commercial and research communities. Here we concentrate on a set of methodologies and technologies called persistent programming. In particular we review programming language support for the concept of orthogonal persistence, a technique for the uniform treatment of objects irrespective of their types or longevity. While research in persistent programming has become unfashionable, we show how the concept is beginning to appear as a major component of modern systems. We relate these attempts to the original principles of orthogonal persistence and give a few hints about how the concept may be utilised in the future. 1

...bution. Insmany of today’s enterprise systems the programmer must, by necessity,snot only manage mappings from the language to the database but alsosfrom the language to the Memcached [52] or DBCache =-=[53]-=- layers, andsfrom those layers to the database. Thus, when we consider the impedancesmismatch problem in our systems it is important to recognise that thesobject-relational mapping is not the only map...