Fast data stream algorithms using associative memories

Abstract

The primary goal of data stream research is to develop space and time efficient solutions for answering continuous on-line summarization queries. Research efforts over the last decade have resulted in a number of efficient algorithms with varying degrees of space and time complexities. While these techniques are developed in a standard CPU setting, many of their applications such as click-fraud detection and network-traffic summarization typically execute on special networking architectures called Network Processing Units (NPUs). These NPUs interface with special associative memories known as Ternary Content Addressable Memories (TCAMs) to provide gigabit rate forwarding at network routers. In this paper, we describe how the integrated architecture of NPU and TCAMs can be exploited towards achieving the goal of developing high-speed stream summarization solutions. We propose two TCAM-conscious solutions for the frequent elements problem in data streams and present a comprehensive evaluation of these techniques on a state-of-the-art networking platform.

Original language

English

Title of host publication

SIGMOD 2007

Subtitle of host publication

Proceedings of the ACM SIGMOD International Conference on Management of Data

abstract = "The primary goal of data stream research is to develop space and time efficient solutions for answering continuous on-line summarization queries. Research efforts over the last decade have resulted in a number of efficient algorithms with varying degrees of space and time complexities. While these techniques are developed in a standard CPU setting, many of their applications such as click-fraud detection and network-traffic summarization typically execute on special networking architectures called Network Processing Units (NPUs). These NPUs interface with special associative memories known as Ternary Content Addressable Memories (TCAMs) to provide gigabit rate forwarding at network routers. In this paper, we describe how the integrated architecture of NPU and TCAMs can be exploited towards achieving the goal of developing high-speed stream summarization solutions. We propose two TCAM-conscious solutions for the frequent elements problem in data streams and present a comprehensive evaluation of these techniques on a state-of-the-art networking platform.",

N2 - The primary goal of data stream research is to develop space and time efficient solutions for answering continuous on-line summarization queries. Research efforts over the last decade have resulted in a number of efficient algorithms with varying degrees of space and time complexities. While these techniques are developed in a standard CPU setting, many of their applications such as click-fraud detection and network-traffic summarization typically execute on special networking architectures called Network Processing Units (NPUs). These NPUs interface with special associative memories known as Ternary Content Addressable Memories (TCAMs) to provide gigabit rate forwarding at network routers. In this paper, we describe how the integrated architecture of NPU and TCAMs can be exploited towards achieving the goal of developing high-speed stream summarization solutions. We propose two TCAM-conscious solutions for the frequent elements problem in data streams and present a comprehensive evaluation of these techniques on a state-of-the-art networking platform.

AB - The primary goal of data stream research is to develop space and time efficient solutions for answering continuous on-line summarization queries. Research efforts over the last decade have resulted in a number of efficient algorithms with varying degrees of space and time complexities. While these techniques are developed in a standard CPU setting, many of their applications such as click-fraud detection and network-traffic summarization typically execute on special networking architectures called Network Processing Units (NPUs). These NPUs interface with special associative memories known as Ternary Content Addressable Memories (TCAMs) to provide gigabit rate forwarding at network routers. In this paper, we describe how the integrated architecture of NPU and TCAMs can be exploited towards achieving the goal of developing high-speed stream summarization solutions. We propose two TCAM-conscious solutions for the frequent elements problem in data streams and present a comprehensive evaluation of these techniques on a state-of-the-art networking platform.