"... Abstract. Clickstream can be a rich source of data for analysing user behaviour, but the volume of these logs makes it difficult to identify and categorise behavioural patterns. In this paper, we introduce the Automatic Pattern Discovery (APD) method, a technique for automated processing of Clickstr ..."

Abstract. Clickstream can be a rich source of data for analysing user behaviour, but the volume of these logs makes it difficult to identify and categorise behaviouralpatterns. In this paper, we introduce the AutomaticPattern Discovery (APD) method, a technique for automated processing

"... We describe a method for the automatic acquisition of the hyponymy lexical relation from unrestricted text. Two goals motivate the approach: (i) avoidante of the need for pre-encoded knowledge and (ii) applicability across a wide range of text. We identify a set of lexico-syntactic patterns that are ..."

We describe a method for the automatic acquisition of the hyponymy lexical relation from unrestricted text. Two goals motivate the approach: (i) avoidante of the need for pre-encoded knowledge and (ii) applicability across a wide range of text. We identifya set of lexico-syntactic patterns

"... Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."

Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of adata set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all

"... The World Wide Web (WWW) continues to grow at an astounding rate in both the sheer volume of tra#c and the size and complexity of Web sites. The complexity of tasks such as Web site design, Web server design, and of simply navigating through a Web site have increased along with this growth. An i ..."

server logs. This paper presents several data preparation techniques in order to identify unique users and user sessions. Also, a method to divide user sessions into semantically meaningful transactions is defined and successfully tested against two other methods. Transactions identified

by
James Newsome, Dawn Song
- In Network and Distributed Systems Security Symposium, 2005

"... Software vulnerabilities have had a devastating effect on the Internet. Worms such as CodeRed and Slammer can compromise hundreds of thousands of hosts within hours or even minutes, and cause millions of dollars of damage [32, 51]. To successfully combat these fast automatic Internet attacks, we nee ..."

—semanticanalysis based signature generation. We show that by backtracing the chain of tainted data structure rooted at the detection point, TaintCheck can automaticallyidentify which original flow and which part of the original flow have caused the attack and identify important invariants of the payload that can

"... ■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases ..."

■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery

"... In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work releva ..."

In this overview paper we motivate the need for and research issues arising froma new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work

"... Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad applica-tion combines computational “vertices ” with communica-tion “channels ” to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set of availa ..."

simultaneously on multi-ple computers, or on multiple CPU cores within a computer. The application can discover the size and placement of data at run time, and modify the graph as the computation pro-gresses to make efficient use of the available resources. Dryad is designed to scale from powerful multi-core sin