CSE Technical Reports Sorted by Technical Report Number

Chip multiprocessors with multiple simpler cores are gaining popularity because they have the potential to drive future performance gains without exacerbating the problems of power dissipation and hardware complexity. These designs provide real benefits for server-class applications that are explicitly multi-threaded. However, for desktop and other systems, there is a large code base of single-thread applications that have yet to see any benefit from multicore systems. While these applications were designed for execution on a single processor, many regions of computation have statistically suitable structure for extracting thread-level parallelism. In this work, we examine automatic extraction of statistical loop-level parallelism from single-thread applications. Our approach is to utilize simple hardware mechanisms combined with intelligent compiler code generation to perform low-cost targeted thread-level speculation. Our technique combines memory dependence profiling to identify suitable loops, stylized code generation to untangle register dependences between iterations, and a hybrid compiler/hardware recovery mechanism to handle infrequent memory dependences. We show that significant amounts of statistical loop-level parallelism indeed exist in non-numeric applications, and present the architectural extensions and detailed compiler algorithms required to exploit it.

Numerous attacks, such as worms, phishing, and botnets, threaten the availability of the Internet, the integrity of its hosts, and the privacy of its users. A core element of defense against these attacks is anti-virus(AV)–a service that detects, removes, and characterizes these threats. The ability of these products to successfully characterize these threats has far-reaching effects—from facilitating sharing across organizations, to detecting the emergence of new threats, and assessing risk in quarantine and cleanup. In this paper, we examine the ability of existing host-based anti-virus products to provide semantically
meaningful information about the malicious software and tools (or malware) used by attackers. Using a large, recent collection of malware that spans a variety of attack vectors (e.g., spyware, worms, spam), we show that different AV products characterize malware in ways that are inconsistent across AV products,
incomplete across malware, and that fail to be concise in their semantics. To address these limitations, we propose a new classification technique that describes malware behavior in terms of system state changes (e.g., files written, processes created) rather than in sequences or patterns of system calls. To address the sheer volume of malware and diversity of its behavior, we provide a method for automatically categorizing these profiles of malware into groups that reflect similar classes of behaviors and demonstrate how behavior-based clustering provides a more direct and effective way of classifying and analyzing Internet malware.

We describe the application of the Reveal formal functional
verification system to six representative hardware test cases.
Reveal employs counterexample-guided abstraction refinement,
or CEGAR, and is suitable for verifying the complex
control logic of designs with wide datapaths. Reveal performs
automatic datapath abstraction yielding an approximation of
the original design with a much smaller state space. This
approximation is subsequently used to verify the correctness of
control logic interactions. If the approximation proves to be
too coarse, it is automatically refined based on the spurious
counterexample it generates. Such refinement can be viewed
as a form of on-demand “learning” similar in spirit to conflict-
based learning in modern Boolean satisfiability solvers.
The abstraction/refinement process is iterated until the design
is shown to be correct or an actual design error is reported.
The Reveal system allows some user control over the abstraction
and refinement steps. Using six representative benchmarks
as case studies, we examine the effect on Reveal’s
performance of the various available options for abstraction
and refinement. We also show examples of the types of
learned “facts” that Reveal discovers in the refinement step.
Based on our initial experience with this system, we believe
that automating the verification for a useful class of hardware
designs is now quite feasible.

Detecting the presence of buffer overflow attacks in network messages has been a major focus of the
security community. Only knowing whether a message contains an attack, however, is not always enough
to mitigate the threat. Sometimes it is also critical to know what the attack does. Some attacks will open
up backdoors on the victim hosts, while others may download a secondary malware payload from a rogue
web server. Understanding the exploit behavior can be very helpful in determining the proper response.
Unfortunately, shell code from buffer overflow attacks is written in low-level assembly language, and is
often obfuscated with encoding routines. The current method of analyzing shell code, manual reverse
engineering, can be time-consuming and requires significant expertise. Furthermore, new analysis must
be done for each unique payload, which makes manual analysis nearly impossible for large wide-scale
polymorphic attacks.
In response to the need for automated attack payload analysis, we introduce Spector. Spector uses
symbolic execution to extract meaningful high-level application programming interface (API) calls from
shell code. This information exposes the code’s real functionality and can aid in attack mitigation.
Spector’s high-level output also helps facilitate classification of different attack payloads that have the
same behavior. To evaluate Spector, we tested it with over 23,000 unique payloads gathered from
lightweight honeypot deployments. It identified eleven different classes of shell code, and was able to
process all the payloads in just over three hours. Spector was also able to successfully classify
polymorphic instances of the same shell code that were generated using two polymorphism tools. In this
paper, we present the architecture and implementation of Spector, as well as our experience using it to
analyze real attack payloads.

Suppose filesystems supported cloning of any subset of directories, even those containing gigabytes of data, almost instantaneously, while guaranteeing isolation between the original copy and the cloned copy. We show that this simple abstraction can greatly simplify many routine tasks, both for a single operating system and for clusters of similar virtual machines in a virtual server farm. In this paper, we describe the design of Prism, a file server that supports lightweight cloning and centralized file management of multiple filesystems. Unlike copying in standard file systems or cloning operations on virtual disks, the Prism filesystem supports high-level operations to create fast clones of any part of a filesystem. After cloning, each cloned file system can evolve independently, providing similar read/write semantics as if the file systems were copied from each other. This abstraction is useful for supporting virtual server farms. It is possible to create filesystems for new virtual machines interactively, while protecting private home directories, log files, and password
files. Furthermore, when cloned systems have a high degree of
similarity, Prism makes centralized scanning operations (e.g., for malware checking) for multiple cloned file systems significantly faster than scanning each file system individually. Prism’s capabilities are also applicable to individual operating systems. Prism can help provide lightweight file system isolation between processes to prevent untrusted code from damaging the
filesystem.
Prism is currently implemented by extending the ext3 filesystem. On the Postmark benchmark and the Apache build workload, Prism’s performance is comparable to that of a standard ext3 filesystem. For applications that require canning or comparing multiple, cloned filesystems, Prism is significantly faster. This paper describes the design and evaluation of the Prism filesystem.

Banks and other financial institutions are providing more and more
information and services for their potential and existing customers
via their web sites. By
providing services such as online banking, they enable their customers to take
care of business at any time and from any place.
Banks are heavily regulated and have a substantial financial stake in
providing security to their customers. Thus, one would expect that
the latest security technology is deployed on their sites and the sites are analyzed regularly by security-o
riented IT professionals.
However, we found substantial gaps in online security at a large number of financial institutions,
including banks and brokerages. To assess the effectiveness of online security procedures, we examined 706 f
inancial institutions for seven types of vulnerabilities that could be exploited by sophisticated attackers.
To our surprise, a large percentage of these institutions are susceptible to several of the
vulnerabilities. For each vulnerability, we recommend solutions
that we hope financial institutions will adapt
to improve the security of their web sites. We also designed a modified proxy server that users can use to
identify several of these vulnerabilities in the web sites they visit.

We analyze the ACL Anthology citation network in an attempt to identify
the most "central" papers and authors using graph-based methods.
Citation data was obtained using text extraction from the library of PDF
files with some post-processing performed to clean up the results.
Manual annotation
of the references was then performed to complete the citation network.
The analysis compares metrics across publication years and venues, such
as citations in and out. The most cited paper, central papers, and
papers with the highest impact factor are also established.

The University of Michigan CLAIR (Computational Linguistics and
Information Retrieval) group has developed the Clair Library. A
collection of utilities and code modules, it is intended to simplify a
number of generic tasks in Natural Language Processing (NLP),
Information Retrieval (IR), and Network Analysis (NA). Its architecture
is also designed for easy integration with external software.

We consider the problem of tracking information over time and across
sources in clusters of news stories about emergency events. While
previous approaches have focused on finding new information at the
document and sentence levels (e.g. TDT FSD and the TREC Novelty track,
respectively), we are interested in following information at the
factual level. Specifically, we propose to develop a ``fact tracking"
system, that when given a user's factual question of interest, returns
a matrix displaying the extracted answers by time and source. As a
first step towards this goal, we present an empirical analysis of a
corpus of breaking news stories that have been manually annotated for
factual questions and answers. Our current goal is to compare
extracted answers to a given question and to examine how features such
as lexical similarity and source information relate to the
chronological and semantic relationships between them. Our study will
show that while there appears to be no direct relationship between the
lexical similarity and publication time difference of a given answer
pair, lexical similarity is highly correlated to whether or not the
answers come from the same sources, and whether or not they express
the same or different answer to the given question.

We present a series of experiments to demonstrate the validity of
Relative Utility (RU) as a measure for evaluating extractive
summarization systems. Like some other evaluation metrics, it compares
sentence selection between machine and reference summarizers.
Additionally, RU is applicable in both single-document and
multi-document summarization, is extendable to arbitrary compression
rates with no extra annotation effort, and takes into account both
random system performance and interjudge agreement. RU also provides
an option for penalizing summaries that include sentences with
redundant information. Our results are based on the JHU summary corpus
and indicate that Relative Utility is a reasonable, and often superior
alternative to several common summary evaluation metrics. We also
give a comparison of RU with some other well-known metrics with
respect to the correlation with the human judgements on the DUC
corpus.

As virtualization continues to become increasingly popular in enterprise and organizational networks, operators and administrators are turning to live migration of virtual machines for the purpose of workload balancing and management. However, the security of live virtual machine migration has yet to be analyzed. This paper looks at this poorly explored area and attempts to empirically demonstrate the importance of securing the migration process. We begin by defining and investigating three classes of threats to virtual machine migration: control plane, data plane, and migration module threats. We then show how a malicious party using these attack strategies can exploit the latest versions of the popular Xen and VMware virtual machine monitors and present a tool to automate the manipulation of a guest operating system’s memory during a live virtual machine migration. Using this experience, we discuss strategies to address the deficiencies in virtualization software and secure the live migration process.

This paper presents novel techniques for checking
the soundness of a type system automatically using
a software model checker.
Our idea is to systematically generate every type
correct intermediate program state (within some
finite bounds), execute the program one step
forward if possible using its small step
operational semantics, and then check that the
resulting intermediate program state is also type
correct---but do so efficiently by detecting
similarities in this search space and pruning away
large portions of the search space.
Thus, given only a specification of type
correctness and the small step operational
semantics for a language, our system automatically
checks type soundness by checking that the
progress and preservation theorems hold for the
language (albeit for program states of at most
some finite size).
Our preliminary experimental results on several
languages---including a language of integer and
boolean expressions, a simple imperative
programming language, an object-oriented language
which is a subset of Java, and a language with
ownership types---indicate that our approach is
feasible and that our search space pruning
techniques do indeed significantly reduce what is
otherwise an extremely large search space.
Our paper thus makes contributions both in the
area of checking soundness of type systems, and in
the area of reducing the state space of a software
model checker.