Technical Reports

Search By Year

Title

Authors

Published

Abstract

Publication Details

Stretchcam: zooming using thin, elastic optics

Daniel Sims, Oliver Cossairt, Yonghao Yue, Shree Nayar

2017-12-31

Stretchcam is a thin camera with a lens capable of zooming with small actuations.
In our design, an elastic lens array is placed on top of a sparse, rigid array of pixels. This lens
array is then stretched using a small mechanical motion in order to change the field of view
of the system. We present in this paper the characterization of such a system and simulations
which demonstrate the capabilities of stretchcam. We follow this with the presentation of images
captured from a prototype device of the proposed design. Our prototype system is able to achieve
1.5 times zoom when the scene is only 300 mm away with only a 3% change of the lens array’s
original length.

Recent variants of Recurrent Neural Networks (RNNs)---in particular, Long Short-Term Memory (LSTM) networks---have established RNNs as a deep learning staple in modeling sequential data in a variety of machine learning tasks. However, RNNs are still often used as a black box with limited understanding of the hidden representation that they learn. Existing approaches such as visualization are limited by the manual effort to examine the visualizations and require considerable expertise, while neural attention models change, rather than interpret, the model. We propose a technique to search for neurons based on existing interpretable models, features, or programs.

State machine replication (SMR) leverages distributed consensus protocols such as PAXOS to keep multiple replicas of a program consistent in face of replica failures or network partitions. This fault tolerance is enticing on implementing a principled SMR system that replicates general programs, especially server programs that demand high availability. Unfortunately, SMR assumes deterministic execution, but most server programs are multithreaded and thus non-deterministic. Moreover, existing SMR systems provide narrow state machine interfaces to suit specific programs, and it can be quite strenuous and error-prone to orchestrate a general program into these interfaces This paper presents CRANE, an SMR system that trans- parently replicates general server programs. CRANE achieves distributed consensus on the socket API, a common interface to almost all server programs. It leverages deterministic multithreading (specifically, our prior system PARROT) to make multithreaded replicas deterministic. It uses a new technique we call time bubbling to efficiently tackle a difficult challenge of non-deterministic network input timing. Evaluation on five widely used server programs (e.g., Apache, ClamAV, and MySQL) shows that CRANE is easy to use, has moderate overhead, and is robust.

Dynamic reconfiguration systems guided by coarse-grained program phases has found success in improving overall program performance and energy efficiency. These performance/energy savings are limited by the granularity that program phases are detected since phases that occur at a finer granularity goes undetected and reconfiguration opportunities are missed. In this study, we detect program phases using interval sizes on the order of tens, hundreds, and thousands of program cycles. This is in stark contrast with prior phase detection studies where the interval size is on the order of several thousands to millions of cycles. The primary goal of this study is to begin to fill a gap in the literature on phase detection by characterizing super fine-grained program phases and demonstrating an application where detection of these relatively short-lived phases can be instrumental. Traditional models for phase detection including basic block vectors and working set signatures are used to detect super fine-grained phases as well as a less traditional model based on microprocessor activity. Finally, we show an analytical case study where super fine-grained phases are applied to voltage and frequency scaling optimizations.

Just like bugs in single-threaded programs can lead to vulnerabilities, bugs in multithreaded programs can also lead to concurrency attacks. Unfortunately, there is little quantitative data on how well existing tools can detect these attacks. This paper presents the first quantitative study on concurrency attacks and their implications on tools. Our study on 10 widely used programs reveals 26 concurrency attacks with broad threats (e.g., OS privilege escalation), and we built scripts to successfully exploit 10 attacks. Our study further reveals that, only extremely small portions of inputs and thread interleavings (or schedules) can trigger these attacks, and existing concurrency bug detectors work poorly because they lack help to identify the vulnerable inputs and schedules. Our key insight is that the reports in existing detectors have implied moderate hints on what inputs and schedules will likely lead to attacks and what will not (e.g., benign bug reports). With this insight, this paper presents a new directed concurrency attack detection approach and its implementation, OWL. It extracts hints from the reports with static analysis, augments existing detectors by pruning out the benign inputs and schedules, and then directs detectors and its own runtime vulnerability verifiers to work on the remaining, likely vulnerable inputs and schedules. Evaluation shows that OWL reduced 94.3% reports caused by benign inputs or schedules and detected 7 known concurrency attacks. OWL also detected 3 previously unknown concurrency attacks, including a use-after-free attack in SSDB confirmed as CVE-2016-1000324, an integer overflow, HTML integrity violation in Apache and three new MySQL data races confirmed with bug ID 84064, 84122, 84241. All OWL source code, exploit scripts, and results are available at https://github.com/ruigulala/ConAnalysis.

It has long been known that George Fabyan's Riverbank Laboratories provided the U.S. military with cryptanalytic and training services during World War I. The relationship has always be seen as voluntary. Newly discovered evidence raises the question of whether Fabyan was in fact paid, at least in part, for his services, but available records do not provide a definitive answer.

New information has been discovered about Frank Miller's 1882 one-time pad. These documents explain Miller's threat model and show that he had a reasonably deep understanding of the problem; they also suggest that his scheme was used more than had been supposed.

Why Are We Permanently Stuck in an Elevator? A Software Engineering Perspective on Game Bugs

Iris Zhang

2016-06-01

In the past decade, the complexity of video games have increased dramatically and so have the complexity of software systems behind them. The difficulty in designing and testing games invariably leads to bugs that manifest themselves across funny video reels on graphical glitches and millions of submitted support tickets. This paper presents an analysis of game developers and their teams who have knowingly released bugs to see what factors may motivate them in doing so. It examines different development environments as well as inquiring into varied types of game platforms and play-style. Above all, it seeks out how established research on software development best practices and challenges should inform understanding of these bugs. These findings may lead to targeted efforts to mitigate some of the factors leading to glitches, tailored to the specific needs of the game development team.

The paradigms of design patterns and software engineering methodologies are methods that apply to areas outside the software space. As a business owner and student, I implement many software principles daily in both my work and personal life. After experiencing the power of Agile methodologies outside the scope of software engineering, I always think about how I can integrate the computer science skills that I am learning at Columbia in my life. For my study, I seek to learn about other software engineering development processes that can be useful in life. I theorize that if a model such as Agile can provide me with useful tools, then a model that the government and most of the world trusts should have paradigms I can learn with as well. The software model I will study is open source software (OSS). My research examines the lateral software standards of (OSS) and closed source software (CSS). For the scope of this paper, I will focus on research primarily on Linux as the OSS model and Agile as the CSS model. OSS has had an extraordinary impact on the software revolution [1], and CSS models have gained such popularity that itâ€™s paradigms extend far beyond the software engineering space. Before delving into research, I thought the methodologies of OSS and CSS would be radically different. My study shall describe the similarities that exist between these two methodologies. In the process of my research, I was able to implement the values and paradigms that define the OSS development model to work more productively in my business. Software engineering core values and models can be used as a tool to improve our lives.

The aim of the user study conducted is primarily threefold:
â€¢ To accurately judge, based on a number of parameters, whether showing similar code helps in code comprehension.
â€¢ To investigate separately, a number of cases involving dynamic code, static code, the effect of options on accuracy of responses, and so on.
â€¢ To distribute the user survey, collect data, responses and feedback from the user study and draw conclusions.

Identifying similar code in software systems can assist many software engineering tasks, including program understand- ing. While most approaches focus on identifying code that looks alike, some researchers propose to detect instead code that functions alike, which are known as functional clones. However, previous work has raised the technical challenges to detect these functional clones in object oriented languages such as Java. We propose a novel technique, In-Vivo Clone Detection, a language-agnostic technique that detects functional clones in arbitrary programs by observing and mining inputs and outputs. We implemented this technique targeting programs that run on the JVM, creating HitoshiIO (available freely on GitHub), a tool to detect functional code clones. Our experimental results show that it is powerful in detecting these functional clones, finding 185 methods that are functionally similar across a corpus of 118 projects, even when there are only very few inputs available. In a random sample of the detected clones, HitoshiIO achieves 68+% true positive rate, while the false positive rate is only 15%.

Web applications are getting ubiquitous every day because they offer many useful services to consumers and businesses. Many of these web applications are quite storage-intensive. Cloud computing offers attractive and economical choices for meeting their storage needs. Unfortunately, it remains challenging for developers to best leverage them to minimize cost. This paper presents Grandet, a storage system that greatly reduces storage cost for web applications deployed in the cloud. Grandet provides both a key-value interface and a file system interface, supporting a broad spectrum of web applications. Under the hood, it supports multiple heterogeneous stores, and unifies them by placing each data object at the store deemed most economical. We implemented Grandet on Amazon Web Services and evaluated Grandet on a diverse set of four popular open-source web applications. Our results show that Grandet reduces their cost by an average of 42.4%, and it is fast, scalable, and easy to use. The source code of Grandet is at http://columbia.github.io/grandet.

ARM servers are becoming increasingly common, making server technologies such as virtualization for ARM of grow- ing importance. We present the first in-depth study of ARM virtualization performance on ARM server hardware, including measurements of two popular ARM and x86 hypervisors, KVM and Xen. We show how the ARM hardware support for virtualization can support much faster transitions between the VM and the hypervisor, a key hypervisor operation. However, current hypervisor designs, including both KVM (Type 1) and Xen (Type 2), are not able to lever- age this performance benefit in practice for real application workloads. We discuss the reasons why and show that other factors related to hypervisor software design and implementation have a larger role in overall performance than the speed of micro architectural operations. Based on our measurements, we discuss changes to ARMâ€™s hardware virtualization support that can potentially bridge the gap to bring its faster virtual machine exit mechanism to modern Type 2 hypervisors running real applications. These changes have been incorporated into the latest ARM architecture.

Use of Fast Multipole to Accelerate Discrete Circulation-Preserving Vortex Sheets for Soap Films and Foams

Fang Da, Christopher Batty, Chris Wojtan, Eitan Grinspun

2015-11-07

We report the integration of a FMM (Fast Multipole Method) template library â€œFMMTLâ€ into the discrete circulation-preserving vortex sheets method to accelerate the Biot-Savart integral. We measure the speed-up on a bubble oscillation test with varying mesh resolution. We also report a few examples with higher complexity than previously achieved.

Recursive functions and data types pose significant challenges for a Haskell-to-hardware compiler. Directly translating these structures yields infinitely large circuits; a subtler approach is required. We propose a sequence of abstraction-lowering transformations that exposes time and memory in a Haskell program. producing a simpler form for hardware translation. This paper outlines these transformations on a specific example; future research will focus on generalizing and automating them in our group's compiler.

Cyber-physical systems (CPS) are systems featuring a tight combination of, and coordination between, the systemâ€™s computational and physical elements. Cyber-physical systems include systems ranging from critical infrastructure such as a power grid and transportation system to health and biomedical devices. System reliability, i.e., the ability of a system to perform its intended function under a given set of environmental and operational conditions for a given period of time, is a fundamental requirement of cyber-physical systems. An unreliable system often leads to disruption of service, financial cost and even loss of human life. An important and prevalent type of cyber-physical system meets the following criteria: processing large amounts of data; employing software as a system component; running online continuously; having operator-in-the-loop because of human judgment and an accountability requirement for safety critical systems. This thesis aims to improve system reliability for this type of cyber-physical system.
To improve system reliability for this type of cyber-physical system, I present a system evaluation approach entitled automated online evaluation (AOE), which is a data-centric runtime monitoring and reliability evaluation approach that works in parallel with the cyber-physical system to conduct automated evaluation along the workflow of the system continuously using computational intelligence and self-tuning techniques and provide operator-in-the-loop feedback on reliability improvement. For example, abnormal input and output data at or between the multiple stages of the system can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop. The operator can then take actions and make changes to the system based on the alerts in order to achieve minimal system downtime and increased system reliability. One technique used by the approach is data quality analysis using computational intelligence, which applies computational intelligence in evaluating data quality in an automated and efficient way in order to make sure the running system perform reliably as expected. Another technique used by the approach is self-tuning which automatically self-manages and self-configures the evaluation system to ensure that it adapts itself based on the changes in the system and feedback from the operator. To implement the proposed approach, I further present a system architecture called autonomic reliability improvement system (ARIS).
This thesis investigates three hypotheses. First, I claim that the automated online evaluation empowered by data quality analysis using computational intelligence can effectively improve system reliability for cyber-physical systems in the domain of interest as indicated above. In order to prove this hypothesis, a prototype system needs to be developed and deployed in various cyber-physical systems while certain reliability metrics are required to measure the system reliability improvement quantitatively. Second, I claim that the self-tuning can effectively self-manage and self-configure the evaluation system based on the changes in the system and feedback from the operator-in-the-loop to improve system reliability. Third, I claim that the approach is effcient. It should not have a large impact on the overall system performance and introduce only minimal extra overhead to the cyberphysical system. Some performance metrics should be used to measure the effciency and added overhead quantitatively.
Additionally, in order to conduct efficient and cost-effective automated online evaluation for data-intensive CPS, which requires large volumes of data and devotes much of its processing time to I/O and data manipulation, this thesis presents COBRA, a cloud-based reliability assurance framework. COBRA provides automated multi-stage runtime reliability evaluation along the CPS workflow using data relocation services, a cloud data store, data quality analysis and process scheduling with self-tuning to achieve scalability, elasticity and efficiency.
Finally, in order to provide a generic way to compare and benchmark system reliability for CPS and to extend the approach described above, this thesis presents FARE, a reliability benchmark framework that employs a CPS reliability model, a set of methods and metrics on evaluation environment selection, failure analysis, and reliability estimation.
The main contributions of this thesis include validation of the above hypotheses and empirical studies of ARIS automated online evaluation system, COBRA cloud-based reliability assurance framework for data-intensive CPS, and FARE framework for benchmarking reliability of cyber-physical systems. This work has advanced the state of the art in the CPS reliability research, expanded the body of knowledge in this field, and provided some useful studies for further research.

Efficient sampling algorithms have been developed for approximating answers to aggregate queries on large data sets. In some formulations of the problem, concentration inequalities (such as Hoeffdingâ€™s inequality) are used to estimate the confidence interval for an approximated aggregated value. Samples are usually chosen until the confidence interval is arbitrarily small enough regardless of how the approximated query answers will be used (for example, in interactive visualizations). In this report, we show how to exploit visualization-specific properties to reduce the sampling complexity of a sampling-based approximate query processing algorithm while preserving certain visualization guarantees (the visual property of relative ordering) with a very high probability.

Detecting â€œsimilar codeâ€ is fundamental to many software engineering tasks. Current tools can help detect code with statically similar syntactic features (code clones). Unfortunately, some code fragments that behave alike without similar syntax may be missed. In this paper, we propose the term â€œcode relativesâ€ to refer to code with dynamically similar execution features. Code relatives can be used for such tasks as implementation-agnostic code search and classification of code with similar behavior for human understanding, which code clone detection cannot achieve. To detect code relatives, we present DyCLINK, which constructs an approximate runtime representation of code using a dynamic instruction graph. With our link analysis based subgraph matching algorithm, DyCLINK detects fine-grained code relatives efficiently. In our experiments, DyCLINK analyzed 290+ million prospective subgraph matches. The results show that DyCLINK detects not only code relatives, but also code clones that the state-of-the-art system is unable to identify. In a code classification problem, DyCLINK achieved 96% precision on average compared with the competitorâ€™s 61%.

Dynamic taint tracking is an information flow analysis that can be applied to many areas of testing. Phosphor is the first portable, accurate and performant dynamic taint tracking system for Java. While previous systems for performing general-purpose taint tracking in the JVM required specialized research JVMs, Phosphor works with standard off-the-shelf JVMs (such as Oracle's HotSpot and OpenJDK's IcedTea). Phosphor also differs from previous portable JVM taint tracking systems that were not general purpose (e.g. tracked only tags on Strings and no other type), in that it tracks tags on all variables. We have also made several enhancements to Phosphor, allowing it to track taint tags through control flow (in addition to data flow), as well as allowing it to track an arbitrary number of relationships between taint tags (rather than be limited to only 32 tags). In this demonstration, we show how developers writing testing tools can benefit from Phosphor, and explain briefly how to interact with it.

Abstraction in hardware description languages stalled at the
register-transfer level decades ago, yet few alternatives have had
much success, in part because they provide only modest gains in
expressivity. We propose to make a much larger jump: a compiler that
synthesizes hardware from behavioral functional specifications. Our
compiler translates general Haskell programs into a restricted
intermediate representation before applying a series of
semantics-preserving transformations, concluding with a simple
syntax-directed translation to SystemVerilog. Here, we present the
overall framework for this compiler, focusing on the IRs involved and
our method for translating general recursive functions into equivalent
hardware. We conclude with experimental results that depict the
performance and resource usage of the circuitry generated with our
compiler.

With the widespread use of mobile systems, there is a growing demand for apps that can enable users to collaboratively use multiple mobile systems, including hardware device features such as cameras, displays, speakers, microphones, sensors, and input. We present M2, a system for multi-mobile computing by enabling remote sharing and combining of devices across multiple mobile systems. M2 leverages higher-level device abstractions and encoding and decoding hardware in mobile systems to define a cross-platform interface for remote device sharing to operate seamlessly across heterogeneous mobile hardware and software. M2 can be used to build new multi-mobile apps as well as make existing unmodified apps multi-mobile aware through the use of fused devices, which transparently combine multiple devices into a more capable one. We have implemented an M2 prototype on Android that operates across heterogeneous hardware and software, including using Android and iOS remote devices, the latter allowing iOS users to also run Android apps. Our results using unmodified apps from Google Play show that M2 can enable even display-intensive 2D and 3D games to use remote devices across multiple mobile systems with modest overhead and qualitative performance indistinguishable from using local device hardware.

Dynamic Inference of Likely Metamorphic Properties to Support Differential Testing

Fang-Hsiang Su, Jonathan Bell, Christian Murphy, Gail Kaiser

2015-02-27

Metamorphic testing is an advanced technique to test programs and applications without a test oracle such as machine learning applications. Because these programs have no general oracle to identify their correctness, traditional testing techniques such as unit testing may not be helpful for developers to detect potential bugs. This paper presents a novel system, KABU, which can dynamically infer properties that describe the characteristics of a program before and after transforming its input at the method level. Metamorphic Properties (MPs) are pivotal to detecting potential bugs in programs without test oracles, but most previous work relies solely on human effort do identify them. This paper also proposes a new testing concept, Metamorphic Differential Testing (MDT). By comparing the MPs between different versions of the same application, KABU can detect potential bugs in the program. We have performed a preliminary evaluation of KABU by comparing the MPs detected by humans with the MPs detected by KABU. Our preliminary results are very promising: KABU can find more MPs as human developers, and its differential testing mechanism is effective at detecting function changes in programs.

We present DisCo, a novel display-camera communication system. DisCo enables displays and cameras to communicate with each other, while also displaying and capturing images for human consumption. Messages are transmitted by temporally modulating the display brightness at high frequencies so that they are imperceptible to humans. Messages are received by a rolling shutter camera which converts the temporally modulated incident light into a spatial flicker pattern. In the captured image, the flicker pattern is superimposed on the pattern shown on the display. The flicker and the display pattern are separated by capturing two images with different exposures. The proposed system performs robustly in challenging real-world situations such as occlusion, variable display size, defocus blur, perspective distortion and camera rotation. Unlike several existing visible light communication methods, DisCo works with off-the-shelf image sensors. It is compatible with a variety of sources (including displays, single LEDs), as well as reflective surfaces illuminated with light sources. We have built hardware prototypes that demonstrate DisCoâ€™s performance in several scenarios. Because of its robustness, speed, ease of use and generality, DisCo can be widely deployed in several CHI applications, such as advertising, pairing of displays with cell-phones, tagging objects in stores and museums, and indoor navigation.

This is a contribution for the November 2014 Dagstuhl workshop on affordable Internet access. The contribution describes the issues of availability, affordability and relevance, with a particular focus on the experience with providing universal broadband Internet access in the United States.

Among all classes of parallel programming abstractions, lock-free data structures are considered one of the most scalable and efficient because of their fine-grained style of synchronization. However, they are also challenging for developers and tools to verify because of the huge number of possible interleavings that result from fine-grained synchronizations.
This paper address this fundamental problem between performance and verifiability of lock-free data structures. We present TXIT, a system that greatly reduces the set of possible interleavings by inserting transactions into the implementation of a lock-free data structure. We leverage hardware transactional memory support from Intel Haswell processors to enforce these artificial transactions. Evaluation on six popular lock-free data structures shows that TXIT makes it easy to verify lock-free data structures while incurring acceptable runtime overhead. Further analysis shows that two inefficiencies in Haswell are the largest contributors to this overhead.

For some applications, it is impossible or impractical to know what the correct output should be for an arbitrary input, making testing difficult. Many machine-learning applications for â€œbig dataâ€, bioinformatics and cyberphysical systems fall in this scope: they do not have a test oracle. Metamorphic Testing, a simple testing technique that does not require a test oracle, has been shown to be effective for testing such applications. We present Metamorphic Runtime Checking, a novel approach that conducts metamorphic testing of both the entire application and individual functions during a programâ€™s execution. We have applied Metamorphic Runtime Checking to 9 machine-learning applications, finding it to be on average 170% more effective than traditional metamorphic testing at only the full application level.

We present PANDA, an open-source tool that has been purpose-built to support whole system reverse engineering. It is built upon the QEMU whole system emulator, and so analyses have access to all code executing in the guest and all data. PANDA adds the ability to record and replay executions, enabling iterative, deep, whole system analyses. Further, the replay log files are compact and shareable, allowing for repeatable experiments. A nine billion instruction boot of FreeBSD, e.g., is represented by only a few hundred MB. Further, PANDA leverages QEMU's support of thirteen different CPU architectures to make analyses of those diverse instruction sets possible within the LLVM IR. In this way, PANDA can have a single dynamic taint analysis, for example, that precisely supports many CPUs. PANDA analyses are written in a simple plugin architecture which includes a mechanism to share functionality between plugins, increasing analysis code re-use and simplifying complex analysis development. We demonstrate PANDA's effectiveness via a number of use cases, including enabling an old but legitimate version of Starcraft to run despite a lost CD key, in-depth diagnosis of an Internet Explorer crash, and uncovering the censorship activities and mechanisms of a Chinese IM client.

Discovering code clones in a runtime environment helps software engineers identify hard to find logic-based bugs. Yet most research in the area of code clone discovery deals with source code due to the complexity of finding clones in a dynamic environment. KAMINO manipulates Java bytecode to track control and data flow dependencies at the method-level of Java programs during runtime. It then matches similar flows to find semantic code clones. With positive preliminary results indicating code clones using KAMINO, future tests will compare the its robustness compared to existing code clones detection tools.

Detecting, Isolating and Enforcing Dependencies Between and Within Test Cases

Jonathan Bell

2014-07-06

Testing stateful applications is challenging, as it can be difficult to identify hidden dependencies on program state. These dependencies may manifest between several test cases, or simply within a single test case. When it's left to developers to document, understand, and respond to these dependencies, a mistake can result in unexpected and invalid test results. Although current testing infrastructure does not currently leverage state dependency information, we argue that it could, and that by doing so testing can be improved. Our results thus far show that by recovering dependencies between test cases and modifying the popular testing framework, JUnit, to utilize this information, we can optimize the testing process, reducing time needed to run tests by 62% on average. Our ongoing work is to apply similar analyses to improve existing state of the art test suite prioritization techniques and state of the art test case generation techniques. This work is advised by Professor Gail Kaiser.

In correlation-based time-of-flight (C-ToF) imaging systems, light sources with temporally varying intensities illuminate the scene. Due to global illumination, the temporally varying radiance received at the sensor is a combination of light received along multiple paths. Recovering scene properties (e.g., scene depths) from the received radiance requires separating these contributions, which is challenging due to the complexity of global illumination and the additional temporal dimension of the radiance.
We propose phasor imaging, a framework for performing fast inverse light transport analysis using C-ToF sensors. Phasor imaging is based on the idea that by representing light transport quantities as phasors and light transport events as phasor transformations, light transport analysis can be simplified in the temporal frequency domain. We study the effect of temporal illumination frequencies on light transport, and show that for a broad range of scenes, global radiance (multi-path interference) vanishes for frequencies higher than a scene-dependent threshold. We use this observation for developing two novel scene recovery techniques. First, we present Micro ToF imaging, a ToF based shape recovery technique that is robust to errors due to global illumination. Second, we present a technique for separating the direct and global components of radiance. Both techniques require capturing as few as 3-4 images and minimal computations. We demonstrate the validity of the presented techniques via simulations and experiments performed with our hardware prototype.

Cloud computing offers a scalable, low-cost, and resilient platform for critical applications. Securing these applications against attacks targeting unknown vulnerabilities is an unsolved challenge. Network anomaly detection addresses such zero-day attacks by modeling attributes of attack-free application traffic and raising alerts when new traffic deviates from this model. Content anomaly detection (CAD) is a variant of this approach that models the payloads of such traffic instead of higher level attributes. Zero-day attacks then appear as outliers to properly trained CAD sensors. In the past, CAD was unsuited to cloud environments due to the relative overhead of content inspection and the dynamic routing of content paths to geographically diverse sites. We challenge this notion and introduce new methods for efficiently aggregating content models to enable scalable CAD in dynamically-pathed environments such as the cloud. These methods eliminate the need to exchange raw content, drastically reduce network and CPU overhead, and offer varying levels of content privacy. We perform a comparative analysis of our methods using Random Forest, Logistic Regression, and Bloom Filter-based classifiers for operation in the cloud or other distributed settings such as wireless sensor networks. We find that content model aggregation offers statistically significant improvements over non-aggregate models with minimal overhead, and that distributed and non-distributed CAD have statistically indistinguishable performance. Thus, these methods enable the practical deployment of accurate CAD sensors in a distributed attack detection infrastructure.

Vernam, Mauborgne, and Friedman: The One-Time Pad and the Index of Coincidence

Steven M. Bellovin

2014-05-13

The conventional narrative for the invention of the AT&T
one-time pad was related by David Kahn. Based on the evidence available
in the AT&T patent files and from interviews and correspondence,
he concluded that Gilbert Vernam came up with the need for randomness, while
Joseph Mauborgne realized the need for a non-repeating key. Examination
of other documents
suggests a different narrative.
It is most likely that Vernam came up with the need for non-repetition;
Mauborgne, though, apparently contributed materially to the
invention of the
two-tape variant. Furthermore, there is reason to suspect that
he suggested the need for randomness to Vernam.
However, neither Mauborgne, Herbert Yardley, nor
anyone at AT&T really understood
the security advantages of the true one-time tape. Col. Parker Hitt
may have; William Friedman definitely did. Finally, we
show that Friedman's attacks on the two-tape variant
likely led to his invention of the index of coincidence, arguably
the single most important publication in the history of
cryptanalysis.

Data privacy when using online systems like Facebook and Amazon has become an increasingly popular topic in the last few years. This thesis will consist of the following four projects that aim to address the issues of privacy and software engineering.
First, only a little is known about how users and developers perceive privacy and which concrete measures would mitigate their privacy concerns. To investigate privacy requirements, we conducted an online survey with closed and open questions and collected 408 valid responses. Our results show that users often reduce privacy to security, with data sharing and data breaches being their biggest concerns. Users are more concerned about the content of their documents and their personal data such as location than about their interaction data. Unlike users, developers clearly prefer technical measures like data anonymization and think that privacy laws and policies are less effective. We also observed interesting differences between people from different geographies. For example, people from Europe are more concerned about data breaches than people from North America. People from Asia/Pacific and Europe believe that content and metadata are more critical for privacy than people from North America. Our results contribute to developing a user-driven privacy framework that is based on empirical evidence in addition to the legal, technical, and commercial perspectives.
Second, a related challenge to above, is to make privacy more understandable in complex systems that may have a variety of user interface options, which may change often. As social network platforms have evolved, the ability for users to control how and with whom information is being shared introduces challenges concerning the configuration and comprehension of privacy settings. To address these concerns, our crowd sourced approach simplifies the understanding of privacy settings by using data collected from 512 users over a 17 month period to generate visualizations that allow users to compare their personal settings to an arbitrary subset of individuals of their choosing. To validate our approach we conducted an online survey with closed and open questions and collected 59 valid responses after which we conducted follow-up interviews with 10 respondents. Our results showed that 70% of respondents found visualizations using crowd sourced data useful for understanding privacy settings, and 80% preferred a crowd sourced tool for configuring their privacy settings over current privacy controls.
Third, as software evolves over time, this might introduce bugs that breach usersâ€™ privacy. Further, there might be system-wide policy changes that could change usersâ€™ settings to be more or less private than before. We present a novel technique that can be used by end-users for detecting changes in privacy, i.e., regression testing for privacy. Using a social approach for detecting privacy bugs, we present two prototype tools. Our evaluation shows the feasibility and utility of our approach for detecting privacy bugs. We highlight two interesting case studies on the bugs that were discovered using our tools. To the best of our knowledge, this is the first technique that leverages regression testing for detecting privacy bugs from an end-user perspective.
Fourth, approaches to addressing these privacy concerns typically require substantial extra computational resources, which might be beneficial where privacy is concerned, but may have significant negative impact with respect to Green Computing and sustainability, another major societal concern. Spending more computation time results in spending more energy and other resources that make the software system less sustainable. Ideally, what we would like are techniques for designing software systems that address these privacy concerns but which are also sustainable â€” systems where privacy could be achieved â€œfor free,â€ i.e., without having to spend extra computational effort. We describe how privacy can indeed be achieved for free â€” an accidental and beneficial side effect of doing some existing computation â€” in web applications and online systems that have access to user data. We show the feasibility, sustainability, and utility of our approach and what types of privacy threats it can mitigate.
Finally, we generalize the problem of privacy and its tradeoffs. As Social Computing has increasingly captivated the general public, it has become a popular research area for computer scientists. Social Computing research focuses on online social behavior and using artifacts derived from it for providing recommendations and other useful community knowledge. Unfortunately, some of that behavior and knowledge incur societal costs, particularly with regards to Privacy, which is viewed quite differently by different populations as well as regulated differently in different locales. But clever technical solutions to those challenges may impose additional societal costs, e.g., by consuming substantial resources at odds with Green Computing, another major area of societal concern. We propose a new crosscutting research area, Societal Computing, that focuses on the technical tradeoffs among computational models and application domains that raise significant societal issues. We highlight some of the relevant research topics and open problems that we foresee in Societal Computing. We feel that these topics, and Societal Computing in general, need to gain prominence as they will provide useful avenues of research leading to increasing benefits for society as a whole.

We report the results from experiments on the convergence of the multimaterial mesh-based surface tracking method introduced by the same authors. Under mesh refinement, approximately first order convergence or higher in L1 and L2 is shown for vertex positions, face normals and non-manifold junction curves in a number of scenarios involving the new operations proposed in the method.

Cyberwar is very much in the news these days.
It is tempting to try to understand the economics of such an activity,
if only qualitatively.
What effort is required? What can such attacks accomplish?
What does this say, if anything, about the likelihood of cyberwar?

This paper introduces energy exchanges, a set of abstractions that allow applications to help hardware and operating systems manage power and energy consumption. Using annotations, energy exchanges dictate when, where, and how to trade performance or accuracy for power in ways that only an applicationâ€™s developer can decide. In particular, the abstractions offer audits and budgets which watch and cap the power or energy of some piece of the application. The interface also exposes energy and power usage reports which an application may use to change its behavior. Such information complements existing system-wide energy management by operating systems or hardware, which provide global fairness and protections, but are unaware of the internal dynamics of an application.
Energy exchanges are implemented as a user-level C++ library. The library employs an accounting technique to attribute shares of system-wide energy consumption (provided by system-wide hardware energy meters available on newer hardware platforms) to individual application threads. With these per-thread meters and careful tracking of an applicationâ€™s activity, the library exposes energy and power usage for program regions of interest via the energy exchange abstractions with negligible runtime or power overhead. We use the library to demonstrate three applications of energy exchanges: (1) the prioritization of a mobile gameâ€™s energy use over third-party advertisements, (2) dynamic adaptations of the framerate of a video tracking benchmark that maximize performance and accuracy within the confines of a given energy allotment, and (3) the triggering of computational sprints and corresponding cooldowns, based on time, system TDP, and power consumption.

Dynamic taint analysis is a well-known information flow analysis problem with many possible applications. Taint tracking allows for analysis of application data flow by assigning labels to inputs, and then propagating those labels through data flow. Taint tracking systems traditionally compromise among performance, precision, accuracy, and portability. Performance can be critical, as these systems are typically intended to be deployed with software, and hence must have low overhead. To be deployed in security-conscious settings, taint tracking must also be accurate and precise.
Dynamic taint tracking must be portable in order to be easily deployed and adopted for real world purposes, without requiring recompilation of the operating system or language interpreter, and without requiring access to application source code.
We present Phosphor, a dynamic taint tracking system for the Java Virtual Machine (JVM) that simultaneously achieves our goals of performance, accuracy, precision, and portability. Moreover, to our knowledge, it is the first portable general purpose taint tracking system for the JVM. We evaluated Phosphor's performance on two commonly used JVM languages (Java and Scala), on two versions of two commonly used JVMs (Oracle's HotSpot and OpenJDK's IcedTea) and on Android's Dalvik Virtual Machine, finding its performance to be impressive: as low as 3% (53% on average), using the DaCapo macro benchmark suite. This paper describes the approach that Phosphor uses to achieve portable taint tracking in the JVM.

Despite the variety of choices regarding hardware and software, to date a large number of computer
systems remain identical. Characteristic examples of this trend are Windows on x86 and Android on ARM.
This homogeneity, sometimes referred to as â€œcomputing oligoculture", provides a fertile ground for malware
in the highly networked world of today.
One way to counter this problem is to diversify systems so that attackers cannot quickly and easily
compromise a large number of machines. For instance, if each system has a different ISA, the attacker
has to invest more time in developing exploits that run on every system manifestation. It is not that each
individual attack gets harder, but the spread of malware slows down. Further, if the diversified ISA is kept
secret from the attacker, the bar for exploitation is raised even higher.
In this paper, we show that system diversification can be realized by enabling diversity at the lowest
hardware/software interface, the ISA, with almost zero performance overhead. We also describe how prac-
tical development and deployment problems of diversified systems can be handled easily in the context of
popular software distrbution models, such as the mobile app store model. We demonstrate our proposal with
an OpenSPARC FPGA prototype

Students traditionally learn microarchitecture by studying textual
descriptions with diagrams but few analogies. Several popular
textbooks on this topic introduce concepts such as pipelining and
caching in the context of simple paper-only architectures. While
this instructional style allows important concepts to be covered
within a given class period, students have difficulty bridging the
gap between what is covered in classes and real-world
implementations. Discussing concrete implementations and
complications would, however, take too much time.
In this paper, we propose a technique of representing
microarchitecture building blocks with animated metaphors to
accelerate the process of learning about complex
microarchitectures. We represent hardware implementations as
road networks that include specific patterns of traffic flow found
in microarchitectural behavior. Our experiences indicate an 83%
improvement to understanding memory system microarchitecture.
We believe the mental models developed by these students will
serve them in remembering microarchitectural behavior and
extend to learning new microarchitectures more easily.

Recent advances in hardware security have led to the development of
FANCI (Functional Analysis for Nearly-Unused Circuit Identification), an
analysis tool that identifies stealthy, malicious circuits within hardware designs that can perform malicious backdoor behavior. Evaluations of such tools against benchmarks and academic attacks are not always equivalent to the dynamic attack scenarios that can arise in the real world. For this reason, we apply a red team/blue team approach to stress-test FANCI's abilities to efficiently detect malicious backdoor circuits within hardware designs.
In the Embedded Systems Challenge (ESC) 2013, teams from research groups
from multiple continents created designs with malicious backdoors hidden
in them as part of a red team effort to circumvent FANCI. Notably, these
backdoors were not placed into a priori known designs. The red
team was allowed to create arbitrary, unspecified designs. Two interesting results came out of this effort. The first was that FANCI was surprisingly resilient to this wide variety of attacks and was not circumvented by any of the stealthy backdoors created by the red teams. The second result is that frequent-action backdoors, which are backdoors that are not made stealthy, were often successful. These results emphasize the importance of combining FANCI with a reasonable degree of validation testing. The blue team efforts also exposed some aspects of the FANCI prototype that make analysis time-consuming in some cases, which motivates further development of the prototype in the future.

Recent works have shown promise in using microarchitectural execution patterns to detect malware programs. These detectors belong to a class of detectors known as signature-based detectors as they catch malware by comparing a program's execution pattern (signature) to execution patterns of known malware programs. In this work, we propose a new class of detectors - anomaly-based hardware malware detectors - that do not require signatures for malware detection, and thus can catch a wider range of malware including potentially novel ones. We use unsupervised machine learning to build profiles of normal program execution based on data from performance counters, and use these profiles to detect significant deviations in program behavior that occur as a result of malware exploitation. We show that real-world exploitation of popular programs such as IE and Adobe PDF Reader on a Windows/x86 platform can be detected with nearly perfect certainty. We also examine the limits and challenges in implementing this approach in face of a sophisticated adversary attempting to evade anomaly-based detection. The proposed detector is complementary to previously proposed signature-based detectors and can be used together to improve security.

Smartphones nowadays have the ground-breaking features that were only a figment of oneâ€™s imagination. For the ever-demanding cellphone users, the exhaustive list of features that a smartphone supports just keeps getting more exhaustive with time. These features aid oneâ€™s personal and professional uses as well. Extrapolating into the future the features of a present-day smartphone, the lives of us humans using smartphones are going to be unimaginably agile.
With the above said emphasis on the current and future potential of a smartphone, the ability to virtualize smartphones with all their real-world features into a virtual platform, is a boon for those who want to rigorously experiment and customize the virtualized smartphone hardware without spending an extra penny. Once virtualizable independently on a larger scale, the idea of virtualized smartphones with all the virtualized pieces of hardware takes an interesting turn with the sensors being virtualized in a way thatâ€™s closer to the real-world behavior.
When accessible remotely with the real-time responsiveness, the above mentioned real-world behavior will be a real dealmaker in many real-world systems, namely, the life-saving systems like the ones that instantaneously get alerts about harmful magnetic radiations in the deep mining areas, etc. And these life-saving systems would be installed on a large scale on the desktops or large servers as virtualized smartphones having the added support of virtualized sensors which remotely fetch the real hardware sensor readings from a real smartphone in real-time. Based on these readings the lives working in the affected areas can be alerted and thus saved by the people who are operating the at the desktops or large servers hosting the virtualized smartphones.
In addition, the direct and one of the biggest advantages of such a real hardware sensor driven Sensor Emulation in an emulated Android(-x86) environment is that the Android applications that use sensors can now run on the emulator and act under the
influence of real hardware sensorsâ€™ due to the emulated sensors. The current work of Sensor Emulation is quite unique when compared to the existing and past sensor-related works. The uniqueness comes from the full-fledged sensoremulation in a virtualized smartphone environment as opposed to building some sophisticated physical systems that usually aggregate the sensor readings from the real hardware sensors, might be in a remote manner and in real-time. For example, wireless sensor networks based remote-sensing systems that install real hardware sensors in remote places and have the sensor readings from all those sensors at a centralized server or something similar, for the necessary real-time or offline analysis. In these systems, apart from collecting mere real hardware sensor readings into a centralized entity, nothing more is being achieved unlike in the current work of Sensor Emulation wherein the emulated sensors behave exactly like the remote real hardware sensors. The emulated sensors can be calibrated, speeded up or slowed down(in terms of their sampling frequency), influence the sensor-based application running inside the virtualized smartphone environment exactly as the real hardware sensors of a real phone would do to the sensor-based application running in that real phone. In essence, the current work is more about generalizing the sensors with all its real-world characteristics as far as possible in a virtualized platform than just a
framework to send and receive sensor readings over the network between the real and virtual phones.
Realizing the useful advantages of Sensor Emulation which is about adding virtualized sensors support to emulated environments, the current work emulates a total of ten sensors present in the real smartphone, Samsung Nexus S, an Android device. Virtual phones run Android-x86 while real phones run Android. The real reason behind choosing Android-x86 for virtual phone is that x86-based Android
devices are feature-rich over ARM based ones, for example a full-fledged x86 desktop or a tablet has more features than a relatively small smartphone. Out of the ten, five are real sensors and the rest are virtual or synthetic ones. The real ones are Accelerometer, Magnetometer, Light, Proximity, and Gyroscope whereas the virtual ones are Corrected Gyroscope, Linear Acceleration, Gravity, Orientation, and Rotation Vector. The emulated Android-x86 is of Android release version Jelly Bean 4.3.1 which differs only very slightly in terms of bug fixes from Android Jelly Bean 4.3 running on the real smartphone.
One of the noteworthy aspects of the Sensor Emulation accomplished is being demand-less - exactly the same sensor-based Android applications will be able to use the sensors on the real and virtual phones, with absolutely no difference in terms of their sensor-based behavior.
The emulationâ€™s core idea is the socket-communication between two modules of Hardware Abstraction Layer(HAL) which is driver-agnostic, remotely over a wireless network in real-time. Apart from a Paired real-device scenario from which the real hardware sensor readings are fetched, the Sensor Emulation also is compatible with a Remote Server Scenario wherein the artificially generated sensor readings are fetched from a remote server. Due to the Sensor Emulation having been built on mereend-to-end socket-communication, itâ€™s logical and obvious to see that the real and virtual phones can run different Android(-x86) releases with no real impact on the
Sensor Emulation being accomplished.
Sensor Emulation once completed was evaluated for each of the emulated sensors using applications from Android Market as well as Amazon Appstore. The applications category include both the basic sensor-test applications that show raw sensor readings, as well as the advanced 3D sensor-driven games which are emulator compatible, especially in terms of the graphics. The evaluations proved the current work of Sensor Emulation to be generic, efficient, robust, fast, accurate, and real.
As of this writing i.e., January 2014, the current work of Sensor Emulation is the sole system-level sensor virtualization work that embraces remoteness in real-time for the emulated Android-x86 systems. It is important to note that though the current work is
targeted for Android-x86, the code written for the current work makes no assumptions about underlying platform to be an x86 one. Hence, the work is also logically seen as compatible with ARM based emulated Android environment though not actually tested.

We present a study of traffic behavior of two popular over-the-top (OTT) video streaming services (YouTube and Netflix). Our analysis is conducted on different mobile devices (iOS and Android) over various wireless networks (Wi-Fi, 3G and LTE) under dynamic network conditions. Our measurements show that the video players frequently discard a large amount of video content although it is successfully delivered to a client.
We first investigate the root cause of this unwanted behavior. Then, we propose a Quality-of-Service (QoS)-aware video streaming architecture in Long Term Evolution (LTE) networks to reduce the waste of network resource and improve user experience. The architecture includes a selective packet discarding mechanism, which can be placed in packet data network gateways (P-GW). In addition, our QoS-aware rules assist video players in selecting an appropriate resolution under a fluctuating channel condition. We monitor network condition and configure QoS parameters to control availability of the maximum bandwidth in real time. In our experimental setup, the proposed platform shows up to 20.58% improvement in saving downlink bandwidth and improves user experience by reducing buffer underflow period to an average of 32 seconds.

We investigate video server selection algorithms in a distributed video-on-demand system. We conduct a detailed study of the YouTube Content Delivery Network (CDN) on PCs and mobile devices over Wi-Fi and 3G networks under varying network conditions. We proved that a location-aware video server selection algorithm assigns a video content server
based on the network attachment point of a client. We found out that such distance-based algorithms carry the risk of directing a client to a less optimal content server, although there may exist other better performing video delivery servers. In order to solve this problem, we propose to use dynamic network information such as packet loss rates and Round Trip Time (RTT)between an edge node of an wireless network (e.g., an Internet Service Provider (ISP) router in a Wi-Fi network and a Radio
Network Controller (RNC) node in a 3G network) and video content servers, to find the optimal video content server when a video is requested. Our empirical study shows that the proposed architecture can provide higher TCP performance, leading to better viewing quality compared to location-based video server selection algorithms.

When belief propagation (BP) converges, it does so to a stationary point of the Bethe free energy $\F$, and is often strikingly accurate. However, it may converge only to a local optimum or may not converge at all. An algorithm was recently introduced for attractive binary pairwise MRFs which is guaranteed to return an $\eps$-approximation to the global minimum of $\F$ in polynomial time provided the maximum degree $\Delta=O(\log n)$, where $n$ is the number of variables. Here we significantly improve this algorithm and derive several results including a new approach based on analyzing first derivatives of $\F$, which leads to performance that is typically far superior and yields a fully polynomial-time approximation scheme (FPTAS) for attractive models without any degree restriction. Further, the method applies to general (non-attractive) models, though with no polynomial time guarantee in this case, leading to the important result that approximating $\log$ of the Bethe partition function, $\log Z_B=-\min \F$, for a general model to additive $\epsilon$-accuracy may be reduced to a discrete MAP inference problem. We explore an application to predicting equipment failure on an urban power network and demonstrate that the Bethe approximation can perform well even when BP fails to converge.

Introductory computer science courses traditionally focus on exposing students to basic programming and computer science theory, leaving little or no time to teach students about software testing. A lot of studentsâ€™ mental model when they start learning programming is that â€œif it compiles and runs without crashing, it must work fine.â€ Thus exposure to testing, even at a very basic level, can be very beneficial to the students. In the short term, they will do better on their assignments as testing before submission might help them discover bugs in their implementation that they hadnâ€™t realized. In the long term, they will appreciate the importance of testing as part of the software development life cycle.

As voice, multimedia, and data services are converging to IP, there is a need for a new networking architecture to support future innovations and applications. Users are consuming Internet services from multiple devices that have multiple network interfaces such as Wi-Fi, LTE, Bluetooth, and possibly wired LAN. Such diverse network connectivity can be used to increase both reliability and performance by running applications over multiple links, sequentially for seamless user experience, or in parallel for bandwidth and performance enhancements. The existing networking stack, however, offers almost no support for intelligently exploiting such network, device, and location diversity.
In this work, we survey recently proposed protocols and architectures that enable heterogeneous networking support. Upon evaluation, we abstract common design patterns and propose a unified networking architecture that makes better use of a heterogeneous dynamic environment, both in terms of networks and devices. The architecture enables mobile nodes to make intelligent decisions about how and when to use each or a combination of networks, based on access policies. With this new architecture, we envision a shift from current applications, which support a single network, location, and device at a time to applications that can support multiple networks, multiple locations, and multiple devices.

To provide high performance at practical power levels, tomorrow's
chips will have to consist primarily of application-specific logic
that is only powered on when needed. This paper discusses
synthesizing such logic from the functional language Haskell. The
proposed approach, which consists of rewriting steps that ultimately
dismantle the source program into a simple dialect that enables a
syntax-directed translation to hardware, enables aggressive
parallelization and the synthesis of application-specific distributed
memory systems. Transformations include scheduling arithmetic
operations onto specific data paths, replacing recursion with
iteration, and improving data locality by inlining recursive types. A
compiler based on these principles is under development.

Social network platforms have transformed how people communicate and share information. However, as these platforms have evolved, the ability for users to control how and with whom information is being shared introduces challenges concerning the configuration and comprehension of privacy settings. To address these concerns, our crowd sourced approach simplifies the understanding of privacy settings by using data collected from 512 users over a 17 month period to generate visualizations that allow users to compare their personal settings to an arbitrary subset of individuals of their choosing. To validate our approach we conducted an online survey with closed and open questions and collected 50 valid responses after which we conducted follow-up interviews with 10 respondents. Our results showed that 80% of users found visualizations using crowd sourced data useful for understanding privacy settings, and 70% preferred a crowd sourced tool for configuring their privacy settings over current privacy controls.

Us and Them --- A Study of Privacy Requirements Across North America, Asia, and Europe

Swapneel Sheth, Gail Kaiser, Walid Maalej

2013-09-15

Data privacy when using online systems like Facebook and Amazon has
become an increasingly popular topic in the
last few years. However, only a little is known about how users and
developers perceive privacy and which concrete measures would mitigate privacy concerns. To investigate privacy requirements,
we conducted an online survey with closed and open questions and
collected 408 valid responses.
Our results show that users often reduce privacy to security, with
data sharing and data breaches being their biggest concerns. Users are more
concerned about the content of their documents and personal data such as location than
their interaction data.
Unlike users, developers clearly prefer technical measures like data
anonymization and think that privacy laws and policies are less effective.
We also observed interesting differences between people from different
geographies. For example, people from Europe are more concerned about
data breaches than people from North America. People from Asia/Pacific
and Europe believe that content and
metadata are more critical for privacy than people from North America.
Our results contribute to developing a user-driven privacy framework
that is based on empirical evidence in addition to the legal,
technical, and commercial perspectives.

Testing large software packages can become very time intensive. To address this problem, researchers have investigated techniques such as Test Suite Minimization. Test Suite Minimization reduces the number of tests in a suite by removing tests that appear redundant, at the risk of a reduction in fault-finding ability since it can be difficult to identify which tests are truly redundant. We take a completely different approach to solving the same problem of long running test suites by instead reducing the time needed to execute each test, an approach that we call Unit Test Virtualization. With Unit Test Virtualization, we reduce the overhead of isolating each unit test with a lightweight virtualization container. We describe the empirical analysis that grounds our approach and provide an implementation of Unit Test Virtualization targeting Java applications. We evaluated our implementation, VMVM, using 20 real-world Java applications and found that it reduces test suite execution time by up to 97% (on average, 62%) when compared to traditional unit test execution. We also compared VMVM to a well known Test Suite Minimization technique, finding the reduction provided by VMVM to be four times greater, while still executing every test with no loss of fault-finding ability.

Challenges arise in testing applications that do not have test oracles, i.e., for which it is impossible or impractical to know what the correct output should be for general input. Metamorphic testing, introduced by Chen et al., has been shown to be a simple yet effective technique in testing these types of applications: test inputs are transformed in such a way that it is possible to predict the expected change to the output, and if the output resulting from this transformation is not as expected, then a fault must exist.
Here, we improve upon previous work by presenting a new technique called Metamorphic Runtime Checking, which automatically conducts metamorphic testing of both the entire application and individual functions during a program's execution. This new approach improves the scope, scale, and sensitivity of metamorphic testing by allowing for the identification of more properties and execution of more tests, and increasing the likelihood of detecting faults not found by application-level properties. We also present the results of new mutation analysis studies that demonstrate that Metamorphic Runtime Checking can kill an average of 170% more mutants than traditional, application-level metamorphic testing alone, and advances the state of the art in testing applications without oracles.

We study the ability of students in a senior/graduate software engineering course to understand and
apply metamorphic testing, a relatively recently invented advance in software testing research that
complements conventional approaches such as equivalence partitioning and boundary analysis. We
previously reported our investigation of the fall 2011 offering of the Columbia University course COMS
W4156 Advanced Software Engineering, and here report on the fall 2012 offering and contrast it to
the previous year. Our main findings are: 1) Although the students in the second offering did not do
very well on the newly added individual assignment specifically focused on metamorphic testing,
thereafter they were better able to find metamorphic properties for their team projects than the
students from the previous year who did not have that preliminary homework and, perhaps most
significantly, did not have the solution set for that homework. 2) Students in the second offering did
reasonably well using the relatively novel metamorphic testing technique vs. traditional black box
testing techniques in their projects (such comparison data is not available for the first offering). 3)
Finally, in both semesters, the majority of the student teams were able to apply metamorphic testing to
their team projects after only minimal instruction, which would imply that metamorphic testing is a
viable strategy for student testers.

Low-latency anonymous communication networks, such as Tor, are geared towards web browsing, instant messaging, and other semi-interactive
applications. To achieve acceptable quality of service, these systems
attempt to preserve packet inter-arrival characteristics, such as inter-packet delay. Consequently, a powerful adversary can mount traffic analysis attacks by observing similar traffic patterns at various points of the network, linking together otherwise unrelated network connections. Previous research has shown that having access to
a few Internet exchange points is enough for monitoring a significant percentage of the network paths from Tor nodes to destination servers.
Although the capacity of current networks makes packet-level monitoring at such a scale quite challenging, adversaries could potentially use less accurate but readily available traffic monitoring functionality, such as Cisco's NetFlow, to mount large-scale traffic analysis attacks.
In this paper, we assess the feasibility and effectiveness of practical
traffic analysis attacks against the Tor network using NetFlow data.
We present an active traffic analysis method based on deliberately
perturbing the characteristics of user traffic at the server side,
and observing a similar perturbation at the client side through
statistical correlation. We evaluate the accuracy of our method using both in-lab testing, as well as data gathered from a public Tor relay serving hundreds of users. Our method revealed the actual sources of anonymous traffic with 100% accuracy for the in-lab tests,
and achieved an overall accuracy of about 81.4% for the real-world experiments, with an average false positive rate of 6.4%.

Video streaming on mobile devices is on the rise. According to recent reports, mobile video streaming traffic accounted for 52.8% of total mobile data traffic in 2011, and it is forecast to reach 66.4% in 2015. We analyzed the network traffic behaviors of the two most popular HTTP-based video streaming services: YouTube and Netflix. Our research indicates that the network traffic behavior depends on factors such as the type of device, multimedia applications in use and network conditions. Furthermore, we found that a large part of the downloaded
video content can be unaccepted by a video player even though it is successfully delivered to a client. This unwanted behavior often occurs when the video player changes the resolution in a fluctuating network condition and the playout buffer is full while downloading a video. Some of the measurements show that the discarded data may exceed 35% of the total video content.

Energy optimizations are being aggressively pursued today. Can these optimizations open up security vulnerabilities? In this invited talk at the Energy Secure System Architectures Workshop (run by Pradip Bose from IBM Watson research center) I discussed security implications of energy optimizations, capabilities of attackers, ease of exploitation, and potential payoff to the attacker. I presented a mini tutorial on security for computer architects, and a personal research wish list for this emerging topic.

This paper presents a review of modern-day schlieren optics system and its application. Schlieren imaging systems provide a powerful technique to visualize changes or nonuniformities in refractive index of air or other transparent media. With the popularization of computational imaging techniques and widespread availability of digital imaging systems, schlieren systems provide novel methods of viewing transparent fluid dynamics. This paper presents a historical background of the technique, describes the methodology behind the system, presents a mathematical proof of schlieren fundamentals, and lists various recent applications and advancements in schlieren studies.

The increasing number of 802.11 APs and wireless devices results in more contention, which causes unsatisfactory WiFi network performance. In addition, non-WiFi devices sharing the same spectrum with 802.11 networks such as microwave ovens, cordless phones, and baby monitors severely interfere with WiFi networks. Although the problem sources can be easily removed in many cases, it is difficult for end users to identify the root cause.
We introduce WiSlow, a software tool that diagnoses the root causes of poor WiFi performance with user-level network probes and leverages peer collaboration to identify the location of the causes. We elaborate on two main methods: packet loss analysis and 802.11 ACK pattern analysis.

The Internet of Things (IoT) enables the physical world to be connected and controlled over the Internet. This paper presents a smart gateway platform that connects everyday objects such as lights, thermometers, and TVs over the Internet. The proposed hardware architecture is implemented on an Arduino platform with a variety of off the shelf home automation technologies such as Zigbee and X10. Using the microcontroller-based platform, the SECE (Sense Everything, Control Everything) system allows users to create various IoT services such as monitoring sensors, controlling actuators, triggering action events, and periodic sensor reporting. We give an overview of the Arduino-based smart gateway architecture and its integration into SECE.

Mobile devices are vertically integrated systems that are powerful, useful platforms, but unfortunately limit user choice and lock users and developers into a particular mobile ecosystem, such as iOS or Android. We present Chameleon, a multi-persona binary compatibility architecture that allows mobile device users to run applications built for different mobile ecosystems together on the same smartphone or tablet. Chameleon enhances the domestic operating system of a device with personas to mimic the application binary interface of a foreign operating system to run unmodified foreign binary applications. To accomplish this without reimplementing the entire foreign operating system from scratch, Chameleon provides four key mechanisms. First, a multi-persona binary interface is used that can load and execute both domestic and foreign applications that use different sets of system calls. Second, compile-time code adaptation makes it simple to reuse existing unmodified foreign kernel code in the domestic kernel. Third, API interposition and passport system calls make it possible to reuse foreign user code together with domestic kernel facilities to support foreign kernel functionality in user space. Fourth, schizophrenic processes allow foreign applications to use domestic libraries to access proprietary software and hardware interfaces on the device. We have built a Chameleon prototype and demonstrate that it imposes only modest performance overhead and can run iOS applications from the Apple App Store together with Android applications from Google Play on a Nexus 7 tablet running the latest version of Android.

As ARM CPUs become increasingly common in mobile devices and servers, there is a
growing demand for providing the benefits of virtualization for ARMbased
devices. We present our experiences building the Linux ARM hypervisor, KVM/ARM,
the first full system ARM virtualization solution that can run unmodified guest
operating systems on ARM multicore hardware. KVM/ARM introduces split-mode
virtualization, allowing a hypervisor to split its execution across CPU modes to
take advantage of CPU mode-specific features. This allows KVM/ARM to leverage
Linux kernel services and functionality to simplify hypervisor development and
maintainability while utilizing recent ARM hardware virtualization extensions to
run application workloads in guest operating systems with comparable performance
to native execution. KVM/ARM has been successfully merged into the mainline
Linux 3.9 kernel, ensuring that it will gain wide adoption as the virtualization
platform of choice for ARM. We provide the first measurements on real hardware
of a complete hypervisor using ARM hardware virtualization support. Our results
demonstrate that KVM/ARM has modest virtualization performance and power costs,
and can achieve lower performance and power costs compared to x86-based Linux
virtualization on multicore hardware.

FARE: A Framework for Benchmarking Reliability of Cyber-Physical Systems

Leon Wu, Gail Kaiser

2013-04-01

A cyber-physical system (CPS) is a system featuring a tight combination of, and coordination between, the systemâ€™s computational and physical elements. System reliability is a critical requirement of cyber-physical systems. An unreliable CPS often leads to system malfunctions, service disruptions, financial losses and even human life. Improving CPS reliability requires an objective measurement, estimation and comparison of the CPS system reliability. This paper describes FARE (Failure Analysis and Reliability Estimation), a framework for benchmarking reliability of cyber-physical systems. Some prior researches have proposed reliability benchmark for some specific CPS such as wind power plant and wireless sensor networks. There were also some prior researches on the components of CPS such as software and some specific hardware. But according to the best of our knowledge, there isnâ€™t any reliability benchmark framework for CPS in general. FARE framework provides a CPS reliability model, a set of methods and metrics on the evaluation environment selection, failure analysis and reliability estimation for benchmarking CPS reliability. It not only provides a retrospect evaluation and estimation of the CPS system reliability using the past data, but also provides a mechanism for continuous monitoring and evaluation of CPS reliability for runtime enhancement. The framework is extensible for accommodating new reliability measurement techniques and metrics. It is also generic and applicable to a wide range of CPS applications. For empirical study, we applied the FARE framework on a smart building management system for a large commercial building in New York City. Our experiments showed that FARE is easy to implement, accurate for comparison and can be used for building useful industry benchmarks and standards after accumulating enough data.

Our accelerating computational demand and the rise of multicore
hardware have made parallel programs increasingly pervasive and
critical. Yet, these programs remain extremely difficult to write,
test, analyze, debug, and verify. In this article, we provide our view
on why parallel programs, specifically multithreaded programs, are
difficult to get right.We present a promising approach we call stable
multithreading to dramatically improve reliability, and summarize
our last four yearsâ€™ research on building and applying stable multithreading
systems.

Through a series of mechanical, semantics-preserving transformations,
I show how a three-line recursive Haskell program (Fibonacci) can be
transformed to a hardware description language -- Verilog -- that can be
synthesized on an FPGA. This report lays groundwork for a
compiler that will perform this transformation automatically.

We discuss practical details and basic scalability for two recent ideas for hardware encryption for trojan prevention. The broad idea is to encrypt the data used as inputs to hardware circuits to make it more difficult for malicious attackers to exploit hardware trojans. The two methods we discuss are data obfuscation and fully homomorphic encryption (FHE). Data obfuscation is a technique wherein specific data inputs are encrypted so that they can be operated on within a hardware module without exposing the data itself to the hardware. FHE is a technique recently discovered to be theoretically possible. With FHE, not only the data but also the operations and the entire circuit are encrypted. FHE primarily exists as a theoretical construct currently. It has been shown that it can theoretically be applied to any program or circuit. It has also been applied in a limited respect to some software. Some initial algorithms for hardware applications have been proposed. We find that data obfuscation is efficient enough to be immediately practical, while FHE is not yet in the practical realm. There are also scalability concerns regarding current algorithms for FHE.

As Social Computing has increasingly captivated the general public, it
has become a popular research area for computer scientists. Social
Computing research focuses on online social behavior and using
artifacts derived from it for providing recommendations and other
useful community knowledge. Unfortunately, some of that behavior and
knowledge incur societal costs, particularly with regards to Privacy,
which is viewed quite differently by different populations as well as
regulated differently in different locales. But clever technical
solutions to those challenges may impose additional societal costs,
e.g., by consuming substantial resources at odds with Green Computing,
another major area of societal concern. We propose a new crosscutting
research area, \emph{Societal Computing}, that focuses on the
technical tradeoffs among computational models and application domains
that raise significant societal issues. We highlight some of the
relevant research topics and open problems that we foresee in Societal
Computing.
We feel that
these topics, and Societal Computing in general, need to gain prominence
as they will provide useful avenues of research leading to
increasing benefits for society as a whole.
This thesis will consist of the following four projects that aim to address the issues of Societal Computing.
First, privacy in the context of ubiquitous social computing systems has become a
major concern for society at large.
As the number of online social computing systems that collect user data
grows, concerns with privacy are further exacerbated.
Examples of such online systems include social networks, recommender systems, and so on.
Approaches to addressing these privacy concerns typically require
substantial extra computational resources, which might be beneficial where
privacy is concerned, but may have significant negative impact with respect
to Green Computing and sustainability, another major societal concern.
Spending more computation time results in spending more energy and other
resources that make the software system less sustainable.
Ideally, what we would like are techniques for designing software systems
that address these privacy concerns but which are also sustainable --- systems
where privacy could be achieved ``for free,'' \ie without having to spend
extra computational effort.
We describe how privacy can indeed be achieved for free --- an
accidental and beneficial side effect of doing some existing computation --- in web applications and online systems that have access to user data.
We show the feasibility, sustainability, and utility of our approach and what types of privacy threats it can mitigate.
Second, we aim to understand what the expectations and needs to end-users and software developers are, with respect to privacy in social systems.
Some questions that we want to answer are:
Do end-users care about privacy?
What aspects of privacy are the most important to end-users?
Do we need different privacy mechanisms for technical vs. non-technical users?
Should we customize privacy settings and systems based on the geographic location of the users?
We have created a large scale user study using an online questionnaire to gather privacy requirements from a variety of stakeholders.
We also plan to conduct follow-up semi-structured interviews.
This user study will help us answer these questions.
Third, a related challenge to above, is to make privacy more understandable in complex systems that may have a variety of user interface options, which may change often.
Our approach is to use crowdsourcing to find out how other users deal with privacy and what settings are commonly used to give users feedback on aspects like how public/private their settings are, what common settings are typically used by others, where do a certain users' settings differ from a trusted group of friends, etc.
We have a large dataset of privacy settings for over 500 users on Facebook and we plan to create a user study that will use the data to make privacy settings more understandable.
Finally, end-users of such systems find it increasingly hard to understand complex privacy settings.
As software evolves over time, this might introduce bugs that breach users' privacy.
Further, there might be system-wide policy changes that could change users' settings to be more or less private than before.
We present a novel technique that can be used by \emph{end-users} for detecting changes in privacy, \ie regression testing for privacy.
Using a social approach for detecting privacy bugs, we present two prototype tools.
Our evaluation shows the feasibility and utility of our approach for detecting privacy bugs.
We highlight two interesting case studies on the bugs that were discovered using our tools.
To the best of our knowledge, this is the first technique that leverages regression testing for detecting privacy bugs from an end-user perspective.

Accurately determining a user's floor location is essential for
minimizing delays in emergency response. This paper presents a floor
localization system intended for emergency calls. We aim to provide
floor-level accuracy with minimum infrastructure support. Our
approach is to use multiple sensors, all available in today's
smartphones, to trace a user's vertical movements inside buildings.
We make three contributions. First, we present a hybrid architecture
for floor localization with emergency calls in mind. The architecture
combines beacon-based infrastructure and sensor-based dead reckoning,
striking the right balance between accurately determining a user's
location and minimizing the required infrastructure. Second, we
present the elevator module for tracking a user's movement in an
elevator. The elevator module addresses three core challenges that
make it difficult to accurately derive displacement from acceleration.
Third, we present the stairway module which determines the number of
floors a user has traveled on foot. Unlike previous systems that
track users' foot steps, our stairway module uses a novel landing
counting technique.

Alias analysis is perhaps one of the most crucial and widely used analyses, and has attracted tremendous research efforts over the years. Yet, advanced alias analyses are extremely difficult to get right, and the bugs in these analyses are most likely the reason that they have not been adopted to production compilers. This paper presents NEONGOBY, a system for effectively detecting errors in alias analysis implementations, improving their correctness and hopefully widening their adoption. NEONGOBY works by dynamically observing pointer addresses during the execution of a test program and then checking these addresses against an alias analysis for errors. It is explicitly designed to (1) be agnostic to the alias analysis it checks for maximum applicability and ease of use and (2) detect alias analysis errors that manifest on real-world programs and workloads. It reduces false positives and performance overhead using a practical selection of techniques. Evaluation on three popular alias analyses and real-world programs Apache and MySQL shows that NEONGOBY effectively finds 29 alias analysis bugs with only 2 false positives and reasonable overhead. To enable alias analysis builders to start using NEONGOBY today, we have released it open-source at https://github.com/wujingyue/neongoby, along with our error detection results and proposed patches.

Inference in general Markov random fields (MRFs) is NP-hard, though identifying the maximum a posteriori (MAP) configuration of pairwise MRFs with submodular cost functions is efficiently solvable using graph cuts. Marginal inference, however, even for this restricted class, is in \#P. We prove new formulations of derivatives of the Bethe free energy, provide bounds on the derivatives and bracket the locations of stationary points, introducing a new technique called Bethe bound propagation. Several results apply to pairwise models whether associative or not. Applying these to discretized pseudo-marginals in the associative case we present a polynomial time approximation scheme for global optimization provided the maximum degree is $O(\log n)$, and discuss several extensions.

I describe in detail the circuitry of the original 1972 Pong video
arcade game and how I reconstructed it on an FPGA -- a modern-day
programmable logic device. In the original circuit, I discover some
sloppy timing and a previously unidentified bug that subtly affected
gameplay. I emulate the quasi-synchronous behavior of the original
circuit by running a synchronous ``simulation'' circuit with a
2X clock and replacing each flip-flop with a circuit that
effectively simulates one. The result is an accurate reproduction
that exhibits many idiosyncracies of the original.

A conventional camera has a limited depth of field (DOF), which often results in defocus blur and loss of image detail. The technique of image refocusing allows a user
to interactively change the plane of focus and DOF of an image after it is captured. One way to achieve refocusing is to capture the entire light field. But this requires a significant compromise of spatial resolution. This is because of the dimensionality gap - the captured information (a light field) is 4-D, while the information required for refocusing (a focal stack) is only 3-D.
In this paper, we present an imaging system that directly captures a focal stack by physically sweeping the focal plane. We first describe how to sweep the focal plane
so that the aggregate DOF of the focal stack covers the entire desired depth range without gaps or overlaps. Since the focal stack is captured in a duration of time when scene objects can move, we refer
to the captured focal stack as a duration focal stack. We then propose an algorithm for computing a space-time in-focus index map from the focal stack, which represents the time at which each pixel is best focused. The algorithm is designed to enable a seamless refocusing experience, even for textureless regions and at depth discontinuities.
We have implemented two prototype focal-sweep cameras and captured several duration focal stacks. Results obtained using our method can be viewed at www.focalsweep.com.

This paper is an attempt to understand the effectiveness of teaching metamorphic properties in a senior/graduate software engineering course classroom environment through gauging the success achieved by students in identifying these properties on the basis of the lectures and materials provided in class. The main findings were: (1) most of the students either misunderstood what metamorphic properties are or fell short of identifying all the metamorphic properties in their respective projects, (2) most of the students that were successful in finding all the metamorphic properties in their respective projects had incorporated certain arithmetic rules into their project logic, and (3) most of the properties identified were numerical metamorphic properties. A possible reason for this could be that the two relevant lectures given in class cited examples of metamorphic properties that were based on numerical properties. Based on the findings of the case study, pertinent suggestions were made in order to improve the impact of lectures provided for Metamorphic Testing.

A Competitive-Collaborative Approach for Introducing Software Engineering in a CS2 Class

Swapneel Sheth, Jonathan Bell, Gail Kaiser

2012-11-05

Introductory Computer Science (CS) classes are typically competitive in nature.
The cutthroat nature of these classes comes from students attempting to get as high a grade as possible, which may or may not correlate with actual learning.
Further, there is very little collaboration allowed in most introductory CS classes.
Most assignments are completed individually since many educators feel that students learn the most, especially in introductory classes, by working alone.
In addition to completing ``normal'' individual assignments, which have many benefits, we wanted to expose students to collaboration early (via, for example, team projects).
In this paper, we describe how we leveraged competition and collaboration in a CS2 to help students learn aspects of computer science better --- in this case, good software design and software testing --- and summarize student feedback.

Gamification, or the use of game elements in non-game contexts, has become an increasingly popular approach to increasing end-user engagement in many contexts, including employee productivity, sales, recycling, and education.
Our preliminary work has shown that gamification can be used to boost student engagement and learning in basic software testing.
We seek to expand our gamified software engineering approach to motivate other software engineering best practices.
We propose to build a game layer on top of traditional continuous integration technologies to increase student engagement in development, documentation, bug reporting, and test coverage.
This poster describes to our approach and presents some early results showing feasibility.

The emergency communication systems are undergoing a transition from
the PSTN-based legacy system to an IP-based next generation system.
In the next generation system, GPS accurately provides a user's
location when the user makes an emergency call outdoors using a mobile
phone. Indoor positioning, however, presents a challenge because GPS
does not generally work indoors. Moreover, unlike outdoors, vertical
accuracy is critical indoors because an error of few meters will send
emergency responders to a different floor in a building.
This paper presents an indoor positioning system which focuses on
improving the accuracy of vertical location. We aim to provide
floor-level accuracy with minimal infrastructure support. Our
approach is to use multiple sensors available in today's smartphones
to trace users' vertical movements inside buildings.
We make three contributions. First, we present the elevator module
for tracking a user's movement in elevators. The elevator module
addresses three core challenges that make it difficult to accurately
derive displacement from acceleration. Second, we present the
stairway module which determines the number of floors a user has
traveled on foot. Unlike previous systems that track users' foot
steps, our stairway module uses a novel landing counting technique.
Third, we present a hybrid architecture that combines the sensor-based
components with minimal and practical infrastructure. The
infrastructure provides initial anchor and periodic corrections of a
user's vertical location indoors. The architecture strikes the right
balance between the accuracy of location and the feasibility of
deployment for the purpose of emergency communication.

An Autonomic Reliability Improvement System for Cyber-Physical Systems

Leon Wu, Gail Kaiser

2012-09-17

System reliability is a fundamental requirement of cyber-physical systems. Unreliable systems can lead to disruption of service, financial cost and even loss of human life. Typical cyber-physical systems are designed to process large amounts of data, employ software as a system component, run online continuously and retain an operator-in-the-loop because of human judgment and accountability requirements for safety-critical systems. This paper describes a data-centric runtime monitoring system named ARIS (Autonomic Reliability Improvement System) for improving the reliability of these types of cyber-physical systems. ARIS employs automated online evaluation, working in parallel with the cyber-physical system to continuously conduct automated evaluation at multiple stages in the system workflow and provide real-time feedback for reliability improvement. This approach enables effective evaluation of data from cyber-physical systems. For example, abnormal input and output data can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop, who can then take actions and make changes to the system based on these alerts in order to achieve minimal system downtime and higher system reliability. We have implemented ARIS in a large commercial building cyber-physical system in New York City, and our experiment has shown that it is effective and efficient in improving building system reliability.

With global pool of data growing at over 2.5 quinitillion bytes per day and over 90% of all data in existence created in the last two years alone [23], there can be little doubt that we have entered the big data era. This trend has brought database performance to the forefront of high throughput, low energy system design. This paper explores targeted deploy- ment of hardware accelerators to improve the throughput and efficiency of database processing. Partitioning, a critical operation when manipulating large data sets, is often the limiting factor in database performance, and represents a significant amount of the overall runtime of database processing workloads.
This paper describes a hardware-software streaming framework and a hardware accelerator for range partitioning, or HARP. The streaming framework offers seamless execution environment for database processing elements such as HARP. HARP offers performance, as well as orders of magnitude gains in power and area efficiency. A detailed analysis of a 32nm physical design shows 9.3 times the throughput of a highly optimized and optimistic software implementation, while consuming just 3.6% of the area and 2.6% of the power of a single Xeon core in the same technology generation.

Privacy in social computing systems has become a major concern.
End-users of such systems find it increasingly hard to understand complex privacy settings.
As software evolves over time, this might introduce bugs that breach users' privacy.
Further, there might be system-wide policy changes that could change users' settings to be more or less private than before.
We present a novel technique that can be used by end-users for detecting changes in privacy, i.e., regression testing for privacy.
Using a social approach for detecting privacy bugs, we present two prototype tools.
Our evaluation shows the feasibility and utility of our approach for detecting privacy bugs.
We highlight two interesting case studies on the bugs that were discovered using our tools.
To the best of our knowledge, this is the first technique that leverages regression testing for detecting privacy bugs from an end-user perspective.

When programs fail in the field, developers are often left with limited information to diagnose the failure. Automated error reporting tools can assist in bug report generation but without precise steps from the end user it is often difficult for developers to recreate the failure. Advanced remote debugging tools aim to capture sufficient information from field executions to recreate failures in the lab but often have too much overhead to practically deploy. We present CHRONICLER, an approach to remote debugging that captures non-deterministic inputs to applications in a lightweight manner, assuring faithful reproduction of client executions. We evaluated CHRONICLER by creating a Java implementation, CHRONICLERJ, and then by using a set of benchmarks mimicking real world applications and workloads, showing its runtime overhead to be under 10% in most cases (worst case 86%), while an existing tool showed overhead over 100% in the same cases (worst case 2,322%).

We present an algorithm for unrolling recursion in the Haskell
functional language. Adapted from a similar algorithm proposed by
Rugina and Rinard for imperative languages, it essentially inlines a
function in itself as many times as requested. This algorithm aims to
increase the available parallelism in recursive functions, with an eye
toward its eventual application in a Haskell-to-hardware compiler. We
first illustrate the technique on a series of examples, then describe
the algorithm, and finally show its Haskell source, which operates as
a plug-in for the Glasgow Haskell Compiler.

Through a series of mechanical transformation, I show how a three-line
recursive Haskell function (Fibonacci) can be translated into a
hardware description language -- VHDL -- for efficient execution on an
FPGA. The goal of this report is to lay the groundwork for a compiler
that will perform these transformations automatically, hence the
development is deliberately pedantic.

Heavy hitters are data items that occur at high frequency
in a data set. Heavy hitters are among the most important
items for an organization to summarize and understand dur-
ing analytical processing. In data sets with sufficient skew,
the number of heavy hitters can be relatively small. We
take advantage of this small footprint to compute aggregate
functions for the heavy hitters in fast cache memory.
We design cache-resident, shared-nothing structures that
hold only the most frequent elements from the table. Our
approach works in three phases. It first samples and picks
heavy hitter candidates. It then builds a hash table and
computes the exact aggregates of these candidates. Finally,
if necessary, a validation step identifies the true heavy hitters
from among the candidates based on the query specification.
We identify trade-offs between the hash table capacity and
performance. Capacity determines how many candidates
can be aggregated. We optimize performance by the use of
perfect hashing and SIMD instructions. SIMD instructions
are utilized in novel ways to minimize cache accesses, be-
yond simple vectorized operations. We use bucketized and
cuckoo hash tables to increase capacity, to adapt to different
datasets and query constraints.
The performance of our method is an order of magnitude
faster than in-memory aggregation over a complete set of
items if those items cannot be cache resident. Even for item
sets that are cache resident, our SIMD techniques enable
significant performance improvements over previous work.

A high percentage of newly-constructed commercial office buildings experience energy consumption that exceeds specifications and system failures after being put into use. This problem is even worse for older buildings. We present a new approach, ‘predictive building energy optimization’, which uses machine learning (ML) and automated online evaluation of historical and real-time building data to improve efficiency and reliability of building operations without requiring large amounts of additional capital investment. Our ML approach uses a predictive model to generate accurate energy demand forecasts and automated analyses that can guide optimization of building operations. In parallel, an automated online evaluation system monitors efficiency at multiple stages in the system workflow and provides building operators with continuous feedback. We implemented a prototype of this application in a large commercial building in Manhattan. Our predictive machine learning model applies Support Vector Regression (SVR) to the building’s historical energy use and temperature and wet-bulb humidity data from the building’s interior and exterior in order to model performance for each day. This predictive model closely approximates actual energy usage values, with some seasonal and occupant-specific variability, and the dependence of the data on day-of-the-week makes the model easily applicable to different types of buildings with minimal adjustment. In parallel, an automated online evaluator monitors the building’s internal and external conditions, control actions and the results of those actions. Intelligent real-time data quality analysis components quickly detect anomalies and automatically transmit feedback to building management, who can then take necessary preventive or corrective actions. Our experiments show that this evaluator is responsive and effective in further ensuring reliable and energy-efficient operation of building systems.

Aperture Evaluation for Defocus Deblurring and Extended Depth of Field

Changyin Zhou, Shree Nayar

2012-04-17

For a given camera setting, scene points that lie outside of depth of field (DOF) will appear defocused (or blurred). Defocus causes the loss of image details. To recover scene details from a defocused region, deblurring techniques must be employed. It is well known that the deblurring quality is closely related to the defocus kernel or point-spread-function (PSF), whose shape is largely determined by the aperture pattern of the camera. In this paper, we propose a comprehensive framework of aperture evaluation for the purpose of defocus deblurring, which takes the effects of image noise, deblurring algorithm, and the structure of natural images into account. By using the derived evaluation criterion, we are able to solve for the optimal coded aperture patterns. Extensive simulations and experiments are then conducted to compare the optimized coded apertures with previously proposed ones.
The proposed framework of aperture evaluation is further extended to evaluate and optimize extended depth of field (EDOF) cameras. EDOF cameras (e.g., focal sweep and wavefront coding camera) are designed to produce PSFs which are less sensitive to depth variation, so that people can deconvolve the whole image using a single PSF without knowing scene depth. Different choices of camera parameters or the PSF to deconvolve with lead to different deblurring qualities. With the derived evaluation criterion, we are able to derive the optimal PSF to deconvolve with in a closed-form and optimize camera parameters for the best deblurring results.

Given recent increases in the size of main memory in modern machines, it is now common to to store large data sets in RAM for faster processing. Multidimensional access methods aim to provide efficient access to large data sets when queries apply predicates to some of the data dimensions. We examine multidimensional access methods in the context of an in-memory column store tuned for on-line analytical processing or scientific data analysis. We propose a multidimensional data structure that contains a novel combination of a grid array and several bitmaps. The base data is clustered in an order matching that of the index structure. The bitmaps contain one bit per block of data, motivating the
term ``blockmap.'' The proposed data structures are compact, typically taking less than one bit of space per row of data. Partition boundaries can be chosen in a way that reflects both the query workload and the data distribution, and boundaries are not required to evenly divide the data if there is a bias in the query distribution. We examine the theoretical performance of the data structure and experimentally measure its performance on three modern CPUs and one GPU processor. We demonstrate that efficient multidimensional access can be achieved with minimal space overhead.

Searching images based on descriptions of image attributes is an intuitive process that can be easily understood by humans and recently made feasible by a few promising works in both the computer vision and multimedia communities. In this report, we describe some experiments of image retrieval methods that utilize weak attributes.

A number of computational imaging techniques have been introduced to improve image quality by increasing light throughput. These techniques use optical coding to measure a stronger signal level. However, the performance of these techniques is limited by the decoding step, which amplifies noise. While it is well understood that optical coding can increase performance at low light levels, little is known about the quantitative performance advantage of computational imaging in general settings. In this paper, we derive the performance bounds for various computational imaging techniques. We then discuss the implications of these bounds for several real-world scenarios (illumination conditions, scene properties and sensor noise characteristics). Our results show that computational imaging techniques provide a significant performance advantage in a surprisingly small set of real world settings. These results can be readily used by practitioners to design the most suitable imaging systems given the application at hand.

High Availability for Carrier-Grade SIP Infrastructure on Cloud Platforms

Jong Yul Kim, Henning Schulzrinne

2012-03-19

SIP infrastructure on cloud platforms has the potential to be both scalable and highly available. In our previous project, we focused on the scalability aspect of SIP services on cloud platforms; the focus of this project is on the high availability aspect. We investigated the effects of component fault on service availability with the goal of understanding how high availability can be guaranteed even in the face of component faults. The experiments were conducted empirically on a real system that runs on Amazon EC2. Our analysis shows that most component faults are masked with a simple automatic failover technique. However, we have also identified fundamental problems that cannot be addressed by simple failover techniques; a problem involving DNS cache in resolvers and a problem involving static failover configurations. Recommendations on how to solve these problems are included in the report.

The goal of this project is to demonstrate the feasibility of automatic detection of
metamorphic properties of individual functions. Properties of interest here, as described
in Murphy et al.’s SEKE 2008 paper “Properties of Machine Learning Applications for Use
in Metamorphic Testing”, include:
1. Permutation of the order of the input data
2. Addition of numerical values by a constant
3. Multiplication of numerical values by a constant
4. Reversal of the order of the input data
5. Removal of part of the data
6. Addition of data to the dataset
While focusing on permutative, additive, and multiplicative properties in functions and
applications, I have sought to identify common programming constructs and code
fragments that strongly indicate that these properties will hold, or fail to hold, along an
execution path in which the code is evaluated. I have constructed a syntax for
expressions representing these common constructs and have also mapped a collection
of these expressions to the metamorphic properties they uphold or invalidate. I have
then developed a general framework to evaluate these properties for programs as a
whole.

One of the primary concerns of users of cloud-based services and applications is the risk of unauthorized access to their private information. For the common setting in which the infrastructure provider and the online service provider are different, end users have to trust their data to both parties, although they interact solely with the service provider. This paper presents CloudFence, a framework that allows users to independently audit the treatment of their private data by third-party online services, through the intervention of the cloud provider that hosts these services.
CloudFence is based on a fine-grained data flow tracking platform exposed by the cloud provider to both developers of cloud-based applications, as well as their users. Besides data auditing for end users, CloudFence allows service providers to confine the use of sensitive data in well-defined domains using data tracking at arbitrary granularity, offering additional protection against inadvertent leaks and unauthorized access. The results of our experimental evaluation with real-world applications, including an e-store platform and a cloud-based backup service, demonstrate that CloudFence requires just a few changes to existing application code, while it can detect and prevent a wide range of security breaches, ranging from data leakage attacks using SQL injection, to personal data disclosure due to missing or erroneously implemented access control checks.

As U.S. power grid transforms itself into Smart Grid, it has become less reliable in the past years. Power grid
failures lead to huge financial cost and affect peopleâ€™s life. Using
a statistical analysis and holistic approach, this paper analyzes
the New York City power grid failures: failure patterns and
climatic effects. Our findings include: higher peak electrical load
increases likelihood of power grid failure; increased subsequent
failures among electrical feeders sharing the same substation;
underground feeders fail less than overhead feeders; cables and
joints installed during certain years are more likely to fail; higher
weather temperature leads to more power grid failures. We further
suggest preventive maintenance, intertemporal consumption,
and electrical load optimization for failure prevention. We also
estimated that the predictability of the power grid component
failures correlates with the cycles of the North Atlantic Oscillation
(NAO) Index.

In 1996, Tennenhouse and Wetherall proposed active net- works, where users can inject code modules into network nodes. The proposal sparked intense debate and follow- on research, but ultimately failed to win over the net- working community. Fifteen years later, the problems that motivated the active networks proposal persist.
We call for a revival of active networks. We present NetServ, a fully integrated active network system that provides all the necessary functionality to be deployable, addressing the core problems that prevented the practical success of earlier approaches.
We make the following contributions. We present a hybrid approach to active networking, which combines the best qualities from the two extreme approachesï¿½ integrated and discrete. We built a working system that strikes the right balance between security and perfor- mance by leveraging current technologies. We suggest an economic model based on NetServ between content providers and ISPs. We built four applications to illus- trate the model.

A Large-Scale, Longitudinal Study of User Profiles in World of Warcraft

Jonathan Bell, Swapneel Sheth, Gail Kaiser

2011-12-29

We present a survey of usage of the popular Massively Multiplayer Online Role Playing Game, World of Warcraft.
Players within this game often self-organize into communities with similar interests and/or styles of play. By mining publicly available data, we collected a dataset consisting of the complete player history for approximately six million characters, with partial data for another six million characters. The paper provides a thorough description of the distributed approach used to collect this massive community data set, and then focuses on an analysis of player achievement data in particular, exposing trends in play from this highly successful game.
From this data, we present several findings regarding player profiles. We correlate achievements with motivations based upon a previously-defined motivation model, and then classify players based on the categories of achievements that they pursued. Experiments show players who fall within each of these buckets can play differently, and that as players progress through game content, their play style evolves as well.

GRAND is an experimental extension of Git, a distributed
revision control system, which enables the synchronization
of Git repositories over Content-Centric Networks (CCN).
GRAND brings some of the benefits of CCN to Git, such as
transparent caching, load balancing, and the ability to fetch
objects by name rather than location. Our implementation
is based on CCNx, a reference implementation of content
router. The current prototype consists of two components:
git-daemon-ccnx allows a node to publish its local Git
repositories to CCNx Content Store; git-remote-ccnx implements
CCNx transport on the client side. This adds CCN
to the set of transport protocols supported by Git, alongside
HTTP and SSH.

Content delivery networks play a crucial role in today’s Internet. They serve a large portion of the multimedia on the Internet and solve problems of scalability and indirectly network congestion (at a price). However, most content delivery networks rely on a statically deployed configuration of nodes and network topology that makes it hard to grow and scale dynamically. We present ActiveCDN, a novel CDN architecture that allows a content publisher to dynamically scale their content delivery services using network virtualization and cloud computing techniques.

Dynamic data flow tracking (DFT) deals with the tagging and tracking of "interesting" data as they propagate during program execution. DFT has been repeatedly implemented by a variety of tools for numerous purposes, including protection from zero-day and cross-site scripting attacks, detection and prevention of information leaks, as well as for the analysis of legitimate and malicious software. We present libdft, a dynamic DFT framework that unlike previous work is at once fast, reusable, and works with commodity software and hardware. libdft provides an API, which can be used to painlessly deliver DFT-enabled tools that can be applied on unmodified binaries, running on common operating systems and hardware, thus facilitating research and rapid prototyping.
We explore different approaches for implementing the low-level aspects of instruction-level data tracking, introduce a more efficient and 64-bit capable shadow memory, and identify (and avoid) the common pitfalls responsible for the excessive performance overhead of previous studies. We evaluate libdft using real applications with large codebases like the Apache and MySQL servers, and the Firefox web browser. We also use a series of benchmarks and utilities to compare libdft with similar systems. Our results indicate that it performs at least as fast, if not faster, than previous solutions, and to the best of our knowledge, we are the first to evaluate the performance overhead of a fast dynamic DFT implementation in such depth. Finally, our implementation is freely available as open source software.

Privacy in the context of ubiquitous social computing systems has become a major concern for the society at large.
As the number of online social computing systems that collect user data grows, this privacy threat is further exacerbated.
There has been some work (both, recent and older) on addressing these privacy concerns.
These approaches typically require extra computational resources, which might be beneficial where privacy is concerned, but when dealing with Green Computing and sustainability, this is not a great option.
Spending more computation time results in spending more energy and more resources that make the software system less sustainable.
Ideally, what we would like are techniques for designing software systems that address these privacy concerns but which are also sustainable - systems where privacy could be achieved ``for free,'' i.e., without having to spend extra computational effort.
In this paper, we describe how privacy can be achieved for free - an accidental and beneficial side effect of doing some existing computation - and what types of privacy threats it can mitigate.
More precisely, we describe a ``Privacy for Free'' design pattern and show its feasibility, sustainability, and utility in building complex social computing systems.

As our society gains a better understanding of how humans have negatively impacted the environment, research related to reducing carbon emissions and overall energy consumption has become increasingly important. One of the simplest ways to reduce energy usage is by making current buildings less wasteful. By improving energy efficiency, this method of lowering our carbon footprint is particularly worthwhile because it reduces energy costs of operating the building, unlike many environmental initiatives that require large monetary investments. In order to improve the efficiency of the heating, ventilation, and air conditioning (HVAC) system of a Manhattan skyscraper, 345 Park Avenue, a predictive computer model was designed to forecast the amount of energy the building will consume. This model uses Support Vector Machine Regression (SVMR), a method that builds a regression based purely on historical data of the building, requiring no knowledge of its size, heating and cooling methods, or any other physical properties. SVMR employs time-delay coordinates as a representation of the past to create the feature vectors for SVM training. This pure dependence on historical data makes the model very easily applicable to different types of buildings with few model adjustments. The SVM regression model was built to predict a week of future energy usage based on past energy, temperature, and dew point temperature data.

Traditional access control models often assume that the entity enforcing access control
policies is also the owner of data and resources. This assumption no longer holds
when data is outsourced to a third-party storage provider, such as the \emph{cloud}.
Existing access control solutions mainly focus on preserving confidentiality of stored
data from unauthorized access and the storage provider.
However, in this setting, access control policies as
well as users' access patterns also become privacy sensitive information that
should be protected from the cloud. We propose a two-level access control scheme
that combines coarse-grained access control enforced at the cloud, which allows to get
acceptable communication overhead and at the same time limits the information that
the cloud learns from his partial view of the access rules and the access patterns,
and fine-grained cryptographic access control enforced at the user's side, which provides the desired expressiveness
of the access control policies. Our solution handles both \emph{read} and \emph{write} access control.

Stable Flight and Object Tracking with a Quadricopter using an Android Device

Benjamin Bardin, William Brown, Paul S. Blaer

2011-09-09

We discuss a novel system architecture for quadricopter control, the Robocopter platform, in which the quadricopter can behave near-autonomously and processing is handled by an Android device on the quadricopter. The Android device communicates with a laptop, receiving commands from the host and sending imagery and sensor data back. We also discuss the results of a series of tests of our platform on our first hardware iteration, named Jabberwock.

System reliability is a fundamental requirement of Cyber-Physical System, i.e., a system featuring
a tight combination of, and coordination between, the systems computational and physical
elements. Cyber-physical system includes systems ranging from the critical infrastructure such as
power grid and transportation system to the health and biomedical devices. An unreliable system
often leads to disruption of service, financial cost and even loss of human life. This thesis aims
to improve system reliability for cyber-physical systems that meet following criteria: processing
large amount of data; employing software as a system component; running online continuously;
having operator-in-the-loop because of human judgment and accountability requirement for safety
critical systems. The reason that I limit the system scope to this type of cyber-physical system is
that this type of cyber-physical systems are important and becoming more prevalent.
To improve system reliability for this type of cyber-physical systems, I propose a system
evaluation approach named automated online evaluation. It works in parallel with the cyber-physical
system to conduct automated evaluation at the multiple stages along the workflow of
the system continuously and provide operator-in-the-loop feedback on reliability improvement.
It is an approach whereby data from cyber-physical system is evaluated. For example, abnormal
input and output data can be detected and flagged through data quality analysis. As a result, alerts
can be sent to the operator-in-the-loop. The operator can then take actions and make changes to
the system based on the alerts in order to achieve minimal system downtime and higher system
reliability. To implement the proposed approach, I further propose a system architecture named
ARIS (Autonomic Reliability Improvement System).
One technique used by the approach is data quality analysis using computational intelligence
that applies computational intelligence in evaluating data quality in some automated and efficient
way to ensure data quality and make sure the running system to perform as expected reliably.
The computational intelligence is enabled by machine learning, data mining, statistical and probabilistic
analysis, and other intelligent techniques. In a cyber-physical system, the data collected
from the system, e.g., software bug reports, system status logs and error reports, are stored in
some databases. In my approach, these data are analyzed via data mining and other intelligent
techniques so that useful information on system reliability including erroneous data and abnormal
system state can be concluded. These reliability related information are directed to operators so
that proper actions can be taken, sometimes proactively based on the predictive results, to ensure
the proper and reliable execution of the system.
Another technique used by the approach is self-tuning that automatically self-manages and
self-configures the evaluation system to ensure it adapts itself based on the changes in the system
and feedback from the operator. The self-tuning adapts the evaluation system to ensure its proper
functioning, which leads to a more robust evaluation system and improved system reliability.
For feasibility study of the proposed approach, I first present NOVA (Neutral Online Visualization-aided
Autonomic) system, a data quality analysis system for improving system reliability for
power grid cyber-physical system. I then present a feasibility study on effectiveness of some
self-tuning techniques, including data classification, redundancy checking and trend detection.
The self-tuning leads to an adaptive evaluation system that works better under system changes
and operator feedback, which will lead to improved system reliability.
The contribution of the work is an automated online evaluation approach that is able to improve
system reliability for cyber-physical systems in the domain of interest as indicated above. It
enables online reliability assurance of the deployed systems that are not possible to perform robust
tests prior to actual deployment.

We introduce the use of describable visual attributes for face images.
Describable visual attributes are labels that can be given to an image to describe its appearance. This thesis focuses mostly on images of faces and the attributes used to describe them, although the concepts also apply to other domains. Examples of face attributes include gender, age, jaw shape, nose size, etc. The advantages of an attribute-based representation for vision tasks are manifold: they can be composed to create descriptions at various levels of specificity; they are generalizable, as they can be learned once and then applied to recognize new objects or categories without any further training; and they are efficient, possibly requiring exponentially fewer attributes (and training data) than explicitly naming each category. We show how one can create and label large datasets of real-world images to train classifiers which measure the presence, absence, or degree to which an attribute is expressed in images. These classifiers can then automatically label new images.
We demonstrate the current effectiveness and explore the future potential of using attributes for image search, automatic face replacement in images, and face verification, via both human and computational experiments. To aid other researchers in studying these problems, we introduce two new large face datasets, named FaceTracer and PubFig, with labeled attributes and identities, respectively.
Finally, we also show the effectiveness of visual attributes in a completely different domain: plant species identification. To this end, we have developed and publicly released the Leafsnap system, which has been downloaded by over half a million users. The mobile phone application is a flexible electronic field guide with high-quality images of the tree species in the Northeast US. It also gives users instant access to our automatic recognition system, greatly simplifying the identification process.

When public transportation stations have access points to provide Internet access to passengers, public transportation becomes a more attractive travel and commute option. However, the Internet connectivity is intermittent because passengers can access the Internet only when a bus or a train is within the networking coverage of an access point at a stop. To efficiently handle this intermittent network for the public transit system, we propose Internet Cache on Wheels (ICOW), a system that provides a low-cost way for bus and train operators to offer access to Internet content. Each bus or train car is equipped with a smart cache that serves popular content to passengers. The cache updates its content based on passenger requests when it is within range of Internet access points placed at bus stops, train stations or depots. We have developed a system architecture and built a prototype of the ICOW system. Our evaluation and analysis show that ICOW is significantly more efficient than having passengers contact Internet access points individually and ensures continuous availability of content throughout the journey.

Data Quality Assurance and Performance Measurement of Data Mining for Preventive Maintenance of Power Grid

Leon Wu, Gail Kaiser, Cynthia Rudin, Roger Anderson

2011-07-01

Ensuring reliability as the electrical grid morphs into the "smart grid" will require innovations in how we assess the state of the grid, for the purpose of proactive maintenance, rather than reactive maintenance; in the future, we will not only react to failures, but also try to anticipate and avoid them using predictive modeling (machine learning and data mining) techniques. To help in meeting this challenge, we present the Neutral Online Visualization-aided Autonomic evaluation framework (NOVA) for evaluating machine learning and data mining algorithms for preventive maintenance on the electrical grid. NOVA has three stages provided through a unified user interface: evaluation of input data quality, evaluation of machine learning and data mining results, and evaluation of the reliability improvement of the power grid. A prototype version of NOVA has been deployed for the power grid in New York City, and it is able to evaluate machine learning and data mining systems effectively
and efficiently.

Networks-on-chip (NoC) are critical to the design
of complex multi-core system-on-chip (SoC) architectures. Since
SoCs are characterized by a combination of high performance
requirements and stringent energy constraints, NoCs must be
realized with low-power design techniques. Since the use of semicustom
design flow based on standard-cell technology libraries is
essential to cope with the SoC design complexity challenges under
tight time-to-market constraints, NoC must be implemented
using logic synthesis. In this paper we analyze the major power
reduction that clock gating can deliver when applied to the
synthesis of a NoC in the context of a semi-custom automated
design flow.

Software testing traditionally receives little attention in early computer science courses. However, we believe that if exposed to testing early, students will develop positive habits for future work. As we have found that students typically are not keen on testing, we propose an engaging and socially-oriented approach to teaching software testing in introductory and intermediate computer science courses. Our proposal leverages the power of gaming utilizing our previously described system HALO. Unlike many previous approaches, we aim to present software testing in disguise - so that students do not recognize (at first) that they are being exposed to software testing. We describe how HALO could be integrated into course assignments as well as the benefits that HALO creates.

Modern network security research has demonstrated a clear need for open sharing of traffic datasets between organizations, a need that has so far been superseded by the challenge of removing sensitive content beforehand. Network Data Anonymization (NDA) is emerging as a field dedicated to this problem, with its main direction focusing on removal of identifiable artifacts that might pierce privacy, such as usernames and IP addresses. However, recent research has demonstrated that more subtle statistical artifacts, also present, may yield fingerprints that are just as differentiable as the former. This result highlights certain shortcomings in current anonymization frameworks -- particularly, ignoring the behavioral idiosyncrasies of network protocols, applications, and users. Recent anonymization results have shown that the extent to which utility and privacy can be obtained is mainly a function of the information in the data that one is aware and not aware of. This paper leverages the predictability of network behavior in our favor to augment existing frameworks through a new machine-learning-driven anonymization technique. Our approach uses the substitution of individual identities with group identities where members are divided based on behavioral similarities, essentially providing anonymity-by-crowds in a statistical mix-net. We derive time-series models for network traffic behavior which quantifiably models the discriminative features of network "behavior" and introduce a kernel-based framework for anonymity which fits together naturally with network-data modeling.

Just as errors in sequential programs can lead to security
exploits, errors in concurrent programs can lead to concurrency
attacks. In this paper, we present an in-depth
study of concurrency attacks and how they may affect existing
defenses. Our study yields several interesting findings.
For instance, we find that concurrency attacks can
corrupt non-pointer data, such as user identifiers, which
existing memory-safety defenses cannot handle. Inspired
by our findings, we propose new defense directions and
fixes to existing defenses.

Mutation testing applies mutation operators to modify program source code or byte code in small ways, and then runs these modified programs (i.e., mutants) against a test suite in order to evaluate the quality of the test suite. In this paper, we first describe a general fault model for con- current programs and some limitations of previously developed sets of first-order concurrency mutation operators. We then present our new mutation testing approach, which em- ploys synchronization-centric second-order mutation operators that are able to generate subtle concurrency bugs not represented by the first-order mutation. These operators are used in addition to the synchronization-centric first-order mutation operators to form a small set of effective concurrency mutation operators for mutant generation. Our empirical study shows that our set of operators is effective in mutant generation with limited cost and demonstrates that this new approach is easy to implement.

Software bugs reported by human users and automatic error reporting software are often stored in some bug track- ing tools (e.g., Bugzilla and Debbugs). These accumulated bug reports may contain valuable information that could be used to improve the quality of the bug reporting, reduce the quality assurance effort and cost, analyze software re- liability, and predict future bug report trend. In this paper, we present BUGMINER, a tool that is able to derive useful information from historic bug report database using data mining, use these information to do completion check and redundancy check on a new or given bug report, and to estimate the bug report trend using statistical analysis. Our empirical studies of the tool using several real-world bug report repositories show that it is effective, easy to implement, and has relatively high accuracy despite low quality data.

Ensuring reliability as the electrical grid morphs into the “smart grid” will require innovations in how we assess the state of the grid, for the purpose of proactive maintenance, rather than reactive maintenance – in the future, we will not only react to failures, but also try to anticipate and avoid them using predictive modeling (ma- chine learning) techniques. To help in meeting this challenge, we present the Neutral Online Visualization-aided Autonomic evaluation framework (NOVA) for evaluating machine learning algorithms for preventive maintenance on the electrical grid. NOVA has three stages provided through a unified user interface: evaluation of input data quality, evaluation of machine learning results, and evaluation of the reliability improvement of the power grid. A prototype version of NOVA has been deployed for the power grid in New York City, and it is able to evaluate machine learning systems effectively and efficiently.

Abstract The star discrepancy is a measure of how uniformly distributed a finite
point set is in the d-dimensional unit cube. It is related to high-dimensional numerical
integration of certain function classes as expressed by the Koksma-Hlawka
inequality. A sharp version of this inequality states that the worst-case error of approximating
the integral of functions from the unit ball of some Sobolev space by
an equal-weight cubature is exactly the star discrepancy of the set of sample points.
In many applications, as, e.g., in physics, quantum chemistry or finance, it is essential
to approximate high-dimensional integrals. Thus with regard to the Koksma-
Hlawka inequality the following three questions are very important:
(i) What are good bounds with explicitly given dependence on the dimension d for
the smallest possible discrepancy of any n-point set for moderate n?
(ii) How can we construct point sets efficiently that satisfy such bounds?
(iii) How can we calculate the discrepancy of given point sets efficiently?
We want to discuss these questions and survey and explain some approaches to
tackle them relying on metric entropy, randomization, and derandomization.

A NEW RANDOMIZED ALGORITHM TO APPROXIMATE THE STAR DISCREPANCY BASED ON THRESHOLD ACCEPTING

MICHAEL GNEWUCH, MAGNUS WAHLSTROM, CAROLA WINZEN

2011-05-24

Abstract. We present a new algorithm for estimating the star discrepancy of arbitrary point
sets. Similar to the algorithm for discrepancy approximation of Winker and Fang [SIAM J. Numer.
Anal. 34 (1997), 2028{2042] it is based on the optimization algorithm threshold accepting. Our
improvements include, amongst others, a non-uniform sampling strategy which is more suited for
higher-dimensional inputs and additionally takes into account the topological characteristics of given
point sets, and rounding steps which transform axis-parallel boxes, on which the discrepancy is to be
tested, into critical test boxes. These critical test boxes provably yield higher discrepancy values, and
contain the box that exhibits the maximum value of the local discrepancy. We provide comprehensive
experiments to test the new algorithm. Our randomized algorithm computes the exact discrepancy
frequently in all cases where this can be checked (i.e., where the exact discrepancy of the point set
can be computed in feasible time). Most importantly, in higher dimension the new method behaves
clearly better than all previously known methods.

Cellphones are increasingly ubiquitous, so much so that many users are
inconveniently forced to carry multiple cellphones to accommodate
work, personal, and geographic mobility needs. We present Cells, a
virtualization architecture for enabling multiple virtual
smartphones to run simultaneously on the same physical cellphone
device in a securely isolated manner. Cells introduces a usage model
of having one foreground virtual phone and multiple background
virtual phones. This model enables a new device namespace mechanism
and novel device proxies that integrate with lightweight operating system virtualization to efficiently and securely multiplex phone hardware devices across multiple virtual phones while providing native hardware device performance to all applications. Virtual phone features
include fully-accelerated graphics for gaming, complete power
management features, and full telephony functionality with separately
assignable telephone numbers and caller ID support. We have
implemented a Cells prototype that supports multiple Android virtual
phones on the same phone hardware. Our performance results
demonstrate that Cells imposes only modest runtime and memory
overhead, works seamlessly across multiple hardware devices including
Google Nexus 1 and Nexus S phones and an NVIDIA tablet, and transparently runs all existing Android applications without any modifications.

While there has been a lot of research towards improving the accuracy of recommender systems, the resulting systems have tended to become increasingly narrow in suggestion variety.
An emerging trend in recommendation systems is to actively seek out diversity in recommendations, where the aim is to provide unexpected, varied, and serendipitous recommendations to the user.
Our main contribution in this paper is a new approach to diversity in recommendations called ``Social Diversity,'' a technique that uses social network information to diversify recommendation results.
Social Diversity utilizes social networks in recommender systems to leverage the diverse underlying preferences of different user communities to introduce diversity into recommendations.
This form of diversification ensures that users in different social networks (who may not collaborate in real life, since they are in a different network) share information, helping to prevent siloization of knowledge and recommendations.
We describe our approach and show its feasibility in providing diverse recommendations for the MovieLens dataset.

Combining a Baiting and a User Search Profiling Techniques for Masquerade Detection

Malek Ben Salem, Salvatore J. Stolfo

2011-05-06

Masquerade attacks are characterized by an adversary stealing
a legitimate user's credentials and using them to impersonate the victim
and perform malicious activities, such as stealing information. Prior work
on masquerade attack detection has focused on proling legitimate user behavior
and detecting abnormal behavior indicative of a masquerade attack.
Like any anomaly-detection based techniques, detecting masquerade attacks
by proling user behavior suers from a signicant number of false positives.
We extend prior work and provide a novel integrated detection approach in
this paper. We combine a user behavior proling technique with a baiting
technique in order to more accurately detect masquerade activity. We show
that using this integrated approach reduces the false positives by 36% when
compared to user behavior proling alone, while achieving almost perfect detection
results.We also show how this combined detection approach serves as
a mechanism for hardening the masquerade attack detector against mimicry
attacks.

With increase in application complexity, the need for network
faults diagnosis for end-users has increased. However,
existing failure diagnosis techniques fail to assist the endusers
in accessing the applications and services.
We present DYSWIS, an automatic network fault detection
and diagnosis system for end-users. The key idea is
collaboration of end-users; a node requests multiple nodes
to diagnose a network fault in real time to collect diverse information
from different parts of the networks and infer the
cause of failure. DYSWIS leverages DHT network to search
the collaborating nodes with appropriate network properties
required to diagnose a failure. The framework allows dynamic
updating of rules and probes into a running system.
Another key aspect is contribution of expert knowledge (rules
and probes) by application developers, vendors and network
administrators; thereby enabling crowdsourcing of diagnosis
strategy for growing set of applications.
We have implemented the framework and the software
and tested them using our test bed and PlanetLab to show
that several complex commonly occurring failures can be
detected and diagnosed successfully using DYSWIS, while
single-user probe with traditional tools fails to pinpoint the
cause of such failures. We validate that our base modules
and rules are sufficient to detect infrastructural failures causing
majority of application failures.

Eyeball ISPs today are under-utilizing an important asset: edge
routers. We present NetServ, a programmable node architecture aimed
at turning edge routers into distributed service hosting platforms.
This allows ISPs to allocate router resources to content publishers
and application service pro\-vi\-ders motivated to deploy content and
services at the network edge. This model provides important benefits
over currently available solutions like CDN. Content and services can
be brought closer to end users by dynamically installing and removing
custom modules as needed throughout the network.
Unlike previous programmable router proposals which focused on
customizing features of a router, NetServ focuses on deploying content
and services. All our design decisions reflect this change in focus.
We set three main design goals: a wide-area deployment, a multi-user
execution environment, and a clear economic benefit. We built a
prototype using Linux, NSIS signaling, and the Java OSGi framework.
We also implemented four prototype applications: ActiveCDN provides
publisher-specific content distribution and processing; KeepAlive
Responder and Media Relay reduce the infrastructure needs of telephony
providers; and Overload Control makes it possible to deploy more
flexible algorithms to handle excessive traffic.

An important problem in reliability engineering is
to predict the failure rate, that is, the frequency with which
an engineered system or component fails. This paper presents a
new method of estimating failure rate using a semiparametric
model with Gaussian process smoothing. The method is able to
provide accurate estimation based on historical data and it does
not make strong a priori assumptions of failure rate pattern (e.g.,
constant or monotonic). Our experiments of applying this method
in power system failure data compared with other models show
its efficacy and accuracy. This method can be used in estimating
reliability for many other systems, such as software systems or
components.

User-contributed messages on social media sites such as Twitter have emerged as powerful, real-time means of information sharing on the Web. These short messages tend to reflect a variety of events in real time, earlier than other social media sites such as Flickr or YouTube,
making Twitter particularly well suited as a source of real-time event content. In this paper, we explore approaches for analyzing the stream of Twitter messages to distinguish between messages about real-world events and non-event messages. Our approach relies on a rich family of aggregate statistics of topically similar message clusters, including temporal, social, topical, and Twitter-centric features. Our large-scale experiments over millions of Twitter messages show the effectiveness of our approach for surfacing real-world event content on Twitter.

Concurrent programming languages are growing in importance with the advent
of multicore systems. Two major concerns in any concurrent program are data
races and deadlocks. Each are potentially subtle bugs that can be caused by nondeterministic
scheduling choices in most concurrent formalisms. Unfortunately,
traditional race and deadlock detection techniques fail on both large programs, and
small programs with complex behaviors.
We believe the solution is model-based design, where the programmer is presented
with a constrained higher-level language that prevents certain unwanted
behavior. We present the SHIM model that guarantees the absence of data races by
eschewing shared memory.
This dissertation provides SHIM based techniques that aid determinism - models
that guarantee determinism, compilers that generate deterministic code and
libraries that provide deterministic constructs. Additionally, we avoid deadlocks,
a consequence of improper synchronization. A SHIM program may deadlock if it
violates a communication protocol. We provide efficient techniques for detecting
and deterministically breaking deadlocks in programs that use the SHIM model.
We evaluate the efficiency of our techniques with a set of benchmarks. We
have also extended our ideas to other languages. The ultimate goal is to provide
deterministic deadlock-free concurrency along with efficiency. Our hope is that
these ideas will be used in the future while designing complex concurrent systems.

This report describes the motivation behind implementing Zeroconf in a open source SIP phone(Linphone) and the architecture of the solution implemented. It also describes the roadblocks encountered and how they were tackled in the implementation. It concludes with a few mentions about future enhancements that may be implemented on a later date.

The invention of the one-time pad is generally credited to Gilbert S. Vernam and
Joseph O. Mauborgne. We show that it was invented about 35 years earlier by a
Sacramento banker named
Frank Miller. We provide a tentative identification of which Frank Miller it
was, and speculate on whether or not Mauborgne might have known of Miller's work,
especially via his colleague Parker Hitt.

Increasingly, people are sharing
sensitive personal information via online social networks (OSN).
While such networks do permit users to control what they share with whom,
access control policies are notoriously difficult to configure correctly;
this raises the question of whether OSN users' privacy settings match their sharing intentions.
We present the results of an empirical evaluation that measures privacy attitudes and intentions
and compares these against the privacy settings on Facebook. Our results indicate a serious mismatch: every one of the 65 participants
in our study confirmed that at least one of the identified violations was in fact a sharing violation.
In other words, OSN users' privacy settings are incorrect.
Furthermore, a majority of users cannot or will not fix such errors.
We conclude that the current approach to privacy settings is fundamentally flawed
and cannot be fixed;
a fundamentally different approach is needed. We present recommendations to
ameliorate the current problems, as well as provide suggestions for future
research.

In recent years, computer games have become increasingly social and collaborative in nature.
Massively multiplayer online games, in which a large number of players collaborate with each other to achieve common goals in the game, have become extremely pervasive.
By working together towards a common goal, players become more engrossed in the game.
In everyday work environments, this sort of engagement would be beneficial, and is often sought out.
We propose an approach to software engineering called HALO that builds upon the properties found in popular games, by turning work into a game environment.
Our proposed approach can be viewed as a model for a family of prospective games that would support the software development process.
Utilizing operant conditioning and flow theory, we create an immersive software development environment conducive to increased productivity.
We describe the mechanics of HALO and how it could fit into typical software engineering processes.

Health care professionals rely on software to simulate anatomical and
physiological elements of the human body for purposes of training, prototyping, and decision making. Software can also be used to simulate medical processes and protocols to measure cost effectiveness and resource utilization. Whereas much of the software engineering research into simulation software focuses on validation (determining that the simulation accurately models real-world activity), to date there has been little investigation into the testing of simulation software itself, that is, the ability to effectively search for errors in the implementation. This is particularly challenging because often there is no test oracle to indicate whether the results of the simulation are correct. In this paper, we present an approach to systematically testing simulation software in the absence of test oracles, and evaluate the effectiveness of the technique.

Protocols and System Design, Reliability, and Energy Efficiency in Peer-to-Peer Communication Systems

Salman Abdul Baset

2011-02-04

Modern Voice-over-IP (VoIP) communication systems provide a bundle of services to their users. These services range from the most basic voice-based services such as voice calls and voicemail to more advanced ones such as conferencing, voicemail-to-text, and online address books. Besides voice, modern VoIP systems provide video calls and video conferencing, presence, instant messaging (IM), and even desktop sharing services. These systems also let their users establish a voice, video, or a text session with devices in cellular, public switched telephone network (PSTN), or other VoIP networks.
The peer-to-peer (p2p) paradigm for building VoIP systems involves minimal or no use of managed servers and is therefore attractive from an administrative and economic perspective. However, the benefits of using p2p paradigm in VoIP systems are not without their challenges. First, p2p communication (VoIP) systems can be deployed in environ-
ments with varying requirements of scalability, connectivity, security, interoperability, and performance. These requirements bring forth the question of designing open and standardized protocols for diverse deployments. Second, the presence of restrictive network address
translators (NATs) and firewalls prevents machines from directly exchanging packets and is problematic from the perspective of establishing direct media sessions. The p2p communication systems address this problem by using an intermediate peer with unrestricted
connectivity to relay the session or by preferring the use of TCP. This technique for addressing connectivity problems raises questions about the reliability and session quality of p2p communication systems compared with the traditional client-server VoIP systems. Third,
while administrative overheads are likely to be lower in running p2p communication systems as compared to client-server, can the same be said about the energy efficiency? Fourth, what type of techniques can be used to gain insights into the performance of a deployed
p2p VoIP system like Skype?
The thesis addresses the challenges in designing, building, and analyzing peer-to-peer communication systems. The thesis presents Peer-to-Peer Protocol (P2PP), an open protocol for building p2p communication systems with varying operational requirements. P2PP
is now part of the IETF's P2PSIP protocol and is on track to become an RFC. The thesis describes the design and implementation of OpenVoIP, a proof-of-concept p2p communication system to demonstrate the feasibility of P2PP and to explore issues in building p2p communication systems. The thesis introduces a simple and novel analytical model for analyzing the reliability of peer-to-peer communication systems and analyzes the feasibility of TCP for sending real-time traffic. The thesis then analyzes the energy efficiency of peer-to-peer and client-server VoIP systems and shows that p2p VoIP systems are less energy efficient than client-server even if the peers consume a small amount of energy for running the p2p network. Finally, the thesis presents an analysis of the Skype protocol which indicates that Skype is free-riding on the network bandwidth of universities.

Anonymous communication networks like Tor
partially protect the confidentiality of their users' traffic by
encrypting all intra-overlay communication.
However, when the relayed traffic
reaches the boundaries of the overlay network towards its actual
destination, the original user traffic is inevitably exposed. At this
point, unless end-to-end encryption is used, sensitive user data can
be snooped by a malicious or compromised exit node, or by any other
rogue network entity on the path towards the actual destination.
We explore the use of decoy traffic for the detection of traffic
interception on anonymous proxying systems. Our approach is based on
the injection of traffic that exposes bait credentials for decoy
services that require user authentication. Our aim is to entice
prospective eavesdroppers to access decoy accounts on servers under
our control using the intercepted credentials. We have deployed our
prototype implementation in the Tor network using decoy IMAP and SMTP
servers. During the course of six months, our system detected eight
cases of traffic interception that involved eight different Tor exit
nodes. We provide a detailed analysis of the detected incidents,
discuss potential improvements to our system, and outline how our
approach can be extended for the detection of HTTP session hijacking
attacks.

The hardware industryï¿½s rapid development of multicore and many core hardware has outpaced the software industryï¿½s transition from sequential to parallel programs. Most applications are still sequential, and many cores on parallel machines remain unused. We propose a tool that uses data-dependence profiling and binary rewriting to parallelize executables without access to source code. Our technique uses Bernsteinï¿½s conditions to identify independent sets of basic blocks that
can be executed in parallel, introducing a level of granularity between fine-grained instruction level and coarse grained task level parallelism. We analyze dynamically generated control and data dependence graphs to find independent sets of basic blocks which can be parallelized.
We then propose to parallelize these candidates using binary rewriting techniques. Our technique aims to demonstrate the parallelism that remains in serial application by exposing concrete opportunities for parallelism.

Masquerade attacks pose a grave security problem that is a consequence of identity theft. Detecting masqueraders is very hard. Prior work has focused on proling legitimate user behavior and detecting deviations from that normal behavior that could potentially signal an ongoing
masquerade attack. Such approaches suffer from high false positive rates.
Other work investigated the use of trap-based mechanisms as a means
for detecting insider attacks in general. In this paper, we investigate the use of such trap-based mechanisms for the detection of masquerade at
tacks. We evaluate the desirable properties of decoys deployed within a
user's file space for detection.We investigate the trade-os between these properties through two user studies, and propose recommendations for effective masquerade detection using decoy documents based on findings from our user studies.

Real-world large-scale data collection poses an important
challenge in the security field. Insider and masquerader attack data collection poses even a greater challenge. Very few organizations acknowledge such breaches because of liability concerns and potential implications on their market value. This caused the scarcity of real-world data sets that could be used to study insider and masquerader attacks. In this paper, we present the design, technical, and procedural challenges encountered during our own masquerade data gathering project. We also share some lessons learned from this several-year project related to the Institutional Review Board process and to user study design.

A computational camera uses a combination of optics and software to produce images that cannot be taken with traditional cameras. In the last decade, computational imaging has emerged as a vibrant field of research. A wide variety of computational cameras have been demonstrated - some designed to achieve new imaging functionalities and others to reduce the complexity of traditional imaging.
In this article, we describe how computational cameras have evolved and present a taxonomy for the technical approaches they use. We explore the benefits and limits of computational imaging, and describe how it is related to the adjacent and overlapping fields of digital imaging, computational photography and computational image sensors.

We extend the notion of L2-B-discrepancy introduced in [E. Novak, H. Wo´zniakowski, L2 discrepancy and multivariate integration, in: Analytic number theory. Essays in honour of Klaus Roth. W. W. L. Chen, W. T. Gowers, H. Halberstam, W. M. Schmidt, and R. C. Vaughan (Eds.), Cambridge University Press, Cambridge, 2009, 359 – 388] to what we want to call weighted geometric L2-discrepancy. This extended notion allows us to consider weights to moderate the importance of different groups of variables, and additionally volume measures different from the Lebesgue measure as well as classes of test sets different from measurable subsets of Euclidean spaces.
We relate the weighted geometric L2-discrepancy to numerical integration defined over weighted reproducing kernel Hilbert spaces and settle in this way an open problem posed by Novak and Wo´zniakowski.
Furthermore, we prove an upper bound for the numerical integration error for cubature formulas that use admissible sample points. The set of admissible sample points may actually be a subset of the integration domain of measure zero. We illustrate that particularly in infinite dimensional numerical integration it is crucial to distinguish between the whole integration domain and the set of those sample points that actually can be used by algorithms.

We present a comprehensive survey of Voice over IP security academic research, using a set of 245 publications forming a closed cross-citation set. We classify these papers according to an extended version of the VoIP Security Alliance (VoIPSA) Threat Taxonomy. Our goal is to provide a roadmap for researchers seeking to understand existing capabilities and to identify gaps in addressing the numerous threats and vulner- abilities present in VoIP systems. We discuss the implications of our findings with respect to vulnerabilities reported in a variety of VoIP products.
We identify two specific problem areas (denial of service, and service abuse) as requiring significant more attention from the research community. We also find that the overwhelming majority of the surveyed work takes a black box view of VoIP systems that avoids examining their internal structure and implementation. Such an approach may miss the mark in terms of addressing the main sources of vulnerabilities, i.e., implementation bugs and misconfigurations. Finally, we argue for further work on understanding cross-protocol and cross-mechanism vulnerabilities (emergent properties), which are the byproduct of a highly complex system-of-systems and an indication of the issues in future large-scale systems.

Masquerade attacks are a common security problem
that is a consequence of identity theft. Masquerade detection may
serve as a means of building more secure and dependable systems
that authenticate legitimate users by their behavior. Prior work
has focused on user command modeling to identify abnormal
behavior indicative of impersonation. This paper extends prior
work by modeling user search behavior to detect deviations indicating
a masquerade attack. We hypothesize that each individual
user knows their own file system well enough to search in a
limited, targeted and unique fashion in order to find information
germane to their current task. Masqueraders, on the other
hand, will likely not know the file system and layout of another
userï¿½s desktop, and would likely search more extensively and
broadly in a manner that is different than the victim user being
impersonated. We devise a taxonomy of Windows applications
and user commands that are used to abstract sequences of
user actions and identify actions linked to search activities. The
experimental results show that modeling search behavior reliably
detects all masqueraders with a very low false positive rate of
1.1%, far better than prior published results. The limited set of
features used for search behavior modeling also results in large
performance gains over the same modeling techniques that use
larger sets of features.

As Social Computing has increasingly captivated the general public, it
has become a popular research area for computer scientists. Social
Computing research focuses on online social behavior and using
artifacts derived from it for providing recommendations and other
useful community knowledge. Unfortunately, some of that behavior and
knowledge incur societal costs, particularly with regards to Privacy,
which is viewed quite differently by different populations as well as
regulated differently in different locales. But clever technical
solutions to those challenges may impose additional societal costs,
e.g., by consuming substantial resources at odds with Green Computing,
another major area of societal concern. We propose a new crosscutting
research area, \emph{Societal Computing}, that focuses on the
technical tradeoffs among computational models and application domains
that raise significant societal issues. We highlight some of the
relevant research topics and open problems that we foresee in Societal
Computing.
We feel that
these topics, and Societal Computing in general, need to gain prominence
as they will provide useful avenues of research leading to
increasing benefits for society as a whole.

This paper describes a work-in-progress to demonstrate the feasibility of integrating services in the Internet core. The project aims to reduce or eliminate so called ossification of the Internet. Here we discuss the recent contributions of two of the team members at Columbia University. We will describe experiences setting up a Juniper router, running packet forwarding tests, preparing for the GENI demo, and starting prototype 2 of NetServ.

Recommender systems are becoming increasingly popular.
As these systems become commonplace and the number of users increases, it will become important for these systems to be able to cope with a large and diverse set of users whose recommendation needs may be very different from each other.
In particular, large scale recommender systems will need to ensure that users' requests for recommendations can be answered with low response times and high throughput.
In this paper, we explore how to use caches and cached data mining to improve the performance of recommender systems by improving throughput and reducing response time for providing recommendations.
We describe the structure of our cache, which can be viewed as a prefetch cache that prefetches all types of supported recommendations, and how it is used in our recommender system.
We also describe the results of our simulation experiments to measure the efficacy of our cache.

In application domains that do not have a test oracle, such as machine learning and scientific computing, quality assurance is a challenge because it is difficult or impossible to know in advance what the correct output should be for general input. Previously, metamorphic testing has been shown to be a simple yet effective technique in detecting defects, even without an oracle. In metamorphic testing, the application's ``metamorphic properties'' are used to modify existing test case input to produce new test cases in such a manner that, when given the new input, the new output can easily be computed based on the original output. If the new output is not as expected, then a defect must exist. In practice, however, metamorphic testing can be a manually intensive technique for all but the simplest cases. The transformation of input data can be laborious for large data sets, and errors can occur in comparing the outputs when they are very complex. In this paper, we present a tool called Amsterdam that automates metamorphic testing by allowing the tester to easily set up and conduct metamorphic tests with little manual intervention, merely by specifying the properties to check, configuring the framework, and running the software. Additionally, we describe an approach called Heuristic Metamorphic Testing, which addresses issues related to false positives and non-determinism, and we present the results of new empirical studies that demonstrate the effectiveness of metamorphic testing techniques at detecting defects in real-world programs without test oracles.

Deployed multithreaded applications contain many races
because these applications are difficult to write, test, and
debug. Worse, the number of races in deployed applications
may drastically increase due to the rise of multicore
hardware and the immaturity of current race detectors.
LOOM is a “live-workaround” system designed to
quickly and safely bypass application races at runtime.
LOOM provides a flexible and safe language for developers
to write execution filters that explicitly synchronize
code. It then uses an evacuation algorithm to safely install
the filters to live applications to avoid races. It reduces
its performance overhead using hybrid instrumen-
tation that combines static and dynamic instrumentation.
We evaluated LOOM on nine real races from a diverse
set of six applications, including MySQL and Apache.
Our results show that (1) LOOM can safely fix all evaluated
races in a timely manner, thereby increasing application
availability; (2) LOOM incurs little performance
overhead; (3) LOOM scales well with the number of application
threads; and (4) LOOM is easy to use.

Baseline: Metrics for setting a baseline for web vulnerability scanners

Huning Dai, Michael Glass, Gail Kaiser

2010-09-22

As web scanners are becoming more popular because they are faster and cheaper than security consultants, the trend of relying on these scanners also brings a great hazard: users can choose a weak or outdated scanner and trust incomplete results. Therefore, benchmarks are created to both evaluate and compare the scanners. Unfortunately, most existing benchmarks suffer from various drawbacks, often by testing against inappropriate criteria that does not reflect the user's needs. To deal with this problem, we present an approach called Baseline that coaches the user in picking the minimal set of weaknesses (i.e., a baseline) that a qualified scanner should be able to detect and also helps the user evaluate the effectiveness and efficiency of the scanner in detecting those chosen weaknesses. Baseline's goal is not to serve as a generic ranking system for web vulnerability scanners, but instead to help users choose the most appropriate scanner for their specific needs.

We study the tractability of computing $\varepsilon$-approximations of the
Fredholm problem of the second kind: given $f\in F_d$ and $q\in Q_{2d}$,
find $u\in L_2(I^d)$ satisfying
\[
u(x) - \int_{I^d} q(x,y)u(y)\,dy = f(x)
\qquad\forall\,x\in I^d=[0,1]^d.
\]
Here, $F_d$ and $Q_{2d}$ are spaces of $d$-variate right hand functions and
$2d$-variate kernels that are continuously embedded in~$L_2(I^d)$
and~$L_2(I^{2d})$, respectively. We consider the worst case setting, measuring
the approximation error for the solution $u$ in the $L_2(I^d)$-sense. We say
that a problem is tractable if the minimal number of information operations
of $f$ and $q$ needed to obtain an $\varepsilon$-approximation is
sub-exponential in $\varepsilon^{-1}$ and~$d$. One information operation
corresponds to the evaluation of one linear functional or one function
value. The lack of sub-exponential behavior may be defined in various ways,
and so we have various kinds of tractability. In particular, the problem
is strongly polynomially tractable if the minimal number of information
operations is bounded by a polynomial in $\varepsilon^{-1}$ for all~$d$.
We show that tractability (of any kind whatsoever) for the Fredholm problem
is equivalent to tractability of the $L_2$-approximation problems over the
spaces of right-hand sides and kernel functions. So (for example) if both
these approximation problems are strongly polynomially tractable, so is the
Fredholm problem. In general, the upper bound provided by this proof is
essentially non-constructive, since it involves an interpolatory algorithm
that exactly solves the Fredholm problem (albeit for finite-rank
approximations of~$f$ and~$q$). However, if linear functionals are
permissible and that $F_d$ and~$Q_{2d}$ are tensor product spaces, we are
able to surmount this obstacle; that is, we provide a fully-constructive
algorithm that provides an approximation with nearly-optimal cost, i.e.,
one whose cost is within a factor $\ln\,\varepsilon^{-1}$ of being optimal.

Encrypted search --- performing queries on protected data --- is a well
researched problem. However, existing solutions have inherent
inefficiency that raises questions of practicality.
Here, we step back from the goal of achieving maximal privacy
guarantees in an encrypted search scenario to consider efficiency as
a priority. We propose a privacy
framework for search that allows tuning and optimization of the
trade-offs between privacy and efficiency.
As an instantiation of
the privacy framework we introduce a tunable search system based on
the SADS scheme and provide detailed measurements demonstrating the
trade-offs of the constructed system. We also analyze other existing
encrypted search schemes with respect to this framework. We further
propose a protocol that addresses the challenge of document content
retrieval in a search setting with relaxed privacy requirements.

The IPsec protocol promised easy, ubiquitous encryption. That has never happened. For the most part, IPsec usage is confined to VPNs for road warriors, largely due to needless configuration complexity and incompatible implementations. We have designed a simple VPN configuration language that hides the unwanted complexities. Virtually no options are necessary or possible. The administrator specifies the absolute minimum of information: the authorized hosts, their operating systems, and a little about the network topology; everything else, including certificate generation, is automatic. Our implementation includes a multitarget compiler, which generates implementation-specific configuration files for three different platforms; others are easy to add.

We study the numerical integration problem for functions with
infinitely many variables. The functions we want to integrate
are from a reproducing kernel Hilbert space which is endowed with
a weighted norm.
We study the worst case $\epsilon$-complexity which
is defined as the minimal cost among all algorithms whose worst
case error over the Hilbert space unit ball is at most $\epsilon$.
Here we assume that the
cost of evaluating a function depends polynomially on the number
of active variables.
The infinite-dimensional integration problem is (polynomially)
tractable if the
$\epsilon$-complexity is bounded by a constant times a power of
$1/\epsilon$. The smallest such power is called the exponent of
tractability.
First we study finite-order weights. We provide improved lower
bounds for the exponent of tractability for general finite-order weights
and improved upper bounds for three newly defined classes
of finite-order weights.
The constructive upper bounds are obtained by multilevel algorithms
that use for each level quasi-Monte Carlo integration points whose
projections onto specific sets of coordinates exhibit a small
discrepancy.
The newly defined finite-intersection weights model the situation where
each group of variables interacts with at most $\rho$ other groups
of variables, where $\rho$ is some fixed number.
For these weights we obtain a sharp upper bound. This is the
first class of weights for which the exact exponent of tractability
is known for any possible decay of the weights and for any polynomial
degree of the cost function. For the other two classes of finite-order
weights our upper bounds are sharp if, e.g.,
the decay of the
weights
is fast or slow enough.
We extend our analysis to the case of arbitrary weights.
In particular, from our results for finite-order
weights, we conclude a lower bound on the exponent
of tractability for arbitrary weights and a constructive upper bound for
product weights.
Although we confine ourselves for simplicity to
explicit upper bounds for four classes of
weights, we stress that our multilevel algorithm together with our
default choice of quasi-Monte Carlo points
is applicable to any class of weights.

Many software security vulnerabilities only reveal themselves under certain conditions, i.e., particular configurations and inputs together with a certain runtime environment. One approach to detecting these vulnerabilities is fuzz testing that feeds randomly generated inputs to the software and witnesses its failures. However, typical fuzz testing makes no guarantees regarding the syntactic and semantic validity of the input, or of how much of the input space will be explored. To address these problems, we present a new testing methodology called Configuration Fuzzing. Configuration Fuzzing is a technique whereby the configuration of the running application is mutated at certain execution points, in order to check for vulnerabilities that only arise in certain conditions. As the application runs in the deployment environment, this testing technique continuously fuzzes the configuration and checks "security invariants'' that, if violated, indicate a vulnerability. We discuss the approach and introduce a prototype framework called ConFu (CONfiguration FUzzing testing framework) implementation. We also present the results of case studies that demonstrate the approach's feasibility and evaluate its performance.

Masquerade attacks are a common security problem that is
a consequence of identity theft. Prior work has focused on user command
modeling to identify abnormal behavior indicative of impersonation. This
paper extends prior work by modeling user search behavior to detect
deviations indicating a masquerade attack. We hypothesize that each
individual user knows their own le system well enough to search in a
limited, targeted and unique fashion in order to nd information germane
to their current task. Masqueraders, on the other hand, will likely
not know the le system and layout of another user's desktop, and would
likely search more extensively and broadly in a manner that is dierent
than the victim user being impersonated. We extend prior research by
devising taxonomies of UNIX commands and Windows applications that
are used to abstract sequences of user commands and actions. The experimental
results show that modeling search behavior reliably detects
all masqueraders with a very low false positive rate of 0.13%, far better
than prior published results. The limited set of features used for search
behavior modeling also results in large performance gains over the same
modeling techniques that use larger sets of features.

Recommender systems have become increasingly popular. Most research on recommender systems has focused on recommendation algorithms. There has been relatively little research, however, in the area of generalized system architectures for recommendation systems. In this paper, we introduce weHelp - a reference architecture for social recommender systems. Our architecture is designed to be application and domain agnostic, but we briefly discuss here how it applies to recommender systems for software engineering.

Electronic health record (EHR) systems have significant potential advantages over traditional paper-based systems, but they require that providers assume responsibility for data entry. One significant barrier to adoption of EHRs is the perception of slowed data-entry by providers. This study compares the speed of data-entry using computer-based templates vs. paper for a large eye clinic, using 10 subjects and 10 simulated clinical scenarios. Dataentry into the EHR was significantly slower (p<0.01) than traditional paper forms.

Mutation testing is a white-box fault-based software
testing technique that applies mutation operators to
modify program source code or byte code in small ways and
then runs these modified programs (i.e., mutants) against a
test suite in order to measure its effectiveness and locate the
weaknesses either in the test data or in the program that are
seldom or never exposed during normal execution. In this paper, we describe our implementation of a generic
mutation testing framework and the results of applying three
sets of concurrency mutation operators on four example Java
programs through empirical study and analysis.

Applications in the fields of scientific computing, simulation, optimization, machine learning, etc. are sometimes said to be "non-testable programs" because there is no reliable test oracle to indicate what the correct output should be for arbitrary input. In some cases, it may be impossible to know the program's correct output a priori; in other cases, the creation of an oracle may simply be too hard. These applications typically fall into a category of software that Weyuker describes as "Programs which were written in order to determine the answer in the first place. There would be no need to write such programs, if the correct answer were known." The absence of a test oracle clearly presents a challenge when it comes to detecting subtle errors, faults, defects or anomalies in software in these domains.
Without a test oracle, it is impossible to know in general what the expected output should be for a given input, but it may be possible to predict how changes to the input should effect changes in the output, and thus identify expected relations among a set of inputs and among the set of their respective outputs. This approach, introduced by Chen et al., is known as "metamorphic testing". In metamorphic testing, if test case input x produces an output f(x), the function's so-called "metamorphic properties" can then be used to guide the creation of a transformation function t, which can then be applied to the input to produce t(x); this transformation then allows us to predict the expected output f(t(x)), based on the (already known) value of f(x). If the new output is as expected, it is not necessarily right, but any violation of the property indicates a defect. That is, though it may not be possible to know whether an output is correct, we can at least tell whether an output is incorrect.
This thesis investigates three hypotheses. First, I claim that an automated approach to metamorphic testing will advance the state of the art in detecting defects in programs without test oracles, particularly in the domains of machine learning, simulation, and optimization. To demonstrate this, I describe a tool for test automation, and present the results of new empirical studies comparing the effectiveness of metamorphic testing to that of other techniques for testing applications that do not have an oracle. Second, I suggest that conducting function-level metamorphic testing in the context of a running application will reveal defects not found by metamorphic testing using system-level properties alone, and introduce and evaluate a new testing technique called Metamorphic Runtime Checking. Third, I hypothesize that it is feasible to continue this type of testing in the deployment environment (i.e., after the software is released), with minimal impact on the user, and describe a generalized approach called In Vivo Testing.
Additionally, this thesis presents guidelines for identifying metamorphic properties, explains how metamorphic testing fits into the software development process, and discusses suggestions for both practitioners and researchers who need to test software without the help of a test oracle.

Robust, efficient, and accurate contact response remains a challenging problem in the simulation of deformable materials. Contact models should robustly handle contact between geometry by preventing interpenetrations. This should be accomplished while respecting natural laws in order to maintain physical correctness. We simultaneously desire to achieve these criteria as efficiently as possible to minimize simulation runtimes. Many methods exist that partially achieve these properties, but none yet fully attain all three. This thesis investigates existing methodologies with respect to these attributes, and proposes a novel algorithm for the simulation of deformable materials that demonstrate them all. This new method is analyzed and optimized, paving the way for future work in this simplified but powerful manner of simulation.

Cybersecurity mechanisms have become increasingly important as online and offline worlds converge. Strong authentication and accountability are key tools for dealing with online attacks, and we would like to realize them through a token-based, centralized identity management system. In this report, we present aprivacy-preserving group of protocols comprising a unique per user digital identity card, with which its owner is able to authenticate himself, prove possession of attributes, register himself to multiple online organizations (anonymously or not) and provide proof of membership. Unlike
existing credential-based identity management systems, this card is revocable, i.e., its legal owner may invalidate it if physically lost, and still recover its content and registrations into a new credential.
This card will protect an honest individual's anonymity when applicable as well as ensure his activity is known only to appropriate users.

Tractability of multivariate problems has become nowadays
a popular research subject. Polynomial tractability means that the solution
of a d-variate problem can be solved to within $\varepsilon$ with
polynomial cost in $\varepsilon^{-1}$ and d. Unfortunately, many
multivariate problems are not polynomially tractable.
This holds for all non-trivial unweighted linear tensor product problems.
By an unweighted problem we mean the case when all variables and
groups of variables play the same role.
It seems natural to ask what is the
``smallest'' non-exponential function $T:[1,\infty)\times
[1,\infty)\to[1,\infty)$ for which we have
T-tractability of unweighted linear tensor product problems. That is, when
the cost of a multivariate problem can be bounded
by a multiple of a power of $T(\varepsilon^{-1},d)$.
Under natural assumptions, it turns out that this function is
$T^{qpol}(x,y):=\exp((1+\ln\,x)(1+\ln y))$
for all $x,y\in[1,\infty)$.
The function $T^{qpol}$ goes to infinity faster than any
polynomial although not ``much'' faster, and that is why we refer to
$T^{qpol}$-tractability as quasi-polynomial tractability.
The main purpose of this paper is to promote quasi-polynomial
tractability especially for the study of unweighted multivariate problems.
We do this for the worst case and randomized settings and for
algorithms using arbitrary linear functionals or only function values.
We prove relations between quasi-polynomial tractability
in these two settings and for the two classes of algorithms.

We introduce BotSwindler, a bait injection system designed to delude and detect crimeware by forcing it to reveal itself during the exploitation of monitored information. Our implementation of BotSwindler relies upon an out-of-host software agent to drive user-like interactions in a virtual machine, seeking to convince malware residing within the guest OS that it has captured legitimate credentials. To aid in the accuracy and realism of the simulations, we introduce a low overhead approach, called virtual machine verification, for verifying whether the guest OS is in one of a predefined set of states. We provide empirical evidence to show that BotSwindler can be used to induce malware into performing observable actions and demonstrate how this approach is superior to that used in other tools. We present results from a user study to illustrate the believability of the simulations and show that financial bait information can be used to effectively detect compromises through experimentation with real credential-collecting malware.

Current banking systems do not aim to protect user privacy. Purchases made from a single bank account can be linked to each other by many parties. This could be addressed in a straight-forward way by generating unlinkable credentials from a single master credential using
Camenisch and Lysyanskaya's algorithm; however, if bank accounts are
taxable, some report must be made to the tax authority about each account. Using unlinkable credentials, digital cash, and zero knowledge proofs of knowledge, we present a solution that prevents anyone, even the tax authority, from knowing which accounts belong to which users, or from being able to link any account to another or to purchases or deposits.

Many software security vulnerabilities only reveal themselves under certain conditions, i.e., particular configurations and inputs together with a certain runtime environment. One approach to detecting these vulnerabilities is fuzz testing. However, typical fuzz testing makes no guarantees regarding the syntactic and semantic validity of the input, or of how much of the input space will be explored. To address these problems, we present a new testing methodology called Configuration Fuzzing. Configuration Fuzzing is a technique whereby the configuration of the running application is mutated at certain execution points, in order to check for vulnerabilities that only arise in certain conditions. As the application runs in the deployment environment, this testing technique continuously fuzzes the configuration and checks "security invariants'' that, if violated, indicate a vulnerability. We discuss the approach and introduce a prototype framework called ConFu (CONfiguration FUzzing testing framework) for implementation. We also present the results of case studies that demonstrate the approach's feasibility and evaluate its performance.

Empirical Evaluation of Approaches to Testing Applications without Test Oracles

Christian Murphy, Gail Kaiser

2010-02-05

Software testing of applications in fields like scientific omputation, simulation, machine learning, etc. is particularly challenging because many applications in these domains have no reliable "test oracle" to indicate whether the program's output is correct when given arbitrary input. A common approach to testing such applications has been to use a "pseudo-oracle", in which multiple independently-developed implementations of an algorithm process an input and the results are compared. Other approaches include the use of
program invariants, formal specification languages, trace and log file analysis, and metamorphic testing.
In this paper, we present the results of two empirical studies in which we compare the effectiveness of some of these approaches, including metamorphic testing, pseudo-oracles, and runtime assertion checking. We also analyze the results in terms of the software development process, and discuss suggestions for practitioners and researchers who need to test software without a test oracle.

Automatic Detection of Previously-Unseen Application States for Deployment Environment Testing and Analysis

Christian Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser

2010-01-19

For large, complex software systems, it is typically impossible in terms of time and cost to reliably test the application in all possible execution states and configurations before releasing it into production. One proposed way of addressing this problem has been to continue testing and analysis of the application in the field, after it has been deployed. The theory behind this "perpetual testing" approach is that over time, defects will reveal themselves given that multiple instances of the same application may be run globally with different configurations, in different environments, under different patterns of usage, and in different system states.
A practical limitation of many automated approaches to deployment environment testing and analysis is the potentially high performance overhead incurred by the necessary instrumentation. However, it may be possible to reduce this overhead by selecting test cases and performing analysis only in previously-unseen application states, thus reducing the number of redundant tests and analyses that are run. Solutions for fault detection, model checking, security testing, and fault localization in deployed software may all benefit from a technique that ignores application states that have already been tested or explored.
In this paper, we apply such a technique to a testing methodology called "In Vivo Testing", which conducts tests in deployed applications, and present a solution that ensures that tests are only executed in states that the application has not previously encountered. In addition to discussing our implementation, we present the results of an empirical study that demonstrates its effectiveness, and explain how the new approach can be generalized to assist other automated testing and analysis techniques.

Machine Learning algorithms have provided important core functionality to support solutions in many scientific computing applications - such as computational biology, computational linguistics, and others. However, it is difficult to test such applications because often there is no "test oracle" to indicate what the correct output should be for arbitrary input. To help address the quality of scientific computing software, in this paper we present a technique for testing the implementations of machine learning classification algorithms on which such scientific computing software depends. Our technique is based on an approach called "metamorphic testing", which has been shown to be effective in such cases. Also presented is a case study on a real-world machine learning application framework, and a discussion of how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also conduct mutation analysis and cross-validation, which reveal that our method has very high effectiveness in killing mutants, and that observing expected cross-validation result alone is not sufficient to test for the correctness of a supervised classification program. Metamorphic testing is strongly recommended as a complementary approach. Finally we discuss how our findings can be used in other areas of computational science and engineering.

Opportunistic networks, which are wireless network "islands" formed when transient and highly mobile nodes meet for a short period of time, are becoming commonplace as wireless devices become more and more popular. It is thus imperative to develop communication tools and applications that work well in opportunistic networks. In particular, group chat and instant messaging applications are particularly lacking for such opportunistic networks today.
In this paper, we present ONEChat, a group chat and instant messaging program that works in such opportunistic networks. ONEChat uses message multicasting on top of service discovery protocols in order to support group chat and reduce bandwidth consumption in opportunistic networks. ONEChat does not require any pre-configuration, a fixed network infrastructure or a client-server architecture in order to operate. In addition, it supports features such as group chat, private rooms, line-by-line or character-by-character messaging, file transfer, etc.
We also present our quantitative analysis of ONEChat, which we believe indicates that the ONEChat architecture is an efficient group collaboration platform for opportunistic networks.

We present a throughput-driven partitioning and a
throughput-preserving merging algorithm for the high-level physical
synthesis of latency-insensitive (LI) systems. These two algorithms
are integrated along with a published floorplanner in a
new iterative physical synthesis flow to optimize system throughput
and reduce area occupation. The synthesis flow iterates a
floorplanning-partitioning-floorplanning-merging sequence of
operations to improve the system topology and the physical locations
of cores. The partitioning algorithm performs bottom-up clustering of
the internal logic of a given IP core to divide it into smaller ones,
each of which has no combinational path from input to output and thus
is legal for LI-interface encapsulation. Applying this algorithm to
cores on critical feedback loops optimizes their length and in turn
enables throughput optimization via the subsequent floorplanning.
The merging algorithm reduces the number of cores on non-critical
loops, lowering the overall area taken by LI interfaces without
hurting the system throughput. Experimental results on a large
system-on-chip design show a 16.7% speedup in system throughput and
a 2.1% reduction in area occupation.

Many software security vulnerabilities only reveal themselves under certain conditions, i.e., particular configurations of the software and certain inputs together with its particular runtime environment. One approach to detecting these vulnerabilities is fuzz testing, which feeds a range of randomly modified inputs to a software application while monitoring it for failures. However, typical fuzz testing makes no guarantees regarding the syntactic and semantic validity of the input, or of how much of the input space will be explored. To address these problems, in this proposal we present a new testing methodology called Configuration Fuzzing. Configuration Fuzzing is a technique whereby the configuration of the running application is mutated at certain execution points, in order to check for vulnerabilities that only arise in certain conditions. As the application runs in the deployment environment, this testing technique continuously fuzzes the configuration and checks "security invariants" that, if violated, indicate a vulnerability; however, the fuzzing is performed in a duplicated copy of the original process, so that it does not affect the state of the running application. Configuration Fuzzing uses a covering array algorithm when fuzzing the configuration which guarantees a certain degree of coverage of the configuration space in the lifetime of the program-under-test. In addition, Configuration Fuzzing tests that are run after the software is released ensure representative real-world user inputs to test with. In addition to discussing the approach and describing a prototype framework for implementation, we also present the results of case studies to prove the approach's feasibility and evaluate its performance.
In this thesis, we will continue developing the framework called ConFu (CONfiguration FUzzi ng framework) that supports the generation of test functions, parallel sandboxed execution and vulnerability detection. Given the initial ConFu, we will optimize the way that configurations get mutated, define more security invariants and conduct additional empirical studies of ConFu's effectiveness in detecting vulnerabilities.
At the conclusion of this work, we want to prove that ConFu is efficient and effective in detecting common vulnerabilities and tests executed by ConFu can ensure reasonable degree of coverage of both the configuration and user input space in the lifetime of the software.

Software bugs that occur in production are often difficult to reproduce in the lab due to subtle differences in the application
environment and nondeterminism. Toward addressing this problem, we present Transplay, a system that captures application software bugs as they occur in production and deterministically reproduces them in a completely different environment, potentially running a different operating system, where the application, its binaries and other support data do not exist. Transplay introduces partial checkpointing, a new mechanism that provides two key properties. It efficiently captures the minimal state necessary to reexecute just the last few moments of the application before it encountered a failure. The recorded state, which typically consists of a few megabytes of data, is used to replay the application without requiring the specific application binaries or the original execution environment. Transplay integrates with existing debuggers to provide facilities such as breakpoints and single-stepping to allow the user to examine the contents of variables and other program state at each source line of the application’s replayed execution. We have implemented a Transplay prototype that can record unmodified Linux applications and replay them on different versions of Linux as well as Windows. Experiments with server applications such as the Apache web server show that Transplay can be used in production with modest recording overhead.

SIP server overload management has attracted interest recently as SIP becomes the core signaling protocol for Next Generation Networks. Yet virtually all existing SIP overload control work is focused on SIP-over-UDP, despite the fact that TCP is increasingly seen as the more viable choice of SIP transport. This paper answers the following questions: is the existing TCP flow control capable of handling the SIP overload problem? If not, why and how can we make it work? We provide a comprehensive explanation of the default SIP-over-TCP overload behavior through server instrumentation. We also propose and implement novel but
simple overload control algorithms without any kernel or protocol level modification. Experimental evaluation shows that with our mechanism the overload performance improves from its original zero throughput to nearly full capacity. Our work also leads to the important high level insight that the traditional notion of TCP flow control alone is incapable of managing overload for time-critical session based applications, which would be applicable not only to SIP, but also to
a wide range of other common applications such as database servers.

We present a signaling architecture for network traffic authorization, Permission-Based Sending (PBS). This architecture aims to prevent Denial-of-Service (DoS) attacks and other forms of unauthorized traffic. Towards this goal, PBS takes a hybrid approach: a proactive approach of explicit permissions and a reactive approach of monitoring and countering attacks. On-path signaling is used to configure the permission state stored in routers for a data flow. The signaling approach enables easy installation and management of the permission state, and its use of soft-state improves robustness of the system. For secure permission state setup, PBS provides security for signaling in two ways: signaling messages are encrypted end-to-end using public key encryption and TLS provides hop-by-hop encryption of signaling paths.
In addition, PBS uses IPsec for data packet authentication. Our analysis and performance evaluation show that PBS is an effective and scalable solution for preventing various kinds of attack scenarios, including Byzantine attacks.

Thanks to its low product-promotion cost and its efficiency, targeted online advertising has become very popular. Unfortunately, being profile-based, online advertising methods violate consumers' privacy, which has engendered resistance to the ads. However, protecting privacy
through anonymity seems to encourage click-fraud. In this paper, we define consumer's privacy and present a privacy-preserving, targeted ad system (PPOAd) which is resistant towards click fraud. Our scheme is structured to provide financial incentives to to all entities involved.

In online applications such as Yahoo! Personals and Trulia.com users define structured profiles in order to find potentially interesting matches. Typically, profiles are evaluated against large datasets and
produce thousands of matches. In addition to filtering, users also specify ranking in their profile, and matches are returned in the form
of a ranked list. Top results in ranked lists are typically homogeneous, which hinders data exploration. For example, a user
looking for 1- or 2-bedroom apartments sorted by price will see a
large number of cheap 1-bedrooms in undesirable neighborhoods before
seeing any apartment with different characteristics. An alternative to
ranking is to group matches on common attribute values (e.g., cheap
1-bedrooms in good neighborhoods, 2-bedrooms with 2 baths). However,
not all groups will be of interest to the user given the ranking
criteria.
We argue here that neither single-list ranking nor attribute-based
grouping is adequate for effective exploration of ranked datasets. We
formalize rank-aware clustering and develop a novel rank-aware
bottom-up subspace clustering algorithm. We evaluate the performance
of our algorithm over large datasets from a leading online dating
site, and present an experimental evaluation of its effectiveness.

Challenges arise in assuring the quality of applications that do not have test oracles, i.e., for which it is impossible to know what the correct output should be for arbitrary input. Metamorphic testing has been shown to be a simple yet effective technique in addressing the quality assurance of these "non-testable programs". In metamorphic testing, if test input x produces output f(x), specified "metamorphic properties" are used to create a transformation function t, which can be applied to the input to produce t(x); this transformation then allows the output f(t(x)) to be predicted based on the already-known value of f(x). If the output is not as expected, then a defect must exist.
Previously we investigated the effectiveness of testing based on metamorphic properties of the entire application. Here, we improve upon that work by presenting a new technique called Metamorphic Runtime Checking, a testing approach that automatically conducts metamorphic testing of individual functions during the program's execution. We also describe an implementation framework called Columbus, and discuss the results of empirical studies that demonstrate that checking the metamorphic properties of individual functions increases the effectiveness of the approach in detecting defects, with minimal performance impact.

Many software security vulnerabilities only reveal themselves under certain conditions, i.e., particular configurations of the software together with its particular runtime environment. One approach to detecting these vulnerabilities is fuzz testing, which feeds a range of randomly modified inputs to a software application while monitoring it for failures. However, fuzz testing makes no guarantees regarding the syntactic and semantic validity of the input, or of how much of the input space will be explored. To address these problems, in this paper we present a new testing methodology called configuration fuzzing. Configuration fuzzing is a technique whereby the configuration of the running application is randomly modified at certain execution points, in order to check for vulnerabilities that only arise in certain conditions. As the application runs in the deployment environment, this testing technique continuously fuzzes the configuration and checks "security invariants" that, if violated, indicate a vulnerability; however, the fuzzing is performed in a duplicated copy of the original process, so that it does not affect the state of the running application. In addition to discussing the approach and describing a prototype framework for implementation, we also present the results of a case study to demonstrate the approach’s efficiency.

The purpose of this work is to lower the average number of features that are evaluated by an online algorithm. This is achieved by merging Sequential Analysis and Online Learning.
Many online algorithms use the example's margin to decide whether the model should be updated. Usually, the algorithm's model is updated when the margin is smaller than a certain threshold. The evaluation of the margin for each example requires the algorithm to evaluate all the model's features. Sequential Analysis allows us to early stop the computation of the margin when uninformative examples are encountered. It is desirable to save computation on uninformative examples since they will have very little impact on the final model.
We show the successful speedup of Online Boosting while maintaining accuracy on a synthetic and the MNIST data sets.

Hard real-time systems use worst-case execution
time (WCET) estimates to ensure that timing requirements
are met. The typical approach for obtaining WCET estimates
is to employ static program analysis methods. While these
approaches provide WCET bounds, they struggle to analyze
programs with loops whose iteration counts depend on input data.
Such programs mandate user-guided annotations. We propose a
hybrid approach by augmenting static program analysis with
model-checking to analyze such programs and derive the loop
bounds automatically. In addition, we use model-checking to
guarantee repeatable timing behaviors from segments of program
code. Our target platform is a precision timed architecture: a
SPARC-based architecture promising predictable and repeatable
timing behaviors. We use CBMC and illustrate our approach
on Euclidean greatest common divisor algorithm (for WCET
analysis) and a VGA controller (for repeatable timing validation).

Smashing the Stack with Hydra: The Many Heads of Advanced Polymorphic Shellcode

Pratap V. Prabhu, Yingbo Song, Salvatore J. Stolfo

2009-08-31

Recent work on the analysis of polymorphic shellcode engines suggests that modern obfuscation methods would soon eliminate the usefulness of signature-based network intrusion detection methods and supports growing views that the new generation of shellcode cannot be accurately and efficiently represented by the string signatures which current IDS and AV scanners rely upon. In this paper, we expand on this area of study by demonstrating never before seen concepts in advanced shellcode polymorphism with a proof-of-concept engine which we call Hydra. Hydra distinguishes itself by integrating an array of obfuscation techniques, such as recursive NOP sleds and multi-layer ciphering into one system while offering multiple improvements upon existing strategies. We also introduce never before seen attack methods such as byte-splicing statistical mimicry, safe-returns with forking shellcode and syscall-time-locking. In total, Hydra simultaneously attacks signature, statistical, disassembly, behavioral and emulation-based sensors, as well as frustrates of&#64258;ine forensics. This engine was developed to present an updated view of the frontier of modern polymorphic shellcode and provide an effective tool for evaluation of IDS systems, Cyber test ranges and other related security technologies.

A longstanding lacuna in the field of computational learning theory is the learnability of succinctly representable monotone Boolean functions, i.e., functions that preserve the given order of the input. This thesis makes significant progress towards understanding both the
possibilities and the limitations of learning various classes of monotone functions by carefully considering the complexity measures used to evaluate them.
We show that Boolean functions computed by polynomial-size monotone circuits are hard to learn assuming the existence of one-way functions. Having shown the hardness of learning general polynomial-size monotone circuits, we show that the class of Boolean functions computed by polynomial-size depth-3 monotone circuits are hard to learn using statistical queries. As a counterpoint, we give a statistical query learning algorithm that can learn random polynomial-size depth-2 monotone circuits (i.e., monotone DNF formulas).
As a preliminary step towards a fully polynomial-time, proper learning algorithm for learning polynomial-size monotone decision trees, we also show the relationship between the average depth of a monotone decision tree, its average sensitivity, and its variance.
Finally, we return to monotone DNF formulas, and we show that they are teachable (a different model of learning) in the average case. We also show that non-monotone DNF formulas, juntas, and sparse GF2 formulas are teachable in the average case.

Most popular instant messaging clients are now offering Voiceover-
IP (VoIP) technology. The many options running on similar
platforms, implementing common audio codecs and encryption
algorithms offers the opportunity to identify what factors affect
call quality. We measure call quality objectively based on mouthto-
ear latency. Based on our analysis we determine that the
mouth-to-ear latency can be influenced by operating system
(process priority and interrupt handling), the VoIP client
implementation and network quality.

Desktop computers are often compromised by the interaction of untrusted data and buggy software. To address this problem, we present Apiary, a system that provides transparent application fault containment while retaining the ease of use of a traditional integrated desktop environment. Apiary accomplishes this with three key mechanisms. It isolates applications in containers that integrate in a controlled manner at the display and file system. It introduces ephemeral containers that are quickly instantiated for single application execution and then removed, to prevent any exploit that occurs from persisting and to protect user privacy. It introduces the virtual layered file system to make instantiating containers fast and space efficient, and to make managing many containers no more complex than having a single traditional desktop. We have implemented Apiary on Linux without any application or operating system kernel changes. Our results from running real applications, known exploits, and a 24-person user study show that Apiary has modest performance overhead, is effective in limiting the damage from real vulnerabilities to enable quick recovery, and is as easy to use as a traditional desktop while improving desktop computer security and privacy.

Traditional firewalls have the ability to allow or block traffic based on source address as well as destination address and port number. Our original ROFL scheme implements firewalling by layering it on top of routing; however, the original proposal focused just on destination address and port number. Doing route selection based in part on source addresses is a form of policy routing, which has started to receive increased amounts of attention. In this paper, we extend the original ROFL (ROuting as the Firewall Layer) scheme by including source prefix constraints in route announcement. We present algorithms for route propagation and packet forwarding, and demonstrate the correctness of these algorithms using rigorous proofs. The new scheme not only accomplishes the complete set of filtering functionality provided by traditional firewalls, but also introduces a new direction for policy routing.

Semantic Ranking and Result Visualization for Life Sciences Publications

Julia Stoyanovich, William Mee, Kenneth A. Ross

2009-06-23

An ever-increasing amount of data and semantic knowledge in the domain
of life sciences is bringing about new data management challenges. In
this paper we focus on adding the semantic dimension to literature
search, a central task in scientific research. We focus our attention
on PubMed, the most significant bibliographic source in life sciences,
and explore ways to use high-quality semantic annotations from the
MeSH vocabulary to rank search results. We start by developing
several families of ranking functions that relate a search query to a
document's annotations. We then propose an efficient adaptive ranking
mechanism for each of the families. We also describe a
two-dimensional Skyline-based visualization that can be used in
conjunction with the ranking to further improve the user's interaction
with the system, and demonstrate how such Skylines can be computed
adaptively and efficiently. Finally, we present a user study that
demonstrates the effectiveness of our ranking. We use the full PubMed
dataset and the complete MeSH ontology in our experimental evaluation.

Complexity and heterogeneity of the deployed software
applications often result in a wide range of dynamic states
at runtime. The corner cases of software failure during execution
often slip through the traditional software checking.
If the software checking infrastructure supports the
transparent checkpoint and resume of the live application
states, the checking system can preserve and replay the live
states in which the software failures occur. We introduce
a novel software checking framework that enables application
states including program behaviors and execution
contexts to be cloned and resumed on a computing cloud.
It employs (1) EXPLODE’s model checking engine for a
lightweight and general purpose software checking (2) ZAP
system for faster, low overhead and transparent checkpoint
and resume mechanism through virtualized PODs (PrOcess
Domains), which is a collection of host-independent processes,
and (3) scalable and distributed checking infrastructure
based on Distributed EXPLODE. Efficient and portable
checkpoint/resume and replay mechanism employed in this
framework enables scalable software checking in order to
improve the reliability of software products. The evaluation
we conducted showed its feasibility, efficiency and applicability.

A limitation of existing P2P VoD services is their inability to support
efficient streamed access to niche content that has relatively small demand. This limitation stems
from the poor performance of P2P when the number of peers sharing the content is small. In this paper, we propose a new provider-managed P2P VoD framework for efficient delivery of niche content based on two principles: reserving small portions of peers' storage and upload resources, as well as using novel, weighed caching techniques. We demonstrate through analytical analysis, simulations, and experiments on planetlab that our architecture can provide high streaming quality for niche content. In particular, we show that our architecture increases the catalog size by up to $40\%$ compared to standard P2P VoD systems, and that a weighted cache policy can reduce the startup delay for niche content by a factor of more than three.

Stream processing is a promising paradigm for programming multi-core systems for high-performance embedded applications. We propose flexible filters as a technique that combines static mapping of the stream program tasks with dynamic load balancing of their execution. The goal is to improve the system-level processing throughput of the program when it is executed on a distributed-memory multi-core system as well as the local (core-level) memory utilization. Our technique is distributed and scalable because it is based on point-to-point handshake signals exchanged between neighboring cores. Load balancing with flexible filters can be applied to stream applications that present large dynamic variations in the computational load of their tasks and the dimension of the stream data tokens. In order to demonstrate the practicality of our technique, we present the performance improvements for the case study of a JPEG encoder running
on the IBM Cell multi-core processor.

The deployment and use of Anomaly Detection (AD) sensors often requires the intervention of a human expert to manually calibrate and optimize their performance. Depending on the site and the type of traffic it receives, the operators might have to provide recent and sanitized training data sets, the characteristics of expected traffic (i.e. outlier ratio), and exceptions or even expected future modifications of system’s behavior. In this paper, we study the potential performance issues that stem from fully automating the AD sensors’ day-to-day maintenance and calibration. Our goal is to remove the dependence on human operator using an unlabeled, and thus potentially dirty, sample of incoming traffic. To that end, we propose to enhance the training phase of AD sensors with a self-calibration phase, leading to the automatic determination of the optimal AD parameters. We show how this novel calibration phase can be employed in conjunction with previously proposed methods for training data sanitization resulting in a fully automated AD maintenance cycle. Our approach is completely agnostic to the underlying AD sensor algorithm. Furthermore, the self-calibration can be applied in an online fashion to ensure that the resulting AD models reflect changes in the system’s behavior which would otherwise render the sensor’s internal state inconsistent. We verify the validity of our approach through a series of experiments where we compare the manually obtained optimal parameters with the ones computed from the self-calibration phase. Modeling traffic from two different sources, the fully automated calibration shows a 7.08% reduction in detection
rate and a 0.06% increase in false positives, in the worst case, when compared to the optimal selection of parameters. Finally, our adaptive models outperform the statically generated ones retaining the gains in performance from the sanitization process over time.

Masquerade attacks are unfortunately a familiar security problem that is a consequence of
identity theft. Detecting masqueraders is very hard. Prior work has focused on user command
modeling to identify abnormal behavior indicative of impersonation. This paper extends prior
work by presenting one-class Hellinger distance-based and one-class SVM modeling techniques
that use a set of novel features to reveal user intent. The specic objective is to model user
search proles and detect deviations indicating a masquerade attack. We hypothesize that
each individual user knows their own le system well enough to search in a limited, targeted
and unique fashion in order to nd information germane to their current task. Masqueraders,
on the other hand, will likely not know the le system and layout of another user's desktop,
and would likely search more extensively and broadly in a manner that is dierent than the
victim user being impersonated. We extend prior research that uses UNIX command sequences
issued by users as the audit source by relying upon an abstraction of commands. We devise
taxonomies of UNIX commands and Windows applications that are used to abstract sequences
of user commands and actions. We also gathered our own normal and masquerader data sets
captured in a Windows environment for evaluation. The datasets are publicly available for
other researchers who wish to study masquerade attack rather than author identication as in
much of the prior reported work. The experimental results show that modeling search behavior
reliably detects all masqueraders with a very low false positive rate of 0.1%, far better than prior
published results. The limited set of features used for search behavior modeling also results in
huge performance gains over the same modeling techniques that use larger sets of features.

Many different monitoring systems have been created to identify system state conditions to detect or prevent a myriad of deliberate attacks, or arbitrary faults inherent in any complex system. Monitoring systems are also vulnerable to attack. A stealthy attacker can simply turn off or disable these monitoring systems without being detected; he would thus be able to perpetrate the very attacks that these systems were designed to stop. For example, many examples of virus attacks against antivirus scanners have appeared in the wild. In this paper, we present a novel technique to “monitor the monitors” in such a way that (a) unauthorized shutdowns of critical monitors are detected with high probability, (b) authorized shutdowns raise no alarm, and (c) the proper shutdown sequence for authorized shutdowns cannot be inferred from reading memory. The techniques proposed to prevent unauthorized shut down (turning off) of monitoring systems was inspired by the duality of safety technology devised to prevent unauthorized discharge (turning on) of nuclear weapons.

Embedding malcode within documents provides a convenient means of attacking systems. Such attacks can be very targeted and difficult to detect to stop due to the multitude of document-exchange vectors and
the vulnerabilities in modern document processing applications. Detecting malcode embedded in a document is difficult owing to the complexity of modern document formats that provide ample opportunity to embed code in a myriad of ways. We focus on Microsoft Word documents as malcode carriers as a case study in this paper. To detect stealthy embedded malcode in documents, we develop an arbitrary data transformation technique that changes the value of data segments in documents in such a way as to purposely damage any hidden malcode that may be embedded in those sections. Consequently, the embedded malcode will not only fail but also introduce a system exception that would be easily detected. The method is intended to be applied in a safe sandbox, the transformation is reversible after testing a document, and does not require any learning phase. The method depends upon knowledge of the structure of the document binary format to parse a document and identify the specific sectors to which the method can be safely applied for malcode detection. The method can be implemented in MS Word as a security feature to enhance the safety of Word documents.

Recommender systems have become increasingly popular.
Most of the research on recommender systems has focused on recommendation algorithms.
There has been relatively little research, however, in the area of generalized system architectures for recommendation systems.
In this paper, we introduce \textit{weHelp}: a reference architecture for social recommender systems - systems where recommendations are derived automatically from the aggregate of logged activities conducted by the system's users.
Our architecture is designed to be application and domain agnostic.
We feel that a good reference architecture will make designing a recommendation system easier; in particular, weHelp aims to provide a practical design template to help developers design their own well-modularized systems.

Zodiac (Zero Outage Dynamic Intrinsically Assurable
Communities) is an implementation of a high-security
MANET, resistant to multiple types of attacks, including
Byzantine faults. The Zodiac architecture poses a set of unique
system security, performance, and usability requirements to
its policy-based management system (PBMS). In this paper,
we identify theses requirements, and present the design and
implementation of the Zodiac Policy Subsystem (ZPS), which
allows administrators to securely specify, distribute and evaluate
network control and system security policies to customize
Zodiac behaviors. ZPS uses Keynote language for specifying
all authorization policies with simple extension to support
obligation policies.

This report studies the performance impact of using TLS as a transport protocol for SIP servers. We evaluate the cost of TLS
experimentally using a testbed with OpenSIPS, OpenSSL, and Linux running on an Intel-based server. We analyze TLS costs
using application, library, and kernel profiling, and use the profiles to illustrate when and how different costs are incurred, such
as bulk data encryption, public key encryption, private key decryption, and MAC-based verification. We show that using TLS can reduce performance by up to a factor of 20 compared to the typical case of SIP over UDP. The primary factor in determining performance is whether and how TLS connection establishment is performed, due to the heavy
costs of RSA operations used for session negotiation. This depends both on how the SIP proxy is deployed (e.g., as an inbound or
outbound proxy) and what TLS options are used (e.g., mutual authentication, session reuse). The cost of symmetric key operations
such as AES or 3DES, in contrast, tends to be small.
Network operators deploying SIP over TLS should attempt to maximize the persistence of secure connections, and will need
to assess the server resources required. To aid them, we provide a measurement-driven cost model for use in provisioning SIP
servers using TLS. Our cost model predicts performance within 15 percent on average.

The widespread adoption of multicores has renewed the
emphasis on the use of parallelism to improve performance.
The present and growing diversity in hardware architectures
and software environments, however, continues to
pose difficulties in the effective use of parallelism thus delaying
a quick and smooth transition to the concurrency era.
In this paper, we describe the research being conducted at
Columbia University on a system called COMPASS that aims
to simplify this transition by providing advice to programmers
while they reengineer their code for parallelism. The
advice proffered to the programmer is based on the wisdom
collected from programmers who have already parallelized
some similar code. The utility of COMPASS rests, not only
on its ability to collect the wisdom unintrusively but also on
its ability to automatically seek, find and synthesize this wisdom
into advice that is tailored to the task at hand, i.e., the
code the user is considering parallelizing and the environment
in which the optimized program is planned to execute.
COMPASS provides a platform and an extensible framework
for sharing human expertise about code parallelization –
widely, and on diverse hardware and software. By leveraging
the “wisdom of crowds” model [26], which has been
conjectured to scale exponentially and which has successfully
worked for wikis, COMPASS aims to enable rapid propagation
of knowledge about code parallelization in the context
of the actual parallelization reengineering, and thus
continue to extend the benefits of Moore’s law scaling to
science and society.

Most legitimate calls are from persons or organizations with
strong social ties such as friends. Some legitimate calls, however, are from those with weak social ties such as a restaurant the callee booked a table on-line. Since a callee's contact list usually contains only the addresses of persons or organizations with strong social ties, filtering out unsolicited calls using the contact list is prone to false positives. To reduce these false positives, we first analyzed call logs and identified that legitimate calls are initiated from persons or organizations with weak social ties through transactions over
the web or email exchanges. This paper proposes two approaches to label incoming calls by using cross-media relations to previous contact mechanisms which initiate the calls. One approach is that potential callers offer the callee their contact addresses which might be used in future correspondence. Another is that a callee provides potential callers with weakly-secret information that the callers should use in
future correspondence in order to identify them as someone the callee has contacted before through other means. Depending on previous contact mechanisms, the callers use either customized contact addresses or message identifiers. The latter approach enables a callee to label incoming calls even without caller identifiers. Reducing false positives during filtering using our proposed approaches will contribute
to the reduction in SPIT (SPam over Internet Telephony).

The frequency and severity of recent intrusions involving data theft
and leakages has shown that online users' trust, voluntary or not, in
the ability of third parties to protect their sensitive data is often
unfounded. Data may be exposed anywhere along a corporation's web
pipeline, from the outward-facing web servers to the back-end
databases. Additionally, in service-oriented architectures (SOAs),
data may also be exposed as they transit between SOAs. For example,
credit card numbers may be leaked during transmission to or handling
by transaction-clearing intermediaries.
We present F3ildCrypt, a system that provides end-to-end protection of
data across a web pipeline and between SOAs. Sensitive data are
protected from their origin (the user's browser) to their legitimate
final destination. To that end, F3ildCrypt exploits browser scripting
to enable application- and merchant-aware handling of sensitive data.
Such techniques have traditionally been considered a security risk; to
our knowledge, this is one of the first uses of web scripting that
enhances overall security. F3ildCrypt uses proxy re-encryption
to re-target messages as they enter and cross SOA boundaries, and uses
XACML, the XML-based access control language, to define protection
policies. Our approach scales well in the number of public key
operations required for web clients and does not reveal proprietary
details of the logical enterprise network (because of the application
of proxy re-encryption). We evaluate F3ildCrypt and show an
additional cost of 40 to 150 ms when making sensitive transactions
from the web browser, and a processing rate of 100 to 140 XML
fields/second on the server. We believe such costs to be a reasonable
tradeoff for increased sensitive-data confidentiality.

The insider threat remains one of the most vexing problems in computer security. A number of approaches have been proposed to detect nefarious insider actions including user modeling and profiling techniques, policy and access enforcement techniques, and misuse detection. In this work we propose trap-based defense mechanisms for the case where insiders attempt to exfiltrate and use sensitive information. Our goal is to confuse and confound the attacker requiring far more effort to identify real information from bogus information and to provide a means of detecting when an inside attacker attempts to exploit sensitive information. ``Decoy Documents" are automatically generated and stored on a file system with the aim of enticing a malicious insider to open and review the contents of the documents. The decoy documents contain several different types of bogus credentials that when used, trigger an alert. We also embed ``stealthy beacons" inside the documents that cause a signal to be emitted to a server indicating when and where the particular decoy was opened. We evaluate decoy documents on honeypots penetrated by attackers demonstrating the feasibility of the method.

Challenges arise in assuring the quality of applications that do not have test oracles, i.e., for which it is difficult or impossible to know that the correct output should be for arbitrary input. Recently, metamorphic testing has been shown to be a simple yet effective technique in addressing the quality assurance of these so-called "non-testable programs". In metamorphic testing, existing test case input is modified to produce new test cases in such a manner that, when given the new input, the function should produce an output that can easily be computed based on the original output. That is, if input x produces output f(x), then we create input x' such that we can predict f(x') based on f(x); if the application does not produce the expected output, then a defect must exist, and either f(x) or f(x') (or both) is wrong.
Previously we have presented an approach called "Automated Metamorphic System Testing", in which metamorphic testing is conducted automatically as the program executes. In the approach, metamorphic properties of the entire application are specified, and then checked after execution is complete. Here, we improve upon that work by presenting a technique in which the metamorphic properties of individual functions are used, allowing for the specification of more complex properties and enabling finer-grained runtime checking. Our goal is to demonstrate that such an approach will be more effective than one based on specifying metamorphic properties at the system level, and is also feasible for use in the deployment environment.
This technique, called Metamorphic Runtime Checking, is a system testing approach in which the metamorphic properties of individual functions are automatically checked during the program's execution. The tester is able to easily specify the functions' properties so that metamorphic testing can be conducted in a running application, allowing the tests to execute using real input data and in the context of real system states, without affecting those states. We also describe an implementation framework called Columbus, and present the results of empirical studies that demonstrate that checking the metamorphic properties of individual functions increases the effectiveness of the approach in detecting defects, with minimal performance impact.

Credit cards have many important benets; however, these same benefits often carry with them many privacy concerns. In particular, the need for users to be able to monitor their own transactions, as well as bank's need to justify its payment requests from cardholders, entitle the latter to maintain a detailed log of all transactions its credit card customers were involved in. A bank can thus build a profile of each cardholder even without the latter's consent. In this technical report, we present a practical and accountable anonymous credit system based on ecash , with a privacy preserving mechanism for error correction and expense-reporting.

As interactive voice response systems spread at a rapid pace, providing
an increasingly more complex functionality, it is becoming clear that
the challenges of such systems are not solely associated to their
synthesis and recognition capabilities. Rather, issues such as the
coordination of turn exchanges between system and user, or the correct
generation and understanding of words that may convey multiple meanings,
appear to play an important role in system usability. This thesis
explores those two issues in the Columbia Games Corpus, a collection
of spontaneous task-oriented dialogues in Standard American English.
We provide evidence of the existence of seven turn-yielding cues --
prosodic, acoustic and syntactic events strongly associated with
conversational turn endings -- and show that the likelihood of a
turn-taking attempt from the interlocutor increases linearly with the
number of cues conjointly displayed by the speaker. We present similar
results related to six backchannel-inviting cues -- events that invite
the interlocutor to produce a short utterance conveying continued
attention.
Additionally, we describe a series of studies of affirmative cue words
-- a family of cue words such as 'okay' or 'alright' that speakers use
frequently in conversation for several purposes: for acknowledging what
the interlocutor has said, or for cueing the start of a new topic,
among others. We find differences in the acoustic/prosodic realization
of such functions, but observe that contextual information figures
prominently in human disambiguation of these words. We also conduct
machine learning experiments to explore the automatic classification of
affirmative cue words. Finally, we examine a novel measure of speaker
entrainment related to the usage of these words, showing its association
with task success and dialogue coordination.

Metamorphic testing has been shown to be a simple yet effective technique in addressing the quality assurance of applications that do not have test oracles, i.e., for which it is difficult or impossible to know what the correct output should be for arbitrary input. In metamorphic testing, existing test case input is modified to produce new test cases in such a manner that, when given the new input, the application should produce an output that can be easily be computed based on the original output. That is, if input x produces output f (x), then we create input x' such that we can predict f (x') based on f(x); if the application does not produce the expected output, then a defect must exist, and either f (x) or f (x') (or both) is wrong.
In practice, however, metamorphic testing can be a manually intensive technique for all but the simplest cases. The transformation of input data can be laborious for large data sets, or practically impossible for input that is not in human-readable format. Similarly, comparing the outputs can be error-prone for large result sets, especially when slight variations in the results are not actually indicative of errors (i.e., are false positives), for instance when there is non-determinism in the application and multiple outputs can be considered correct.
In this paper, we present an approach called Automated Metamorphic System Testing. This involves the automation of metamorphic testing at the system level by checking that the metamorphic properties of the entire application hold after its execution. The tester is able to easily set up and conduct metamorphic tests with little manual intervention, and testing can continue in the field with minimal impact
on the user. Additionally, we present an approach called Heuristic Metamorphic Testing which seeks to reduce false positives and address some cases of non-determinism. We also describe an implementation framework called Amsterdam, and present the results of empirical studies in which we demonstrate the effectiveness of the technique on real-world programs without test oracles.

PRET philosophy proposes the temporal
characteristics to be made predictable. However for
various applications the PRET processor will have to
interact with a non predictable environment. In this
paper an example of one such environment, an
MultiMediaCard (MMC) is considered. This paper
illustrates a method to make the response of the
MMC predictable.

Many applications in the field of scientific computing - such as computational biology, computational linguistics, and others - depend on Machine Learning algorithms to provide important core functionality to support solutions in the particular problem domains. However, it is difficult to test such applications because often there is no "test oracle" to indicate what the correct output should be for arbitrary input. To help address the quality of scientific computing software, in this paper we present a technique for testing the implementations of machine learning classification algorithms on which such scientific computing software depends. Our technique is based on an approach called
"metamorphic testing", which has been shown to be effective in such cases. In addition to presenting our technique, we describe a case study we performed on a real-world machine learning application framework, and discuss how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also discuss how our findings can be of use to other areas of computational science and engineering.

Quantifying the efficacy of self-healing systems is a challenging
but important task, which has implications for increasing
designer, operator and end-user confidence in
these systems. During design system architects benefit from
tools and techniques that enhance their understanding of
the system, allowing them to reason about the tradeoffs
of proposed or existing self-healing mechanisms and the
overall effectiveness of the system as a result of different
mechanism-compositions. At deployment time, system integrators
and operators need to understand how the selfhealing
mechanisms work and how their operation impacts
the system’s reliability, availability and serviceability (RAS)
in order to cope with any limitations of these mechanisms
when the system is placed into production.
In this paper we construct an evaluation framework for selfhealing
systems around simple, yet powerful, probabilistic
models that capture the behavior of the system’s selfhealing
mechanisms from multiple perspectives (designer,
operator, and end-user). We combine these analytical models
with runtime fault-injection to study the operation of
VM-Rejuv – a virtual machine based rejuvenation scheme
for web-application servers. We use the results from the
fault-injection experiments and model-analysis to reason
about the efficacy of VM-Rejuv, its limitations and strategies
for managing/mitigating these limitations in systemdeployments.
Whereas we use VM-Rejuv as the subject of
our evaluation in this paper, our main contribution is a
practical evaluation approach that can be generalized to
other self-healing systems.

A Case Study in Distributed Deployment of Embedded Software for Camera NetworksA Case Study in Distributed Deployment of Embedded Software for Camera Networks

Francesco Leonardi, Alessandro Pinto, Luca P. Carloni

2009-01-15

We present an embedded software application for the real-time estimation of building occupancy using a network of video cameras. We analyze a series of alternative decompositions of the main application tasks and profile each of them by running the corresponding embedded software on three different processors. Based on the profiling measures, we build various alternative embedded platforms by combining different embedded processors, memory modules and network interfaces.
In particular, we consider the choice of two possible network technologies: ARCnet and Ethernet. After deriving an analytical model of the network costs, we use it to complete an exploration of the design space as we scale the number of video cameras in an hypothetical building. We compare our results with those obtained for two real buildings of different characteristics. We conclude discussing the results of our case study in the broader context of other camera-network applications.

Managing many computers is difficult. Recent virtualization trends exacerbate this problem by making it easy to create and deploy multiple virtual appliances per physical machine, each of which can be configured with different applications and utilities. This results in a huge scaling problem for large organizations as management overhead grows linearly with the number of appliances.
To address this problem, we present Strata, a system that introduces the Virtual Layered File System (VLFS) and integrates it with virtual appliances to simplify system management. Unlike a traditional file system, which is a monolithic entity, a VLFS is a collection of individual software layers composed together to provide the traditional file system view. Individual layers are maintained in a central repository and shared across all VLFSs that use them. Layer changes and upgrades only need to be done once in the repository and are then automatically propagated to all VLFSs, resulting in management overhead independent of the number of virtual appliances. We have implemented a Strata Linux prototype without any application or operating system kernel changes. Using this prototype, we demonstrate how Strata enables fast system provisioning, simplifies system maintenanc and upgrades, speeds system recovery from security exploits, and incurs only modest performance overhead.

The emergence of world-wide standards for video compression has created a demand for design tools and simulation resources to support algorithm research and new product development. Because of the need for subjective study in the design of video compression algorithms it is essential that flexible yet computationally efficient tools be developed.
For this project, we plan to implement a MPEG standard using the SHIM programming language. The SHIM is a software/hardware integration language whose aim is to provide communication between hardware and software while providing deterministic concurrency.
The focus of this project will be to emphasize the efficiency of the SHIM language in embedded applications as compared to other existing implementations.

Using Metamorphic Testing at Runtime to Detect Defects in Applications without Test Oracles

Christian Murphy

2008-12-22

First, we will present an approach called Automated Metamorphic System Testing. This will involve automating system-level metamorphic testing by treating the application as a black box and checking that the metamorphic properties of the entire application hold after execution. This will allow for metamorphic testing to be conducted in the production environment without affecting the user, and will not require the tester to have access to the source code. The tests do not require an oracle upon their creation; rather, the metamorphic properties act as built-in test oracles. We will also introduce an implementation framework called Amsterdam.
Second, we will present a new type of testing called Metamorphic Runtime Checking. This involves the execution of metamorphic tests from within the application, i.e., the application launches its own tests, within its current context. The tests execute within the application’s current state, and in particular check a function’s metamorphic properties. We will also present a system called Columbus that supports the execution of the Metamorphic Runtime Checking from within the context of the running application. Like Amsterdam, it will conduct the tests with acceptable performance overhead, and will ensure that the execution of the tests does not affect the state of the original application process from the users’ perspective; however, the implementation of Columbus will be more challenging in that it will require more sophisticated mechanisms for conducting the tests without pre-empting the rest of the application, and for comparing the results
which may conceivably be in different processes or environments.
Third, we will describe a set of metamorphic testing guidelines that can be followed to assist in the formulation and specification of metamorphic properties that can be used with the above approaches. These will categorize the different types of properties exhibited by many applications in the domain of machine learning and data mining in particular (as a result of the types of applications we will investigate), but we will demonstrate that they are also generalizable to other domains as well. This set of guidelines will also correlate to the different types of defects that we expect the approaches will be able to find.

Static Deadlock Detection in SHIM with an Automata Type Checking System

Dave Aaron Smith, Nalini Vasudevan, Stephen Edwards

2008-12-21

With the advent of multicores, concurrent programming languages are become more prevelant. Data Races and Deadlocks are two major problems with concurrent programs. SHIM is a concurrent programming language that guarantees absence of data races through its semantics. However, a program written in SHIM can deadlock if not carefully written.
In this paper, we present a divide-and-merge technique to statically detect deadlocks in SHIM. SHIM is asynchronous, but we can greatly reduce its state space without loosing precision because of its semantics.

The SHIM compiler for the IBM CELL processor generates distinct code for the two processing units, PPE (Power Processor Element) and SPE (Synergistic Processor Elements). The SPE is specialized to give high throughput with computation intensive application operating on dense data. We propose mechanism to tune the code generated by the SHIM compiler to enable optimizing compilers to generate structured code.
Although, the discussion here is related to optimizing SHIM IR (Intermediate Representation) code, the techniques discussed here can be incorporated into compilers to convert unstructured loops consisting of goto statements to structured loops such as while and do-while statements to ease back end compiler optimizations.
Our research based SHIM compiler takes the code written in SHIM language and performs various static analysis and finally transforms it into C code. This generated code is compiled to binary using standard compilers available for IBM cell processor such as GCC and IBM XL compiler.

This technical report provides an introduction on how to compile and run uClinux and third-party programs to be run on a Nios II CPU core instantiated within the FPGA on the Altera DE2. It is based on experiences working with the OS and development board while teaching the Embedded Systems course during the springs of 2007 and 2008.

In a processor design the premier issues with
memory are (1) main memory allocation and (2)
interprocess communication. These two mainly
affect the performance of the memory system. The
goal of this paper is to formulate a deterministic
model for memory systems of PRET, taking into
account all the intertwined parallelism of modern
memory chips.
Studying existing memory models is necessary
to understand the implications of these factors to
realize a perfectly time predictable memory
system.

Clocks are a mechanism for providing synchronization barriers in
concurrent programming languages. They are usually implemented using
primitive communication mechanisms and thus spare the programmer from
reasoning about low-level implementation details such as remote
procedure calls and error conditions.
Clocks provide flexibility, but programs often use them in specific
ways that do not require their full implementation. In this paper, we
describe a tool that mitigates the overhead of general-purpose clocks
by statically analyzing how programs use them and choosing optimized
implementations when available.
We tackle the clock implementation in the standard library of the X10
programming language---a parallel, distributed object-oriented
language. We report our findings for a small set of analyses and
benchmarks. Our tool only adds a few seconds to analysis time, making
it practical to use as part of a compilation chain.

Classifying High-Dimensional Text and Web Data using Very Short Patterns

Hassan Malik, John Kender

2008-12-17

In this paper, we propose the "Democratic Classifier", a simple, democracy-inspired pattern-based classification algorithm that uses very short patterns for classification, and does not rely on the minimum support threshold. Borrowing ideas from democracy, our training phase allows each training instance to vote for an equal number of candidate size-2 patterns. Similar to the usual democratic election process, where voters select candidates by considering their qualifications, prior contributions at the constituency and territory levels, as well as their own perception about candidates, the training instances select patterns by effectively balancing between local, class, and global significance of patterns. In addition, we respect "each voter's opinion" by simultaneously adding shared patterns to all applicable classes, and then apply a novel power law based weighing scheme, instead of making binary decisions on these patterns.
Results of experiments performed on 121 common text and web datasets show that our algorithm almost always outperforms state of the art classification algorithms, without requiring any dataset-specific parameter tuning. On 100 real-life, noisy, web datasets, the average absolute classification accuracy improvement was as great as 10% over SVM, Harmony, C4.5 and KNN. Also, our algorithm ran about 3.5 times faster than the fastest existing pattern-based classification algorithm.

Model checking the state space (all possible behaviors) of software systems is a promising technique for verification and validation. Bugs such as security vulnerabilities, file storage issues, deadlocks and data races can occur anywhere in the state space and are often triggered by corner cases; therefore, it becomes important to explore and model check all runtime choices. However, large and complex software systems generate huge numbers of behaviors leading to ‘state explosion’. eXplode is a lightweight, deterministic and depth-bound model checker that explores all dynamic choices at runtime. Given an application-specific test-harness, eXplode performs state search in a serialized fashion - which limits its scalability and performance. This paper proposes a distributed eXplode engine that uses multiple host machines concurrently in order to achieve more state space coverage in less time, and is very helpful to scale up the software verification and validation effort. Test results show that Distributed eXplode runs several times faster and covers more state space than the standalone eXplode.

We propose the concept of a generalized assorted pixel (GAP) camera, which enables the user to
capture a single image of a scene and, after the fact, control the trade-off between spatial resolution,
dynamic range and spectral detail. The GAP camera uses a complex array (or mosaic) of color filters.
A major problem with using such an array is that the captured image is severely under-sampled for at
least some of the filter types. This leads to reconstructed images with strong aliasing. We make three
contributions in this paper: (a) We present a comprehensive optimization method to arrive at the spatial
and spectral layout of the color filter array of a GAP camera. (b) We develop a novel anti-aliasing
algorithm for reconstructing the under-sampled channels of the image with minimal aliasing. (c) We
demonstrate how the user can capture a single image and then control the trade-off of spatial resolution
to generate a variety of images, including monochrome, high dynamic range (HDR) monochrome, RGB,
HDR RGB, and multispectral images. Finally, the performance of our GAP camera has been verified using extensive simulations that use multispectral images of real world scenes. A large database of these multispectral images is being made publicly available for use by the research community.

Measurements of Multicast Service Discovery in a Campus Wireless Network

Se Gi Hong, Suman Srinivasan, Henning Schulzrinne

2008-11-14

Applications utilizing multicast service discovery protocols, such as iTunes, have become increasingly popular. However, multicast service discovery protocols are considered to generate network traffic overhead, especially in a wireless network. Therefore, it becomes important to evaluate the traffic and overhead caused by multicast service discovery packets in real-world networks. We measure and analyze the traffic of one of the mostly deployed multicast service discovery protocols, multicast DNS (mDNS) service discovery, in a campus wireless network that forms a single multicast domain of large users. We also analyze different service discovery models in terms of packet overhead and service discovery delay under different network sizes and churn rates. Our measurement shows that mDNS traffic consumes about 13 percent of the total bandwidth.

As machine learning (ML) applications become prevalent in various aspects of everyday life, their dependability takes on
increasing importance. It is challenging to test such applications, however, because they are intended to learn properties of data sets
where the correct answers are not already known. Our work is not concerned with testing how well an ML algorithm learns, but rather
seeks to ensure that an application using the algorithm implements the specification correctly and fulfills the users' expectations. These
are critical to ensuring the application's dependability. This paper presents three approaches to testing these types of applications.
In the first, we create a set of limited test cases for which it is, in fact, possible to predict what the correct output should be. In
the second approach, we use random testing to generate large data sets according to parameterization based on the application’s
equivalence classes. Our third approach is based on metamorphic testing, in which properties of the application are exploited to define
transformation functions on the input, such that the new output can easily be predicted based on the original output. Here we discuss
these approaches, and our findings from testing the dependability of three real-world ML applications.

Currently deployed IEEE 802.11WLANs (Wi-Fi networks) share access point (AP) bandwidth on a per-packet basis. However, the various stations communicating with the AP often have different signal qualities, resulting in different transmission rates. This induces a phenomenon known as the rate anomaly problem, in which stations with lower signal
quality transmit at lower rates and consume a significant majority of airtime, thereby dramatically reducing the throughput of stations transmitting at high rates. We propose a practical, deployable system, called SoftRepeater, in which stations cooperatively address the rate
anomaly problem. Specifically, higher-rate Wi-Fi stations opportunistically transformthemselves into repeaters for stations
with low data-rates when transmitting to/from the AP. The key challenge is to determine when it is beneficial to enable the repeater functionality. In this paper, we propose an initiation protocol that ensures that repeater functionality is enabled only when appropriate. Also, our system can run directly on top of today’s 802.11 infrastructure networks. We also describe a novel, zero-overhead network coding scheme that further alleviates undesirable symptoms of the rate anomaly problem. We evaluate our system using simulation and testbed implementation, and find that SoftRepeater can improve cumulative throughput by up to 200%.

Renewed interest in developing computing systems that meet additional non-functional requirements such as reliability, high availability and ease-of-management/self-management (serviceability) has fueled research into developing systems that exhibit enhanced reliability,
availability and serviceability (RAS) capabilities. This research focus on enhancing the RAS capabilities of computing systems impacts not only the legacy/existing systems we have today, but also has implications for the design and development of next generation (self-
managing/self-*) systems, which are expected to meet these non-functional requirements with minimal human intervention.
To reason about the RAS capabilities of the systems of today or the self-* systems of tomorrow, there are three evaluation-related challenges to address. First, developing (or identifying) practical fault-injection tools that can be used to study the failure behavior of
computing systems and exercise any (remediation) mechanisms the system has available for mitigating or resolving problems. Second, identifying techniques that can be used to quantify RAS deficiencies in computing systems and reason about the e&#64259;cacy of individual or combined RAS-enhancing mechanisms (at design-time or after system deployment).
Third, developing an evaluation methodology that can be used to objectively compare systems based on the (expected or actual) bene&#64257;ts of RAS-enhancing mechanisms.
This thesis addresses these three challenges by introducing the 7U Evaluation Methodology, a complementary approach to traditional performance-centric evaluations that identifies criteria for comparing and analyzing existing (or yet-to-be-added) RAS-enhancing mechanisms,
is able to evaluate and reason about combinations of mechanisms, exposes under-performing mechanisms and highlights the lack of mechanisms in a rigorous, objective and quantitative
manner.
The development of the 7U Evaluation Methodology is based on the following three hypotheses. First, that runtime adaptation provides a platform for implementing e&#64259;cient and &#64258;exible fault-injection tools capable of in-situ and in-vivo interactions with computing systems. Second, that mathematical models such as Markov chains, Markov reward networks and Control theory models can successfully be used to create simple, reusable templates for describing speci&#64257;c failure scenarios and scoring the system’s responses, i.e., studying the failure-behavior of systems, and the various facets of its remediation mechanisms and
their impact on system operation. Third, that combining practical fault-injection tools with mathematical modeling techniques based on Markov Chains, Markov Reward Networks and Control Theory can be used to develop a benchmarking methodology for evaluating and comparing the reliability, availability and serviceability (RAS) characteristics of computing systems.
This thesis demonstrates how the 7U Evaluation Method can be used to evaluate the RAS capabilities of real-world computing systems and in so doing makes three contributions. First, a suite of runtime fault-injection tools (Kheiron tools) able to work in a variety
of execution environments is developed. Second, analytical tools that can be used to construct mathematical models (RAS models) to evaluate and quantify RAS capabilities using appropriate metrics are discussed. Finally, the results and insights gained from conducting fault-injection experiments on real-world systems and modeling the system
responses (or lack thereof) using RAS models are presented. In conducting 7U Evaluations of real-world systems, this thesis highlights the similarities and differences between traditional performance-oriented evaluations and RAS-oriented evaluations and outlines a general
framework for conducting RAS evaluations.

Quality Assurance of Software Applications using the In Vivo Testing Approach

Christian Murphy, Gail Kaiser, Ian Vo, Matt Chu

2008-10-02

Software products released into the field typically have
some number of residual defects that either were not detected
or could not have been detected during testing. This
may be the result of flaws in the test cases themselves, incorrect
assumptions made during the creation of test cases,
or the infeasibility of testing the sheer number of possible
configurations for a complex system; these defects may also
be due to application states that were not considered during
lab testing, or corrupted states that could arise due to
a security violation. One approach to this problem is to
continue to test these applications even after deployment,
in hopes of finding any remaining flaws. In this paper, we
present a testing methodology we call in vivo testing, in
which tests are continuously executed in the deployment
environment. We also describe a type of test we call in
vivo tests that are specifically designed for use with such
an approach: these tests execute within the current state of
the program (rather than by creating a clean slate) without
affecting or altering that state from the perspective of the
end-user. We discuss the approach and the prototype testing
framework for Java applications called Invite. We also
provide the results of case studies that demonstrate Invite’s
effectiveness and efficiency.

It is challenging to test applications and functions for which the correct output for arbitrary input cannot be known in advance, e.g. some computational science or machine learning applications. In the absence of a test oracle, one approach to testing these applications is to use metamorphic testing: existing test case input is modified to produce new test cases in such a manner that, when given the new input, the application should produce an output that can be easily be computed based on the original output. That is, if input x produces output f(x), then we create input x' such that we can predict f(x') based on f(x); if the application or function does not produce the expected output, then a defect must exist, and either f(x) or f(x') (or both) is wrong. By using metamorphic testing, we are able to provide built-in "pseudo-oracles" for these so-called "nontestable programs" that have no test oracles. In this paper, we describe an approach in which a function's
metamorphic properties are specified using an extension to the Java Modeling Language (JML), a behavioral interface specification language that is used to support the "design by contract" paradigm in Java applications. Our implementation, called Corduroy, pre-processes these specifications and generates test code that can be executed using JML runtime assertion checking, for ensuring that the specifications hold during program execution. In addition to presenting our approach and implementation, we also describe our findings from case studies in which we apply our technique to applications without test oracles.

Extending VoIP beyond the Internet telephony, we propose a case study of applying the technology outside of its intended domain, to solve a real-world problem. This work is an attempt to understand an analog hardwired communication system of the U.S. Federal Aviation Administration (FAA), and effectively translate it into a generic, standards-based VoIP system that runs on their existing data network. We develop insights into the air traffic training and weigh on the design choices for building a soft real-time data communication system. We also share our real-world deployment and maintenance experiences, as the FAA Academy has been successfully using this VoIP system in five training rooms since 2006 to train the future air traffic controllers of the U.S. and the world.

We are facing the exhaustion of newly assignable IPv4 addresses. Unfortunately, IPv6 is not yet deployed widely enough to fully replace IPv4, and it is unrealistic to expect that this is going to change before we run out of IPv4 addresses. Letting hosts seamlessly communicate in an IPv4-world without assigning a unique globally routable IPv4 address to each of them is a challenging problem, for which many solutions have been proposed. Some prominent ones target towards carrier-grade-NATs (CGN), which we feel is a bad idea. Instead, we propose using specialized NATs at the edge that treat some of the port number bits as part of the address.

Spectrogram: A Mixture-of-Markov-Chains Model for Anomaly Detection in Web Traffic

Yingbo Song, Angelos D. Keromytis, Salvatore J. Stolfo

2008-09-15

We present Spectrogram, a mixture of Markov-chains sensor for anomaly detection (AD) against web-layer (port 80) code-injection attacks such as PHP file inclusion, SQL-injection, cross-site-scripting, as well as memory layer buffer overflows. Port 80 is the gateway to many application level services and a large array of attacks are channeled through this vector, servers cannot easily firewall this port. Signature-based sensors are effective in filtering known exploits but can’t detect 0-day vulnerabilities or deal with polymorphism and statistical AD approaches have mostly been limited to network layer, protocol-agnostic modeling, weakening their effectiveness. N -gram based modeling approaches have recently demonstrated success but the ill-posed nature of modeling large grams have thus far prevented exploration of higher order statistical models. In this paper, we provide a solution to this problem based on a factorization into Markov-chains and aim to model higher order structure as well as content for web requests. Spectrogram is implemented in a protocol-aware, passive, network-situated, but CGI-layered, AD architecture and we show in our evaluation that this model demonstrates significant detection results on an array of real world web-layer attacks, achieving at least 97% detection rates on all but one dataset and comparing favorably against other AD sensors.

Retina: Helping Students and Instructors Based on Observed Programming Activities

Christian Murphy, Gail Kaiser, Kristin Loveland, Sahar Hasan

2008-08-28

It is difficult for instructors of CS1 and CS2 courses to get accurate answers to such critical questions as "how long are students spending on programming assignments?", or "what sorts of errors are they making?" At the same time, students often have no idea of where they stand with respect to the rest of the class in terms of time spent on an assignment or the number or types of errors that they encounter. In this paper, we present a tool called Retina, which collects information about students' programming activities, and then provides useful and informative reports to both students and instructors based on the aggregation of that data. Retina can also make real-time recommendations to students, in order to help them quickly address some of the errors they make. In addition to describing Retina and its features, we also present some of our initial ndings during two trials of the tool in a real classroom setting.

We present a novel, practical, and effective mechanism for
identifying the IP address of Tor clients. We approximate
an almost-global passive adversary (GPA) capable of eavesdropping
anywhere in the network by using LinkWidth, a novel bandwidth-estimation technique. LinkWidth allows network edge-attached entities to estimate the available bandwidth in an arbitrary Internet link without a cooperating peer host, router, or ISP. By modulating the bandwidth of an anonymous connection (e.g., when the destination server
or its router is under our control), we can observe these fluctuations
as they propagate through the Tor network and the Internet to the end-user’s IP address. Our technique exploits one of the design criteria for Tor (trading off GPA-resistance for improved latency/bandwidth over MIXes) by allowing well-provisioned (in terms of bandwidth) adversaries to effectively become GPAs.
Although timing-based attacks have been demonstrated
against non-timing-preserving anonymity networks, they have
depended either on a global passive adversary or on the compromise
of a substantial number of Tor nodes. Our technique
does not require compromise of any Tor nodes or collaboration
of the end-server (for some scenarios). We demonstrate
the effectiveness of our approach in tracking the IP address
of Tor users in a series of experiments. Even for an underprovisioned
adversary with only two network vantage points, we can identify the end user (IP address) in many cases.

Operating system upgrades and patches sometimes
break applications that worked fine on the older version.
We present an autonomic approach to testing of OS updates
while minimizing downtime, usable without local regression
suites or IT expertise. Deux utilizes a dual-layer virtual
machine architecture, with lightweight application process
checkpoint and resume across OS versions, enabling simultaneous
execution of the same applications on both OS versions
in different VMs. Inputs provided by ordinary users
to the production old version are also fed to the new version.
The old OS acts as a pseudo-oracle for the update,
and application state is automatically re-cloned to continue
testing after any output discrepancies (intercepted at system
call level) - all transparently to users. If all differences
are deemed inconsequential, then the VM roles are
switched with the application state already in place. Our
empirical evaluation with both LAMP and standalone applications
demonstrates Deux’s efficiency and effectiveness.

The regulation of gene expression plays a central role in the development and function of a living cell. A complex network of interacting regulatory proteins bind specific sequence elements in the genome to control the amount and timing of gene expression. The abundance of genome-scale datasets from different organisms provides an opportunity to accelerate our understanding of the mechanisms of gene regulation. Developing computational tools to infer gene regulation programs from high-throughput genomic data is one of the central problems in computational biology.
In this thesis, we present a new predictive modeling framework for studying gene regulation. We formulate the problem of learning regulatory programs as a binary classification task: to accurately predict the the condition-specific activation (up-regulation) and repression (down-regulation) of gene expression. The gene expression response is measured by microarray expression data. Genes are represented by various genomic regulatory sequence features. Experimental conditions are represented by the gene expression levels of various regulatory proteins. We use this combination of features to learn a prediction function for the regulatory response of genes under different experimental conditions. The core computational approach is based on boosting. Boosting algorithms allow us to learn high-accuracy, large-margin classifiers and avoid overfitting. We describe three applications of our framework to study gene regulation:
- In the GeneClass algorithm, we use a compendium of known transcription factor binding sites and gene expression data to learn a global context-specific regulation program that accurately predicts differential expression. GeneClass learns a prediction function in the form of an alternating decision tree, a margin-based generalization of a decision tree. We introduce a novel robust variant of boosting that improves stability and biological interpretability in the presence of correlated features. We also show how to incorporate genome-wide protein-DNA binding data from ChIP-chip experiments into the framework.
- In several organisms, the DNA binding sites of many transcription factors are unknown. Hence, automatic discovery of regulatory sequence motifs is required. In the MEDUSA algorithm, we integrate raw promoter sequence data and gene expression data to simultaneously discover cis regulatory motifs ab initio and learn predictive regulatory programs. MEDUSA automatically learns probabilistic representations of motifs and their corresponding target genes. We show that we are able to accurately learn the binding sites of most known transcription factors in yeast.
- We also design new techniques for extracting biologically and statistically significant information from the learned regulatory models. We use a margin-based score to extract global condition-specific regulomes as well as cluster-specific and gene-specific regulation programs. We develop a post-processing framework for interpreting and visualizing biological information encapsulated in our models.
We show the utility of our framework in analyzing several interesting biological contexts (environmental stress responses, DNA-damage response and hypoxia-response) in the budding yeast Saccharomyces cerevisiae. We also show that our methods can learn regulatory programs and cis regulatory motifs in higher eukaryotes such as worms and humans. Several hypotheses generated by our methods are validated by our collaborators using biochemical experiments. Experimental results demonstrate that our framework is quantitatively and qualitatively predictive. We are able to achieve high prediction accuracy on test data and also generate specific, testable hypotheses.

Using Runtime Testing to Detect Defects in Applications without Test Oracles

Christian Murphy, Gail Kaiser

2008-08-07

It is typically infeasible to test a large, complex software system in all its possible configurations and system states prior to deployment. Moreover, some such applications have no test oracles to indicate their correctness. In my thesis, we will address these problems in two ways. First, we suggest that executing tests within the context of an application running in the field can reveal defects that would not ordinarily otherwise be found. Second, we believe that this approach can further be extended to applications for which there is no test oracle by using a variant of metamorphic testing at runtime.

Towards the Quality of Service for VoIP traffic in IEEE 802.11 Wireless Networks

Sangho Shin, Henning Schulzrinne

2008-07-09

The usage of voice over IP (VoIP) traffic in IEEE 802.11 wireless networks is expected to increase in the near future due to widely deployed 802.11 wireless networks and VoIP services on fixed lines. However, the quality of service (QoS) of VoIP traffic in wireless networks is still unsatisfactory. In this thesis, I identify several sources for the QoS problems of VoIP traffic in IEEE 802.11 wireless networks and propose solutions for these problems.
The QoS problems discussed can be divided into three categories, namely, user mobility, VoIP capacity, and call admission control. User mobility causes network disruptions during handoffs. In order to reduce the handoff time between Access Points (APs), I propose a new handoff algorithm, Selective Scanning and Caching, which finds available APs by scanning a minimum number of channels and furthermore allows clients to perform handoffs without scanning, by caching AP information. I also describe a new architecture for the client and server side for seamless IP layer handoffs, which are caused when mobile clients change the subnet due to layer 2 handoffs.
I also present two methods to improve VoIP capacity for 802.11 networks, Adaptive Priority Control (APC) and Dynamic Point Coordination Function (DPCF). APC is a new packet scheduling algorithm at the AP and improves the capacity by balancing the uplink and downlink delay of VoIP traffic, and DPCF uses a polling based protocol and minimizes the bandwidth wasted from unnecessary polling, using a dynamic polling list. Additionally, I estimated the capacity for VoIP traffic in IEEE 802.11 wireless networks via theoretical analysis, simulations, and experiments in a wireless test-bed and show how to avoid mistakes in the measurements and comparisons.
Finally, to protect the QoS for existing VoIP calls while maximizing the channel utilization, I propose a novel admission control algorithm called QP-CAT (Queue size Prediction using Computation of Additional Transmission), which accurately predicts the impact of new voice calls by virtually transmitting virtual new VoIP traffic.

genSpace: Exploring Social Networking Metaphors for Knowledge Sharing and Scientific Collaborative Work

Christian Murphy, Swapneel Sheth, Gail Kaiser, Lauren Wilcox

2008-06-13

Many collaborative applications, especially in scientific
research, focus only on the sharing of tools or the sharing
of data. We seek to introduce an approach to scientific collaboration
that is based on knowledge sharing. We do this
by automatically building organizational memory and enabling
knowledge sharing by observing what users do with
a particular tool or set of tools in the domain, through the
addition of activity and usage monitoring facilities to standalone
applications. Once this knowledge has been gathered,
we apply social networking models to provide collaborative
features to users, such as suggestions on tools to use,
and automatically-generated sequences of actions based on
past usage amongst the members of a social network or
the entire community. In this work, we investigate social
networking models as an approach to scientific knowledge
sharing, and present an implementation called genSpace,
which is built as an extension to the geWorkbench platform
for computational biologists. Last, we discuss the approach
from the viewpoint of social software engineering.

A SIP server may be overloaded by emergency-induced call volume,
"American Idol" style flash crowd effects or denial of service
attacks. The SIP server overload problem is interesting especially
because the costs of serving or rejecting a SIP session can be
similar. For this reason, the built-in SIP
overload control mechanism based on generating rejection messages
cannot prevent the server from entering congestion collapse under
heavy load. The SIP overload problem calls for a pushback control
solution in which the potentially overloaded receiving server may
notify its upstream sending servers to have them send only the
amount of load within the receiving server's processing capacity.
The pushback framework can be achieved by SIP application layer rate-based
feedback or window-based feedback. The centerpiece of the
feedback mechanism is the algorithm used to generate load
regulation information. We propose three new window-based feedback
algorithms and evaluate them together with two existing rate-based
feedback algorithms. We compare the different algorithms in terms
of the number of tuning parameters and performance under both steady
and variable load. Furthermore, we identify two categories of
fairness requirements for SIP overload control, namely,
user-centric and provider-centric fairness. With the introduction
of a new double-feed SIP overload control architecture, we show
how the algorithms meet those fairness criteria.

Developers of novel scientific computing systems are often eager to
make their algorithms and databases available for community use, but
their own computational resources may be inadequate to fulfill
external user demand -- yet the system's footprint is far too large
for prospective user organizations to download and run locally. Some
heavyweight systems have become part of designated ``centers''
providing remote access to supercomputers and/or clusters supported by
substantial government funding; others use virtual supercomputers
dispersed across grids formed by massive numbers of volunteer
Internet-connected computers. But public funds are limited and not all
systems are amenable to huge-scale divisibility into independent
computation units. We have identified a class of scientific computing
systems where ``utility'' sub-jobs can be offloaded to any of several
alternative providers thereby freeing up local cycles for the main
proprietary jobs, implemented a proof-of-concept framework enabling
such deployments, and analyzed its expected throughput and
response-time impact on a real-world bioinformatics system (Columbia's
PredictProtein) whose present users endure long wait queues.

The lack of fair bandwidth allocation in Peer-to-Peer systems causes
many performance problems, including users being disincentivized from
contributing upload bandwidth, free riders taking as much from the
system as possible while contributing as little as possible, and a
lack of quality-of-service guarantees to support streaming
applications. We present FairTorrent, a simple distributed scheduling
algorithm for Peer-to-Peer systems that fosters fair bandwidth
allocation among peers. For each peer, FairTorrent maintains a deficit
counter which represents the number of bytes uploaded to a peer minus
the number of bytes downloaded from it. It then uploads to the peer
with the lowest deficit counter. FairTorrent automatically adjusts to
variations in bandwidth among peers and is resilient to exploitation
by free-riding peers. We have implemented FairTorrent inside a BitTorrent client without modifications to the BitTorrent protocol, and compared its performance on PlanetLab against other widely-used BitTorrent clients. Our results show that FairTorrent can provide up to two orders of magnitude better fairness and up to five times better download performance for high contributing peers. It thereby gives users an incentive to contribute more bandwidth, and improve overall system performance.

We observed wireless network traffic at the 65th IETF Meeting in Dallas, Texas in March of 2006, attended by approximately 1200 engineers. The event was supported by a very large number of 802.11a and 802.11b access points, often seeing hundreds of simultaneous users. We were particularly interested in the stability of wireless connectivity, load balancing and loss behavior, rather than just traffic.We observed distinct differences among client implementations and saw a number of factors that made the overall system less than optimal, pointing to the need for better design tools and automated adaptation mechanisms.

We propose a firewall architecture that treats port numbers as part of the IP address. Hosts permit connectivity to a service by advertising the IPaddr:port/48 address; they block connectivity by ensuring that there is no route to it. This design, which is especially well-suited to MANETs, provides greater protection against insider attacks than do conventional firewalls, but drops unwanted traffic far earlier than distributed firewalls do.

Peer-to-peer (P2P) networks exist on the Internet today
as a popular means of data distribution. However, conventional
uses of P2P networking involve distributing stored
files for use after the entire file has been downloaded. In
this work, we investigate whether P2P networking can be
used to provide real-time playback capabilities for stored
media. For real-time playback, users should be able to start
playback immediately, or almost immediately, after requesting
the media and have uninterrupted playback during the
download. To achieve this goal, it is critical to efficiently
schedule the order in which pieces of the desired media
are downloaded. Simply downloading pieces in sequential
(earliest-first) order is prone to bottlenecks. Consequently
we propose a hybrid of earliest-first and rarest-first scheduling
- ensuring high piece diversity while at the same time
prioritizing pieces needed to maintain uninterrupted playback.
We consider an approach to peer-assisted streaming
that is based on BitTorrent. In particular, we show that dynamic
adjustment of the probabilities of earliest-first and
rarest-first strategies along with utilization of coding techniques
promoting higher data diversity, can offer noticeable
improvements for real-time playback.

The task of query optimization in modern relational database systems
is important but can be computationally expensive. Parametric query
optimization(PQO) has as its goal the prediction of optimal query
execution plans based on historical results, without consulting the
query optimizer. We develop machine learning techniques that can
accurately model the output of a query optimizer. Our algorithms
handle non-linear boundaries in plan space and achieve high prediction
accuracy even when a limited amount of data is available for training.
We use both predicted and actual query execution times for learning,
and are the first to demonstrate a total net win of a PQO method over
a state-of-the-art query optimizer for some workloads. ReoptSMART
realizes savings not only in optimization time, but also in query
execution time, for an over-all improvement by more than an order of
magnitude in some cases.

This paper presents one-class Hellinger distance-based and one-class SVM modeling techniques that use a set of features to reveal user intent. The specific objective is to model user command profiles and detect deviations indicating a masquerade attack. The approach aims to model user intent, rather than only modeling sequences of user issued commands. We hypothesize that each individual user will search in a targeted and limited fashion in order to find information germane to their current task. Masqueraders, on the other hand, will likely not know the file system and layout of another user's desktop, and would likely search more extensively and broadly. Hence, modeling a user search behavior to detect deviations may more accurately detect masqueraders. To that end, we extend prior research that uses UNIX command sequences issued by users as the audit source by relying upon an abstraction of commands. We devised a taxonomy of UNIX commands that is used to abstract command sequences. The experimental results show that the approach does not lose information and performs comparably to or slightly better than the modeling approach based on simple UNIX command frequencies.

This work describes a method of approximating matrix permanents efficiently using belief propagation. We formulate a probability distribution whose partition function is exactly the permanent, then use Bethe free energy to approximate this partition function. After deriving some speedups to standard belief propagation, the resulting algorithm requires $(n^2)$ time per iteration. Finally, we demonstrate the advantages of using this approximation.

Current NAC technologies implement a pre-connect phase where the status of a device is checked against a set of policies before being granted access to a network, and a post-connect phase that examines whether the device complies with the policies that correspond to its role in the network. In order to enhance current NAC technologies, we propose a new architecture based on behaviors rather than roles or identity, where the policies are automatically learned and updated over time by the members of the network in order to adapt to behavioral
changes of the devices. Behavior pro&#64257;les may be presented as identity cards that can change over time. By incorporating an Anomaly Detector (AD) to the NAC server or to each of the hosts, their behavior pro&#64257;le is
modeled and used to determine the type of behaviors that should be accepted within the network. These models constitute behavior-based policies. In our enhanced NAC architecture, global decisions are made using a group voting process. Each host’s behavior pro&#64257;le is used to compute a partial decision for or against the acceptance of a new pro&#64257;le or traf&#64257;c. The aggregation of these partial votes amounts to the model-group decision. This voting process makes the architecture more resilient to attacks. Even after accepting a certain percentage of malicious devices, the enhanced NAC is able to compute an adequate decision. We provide proof-of-concept experiments of our architecture using web traf&#64257;c from our department network. Our results show that the model-group decision approach based on behavior pro&#64257;les has a 99% detection rate of anomalous traf&#64257;c with a false positive rate of
only 0.005%. Furthermore, the architecture achieves short latencies for both the pre- and post-connect phases.

Enterprise networks are ubiquitious and increasingly complex. The
mechanisms for defining security policies in these networks have not
kept up with the advancements in networking technology. In most
cases, system administrators must define policies on a per-application
basis, and subsequently, these policies do not interact. For example,
there is no mechanism that allows a firewall to communicate decisions
based on its ruleset to a web server behind it, even though decisions
being made at the firewall may be relevant to decisions made at the
web server. In this paper, we describe a path-based access control
system which allows applications in a network to pass
access-control-related information to neighboring applications, as the
applications process requests from outsiders and from each other.
This system defends networks against a class of attacks wherein
individual applications may make correct access control decisions but
the resulting network behavior is incorrect. We demonstrate the
system on service-oriented architecture (SOA)-style networks, in two
forms, using graph-based policies, and leveraging the KeyNote trust
management system.

We study $d$-variate approximation for a weighted unanchored Sobolev
space having smoothness $m\ge1$. Folk wisdom would lead us to believe
that this problem should become easier as its smoothness increases. This
is true if we are only concerned with asymptotic analysis: the $n$th
minimal error is of order~$n^{-(m-\delta)}$ for any $\delta>0$. However,
it is unclear how long we need to wait before this asymptotic behavior
kicks in. How does this waiting period depend on $d$ and~$m$? We prove
that no matter how the weights are chosen, the waiting period is at
least~$m^d$, even if the error demand~$\varepsilon$ is arbitrarily close
to~$1$. Hence, for $m\ge2$, this waiting period is exponential in~$d$,
so that the problem suffers from the curse of dimensionality and is
intractable. In other words, the fact that the asymptotic behavior
improves with~$m$ is irrelevant when $d$~is large. So, we will be unable
to vanquish the curse of dimensionality unless $m=1$, i.e., unless the
smoothness is minimal. We then show that our problem \emph{can} be
tractable if $m=1$. That is, we can find an $\varepsilon$-approximation
using polynomially-many (in $d$ and~$\varepsilon^{-1}$) information
operations, even if only function values are permitted. When $m=1$, it
is even possible for the problem to be \emph{strongly} tractable, i.e.,
we can find an $\varepsilon$-approximation using polynomially-many
(in~$\varepsilon^{-1}$) information operations, independent of~$d$.
These positive results hold when the weights of the Sobolev space decay
sufficiently quickly or are bounded finite-order weights, i.e., the
$d$-variate functions we wish to approximate can be decomposed as sums of
functions depending on at most~$\omega$ variables, where $\omega$ is
independent of~$d$.

A Spreadable Connected Autonomic Network (SCAN) is a mobile network that automatically maintains its own connectivity as nodes move. We envision SCANs to enable a diverse set of applications such as self-spreading mesh networks and robotic search and rescue systems. This paper describes our experiences developing a prototype robotic SCAN built from commercial, off-the-shelf hardware, to support such applications. A major contribution of our work is the development of a protocol, called SCAN1, which maintains network connectivity by enabling individual nodes to determine when they must constrain their mobility in order to avoid disconnecting the network. SCAN1 achieves its goal through an entirely distributed process in which individual nodes utilize only local (2-hop) knowledge of the network's topology to periodically make a simple decision: move, or freeze in place. Along with experimental results from our hardware testbed, we model SCAN1's performance, providing both supporting analysis and simulation for the efficacy of SCAN1 as a solution to enable SCANs. While our evaluation of SCAN1 in this paper is limited to systems whose capabilities match those of our testbed, SCAN1 can be utilized in conjunction with a wide-range of potential applications and environments, as either a primary or backup connectivity maintenance mechanism.

Leveraging Local Intra-Core Information to Increase Global Performance in Block-Based Design of Systems-on-Chip

Cheng-Hong Li, Luca P. Carloni

2008-03-18

Latency-insensitive design is a methodology for system-on-chip (SoC) design that simplifies the reuse of intellectual property cores and the implementation of the communication among them. This simplification is based on a system-level protocol that decouples the intra-core logic design from the design of the inter-core communication channels. Each core is encapsulated within a shell, a synthesized logic block that dynamically controls its operation to interface it with the rest of the SoC and to absorb any latency variations on its I/O signals. In particular, a shell stalls a core whenever new valid data are not available on the input channels or a down-link core has requested a delay in the data production on the output channels.
We study how knowledge about the internal logic structure of a core can be applied to the design of its shell to improve the overall system-level performance by avoiding unnecessary local stalling. We introduce the notion of functional independence conditions (FIC) and present a novel circuit design of a generic shell template that can leverage FIC. We propose a procedure for the logic synthesis of a FIC-shell instance that is only based on the analysis of the intra-core logic and does not require any input from the designers. Finally, we present a comprehensive experimental analysis that shows the performance benefits and limited design overhead of the proposed technique. This includes the semi-custom design of an SoC, an ultra-wideband baseband transmitter, using a 90nm industrial standard cell library.

TCP has been traditionally considered unfriendly for real-time
applications. Nonetheless, popular applications such as Skype use
TCP due to the deployment of NATs and firewalls that prevent UDP
traffic. Motivated by this observation we study the delay
performance of TCP for real-time media flows. We develop an
analytical performance model for the delay of TCP. We use
extensive experiments to validate the model and to evaluate the
impact of various TCP mechanisms on its delay performance. Based
on our results, we derive the working region for VoIP and live
video streaming applications and provide guidelines for
delay-friendly TCP settings. Our research indicates that simple
application-level schemes, such as packet splitting and parallel
connections, can reduce the delay of real-time TCP flows by as
much as 30\% and 90\%, respectively.

Properties of Machine Learning Applications for Use in Metamorphic Testing

Christian Murphy, Gail Kaiser, Lifeng Hu

2008-02-28

It is challenging to test machine learning (ML) applications, which are intended to learn properties of data sets where the correct answers are not already known. In the absence of a test oracle, one approach to testing these applications is to use metamorphic testing, in which properties of the application are exploited to define transformation
functions on the input, such that the new output will be unchanged or can easily be predicted based on the original output; if the output is not as expected, then a defect must exist in the application. Here, we seek to enumerate and classify the metamorphic properties of some machine learning algorithms, and demonstrate how these can be applied
to reveal defects in the applications of interest. In addition to the results of our testing, we present a set of properties that can be used to define these metamorphic relationships so that metamorphic testing can be used as a general approach to testing machine learning applications.

The Stream Control Transmission Protocol (SCTP) is a newer transport
protocol, having additional features to TCP. Although SCTP is an
alternative transport protocol for the Session Initiation Protocol
(SIP), we do not know how SCTP features influence SIP server
scalability and performance. To estimate this, we measured the
scalability and performance of two servers, an echo server and a
simplified SIP server on Linux, comparing to TCP.
Our measurements found that using SCTP does not significantly
affect on data latency: approximately 0.3 ms longer for the handshake than that for TCP. However, server scalability in terms of the number
of sustainable associations drops to 17-21%, or to 43% of TCP if we
adjust the acceptable gap size of unordered data delivery.

Partitioning is an important step in several database algorithms, including sorting, aggregation, and joins. Partitioning is also fundamental for dividing work into equal-sized (or balanced) parallel subtasks. In this paper, we aim to find, materialize and maintain a set of partitioning elements (splitters) for a data set. Unlike traditional partitioning elements, our splitters define both inequality and equality partitions, which allows us to bound the size of the inequality partitions. We provide an algorithm for determining an optimal set of splitters from a sorted data set and show that it has time complexity O(k lg_2 N), where k is the number of splitters requested and N is the size of the data set. We show how the algorithm can be extended to pairs of tables, so that joins can be partitioned into work units that have balanced cost. We demonstrate experimentally (a) that finding the optimal set of splitters can be done efficiently, and (b) that using the precomputed splitters can improve the time to sort a data set by up to 76%, with particular benefits in the presence of a few heavy hitters.

The transport protocol for SIP can be chosen based on the requirements
of services and network conditions. How does the choice of TCP affect the scalability and performance compared to UDP? We experimentally analyze the impact of using TCP as a transport protocol for a SIP server. We first investigate scalability of a TCP echo server, then compare performance of a SIP server for three TCP connection lifetimes: transaction, dialog, and persistent. Our results show that a Linux machine can establish 450,000+ TCP connections and maintaining connections does not affect the transaction response time. Additionally, the transaction response times using the three TCP connection lifetimes and UDP show no significant difference at 2,500 registration requests/second and at 500 call requests/second. However, sustainable request rate is lower for TCP than for UDP, since using TCP requires more message processing. More message processing causes longer delays at the thread queue for the server implementing a thread-pool model. Finally, we suggest how to reduce the impact of TCP for a scalable SIP server especially under overload control. This is applicable to other servers with very large connection counts.

Internet applications are being used for more and more important business and personal purposes. Despite efforts to lock down web servers and isolate databases, there is an inherent problem in the web application architecture that leaves databases necessarily exposed to possible attack from the Internet. We propose a new design that removes the web server as a trusted component of the architecture and provides an extra layer of protection against database attacks. We have created a prototype system that demonstrates the feasibility of the new design.

Modern society is irreversibly dependent on computers and,
consequently, on software. However, as the complexity of programs
increase, so does the number of defects within them. To
alleviate the problem, automated techniques are constantly used to
improve software quality. Static analysis is one such approach in
which violations of correctness properties are searched and
reported. Static analysis has many advantages, but it is necessarily
conservative because it symbolically executes the program instead of
using real inputs, and it considers all possible executions
simultaneously. Being conservative often means issuing false alarms,
or missing real program errors.
Pointer variables are a challenging aspect of many languages that can
force static analysis tools to be overly conservative. It is often
unclear what variables are affected by pointer-manipulating
expressions, and aliasing between variables is one of the banes of
program analysis. To alleviate that, a common solution is to allow
the programmer to provide annotations such as declaring a variable
as unaliased in a given scope, or providing special constructs
such as the ``never-null'' pointer of Cyclone. However,
programmers rarely keep these annotations up-to-date.
The solution is to provide some form of pointer analysis, which
derives useful information about pointer variables in the program. An
appropriate pointer analysis equips the static tool so that it is
capable of reporting more errors without risking too many false alarms.
This dissertation proposes a methodology for pointer analysis that is
specially tailored for ``modular bug finding.'' It presents a new
analysis space for pointer analysis, defined by finer-grain ``dimensions
of precision,'' which allows us to explore and evaluate a variety of
different algorithms to achieve better trade-offs between analysis
precision and efficiency. This framework is developed around a new
abstraction for computing points-to sets, the Assign-Fetch Graph, that
has many interesting features. Empirical evaluation shows promising
results, as some unknown errors in well-known applications were
discovered.

Embedding malcode within documents provides a convenient means of penetrating systems which may be unreachable by network-level service attacks. Such attacks can be very targeted and difficult to detect compared to the typical network worm threat due to the multitude of document-exchange vectors. Detecting malcode embedded in a document is difficult owing to the complexity of modern document formats that provide ample opportunity to embed code in a myriad of ways. We focus on Microsoft Word documents as malcode carriers as a case study in this paper. We introduce a hybrid system that integrates static and dynamic techniques to detect the presence and location of malware embedded in documents. The system is designed to automatically update its detection models to improve accuracy over time. The overall hybrid detection system with a learning feedback loop is demonstrated to achieve a 99.27% detection rate and 3.16% false positive rate on a corpus of 6228 Word documents.

Software products released into the field typically have some number of residual bugs that either were not detected or could not have been detected during testing. This may be the result of flaws in the test cases themselves, assumptions made during the creation of test cases, or the infeasibility of testing the sheer number of possible configurations for a complex system. Testing approaches such as perpetual testing or continuous testing seek to continue to test these applications even after deployment, in hopes of finding any remaining flaws. In this paper, we present our initial work towards a testing methodology we call in vivo testing, in which unit tests are continuously executed inside a running application in the deployment environment. These tests execute within the current state of the program (rather than by creating a clean slate) without affecting or altering that state. Our approach can reveal defects both in the applications of interest and in the unit tests themselves. It can also be used for detecting concurrency or robustness issues that may not have appeared in a testing lab. Here we describe the approach and the testing framework called Invite that we have developed for Java applications. We also enumerate the classes of bugs our approach can discover, and provide the results of a case study on a publicly-available application, as well as the results of experiments to measure the added overhead.

Mitigating the Effect of Free-Riders in BitTorrent using Trusted Agents

Alex Sherman, Angelos Stavrou, Jason Nieh, Cliff Stein

2008-01-25

Even though Peer-to-Peer (P2P) systems present a cost-effective and
scalable solution to content distribution, most entertainment, media
and software, content providers continue to rely on expensive,
centralized solutions such as Content Delivery Networks. One of the
main reasons is that the current P2P systems cannot guarantee
reasonable performance as they depend on the willingness of users to
contribute bandwidth. Moreover, even systems like BitTorrent, which
employ a tit-for-tat protocol to encourage fair bandwidth exchange
between users, are prone to free-riding (i.e. peers that do not
upload). Our experiments on PlanetLab extend previous research
(e.g. LargeViewExploit, BitTyrant) demonstrating that
such selfish behavior can seriously degrade the performance of regular
users in many more scenarios beyond simple free-riding: we observed an
overhead of upto 430\% for 80\% of free-riding identities easily
generated by a small set of selfish users.
To mitigate the effects of selfish users, we propose a new P2P
architecture that classifies peers with the help of a small number of
{\em trusted nodes} that we call Trusted Auditors (TAs). TAs
participate in P2P download like regular clients and detect
free-riding identities by observing their neighbors' behavior. Using
TAs, we can separate compliant users into a separate service pool
resulting in better performance. Furthermore, we show that TAs are
more effective ensuring the performance of the system than a mere
increase in bandwidth capacity: for 80\% of free-riding identities a
single-TA system has a 6\% download time overhead while without the
TA and three times the bandwidth capacity we measure a 100\%
overhead.

As university-level distance learning programs become more and more popular, and software engineering courses incorporate eXtreme Programming (XP) into their curricula, certain challenges arise when teaching XP to students who are not physically co-located. In this paper, we present the results of a three-year study of such an online software engineering course targeted to graduate students, and describe some of the specific challenges faced, such as students’ aversion to aspects of XP and difficulties in scheduling. We discuss our findings in terms of the course’s educational objectives, and present suggestions to other educators who may face similar situations.

Latency-insensitive protocols allow system-on-chip engineers to decouple the design of the computing cores from the design of the inter-core communication channels while following the synchronous design paradigm.
In a latency-insensitive system (LIS) each core is encapsulated within a shell, a synthesized interface module that dynamically controls its operation. At each clock period, if new data has not arrived on an input channel or a stalling request has arrived on an output channel, the shell stalls the core and buffers other incoming valid data for future processing. The combination of finite buffers and backpressure from stalling can cause throughput degradation. Previous works addressed this problem by increasing buffer space to reduce the backpressure requests or inserting extra buffering to balance the channel latency around a LIS. We explore the theoretical complexity of these approaches and propose a heuristic algorithm for efficient queue sizing. We also practically characterize several LIS topologies and how the topology of a LIS can impact not only how much throughput degradation will occur, but also the difficulty of finding optimal queue sizing solutions.

LinkWidth: A Method to Measure Link Capacity and Available Bandwidth using Single-End Probes

Sambuddho Chakravarty, Angelos Stavrou, Angelos D. Keromytis

2008-01-05

We introduce LinkWidth, a method for estimating capacity and available
bandwidth using single-end controlled TCP packet probes. To estimate
capacity, we generate a train of TCP RST packets ``sandwiched''
between trains of TCP SYN packets. Capacity is computed from the
end-to-end packet dispersion of the received TCP RST/ACK packets
corresponding to the TCP SYN packets going to closed ports. Our
technique is significantly different from the rest of the packet-pair
based measurement techniques, such as {\em CapProbe,} {\em pathchar}
and {\em pathrate,} because the long packet trains minimize errors due
to bursty cross-traffic. Additionally, TCP RST packets do not generate
additional ICMP replies, thus avoiding cross-traffic due to such
packets from interfering with our probes. In addition, we use TCP
packets for all our probes to prevent QoS-related traffic shaping
(based on packet types) from affecting our measurements (eg. CISCO
routers by default are known have to very high latency while
generating to ICMP TTL expired replies).
We extend the {\it Train of Packet Pairs} technique to approximate the
available link capacity. We use a train of TCP packet pairs with
variable intra-pair delays and sizes. This is the first attempt to
implement this technique using single-end TCP probes, tested on a
range of networks with different bottleneck capacities and cross
traffic rates. The method we use for measuring from a single point of
control uses TCP RST packets between a train of TCP SYN packets. The
idea is quite similar to the technique for measuring the bottleneck
capacity. We compare our prototype with {\em pathchirp,} {\em
pathload,} {\em IPERF,} which require control of both ends as well as
another single end controlled technique {\em abget}, and demonstrate
that in most cases our method gives approximately the same results if
not better.

Text search on 3D models has traditionally worked poorly, as text annotations on 3D models are often unreliable or incomplete. In this paper we attempt to improve the recall of text search by automatically assigning appropriate tags to models. Our algorithm finds relevant tags by appealing to a large corpus of partially labeled example models, which does not have to be preclassified or otherwise prepared. For this purpose we use a copy of Google 3DWarehouse, a database of user contributed models which is publicly available on the Internet. Given a model to tag, we find geometrically similar models in the corpus, based on distances in a reduced dimensional space derived from Zernike descriptors. The labels of these neighbors are used as tag candidates for the model with probabilities proportional to the degree of geometric similarity. We show experimentally that text based search for 3D models using our computed tags can work as well as geometry based search. Finally, we demonstrate our 3D model search engine that uses this algorithm and discuss some implementation issues.

Conceptual complexity is emerging as a new bottleneck as database
developers, application developers, and database administrators
struggle to design and comprehend large, complex schemas. The
simplicity and conciseness of a schema depends critically on the
idioms available to express the schema. We propose a formal
conceptual schema representation language that combines different
design formalisms, and allows schema manipulation that exposes the
strengths of each of these formalisms. We demonstrate how the schema
factorization framework can be used to generate relational,
object-oriented, and faceted physical schemas, allowing a wider
exploration of physical schema alternatives than traditional
methodologies. We illustrate the potential practical benefits of
schema factorization by showing that simple heuristics can
significantly reduce the size of a real-world schema description. We
also propose the use of schema polynomials to model and derive
alternative representations for complex relationships with
constraints.

In this paper, we propose a method to program divide-and-conquer problems on multicore systems that is based on a data-driven recursive programming model. Data intensive programs are difficult to program on multicore architectures because they require efficient utilization of inter-core communication. Models for programming multicore systems available today generally lack the ability to automatically extract concurrency from a sequential style program and map concurrent tasks to efficiently leverage data and temporal locality. For divide-and-conquer algorithms, a recursive programming model can address both of these problems. Furthermore, since a recursive function has the same behavior patterns at all granularities of a problem, the same recursive model can be used to implement a multicore program at all of its levels: 1. the operations of a single core, 2. how to distribute tasks among several cores, and 3. in what order to schedule tasks on a multicore system when it is not possible to schedule all of the tasks at the same time. We present a novel selective execution technique that can enable automatic parallelization and task mapping of a recursive program onto a multicore system. To verify the practicality of this approach, we perform a case-study of bitonic sort on the Cell BE processor.

This paper presents a complete framework for creating speech-enabled
2D and 3D avatars from a single image of a person. Our approach uses a
generic facial motion model which represents deformations of the prototype face during speech.
We have developed an HMM-based facial
animation algorithm which takes into account both lexical stress and
coarticulation. This algorithm produces realistic animations of the
prototype facial surface from either text or speech. The generic facial motion model is transformed to
a novel face geometry using a set of corresponding points between the generic mesh and the novel face.
In the case of a 2D avatar, a single photograph of the person is used as input. We manually select a small number of features on the photograph and these are used to deform the prototype surface. The deformed surface is then used to animate the photograph. In the case of a 3D avatar, we use a single stereo image of the person as input. The sparse geometry of the face is computed from this image and used to warp the prototype surface to obtain the complete 3D surface of the person's face. This surface is etched into a glass cube using sub-surface laser engraving (SSLE) technology. Synthesized facial animation videos are then projected onto the etched glass cube. Even though the etched surface is static, the projection of facial animation onto it results in a compelling experience for the viewer. We show several examples of 2D and 3D avatars that are driven by text and speech inputs.

Partial evaluation has been applied to compiler optimization and
generation for decades. Most of the successful partial evaluators have
been designed for general-purpose languages. Our observation is that
domain-specific languages are also suitable targets for partial
evaluation. The unusual computational models in many DSLs bring
challenges as well as optimization opportunities to the compiler.
To enable aggressive optimization, partial evaluation
has to be specialized to fit the specific paradigm of a DSL. In this
dissertation, we present three such specialized partial evaluation
techniques designed for specific languages that address a variety of
compilation concerns. The first algorithm provides a low-cost solution
for simulating concurrency on a single-threaded processor. The second
enables a compiler to compile modest-sized synchronous programs in
pieces that involve communication cycles. The third statically
elaborates recursive function calls that enable programmers to
dynamically create a system's concurrent components in a convenient
and algorithmic way. Our goal is to demonstrate the potential of
partial evaluation to solve challenging issues in code generation for
domain-specific languages.
Naturally, we do not cover all DSL compilation issues. We hope our
work will enlighten and encourage future research on the application
of partial evaluation to this area.

The in vivo software testing approach focuses on testing
live applications by executing unit tests throughout the
lifecycle, including after deployment. The motivation is that
the “known state” approach of traditional unit testing is unrealistic;
deployed applications rarely operate under such
conditions, and it may be more informative to perform the
testing in live environments. One of the limitations of this
approach is the high performance cost it incurs, as the unit
tests are executed in parallel with the application. Here we
present distributed in vivo testing, which focuses on easing
the burden by sharing the load across multiple instances of
the application of interest. That is, we elevate the scope
of in vivo testing from a single instance to a community of
instances, all participating in the testing process. Our approach
is different from prior work in that we are actively
testing during execution, as opposed to passively monitoring
the application or conducting tests in the user environment
prior to execution. We discuss new extensions to
the existing in vivo testing framework (called Invite) and
present empirical results that show the performance overhead
improves linearly with the number of clients.

Tractability of the Helmholtz equation with non-homogeneous Neumann boundary conditions: Relation to $L_2$-approximation

Arthur G. Werschulz

2007-11-08

We want to compute a worst case $\varepsilon$-approximation to the solution
of the Helmholtz equation $-\Delta u+qu=f$ over the unit $d$-cube~$I^d$,
subject to Neumann boundary conditions $\partial_\nu u=g$ on~$\partial
I^d$. Let $\mathop{\rm card}(\varepsilon,d)$ denote the minimal number of
evaluations of $f$, $g$, and~$q$ needed to compute an absolute or
normalized $\varepsilon$-approximation, assuming that $f$, $g$, and~$q$
vary over balls of weighted reproducing kernel Hilbert spaces. This
problem is said to be weakly tractable if $\mathop{\rm
card}(\varepsilon,d)$ grows subexponentially in~$\varepsilon^{-1}$ and
$d$. It is said to be polynomially tractable if $\mathop{\rm
card}(\varepsilon,d)$ is polynomial in~$\varepsilon^{-1}$ and~$d$, and
strongly polynomially tractable if this polynomial is independent of~$d$.
We have previously studied tractability for the homogeneous version $g=0$
of this problem. In this paper, we investigate the tractability of the
non-homogeneous problem, with general~$g$. First, suppose that we use
product weights, in which the role of any variable is moderated by its
particular weight. We then find that if the weight sum is sublinearly
bounded, then the problem is weakly tractable; moreover, this condition is
more or less necessary. We then show that the problem is polynomially
tractable if the weight sum is logarithmically or uniformly bounded, and we
estimate the exponents of tractability for these two cases. Next, we turn
to finite-order weights of fixed order~$\omega$, in which a $d$-variate
function can be decomposed as sum, each term depending on at most
$\omega$~variables. We show that the problem is always polynomially
tractable for finite-order weights, and we give estimates for the exponents
of tractability. Since our results so far have established nothing
stronger than polynomial tractability, we look more closely at whether
strong polynomial tractability is possible. We show that our problem is
never strongly polynomially tractable for the absolute error criterion.
Moreover, we believe that the same is true for the normalized error
criterion, but we have been able to prove this lack of strong tractability
only when certain conditions hold on the weights. Finally, we use the
Korobov- and min-kernels, along with product weights, to illustrate our
results.

Packet processing is an essential function of state-of-the-art network
routers and switches. Implementing packet processors in pipelined
architectures is a well-known, established technique, albeit different
approaches have been proposed.
The design of packet processing pipelines is a delicate trade-off
between the desire for abstract specifications, short development time,
and design maintainability on one hand and very aggressive performance
requirements on the other.
This thesis proposes a coherent design flow for packet processing
pipelines. Like the design process itself, I start by introducing a
novel domain-specific language that provides a high-level
specification of the pipeline. Next, I address synthesizing this
model and calculating its worst-case throughput. Finally, I address
some specific circuit optimization issues.
I claim, based on experimental results, that my proposed technique can
dramatically improve the design process of these pipelines, while the
resulting performance matches the expectations of hand-crafted design.
The considered pipelines exhibit a pseudo-linear topology, which can
be too restrictive in the general case. However, especially due to
its high performance, such an architecture may be suitable for
applications outside packet processing, in which case some of my
proposed techniques could be easily adapted.
Since I ran my experiments on FPGAs, this work has an inherent bias
towards that technology; however, most results are
technology-independent.

\usepackage{amssymb}
\begin{document}
We continue the study of generalized tractability initiated in our
previous paper ``Generalized tractability for multivariate problems,
Part I: Linear tensor product problems and linear information'', J.
Complexity, 23, 262-295 (2007). We study linear tensor product
problems for which we can compute linear information which is given by
arbitrary continuous linear functionals. We want to approximate an
operator $S_d$ given as the $d$-fold tensor product of a compact
linear operator $S_1$ for $d=1,2,\dots\,$, with $\|S_1\|=1$ and $S_1$
has at least two positive singular values.
Let $n(\varepsilon,S_d)$ be the minimal number of information
evaluations needed to approximate $S_d$ to within
$\varepsilon\in[0,1]$. We study \emph{generalized tractability} by
verifying when $n(\varepsilon,S_d)$ can be bounded by a multiple of a
power of $T(\varepsilon^{-1},d)$ for all
$(\varepsilon^{-1},d)\in\Omega \subseteq[1,\infty)\times \mathbb{N}$. Here,
$T$ is a \emph{tractability} function which is non-decreasing in both
variables and grows slower than exponentially to infinity. We study
the \emph{exponent of tractability} which is the smallest power of
$T(\varepsilon^{-1},d)$ whose multiple bounds $n(\varepsilon,S_d)$.
We also study \emph{weak tractability}, i.e., when
$\lim_{\varepsilon^{-1}+d\to\infty,(\varepsilon^{-1},d)\in\Omega}
\ln\,n(\varepsilon,S_d)/(\varepsilon^{-1}+d)=0$.
In our previous paper, we studied generalized tractability for proper
subsets $\Omega$ of $[1,\infty)\times\mathbb{N}$, whereas in this paper we
take the unrestricted domain $\Omega^{\rm unr}=[1,\infty)\times\mathbb{N}$.
We consider the three cases for which we have only finitely many
positive singular values of $S_1$, or they decay exponentially or
polynomially fast. Weak tractability holds for these three cases, and
for all linear tensor product problems for which the singular values
of $S_1$ decay slightly faster that logarithmically. We provide
necessary and sufficient conditions on the function~$T$ such that
generalized tractability holds. These conditions are obtained in terms
of the singular values of $S_1$ and mostly limiting properties of $T$.
The tractability conditions tell us how fast $T$ must go to infinity.
It is known that $T$ must go to infinity faster than polynomially. We
show that generalized tractability is obtained for
$T(x,y)=x^{1+\ln\,y}$. We also study tractability functions $T$ of
product form, $T(x,y) =f_1(x)f_2(x)$. Assume that
$a_i=\liminf_{x\to\infty}(\ln\,\ln f_i(x))/(\ln\,\ln\,x)$ is finite
for $i=1,2$. Then generalized tractability takes place iff
$$a_i>1 \ \ \mbox{and}\ \ (a_1-1)(a_2-1)\ge1,$$
and if $(a_1-1)(a_2-1)=1$ then we need to assume one more condition
given in the paper. If $(a_1-1)(a_2-1)>1$ then the exponent of
tractability is zero, and if $(a_1-1)(a_2-1)=1$ then the exponent of
tractability is finite. It is interesting to add that for $T$ being of
the product form, the tractability conditions as well as the exponent
of tractability depend only on the second singular eigenvalue of $S_1$
and they do \emph{not} depend on the rate of their decay.
Finally, we compare the results obtained in this paper for the
unrestricted domain $\Omega^{\rm unr}$ with the results from our
previous paper obtained for the restricted domain
$\Omega^{\rm res}=
[1,\infty)\times\{1,2,\dots,d^*\}\,\cup\,[1,\varepsilon_0^{-1})\times\mathbb{N}$
with $d^*\ge1$ and $\varepsilon_0\in(0,1)$. In general, the tractability
results are quite different. We may have generalized tractability
for the restricted domain and no generalized tractability for the
unrestricted domain which is the case, for instance,
for polynomial tractability $T(x,y)=xy$. We may also have generalized
tractability for both domains with different or with the same
exponents of tractability.
\end{document}

Data mining algorithms use various Trie and bitmap-based representations to optimize the support (i.e., frequency) counting performance. In this paper, we compare the memory requirements and support counting performance of FP Tree, and Compressed Patricia Trie against several novel variants of vertical bit vectors. First, borrowing ideas from the VLDB domain, we compress vertical bit vectors using WAH encoding. Second, we evaluate the Gray code rank-based transaction reordering scheme, and show that in practice, simple lexicographic ordering, obtained by applying LSB Radix sort, outperforms this scheme.
Led by these results, we propose HDO, a novel Hamming-distance-based greedy transaction reordering scheme, and aHDO, a linear-time approximation to HDO. We present results of experiments performed on 15 common datasets with varying degrees of sparseness, and show that HDO- reordered, WAH encoded bit vectors can take as little as 5% of the uncompressed space, while aHDO achieves similar compression on sparse datasets. Finally, with results from over a billion database and data mining style frequency query executions, we show that bitmap-based approaches result in up to hundreds of times faster support counting, and HDO-WAH encoded bitmaps offer the best space-time tradeoff.

We present our work on automatically extracting social hierarchies
from electronic communication data. Data mining based on user behavior
can be leveraged to analyze and catalog patterns of communications
between entities to rank relationships. The advantage is that the
analysis can be done in an automatic fashion and can adopt itself to
organizational changes over time.
We illustrate the algorithms over real world data using the Enron
corporation's email archive. The results show great promise when
compared to the corporations work chart and judicial proceeding
analyzing the major players.

This paper presents a new framework for the unsupervised discovery of semantic information, using a divide-and-conquer approach to take advantage of contextual regularities and to avoid problems of polysemy and sublanguages. Multiple sets of documents are formed and analyzed to create multiple sets of frames. The overall procedure is wholly unsupervised and domain independent. The end result will be a collection of sets of semantic frames that will be useful in a wide range of applications, including question-answering, information extraction, summarization and text generation.

Software products released into the field typically have some number of residual bugs that either were not detected or could not have been detected during testing. This may be the result of flaws in the test cases themselves, assumptions made during the creation of test cases, or the infeasibility of testing the sheer number of possible configurations for a complex system. Testing approaches such as perpetual testing or continuous testing seek to continue to test these applications even after deployment, in hopes of finding any remaining flaws. In this paper, we present our initial work towards a testing methodology we call “in vivo testing”, in which unit tests are continuously executed inside a running application in the deployment environment. In this novel approach, unit tests execute within the current state of the program (rather than by creating a clean slate) without affecting or altering that state. Our approach has been shown to reveal defects both in the applications of interest and in the unit tests themselves. It can also be used for detecting concurrency or robustness issues that may not have appeared in a testing lab. Here we describe the approach, the testing framework we have developed for Java applications, classes of bugs our approach can discover, and the results of experiments to measure the added overhead.

Experiences in Teaching eXtreme Programming in a Distance Learning Program

Christian Murphy, Dan Phung, Gail Kaiser

2007-10-12

As university-level distance learning programs become more and more popular, and software engineering courses incorporate eXtreme Programming (XP) into their curricula, certain challenges arise when teaching XP to students who are not physically co-located. In this paper, we present our experiences and observations from managing such an online software engineering course, and describe some of the specific challenges we faced, such as students’ aversion to using XP and difficulties in scheduling. We also present some suggestions to other educators who may face similar situations.

BARTER: Profile Model Exchange for Behavior-Based Access Control and Communication Security in MANETs

Vanessa Frias-Martinez, Salvatore J. Stolfo, Angelos D. Keromytis

2007-10-10

There is a considerable body of literature and technology that
provides access control and security of communication for Mobile Ad-hoc Networks (MANETs) based on cryptographic authentication technologies
and protocols. We introduce a new method of granting access and securing communication in a MANET environment to augment, not replace, existing techniques. Previous approaches grant access to the MANET, or to its services, merely by means of an authenticated identity or a qualified role. We present BARTER, a framework that, in addition, requires nodes to exchange a model of their behavior to grant access to the MANET
and to assess the legitimacy of their subsequent communication. This framework forces the nodes not only to say who
or what they are, but also how they behave. BARTER will continuously
run membership acceptance and update protocols to give access to and accept traffic only from nodes whose behavior model is considered ``normal'' according to the behavior model of the nodes in the MANET.
We implement and experimentally evaluate the merger between BARTER and other cryptographic technologies and show that BARTER can implement
a fully distributed automatic access control and update with
small cryptographic costs. Although the methods proposed involve the use of content-based anomaly detection models, the generic infrastructure
implementing the methodology may utilize any behavior model.
Even though the experiments are implemented for MANETs, the idea
of model exchange for access control can be applied to any type of network.

Applying patches, although a disruptive activity, remains a vital part
of software maintenance and defense. When host-based anomaly detection
(AD) sensors monitor an application, patching the application requires
a corresponding update of the sensor's behavioral model. Otherwise,
the sensor may incorrectly classify new behavior as malicious (a false
positive) or assert that old, incorrect behavior is normal (a false
negative). Although the problem of ``model drift'' is an almost
universally acknowledged hazard for AD sensors, relatively little work
has been done to understand the process of re-training a ``live'' AD
model --- especially in response to legal behavioral updates like
vendor patches or repairs produced by a self-healing system.
We investigate the feasibility of automatically deriving and applying
a ``model patch'' that describes the changes necessary to update a
``reasonable'' host-based AD behavioral model ({\it i.e.,} a model
whose structure follows the core design principles of existing
host--based anomaly models). We aim to avoid extensive retraining and
regeneration of the entire AD model when only parts may have changed
--- a task that seems especially undesirable after the exhaustive
testing necessary to deploy a patch.

It is often necessary for two or more or more parties that do not
fully trust each other to share data selectively. For example,
one intelligence agency might be willing to turn over certain
documents to another such agency, but only if the second agency
requests the specific documents. The problem, of course, is finding
out that such documents exist when access to the database is
restricted.
We propose a search scheme based on Bloom filters and group ciphers
such as Pohlig-Hellman encryption. A semi-trusted third party can
transform one party's search queries to a form suitable for querying
the other party's database, in such a way that neither the third
party nor the database owner can see the original query. Furthermore,
the encryption keys used to construct the Bloom filters are not
shared with this third party. Multiple providers and queriers are
supported; provision can be made for third-party ``warrant servers'',
as well as ``censorship sets'' that limit the data to be shared.

GloServ is a global service discovery system which
aggregates information about different types of services in a
globally distributed network. GloServ classifies services in an
ontology and maps knowledge obtained by the ontology onto a
scalable hybrid hierarchical peer-to-peer network. The network
mirrors the semantic relationships of service classes and as a
result, reduces the number of message hops across the global
network due to the domain-specific way services are distributed.
Also, since services are described in greater detail, due to
the ontology representation, greater reasoning is applied when
querying and registering services. In this paper, we describe an
enhancement to the GloServ querying mechanism which allows
GloServ servers to process and issue subqueries between servers
of different classes. Thus, information about different service
classes may be queried for in a single query and issued directly
from the front end, creating an extensible platform for service
composition. The results are then aggregated and presented to the
user such that services which share an attribute are categorized
together. We have built and evaluated a location-based web
service discovery prototype which demonstrates the flexibility
of service composition in GloServ and discuss the design and
evaluation of this system. Keywords: service discovery, ontologies,
OWL, CAN, peer-to-peer, web service composition

The problem: Much of finance theory is based on the efficient market hypothesis. According to this hypothesis, the prices of financial assets, such as stocks, incorporate all information that may affect their future performance. However, the translation of publicly
available information into predictions of future performance is far from trivial. Making such predictions is the livelihood of stock traders, market analysts, and the like. Clearly, the efficient market hypothesis is only an approximation which ignores the cost of producing accurate predictions.
Markets are becoming more efficient and more accessible because of the use of ever faster methods for communicating and analyzing financial data. Algorithms developed in machine learning can be used to automate parts of this translation process. In other words, we can now use machine learning algorithms to analyze vast amounts of information and compile them to predict the performance of companies, stocks, or even market analysts. In financial terms, we would say that such algorithms discover inefficiencies in the current market. These discoveries can be used to make a profit and, in turn, reduce the market inefficiencies or support strategic planning processes.
Relevance: Currently, the major stock exchanges such as NYSE and NASDAQ are transforming their markets into electronic financial markets. Players in these markets must process large amounts of information and make instantaneous investment decisions.
Machine learning techniques help investors and corporations recognize new business opportunities or potential corporate problems in these markets. With time, these techniques help the financial market become better regulated and more stable. Also, corporations could save significant amount of resources if they can automate certain corporate finance functions such as planning and trading.
Results: This dissertation offers a novel approach to using boosting as a predictive and interpretative tool for problems in finance. Even more, we demonstrate how boosting can support the automation of strategic planning and trading functions.
Many of the recent bankruptcy scandals in publicly held US companies such as Enron and WorldCom are inextricably linked to the conflict of interest between shareholders (principals) and managers (agents). We evaluate this conflict in the case of Latin American and US companies. In the first part of this dissertation, we use Adaboost to analyze the impact of corporate governance variables on performance. In this respect, we present an algorithm that calculates alternating decision trees (ADTs), ranks variables according to their level of importance, and generates representative ADTs. We develop a board Balanced Scorecard (BSC) based on these representative ADTs which is part of the process to automate the planning functions.
In the second part of this dissertation we present three main algorithms to improve forecasting and automated trading. First, we introduce a link mining algorithm using a mixture of economic and social network indicators to forecast earnings surprises, and cumulative abnormal return. Second, we propose a trading algorithm for short-term technical trading. The algorithm was tested in the context of the Penn-Lehman Automated Trading Project (PLAT) competition using the Microsoft stock. The algorithm was profitable during the competition. Third, we present a multi-stock automated trading system that includes a machine learning algorithm that makes the prediction, a weighting algorithm that combines the experts, and a risk management layer that selects only the strongest prediction and avoids trading when there is a history of negative performance. This algorithm was tested with 100 randomly selected S&P 500 stocks. We find that even an efficient learning algorithm, such as boosting, still requires powerful control mechanisms in order to reduce unnecessary and unprofitable trades that increase transaction costs.

We present the problem of Oblivious Image Matching, where two parties want to determine whether they have images of the same object or scene, without revealing any additional information. While image matching has attracted a great deal of attention in the computer vision community, it was never treated in a cryptographic sense.
In this paper we study the private version of the problem, oblivious image matching, and provide an efficient protocol for it. In doing so, we design a novel image matching algorithm, and a few private protocols that may be of independent interest. Specifically, we first show how to reduce the image matching problem to a two-level version of the fuzzy set matching problem, and then present a novel protocol to privately compute this (and several other) matching problems.

Despite the growth of the Internet and the increasing concern for
privacy of online communications, current deployments of
anonymization networks depends on a very small set of nodes that
volunteer their bandwidth. We believe that the main reason is not
disbelief in their ability to protect anonymity, but rather the
practical limitations in bandwidth and latency that stem from
limited participation. This limited participation, in turn, is due
to a lack of incentives. We propose providing economic incentives,
which historically have worked very well.
In this technical report, we demonstrate a payment scheme that can
be used to compensate nodes which provide anonymity in Tor, an
existing onion routing, anonymizing network. We show that current
anonymous payment schemes are not suitable and introduce a hybrid
payment system based on a combination of the Peppercoin Micropayment
system and a new type of ``one use'' electronic cash. Our system
claims to maintain users' anonymity, although payment techniques
mentioned previously --- when adopted individually --- provably
fail.

We present a reputation scheme for a pseudonymous peer-to-peer (P2P) system in an anonymous network.
Misbehavior is one of the biggest problems in pseudonymous P2P systems, where there is little incentive for
proper behavior. In our scheme, using ecash for reputation points, the reputation of each user is closely related to
his real identity rather than to his current pseudonym. Thus, our scheme allows an honest user to switch to a new
pseudonym keeping his good reputation, while hindering a malicious user from erasing his trail of evil deeds with
a new pseudonym.

By exploiting the object-oriented dynamic composability of
modern document applications and formats, malcode hidden in otherwise
inconspicuous documents can reach third-party applications that may
harbor exploitable vulnerabilities otherwise unreachable by network-level service attacks. Such attacks can be very selective and difficult to detect compared to the typical network worm threat, owing to the complexity of these applications and data formats, as well as the multitude of document-exchange vectors. As a case study, this paper focuses on Microsoft Word documents as malcode carriers. We investigate the possibility of detecting embedded malcode in Word documents using two techniques: static content analysis using statistical models of typical document content, and run-time dynamic tests on diverse platforms. The experiments demonstrate these approaches can not only detect known malware, but also most zero-day attacks. We identify several problems with both approaches, representing both challenges in addressing the problem and opportunities for future research.

The errors that Java programmers are likely to encounter can roughly be categorized into three groups: compile-time (semantic and syntactic), logical, and runtime (exceptions). While much work has focused on the first two, there are very few tools that exist for interpreting the sometimes cryptic messages that result from runtime errors. Novice programmers in particular have difficulty dealing with uncaught exceptions in their code and the resulting stack traces, which are by no means easy to understand. We present Backstop, a tool for debugging runtime errors in Java applications. This tool provides more user-friendly error messages when an uncaught exception occurs, but also provides debugging support by allowing users to watch the execution of the program and the changes to the values of variables. We also present the results of two studies conducted on introductory-level programmers using the two different features of the tool.

To evaluate the efficacy of self-healing systems a rigorous, objective, quantitative benchmarking methodology is needed. However, developing such a benchmark is a non-trivial task given the many evaluation issues to be resolved, including but not limited to: quantifying the impacts
of faults, analyzing various styles of healing (reactive, preventative, proactive), accounting for partially automated healing and accounting for incomplete/imperfect healing. We posit, however,that it is possible to realize a self-healing benchmark using a collection of analytical techniques and practical tools as building blocks. This paper highlights the flexibility of one analytical tool, the Reliability, Availability and Serviceability (RAS) model, and illustrates its power and relevance
to the problem of evaluating self-healing mechanisms/systems, when combined with practical tools for fault-injection.

The ability to interactively edit BRDFs in their final placement within a computer graphics scene is vital to making informed choices for material properties. We significantly extend previous work on BRDF editing for static scenes (with fixed lighting and view), by developing a precomputed polynomial representation that enables interactive BRDF editing with global illumination. Unlike previous recomputation based rendering techniques, the image is not linear in the BRDF when considering interreflections. We introduce a framework for precomputing a multi-bounce tensor of polynomial coefficients, that encapsulates the nonlinear nature of the task. Significant reductions in complexity are achieved by leveraging the low-frequency nature of indirect light. We use a high-quality representation for the BRDFs at the first bounce from the eye, and lower-frequency (often diffuse) versions for further bounces. This approximation correctly captures the general global illumination in a scene, including color-bleeding, near-field object reflections, and even caustics. We adapt Monte Carlo path tracing for precomputing the tensor of coefficients for BRDF basis functions. At runtime, the high-dimensional tensors can be reduced to a simple dot product at each pixel for rendering. We present a number of examples of editing BRDFs in complex scenes, with interactive feedback rendered with global illumination.

We are concerned with the problem of detecting bugs in machine learning applications. In the absence of sufficient real-world data, creating suitably large data sets for testing can be a difficult task. Random testing is one solution, but may have limited effectiveness in cases in which a reliable test oracle does not exist, as is the case of the machine learning applications of interest. To address this problem, we have developed an approach to creating data sets called “parameterized random data generation”. Our data generation framework allows us to isolate or combine different equivalence classes as desired, and then randomly generate large data sets using the properties of those equivalence classes as parameters. This allows us to take advantage of randomness but still have control over test case selection at the system testing level. We present our findings from using the approach to test two different machine learning ranking applications.

Traditionally, TCP has been considered unfriendly for real-time
applications. Nonetheless, popular applications such as Skype use
TCP due to the deployment of NATs and firewalls that prevent UDP
traffic. This observation motivated us to study the delay
performance of TCP for real-time media flows using an analytical
model and experiments. The results obtained yield the working region
for VoIP and live video streaming applications and guidelines for
delay-friendly TCP settings. Further, our research indicates that
simple application-level schemes, such as packet splitting and
parallel connections, can significantly improve the delay
performance of real-time TCP flows.

The efficacy of Anomaly Detection (AD) sensors depends
heavily on the quality of the data used to train them. Arti-
ficial or contrived training data may not provide a realistic
view of the deployment environment. Most realistic data
sets are dirty; that is, they contain a number of attacks
or anomalous events. The size of these high-quality training
data sets makes manual removal or labeling of attack
data infeasible. As a result, sensors trained on this data can
miss attacks and their variations. We propose extending the
training phase of AD sensors (in a manner agnostic to the
underlying AD algorithm) to include a sanitization phase.
This phase generates multiple models conditioned on small
slices of the training data. We use these “micro-models”
to produce provisional labels for each training input, and
we combine the micro-models in a voting scheme to determine
which parts of the training data may represent attacks.
Our results suggest that this phase automatically and significantly
improves the quality of unlabeled training data
by making it as “attack-free” and “regular” as possible in
the absence of absolute ground truth. We also show how a
collaborative approach that combines models from different
networks or domains can further refine the sanitization process
to thwart targeted training or mimicry attacks against
a single site.

The Role of Reliability, Availability and Serviceability (RAS) Models in the Design and Evaluation of Self-Healing Systems

Rean Griffith, Ritika Virmani, Gail Kaiser

2007-04-10

In an idealized scenario, self-healing systems predict,
prevent or diagnose problems and take the appropriate actions
to mitigate their impact with minimal human intervention.
To determine how close we are to reaching this goal
we require analytical techniques and practical approaches
that allow us to quantify the effectiveness of a system’s remediations
mechanisms. In this paper we apply analytical
techniques based on Reliability, Availability and Serviceability
(RAS) models to evaluate individual remediation
mechanisms of select system components and their combined
effects on the system. We demonstrate the applicability
of RAS-models to the evaluation of self-healing systems
by using them to analyze various styles of remediations (reactive,
preventative etc.), quantify the impact of imperfect
remediations, identify sub-optimal (less effective) remediations
and quantify the combined effects of all the activated
remediations on the system as a whole.

P2P file-sharing has been recognized as a powerful and efficient
distribution model due to its ability to leverage users' upload
bandwidth. However, companies
that sell digital content on-line are hesitant to rely on P2P
models for paid content distribution due to the free file-sharing
inherent in P2P models.
In this paper we present Aequitas, a P2P system in which users
share paid content anonymously via a layer of intermediate nodes.
We argue that with the extra anonymity in Aequitas, vendors
could leverage P2P bandwidth while effectively maintaining
the same level of trust towards their customers as in traditional
models of paid content distribution. As a result, a
content provider could reduce its infrastructure costs and
subsequently lower the costs for the end-users.
The intermediate nodes are
incentivized to contribute their bandwidth via electronic micropayments.
We also introduce techniques that prevent the intermediate nodes
from learning the content of the files they help transmit.
In this paper we present the design of our system, an analysis of its
properties and an implementation and experimental evaluation. We
quantify the value of the intermediate nodes, both in terms of
efficiency and their effect on anonoymity. We argue in support of the
economic and technological merits of the system.

While peer-to-peer (P2P) file-sharing is a powerful and cost-effective
content distribution model, most paid-for digital-content providers
(CPs) rely on direct download to deliver their content. CPs such as
Apple iTunes that command a large base of paying users are hesitant to
use a P2P model that could easily degrade their user base into yet
another free file-sharing community.
We present TP2, a system that makes P2P file sharing a viable
delivery mechanism for paid digital content by providing the same
security properties as the currently used direct-download model.}
introduces the novel notion of trusted auditors (TAs) -- P2P
peers that are controlled by the system operator. TAs monitor the
behavior of other peers and help detect and prevent formation of
illegal file-sharing clusters among the CP's user base. TAs both
complement and exploit the strong authentication and authorization
mechanisms that are used in TP2 to control access to content. It
is important to note that TP2 does not attempt to solve the
out-of-band file-sharing or DRM problems, which also exist in the
direct-download systems currently in use.
We analyze TP2 by modeling it as a novel game between misbehaving
users who try to form unauthorized file-sharing clusters and TAs who
curb the growth of such clusters. Our analysis shows that a small
fraction of TAs is sufficient to protect the P2P system against
unauthorized file sharing. In a system with as many as 60\% of
misbehaving users, even a small fraction of TAs can detect 99\% of
unauthorized cluster formation. We developed a simple economic model
to show that even with such a large fraction of malicious nodes,
TP2 can improve CP's profits (which could translate to user
savings) by 62 to 122\%, even while assuming conservative estimates of
content and bandwidth costs. We implemented TP2 as a layer on top
of BitTorrent and demonstrated experimentally using PlanetLab that our
system provides trusted P2P file sharing with negligible performance
overhead.

Firewalls are a effective means of protecting a local system or network of systems from network-based security threats. In this paper, we propose a policy algebra framework for security policy enforcement in hybrid firewalls, ones that exist both in the network and on end systems. To preserve the security semantics, the policy algebras provide a formalism to compute addition, conjunction, subtraction, and summation on rule sets; it also defines the cost and risk functions associated with policy enforcement. Policy outsourcing triggers global cost minimization. We show that our framework can easily be extended to support packet filter firewall policies. Finally, we discuss special challenges and requirements for applying the policy algebra framework to MANETs.

In this report we analyze a configurable blind scheduler
containing a continuous, tunable parameter.
After the definition of this policy, we prove the
property of no surprising interruption, the property of no permanent
starvation, and two theorems about monotonicity of this policy.
This technical report contains supplemental materials for the following publication: Hanhua Feng, Vishal Misra, and Dan Rubenstein, "PBS: A unified priority-based scheduler", Proceedings of ACM SIGMETRICS '07, 2007.

Some machine learning applications are intended to learn properties of data sets where the correct answers are not already known to human users. It is challenging to test such ML software, because there is no reliable test oracle. We describe a software testing approach aimed at addressing this problem. We present our findings from testing implementations of two different ML ranking algorithms: Support Vector Machines and MartiRank.

Design, Implementation, and Validation of a New Class of Interface Circuits for Latency-Insensitive Design

Cheng-Hong Li, Rebecca Collins, Sampada Sonalkar, Luca P. Carloni

2007-03-05

With the arrival of nanometer technologies wire
delays are no longer negligible with respect to gate delays, and
timing-closure becomes a major challenge to System-on-Chip
designers. Latency-insensitive design (LID) has been proposed as
a "correct-by-construction" design methodology to cope with this
problem. In this paper we present the design and implementation
of a new and more efficient class of interface circuits to support
LID. Our design offers substantial improvements in terms of logic
delay over the design originally proposed by Carloni et al. [1] as
well as in terms of both logic delay and processing throughput
over the synchronous elastic architecture (SELF) recently proposed by Cortadella et al. [2]. These claims are supported by the
experimental results that we obtained completing semi-custom
implementations of the three designs with a 90nm industrial
standard-cell library. We also report on the formal verification
of our design: using the NuSMV model checker we verified that
the RTL synthesizable implementations of our LID interface
circuits (relay stations and shells) are correct refinements of the corresponding abstract specifications according to the theory of
LID [3].

The most common and well-understood way to evaluate and compare computing systems is via performance-oriented benchmarks. However, numerous other demands are placed on computing systems besides speed. Current generation and next generation computing systems are expected
to be reliable, highly available, easy to manage and able to repair faults and recover from failures with minimal human intervention.
The extra-functional requirements concerned with reliability, high availability, and serviceability (manageability, repair and recovery) represent an additional set of high-level goals the system is expected to meet or exceed. These goals govern the system’s operation and are codified using policies and service level agreements (SLAs).
To satisfy these extra-functional requirements, system-designers explore or employ a number of mechanisms geared towards improving the system’s reliability, availability and serviceability (RAS) characteristics. However, to evaluate these mechanisms and their impact, we need something more than performance metrics.
Performance-measures are suitable for studying the feasibility of the mechanisms i.e. they can be used to conclude that the level of performance delivered by the system with these mechanisms
active does not preclude its usage. However, performance numbers convey little about the efficacy of the systems RAS-enhancing mechanisms. Further, they do not allow us to analyze the (expected or actual) impact of individual mechanisms or make comparisons/discuss tradeoffs
between mechanisms.
What is needed is an evaluation methodology that is able to analyze the details of the RAS-enhancing mechanisms – the micro-view as well as the high-level goals, expressed as policies, SLAs etc., governing the system’s operation – the macro-view. Further, we must establish a link
between the details of the mechanisms and their impact on the high-level goals. This thesis is concerned with developing the tools and applying analytical techniques to enable this kind of evaluation. We make three contributions.
First, we contribute to a suite of runtime fault-injection tools with Kheiron. Kheiron demonstrates a feasible, low-overhead, transparent approach to performing system-adaptations in a variety of execution environments at runtime. We use Kheiron’s runtime-adaptation capability to inject faults into running programs. We present three implementations of Kheiron, each targeting a different execution environment. Kheiron/C manipulates compiled C-programs running in an unmanaged execution environment – comprised of the operating system and the underlying
processor. Kheiron/CLR manipulates programs running in Microsoft’s Common Language Runtime (CLR) and Kheiron/JVM manipulates programs running in Sun Microsystems’ Java Virtual Machine (JVM). Kheiron’s operation is transparent to both the application and the execution
environment. Further, the overheads imposed by Kheiron on the application and the execution environment are negligible, <5%, when no faults are being injected.
Second, we describe analytical techniques based on RAS-models, represented as Markov chains and Markov reward models, to demonstrate their power in evaluating RAS-mechanisms and their impact on the high-level goals governing system-operation. We demonstrate the flexibility of these models in evaluating reactive, proactive and preventative mechanisms as well as their ability to explore the feasibility of yet-to-be-implemented mechanisms. Our analytical techniques focus on remediations rather than observed mean time to failures (MTTF). Unlike hardware, where the laws of physics govern the failure rates of mechanical and electrical parts, there
are no such guarantees for software failure rates. Software failure-rates can however be influenced using fault-injection, which we employ in our experiments. In our analysis we consider a number
of facets of remediations, which include, but go beyond mean time to recovery (MTTR). For example we consider remediation success rates, the (expected) impact of preventative-maintenance and the degradation-impact of remediations in our efforts to establish a framework for reasoning
about the tradeoffs (the costs versus the benefits) of various remediation mechanisms.
Finally, we distill our experiences developing runtime fault-injection tools, performing fault-injection experiments and constructing and analyzing RAS-models into a 7-step process for evaluating
computing systems – the 7U-evaluation methodology. Our evaluation method succeeds in establishing the link between the details of the low-level mechanisms and the high-level goals governing the system’s operation. It also highlights the role of environmental constraints and
policies in establishing meaningful criteria for scoring and comparing these systems and their RAS-enhancing mechanisms.

Event correlation is a widely-used data processing methodology for a broad variety of applications, and is especially useful in the context of distributed monitoring for software faults and vulnerabilities. However, most existing solutions have typically been focused on "intra-organizational" correlation; organizations typically employ privacy policies that prohibit the exchange of information outside of the organization. At the same time, the promise of "inter-organizational" correlation is significant given the broad availability of Internet-scale communications, and its potential role in both software fault maintenance and software vulnerability detection.
In this thesis, I present a framework for reconciling these opposing forces via the use of privacy preservation integrated into the event processing framework. I introduce the notion of event corroboration, a reduced yet flexible form of correlation that enables collaborative verification, without revealing sensitive information. By accommodating privacy policies, we enable the corroboration of data across different organizations without actually releasing sensitive information. The framework supports both source anonymity and data privacy, yet allows for temporal corroboration of a broad variety of data. The framework is designed as a lightweight collection of components to enable integration with existing COTS platforms and distributed systems. I also present an implementation of this framework: Worminator, a collaborative Intrusion Detection System, based on an earlier platform, XUES (XML Universal Event Service), an event processor used as part of a software monitoring platform called KX (Kinesthetics eXtreme).
KX comprised a series of components, connected together with a publish-subscribe content-based routing event subsystem, for the autonomic software monitoring, reconfiguration, and repair of complex distributed systems. Sensors were installed in legacy systems; XUES' two modules then performed event processing on sensor data: information was collected and processed by the Event Packager, and correlated using the Event Distiller. While XUES itself was not privacy-preserving, it laid the groundwork for this thesis by supporting event typing, the use of publish-subscribe and extensibility support via pluggable event transformation modules. I also describe techniques by which corroboration and privacy preservation could optionally be "retrofitted" onto XUES without breaking the correlation applications and scenarios described.
Worminator is a ground-up rewrite of the XUES platform to fully support privacy-preserving event types and algorithms in the context of a Collaborative Intrusion Detection System (CIDS), whereby sensor alerts can be exchanged and corroborated without revealing sensitive information about a contributor's network, services, or even external sources, as required by privacy policies. Worminator also fully anonymizes source information, allowing contributors to decide their preferred level of information disclosure. Worminator is implemented as a monitoring framework on top of a collection of non-collaborative COTS and in-house IDS sensors, and demonstrably enables the detection of not only worms but also "broad and stealthy" scans; traditional single-network sensors either bury such scans in large volumes or miss them entirely. Worminator supports corroboration for packet and flow headers (metadata), packet content, and even aggregate models of network traffic using a variety of techniques.
The contributions of this thesis include the development of a cross-application-domain event processing framework with native privacy-preserving types, the use and validation of privacy-preserving corroboration, and the establishment of a practical deployed collaborative security system. The thesis also quantifies Worminator's effectiveness at attack detection, the overhead of privacy preservation and the effectiveness of our approach against adversaries, be they "honest-but-curious" or actively malicious.

To proactively defend against intruders from readily jeopardizing single-path data sessions, we propose a {\em distributed secure multipath solution} to route data across multiple paths so that intruders require much more resources to mount successful attacks. Our work exhibits several important properties that include: (1) routing decisions are made locally by network nodes without the centralized information of the entire network topology, (2) routing decisions minimize throughput loss under a single-link attack with respect to
different session models, and (3) routing decisions address multiple link attacks via lexicographic optimization. We devise two algorithms
termed the {\em Bound-Control algorithm} and the {\em Lex-Control algorithm}, both of which provide provably optimal solutions. Experiments show that the Bound-Control algorithm is more effective to prevent the worst-case single-link attack when compared to the single-path approach, and that the Lex-Control algorithm further enhances the Bound-Control algorithm by countering severe single-link attacks and various types of multi-link attacks. Moreover, the Lex-Control algorithm offers prominent protection after only a few execution rounds, implying that we can sacrifice minimal routing
protection for significantly improved algorithm performance. Finally, we examine the applicability of our proposed algorithms in a specialized
defensive network architecture called the attack-resistant network and analyze how the algorithms address resiliency and security in different network settings.

MutaGeneSys: Making Diagnostic Predictions Based on Genome-Wide Genotype Data in Association Studies

Julia Stoyanovich, Itsik Pe'er

2007-02-16

Summary: We present MutaGeneSys: a system that uses genomewide
genotype data for disease prediction. Our system integrates
three data sources: the International HapMap project, whole-genome
marker correlation data and the Online Mendelian Inheritance in Man
(OMIM) database. It accepts SNP data of individuals as query input
and delivers disease susceptibility hypotheses even if the original set
of typed SNPs is incomplete. Our system is scalable and flexible: it
operates in real time and can be configured on the fly to produce
population, technology, and confidence-specific predictions.
Availability: Efforts are underway to deploy our system as part of the
NCBI Reference Assembly. Meanwhile, the system may be obtained
from the authors.
Contact: jds1@cs.columbia.edu

Anomaly Detection (AD) sensors have become an invaluable
tool for forensic analysis and intrusion detection.
Unfortunately, the detection performance of all
learning-based ADs depends heavily on the quality of the
training data. In this paper, we extend the training phase
of an AD to include a sanitization phase. This phase significantly
improves the quality of unlabeled training data
by making them as ”attack-free” as possible in the absence
of absolute ground truth. Our approach is agnostic
to the underlying AD, boosting its performance based
solely on training-data sanitization. Our approach is to
generate multiple AD models for content-based AD sensors
trained on small slices of the training data. These
AD “micro-models” are used to test the training data,
producing alerts for each training input. We employ voting
techniques to determine which of these training items
are likely attacks. Our preliminary results show that sanitization
increases 0-day attack detection while in most
cases reducing the false positive rate. We analyze the performance
gains when we deploy sanitized versus unsanitized
AD systems in combination with expensive hostbased
attack-detection systems. Finally, we show that
our system incurs only an initial modest cost, which can
be amortized over time during online operation.

Zero Configuration Networking (Zeroconf) assigns IP addresses and host names, and discovers service without a central server. Zeroconf can be used in wireless mobile ad-hoc networks which are based on IEEE 802.11 and IP. However, Zeroconf has problems in mobile ad-hoc networks as it cannot detect changes in the network topology. In highly mobile networks, Zeroconf causes network overhead while discovering new services. In this paper, we propose an algorithm to accelerate service discovery for mobile ad-hoc networks. Our algorithm involves the monitoring of network interface changes that occur when a device with IEEE 802.11 enabled joins a new network area. This algorithm allows users to discover network topology changes and new services in real-time while minimizing network overhead.

Most computer defense systems crash the process that they protect as
part of their response to an attack. In contrast, self-healing
software recovers from an attack by automatically repairing the
underlying vulnerability. Although recent research explores the
feasibility of the basic concept, self-healing faces four major
obstacles before it can protect legacy applications and COTS
software. Besides the practical issues involved in applying the system
to such software (<i>e.g.</i>, not modifying source code), self-healing
has encountered a number of problems: knowing when to engage, knowing
how to repair, and handling communication with external entities.
<p>
Our previous work on a self-healing system, STEM, left these
challenges as future work. STEM provides self-healing by
speculatively executing ``slices'' of a process. This paper improves
STEM's capabilities along three lines: (1) applicability of the
system to COTS software (STEM does not require source code, and it
imposes a roughly 73% performance penalty on Apache's normal
operation), (2) semantic correctness of the repair (we introduce
<i>virtual proxies</i> and <i>repair policy</i> to assist the healing
process), and (3) creating a behavior profile based on aspects of
data and control flow.

Topology-Based Optimization of Maximal Sustainable Throughput in a Latency-Insensitive System

Rebecca Collins, Luca Carloni

2007-02-06

We consider the problem of optimizing the performance of a latency-insensitive system (LIS) where the addition of backpressure has caused throughput degradation. Previous works have addressed the problem of LIS performance in different ways. In particular, the insertion of relay stations and the sizing of the input queues in the shells are the two main optimization techniques that have been proposed.
We provide a unifying framework for this problem by outlining which approaches work for different system topologies, and highlighting counterexamples where some solutions do not work. We also observe that in the most difficult class of topologies, instances with the greatest throughput degradation are typically very amenable to simplifications. The contributions of this paper include a characterization of topologies that maintain optimal throughput with fixed-size queues and a heuristic for sizing queues that produces solutions close to optimal in a fraction of the time.

POlymorphic malcode remains one of the most troubling threats for information security and intrusion defense systems. The ability for malcode to be automatically transformed into a semantically equivalent variant frustrates attemtps to construct a single, simple, easily verifiable representation. We present a quantitative analysis of the strentghs and limitations of shellcode polymorphism and consider the impact of this analysis on the current practices in intrusion detection.
Our examination focuses on the nature of shellcode "decoding routines", and the empirical evidence we gather illustrate our mail result: that the challenge of modeling the class of self-modifying code is likely intractable - even when the size of the instruction sequence (i.e. the decoder) is relatively small. We develop metrics to gauge the power of polymorphic engines and use them to provide insight into the strengths and weaknesses of some popular engines. We believe this analysis supplies a novel and useful way to understand the limitations of the current generation of signature-based techniques. We analyze some contemporary polymorphic techniques, explore ways to improve them in order to forecast the nature of future threats, and present our suggestions for countermeasures. Our resulsts indicate that the class of polymorphic behavior is too greatly spread and varied to model effectively. We conclude that modeling normal content is ulatimately a more promising defense mechanism than modeling malicious or abnormal content.

We present a querying mechanism for service discovery which combines
ontology queries with text search. The underlying service discovery
architecture used is GloServ. GloServ uses the Web Ontology Language
(OWL) to classify services in an ontology and map knowledge obtained
by the ontology onto a hierarchical peer-to-peer network. Initially,
an ontology-based first order predicate logic query is issued in order
to route the query to the appropriate server and to obtain exact and
related service data. Text search further enhances querying by
allowing services to be described not only with ontology attributes,
but with plain text so that users can query for them using key
words. Currently, querying is limited to either simple attribute-value
pair searches, ontology queries or text search. Combining ontology
queries with text search enhances current service discovery
mechanisms.

Many users value applications that continue execution in the face of
attacks. Current software protection techniques typically abort a
process after an intrusion attempt ({\it e.g.}, a code injection
attack). We explore ways in which the security property of integrity
can support availability. We extend the Clark-Wilson Integrity Model
to provide primitives and rules for specifying and enforcing repair
mechanisms and validation of those repairs. Users or administrators
can use this model to write or automatically synthesize \emph{repair
policy}. The policy can help customize an application's response to
attack. We describe two prototype implementations for transparently
applying these policies without modifying source code.

Using Functional Independence Conditions to Optimize the Performance of Latency-Insensitive Systems

Cheng-Hong Li, Luca Carloni

2007-01-11

In latency-insensitive design shell modules are used to encapsulate system components (pearls) in order to interface them with the given latency-insensitive protocol and dynamically control their operations. In particular, a shell stalls a pearl whenever new valid data are not available on its input channels. We study how functional independence
conditions (FIC) can be applied to the performance optimization of a latency-insensitive system by avoiding unnecessary stalling of their pearls. We present a novel circuit design of a generic shell template that can exploit FICs. We also provide an automatic procedure for the logic synthesis of a shell instance that is only based on the particular local characteristics of its corresponding pearl and does not require any input from the designers. We conclude reporting on a set of experimental results that illustrate the beneits and overhead of the proposed technique.

Microsoft's enterprise customers are demanding better ways to modularize their software systems. They look to the Java community, where these needs are being met with language enhancements, improved developer tools and middleware, and better runtime support. We present a business case for why Microsoft should give priority to supporting better modularization techniques, also known as advanced separation of concerns (ASOC), for the .NET platform, and we provide a roadmap for how to do so.

An Implementation of a Renesas H8/300 Microprocessor with a Cycle-Level Timing Extension

Chen-Chun Huang, Javier Coca, Yashket Gupta, Stephen A. Edwards

2006-12-30

We describe an implementation of the Renesas H8/300 16-bit processor
in VHDL suitable for synthesis on an FPGA. We extended the ISA
slightly to accomodate cycle-accurate timers accessible from the
instruction set, designed to provide more precise real-time control.
We describe the architecture of our implementation in detail, describe
our testing strategy, and finally show how to built a cross compilation
toolchain under Linux.

SHIM is a concurrent deterministic language focused on embedded
system. Although SHIM has undergone substantial evolution, it
currently does not have a code generator for a true embedded
environment.
In this project, we built an embedded environment that we intend to
use as a target for the SHIM compiler. We add the uClinux operating
system between hardware devices and software programs. Our long-term
goal is to have the SHIM compiler generate both user-space and
kernel/module programs for this environment. This project is a first
step: we manually explored what sort of code we ultimately want the
SHIM compiler to produce.
In this report, we provide instructions on how to build and install
uClinux into an Altera DE2 board and example programs, including a
user-space program, a kernel module, and a simple device driver for
the buttons on the DE2 board that includes an interrupt handler.

Image compression plays an important role in multimedia systems,
digital systems, handheld systems and various other devices.
Efficient image processing techniques are needed to make images
suitable for use in embedded systems.
This paper describes an implementation of a JPEG decoder in the SHIM
programming language. SHIM is a software/hardware integration language
whose aim is to provide communication between hardware and software
while providing deterministic concurrency.
The paper shows that a JPEG decoder is a good application and
reasonable test case for the SHIM language and illustrates the ease
with which conventional sequential decoders can be modified to achieve
concurrency.

The use of multiprocessor configurations over uniprocessor is rapidly
increasing to exploit parallelism instead of frequency scaling for
better compute capacity. The multiprocessor architectures being
developed will have a major impact on existing software. Current
languages provide facilities for concurrent and distributed
programming, but are prone to races and non-determinism. SHIM, a
deterministic concurrent language, guarantees the behavior of its
programs are independent of the scheduling of concurrent
operations. The language currently supports atomic arrays only, i.e.,
parts of arrays cannot be sent to concurrent processes for evaluation
(and edition). In this report, we propose a way to add non-atomic
arrays to SHIM and describe the semantics that should be considered
while allowing concurrent processes to edit parts of the same array.

High dimensionality remains a significant challenge for document clustering. Recent approaches used frequent itemsets and closed frequent itemsets to reduce dimensionality, and to improve the efficiency of hierarchical document clustering. In this paper, we introduce the notion of “closed interesting” itemsets (i.e. closed itemsets with high interestingness). We provide heuristics such as “super item” to efficiently mine these itemsets and show that they provide significant dimensionality reduction over closed frequent itemsets.
Using “closed interesting” itemsets, we propose a new hierarchical document clustering method that outperforms state of the art agglomerative, partitioning and frequent-itemset based methods both in terms of FScore and Entropy, without requiring dataset specific parameter tuning. We evaluate twenty interestingness measures on nine standard datasets and show that when used to generate “closed interesting” itemsets, and to select parent nodes, Mutual Information, Added Value, Yule’s Q and Chi-Square offers best clustering performance, regardless of the characteristics of underlying dataset. We also show that our method is more scalable, and results in better run-time performance as compare to leading approaches. On a dual processor machine, our method scaled sub-linearly and was able to cluster 200K documents in about 40 seconds.

LinkWidth: A Method to measure Link Capacity and Available Bandwidth using Single-End Probes

Sambuddho Chakravarty, Angelos Stavrou, Angelos D. Keromytis

2006-12-15

We introduce LinkWidth, a method for estimating capacity and
available bandwidth using single-end controlled TCP packet probes.
To estimate capacity, we generate a train of TCP RST packets “sandwiched” between two TCP SYN packets. Capacity is obtained by
end-to-end packet dispersion of the received TCP RST/ACK packets corresponding to the TCP SYN packets. Our technique is significantly different from the rest of the packet-pair-based measurement techniques, such as CapProbe, pathchar and pathrate, because the long packet trains minimize errors due to bursty cross-traffic. TCP RST packets do not generate additional ICMP replies preventing cross-traffic interference with our probes. In addition, we use TCP packets for all our probes to prevent some types of QoS-related traffic shaping from affecting our measurements. We extend the Train of Packet Pairs technique to approximate the available link capacity. We use pairs of TCP packets with variable intra-pair delays and sizes. This is the first attempt to implement this technique using single-end TCP probes, tested on a wide range of real networks with variable cross-traffic. We compare our prototype with pathchirp and pathload, which require control of both
ends, and demonstrate that in most cases our method gives approximately the same results.

Autonomic systems, specifically self-healing systems, currently lack an objective and relevant methodology for their evaluation. Due to their focus on problem detection, diagnosis and remediation any evaluation methodology should facilitate an objective evaluation and/or comparison of these activities. Measures of “raw” performance are easily quantified and hence facilitate measurement and comparison on the basis of numbers. However, classifying a system better at problem detection, diagnosis and remediation purely on the basis of performance measures is not useful. The proposed evaluation methodology devised will differ from traditional benchmarks, which are primarily concerned with measures of performance. In order to develop this methodology we rely on a set of experiments which will enable us to compare the self-healing capabilities of one system versus another. As currently we do not have available “real” self-healing systems, we will simulate the behavior of some target self-healing systems, system faults and the operational and repair activities of target systems. Further, we will use the results derived from the simulation experiments to answer questions relevant to the utility of a benchmark report.

In this project, we measured the stability of DNS servers based on the most popular 500 domains. In the first part of the project, DNS server replica counts and maximum DNS server separation are found for each domain. In the second part, these domains are queried for a one-month period in order to find their uptime percentages.

In a wireless network, mobile nodes (MNs) repeatedly
perform tasks such as layer 2 (L2) handoff, layer 3 (L3)
handoff and authentication. These tasks are critical, particularly
for real-time applications such as VoIP. We propose a novel
approach, namely Cooperative Roaming (CR), in which MNs
can collaborate with each other and share useful information
about the network in which they move.
We show how we can achieve seamless L2 and L3 handoffs
regardless of the authentication mechanism used and without any
changes to either the infrastructure or the protocol. In particular,
we provide a working implementation of CR and show how, with
CR, MNs can achieve a total L2+L3 handoff time of less than
16 ms in an open network and of about 21 ms in an IEEE 802.11i
network. We consider behaviors typical of IEEE 802.11 networks,
although many of the concepts and problems addressed here
apply to any kind of mobile network.

While physical layer capture has been observed in real implementations of wireless devices accessing the channel like 802.11, log-utility fair allocation algorithms based on accurate channel models describing the phenomenon have not been developed. In this paper, using a general physical channel model, we develop an allocation algorithm for log-utility fairness. To maximize the aggregate utility, our algorithm determines channel access attempt probabilities of nodes using partial derivatives of the utility. Our algorithm is verified through extended simulations. The results indicate that our algorithm could quickly
achieve allocations close to the optimum with 8.6% accuracy error on average.

P2P file sharing provides a powerful content distribution model
by leveraging users' computing and bandwidth resources. However,
companies have been reluctant to rely on P2P systems for paid content
distribution due to their inability to limit the exploitation of these
systems for free file sharing. We present \sname, a system that
combines the more cost-effective and scalable distribution capabilities of
P2P systems with a level of trust and control over content
distribution similar to direct download content delivery networks. \sname\
uses two key mechanisms that can be layered on top of existing P2P
systems. First, it provides strong authentication to prevent free
file sharing in the system. Second, it introduces a new notion of trusted
auditors to detect and limit malicious attempts to gain information
about participants in the system to facilitate additional out-of-band
free file sharing. We analyze \sname\ by modeling it as a novel game
between malicious users who try to form free file sharing clusters
and trusted auditors who curb the growth of such clusters. Our analysis
shows that a small fraction of trusted auditors
is sufficient to protect the P2P system against unauthorized file
sharing. Using a simple economic model, we further show that
\sname\ provides a more cost-effective content distribution solution,
resulting in higher profits for a content provider even in the
presence of a large percentage of malicious users. Finally, we
implemented \sname\ on top of BitTorrent and use PlanetLab to show
that our system can provide trusted P2P f

With the growth of presence-based services, it is important to provision the network to support high traffic and load generated by presence services. Presence event distribution systems amplify a single incoming PUBLISH message into possibly numerous outgoing NOTIFY messages from the server. This can increase the network load on inter-domain links and can potentially disrupt other QoS-sensitive applications. In this document, we present existing as well as new techniques that can be used to reduce presence traffic both in inter-domain and intra-domain scenarios. Specifically, we propose two new techniques: sending common NOTIFY for multiple watchers and batched notifications. We also propose some generic heuristics that can be used to reduce network traffic due to presence.

This document defines DHT-independent and DHT-dependent features of DHT algorithms and presents a comparison of Chord, Pastry and Kademlia. It then describes key DHT operations and their information requirements.

We conducted a survey on end users’ willingness and capability to create their desired communication services. The survey is based on the graphical service creation tool we implemented for the Language for End System Services (LESS). We call the tool CUTE, which stands for Columbia University Telecommunication service Editor. This report demonstrates our survey result and shows that relatively inexperienced users are willing and capable to create their desired communication services, and CUTE fits their needs.

A VoIP Privacy Mechanism and its Application in VoIP Peering for Voice Service Provider Topology and Identity Hiding

Charles Shen, Henning Schulzrinne

2006-10-03

Voice Service Providers (VSPs) participating in VoIP peering frequently want to withhold their
identity and related privacy-sensitive information from other parties during the VoIP communication.
A number of existing documents on VoIP privacy exist, but most of them focus on end user privacy.
By summarizing and extending existing work, we present a unified privacy mechanism for both VoIP
users and service providers. We also show a case study on how VSPs can use this mechanism for
identity and topology hiding in VoIP peering.

ENUM is a protocol standard developed by the Internet Engineering Task
Force (IETF) for translating the E.164 phone numbers into Internet
Universal Resource Identifiers (URIs). It plays an increasingly
important role as the bridge between Internet and traditional
telecommunications services. ENUM is based on the Domain Name System
(DNS), but places unique performance requirements on DNS server. In
particular, ENUM server needs to host a huge number of records,
provide high query throughput for both existing and non-existing
records in the server, maintain high query performance under update
load, and answer queries within a tight latency budget. In this
report, we evaluate and compare performance of serving ENUM queries by
three servers, namely BIND, PDNS and Navitas. Our objective is to
answer whether and how these servers can meet the unique performance
requirements of ENUM. Test results show that the ENUM query response
time on our platform has always been on the order of a few
milliseconds or less, so this is likely not a concern. Throughput then
becomes the key. The throughput of BIND degrades linearly as the
record set size grows, so BIND is not suitable for ENUM. PDNS delivers
higher performance than BIND in most cases, while the commercial
Navitas server presents even better ENUM performance than PDNS. Under
our 5M-record set test, Navitas server with its default configuration
consumes one tenth to one sixth the memory of PDNS, achieves six times
higher throughput for existing records and an order of two magnitudes
higher throughput for non-existing records than the bottom line PDNS
server without caching. The throughput of Navitas is also the highest
among the tested servers when the database is being updated in the
background. We investigated ways to improve PDNS performance. For
example, doubling CPU processing power by putting PDNS and its backend
database in two separate machines can increase PDNS throughput for
existing records by 45% and that for nonexisting records by 40%. Since
PDNS is open source, we also instrumented the source code to obtain a
detailed profile of contributions of various systems components to the
overall latency. We found that when the server is within its normal
load range, the main component of server processing latency is caused
by backend database lookup operations. Excessive number of backend
database lookups is the reason that makes PDNS throughput for
non-existing records its key weakness. We studied using PDNS caching
to reduce the number of database lookups. With a full packet cache and
a modified cache maintenance mechanism, the PDNS throughput for
existing records can be improved by 100%. This brings the value to one
third of its Navitas counterpart. After enabling the PDNS negative
query cache, we improved PDNS throughput for non-existing records to
the level comparable to its throughput for existing records, but this
result is still an order of magnitude lower than the corresponding
value in Navitas. Further improvements of PDNS throughput for
non-existing records will require optimization of related processing
mechanism in its implementation.

We address the problem of specifying concurrent processes that can make local
nondeterministic decisions without affecting global system
behavior---the sequence of events communicated along each
inter-process communication channel. Such nondeterminism can be used
to cope with unpredictable execution rates and communication delays.
Our model
resembles Kahn's, but does not include unbounded buffered
communication, so it is much simpler to reason about and implement.
After formally characterizing these so-called confluent processes, we
propose a collection of operators, including sequencing, parallel, and
our own creation, confluent choice, that guarantee confluence by
construction.
The result is a set of primitive constructs that form the formal basis
of a concurrent programming language for both hardware and software
systems that gives deterministic behavior regardless of the relative
execution rates of the processes. Such a language greatly simplifies
the verification task because any correct implementation of such a
system is guaranteed to have the same behavior, a property rarely
found in concurrent programming environments.

Nondeterminism is a central challenge in most concurrent models of
computation. That programmers must worry about races and other
timing-dependent behavior is a key reason that parallel programming
has not been widely adopted. The SHIM concurrent language,
intended for hardware/software codesign applications, avoids this
problem by providing deterministic (race-free) concurrency, but does
not support automatic parallelization of sequential algorithms.
In this paper, we present a compiler able to parallelize a simple
MATLAB-like language into concurrent SHIM processes. From a
user-provided partitioning of arrays to processes, our compiler
divides the program into coarse-grained processes and schedules and
synthesizes inter-process communication. We demonstrate the
effectiveness of our approach on some image-processing algorithms.

Concurrent programming languages should be a good fit for embedded
systems because they match the intrinsic parallelism of their
architectures and environments. Unfortunately, most concurrent
programming formalisms are prone to races and nondeterminism, despite
the presence of mechanisms such as monitors.
In this paper, we propose SHIM, the core of a concurrent language with
disciplined shared variables that remains deterministic, meaning the
behavior of a program is independent of the scheduling of concurrent
operations. SHIM does not sacrifice power or flexibility to achieve
this determinism. It supports both synchronous and asynchronous
paradigms---loosely and tightly synchronized threads---the dynamic
creation of threads and shared variables, recursive procedures, and
exceptions.
We illustrate our programming model with examples including
breadth-first-search algorithms and pipelines. By construction, they
are race-free. We provide the formal semantics of SHIM and a
preliminary implementation.

The ability to debug woven programs is critical to the adoption of Aspect Oriented Programming (AOP). Nevertheless, many AOP systems lack adequate support for debugging, making it difficult to diagnose faults and understand the program's structure and control flow. We discuss why debugging aspect behavior is hard and how harvesting results from related research on debugging optimized code can make the problem more tractable. We also specify general debugging criteria that we feel all AOP systems should support.
We present a novel solution to the problem of debugging aspect-enabled programs. Our Wicca system is the first dynamic AOP system to support full source-level debugging of woven code. It introduces a new weaving strategy that combines source weaving with online byte-code patching. Changes to the aspect rules, or base or aspect source code are rewoven and recompiled on-the-fly. We present the results of an experiment that show how these features provide the programmer with a powerful interactive debugging experience with relatively little overhead.

Some machine learning applications are intended to learn properties
of data sets where the correct answers are not already known to
human users. It is challenging to test and debug such ML software,
because there is no reliable test oracle. We describe a framework
and collection of tools aimed to assist with this problem. We
present our findings from using the testing framework with three
implementations of an ML ranking algorithm (all of which had bugs).

This paper present an throughput analysis of log-utility and max-min fairness. Assuming all nodes interfere with each other, completely or partially, log-utility fairness significantly enhances the total throughput compared to max-min fairness since the nodes should have the same throughput in max-min fairness. The improvement is enlarged especially when the effect of cumulated interference from multiple senders cannot be ignored.

While packet capture has been observed in real implementations of wireless devices randomly accessing shared channels, fair rate control algorithms based on accurate channel models that describe the phenomenon have not been developed. In this paper, using a general physical channel model, we develop the equation for the optimal attemp rate to maximize the aggregate log utility. We use the least squares method to approximate the equation to a linear function of the attempt rate. Our analysis on the approximation error shows that the linear function obtained is close enough to the original with the square of the residuals more than 0.9.

In a previous paper, we developed a general framework for establishing
tractability and strong tractability for quasilinear multivariate
problems in the worst case setting. One important example of such a
problem is the solution of the heat equation $u_t = \Delta u - qu$ in
$I^d\times(0,T)$, where $I$ is the unit interval and $T$ is a maximum
time value. This problem is to be solved subject to homogeneous
Dirichlet boundary conditions, along with the initial conditions
$u(\cdot,0)=f$ over~$I^d$. The solution~$u$ depends linearly on~$f$,
but nonlinearly on~$q$. Here, both $f$ and~$q$ are $d$-variate
functions from a reproducing kernel Hilbert space with finite-order
weights of order~$\omega$. This means that, although~$d$ can be
arbitrary large, $f$ and~$q$ can be decomposed as sums of functions of
at most $\omega$~variables, with $\omega$ independent of~$d$.
In this paper, we apply our previous general results to the heat
equation. We study both the absolute and normalized error criteria.
For either error criterion, we show that the problem is \emph. That
is, the number of evaluations of $f$ and~$q$ needed to obtain an
$\varepsilon$-approximation is polynomial in~$\varepsilon$ and~$d$,
with the degree of the polynomial depending linearly on~$\omega$. In
addition, we want to know when the problem is \emph{strongly
tractable}, meaning that the dependence is polynomial only
in~$\varepsilon$, independently of~$d$. We show that if the sum of
the weights defining the weighted reproducing kernel Hilbert space is
uniformly bounded in~$d$ and the integral of the univariate kernel is
positive, then the heat equation is strongly tractable.

In this paper, we present a new class of volumetric displays that can
be used to display 3D objects. The basic approach is to trade-off the
spatial resolution of a digital projector (or any light engine) to gain
resolution in the third dimension. Rather than projecting an image
onto a 2D screen, a depth-coded image is projected onto a 3D cloud
of passive optical scatterers. The 3D point cloud is realized using a
technique called Laser Induced Damage (LID), where each scatterer
is a physical crack embedded in a block of glass or plastic. We show
that when the point cloud is randomized in a specific manner, a very
large fraction of the points are visible to the viewer irrespective of
his/her viewing direction. We have developed an orthographic projection
system that serves as the light engine for our volumetric displays.
We have implemented several types of point clouds, each one
designed to display a specific class of objects. These include a cloud
with uniquely indexable points for the display of true 3D objects, a
cloud with an independently indexable top layer and a dense extrusion
volume to display extruded objects with arbitrarily textured top
planes and a dense cloud for the display of purely extruded objects.
In addition, we show how our approach can be used to extend simple
video games to 3D. Finally, we have developed a 3D avatar in which
videos of a face with expression changes are projected onto a static
surface point cloud of the face.

User-defined preferences allow personalized ranking of query results.
A user provides a declarative specification of his/her preferences, and
the system is expected to use that specification to give more
prominence to preferred answers. We study constraint formalisms for
expressing user preferences as base facts in a partial order. We
consider a language that allows comparison and a limited form of
arithmetic, and show that the transitive closure computation required
to complete the partial order terminates. We consider various ways of
composing partial orders from smaller pieces, and provide results on
the size of the resulting transitive closures. We introduce the
notion of ``covering composition,'' which solves some semantic problems
apparent in previous notions of composition. Finally, we show how
preference queries within our language can be supported by suitable
index structures for efficient evaluation over large data sets. Our
results provide guidance about when complex preferences can be
efficiently evaluated, and when they cannot.

VoIP (Voice over IP) services are using the Internet infrastructure to
enable new forms of communication and collaboration. A growing number
of VoIP service providers such as Skype, Vonage, Broadvoice, as well
as many cable services are using the Internet to offer telephone
services at much lower costs. However, VoIP services rely on the
user's Internet connection, and this can often translate into lower
quality communication. Overlay networks offer a potential solution to
this problem by improving the default Internet routing and overcome
failures.
To assess the feasibility of using overlays to improve VoIP
on the Internet, we have conducted a detailed experimental study to
evaluate the benefits of using an overlay on PlanetLab nodes for
improving voice communication connectivity and performance around the
world. Our measurements demonstrate that an overlay architecture
can significantly improve VoIP communication across most regions
and provide their greatest benefit for locations with poorer default
Internet connectivity. We explore overlay topologies and show that a
small number of well-connected intermediate nodes is sufficient to
improve VoIP performance. We show that there is significant
variation over time in the best overlay routing paths and argue for
the need for adaptive routing to account for this variation to deliver
the best performance.

Precomputed radiance transfer (PRT) generates impressive images with complex illumi-
nation, materials and shadows with real-time interactivity. These methods separate the
scene’s static and dynamic components allowing the static portion to be computed as a
preprocess. In this work, we hold geometry static and allow either the lighting or BRDF
to be dynamic. To achieve real-time performance, both static and dynamic components
are compressed by exploiting spatial and angular coherence. Temporal coherence of the
dynamic component from frame to frame is an important, but unexplored additional form
of coherence. In this thesis, we explore temporal coherence of two forms of all-frequency
PRT: BRDF material editing and lighting design. We develop incremental methods for
approximating the differences in the dynamic component between consecutive frames. For
BRDF editing, we find that a pure incremental approach allows quick convergence to an
exact solution with smooth real-time response.
For relighting, we observe vastly differing degrees of temporal coherence accross levels of
the lighting’s wavelet hierarchy. To address this, we develop an algorithm that treats each
level separately, adapting to available coherence. The proposed methods are othogonal to
other forms of coherence, and can be added to almost any PRT algorithm with minimal
implementation, computation, or memory overhead. We demonstrate our technique within
existing codes for nonlinear wavelet approximation, changing view with BRDF factorization,
and clustered PCA. Exploiting temporal coherence of dynamic lighting yields a 3×–4× per-
formance improvement, e.g., all-frequency effects are achieved with 30 wavelet coefficients,
about the same as low-frequency spherical harmonic methods. Distinctly, our algorithm
smoothly converges to the exact result within a few frames of the lighting becoming static.

Software faults and vulnerabilities continue to present significant
obstacles to achieving reliable and secure software. In an effort to
overcome these obstacles, systems often incorporate self-monitoring
and self-healing functionality. Our hypothesis is that internal
monitoring is not an effective long-term strategy. However,
monitoring mechanisms that are completely external lose the advantage
of application-specific knowledge available to an inline monitor. To
balance these tradeoffs, we present the design of VxF, an environment
where both supervision and automatic remediation can take place by
speculatively executing "slices" of an application. VxF introduces
the concept of an endolithic kernel by providing execution as
an operating system service: execution of a process slice takes place
inside a kernel thread rather than directly on the system
microprocessor.

With the increased use of botnets and other techniques to obfuscate attackers' command-and-control centers, Distributed Intrusion Detection Systems (DIDS) that focus on attack source IP addresses or other header information can only portray a limited view of distributed scans and attacks. Packet payload sharing techniques hold far more promise, as they can convey exploit vectors and/or malcode used upon successful exploit of a target system, irrespective of obfuscated source addresses. However, payload sharing has had minimal success due to regulatory or business-based privacy concerns of transmitting raw or even sanitized payloads. The currently accepted form of content exchange has been limited to the exchange of known-suspicious content, e.g., packets captured by honeypots; however, signature generation assumes that each site receives enough traffic in order to correlate a meaningful set of payloads from which common content can be derived, and places fundamental and computationally stressful requirements on signature generators that may miss particularly stealthy or carefully-crafted polymorphic malcode.
Instead, we propose a new approach to enable the sharing of suspicious payloads via privacy-preserving technologies. We detail the work we have done with two example payload anomaly detectors, PAYL and Anagram, to support generalized payload correlation and signature generation without releasing identifiable payload data and without relying on single-site signature generation. We present preliminary results of our approaches and suggest how such deployments may practically be used for not only cross-site, but also cross-domain alert sharing and its implications for profiling threats.

A novel CPU scheduling policy is designed and implemented. It is a configurable policy in the sense that a tunable parameter is provided to change its behavior. With different settings of the parameter, this policy can emulate the first-come first-serve, the processing sharing, or the feedback policies, as well as different levels of their mixtures. This policy is implemented in the Linux kernel as a replacement of the default scheduler. The drastic changes of behaviors as the parameter changes are analyzed and simulated. Its performance is measured with the real systems by the workload generators and benchmarks.

The shading in a scene depends on a combination of
many factors---how the lighting varies spatially across a surface, how
it varies along different directions, the geometric curvature and
reflectance properties of objects, and the locations of soft shadows.
In this paper, we conduct a complete first order or gradient analysis of
lighting, shading and shadows, showing how each factor separately
contributes to scene appearance, and when it is important. Gradients
are well suited for analyzing the intricate combination of appearance
effects, since each gradient term corresponds directly to variation in
a specific factor. First, we show how the spatial {\em and}
directional gradients of the light field change, as light interacts
with curved objects. This extends the recent frequency analysis of
Durand et al.\ to gradients, and has many advantages for operations,
like bump-mapping, that are difficult to analyze in the
Fourier domain. Second, we consider the individual terms responsible
for shading gradients, such as lighting variation, convolution with
the surface BRDF, and the object's curvature. This analysis indicates
the relative importance of various terms, and shows precisely how they
combine in shading. As one practical application, our theoretical
framework can be used to adaptively sample images in high-gradient
regions for efficient rendering. Third, we understand the effects of
soft shadows, computing accurate visibility gradients. We generalize
previous work to arbitrary curved occluders, and develop a
local framework that is easy to integrate with conventional
ray-tracing methods. Our visibility gradients can be directly used in
practical gradient interpolation methods for efficient rendering.

The increasing sophistication of software attacks has created the need
for increasingly finer-grained intrusion and anomaly detection
systems, both at the network and the host level. We believe that the
next generation of defense mechanisms will require a much more
detailed dynamic analysis of application behavior than is currently
done. We also note that the same type of behavior analysis is needed
by the current embryonic attempts at self-healing systems. Because
such mechanisms are currently perceived as too expensive in terms of
their performance impact, questions relating to the feasibility and
value of such analysis remain unexplored and unanswered.
We present a new mechanism for profiling the behavior space of an
application by analyzing all function calls made by the process,
including regular functions and library calls, as well as system
calls. We derive behavior from aspects of both control and data flow.
We show how to build and check profiles that contain this information
at the binary level -- that is, without making changes to the
application's source, the operating system, or the compiler. This
capability makes our system, Lugrind, applicable to a variety of
software, including COTS applications. Profiles built for the
applications we tested can predict behavior with 97% accuracy given a
context window of 15 functions. Lugrind demonstrates the
feasibility of combining binary-level behavior profiling with
detection and automated repair.

In this paper, we present Anagram, a content anomaly detector that
models a mixture of high-order n-grams (n > 1) designed to detect
anomalous and ^Ósuspicious^Ô network packet payloads. By using higher-
order n-grams, Anagram can detect significant anomalous byte
sequences and generate robust signatures of validated malicious
packet content. The Anagram content models are implemented using
highly efficient Bloom filters, reducing space requirements and
enabling privacy-preserving cross-site correlation. The sensor models
the distinct content flow of a network or host using a semi-
supervised training regimen. Previously known exploits, extracted
from the signatures of an IDS, are likewise modeled in a Bloom filter
and are used during training as well as detection time. We demon-
strate that Anagram can identify anomalous traffic with high accuracy
and low false positive rates. Anagram^Òs high-order n-gram analysis
technique is also resil-ient against simple mimicry attacks that
blend exploits with ^Ónormal^Ô appearing byte padding, such as the
blended polymorphic attack recently demonstrated in [1]. We discuss
randomized n-gram models, which further raises the bar and makes it
more difficult for attackers to build precise packet structures to
evade Anagram even if they know the distribution of the local site
content flow. Finally, Ana-gram^Òs speed and high detection rate makes
it valuable not only as a standalone sensor, but also as a network
anomaly flow classifier in an instrumented fault-tolerant host-based
environment; this enables significant cost amortization and the
possibility of a ^Ósymbiotic^Ô feedback loop that can improve accuracy
and reduce false positive rates over time.

Many current systems security research efforts focus on mechanisms for
Intrusion Prevention and Self-Healing Software. Unfortunately, such
systems find it difficult to gain traction in many deployment
scenarios. For self-healing techniques to be realistically employed,
system owners and administrators must have enough confidence in the
quality of a generated fix that they are willing to allow its
automatic deployment.
In order to increase the level of confidence in these systems, the efficacy of a 'fix' must be tested and validated after it
has been automatically developed, but before it is actually
deployed. Due to the nature of attacks, such verification must proceed
automatically. We call this problem Automatic Repair Validation
(ARV). As a way to illustrate the difficulties faced by ARV, we
propose the design of a system, Bloodhound, that tracks and
stores malicious network flows for later replay in the validation
phase for self-healing software.

This report presents an approach to automation of a protein
crystallography task called streak seeding. The approach is based
on novel and unique custom-designed silicon microtools, which we
experimentally verified to produce results similar to the results
from traditionally used boar bristles. The advantage to using
silicon is that it allows the employment of state-of-the-art
micro-electro-mechanical-systems (MEMS) technology to produce
microtools of various shapes and sizes and thatit is rigid and can
be easily adopted as an accurately calibrated end-effector on a
microrobotic system. A working prototype of an automatic streak
seeding system is presented, which has been successfully applied
for protein crystallization.

Collaborative security is a promising solution to many types of security
problems. Organizations and individuals often have a limited amount of
resources to detect and respond to the threat of automated attacks.
Enabling them to take advantage of the resources of their peers by sharing information related to such threats is a major step towards automating defense systems.
In particular, comment spam posted on blogs as a way for attackers to
do Search Engine Optimization (SEO) is a major annoyance. Many measures
have been proposed to thwart such spam, but all such measures are currently enacted and operate within one administrative domain. We propose and implement a system for cross-domain information sharing to improve the quality and speed of defense against such spam.

In this paper, we consider using angle of arrival information
(bearing) for network localization and control in two different
fields of multi-agent systems: (i) wireless sensor networks; (ii)
robot networks. The essential property we require in this paper is
that a node can infer heading information from its neighbors. We
address the uniqueness of network localization solutions by the
theory of globally rigid graphs. We show that while the parallel
rigidity problem for formations with bearings is isomorphic to the
distance case, the global rigidity of the formation is simpler (in
fact identical to the simpler rigidity case) for a network with
bearings, compared to formations with distances. We provide the
conditions of localization for networks in which the neighbor
relationship is not necessarily symmetric.

A Theory of Spherical Harmonic Identities for BRDF/Lighting Transfer and Image Consistency

Dhruv Mahajan, Ravi Ramamoorthi, Brian Curless

2006-03-17

We develop new mathematical results based on the spherical harmonic
convolution framework for reflection from a curved surface. We derive novel identities, which are the angular frequency domain analogs to common spatial domain invariants such as reflectance ratios. They apply in a number of canonical cases, including single and multiple images of objects under the same and different lighting conditions. One important case we consider is two different glossy objects in two different lighting environments. While this paper is primarily theoretical,
it has the potential to lay the mathematical foundations for two important practical applications. First, we can develop more general algorithms for inverse rendering problems, which can directly relight and change material properties by transferring the BRDF or lighting from another object or illumination. Second, we can check the consistency of an image, to detect tampering or image splicing.

During a layer-3 handoff, address acquisition via
DHCP is often the dominant source of handoff delay, duplicate
address detection (DAD) being responsible for most of the delay.
We propose a new DAD algorithm, passive DAD (pDAD), which
we show to be effective, yet introduce only a few milliseconds of
delay. Unlike traditional DAD, pDAD also detects the unauthorized
use of an IP address before it is assigned to a DHCP client.

A pyramid evaluation dataset was created for DUC 2003 in order to
compare results with DUC 2005, and to provide an independent test
of the evaluation metric. The main differences between
DUC 2003 and 2005 datasets pertain to the document length, cluster
sizes, and model summary length. For five of the DUC 2003 document sets,
two pyramids each were
constructed by annotators working independently. Scores of the same
peer using different pyramids were highly correlated. Sixteen systems
were evaluated on eight document sets.
Analysis of variance using Tukey's Honest Significant Difference
method showed significant differences among all eight document sets,
and more significant differences among the sixteen systems than for DUC 2005.

This paper is concerned with information structures used in rigid
formations of autonomous agents that have leader-follower
architecture. The focus of the paper is on sensor/network
topologies to secure control of rigidity. This papers extends the
previous rigidity based approaches for formations with symmetric
neighbor relations to include formations with leader-follower
architecture. We provide necessary and sufficient conditions for
rigidity of directed formations, with or without cycles. We
present the directed Henneberg constructions as a sequential
process for all guide rigid digraphs. We refine those results for
acyclic formations, where guide rigid formations had a simple
construction. The analysis in this paper confirms that acyclicity
is not a necessary condition for stable rigidity. The cycles are
not the real problem, but rather the lack of guide freedom is the
reason behind why cycles have been seen as a problematic topology.
Topologies that have cycles within a larger architecture can be
stably rigid, and we conjecture that all guide rigid formations
are stably rigid for internal control. We analyze how the external
control of guide agents can be integrated into stable rigidity of
a larger formation. The analysis in the paper also confirms the
inconsistencies that result from noisy measurements in redundantly
rigid formations. An algorithm given in the paper establishes a
sequential way of determining the directions of links from a given
undirected rigid formation so that the necessary and sufficient
conditions are fulfilled.

Peer-to-peer Internet telephony using the Session Initiation Protocol
(P2P-SIP) can exhibit two different architectures: an existing P2P
network can be used as a replacement for lookup and updates, or a P2P
algorithm can be implemented using SIP messages. In this paper, we
explore the first architecture using the OpenDHT service as an
externally managed P2P network. We provide design details such as
encryption and signing using pseudo-code and examples to provide
P2P-SIP for various deployment components such as P2P client, proxy
and adaptor, based on our implementation. The design can be used with
other distributed hash tables (DHTs) also.

Packet-switched networks-on-chip (NOC) have been advocated as the solution to the challenge of organizing efficient and reliable communication structures among the components of a system-on-chip (SOC). A critical issue in designing a NOC is to determine its topology given the set of point-to-point communication requirements among these
components. We present a novel approach to on-chip communication synthesis that is based on the iterative combination of two efficient computational steps: (1) an application of the k-Median algorithm to coarsely determine the global communication structure (which may turned out not be a network after all), and a (2) a variation of the shortest-path algorithm in order to finely tune the data flows on the communication channels. The application of our method to case
studies taken from the literature shows that we can automatically synthesize optimal NOC topologies for multi-core on-chip processors and it offers new insights on why NOC are not necessarily a value proposition for some classes of applcation-specific SOCs.

Routing protocols rely on the cooperation of nodes in the network to
both forward packets and to select the forwarding routes.
There have been several instances in which an entire
network's routing collapsed simply because a seemingly insignificant
set of nodes reported erroneous routing information to their
neighbors. It may have been possible for other nodes to trigger an automated response
and prevent the problem by analyzing received routing information for
inconsistencies that revealed the errors. Our theoretical study seeks to
understand when nodes can detect the existence of errors in the
implementation of route
selection elsewhere in the network through monitoring their own
routing states for inconsistencies. We start by constructing a
methodology, called Strong-Detection, that helps answer the
question. We then apply Strong-Detection to three classes of routing
protocols: distance-vector, path-vector, and link-state. For each
class, we derive low-complexity, self-monitoring algorithms that use
the routing state created by these routing protocols to identify any
detectable anomalies. These algorithms are then used
to compare and contrast the self-monitoring power these various
classes of protocols possess. We also study the trade-off between
their state-information complexity and ability to identify routing anomalies.

With the growth of presence based services, it is important to securely manage and distribute sensitive presence information such as user location. We survey techniques that are used for security and privacy of presence information. In particular, we describe the SIMPLE based presence specific authentication, integrity and confidentiality. We also discuss the IETF’s common policy for geo-privacy, presence authorization for presence information privacy and distribution of different levels of presence information to different watchers. Additionally, we describe an open problem of getting the aggregated presence from the trusted server without the server knowing the presence information, and propose a solution. Finally, we discuss denial of service attacks on the presence system and ways to mitigate them.

Presence is an important enabler for communication in Internet telephony systems. Presence-based services depend on accurate and timely delivery of presence information. Hence, presence systems need to be appropriately dimensioned to meet the growing number of users, varying number of devices as presence sources, the rate at which they update presence information to the network and the rate at which network distributes the user’s presence information to the watchers. SIMPLEstone is a set of metrics for benchmarking the performance of presence systems based on SIMPLE. SIMPLEstone benchmarks a presence server by generating requests based on a work load specification. It measures server capacity in terms of request handling capacity as an aggregate of all types of requests as well as individual request types. The benchmark treats different configuration modes in which presence server interoperates with the Session Initiation protocol (SIP) server as one block.

We present Grouped Distributed Queues (GDQ), the first proportional share scheduler for multiprocessor systems that, by using a distributed queue architecture, scales well with a large number of processors and processes. GDQ achieves accurate proportional fairness scheduling with only O(1) scheduling overhead.
GDQ takes a novel approach to distributed queuing: instead of creating per-processor queues that need to be constantly balanced to achieve any measure of proportional sharing fairness, GDQ uses a simple grouping strategy to organize processes into groups based on similar processor time allocation rights, and then assigns processors to groups based on aggregate group shares. Group membership of processes is static, and fairness is achieved by dynamically migrating processors among groups. The set of processors working on a group use simple, low-overhead round-robin queues, while processor reallocation among groups is achieved using a new multiprocessor adaptation of the well-known Weighted Fair Queuing algorithm. By commoditizing processors and decoupling their allocation from process scheduling, GDQ provides, with only constant scheduling cost, fairness within a constant of the ideal generalized processor sharing model for process weights with a fixed upper bound.
We have implemented GDQ in Linux and measured its performance. Our experimental results show that GDQ has low overhead and scales well with the number of processors.

While web communications are increasingly protected by transport
layer cryptographic operations (SSL/TLS), there are many
situations where even the communications infrastructure provider
cannot be trusted. The end-to-end (E2E) encryption of data
becomes increasingly important in these trust models to protect
the confidentiality and integrity of the data against snooping
and modification by the communications provider.
We introduce W3Bcrypt, an extension to the Mozilla Firefox web
platform that enables application-level cryptographic protection
for web content. In effect, we view cryptographic operations as
a type of style to be applied to web content along with layout
and coloring operations. Among the main benefits of using
encryption as a stylesheet are $(a)$ reduced workload on a web
server, $(b)$ targeted content publication, and $(c)$ greatly
increased privacy. This paper discusses our implementation for
Firefox, but the core ideas are applicable to most current
browsers.

The need for self-healing software to respond with a
reactive, proactive or preventative action as a result of
changes in its environment has added the non-functional
requirement of adaptation to the list of facilities expected
in self-managing systems. The adaptations we are concerned
with assist with problem detection, diagnosis and
remediation. Many existing computing systems do not include
such adaptation mechanisms, as a result these systems
either need to be re-designed to include them or there
needs to be a mechanism for retro-fitting these mechanisms.
The purpose of the adaptation mechanisms is to ease the
job of the system administrator with respect to managing
software systems. This paper introduces Kheiron, a framework
for facilitating adaptations in running programs in a
variety of execution environments without requiring the redesign
of the application. Kheiron manipulates compiled
C programs running in an unmanaged execution environment
as well as programs running in Microsoft’s Common
Language Runtime and SunMicrosystems’ Java VirtualMachine.
We present case-studies and experiments that demonstrate
the feasibility of using Kheiron to support self-healing
systems. We also describe the concepts and techniques used
to retro-fit adaptations onto existing systems in the various
execution environments.

Most current approaches to self-healing software (SHS) suffer
from semantic incorrectness of the response mechanism. To
support SHS, we propose Smart Error Virtualization (SEV),
which treats functions as transactions but provides a way to
guide the program state and remediation to be a more correct
value than previous work.
We perform runtime binary-level profiling on unmodified
applications to learn both good return values and error return
values (produced when the program encounters ``bad'' input).
The goal is to ``learn from mistakes'' by converting malicious
input to the program's notion of ``bad'' input.
We introduce two implementations of this system that support
three major uses: function profiling for regression testing,
function profiling for host-based anomaly detection
(envinroment-specialized fault detection), and function profiling
for automatic attack remediation via SEV. Our systems do not
require access to the source code of the application to enact
a fix. Finally, this paper is, in part, a critical examination of
error virtualization in order to shed light on how to approach
semantic correctness.

We extend the kernel based learning framework to learning from linear
functionals, such as partial derivatives.
The learning problem is formulated as
a generalized regularized risk minimization problem, possibly
involving several different functionals.
We show how to reduce this to
conventional kernel based learning methods
and explore a specific application in Computational
Condensed Matter Physics.

A Lower Bound for the Sturm-Liouville Eigenvalue Problem on a Quantum Computer

Arvid J. Bessen

2005-12-14

We study the complexity of approximating the smallest eigenvalue of a univariate Sturm-Liouville problem on a quantum computer. This general problem includes the special case of solving a one-dimensional Schroedinger equation with a given potential for the ground state energy.
The Sturm-Liouville problem depends on a function q, which, in the case of the Schroedinger equation, can be identified with the potential function V. Recently Papageorgiou and Wozniakowski proved that quantum computers achieve an exponential reduction in the number of queries over the number needed in the classical worst-case and randomized settings for smooth functions q. Their method uses the (discretized) unitary propagator and arbitrary powers of it as a query ("power queries"). They showed that the Sturm-Liouville equation can be solved with O(log(1/e)) power queries, while the number of queries in the worst-case and randomized settings on a classical computer is polynomial in 1/e. This proves that a quantum computer with power queries achieves an exponential reduction in the number of queries compared to a classical computer.
In this paper we show that the number of queries in Papageorgiou's and Wozniakowski's algorithm is asymptotically optimal. In particular we prove a matching lower bound of log(1/e) power queries, therefore showing that log(1/e) power queries are sufficient and necessary. Our proof is based on a frequency analysis technique, which examines the probability distribution of the final state of a quantum algorithm and the dependence of its Fourier transform on the input.

Temporal event correlation is essential to realizing self-managing distributed systems. Autonomic controllers often require
that events be correlated across multiple components using rule patterns with timer-based transitions, e.g., to detect denial
of service attacks and to warn of staging problems with business critical applications. This short paper discusses automatic
adjustment of timer values for event correlation rules, in particular compensating for the variability of event propagation
delays due to factors such as contention for network and server resources. We describe a corresponding Management Station
architecture and present experimental studies on a testbed system that suggest that this approach can produce results at least
as good as an optimal fixed setting of timer values.

The number of qubits used by a quantum algorithm will be a crucial computational resource
for the foreseeable future. We show how to obtain the classical query complexity for continuous
problems. We then establish a simple formula for a lower bound on the qubit complexity in terms
of the classical query complexity.

Large organizations are deploying ever-increasing numbers of networked compute devices, from utilities installing smart controllers on electricity distribution cables, to the military giving PDAs to soldiers, to corporations putting PCs on the desks of employees. These computers are often far more capable than is needed to accomplish their primary task, whether it be guarding a circuit breaker, displaying a map, or running a word processor. These devices would be far more useful if they had some awareness of the world around them: a controller that resists tripping a switch, knowing that it would set off a cascade failure, a PDA that warns its owner of imminent danger, a PC that exchanges reports of suspicious network activity to its peers to identify stealthy computer crackers.
In order to provide these higher-level services, the devices need a model of their environment. The controller needs a model of the distribution grid, the PDA needs a model of the battlespace, and the PC needs a model of the network and of normal network and user behavior. Unfortunately, not only might models such as these require substantial computational resources, but generating and updating them is even more demanding. Modelbuilding algorithms tend to be bad in three ways: requiring large amounts of CPU and memory to run, needing large amounts of data from the outside to stay up to date, and running so slowly that can’t keep up with any fast changes in the environment that might occur.
We can solve these problems by reducing the scope of the model to the immediate locale of the device, since reducing the size of the model makes the problem of model generation much more tractable. But such models are also much less useful, having no knowledge of the wider system.
This thesis proposes a better solution to this problem called Level of Detail, after the computer graphics technique of the same name. Instead of simplifying the representation of distant objects, however, we simplify less-important data. Compute devices in the system receive streams of data that is a mixture of detailed data from devices that directly affect them and data summaries (aggregated data) from less directly influential devices. The degree to which the data is aggregated (i.e., how much it is reduced) is determined by calculating an influence metric between the target device and the remote device. The smart controller thus receives a continuous stream of raw data from the adjacent transformer, but only an occasional small status report summarizing all the equipment in a neighborhood in another part of the city.
This thesis describes the data distribution system, the aggregation functions, and the influence metrics that can be used to implement such a system. I also describe my current towards establishing a test environment and validating the concepts, and describe the next steps in the research plan.

We view a dataset of points or samples as having an underlying, yet
unspecified, tree structure and exploit this assumption in learning
problems. Such a tree structure assumption is equivalent to treating a
dataset as being tree dependent identically distributed or tdid and
preserves exchange-ability. This extends traditional iid assumptions
on data since each datum can be sampled sequentially after being
conditioned on a parent. Instead of hypothesizing a single best tree
structure, we infer a richer Bayesian posterior distribution over tree
structures from a given dataset. We compute this posterior over
(directed or undirected) trees via the Laplacian of conditional
distributions between pairs of input data points. This posterior
distribution is efficiently normalized by the Laplacian's
determinant and also facilitates novel maximum likelihood estimators,
efficient expectations and other useful inference computations. In a
classification setting, tdid assumptions yield a criterion that
maximizes the determinant of a matrix of conditional distributions
between pairs of input and output points. This leads to a novel
classification algorithm we call the Maximum Determinant
Machine. Unsupervised and supervised experiments are shown.

Micro-speculation, Micro-sandboxing, and Self-Correcting Assertions: Support for Self-Healing Software and Application Communities

Michael Locasto

2005-12-05

Software faults and vulnerabilities continue to present significant
obstacles to achieving reliable and secure software. The critical
problem is that systems currently lack the capability to respond
intelligently and automatically to attacks -- especially attacks
that exploit previously unknown vulnerabilities or are delivered by
previously unseen inputs. Therefore, the goal of this thesis is to
provide an environment where both supervision and automatic
remediation can take place. Also provided is a mechanism to guide
the supervision environment in detection and repair activities.
This thesis supports the notion of Self-Healing Software by
introducing three novel techniques: \emph{micro-sandboxing},
\emph{micro-speculation}, and \emph{self-correcting assertions}. These
techniques are combined in a kernel-level emulation framework to
speculatively execute code that may contain faults or
vulnerabilities and automatically repair such faults or exploited
vulnerabilities. The framework, VPUF, introduces the concept of
computation as an operating system service by providing control
for an array of virtual processors in the Linux kernel (creating
the concept of an \emph{endolithic} kernel). This thesis introduces
ROAR (Recognize, Orient, Adapt, Respond) as a conceptual
workflow for Self-healing Software systems.
This thesis proposal outlines a 17 month program for developing the
major components of the proposed system, implementing them on a
COTS operating system and programming language, subjecting them to
a battery of evaluations for performance and efficacy, and publishing
the results. In addition, this proposal looks forward to several
areas of follow-on work, including implementing some of the proposed
techniques in hardware and leveraging the general kernel-level
framework to support Application Communities.

The high cost of operating large computing installations has motivated
a broad interest in reducing the need for human intervention by making
systems self-managing. This paper explores the extent to which control
theory can provide an architectural and analytic foundation for
building self-managing systems. Control theory provides a rich set of
methodologies for building automated self-diagnosis and self-repairing
systems with properties such as stability, short settling times, and
accurate regulation. However, there are challenges in applying control
theory to computing systems, such as developing effective resource
models, handling sensor delays, and addressing lead times in effector
actions. We propose a deployable testbed for autonomic computing
(DTAC) that we believe will reduce the barriers to addressing research
problems in applying control theory to computing systems. The initial
DTAC architecture is described along with several problems that it can
be used to investigate.

Routing protocols in most ad hoc networks use the length of paths as the routing metric. Recent findings have revealed that the minimum-hop metric can not achieve the maximum throughput because it tries to reduce the number of hops by containing long range links, where packets need to be transmitted at the lowest transmission rate. In this paper, we investigate the tradeoff between transmission rates and throughputs and show that in dense networks with uniform-distributed traffic, there exists the optimal rate that may not be the lowest rate. Based on our observation, we propose a new routing metric, which measures the expected capability of a path assuming the per-node fairness. We develop a routing protocol based on DSDV and demonstrate that the routing metric enhances the system throughput by 20% compared to the original DSDV.

Managed execution environments such as Microsoft’s Common Language Runtime (CLR) and Sun Microsystems’ Java Virtual Machine (JVM) provide a number of services – including but not limited to application isolation, security sandboxing, garbage collection and structured exception handling – that are aimed primarily at enhancing the robustness of managed applications. However, none of these services directly enables performing reconfigurations, repairs or diagnostics on the managed
applications and/or its constituent subsystems and components.
In this paper we examine how the facilities of a managed execution environment can be leveraged to support runtime system adaptations, such as reconfigurations and repairs. We describe an adaptation framework we have developed, which uses these facilities to dynamically attach/detach an engine capable of performing reconfigurations and repairs on a target system while it executes. Our adaptation framework is lightweight, and transparent to the application and the managed execution environment: it does not require recompilation of the application nor specially compiled
versions of the managed execution runtime. Our prototype was implemented for the CLR. To evaluate our framework beyond toy examples, we searched on SourceForge for potential target systems already implemented on the CLR that might benefit from runtime adaptation. We report on our experience using our prototype to effect runtime reconfigurations in a system that was developed and is in use by others: the Alchemi enterprise Grid Computing System developed at the University of Melbourne, Australia.

The increasing popularity of online courses has highlighted the need for collaborative learning tools for student groups. In addition, the introduction of lecture videos into the online curriculum has drawn attention to the disparity in the network resources available to students. We present an e-Learning architecture and adaptation model called AI2TV (Adaptive Interactive Internet Team Video), which allows groups of students to collaboratively view a video in synchrony. AI2TV upholds the invariant that each student will view semantically equivalent content at all times. A semantic compression model is developed to provide instructional videos at different level-of-details to accommodate dynamic network conditions and users’ system requirements. We take advantage of the semantic compression algorithm’s ability to provide different layers of semantically equivalent video by adapting the client to play at the appropriate layer that provides the client with the richest possible viewing experience. Video player actions, like play, pause and stop, can be initiated by any group member and and the results of those actions are synchronized with all the other students. These features allow students to review a lecture video in tandem, facilitating the learning process. Experimental trials show that AI2TV successfully synchronizes instructional videos for distributed students while concurrently optimizing the video quality, even under conditions of fluctuating bandwidth, by adaptively adjusting the quality level for each student while still maintaining the invariant.

The content of a webpage is usually contained within a small
body of text and images, or perhaps several articles on the same page;
however, the content may be lost in the clutter (defined as cosmetic
features such as animations, menus, sidebars, obtrusive banners).
Automatic content extraction has many applications, including browsing on small cell phone and PDA screens, speech rendering for the visually
impaired, and reducing noise for information retrieval systems. We have
developed a framework, Crunch, which employs various heuristics for
content extraction in the form of filters applied to the webpage's DOM
tree; the filters aim to prune or transform the clutter, leaving only the content. Crunch allows users to tune what we call "settings", consisting of thresholds for applying a particular filter and/or for toggling a filter on/off, because the HTML components that characterize clutter can vary significantly from website to website. However, we have found that the same settings tend to work well across different websites of the same genre, e.g., news or shopping, since the designers often employ similar page layouts. In particular, Crunch could obtain the settings for a previously unknown website by automatically classifying it as sufficiently similar to a cluster of known websites with previously adjusted settings. We present our approach to clustering a large corpus of websites into genres, using their pre-extraction textual material augmented by the snippets generated by searching for the website's domain name in web search engines. Including these snippets increases the frequency of function words needed for clustering. We use existing Manhattan distance measure and hierarchical clustering techniques, with some modifications, to pre-classify the corpus into genres offline. Our method does not require prior knowledge of the set of genres that websites fit into, but to be useful a priori settings must be available for some member of each cluster or a nearby cluster (otherwise defaults are used). Crunch classifies newly encountered websites online in linear-time, and then applies the corresponding filter settings, with no noticeable delay added by our content-extracting web proxy.

Event correlation is a widely-used data processing methodology for a broad variety of applications, and is especially useful in the context of distributed monitoring for software faults and vulnerabilities. However, most existing solutions have typically been focused on "intra-organizational" correlation; organizations typically employ privacy policies that prohibit the exchange of information outside of the organization. At the same time, the promise of "inter-organizational" correlation is significant given the broad availability of Internet-scale communications, and its potential role in both software maintenance and software vulnerability exploits.
In this proposal, I present a framework for reconciling these opposing forces in event correlation via the use of privacy preservation integrated into the event processing framework. By integrating flexible privacy policies, we enable the correlation of organizations' data without actually releasing sensitive information. The framework supports both source anonymity and data privacy, yet allows for the time-based correlation of a broad variety of data. The framework is designed as a lightweight collection of components to enable integration with existing COTS platforms and distributed systems. I also present two different implementations of this framework: XUES (XML Universal Event Service), an event processor used as part of a software monitoring platform called KX (Kinesthetics eXtreme), and Worminator, a collaborative Intrusion Detection System.
KX comprised a series of components, connected together with a publish-subscribe content-based routing event subsystem, for the autonomic software monitoring of complex distributed systems. Sensors were installed in legacy systems. XUES' two modules then performed event processing on sensor data: information was collected and processed by the Event Packager, and correlated using the Event Distiller. While XUES itself was not privacy-preserving, it laid the groundwork for this thesis by supporting event typing, the use of publish-subscribe and extensibility support via pluggable event transformation modules.
Worminator, the second implementation, extends the XUES platform to fully support privacy-preserving event types and algorithms in the context of a Collaborative Intrusion Detection System (CIDS), whereby sensor alerts can be exchanged and corroborated--a reduced form of correlation that enables collaborative verification--without revealing sensitive information about a contributor's network, services, or even external sources as required. Worminator also fully anonymizes source information, allowing contributors to decide their preferred level of information disclosure. Worminator is implemented as a monitoring framework on top of a COTS IDS sensor, and demonstrably enables the detection of not only worms but also "broad and stealthy" scans; traditional single-network sensors either bury such scans in large volumes or miss them entirely. Worminator has been successfully deployed at 5 collaborating sites and work is under way to scale it up further.
The contributions of this thesis include the development of a cross-application-domain event correlation framework with native privacy-preserving types, the use and validation of privacy-preserving corroboration, and the establishment of a practical deployed collaborative security system. I also outline the next steps in the thesis research plan, including the development of evaluation metrics to quantify Worminator's effectiveness at long-term scan detection, the overhead of privacy preservation and the effectiveness of our approach against adversaries, be they "honest-but-curious" or actively malicious. This thesis has broad future work implications, including privacy-preserving signature detection and distribution, distributed stealthy attacker profiling, and "application community"-based software vulnerability detection.

In a previous paper, we developed a general framework for establishing
tractability and strong tractability for quasilinear multivariate
problems in the worst case setting. One important example of such a
problem is the solution of the Poisson equation $-\Delta u + qu = f$
in the $d$-dimensional unit cube, in which $u$ depends linearly
on~$f$, but nonlinearly on~$q$. Here, both $f$ and~$q$ are
$d$-variate functions from a reproducing kernel Hilbert space with
finite-order weights of order~$\omega$. This means that, although~$d$
can be arbitrary large, $f$ and~$q$ can be decomposed as sums of
functions of at most $\omega$~variables, with $\omega$ independent
of~$d$.
In this paper, we apply our previous general results to the Poisson
equation, subject to either Dirichlet or Neumann homogeneous boundary
conditions. We study both the absolute and normalized error criteria.
For all four possible combinations of boundary conditions and error
criteria, we show that the problem is \emph{tractable}. That is, the
number of evaluations of $f$ and~$q$ needed to obtain an
$\e$-approximation is polynomial in~$\e^{-1}$ and~$d$, with the degree
of the polynomial depending linearly on~$\omega$. In addition, we
want to know when the problem is \emph{strongly tractable}, meaning
that the dependence is polynomial only in~$\e^{-1}$, independently
of~$d$. We show that if the sum of the weights defining the weighted
reproducing kernel Hilbert space is uniformly bounded in~$d$ and the
integral of the univariate kernel is positive, then the Poisson
equation is strongly tractable for three of the four possible
combinations of boundary conditions and error criterion, the only
exception being the Dirichlet boundary condition under the normalized
error criterion.

TCP-Friendly Rate Control with Token Bucket for VoIP Congestion Control

Miguel Maldonado, Salman Abdul Baset, Henning Schulzrinne

2005-10-17

TCP Friendly Rate Control (TFRC) is a congestion control algorithm that provides a smooth transmission rate for real-time
network applications. TFRC refrains from halving the sending rate on every packet drop, instead it is adjusted as a function of
the loss rate during a single round trip time. TFRC has been proven to be fair when competing with TCP flows over congested
links, but it lacks quality-of-service parameters to improve the performance of real-time traffic. A problem with TFRC is that it
uses additive increase to adjust the sending rate during periods with no congestion. This leads to short term congestion that can
degrade the quality of voice applications.
We propose two changes to TFRC that improve the performance of VoIP applications. Our implementation, TFRC with Token
Bucket (TFRC-TB), uses discrete calculated bit rates based on audio codec bandwidth usage to increase the sending rate. Also, it
uses a token bucket to control the sending rate during congestion periods. We have used ns2, the network simulator, to compare
our implementation to TFRC in a wide range of network conditions. Our results suggest that TFRC-TB can provide a quality of
service (QoS) mechanism to voice applications while competing fairly with other traffic over congested links.

Performance and Usability Analysis of Varying Web Service Architectures

Michael Lenner, Henning Schulzrinne

2005-10-14

We tested the performance of four web
application architectures, namely CGI, PHP, Java
servlets, and Apache Axis SOAP. All four
architectures implemented a series of typical web
application tasks. Our findings indicated that PHP
produced the smallest delay, while the SOAP
implementation produces the largest.

We propose a message propagation scheme for numerically stable inference in Gaussian graphical models which can otherwise be susceptible to errors caused by finite numerical precision. We adapt square root algo­rithms, popular in Kalman filtering, to graphs with arbitrary topologies. The method consists of maintaining potentials and generating messages that involve the square root of precision matrices. Combining this with the machinery of the junction tree algorithm leads to an efficient and nu­merically stable algorithm. Experiments are presented to demonstrate the robustness of the method to numerical errors that can arise in com­plex learning and inference problems.

Approximating the Reflection Integral as a Summation: Where did the delta go?

Aner Ben-Artzi

2005-10-07

In this note, I explore why the the common approximation of the reflection integral is not written with a delta omega-in ( ) to replace the differential omega-in ( ). After that, I go on to discover what really happens when the sum over all directions is reduced to a sum over a small number of directions. In the final section, I make recommendations for correctly approximating the reflection sum, and briefly suggest a possible framework for multiple importance sampling on both lighting and brdf.

Scalability poses a significant challenge for today's web applications,
mainly due to the large population of potential users. To effectively
address the problem of short-term dramatic load spikes caused by web
hotspots, we developed a self-configuring and scalable rescue system
called DotSlash. The primary goal of our system is to provide dynamic
scalability to web applications by enabling a web site to obtain
resources dynamically, and use them autonomically without any
administrative intervention. To address the database server bottleneck,
DotSlash allows a web site to set up on-demand distributed query result
caching, which greatly reduces the database workload for read mostly
databases, and thus increases the request rate supported at a
DotSlash-enabled web site. The novelty of our work is that our query
result caching is on demand, and operated based on load conditions.
The caching remains inactive as long as the load is normal, but is
activated once the load is heavy. This approach offers good data
consistency during normal load situations, and good scalability with
relaxed data consistency for heavy load periods. We have built a
prototype system for the widely used LAMP configuration, and evaluated
our system using the RUBBoS bulletin board benchmark. Experiments show
that a DotSlash-enhanced web site can improve the maximum request rate
supported by a factor of 5 using 8 rescue servers for the RUBBoS
submission mix, and by a factor of 10 using 15 rescue servers for
the RUBBoS read-only mix.

We investigate elastic block ciphers, a method for constructing
variable length block ciphers, from a theorectical perspective. We
view the underlying structure of an elastic block cipher as a network,
which we refer to as an elastic network, and analyze the network in a manner similar to the analysis performed by Luby and Rackoff on Fesitel networks. We prove that a three round elastic network is a pseudorandom permutation and a four round network is a strong pseudorandom permutation when the round functions are pseudorandom permutations.

We analyze the security of elastic block ciphers in general to
show that an attack on an elastic version of block cipher implies
a polynomial time related attack on the fixed-length version of
the block cipher. We relate the security of the elastic version
of a block cipher to the fixed-length version by forming a reduction
between the versions. Our method is independent of the specific
block cipher used. The results imply that if the fixed-length version
of a block cipher is secure against attacks which attempt key recovery then the elastic version is also secure against such attacks.

On Elastic Block Ciphers and Their Differential and Linear Cryptanalyses

Debra Cook, Moti Yung, Angelos Keromytis

2005-09-28

Motivated by applications such as databases with nonuniform field lengths,
we introduce the concept of an elastic block cipher, a new
approach to variable length block ciphers which incorporates fixed
sized cipher components into a new network structure. Our scheme
allows us to dynamically "stretch" the supported block size of
a block cipher up to a length double the original block size, while
increasing the computational workload proportionally to the
block size. We show that traditional attacks against an elastic block
cipher are impractical if the original cipher is secure. In this paper
we focus on differential and linear attacks. Specifically, we employ
an elastic version of Rijndael supporting block sizes of 128 to 256
bits as an example, and show it is resistant to both differential and
linear attacks. In particular, employing a different method than what
is employed in Rijndael design, we show that the probability of any
differential characteristic for the elastic version of Rijndael is
<= 2^-(block size). We further prove that both linear and
nonlinear attacks are computationally infeasible for any elastic block
cipher if the original cipher is not subject to such an attack and
involves a block size for which an exhaustive plaintext search is
computationally infeasible (as is the case for Rijndael).

Many websites are driven by web applications that deliver dynamic content stored in SQL databases. Such systems take input directly from the client via HTML forms. Without proper input validation, these systems are vulnerable to SQL injection attacks.
The predominant defense against such attacks is to implement better input validation. This strategy is unlikely to succeed on its own. A better approach is to protect systems against SQL injection automatically and not rely on manual supervision or testing strategies (which are incomplete by nature). SQL randomization is a technique that defeats SQL injection attacks by transforming the language of SQL statements in a web application such that an attacker needs to guess the transformation in order to successfully inject his code.
We present PachyRand, an extension to the PostgreSQL JDBC driver that performs SQL randomization. Our system is easily portable to most other JDBC drivers, has a small performance impact, and makes SQL injection attacks infeasible.

In this paper we present the theoretical foundation of the search
space for learning a class of constraint-based grammars, which
preserve the parsing of representative examples. We prove that under
several assumptions the search space is a complete grammar lattice,
and the lattice top element is a grammar that can always be learned
from a set of representative examples and a sublanguage used to
reduce the grammar semantics. This complete grammar lattice
guarantees convergence of solutions of any learning algorithm that
obeys the given assumptions.

In the network community different mobility management techniques have
been proposed over the years. However, many of these techniques share a
surprisingly high number of similarities. In this technical report we analyze and
evaluate the most relevant mobility management techniques, pointing out
differences and similarities. For macro-mobility we consider Mobile IP (MIP),
the Session Initiation Protocol (SIP) and mobility management techniques
typical of a GSM network; for micro-mobility we describe and analyze several
protocols such as: Hierarchical MIP, TeleMIP, IDMP, Cellular IP and HAWAII.

We present a pointer analysis algorithm designed for source-to-source transformations. Existing techniques for pointer analysis apply a collection of inference rules to a dismantled intermediate form of the source program, making them difficult to apply to source-to-source tools that generally work on abstract syntax trees to preserve details of the source program. Our pointer analysis algorithm operates directly on the abstract syntax tree of a C program and uses a form of standard dataflow analysis to compute the desired points-to information. We have implemented our algorithm in a source-to-source translation framework and experimental results show that it is practical on real-world examples.

The increasing popularity of distance learning and online courses has highlighted the lack of collaborative tools for student groups. In addition, the introduction of lecture videos into the online curriculum has drawn attention to the disparity in the network resources used by students. We present an e-Learning architecture and adaptation model called AI2TV (Adaptive Internet Interactive Team Video), a system that allows borderless, virtual students, possibly some or all disadvantaged in network resources, to collaboratively view a video in synchrony. AI2TV upholds the invariant that each student will view semantically equivalent content at all times. Video player actions, like play, pause and stop, can be initiated by any of the students and the results of those actions are seen by all the other students. These features allow group members to review a lecture video in tandem to facilitate the learning process. We show in experimental trials that our system can successfully synchronize video for distributed students while, at the same time, optimizing the video quality given actual (fluctuating) bandwidth by adaptively adjusting the quality level for each student.

The tractability of multivariate problems has usually been studied only for the approximation of linear operators. In this paper we study the tractability of quasilinear multivariate problems. That is, we wish to approximate nonlinear operators~$S_d(\cdot,\cdot)$ that depend linearly on the first argument and satisfy a Lipschitz condition with respect to both arguments. Here, both arguments are functions of $d$~variables. Many computational problems of practical importance have this form. Examples inlude the solution of specific Dirichlet, Neumann, and Schr\"odinger problems. We show, under appropriate assumptions, that quasilinear problems, whose domain spaces are equipped with product or finite-order weights, are tractable or strongly tractable in the worst case setting.
This paper is the first part in a series of papers. Here, we present tractability results for quasilinear problems under general assumptions on quasilinear operators and weights. In future papers, we shall verify these assumptions for quasilinear problems such as the solution of specific Dirichlet, Neumann, and Schr\"odinger problems.

We consider the problem of learning a halfspace in the agnostic framework
of Kearns et al., where a learner is given access to a distribution on labelled examples but the labelling may be arbitrary. The learner's goal is to output a hypothesis which performs almost as well as the optimal halfspace with respect to future draws from this distribution. Although the agnostic learning framework does not explicitly deal with noise, it is closely related to learning in worst-case noise models such as malicious noise.
We give the first polynomial-time algorithm for agnostically learning halfspaces with respect to several distributions, such as the uniform distribution over the $n$-dimensional Boolean cube {0,1}^n or unit sphere in n-dimensional Euclidean space, as well as any log-concave distribution in n-dimensional Euclidean space. Given any constant additive factor eps>0, our algorithm runs in poly(n) time and constructs a hypothesis whose error rate is within an additive eps of the optimal halfspace. We also show this algorithm agnostically learns Boolean disjunctions in time roughly 2^{\sqrt{n}} with respect to any distribution; this is the first subexponential-time algorithm for this problem. Finally, we obtain a new algorithm for PAC learning halfspaces under the uniform distribution on the unit sphere which can tolerate the highest level of malicious noise of any algorithm to date.
Our main tool is a polynomial regression algorithm which finds a polynomial that best fits a set of points with respect to a particular metric. We show that, in fact, this algorithm is an arbitrary-distribution generalization of the well known ``low-degree'' Fourier algorithm of Linial, Mansour, & Nisan and has excellent noise tolerance properties when minimizing with respect to the L_1 norm. We apply this algorithm in conjunction with a non-standard Fourier transform (which does not use the traditional parity basis) for learning halfspaces over the uniform distribution on the unit sphere; we believe this technique is of independent interest.

We consider the problem of learning mixtures of product distributions
over discrete domains in the distribution learning framework
introduced by Kearns et al. We give a $\poly(n/\eps)$ time algorithm
for learning a mixture of $k$ arbitrary product distributions over the
$n$-dimensional Boolean cube $\{0,1\}^n$ to accuracy $\eps$, for any
constant $k$. Previous polynomial time algorithms could only achieve
this for $k = 2$ product distributions; our result answers an open
question stated independently by Cryan and by Freund and Mansour. We
further give evidence that no polynomial time algorithm can succeed
when $k$ is superconstant, by reduction from a notorious open problem
in PAC learning. Finally, we generalize our $\poly(n/\eps)$ time
algorithm to learn any mixture of $k = O(1)$ product distributions
over $\{0,1, \dots, b\}^n$, for any $b = O(1)$.

Automaton-based static program analysis has proved to be an effective tool for bug finding. Current tools generally
re-analyze a program from scratch in response to a change in the code, which can result in much duplicated
effort. We present an inter-procedural algorithm that analyzes incrementally in response to program changes and
present experiments for a null-pointer dereference analysis. It shows a substantial speed-up over re-analysis from
scratch, with a manageable amount of disk space used to store information between analysis runs.

This paper presents the theoretical foundation of a new type of
constraint-based grammars, Lexicalized Well-Founded Grammars, which are
adequate for modeling human language and are learnable. These features
make the grammars suitable for developing robust and scalable natural language understanding systems. Our grammars capture both syntax and semantics and have two types of constraints at the rule level: one for semantic composition and one for ontology-based semantic interpretation. We prove that these grammars can always be learned from a small set of
semantically annotated, ordered representative examples, using a
relational learning algorithm. We introduce a new semantic
representation for natural language, which is suitable for an
ontology-based interpretation and allows us to learn the compositional
constraints together with the grammar rules. Besides the learnability
results, we give a principle for grammar merging. The experiments
presented in this paper show promising results for the adequacy of
these grammars in learning natural language. Relatively simple
linguistic knowledge is needed to build the small set of semantically
annotated examples required for the grammar induction.

Most general-purpose work towards autonomic or self-managing systems
has emphasized the front end of the feedback control loop, with some
also concerned with controlling the back end enactment of runtime
adaptations -- but usually employing an effector technology peculiar
to one type of target system. While completely generic "one size fits
all" effector technologies seem implausible, we propose a general
purpose programming model and interaction layer that abstractsaway
from the peculiarities of target specific effectors,enabling a uniform
approach to controlling and coordinatingthe low-level execution of
reconfigurations, repairs,micro-reboots, etc

Skin is the outer most tissue of the human body. As a result, people
are very aware of, and very sensitive to, the appearance of their
skin. Consequently, skin appearance has been a subject of great
interest in various fields of science and technology.
Research on skin appearance has been
intensely pursued in the fields of medicine, cosmetology, computer
graphics and computer vision. Since the goals of these fields are
very different, each field has tended to focus on specific aspects of
the appearance of skin. The goal of this work is to present a
comprehensive survey that includes the most prominent results related
to skin in these different fields and show how these seemingly
disconnected studies are related to one another.

Essentially all computer graphics rendering assumes that the
reflectance and texture of surfaces is a static phenomenon. Yet,
there is an abundance of materials in nature whose appearance varies
dramatically with time, such as cracking paint, growing grass, or
ripening banana skins. In this paper, we take a significant step
towards addressing this problem, investigating a new class of
time-varying textures. We make three contributions. First, we
describe the carefully controlled acquisition of datasets of a variety
of natural processes including the growth of grass, the accumulation
of snow, and the oxidation of copper. Second, we show how to adapt
quilting-based methods to time-varying texture synthesis, addressing
the important challenges of maintaining temporal coherence, efficient
synthesis on large time-varying datasets, and reducing visual
artifacts specific to time-varying textures. Finally, we show how
simple procedural techniques can be used to control the evolution of
the results, such as allowing for a faster growth of grass in well lit
(as opposed to shadowed) areas.

This paper is concerned with merging globally rigid formations
of mobile autonomous agents. A key element in all
future multi-agent systems will be the role of sensor and
communication networks as an integral part of coordination.
Network topologies are critically important for autonomous
systems involving mobile underwater, ground and
air vehicles and for sensor networks. This paper focuses on
developing techniques and strategies for the analysis and
design of sensor and network topologies required to merge
globally rigid formations for cooperative tasks. Central to
the development of these techniques and strategies will be
the use of tools from rigidity theory, and graph theory.

We consider a cluster of heterogeneous servers, modeled as $M/G/1$ queues with different processing speeds. The scheduling policies for these servers can be either processor-sharing or first-come first-serve. Furthermore, a dispatcher that assigns jobs to the servers takes as input only the size of the arriving job and the overall job-size distribution.
This general model captures the behavior of a variety of real systems, such as web server clusters. Our goal is to identify assignment strategies that the dispatcher can perform to minimize expected completion time and waiting time. We show that there exist optimal strategies that are deterministic, fixing the server to which jobs of particular sizes are always sent. We prove that the optimal strategy for systems with identical servers assigns a non-overlapping interval range of job sizes to each server. We then prove that when server processing speeds differ, it is necessary to assign each server a distinct set of intervals of job sizes in order to minimize expected waiting or response times. We explore some of the practical challenges of identifying the optimal strategy, and also study a related problem that uses our model of how to provision server processing speeds to minimize waiting and completion time given a job size distribution and fixed aggregate processing power.

We present a hybrid method for localizing a mobile robot in a complex environment. The method combines the use of multiresolution histograms with a signal strength analysis of existing wireless networks. We tested this localization procedure on the campus of Columbia University with our mobile robot, the Autonomous Vehicle for Exploration and Navigation of Urban Environments. Our results indicate that localization accuracy is significantly improved when five levels of resolution are used instead of one in color histogramming. We also find that incorporating wireless signal strengths into the method further improves reliability and helps to resolve ambiguities which arise when different regions have similar visual appearances.

Classical and Quantum Complexity of the Sturm-Liouville Eigenvalue Problem

A. Papageorgiou, H. Wozniakowski

2005-04-22

We study the approximation of the smallest eigenvalue of a Sturm-Liouville problem in the classical and quantum settings. We consider a univariate Sturm-Liouville eigenvalue problem with a nonnegative function $q$ from the class $C^2([0,1])$ and study the minimal number $n(\e)$ of function evaluations or queries that are necessary to compute an $\e$-approximation of the smallest eigenvalue. We prove that $n(\e)=\Theta(\e^{-1/2})$ in the (deterministic) worst case setting, and $n(\e)=\Theta(\e^{-2/5})$ in the randomized setting. The quantum setting offers a polynomial speedup with {\it bit} queries and an exponential speedup with {\it power} queries. Bit queries are similar to the oracle calls used in Grover's algorithm appropriately extended to real valued functions. Power queries are used for a number of problems including phase estimation. They are obtained by considering the propagator of the discretized system at a number of different time moments. They allow us to use powers of the unitary matrix $\exp(\tfrac12 {\rm i}M)$, where $M$ is an $n\times n$ matrix obtained from the standard discretization of the Sturm-Liouville differential operator. The quantum implementation of power queries by a number of elementary quantum gates that is polylog in $n$ is an open issue. We study the approximation of the smallest eigenvalue of a Sturm-Liouville problem in the classical and quantum settings. We consider a univariate Sturm-Liouville eigenvalue problem with a nonnegative function $q$ from the class $C^2([0,1])$ and study the minimal number $n(\e)$ of function evaluations or queries that are necessary to compute an $\e$-approximation of the smallest eigenvalue. We prove that $n(\e)=\Theta(\e^{-1/2})$ in the (deterministic) worst case setting, and $n(\e)=\Theta(\e^{-2/5})$ in the randomized setting. The quantum setting offers a polynomial speedup with {\it bit} queries and an exponential speedup with {\it power} queries. Bit queries are similar to the oracle calls used in Grover's algorithm appropriately extended to real valued functions. Power queries are used for a number of problems including phase estimation. They are obtained by considering the propagator of the discretized system at a number of different time moments. They allow us to use powers of the unitary matrix $\exp(\tfrac12 {\rm i}M)$, where $M$ is an $n\times n$ matrix obtained from the standard discretization of the Sturm-Liouville differential operator. The quantum implementation of power queries by a number of elementary quantum gates that is polylog in $n$ is an open issue.

Simultaneous multithreading (SMT) allows multiple threads to supply instructions to the instruction pipeline of a superscalar processor. Because threads share processor resources, an SMT system is inherently different from a multiprocessor system and, therefore, utilizing multiple threads on an SMT processor creates new challenges for database implementers.
We investigate three thread-based techniques to exploit SMT architectures on memory-resident data. First, we consider running independent operations in separate threads, a technique applied to conventional multiprocessor systems. Second, we describe a novel implementation strategy in which individual operators are implemented in a multi-threaded fashion. Finally, we introduce a new data-structure called a work-ahead set that allows us to use one of the threads to aggressively preload data into the cache for use by the other thread.
We evaluate each method with respect to its performance, implementation complexity, and other measures. We also provide guidance regarding when and how to best utilize the various threading techniques. Our experimental results show that by taking advantage of SMT technology we achieve a 30\% to 70\% improvement in throughput over single threaded implementations on in-memory database operations.

The thesis contains an analysis of two computational problems. The first problem is discrete quantum Boolean summation. This problem is a building block of quantum algorithms for many continuous problems, such as integration, approximation, di®erential equations and path integration. The second problem is continuous multivariate
Feynman-Kac path integration, which is a special case of path integration.
The quantum Boolean summation problem can be solved by the quantum summation (QS) algorithm of Brassard, Høyer, Mosca and Tapp, which approximates the arithmetic mean of a Boolean function. We improve the error bound of Brassard et al. for the worst-probabilistic setting. Our error bound is sharp. We also present new sharp error bounds in the average-probabilistic and worst-average settings. Our average-probabilistic error bounds prove the optimality of the QS algorithm for a certain choice of its parameters. The study of the worst-average error shows that the QS algorithm is not optimal in this setting; we need to use a certain number of repetitions to regain its optimality.
The multivariate Feynman-Kac path integration problem for smooth multivariate functions su®ers from the provable curse of dimensionality in the worst-case deterministic setting, i.e., the minimal number of function evaluations needed to compute an approximation depends exponentially on the number of variables. We show that in
both the randomized and quantum settings the curse of dimensionality is vanquished, i.e., the minimal number of function evaluations and/or quantum queries required to compute an approximation depends only polynomially on the reciprocal of the desired accuracy and has a bound independent of the number of variables. The exponents
of these polynomials are 2 in the randomized setting and 1 in the quantum setting. These exponents can be lowered at the expense of the dependence on the number of variables. Hence, the quantum setting yields exponential speedup over the worst-case
deterministic setting, and quadratic speedup over the randomized setting.

A Hybrid Hierarchical and Peer-to-Peer Ontology-based Global Service Discovery System

Knarig Arabshian, Henning Schulzrinne

2005-04-06

Current service discovery systems fail to span across the globe and they use simple attribute-value pair or interface matching for service description and querying. We propose a global service discovery system, GloServ, that uses the description logic Web Ontology Language (OWL DL). The GloServ architecture spans both local and wide area networks. It maps knowledge obtained by the service classification ontology to a structured peer-to-peer network such as a Content Addressable Network (CAN). GloServ also performs automated and intelligent registration and querying by exploiting the logical relationships within the service ontologies.

We present an Edit-and-Continue implementation that allows regular source files to be treated like interactively updatable, compiled scripts, coupling the speed of compiled na-tive machine code, with the ability to make changes without restarting. Our implementa-tion is based on the Microsoft .NET Framework and allows applications written in any .NET language to be dynamically updatable. Our solution works with the standard ver-sion of the Microsoft Common Language Runtime, and does not require a custom com-piler or runtime. Because no application changes are needed, it is transparent to the appli-cation developer. The runtime overhead of our implementation is low enough to support updating real-time applications (e.g., interactive 3D graphics applications).

We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to multilingual multi-document summarization uses text similarity to choose sentences from English documents based on the content of the machine translated documents. A manual evaluation shows that 68\% of the sentence replacements improve the summary, and the overall summarization approach outperforms first-sentence extraction baselines in automatic ROUGE-based evaluations.

IEEE 802.11 MAC is a CSMA/CA protocol and uses RTS/CTS exchanges to avoid the hidden terminal problem. Recent findings have revealed that the carriersensing range set in current major implementations does not detect and prevent all interference signals even using RTS/CTS access method together. In this paper, we investigate the effect of interference and develop a mathematical model for it. We demonstrate that the 802.11 DCF does not properly act on the interference channel due to the small size and the exponential increment of backoff windows. The accuracy of our model is verified via simulations. Based on an insight from our model, we present a simple protocol that operates on the top of 802.11 MAC layer and achieves more throughput than rate-adjustment schemes.

We obtain a query lower bound for quantum algorithms solving the phase estimation problem. Our analysis generalizes existing lower bound approaches to the case where the oracle Q is given by controlled powers Q^p of Q, as it is for example in Shor's order finding algorithm. In this setting we will prove a log (1/epsilon) lower bound for the number of applications of Q^p1, Q^p2, ... This bound is tight due to a matching upper bound. We obtain the lower bound using a new technique based on frequency analysis.

The computation of combinatorial and numerical problems on quantum computers is often much faster than on a classical computer in numbers of queries. A query is a procedure by which the quantum computer gains information about the specific problem. Different query definitions were given and our aim is to review them and to show that these definitions are not equivalent. To achieve this result we will study the simulation and approximation of one query type by another. While approximation is easy in one direction, we will show that it is hard in the other direction by a lower bound for the numbers of queries needed in the simulation. The main tool in this lower bound proof is a relationship between quantum algorithms and trigonometric polynomials that we will establish.

This paper is concerned with information
structures used in rigid formations of autonomous agents that
have leader-follower architecture. The focus of this paper is
on sensor/network topologies to secure control of rigidity. We
extend our previous approach for formations with symmetric
neighbor relations to include formations with leader-follower
architecture. Necessary and sufficient conditions for stably
rigid directed formations are given including both cyclic and
acyclic directed formations. Some useful steps for creating
topologies of directed rigid formations are developed. An
algorithm to determine the directions of links to create
stably rigid directed formations from rigid undirected
formations is presented. It is shown that k-cycles (k > 2) do
not cause inconsistencies when measurements are noisy, while
2-cycles do. Simulation results are presented for (i) a rigid
acyclic formation, (i) a flexible formation, and (iii) a rigid
formation with cycles.

We have previously developed a collaborative virtual environment (CVE) for small-group virtual classrooms, intended for distance learning by geographically dispersed students. The CVE employs a peer-to-peer approach to the frequent real-time updates to the 3D virtual worlds required by avatar movements (fellow students in the same room are depicted by avatars). This paper focuses on our extension to the P2P model to support group viewing of lecture videos, called VECTORS, for Video Enhanced Collaboration for Team Oriented Remote Synchronization. VECTORS supports synchronized viewing of lecture videos, so the students all see "the same thing at the same time", and can pause, rewind, etc. in synchrony while discussing the lecture material via "chat". We are particularly concerned with the needs of the technologically disenfranchised, e.g., whose only Web/Internet access if via dialup or other relatively low-bandwidth networking. Thus VECTORS employs semantically compressed videos with meager bandwidth requirements. Further, the videos are displayed as a sequence of JPEGs on the walls of a 3D virtual room, requiring fewer local multimedia resources than full motion MPEGs.

Design of Next Step In Signaling (NSIS) protocol and IP routing interaction requires a good understanding of today's Internet routing behavior. In this report we present a routing measurement experiment to characterize current Internet dynamics, including routing pathology, routing prevalence and routing persistence. The focus of our study is route change. We look at the types, duration and likely causes of different route changes and discuss their impact to the design of NSIS. We also review common route change detection methods and investigate rules to determine whether a route change happened in a node's forward-looking or backward-looking direction is detectable. We introduce typical NSIS deployment models and discuss specific categories of route changes that should be considered in each of these models. With the NSIS deployment models in mind, we further give experimental evaluation of two route change detection methods - the packet TTL monitoring method and a new delay variation monitoring method.

Self-healing systems require that repair mechanisms are available to resolve problems that arise while the system executes. Managed execution environments such as the Common Language Runtime (CLR) and Java Virtual Machine (JVM) provide a number of application services (application isolation, security sandboxing, garbage collection and structured exception handling) which are geared primarily at making managed applications more robust. However, none of these services directly enables applications to perform repairs or consistency checks of their components. From a design and implementation standpoint, the preferred way to enable repair in a self-healing system is to use an externalized repair/adaptation architecture rather than hardwiring adaptation logic inside the system where it is harder to analyze, reuse and extend. We present a framework that allows a repair engine to dynamically attach and detach to/from a managed application while it executes essentially adding repair mechanisms as another application service provided in the execution environment.

Self-healing systems require that repair mechanisms are available to resolve problems that arise while the system executes. Managed execution environments such as the Common Language Runtime (CLR) and Java Virtual Machine (JVM) provide a number of application services (application isolation, security sandboxing, garbage collection and structured exception handling) which are geared primarily at making managed applications more robust. However, none of these services directly enables applications to perform repairs or consistency checks of their components. From a design and implementation standpoint, the preferred way to enable repair in a self-healing system is to use an externalized repair/adaptation architecture rather than hardwiring adaptation logic inside the system where it is harder to analyze, reuse and extend. We present a framework that allows a repair engine to dynamically attach and detach to/from a managed application while it executes essentially adding repair mechanisms as another application service provided in the execution environment.

Web pages often contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Automatic extraction of "useful and relevant" content from web pages has many applications, including browsing on small cell phone and PDA screens, speech rendering for the visually impaired, and reducing noise for information retrieval systems. Prior work has led to the development of Crunch, a framework which employs various heuristics in the form of filters and filter settings for content extraction. Crunch allows users to tune these settings, essentially the thresholds for applying each filter. However, in order to reduce human involvement in selecting these heuristic settings, we have extended this work to utilize a website's classification, defined by its genre and physical layout. In particular, Crunch would then obtain the settings for a previously unknown website by automatically classifying it as sufficiently similar to a cluster of known websites with previously adjusted settings - which in practice produces better content extraction results than a single one-size-fits-all set of setting defaults. In this paper, we present our approach to clustering a large corpus of websites by their genre, utilizing the snippets generated by sending the website's domain name to search engines as well as the website's own text. We find that exploiting these snippets not only increased the frequency of function words that directly assist in detecting the genre of a website, but also allow for easier clustering of websites. We use existing techniques like Manhattan distance measure and Hierarchical clustering, with some modifications, to pre-classify websites into genres. Our clustering method does not require prior knowledge of the set of genres that websites fit into, but instead discovers these relationships among websites. Subsequently, we are able to classify newly encountered websites in linear-time, and then apply the corresponding filter settings, with no noticeable delay introduced for the content-extracting web proxy.

Most general-purpose work towards autonomic or self-managing systems has emphasized the front end of the feedback control loop, with some also concerned with controlling the back end enactment of runtime adaptations ^V but usually employing an effector technology peculiar to one type of target system. While completely generic ^Sone size fits all^T effector technologies seem implausible, we propose a general-purpose programming model and interaction layer that abstracts away from the peculiarities of target-specific effectors, enabling a uniform approach to controlling and coordinating the low-level execution of reconfigurations, repairs, micro-reboots, etc.

Event correlation is essential to realizing self-managing distributed systems. For example, distributed systems often require that events be correlated from multiple systems using temporal patterns to detect denial of service attacks and to warn of problems with business critical applications that run on multiple servers. This paper addresses how to specify timer values for temporal patterns so as to manage the trade-off between false alarms and undetected alarms. A central concern is addressing the variability of event propagation delays due to factors such as contention for network and server resources. To this end, we develop an architecture and an adaptive control algorithm that dynamically compensate for variations in propagation delays. Our approach makes Management Stations more autonomic by avoiding the need for manual adjustments of timer values in temporal rules. Further, studies we conducted of a testbed system suggest that our approach produces results that are at least as good as an optimal fixed setting of timer values.

We present a location-based, ubiquitous service architecture, based on the Session Initiation Protocol (SIP) and a service discovery protocol that enables users to enhance the multimedia communications services available on their mobile devices by discovering other local devices, and including them in their active sessions, creating a "virtual device." We have implemented our concept based on Columbia University's multimedia environment and we show its feasibility by a performance analysis.

We present an autonomic controller for quality collaborative video viewing, which allows groups of geographically dispersed users with different network and computer resources to view a video in synchrony while optimizing the video quality experienced. The autonomic controller is used within a tool for enhancing distance learning with synchronous group review of online multimedia material. The autonomic controller monitors video state at the clients' end, and adapts the quality of the video according to the resources of each client in (soft) real time. Experimental results show that the autonomic controller successfully synchronizes video for small groups of distributed clients and, at the same time, enhances the video quality experienced by users, in conditions of fluctuating bandwidth and variable frame rate.

State assignment is a formidable task. As designs written in a hardware description language such as Esterel inherently carry more high level information that a register transfer level model, such information can be used to guide the encoding process. A question arises if the high level information alone is strong enough to suggest an efficient state assignment, allowing low-level details to be ignored.
This report suggests that with Esterel's flexibility, most optimization potential is not within the high-level structure. It appears effective state assignment cannot rely solely on high level information.

Porting software usually requires understanding what library functions the program being ported uses since this functionality must be either found or reproduced in the ported program's new environment. This is usually done manually through code inspections. We propose a type inference algorithm able to infer basic information about the library functions a particular C program uses in the absence of declaration information for the library (e.g., without header files). Based on a simple but efficient inference algorithm, we were able to infer declarations for much of the PalmOS API from the source of a twenty-seven-thousand-line C program. Such a tool will aid in the problem of program understanding when porting programs, especially from poorly-documented or lost legacy environments.

Software that covertly monitors user actions, also known as {\it
spyware,} has become a first-level security threat due to its ubiquity
and the difficulty of detecting and removing it. Such software may be
inadvertently installed by a user that is casually browsing the web,
or may be purposely installed by an attacker or even the owner of a
system. This is particularly problematic in the case of utility
computing, early manifestations of which are Internet cafes and
thin-client computing. Traditional trusted computing approaches offer
a partial solution to this by significantly increasing the size of the
trusted computing base (TCB) to include the operating system and other
software.
We examine the problem of protecting a user accessing specific
services in such an environment. We focus on secure video broadcasts
and remote desktop access when using any convenient, and often
untrusted, terminal as two example applications. We posit that, at
least for such applications, the TCB can be confined to a suitably
modified graphics processing unit (GPU). Specifically, to prevent
spyware on untrusted clients from accessing the user's data, we
restrict the boundary of trust to the client's GPU by moving image
decryption into GPUs. We use the GPU in order to leverage existing
capabilities as opposed to designing a new component from scratch.
We discuss the applicability of GPU-based
decryption in these two sample scenarios and identify the limitations
of the current generation of GPUs. We propose straightforward
modifications to future GPUs that will allow the realization of the
full approach.

This paper proposes an exploration method for robots equipped with a set of sonar sensors that does not allow for complete coverage of the robot's close surroundings. In such cases, there is a high risk of collision with possible undetected obstacles. The proposed method, adapted for use in urban outdoors environments, minimizes such risks while guiding the robot towards a predefined target location. During the process, a compact and accurate representation of the environment can be obtained.

This technical report investigates services suitable for end systems. We look into ITU Q.1211 services, AT&T 5ESS switch services, services defined in CSTA Phase III, and new services integrating other Internet services, such as presence information. We also explore how to use the Language for End System Services (LESS) to program the services.

We present WebPod, a portable device for managing web browsing sessions. WebPod leverages capacity improvements in portable solid state memory devices to provide a consistent environment to access the web. WebPod provides a thin virtualization layer that decouples a user's web session from any particular end-user device, allowing users freedom to move their work environments around. We have implemented a prototype in Linux that works with existing unmodified applications and operating system kernels. Our experimental results demonstrate that WebPod has very low virtualization overhead and can provide a full featured web browsing experience, including support for all helper applications and plug-ins one expects. WebPod is able to efficiently migrate a user's web session. This enables improved user mobility while maintaining a consistent work environment.

After a few decades of research and experimentation, register-transfer dialects of two standard languages---Verilog and VHDL---have emerged as the industry standard starting point for automatic large-scale digital integrated circuit synthesis. Writing RTL descriptions of hardware remains a largely human process and hence the clarity, precision, and ease with which such descriptions can be coded correctly has a profound impact on the quality of the final product and the speed with which the design can be created.
While the efficiency of a design (e.g., the speed at which it can run or the power it consumes) is obviously important, its correctness is usually the paramount issue, consuming the majority of the time (and hence money) spent during the design process. In response to this challenge, a number of so-called verification languages have arisen. These have been designed to assist in a simulation-based or formal verification process by providing mechanisms for checking temporal properties, generating pseudorandom test cases, and for checking how much of a design's behavior has been exercised by the test cases.
Through examples and discussion, this report describes the two main design languages---VHDL and Verilog---as well as SystemC, a language currently used to build large simulation models; SystemVerilog, a substantial extension of Verilog; and OpenVera, e, and PSL, the three leading contenders for becoming the main verification language.

Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "useful and relevant" content from web pages has many applications, including cell phone and PDA browsing, speech rendering for the visually impaired, reducing noise for information retrieval systems and to generally improve the web browsing experience. In our previous work [16], we developed a framework that employed an easily extensible set of techniques that incorporated results from our earlier work on content extraction [16]. Our insight was to work with DOM trees, rather than raw HTML markup. We present here filters that reduce human involvement in applying heuristic settings for websites and instead automate the job by detecting and utilizing the physical layout and content genre of a given website. We also present work we have done towards improving the usability and performance of our content extraction proxy as well as the quality and accuracy of the heuristics that act as filters for inferring the context of a webpage.

P2P systems inherently have high scalability, robustness and fault tolerance because there is no centralized server and the network self-organizes itself. This is achieved at the cost of higher latency for locating the resources of interest in the P2P overlay network. Internet telephony can be viewed as an application of P2P architecture where the participants form a self-organizing P2P overlay network to locate and communicate with other participants. We propose a pure P2P architecture for the Session Initiation Protocol (SIP)-based IP telephony systems. Our P2P-SIP architecture supports basic user registration and call setup as well as advanced services such as offline message delivery, voice/video mails and multi-party conferencing. We also provide an overview of practical challenges for P2P-SIP such as firewall, Network Address Translator (NAT) traversal and security.

Internet telephony can introduce many novel communication services, however, novelty puts learning burden on users. It will be a great help to users if their desired services can be created automatically. We developed an intelligent communication service creation environment which can handle automatic service creation by learning from users' daily communication behaviors. The service creation environment models communication services as decision trees and uses the Incremental Tree Induction (ITI) algorithm for decision tree learning. We use Language for End System Services (LESS) scripts to represent learned results and implemented a simulation environment to verify the learning algorithm. We also noticed that when users get their desired services, they may not be aware of unexpected behaviors that the serivces could introduce, for example, mistakenly rejecting expected calls. In this paper, we also did a comprehensive analysis on communication service fail-safe handling and propose several approaches to create fail-safe services.

We present a microrobotic system for protein crystal micromanipulation
tasks. The focus in this report is on a task called streak seeding,
which is used by crystallographers to entice certain protein crystals
to grow. Our system features a set of custom designed micropositioner
end-effectors we call microshovels to replace traditional tools used
by crystallographers for this task. We have used micro-electrical
mechanical system (MEMS) techniques to design and manufacture various
shapes and quantities of microshovels. Visual feedback from a camera
mounted on the microscope is used to control the micropositioner as it
lowers a microshovel into the liquid containing the crystals for poking
and streaking. We present experimental results that illustrate the
applicability of our approach.

As IP telephony becomes more widely deployed and used, tele-marketers or other spammers are bound to start using SIP-based calls and instant messages as a medium for sending spam. As is evident from the fate of email, protection against spam has to be built into SIP systems otherwise they are bound to fall prey to spam. Traditional approaches used to prevent spam in email such as content-based filtering and access lists are not applicable to SIP calls and instant messages in their present form. We propose Domain-based Authentication and Policy-Enforced for SIP (DAPES): a system that can be easily implemented and deployed in existing SIP networks. Our system is capable of determining in real time, whether an incoming call or instant message is likely to be spam or not, while at the same time, supporting communication between both known and unknown parties. DAPES includes the deployment of reputation systems in SIP networks to enable real-time transfer of reputation information between parties to allow communication between entities unknown to each other.

Conferencing services for Internet telephony and multimedia can be enhanced by the integration of other Internet services, such as instant messaging, presence notification, directory lookups, location sensing, email and web. These services require a service programming architecture that can easily incorporate new Internet services into the existing conferencing functionalities, such as voice-enabled conference control. W3C has defined the Call Control eXtensible Markup Language (CCXML), along with its VoiceXML, for telephony call control services in a point-to-point call. However, it cannot handle other Internet service events such as presence enabled conferences. In this paper, we propose an architecture combining VoiceXML with our Language for End System Services (LESS) and the Common Gateway Interface (CGI) for multi-party conference service programming that integrates existing Internet services. VoiceXML provides the voice interface to LESS and CGI scripts. Our architecture enables many novel services such as conference setup based on participant location and presence status. We give some examples of the new services and describe our on-going implementation.

Skype is a peer-to-peer VoIP client developed by KaZaa in 2003. Skype claims that it can work almost seamlessly across NATs and firewalls and has better voice quality than the MSN and Yahoo IM applications. It encrypts calls end-to-end, and stores user information in a decentralized fashion. Skype also supports instant messaging and conferencing. This report analyzes key Skype functions such as login, NAT and firewall traversal, call establishment, media transfer, codecs, and conferencing under three different network setups. Analysis is performed by careful study of Skype network traffic.

We propose a new approach for reacting to a wide variety of software failures, ranging from remotely exploitable vulnerabilities to more mundane bugs that cause abnormal program termination (e.g., illegal memory dereference). Our emphasis is in creating "self-healing" software that can protect itself against a recurring fault until a more comprehensive fix is applied.
Our system consists of a set of sensors that monitor applications for various types of failure and an instruction-level emulator that is invoked for selected parts of a program's code. Use of such an emulator allows us to predict recurrences of faults, and recover program execution to a safe control flow. Using the emulator for small pieces of code, as directed by the sensors, allows us to minimize the performance impact on the immunized application.
We discuss the overall system architecture and a prototype implementation for the x86 platform. We evaluate the efficacy of our approach against a range of attacks and other software failures and investigate its performance impact on several server-type applications. We conclude that our system is effective in preventing the recurrence of a wide variety of software failures at a small performance cost.

In this paper, we present a performance comparison of two linux live CD distributions, Knoppix (v.3.3) and Quantian (v 0.4.96). The library used for performance evaluation is the Parallel Image Processing Toolkit (PIPT), a software library that contains several parallel image processing routines. A set of images was chosen and a batch job of PIPT routines were run and timed using both live CD distributions. The point of comparison between the two live CDs was the total time the batch job required for completion.

This paper is concerned with rigid formations of mobile autonomous agents using a leader-follower structure. A formation is a group of agents moving in real 2- or 3- dimensional space. A formation is called rigid if the distance between each pair of agents does not change over time under ideal conditions. Sensing/communication links between agents are used to maintain a rigid formation. Two agents connected by a sensing/communication link are called neighbors. There are two types of neighbor relations in rigid formations. In the first type, the neighbor relation is symmetric. In the second type, the neighbor relation is asymmetric. Rigid formations with a leader-follower structure have the asymmetric neighbor relation. A framework to analyze rigid formations with symmetric neighbor relations is given in our previous work. This paper suggests an approach to analyze rigid formations that have a leader-follower structure.

This paper explores new-information detection, describing a strategy for filtering a stream of documents to present only information that is fresh. We focus on multi-document summarization and seek to efficiently use more linguistic information than is often seen in such systems. We experimented with our linguistic system and with a more traditional sentence-based, vector-space system and found that a combination of the two approaches boosted performance over each one alone.

This paper explores a combination of machine learning, approximate text segmentation and a vector-space model to distinguish novel information from repeated information. In experiments with the data from the Novelty Track at the Text Retrieval Conference, we show improvements over a variety of approaches, in particular in raising precision scores on this data, while maintaining a reasonable amount of recall.

We compare UDP and TCP when transmitting voice data using PlanetLab where we can do experiments globally. For TCP, we also do experiments using TCP NODELAY which sends out requests immediately. We compare the performance of different protocols by their 90th percentile delay and jitter. The performance of UDP is better than that of TCP NODELAY and the performance TCP NODELAY is better than that of TCP. We also explore the relation between TCP delay time minus the transmission time and the packet loss rate and find there is a linear relationship between them.

We examine the problem of containing buffer overflow attacks in a safe and efficient manner. Briefly, we automatically augment source code to dynamically catch stack and heap-based buffer overflow and underflow attacks, and recover from them by allowing the program to continue execution. Our hypothesis is that we can treat each code function as a transaction that can be aborted when an attack is detected, without affecting the application's ability to correctly execute. Our approach allows us to selectively enable or disable components of this defensive mechanism in response to external events, allowing for a direct tradeoff between security and performance. We combine our defensive mechanism with a honeypot-like con guration to detect previously unknown attacks and automatically adapt an application's defensive posture at a negligible performance cost, as well as help determine a worm's signature. The main benefits of our scheme are its low impact on application performance, its ability to respond to attacks without human intervention, its capacity to handle previously unknown vulnerabilities, and the preservation of service availability. We implemented a stand-alone tool, DYBOC, which we use to instrument a number of vulnerable applications. Our performance benchmarks indicate a slow-down of 20% for Apache in full-protection mode, and 1.2% with partial protection. We validate our transactional hypothesis via two experiments: first, by applying our scheme to 17 vulnerable applications, successfully fixing 14 of them; second, by examining the behavior of Apache when each of 154 potentially vulnerable routines are made to fail, resulting in correct behavior in 139 of cases.

A Theoretical Analysis of the Conditions for Unambiguous Node Localization in Sensor Networks

Tolga Eren, Walter Whiteley, Peter N. Belhumeur

2004-09-13

In this paper we provide a theoretical foundation for the problem of network localization in which some nodes know their locations and other nodes determine their locations by measuring distances or bearings to their neighbors. Distance information is the separation between two nodes connected by a sensing/communication link. Bearing is the angle between a sensing/communication link and the x-axis of a node's local coordinate system. We construct grounded graphs to model network localization and apply graph rigidity theory and parallel drawings to test the conditions for unique localizability and to construct uniquely localizable networks. We further investigate partially localizable networks.

Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency, metasearchers rely on succinct statistical summaries of the database contents to select the best databases for each query. So far, database selection research has largely assumed that databases are static, so the associated statistical summaries do not need to change over time. However, databases are rarely static and the statistical summaries that describe their contents need to be updated periodically to reflect content changes. In this paper, we first report the results of a study showing how the content summaries of 152 real web databases evolved over a period of 52 weeks. Then, we show how to use ``survival analysis'' techniques in general, and Cox's proportional hazards regression in particular, to model database changes over time and predict when we should update each content summary. Finally, we exploit our change model to devise update schedules that keep the summaries up to date by contacting databases only when needed, and then we evaluate the quality of our schedules experimentally over real web databases.

We present a set of interaction techniques for a hybrid user interface that integrates existing 2D and 3D visualization and interaction devices. Our approach is built around one- and two-handed gestures that support the seamless transition of data between co-located 2D and 3D contexts. Our testbed environment combines a 2D multi-user, multi-touch, projection surface with 3D head-tracked, see-through, head-worn displays and 3D tracked gloves to form a multi-display augmented reality. We also address some of the ways in which we can interact with private data in a collaborative, heterogeneous workspace.

Proportional share resource management provides a flexible and useful abstraction for multiplexing time-shared resources. We present Group Ratio Round-Robin ($GR^3$), the first proportional share scheduler that combines accurate proportional fairness scheduling behavior with $O(1)$ scheduling overhead on both uniprocessor and multiprocessor systems. $GR^3$ uses a novel client grouping strategy to organize clients into groups of similar processor allocations which can be more easily scheduled. Using this grouping strategy, $GR^3$ combines the benefits of low overhead round-robin execution with a novel ratio-based scheduling algorithm. $GR^3$ can provide fairness within a constant factor of the ideal generalized processor sharing model for client weights with a fixed upper bound and preserves its fairness properties on multiprocessor systems. We have implemented $GR^3$ in Linux and measured its performance against other schedulers commonly used in research and practice, including the standard Linux scheduler, Weighted Fair Queueing, Virtual-Time Round-Robin, and Smoothed Round-Robin. Our experimental results demonstrate that $GR^3$ can provide much lower scheduling overhead and much better scheduling accuracy in practice than these other approaches.

Rapid improvements in network bandwidth, cost, and ubiquity combined with the security hazards and high total cost of ownership of personal computers have created a growing market for thin-client computing. We introduce THINC, a remote display system architecture for high-performance thin-client computing in both LAN and WAN environments. THINC transparently maps high-level application display calls to a few simple low-level commands which can be implemented easily and efficiently. THINC introduces a number of novel latency-sensitive optimization techniques, including offscreen drawing awareness, command buffering and scheduling, non-blocking display operation, native video support, and server-side screen scaling. We have implemented THINC in an XFree86/Linux environment and compared its performance with other popular approaches, including Citrix MetaFrame, Microsoft Terminal Services, SunRay, VNC, and X. Our experimental results on web and video applications demonstrate that THINC can be as much as five times faster than traditional thin-client systems in high latency network environments and is capable of playing full-screen video at full frame rate.

This paper addresses the problem of efficiently calculating shadows from environment maps. Since accurate rendering of shadows from environment maps requires hundreds of lights, the expensive computation is determining visibility from each pixel to each light direction, such as by ray-tracing. We show that coherence in both spatial and angular domains can be used to reduce the number of shadow rays that need to be traced. Specifically, we use a coarse-to-fine evaluation of the image, predicting visibility by reusing visibility calculations from four nearby pixels that have already been evaluated. This simple method allows us to explicitly mark regions of uncertainty in the prediction. By only tracing rays in these and neighboring directions, we are able to reduce the number of shadow rays traced by up to a factor of 20 while maintaining error rates below 0.01\%. For many scenes, our algorithm can add shadowing from hundreds of lights at twice the cost of rendering without shadows.

Asynchronous (or ``clock-less'') digital circuit design has received much attention over the past few years, including its introduction into consumer products. One major bottleneck to the further advancement of clock-less design is the lack of optimizing CAD (computer-aided design) algorithms and tools. In synchronous design, CAD packages have been crucial to the advancement of the microelectronics industry. In fact, automated methods seem to be even more crucial for asynchronous design, which is widely considered as being much more error-prone.
This thesis proposes several new efficient CAD techniques for the design of asynchronous control circuits. The contributions include: (i) two new and very efficient algorithms for hazard-free two-level logic minimization, including a heuristic algorithm, ESPRESSO-HF, and an exact algorithm based on implicit data structures, IMPYMIN; and (ii) a new synthesis and optimization method for large-scale asynchronous systems, which starts from a Control-Dataflow Graph (CDFG), and produces highly-optimized distributed control.
As a case study, this latter method is applied to a differential equation solver; the resulting synthesized circuit is comparable in quality to a highly-optimized manual design.

In this note we prove that a monotone boolean function computable by a decision tree of size $s$ has average sensitivity at most $\sqrt{\log_2 s}$. As a consequence we show that monotone functions are learnable to constant accuracy under the uniform distribution in time polynomial in their decision tree size.

Orchestrating the Dynamic Adaptation of Distributed Software with Process Technology

Giuseppe Valetto

2004-05-24

Software systems are becoming increasingly complex to develop, understand, analyze, validate, deploy, configure, manage and maintain. Much of that complexity is related to ensuring adequate quality levels to services provided by software systems after they are deployed in the field, in particular when those systems are built from and operated as a mix of proprietary and non-proprietary components. That translates to increasing costs and difficulties when trying to operate large-scale distributed software ensembles in a way that continuously guarantees satisfactory levels of service.
A solution can be to exert some form of dynamic adaptation upon running software systems: dynamic adaptation can be defined as a set of automated and coordinated actions that aim at modifying the structure, behavior and performance of a target software system, at run time and without service interruption, typically in response to the occurrence of some condition(s). To achieve dynamic adaptation upon a given target software system, a set of capabilities, including monitoring, diagnostics, decision, actuation and coordination, must be put in place.
This research addresses the automation of decision and coordination in the context of an end-to-end and externalized approach to dynamic adaptation, which allows to address as its targets legacy and component-based systems, as well as new systems developed from scratch. In this approach, adaptation provisions are superimposed by a separate software platform, which operates from the outside of and orthogonally to the target application as a whole; furthermore, a single adaptation possibly spans concerted interventions on a multiplicity of target components. To properly orchestrate those interventions, decentralized process technology is employed for describing, activating and coordinating the work of a cohort of software actuators, towards the intended end-to-end dynamic adaptation.
The approach outlined above, has been implemented in a prototype, code-named Workflakes, within the Kinesthetics eXtreme project investigating externalized dynamic adaptation, carried out by the Programming Systems Laboratory of Columbia University, and has been employed in a set of diverse case studies. This dissertation discusses and evaluates the concept of process-based orchestration of dynamic adaptation and the Workflakes prototype on the basis of the results of those case studies.

We discuss the elastic versions of block ciphers whose round function processes subsets of bits from the data block differently, such as occurs in a Feistel network and in MISTY1. We focus on how specific bits are selected to be swapped after each round when forming the elastic version, using an elastic version of MISTY1 and differential cryptanalysis to illustrate why this swap step must be carefully designed. We also discuss the benefit of adding initial and final key dependent permutations in all elastic block ciphers. The implementation of the elastic version of MISTY1 is analyzed from a performance perspective.

Peer to Peer (P2P) systems that utilize Distributed Hash Tables (DHTs) provide a scalable means to distribute the handling of lookups. However, this scalability comes at the expense of increased vulnerability to specific types of attacks. In this paper, we focus on insider denial of service (DoS) attacks on such systems. In these attacks, nodes that are part of the DHT system are compromised and used to flood other nodes in the DHT with excessive request traffic.
We devise a distributed lightweight protocol that detects such attacks, implemented solely within nodes that participate in the DHT. Our approach exploits inherent structural invariants of DHTs to ferret out attacking nodes whose request patterns deviate from ``normal'' behavior. We evaluate our protocol's ability to detect attackers via simulation within a Chord network. The results show that our system can detect a simple attacker whose attack traffic deviates by as little as 5\% from a normal request traffic. We also demonstrate the resiliency of our protocol to coordinated attacks by up to as many as 25\% of nodes. Our work shows that DHTs can protect themselves from insider flooding attacks, eliminating an important roadblock to their deployment and use in untrusted environments.

We describe an anomaly detector, called FWRAP, for a Host-based Intrusion Detection System that monitors file system calls to detect anomalous accesses. The system is intended to be used not as a standalone detector but one of a correlated set of host-based sensors. The detector has two parts, a sensor that audits file systems accesses, and an unsupervised machine learning system that computes normal models of those accesses.We report on the architecture of the file system sensor implemented on Linux using the FiST file wrapper technology and results of the anomaly detector applied to experimental data acquired from this sensor. FWRAP employs the Probabilistic Anomaly Detection (PAD) algorithm previously reported in our work on Windows Registry Anomaly Detection. The detector is first trained by operating the host computer for some amount of time and a model specific to the target machine is automatically computed by PAD, intended to be deployed to a real-time detector. In this paper we describe the feature set used to model file system accesses, and the performance results of a set of experiments using the sensor while attacking a Linux host with a variety of malware exploits. The PAD detector achieved impressive detection rates in some cases over 95\% and about a 2\% false positive rate when alarming on anomalous processes.

The high cost of ownership of computing systems has resulted in a number of industry initiatives to reduce the burden of operations and management. Examples include IBM's Autonomic Computing, HP's Adaptive Infrastructure, and Microsoft's Dynamic Systems Initiative. All of these efforts seek to reduce operations costs by increased automation, ideally to have systems be self-managing without any human intervention (since operator error has been identified as a major source of system failures). While the concept of automated operations has existed for two decades, as a way to adapt to changing workloads, failures and (more recently) attacks, the scope of automation remains limited. We believe this is in part due to the absence of a fundamental understanding of how automated actions affect system behavior, especially system stability.
Other disciplines such as mechanical, electrical, and aeronautical engineering make use of control theory to design feedback systems. This paper uses control theory as a way to identify a number of requirements for and challenges in building self-managing systems,
either from new components or layering on top of existing components.

Blurring of Light due to Multiple Scattering by the Medium, a
Path Integral Approach

Michael Ashikhmin, Simon Premoze, Ravi R, Shree Nayar

2004-03-31

Volumetric light transport effects are significant for many materials like skin, smoke, clouds or water. In particular, one must consider the multiple scattering of light within the volume. Recently, we presented a path integral-based approach to this problem which identifies the most probable path light takes in the medium and approximates energy transport over all paths by only those surrounding this most probable one. In this report we use the same approach to derive useful expressions for the amount of spacial and angular blurring light experiences as it travels through a medium.

Video cameras must produce images at a reasonable frame-rate and with a reasonable depth of field. These requirements impose fundamental physical limits on the spatial resolution of the image detector. As a result, current cameras produce videos with a very low resolution. The resolution of videos can be computationally enhanced by moving the camera and applying super-resolution reconstruction algorithms. However, a moving camera introduces motion blur, which limits super-resolution quality. We analyze this effect and derive a theoretical result showing that motion blur has a substantial degrading effect on the performance of super resolution. The conclusion is, that in order to achieve the highest resolution, motion blur should be avoided.
Motion blur can be minimized by sampling the space-time volume of the video in a specific manner. We have developed a novel camera, called the jitter camera, that achieves this sampling. By applying an adaptive super-resolution algorithm to the video produced by the jitter camera, we show that resolution can be notably enhanced for stationary or slowly moving objects, while it is improved slightly or left unchanged for objects with fast and complex motions. The end result is a video that has a significantly higher resolution than the captured one.

We present a new procedure for automatically synthesizing controllers from high-level Esterel specifications. Unlike existing \textsc{rtl} synthesis approaches, this approach frees the designer from tedious bit-level state encoding and certain types of inter-machine communication. Experimental results suggest that even with a fairly primitive state assignment heuristic, our compiler consistently produces smaller, slightly faster circuits that the existing Esterel compiler. We mainly attribute this to a different style of distributing state bits throughout the circuit. Initial results are encouraging, but some hand-optimized encodings suggest room for a better state assignment algorithm. We are confident that such improvements will make our technique even more practical.

We present MobiDesk, a mobile virtual desktop computing hosting infrastructure that leverages continued improvements in network speed, cost, and ubiquity to address the complexity, cost, and mobility limitations of today's personal computing infrastructure. MobiDesk transparently virtualizes a user's computing session by abstracting underlying system resources in three key areas: display, operating system and network. MobiDesk provides a thin virtualization layer that decouples a user's computing session from any particular end user device and moves all application logic from end user devices to hosting providers. MobiDesk virtualization decouples a user's computing session from the underlying operating system and server instance, enabling high availability service by transparently migrating sessions from one server to another during server maintenance or upgrades. We have implemented a MobiDesk prototype in Linux that works with existing unmodified applications and operating system kernels. Our experimental results demonstrate that MobiDesk has very low virtualization overhead, can provide a full-featured desktop experience including full-motion video support, and is able to migrate users' sessions efficiently and reliably for high availability, while maintaining existing network connections.

When one Sample is not Enough: Improving Text Database Selection Using Shrinkage

Panagiotis G. Ipeirotis, Luis Gravano

2004-03-17

Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the database contents, which are not typically exported by databases. Previous research has developed algorithms for constructing an approximate content summary of a text database from a small document sample extracted via querying. Unfortunately, Zipf's law practically guarantees that content summaries built this way for any relatively large database will fail to cover many low-frequency words. Incomplete content summaries might negatively affect the database selection process, especially for short queries with infrequent words. To improve the coverage of approximate content summaries, we build on the observation that topically similar databases tend to have related vocabularies. Therefore, the approximate content summaries of topically related databases can complement each other and increase their coverage. Specifically, we exploit a (given or derived) hierarchical categorization of the databases and adapt the notion of ``shrinkage'' --a form of smoothing that has been used successfully for document classification-- to the content summary construction task. A thorough evaluation over 315 real web databases as well as over TREC data suggests that the shrinkage-based content summaries are substantially more complete than their ``unshrunk'' counterparts. We also describe how to modify existing database selection algorithms to adaptively decide --at run-time-- whether to apply shrinkage for a query. Our experiments, which rely on TREC data sets, queries, and the associated ``relevance judgments,'' show that our shrinkage-based approach is significantly more accurate than state-of-the-art database selection algorithms, including a recently proposed hierarchical strategy that also exploits database classification.

The rapidly increasing array of Internet-scale threats is a pressing problem for every organization that utilizes the network. Organizations often have limited resources to detect and respond to these threats. The sharing of information related to probes and attacks is a facet of an emerging trend toward "collaborative security." Collaborative security mechanisms provide network administrators with a valuable tool in this increasingly hostile environment.
The perceived benefit of a collaborative approach to intrusion detection is threefold: greater clarity about attacker intent, precise models of adversarial behavior, and a better view of global network attack activity. While many organizations see value in adopting such a collaborative approach, several critical problems must be addressed before intrusion detection can be performed on an inter-organizational scale. These obstacles to collaborative intrusion detection often go beyond the merely technical; the relationships between cooperating organizations impose additional constraints on the amount and type of information to be shared.
We propose a completely decentralized system that can efficiently distribute alerts to each collaborating peer. The system is composed of two major components that embody the main contribution of our research. The first component, named Worminator, is a tool for extracting relevant information from alert streams and encoding it in Bloom filters. The second component, Whirlpool, is a software system for scheduling correlation relationships between peer nodes. The combination of these systems accomplishes alert distribution in a scalable manner and without violating the privacy of each administrative domain.

We apply some of the existing web server redundancy techniques for high service availability and scalability to the relatively new IP telephony context. The paper compares various failover and load sharing methods for registration and call routing servers based on the Session Initiation Protocol (SIP). In particular, we consider the SIP server failover techniques based on the clients, DNS (Domain Name Service), database replication and IP address takeover, and the load sharing techniques using DNS, SIP identifiers, network address translators and servers with same IP addresses. Additionally, we present an overview of the failover mechanism we implemented in our test-bed using our SIP proxy and registration server and the open source MySQL database.

We present a 3D collaborative virtual environment, CHIME, in which geographically dispersed students can meet together in study groups or to work on team projects. Conventional educational materials from heterogeneous backend data sources are reflected in the virtual world through an automated metadata extraction and projection process that structurally organizes container materials into rooms and interconnecting doors, with atomic objects within containers depicted as furnishings and decorations. A novel in-world authoring tool makes it easy for instructors to design environments, with additional in-world modification afforded to the students themselves, in both cases without programming. Specialized educational services can also be added to virtual environments via programmed plugins. We present an example plugin that supports synchronized viewing of lecture videos by groups of students with widely varying bandwidths.

The increasing popularity of distance learning and online courses has highlighted the lack of collaborative tools for student groups. In addition, the introduction of lecture videos into the online curriculum has drawn attention to the disparity in the network resources used by the students. We present an architecture and adaptation model called AI2TV (Adaptive Internet Interactive Team Video), a system that allows geographically dispersed participants, possibly some or all disadvantaged in network resources, to collaboratively view a video in synchrony. AI2TV upholds the invariant that each participant will view semantically equivalent content at all times. Video player actions, like play, pause and stop, can be initiated by any of the participants and the results of those actions are seen by all the members. These features allow group members to review a lecture video in tandem to facilitate the learning process. We employ an autonomic (feedback loop) controller that monitors clients' video status and adjusts the quality of the video according to the resources of each client. We show in experimental trials that our system can successfully synchronize video for distributed clients while, at the same time, optimizing the video quality given actual (fluctuating) bandwidth by adaptively adjusting the quality level for each participant.

We introduce a new concept of elastic block ciphers, symmetric-key encryption algorithms that for a variable size input do not expand the plaintext, (i.e., do not require plaintext padding), while maintaining the diffusion property of traditional block ciphers and adjusting their computational load proportionally to the size increase. Elastic block ciphers are ideal for applications where length-preserving encryption is most beneficial, such as protecting variable-length database entries or network packets.
We present a general algorithmic schema for converting a traditional block cipher, such as AES, to its elastic version, and analyze the security of the resulting cipher. Our approach allows us to ``stretch'' the supported block size of a block cipher up to twice the original length, while increasing the computational load proportionally to the block size. Our approach does not allow us to use the original cipher as a ``black box'' (i.e., as an ideal cipher or a pseudorandom permutation as is used in constructing modes of encryption). Nevertheless, under some reasonable conditions on the cipher's structure and its key schedule, we reduce the security of the elastic version to that of the fixed size block cipher. This schema and the security reduction enable us to capitalize on secure ciphers and their already established security properties in elastic designs. We note that we are not aware of previous ``reduction type'' proofs of security in the area of concrete (i.e., non ``black-box'') block cipher design. Our implementation of the elastic version of AES, which accepts blocks of all sizes in the range 128 to 255 bits, was measured to be almost twice as fast when encrypting plaintext that is only a few bits longer than a full block (A128 bits), when compared to traditional ``pad and block-encrypt'' approach.

DotSlash: A Scalable and Efficient Rescue System for Handling Web Hotspots

Weibin Zhao, Henning Schulzrinne

2004-02-07

This paper describes DotSlash, a scalable and efficient rescue system for handling web hotspots. DotSlash allows different web sites to form a mutual-aid community, and use spare capacity in the community to relieve web hotspots experienced by any individual site. As a rescue system, DotSlash intervenes when a web site becomes heavily loaded, and is phased out once the workload returns to normal. It aims to complement existing web server infrastructure such as CDNs to handle short-term load spikes effectively, but is not intended to support a request load constantly higher than a web site's planned capacity. DotSlash is scalable, cost-effective, easy to use, self-configuring, and transparent to clients. It targets small web sites, although large web site can also benefit from it. We have implemented a prototype of DotSlash on top of Apache. Experiments show that DotSlash can provide an order of magnitude improvement for a web server in terms of the request rate supported and the data rate delivered to clients even if only HTTP redirect is used. Parts of this work may be applicable to other services such as the Grid computational services and media streaming.

Sting applications often contain security holes that are not patched until after the system has already been compromised. Even when software updates are applied to address security issues, they often result in system services being unavailable for some time. To address these system security and availability issues, we have developed peas and pods. A pea provides a least privilege environment that can restrict processes to the minimal subset of system resources needed to run. This mechanism enables the creation of environments for privileged program execution that can help with intrusion prevention and containment. A pod provides a group of processes and associated users with a consistent, machine-independent virtualized environment. Pods are coupled with a novel checkpoint-restart mechanism which allows processes to be migrated across minor operating system kernel versions with different security patches. This mechanism allows system administrators the flexibility to patch their operating systems immediately without worrying over potential loss of data or needing to schedule system downtime. We have implemented peas and pods in Linux without requiring any application or operating system kernel changes. Our measurements on real world desktop and server applications demonstrate that peas and pods impose little overhead and enable secure isolation and migration of untrusted applications.

Internet telephony end systems can offer many services. Different services may interfere with each other, a problem which is known as feature interaction. The feature interaction problem has existed in telecommunication systems for many years. The introduction of Internet telephony helps to solve some interaction problems due to the richness of its signaling information. However, many new feature interaction problems are also introduced in Internet telephony systems, especially in end systems, which are usually dumb in PSTN systems, but highly functional in Internet telephony systems. Internet telephony end systems, such as SIP soft-agents, can run on personal computers. The soft-agents can then perform call control and many other functions, such as presence information handling, instant messaging, and network appliance control. These new functionalities make the end system feature interaction problems more complicated. In this paper, we investigate ways features interact in Internet telephony end systems and propose a potential solution for detecting and avoiding feature interactions. Our solutions are based on the Session Initiation Protocol (SIP) and the Language for End System Services (LESS), which is a markup language specifically for end system service creation.

The Complexity of Fredholm Equations of the Second Kind: Noisy Information About Everything

Arthur G. Werschulz

2004-01-21

We study the complexity of Fredholm problems of the second kind $u - \int_\Omega k(\cdot,y)u(y)\,dy = f$. Previous work on the complexity of this problem has assumed that $\Omega$ was the unit cube~$I^d$. In this paper, we allow~$\Omega$ to be part of the data specifying an instance of the problem, along with~$k$ and~$f$. More precisely, we assume that $\Omega$ is the diffeomorphic image of the unit $d$-cube under a $C^{r_1}$ mapping~$\rho\:I^d\to I^l$. In addition, we assume that $k\in C^{r_2}(I^{2l})$ and $f\in W^{r_3,p}(I^l)$ with $r_3>l/p$. Our information about the problem data is contaminated by $\delta$-bounded noise. Error is measured in the $L_p$-sense. We find that the $n$th minimal error is bounded from below by $\Theta(n^{-\mu_1}+\delta)$ and from above by $\Theta(n^{-\mu_2}+\delta)$, where $$\mu_1 = \min\left\{\frac{r_1}{d}, \frac{r_2}{2d}, \frac{r_3-(d-l)/p}d\right\} \qquad\text{and}\qquad \mu_2 = \min\left\{\frac{r_1-\nu}d, \frac{r_2}{2d}, \frac{r_3-(l-d)/p}d\right\},$$ with $$\nu = \begin{cases} \displaystyle\frac{d}p & \text{if $r_1\ge 2$, $r_2\ge2$, and $d\le p$},\\ & \\ 1 & \text{otherwise}. \end{cases}$$ In particular, the $n$th minimal error is proportional to $\Theta(n^{-\mu_1}+\delta)$ when $p=\infty$. The upper bound is attained by a noisy modified Galerkin method, which can be efficiently implemented using multigrid techniques. We thus find bounds on the $\varepsilon$-complexity of the problem, these bounds depending on the cost $\mathbf{c}(\delta)$ of calculating a $\delta$-noisy function value. As an example, if $\mathbf{c}(\delta)=\delta^{-b}$, we find that the $\varepsilon$-complexity is between $(1/\varepsilon)^{b+1/\mu_1}$ and $(1/\varepsilon)^{b+1/\mu_2}$.

One frequently cited reason for the lack of wide deployment of cryptographic protocols is the (perceived) poor performance of the algorithms they employ and their impact on the rest of the system. Although high-performance dedicated cryptographic accelerator cards have been commercially available for some time, market penetration remains low. We take a different approach, seeking to exploit {\it existing system resources,} such as Graphics Processing Units (GPUs) to accelerate cryptographic processing.
We exploit the ability for GPUs to simultaneously process large quantities of pixels to offload cryptographic processing from the main processor. We demonstrate the use of GPUs for stream ciphers, which can achieve 75\% the performance of a fast CPU. We also investigate the use of GPUs for block ciphers, discuss operations that make certain ciphers unsuitable for use with a GPU, and compare the performance of an OpenGL-based implementation of AES with implementations utilizing general CPUs. In addition to offloading system resources, the ability to perform encryption and decryption within the GPU has potential applications in image processing by limiting exposure of the plaintext to within the GPU.

Web pages often contain clutter (such as unnecessary images and extraneous links) around the body of an article that distracts a user from actual content. Extraction of "useful and relevant" content from web pages has many applications, including cell phone and PDA browsing, speech rendering for the visually impaired, and text summarization. Most approaches to making content more readable involve changing font size or removing HTML and data components such as images, which takes away from a webpage's inherent look and feel. Unlike "Content Reformatting", which aims to reproduce the entire webpage in a more convenient form, our solution directly addresses "Content Extraction". We have developed a framework that employs an easily extensible set of techniques. It incorporates advantages of previous work on content extraction. Our key insight is to work with DOM trees, a W3C specified interface that allows programs to dynamically access document structure, rather than with raw HTML markup. We have implemented our approach in a publicly available Web proxy to extract content from HTML web pages. This proxy can be used both centrally, administered for groups of users, as well as by individuals for personal browsers. We have also, after receiving feedback from users about the proxy, created a revised version with improved performance and accessibility in mind.

AIM Encrypt: A Case Study of the Dangers of Cryptographic Urban Legends

Michael E. Locasto

2003-11-26

Like e--mail, instant messaging (IM) has become an integral part of life in a networked society. Until recently, IM software has been lax about providing confidentiality and integrity of these conversations.
With the introduction of AOL's version 5.2.3211 of the AIM client, users can optionally encrypt and protect the integrity of their conversation. Taking advantage of the encryption capabilities of the AIM client requires that signed certificates for both parties be available. AIM (through VeriSign) makes such certificates available for purchase. However, in a ``public service'' effort to defray the cost of purchasing personal certificates to protect IM conversations, a website (www.aimencrypt.com) is offering a certificate free of cost for download. Unfortunately, the provided certificate is the same for everyone; this mistake reveals the dangers of a public undereducated about computer security, especially public key cryptography.

The ability of worms to spread at rates that effectively preclude human-directed reaction has elevated them to a first-class security threat to distributed systems. We propose an architecture for automatically repairing software flaws that are exploited by zero-day worms.Our approach relies on source code transformations to quickly apply automatically-created (and tested) localized patches to vulnerable segments of the targeted application. To determine these susceptible portions, we use a sandboxed instance of the application as a ``clean room'' laboratory that runs in parallel with the production system and exploit the fact that a worm must reveal its infection vector to achieve its goal ( i.e., further infection). We believe our approach to be the first end-point solution to the problem of malicious self-replicating code. The primary benefits of our approach are (a) its low impact on application performance, (b) its ability to respond to attacks without human intervention, and (c) its capacity to deal with zero-day worms (for which no known patches exist). Furthermore, our approach does not depend on a centralized update repository, which can be the target of a concerted attack similar to the Blaster worm. Finally, our approach can also be used to protect against lower intensity attacks, such as intrusion (``hack-in'') attempts. To experimentally evaluate the efficacy of our approach, we use our prototype implementation to test a number of applications with known vulnerabilities. Our preliminary results indicate a success rate of 82\%, and a maximum repair time of 8.5 seconds.

We propose dynamical systems trees (DSTs) as a fexible model for describing multiple processes that interact via a hierarchy of aggregating processes. DSTs extend nonlinear dynamical systems to an interactive group scenario. Various individual processes interact as communities and sub-communities in a tree structure that is un-rolled in time. To accommodate nonlinear temporal activity, each individual leaf process is modeled as a dynamical system containing discrete and/or continuous hidden states with discrete and/or Gaussian emissions. Subsequent, higher level parent processes act like hidden Markov models that mediate the interaction between leaf processes or between other parent processes in the hierarchy. Aggregator chains are parents of the child processes the combine and mediate, yielding a compact overall parameterization. We provide tractable inference and learning algorithms for arbitrary DSTs topologies via structured mean field. Experiments are shown for real trajectory data of tracked American football plays where a DST tracks players as dynamical systems mediated by their team processes mediated in turn by a top-level game process.

We describe the architecture and implementation of our comprehensive multi-platform collaboration framework known as Columbia InterNet Extensible Multimedia Architecture (CINEMA). It provides a distributed architecture for collaboration using synchronous communications like multimedia conferencing, instant messaging, shared web-browsing, and asynchronous communications like discussion forums, shared files, voice and video mails. It allows seamless integration with various communication means like telephones, IP phones, web and electronic mail. In addition, it provides value-added services such as call handling based on location information and presence status. The paper discusses the media services needed for collaborative environment, the components provided by CINEMA and the interaction among those components.

Autonomic computing - self-configuring, self-healing, self-optimizing applications, systems and networks - is a promising solution to ever-increasing system complexity and the spiraling costs of human management as systems scale to global proportions. Most results to date, however, suggest ways to architect new software constructed from the ground up as autonomic systems, whereas in the real world organizations continue to use stovepipe legacy systems and/or build "systems of systems" that draw from a gamut of disparate technologies from numerous vendors. Our goal is to retrofit autonomic computing onto such systems, externally, without any need to understand, modify or even recompile the target system's code. We present an autonomic infrastructure that operates similarly to active middleware, to explicitly add autonomic services to pre-existing systems via continual monitoring and a feedback loop that performs, as needed, reconfiguration and/or repair. Our lightweight design and separation of concerns enables easy adoption of individual components, independent of the rest of the full infrastructure, for use with a large variety of target systems. This work has been validated by several case studies spanning multiple application domains.

From the outset of automated generation of summaries, the difficulty of evaluation has been widely discussed. Despite many promising attempts, we believe it remains an unsolved problem. Here we present a method for scoring the content of summaries of any length against a weighted inventory of content units, which we refer to as a pyramid. Our method is derived from empirical analysis of human-generated summaries, and provides an informative metric for human or machine-generated summaries.

The Session Initiation Protocol (SIP) is a signaling protocol for Internet telephony, multimedia conferencing and instant messaging. Although SIP implementations have not yet been widely deployed, the product portfolio is expanding rapidly. We describe a method to assess the robustness of SIP implementation by describing a tool to find vulnerabilities. We prepared the test material and carried out tests against a sample set of existing implementations. Results were reported to the vendors and the test suite was made publicly available. Many of the implementations available for evaluation failed to perform in a robust manner under the test. Some failures had information security implications, and should be considered vulnerabilities.

Recently, there have been significant advances in several areas of language technology, including clustering, text categorization, and summarization. However, efforts to combine technology from these areas in a practical system for information access have been limited. In this paper, we present a system that integrates cutting-edge technology in these areas to automatically collect news articles from multiple sources, organize them and present them in both hierarchical and text summary form. Our system is publicly available and runs daily over real data. Through a sizable user evaluation, we show that users strongly prefer using the advanced features incorporated in our system, and that these features help users achieve more efficient browsing of news.

This document enumerates some of the major opportunities and challenges for providing emergency call (9-1-1) services using IP technology. In particular, all VoIP devices are effectively mobile. The same IP telephony device works anywhere in the Internet, keeping the same external identifier such as an E.164 number or URL. (Note: This was also submitted as an ex-parte filing to the Federal Communications Commission.)

We present SABER (Survivability Architecture: Block, Evade, React), a proposed survivability architecture that blocks, evades and reacts to a variety of attacks by using several security and survivability mechanisms in an automated and coordinated fashion. Contrary to the ad hoc manner in which contemporary survivable systems are built--using isolated, independent security mechanisms such as firewalls, intrusion detection systems and software sandboxes--SABER integrates several different technologies in an attempt to provide a unified framework for responding to the wide range of attacks malicious insiders and outsiders can launch.
This coordinated multi-layer approach will be capable of defending against attacks targeted at various levels of the network stack, such as congestion-based DoS attacks, software-based DoS or code-injection attacks, and others. Our fundamental insight is that while multiple lines of defense are useful, most conventional, uncoordinated approaches fail to exploit the full range of available responses to incidents. By coordinating the response, the ability to survive even in the face of successful security breaches increases substantially.
We discuss the key components of SABER, how they will be integrated together, and how we can leverage on the promising results of the individual components to improve survivability in a variety of coordinated attack scenarios. SABER is currently in the prototyping stages, with several interesting open research topics.

Using Process Technology to Control and Coordinate Software Adaptation

Giuseppe Valetto, Gail Kaiser

2003-07-09

We have developed an infrastructure for end-to-end run-time monitoring, behavior/performance analysis, and dynamic adaptation of distributed software. This infrastructure is primarily targeted to pre-existing systems and thus operates outside the target application, without making assumptions about the target's implementation, internal communication/computation mechanisms, source code availability, etc. This paper assumes the existence of the monitoring and analysis components, presented elsewhere, and focuses on the mechanisms used to control and coordinate possibly complex repairs/reconfigurations to the target system. These mechanisms require lower level effectors somehow attached to the target system, so we briefly sketch one such facility (elaborated elsewhere). Our main contribution is the model, architecture, and implementation of Workflakes, the decentralized process engine we use to tailor, control, coordinate, etc. a cohort of such effectors. We have validated the Workflakes approach with case studies in several application domains. Due to space restrictions we concentrate primarily on one case study, briefly discuss a second, and only sketch others.

Autonomic computing - self-configuring, self-healing, self-optimizing applications, systems and networks - is widely believed to be a promising solution to ever-increasing system complexity and the spiraling costs of human system management as systems scale to global proportions. Most results to date, however, suggest ways to architect new software constructed from the ground up as autonomic systems, whereas in the real world organizations continue to use stovepipe legacy systems and/or build "systems of systems" that draw from a gamut of new and legacy components involving disparate technologies from numerous vendors. Our goal is to retrofit autonomic computing onto such systems, externally, without any need to understand or modify the code, and in many cases even when it is impossible to recompile. We present a meta-architecture implemented as active middleware infrastructure to explicitly add autonomic services via an attached feedback loop that provides continual monitoring and, as needed, reconfiguration and/or repair. Our lightweight design and separation of concerns enables easy adoption of individual components, as well as the full infrastructure, for use with a large variety of legacy, new systems, and systems of systems. We summarize several experiments spanning multiple domains.

Group Round Robin: Improving the Fairness and Complexity of Packet Scheduling

Bogdan Caprita, Wong Chun Chan, Jason Nieh

2003-07-02

We introduce Group Round-Robin (GRR) scheduling, a hybrid scheduling framework based on a novel grouping strategy that narrows down the traditional tradeoff between fairness and computational complexity. GRR combines its grouping strategy with a specialized round-robin scheduling algorithm that utilizes the properties of GRR groups to schedule flows within groups in a manner that provides O(1) bounds on fairness with only O(1) time complexity. Under the practical assumption that GRR employs a small constant number of groups, we apply GRR to popular fair queueing scheduling algorithms and show how GRR can be used to achieve constant bounds on fairness and time complexity for these algorithms.

A General Framework for Designing Catadioptric Imaging and Projection Systems

Rahul Swaminathan, Michael D. Grossberg, Shree K. Nayar

2003-07-01

New vision applications have been made possible and old ones improved through the creation and design of novel catadioptric systems. Critical to the design of catadioptric imaging is determining the shape of one or more mirrors in the system. Almost all the previously designed mirrors for catadioptric systems used case specific tools and considerable effort on the part of the designer. Recently some new general methods have been proposed to automate the design process. However, all the methods presented so far determine the mirror shape by optimizing its geometric properties, such as surface normal orientations. A more principled approach is to determine a mirror that reduces image errors.
In this paper we present a method for finding mirror shapes which meet user determined specifications while minimizing image error. We accomplish this by deriving a first order approximation of the image error. This permits us to compute the mirror shape using a linear approach that provides good results efficiently while avoiding the numerical problems associated with non-linear optimization. Since the design of mirrors can also be applied to projection systems, we also provide a method to approximate projection errors in the scene. We demonstrate our approach on various catadioptric systems and show our approach to provide much more accurate imaging characteristics. In some cases we achieved reduction in image error up to 80 percent.

Using Prosodic Features of Speech and Audio Localization in Graphical User Interfaces

Alex Olwal, Steven Feiner

2003-06-26

We describe several approaches for using prosodic features of speech and audio localization to control inter-active applications. This information can be used for parameter control, as well as for disambiguating speech recognition. We discuss how characteristics of the spoken sentences can be exploited in the user interface; for example, by considering the speed with which the sentence was spoken and the presence of extraneous utterances. We also show how coarse audio localization can be used for low-fidelity gesture tracking, by inferring the speaker's head position.

A Natural Language Generation system produces text using as input semantic data. One of its very first tasks is to decide which pieces of information to convey in the output. This task, called Content Selection, is quite domain dependent, requiring considerable re-engineering to transport the system from one scenario to another. In (Duboue and McKeown, 2003), we presented a method to acquire content selection rules automatically from a corpus of text and associated semantics. Our proposed technique was evaluated by comparing its output with information selected by human authors in unseen texts, where we were able to filter half the input data set without loss of recall. This report contains additional technical information about our system.

We have developed a multilingual version of Columbia Newsblaster as a testbed for multilingual multi-document summarization. The system collects, clusters, and summarizes news documents from sources all over the world daily. It crawls news sites in many different countries, written in different languages, extracts the news text from the HTML pages, uses a variety of methods to translate the documents for clustering and summarization, and produces an English summary for each cluster. The system is robust, running daily over real-world data. The multilingual version of Columbia Newsblaster provides a platform for testing different strategies for multilingual document clustering, and approaches for multilingual multi-document summarization.

Faceted classification allows one to model applications with complex classification hierarchies using orthogonal dimensions. Recent work has examined the use of faceted classification for browsing and search. In this paper, we go further by developing a general query language, called the entity algebra, for hierarchically classified data. The entity algebra is compositional, with query inputs and outputs being sets of entities. Our language has linear data complexity in terms of space and quadratic data complexity in terms of time. We compare the entity algebra with the relational algebra in terms of expressiveness. We also describe an implementation of the language in the context of two application domains, one for an archeological database, and another for a human anatomy database.

Proportional share resource management provides a flexible and useful abstraction for multiplexing time-shared resources. However, previous proportional share mechanisms have either weak proportional sharing accuracy or high scheduling overhead. We present Group Ratio Round-Robin (GR3), a proportional share scheduler that can provide high proportional sharing accuracy with O(1) scheduling overhead. Unlike many other schedulers, a low-overhead GR3 implementation is easy to build using simple data structures. We have implemented GR3 in Linux and measured its performance against other schedulers commonly used in research and practice, including the standard Linux scheduler, Weighted Fair Queueing, Virtual-Time Round-Robin, and Smoothed Round-Robin. Our experimental results demonstrate that GR3 can provide much lower scheduling overhead and better scheduling accuracy in practice than these other approaches for large numbers of clients.

The Natural Language Group is developing a multi-language version of Columbia Newsblaster, a program that generates summaries of news articles collected from web sites. Newsblaster currently processes articles in Arabic, Japanese,Portuguese, Spanish, and Russian, as well as English. This report outlines the Russian language processing software,focusing on machine translation and document clustering. Russian-English clustering results are analyzed and indicate encouraging inter-language and intra-language performance.

Embedded systems are application-specific computers that interact with the physical world. Each has a diverse set of tasks to perform, and although a very flexible language might be able to handle all of them, instead a variety of problem-domain-specific languages have evolved that are easier to write, analyze, and compile.
This paper surveys some of the more important languages, introducing their central ideas quickly without going into detail. A small example of each is included.

A top-k query specifies a set of preferred values for the attributes of a relation and expects as a result the k objects that are closest to the given preferences according to some distance function. In many web applications, the relation attributes are only available via probes to autonomous web-accessible sources. Probing these sources sequentially to process a top-k query is inefficient, since web accesses exhibit high and variable latency. Fortunately, web sources can be probed in parallel, and each source can typically process concurrent requests, although sources may impose some restrictions on the type and number of probes that they are willing to accept. These characteristics of web sources motivate the introduction of parallel top-k query processing strategies, which are the focus of this paper. We present efficient techniques that maximize source-access parallelism to minimize query response time, while satisfying source access constraints. A thorough experimental evaluation over both synthetic and real web sources shows that our techniques can be significantly more efficient than previously proposed sequential strategies. In addition, we adapt our parallel algorithms for the alternate optimization goal of minimizing source load while still exploiting source-access parallelism.

We propose a new storage model called MBSM (Multi-resolution Block Storage Model) for laying out tables on disks. MBSM is intended to speed up operations such as scans that are typical of data warehouse workloads. Disk blocks are grouped into ``super-blocks,'' with a single record stored in a partitioned fashion among the blocks in a super-block. The intention is that a scan operation that needs to consult only a small number of attributes can access just those blocks of each super-block that contain the desired attributes. To achieve good performance given the physical characteristics of modern disks, we organize super-blocks on the disk into fixed-size ``mega-blocks.'' Within a mega-block, blocks of the same type (from various super-blocks) are stored contiguously. We describe the changes needed in a conventional database system to manage tables using such a disk organization. We demonstrate experimentally that MBSM outperforms competing approaches such as NSM (N-ary Storage Model), DSM (Decomposition Storage Model) and PAX (Partition Attributes Across), for I/O bound decision-support workloads consisting of scans in which not all attributes are required. This improved performance comes at the expense of single-record insert and delete performance; we quantify the trade-offs involved. Unlike DSM, the cost of reconstructing a record from its partitions is small. MBSM stores attributes in a vertically partitioned manner similar to PAX, and thus shares PAX's good CPU cache behavior. We describe methods for mapping attributes to blocks within super-blocks in order to optimize overall performance, and show how to tune the super-block and mega-block sizes.

We present DefScriber, a fully implemented system that combines knowledge-based and statistical methods in forming multi-sentence answers to open-ended definitional questions of the form, ``What is X?'' We show how a set of definitional predicates proposed as the knowledge-based side of our approach can be used to guide the selection of definitional sentences. Finally, we present results of an evaluation of definitions generated by DefScriber from Internet documents.

Cooperating processes are increasingly used to structure modern applications in common client-server computing environments. This cooperation among processes often results in dependencies such that a certain process cannot proceed until other processes finish some tasks. Despite the popularity of using cooperating processes in application design, operating systems typically ignore process dependencies and schedule processes independently. This can result in poor system performance due to the actual scheduling behavior contradicting the desired scheduling policy.
To address this problem, we have developed SWAP, a system that automatically detects process dependencies and accounts for such dependencies in scheduling. SWAP uses system call history to determine possible resource dependencies among processes in an automatic and fully transparent fashion. Because some dependencies cannot be precisely determined, SWAP associates confidence levels with dependency information that are dynamically adjusted using feedback from process blocking behavior. SWAP can schedule processes using this imprecise dependency information in a manner that is compatible with existing scheduling mechanisms and ensures that actual scheduling behavior corresponds to the desired scheduling policy in the presence of process dependencies. We have implemented SWAP in Linux and measured its effectiveness on microbenchmarks and real applications. Our experiment results show that SWAP has low overhead and can provide substantial improvements in system performance in scheduling processes with dependencies.

XQuery is not only useful to query XML in databases, but also to applications that must process XML documents as files or streams. These applications suffer from the limitations of current main-memory XQuery processors which break for rather small documents. In this paper we propose techniques, based on a notion of projection for XML, which can be used to drastically reduce memory requirements in XQuery processors. The main contribution of the paper is a static analysis technique that can identify at compile time which parts of the input document are needed to answer an arbitrary XQuery. We present a loading algorithm that takes the resulting information to build a projected document, which is smaller than the original document, and on which the query yields the same result. We implemented projection in the Galax XQuery processor. Our experiments show that projection reduces memory requirements by a factor of 20 on average, and is effective for a wide variety of queries. In addition, projection results in some speedup during query evaluation.

We describe a system for personalizing a set of medical journal articles (possibly created as the output of a search engine) by selecting those documents that specifically match a patient under care. Key element in our approach is the use of targeted parts of the electronic patient record to serve as a readily available user model for the personalization task. We discuss several enhancements to a TF*IDF based approach for measuring the similarity between articles and the patient record. We also present the results of an experiment involving almost 3,000 relevance judgments by medical doctors. Our evaluation establishes that the automated system surpasses in performance alternative methods for personalizing the set of articles, including keyword-based queries manually constructed by medical experts for this purpose.

Pushbroom cameras produce one-dimensional images of a scene with high resolution at a high frame-rate. As a result, they provide superior data compared to conventional two-dimensional cameras in cases where the scene of interest can be temporally scanned. In this paper, we consider the problem of recovering the structure of a scene using a set of pushbroom cameras. Although pushbroom cameras have been used to recover scene structure in the past, the algorithms for recovery were developed separately for different camera motions such as translation and rotation. In this paper, we present a general framework of structure recovery for pushbroom cameras with 6 degree-of-freedom motion. We analyze the translation and rotation cases using our framework and demonstrate that several previous results are really special cases of our result. Using this framework, we also show that three or more pushbroom cameras can be used to compute scene structure as well as motion of translation or rotation. We conclude with a set of experiments that demonstrate the use of pushbroom imaging to recover structure from unknown motion.

Improving the Coherence of Multi-document Summaries: a Corpus Study for Modeling the Syntactic Realization of Entities

Ani Nenkova, Kathleen McKeown

2003-03-04

References included in multi-document summaries are often problematic. In this paper, we present a corpus study performed to derive statistical models for the syntactic realization of referential expressions. Our work shows how the syntactic realization of entities can influence the coherence of the text and provides a model for rewriting references in multi-document summaries to smooth disfluencies.
It shows how the syntactic realization of entities can influence the coherence of the text and how rewrite change s can smooth the disfluencies. A large