Projects

Of interest

Latest blog posts

Publications

Funding for the publications listed below has come in large part from the NSF through AitF-1637598, CNS-1518941, CPS-154471, CNS-1319820, EPAS-1307794, CCF-1101470, CNS-0846025, and CCF-0830511. Additional funding has come from ARPA-E through the GENI and NODES programs. Other sponsors have included the Okawa foundation, Microsoft, IBM, Southern California Edison, Yahoo, and Facebook.

Heavy-tails are a continual source of excitement and confusion across disciplines as they are repeatedly `discovered' in new contexts. This is especially true within computer systems, where heavy-tails seemingly pop up everywhere -- from degree distributions in the internet and social networks to file sizes and interarrival times of workloads. However, despite decades of work on heavy-tails they are still treated as mysterious, surprising, and even controversial. The goal of this book is to show that heavy-tailed distributions need not be mysterious and should not be surprising or controversial. In particular, we demystify heavy-tailed distributions by showing how to reason formally about their counter-intuitive properties; we highlight that their emergence should be expected (not surprising) by showing that a wide variety of general processes lead to heavy-tailed distributions; and we highlight that most of the controversy surrounding heavy-tails is the result of bad statistics, and can be avoided by using the proper tools.

This paper initiates the study of the testable implications of choice data in settings where agents have privacy preferences. We adapt the standard conceptualization of consumer choice theory in economics to a situation where the consumer is aware of, and has preferences over, the information revealed by her choices. The main message of the paper is that nothing can be inferred about consumers' preferences once we introduce the possibility that the consumer has concerns about privacy. This holds even when consumers' privacy preferences are assumed to be monotonic and separable. This motivates the consideration of stronger assumptions, and to that end we introduce an additive model for privacy preferences that does have testable implications.

In this paper, we consider the problem of capacity provisioning for an online service supported by advertising. We analyse the strategic interaction between the service provider and the user base in this setting, modeling positive network effects, as well as congestion sensitivity in the user base. We focus specifically on the influence of positive network effects, as well as the impact of non-cooperative behavior in the user base on the firm's capacity provisioning decision and its proffit. Our analysis reveals that stronger positive network effects, as well as non-cooperation in the user base, drive the service into a more congested state and lead to increased proffit for the service provider. However, the impact of non-cooperation, or 'anarchy' in the user base strongly dominates the impact of network effects.

This paper surveys the opportunities and challenges in an emerging area of research that has the potential to significantly ease the incorporation of renewable energy into the grid as well as electric power peak-load shaving: data center demand response. Data center demand response sits at the intersection of two growing fields: energy efficient data centers and demand response in the smart grid. As such, the literature related to data center demand response is sprinkled across multiple areas and worked on by diverse groups. Our goal in this survey is to demonstrate the potential of the field while also summarizing the progress that has been made and the challenges that remain.

We study the role of a market maker (or system operator) in a transmission constrained electricity market. We model the market as a one-shot networked Cournot competition where generators supply quantity bids and load serving entities provide downward sloping inverse demand functions. This mimics the operation of a spot market in a deregulated market structure. In this paper, we focus on possible mechanisms employed by the market maker to balance demand and supply. In particular, we consider three candidate objective functions that the market maker optimizes -- social welfare, residual social welfare, and consumer surplus. We characterize the existence of Generalized Nash Equilibrium (GNE) in this setting and demonstrate that market outcomes at equilibrium can be very different under the candidate objective functions.

Traditionally, research focusing on the design of routing and staffing policies for service systems has modeled servers as having fixed (possibly heterogeneous) service rates. However, service systems are generally staffed by people. Furthermore, people respond to workload incentives; that is, how hard a person works can depend both on how much work there is, and how the work is divided between the people responsible for it. In a service system, the routing and staffing policies control such workload incentives; and so the rate servers work will be impacted by the system's routing and staffing policies. This observation has consequences when modeling service system performance, and our objective in this paper is to investigate those consequences.

We do this in the context of the M/M/N queue, which is the canonical model for large service systems. First, we present a model for ``strategic'' servers that choose their service rate in order to maximize a trade-off between an ``effort cost'', which captures the idea that servers exert more effort when working at a faster rate, and a ``value of idleness'', which assumes that servers value having idle time. Next, we characterize the symmetric Nash equilibrium service rate under any routing policy that routes based on the server idle time (such as the longest idle server first policy). This allows us to (asymptotically) solve the problem of minimizing the total cost, when there are linear staffing costs and linear waiting costs. We find that an asymptotically optimal staffing policy staffs strictly more than the common square-root staffing policy. Finally, we end by exploring the question of whether routing policies that are based on the service rate, instead of the server idle time, can improve system performance.

Demand response is a crucial tool for the incorporation of renewable energy into the grid. In this paper, we focus on a particularly promising industry for demand response: data centers. We use simulations to show that, not only are data centers large loads, but they can provide as much (or possibly more) flexibility as large scale storage, if given the proper incentives. However, due to the market power most data centers maintain, it is difficult to design programs that are efficient for data center demand response. To that end, we propose that prediction-based pricing is an appealing market design, and show that it outperforms more traditional supply-function bidding mechanisms in situations where market power is an issue. However, prediction-based pricing may be inefficient when predictions are not accurate, and so we provide analytic, worst-case bounds on the impact of prediction accuracy on the efficiency of prediction-based pricing. These bounds hold even when network constraints are considered, and highlight that prediction-based pricing is surprisingly robust to prediction error.

In this paper we initiate the study of the computational complexity of Nash equilibria in bimatrix games that are specified via data. This direction is motivated by an attempt to connect the emerging work on the computational complexity of Nash equilibria with the perspective of revealed preference theory, where inputs are data about observed behavior, rather than explicit payoffs. Our results draw such connections for large classes of data sets, and provide a formal basis for studying these connections more generally. In particular, we derive three structural conditions that are sufficient to ensure that a data set is both consistent with Nash equilibria and that the observed equilibria could have been computed efficiently: (i) a small dimensionality of the observed strategies, (ii) a small support size of the observed strategies, and (iii) a small chromatic number of the data set. Key to these results is a connection between data sets and the player rank of a game, defined to be the minimum rank of the payoff matrices of the players. We complement our results by constructing data sets that require rationalizing games to have high player rank, which suggests that computational constraints may be important empirically as well.

In big data analytics timely results, even if based on only part of the data, are often good enough. For this reason, approximation jobs, which have deadline or error bounds and require only a subset of their tasks to complete, are projected to dominate big data workloads. In this paper, we present GRASS, which carefully uses speculation to mitigate the impact of stragglers in approximation jobs. The design of GRASS is based on first principles analysis of the impact of speculative copies. GRASS delicately balances immediacy of improving the approximation goal with the long term implications of using extra resources for speculation. Evaluations with production workloads from Facebook and Microsoft Bing in an EC2 cluster of 200 nodes shows that name increases accuracy of deadline-bound jobs by 47% and speeds up error-bound jobs by 38%. GRASS's design also speeds up exact computations, making it a unified solution for straggler mitigation.

We consider algorithms for `smoothed online convex optimization' (SOCO) problems. SOCO is a variant of the class of `online convex optimization' (OCO) problems that is strongly related to the class of `metrical task systems' (MTS). Both of these have a rich history and a wide range of applications. Prior literature on these problems has focused on two performance metrics: regret and the competitive ratio. There exist known algorithms with sublinear regret and known algorithms with constant competitive ratios; however no known algorithm achieves both simultaneously. In this paper, we show that this is due to a fundamental incompatibility between these two metrics -- no algorithm (deterministic or randomized) can achieve sublinear regret and a constant competitive ratio, even in the case when the objective functions are linear. However, we also exhibit an algorithm that, for the important special case of one dimensional decision spaces, provides sublinear regret while maintaining a competitive ratio that grows arbitrarily slowly.

A common approach for dynamic capacity provisioning is ``receding horizon control (RHC)'', which computes the provisioning for the current time by optimizing over a given window of predictions about the future arrivals. In this work, we provide new results characterizing the performance of RHC. We prove that RHC performs well when servers are homogeneous, i.e., it has performance that quickly tends toward optimality as the prediction window increases. However, we also prove that RHC performs badly when servers are heterogeneous, regardless of the length of the prediction window. This is problematic given that in practice data centers are heterogeneous. To address this issue, we introduce a variant of RHC that performs well in the heterogeneous setting. In fact, we prove that it matches the competitive ratio of RHC in the homogeneous setting under both the homogeneous and heterogeneous settings.

This paper focuses on the competitive analysis of scheduling disciplines in a large deviations setting. Though there are policies that are known to optimize the sojourn time tail under a large class of heavy-tailed job sizes (e.g. Processor Sharing and Shortest Remaining Processing Time) and there are policies known to optimize the sojourn time tail in the case of light-tailed job sizes (e.g. First Come First Served), no policies are known that can optimize the sojourn time tail across both light-tailed and heavy-tailed job size distributions. We prove that no such work-conserving, non-anticipatory, non-learning policy exists and thus that a policy must learn (or know) the job size distribution in order to optimize the sojourn time tail.

Power consumption imposes a significant cost for data centers implementing cloud services, yet much of that power is used to maintain excess service capacity during periods of predictably low load. This paper investigates how much can be saved by dynamically `right-sizing' the data center by turning off servers during such periods, and how to achieve that saving via an online algorithm. We prove that the optimal offline algorithm for dynamic right-sizing has a simple structure when viewed in reverse time, and this structure is exploited to develop a new `lazy' online algorithm, which is proven to be 3-competitive. We validate the algorithm using traces from two real data center workloads and show that significant cost-savings are possible.

Award Recipients

Opportunities for price manipulation by aggregators in electricity markets

A competitive deregulated electricity market with increasingly active market players is foreseen to be the future of the electricity industry. In such settings, market power assessment is a primary concern. In this paper, we propose a novel functional approach for measuring long term market power that unifies a variety of popular market power indices. Specifically, the new measure, termed \emphtransmission constrained network flow (TCNF), unifies three large classes of market power measures: residual supply based, network flow based, and minimal generation based. Further, TCNF provides valuable information about market power not captured by prior indices. We derive its analytic properties and test its efficacy on IEEE test systems.

Performance Evaluation, 2013. 70(10):720-735. This work also appeared in the Proceedings of IFIP Performance, 2013, where it recieved the ``Best Student Paper'' award. It was one of the ten most downloaded papers of Performance Evaluation during Fall and Winter 2013.

MapReduce is a scalable parallel computing framework for big data processing. It exhibits multiple processing phases, and thus an efficient job scheduling mechanism is crucial for ensuring efficient resource utilization. This paper studies the scheduling challenge that results from the overlapping of the ``map'' and ``shuffle'' phases in MapReduce. We propose a new, general model for this scheduling problem, and validate this model using cluster experiments. Further, we prove that scheduling to minimize average response time in this model is strongly NP-hard in the offline case and that no online algorithm can be constant-competitive. However, we provide two online algorithms that match the performance of the offline optimal when given a slightly faster service rate (i.e., in the resource augmentation framework). Finally, we validate the algorithms using a workload trace from a Google cluster and show that the algorithms are near optimal in practical settings.

Performance Evaluation, 2013. 70(10):770-791. This work was recognized as one of the most cited papers in Performance Evaluation during 2014-2016. It also appeared in the Proceedings of IFIP Performance, 2013; and as a poster at ACM Sigmetrics, 2013.

Demand response is a crucial aspect of the future smart grid. It has the potential to provide significant peak demand reduction and to ease the incorporation of renewable energy into the grid. Data centers' participation in demand response is becoming increasingly important given their high and increasing energy consumption and their flexibility in demand management compared to conventional industrial facilities. In this paper, we study two demand response schemes to reduce a data center's peak loads and energy expenditure: workload shifting and the use of local power generations. We conduct a detailed characterization study of coincident peak data over two decades from Fort Collins Utilities, Colorado and then develop two optimization based algorithms by combining workload scheduling and local power generation to avoid the coincident peak and reduce the energy expenditure. The first algorithm optimizes the expected cost and the second one provides the optimal worst-case guarantee. We evaluate these algorithms via trace-based simulations. The results show that using workload shifting in combination with local generation can provide significant cost savings compared to either alone.

Power consumption imposes a significant cost for data centers implementing cloud services, yet much of that power is used to maintain excess service capacity during periods of predictably low load. This paper investigates how much can be saved by dynamically `right-sizing' the data center by turning off servers during such periods, and how to achieve that saving via an online algorithm. We prove that the optimal offline algorithm for dynamic right-sizing has a simple structure when viewed in reverse time, and this structure is exploited to develop a new `lazy' online algorithm, which is proven to be 3-competitive. We validate the algorithm using traces from two real data center workloads and show that significant cost-savings are possible.

Proceedings of ACM Sigmetrics, 2012. Sigmetrics held jointly with IFIP Performance. An extension of this work is used in HP's Net-zero Data Center Architecture, which was named a 2013 Computerworld Honors Laureate. It was one of the ten most downloaded papers of ACM SIGMETRICS in the Summer, Fall, and Winter of 2013..

The demand for data center computing increased significantly in recent years resulting in huge energy consumption. Data centers typically comprise three main subsystems: IT equipment provides services to customers; power infrastructure supports the IT and cooling equipment; and the cooling infrastructure removes the generated heat. This work presents a novel approach to model the energy flows in a data center and optimize its holistic operation. Traditionally, supply-side constraints such as energy or cooling availability were largely treated independently from IT workload management. This work reduces cost and environmental impact using a holistic approach that integrates energy supply (e.g., renewable supply, dynamic pricing) and cooling supply (e.g., chiller, outside air cooling) with IT workload planning to improve the overall attainability of data center operations. Specifically, we predict renewable energy as well as IT demand and design an IT workload management plan that schedules IT workload and allocates IT resources within a data center according to time varying power supply and cooling efficiency. We have implemented and evaluated our approach using traces from real data centers and production systems. The results demonstrate that our approach can reduce the recurring power costs and the use of non-renewable energy by as much as 60 percent compared to existing, non-integrated techniques, while still meeting operational goals and SLAs.

A common approach for dynamic capacity provisioning is ``receding horizon control (RHC)'', which computes the provisioning for the current time by optimizing over a given window of predictions about the future arrivals. In this work, we provide new results characterizing the performance of RHC. We prove that RHC performs well when servers are homogeneous, i.e., it has performance that quickly tends toward optimality as the prediction window increases. However, we also prove that RHC performs badly when servers are heterogeneous, regardless of the length of the prediction window. This is problematic given that in practice data centers are heterogeneous. To address this issue, we introduce a variant of RHC that performs well in the heterogeneous setting. In fact, we prove that it matches the competitive ratio of RHC in the homogeneous setting under both the homogeneous and heterogeneous settings.

Power consumption imposes a significant cost for data centers implementing cloud services, yet much of that power is used to maintain excess service capacity during periods of predictably low load. This paper investigates how much can be saved by dynamically `right-sizing' the data center by turning off servers during such periods, and how to achieve that saving via an online algorithm. We prove that the optimal offline algorithm for dynamic right-sizing has a simple structure when viewed in reverse time, and this structure is exploited to develop a new `lazy' online algorithm, which is proven to be 3-competitive. We validate the algorithm using traces from two real data center workloads and show that significant cost-savings are possible.

Given the significant energy consumption of data centers, improving their energy efficiency is an important social problem. However, energy efficiency is necessary but not sufficient for sustainability, which demands reduced usage of energy from fossil fuels. This paper investigates the feasibility of powering internet-scale systems using (nearly) entirely renewable energy. We perform a trace-based study to evaluate three issues related to achieving this goal: the impact of geographical load balancing, the role of storage, and the optimal mix of renewables. Our results highlight that geographical load balancing can significantly reduce the required capacity of renewable energy by using the energy more efficiently with ``follow the renewables'' routing. Further, our results show that small-scale storage can be useful, especially in combination with geographical load balancing, and that an optimal mix of renewables includes significantly more wind than photovoltaic solar.

Energy expenditure has become a signifficant fraction of data center operating costs. Recently, `geographical load balancing' has been suggested as an approach for taking advantage of the geographical diversity of Internet-scale distributed systems in order to reduce energy expenditures by exploiting the electricity price differences across regions. However, the fact that such designs reduce energy costs does not imply that they reduce energy usage. In fact, such designs often increase energy usage.

This paper explores whether the geographical diversity of Internet-scale systems can be used to provide environmental gains in addition to reducing data center costs. Specifically, we explore whether geographical load balancing can encourage usage of `green' energy from renewable sources and reduce usage of `brown' energy from fossil fuels. We make two contributions. First, we derive three algorithms, with varying degrees of distributed computation, for achieving optimal geographical load balancing. Second, using these algorithms, we show that if dynamic pricing of electricity is done in proportion to the fraction of the total energy that is brown at each time, then geographical load balancing provides signifficant reductions in brown energy usage. However, the benefits depend strongly on the degree to which systems accept dynamic energy pricing and the form of pricing used.

From a rare events perspective, scheduling disciplines that work well under light (exponential) tailed workload distributions do not perform well under heavy (power) tailed workload distributions, and vice-versa, leading to fundamental problems in designing schedulers that are robust to distributional assumptions on the job sizes. This paper shows how to exploit partial workload information (system load) to design a scheduler that provides robust performance across heavy-tailed and light-tailed workloads. Specifically, we derive new asymptotics for the tail of the stationary sojourn time under Limited Processor Sharing (LPS) scheduling for both heavy-tailed and light-tailed job size distributions, and show that LPS can be robust to the tail of the job size distribution if the multiprogramming level is chosen carefully as a function of the load.

It is common to evaluate scheduling policies based on their mean response times. Another important, but sometimes opposing, performance metric is a scheduling policy's fairness. For example, a policy that biases towards small job sizes so as to minimize mean response time may end up being unfair to large job sizes. In this paper we define three types of unfairness and demonstrate large classes of scheduling policies that fall into each type. We end with a discussion on which jobs are the ones being treated unfairly.

Heavy-tails are a continual source of excitement and confusion across disciplines as they are repeatedly `discovered' in new contexts. This is especially true within computer systems, where heavy-tails seemingly pop up everywhere -- from degree distributions in the internet and social networks to file sizes and interarrival times of workloads. However, despite decades of work on heavy-tails they are still treated as mysterious, surprising, and even controversial. The goal of this book is to show that heavy-tailed distributions need not be mysterious and should not be surprising or controversial. In particular, we demystify heavy-tailed distributions by showing how to reason formally about their counter-intuitive properties; we highlight that their emergence should be expected (not surprising) by showing that a wide variety of general processes lead to heavy-tailed distributions; and we highlight that most of the controversy surrounding heavy-tails is the result of bad statistics, and can be avoided by using the proper tools.

In this paper we quantify the total cost of an epidemic spreading through a social network, accounting for both the immunization and disease costs. Previous research has typically focused on determining the optimal strategy to limit the lifetime of a disease, without considering the cost of such strategies. In the large graph limit, we calculate the exact expected disease cost for a general random graph, and we illustrate it for the specific example of an Erdos-Renyi network. We also give an upper bound on the expected disease cost for finite-size graphs. Finally, we study how to optimally perform a one-shot immunization to minimize the social cost of a disease, including both the cost of the disease and the cost of immunization.

Consumption theory assumes that consumers posses infinite computational abilities. Proponents of bounded rationality want to instead require that any model of consumer behavior incorporate computational constraints. In this paper, we establish that this requirement has no testable implications. Any consumption data set that is compatible with a rational consumer is also compatible with a rational and computationally bounded consumer. The result extends to data on multiple agents interacting in a market economy. We present sufficient conditions on observed market outcomes such that they are compatible with an instance of the model for which Walrasian equilibrium is easy to compute. Our result motivates a general approach for posing questions about the empirical content of computational constraints: the revealed preference approach to computational complexity. The approach complements the conventional worst-case view of computational complexity in important ways, and is methodologically close to mainstream economics.

Data centers have diverse options to procure electricity. However, the current literature on exploiting these options is very fractured. Specifically, it is still not clear how utilizing one energy option may affect selecting other energy options. To address this open problem, we propose a unified energy portfolio optimization framework that takes into consideration a broad range of energy choices for data centers. Despite the complexity and nonlinearity of the original models, the proposed analysis boils down to solving tractable linear mixed-integer stochastic programs. Using experimental electricity market and Internet workload data, various insightful numerical observations are reported. It is shown that the key to link different energy options with different short- and long-term profit characteristics is to conduct risk management at different time horizons. Also, there is a direct relationship between data centers’ service- level agreement parameters and their ability to exploit certain energy options. The use of on-site storage and the deployment of geographical workload distribution can particularly help data centers in utilizing high-risk energy choices, such as offering ancillary services or participating in wholesale electricity markets.

A Market Approach for Handling Power Emergencies in Multi-Tenant Data Center

Power oversubscription in data centers may occasionally trigger an emergency when the aggregate power demand exceeds the capacity. Handling such an emergency requires a graceful power capping solution that minimizes the performance loss. In this paper, we study power capping in a multi-tenant data center where the operator supplies power to multiple tenants that manage their own servers. Unlike owner-operated data centers, the operator lacks control over tenants' servers. To address this challenge, we propose a novel market mechanism based on supply function bidding, called COOP, to financially incentivize and coordinate tenants' power reduction for minimizing total performance loss (quantified in performance cost) while satisfying multiple power capping constraints. We build a prototype to show that \ouralg is efficient in terms of minimizing the total performance cost, even compared to the ideal but infeasible case that assumes the operator has full control over tenants' servers. We also demonstrate that COOP is ``win-win'', increasing the operator's profit (through oversubscription) and reducing tenants' cost (through financial compensation for their power reduction during emergencies).

The empirical implications of privacy-aware choice

Rachel Cummings, Federico Echenique and Adam Wierman

INFORMS Operations Research, 2016. An extended abstract of this work appeared at ACM EC, 2014.

This paper initiates the study of the testable implications of choice data in settings where agents have privacy preferences. We adapt the standard conceptualization of consumer choice theory in economics to a situation where the consumer is aware of, and has preferences over, the information revealed by her choices. The main message of the paper is that nothing can be inferred about consumers' preferences once we introduce the possibility that the consumer has concerns about privacy. This holds even when consumers' privacy preferences are assumed to be monotonic and separable. This motivates the consideration of stronger assumptions, and to that end we introduce an additive model for privacy preferences that does have testable implications.

In this paper, we consider the problem of capacity provisioning for an online service supported by advertising. We analyse the strategic interaction between the service provider and the user base in this setting, modeling positive network effects, as well as congestion sensitivity in the user base. We focus specifically on the influence of positive network effects, as well as the impact of non-cooperative behavior in the user base on the firm's capacity provisioning decision and its proffit. Our analysis reveals that stronger positive network effects, as well as non-cooperation in the user base, drive the service into a more congested state and lead to increased proffit for the service provider. However, the impact of non-cooperation, or 'anarchy' in the user base strongly dominates the impact of network effects.

When Heavy-Tailed and Light-Tailed Flows Compete: The Response Time Tail Under Generalized Max-Weight Scheduling

Jayakrishnan Nair, Krishna Jagannathan and Adam Wierman

IEEE Transactions on Networking, 2016. This is an extension of a paper that appeared in IEEE Infocom, 2013.

This paper focuses on the design and analysis of scheduling policies for multi-class queues, such as those found in wireless networks and high-speed switches. In this context, we study the response time tail under generalized max-weight policies in settings where the traffic flows are highly asymmetric. Specifically, we study an extreme setting with two traffic flows, one heavy-tailed, and one light-tailed. In this setting, we prove that classical max-weight scheduling, which is known to be throughput optimal, results in the light-tailed flow having heavy-tailed response times. However, we show that via a careful design of inter-queue scheduling policy (from the class of generalized max-weight policies) and intra-queue scheduling policies, it is possible to maintain throughput optimality, and guarantee light-tailed delays for the light-tailed flow, without affecting the response time tail for the heavy-tailed flow.

2015

Greening multi-tenant data center demand response

Niangjun Chen, Xiaoqi Ren, Shaolei Ren and Adam Wierman

Performance Evaluation, 2015. 229-254. It also appeared in the Proceedings of IFIP Performance, 2015.

Economic dispatch and frequency regulation are typically viewed as fundamentally different problems in power systems, and hence are typically studied separately. In this paper, we frame and study a joint problem that optimizes both slow timescale economic dispatch resources and fast timescale frequency regulation resources. We provide sufficient conditions under which the joint problem can be decomposed without loss of optimality into slow and fast timescale problems. These slow and fast timescale problems have appealing interpretations as the economic dispatch and frequency regulation problems respectively. Moreover, the fast timescale problem can be solved using a distributed algorithm that preserves the stability of the network during transients. We also apply this optimal decomposition to propose an efficient market mechanism for economic dispatch that coordinates with frequency regulation.

We study a novel variation of online convex optimization where the algorithm is subject to ramp constraints limiting the distance between consecutive actions. Our contribution is results providing asymptotically tight bounds on the worst-case performance, as measured by the competitive difference, of a variant of Model Predictive Control termed Averaging Fixed Horizon Control (AFHC). Additionally, we prove that AFHC achieves the asymptotically optimal achievable competitive difference within a general class of ``forward looking'' online algorithms. Furthermore, we illustrate that the performance of AFHC in practice is often much better than indicated by the (worst-case) competitive difference using a case study in the context of the economic dispatch problem.

Making use of predictions is a crucial, but under-explored, area of online algorithms.This paper studies a class of online optimization problems where we have external noisy predictions available. We propose a stochastic prediction error model that generalizes prior models in the learning and stochastic control communities, incorporates correlation among prediction errors, and captures the fact that predictions improve as time passes. We prove that achieving sublinear regret and constant competitive ratio for online algorithms requires the use of an unbounded prediction window in adversarial settings, but that under more realistic stochastic prediction error models it is possible to use Averaging Fixed Horizon Control (AFHC) to simultaneously achieve sublinear regret and constant competitive ratio using only a constant-sized prediction window. Furthermore, we show that typical performance of AFHC is tightly concentrated around its mean.

A unifying market power measure for deregulated transmission-constrained electricity markets

Market power assessment is a prime concern when designing a deregulated electricity market. In this paper, we propose a new functional market power measure, termed \emphtransmission constrained network flow TCNF, that takes into account an AC model of the network. The measure unifies three large classes of long-term transmission constrained market power indices in the literature: residual supply based, network flow based, and minimal generation based. Furthermore it is built upon the recent advances in semidefinite relaxations of AC power flow equations to model the underlying power network. Previously, market power measure that took into account the network did so via DC approximations of power flow models. Our results highlight that using the more accurate AC model can yield fundamentally different conclusions both about whether market power exists and about which generators can exploit market power.

Characterizing the impact of the workload on the value of dynamic resizing in data centers

Kai Wang, Minghong Lin, Florin Ciucu, Adam Wierman and Chuang Lin

Performance Evaluation, 2015. This is an extension of a paper that appeared in IEEE Infocom, 2013.

Energy consumption imposes a significant cost for data centers; yet much of that energy is used to maintain excess service capacity during periods of predictably low load. Resultantly, there has recently been interest in developing designs that allow the service capacity to be dynamically resized to match the current workload. However, there is still much debate about the value of such approaches in real settings. In this paper, we show that the value of dynamic resizing is highly dependent on statistics of the workload process. In particular, both slow time-scale non-stationarities of the workload (e.g., the peak-to-mean ratio) and the fast time-scale stochasticity (e.g., the burstiness of arrivals) play key roles. To illustrate the impact of these factors, we combine optimization-based modeling of the slow time-scale with stochastic modeling of the fast time scale. Within this framework, we provide both analytic and numerical results characterizing when dynamic resizing does (and does not) provide benefits.

2014

Motivated by cloud services, we consider the interplay of network effects, congestion, and competition in ad-supported services. We study the strategic interactions between competing service providers and a user base, modeling congestion sensitivity and two forms of positive network effects: network effects that are either 'firm-specific' or 'industry-wide.' Our analysis reveals that users are generally no better off due to the competition in a marketplace with positive network effects. Further, our analysis highlights an important contrast between firm-specific and industry-wide network effects: Firms can coexist in a marketplace with industry-wide network effects, but near-monopolies tend to emerge in marketplaces with firm-specific network effects.

This paper surveys the opportunities and challenges in an emerging area of research that has the potential to significantly ease the incorporation of renewable energy into the grid as well as electric power peak-load shaving: data center demand response. Data center demand response sits at the intersection of two growing fields: energy efficient data centers and demand response in the smart grid. As such, the literature related to data center demand response is sprinkled across multiple areas and worked on by diverse groups. Our goal in this survey is to demonstrate the potential of the field while also summarizing the progress that has been made and the challenges that remain.

Deferrable load control is an essential tool for handling the uncertainties associated with increasing penetration of renewable load control. Model predictive control has emerged as an effective approach for managing deferrable loads, and has received considerable attention. In particular, previous work has derived tight bounds on the average-case performance of model predictive deferrable load control. However, to this point, distributional analysis of model predictive deferrable load control has been elusive. In this paper, we adapt the Martingale bounded difference approach in order to prove strong concentration results on the distribution of the load variance that results from model predictive deferrable load control. These concentration results highlight, among other things, the impact of short-range and long-range dependencies in the prediction errors.

We study the role of a market maker (or system operator) in a transmission constrained electricity market. We model the market as a one-shot networked Cournot competition where generators supply quantity bids and load serving entities provide downward sloping inverse demand functions. This mimics the operation of a spot market in a deregulated market structure. In this paper, we focus on possible mechanisms employed by the market maker to balance demand and supply. In particular, we consider three candidate objective functions that the market maker optimizes -- social welfare, residual social welfare, and consumer surplus. We characterize the existence of Generalized Nash Equilibrium (GNE) in this setting and demonstrate that market outcomes at equilibrium can be very different under the candidate objective functions.

Traditionally, research focusing on the design of routing and staffing policies for service systems has modeled servers as having fixed (possibly heterogeneous) service rates. However, service systems are generally staffed by people. Furthermore, people respond to workload incentives; that is, how hard a person works can depend both on how much work there is, and how the work is divided between the people responsible for it. In a service system, the routing and staffing policies control such workload incentives; and so the rate servers work will be impacted by the system's routing and staffing policies. This observation has consequences when modeling service system performance, and our objective in this paper is to investigate those consequences.

We do this in the context of the M/M/N queue, which is the canonical model for large service systems. First, we present a model for ``strategic'' servers that choose their service rate in order to maximize a trade-off between an ``effort cost'', which captures the idea that servers exert more effort when working at a faster rate, and a ``value of idleness'', which assumes that servers value having idle time. Next, we characterize the symmetric Nash equilibrium service rate under any routing policy that routes based on the server idle time (such as the longest idle server first policy). This allows us to (asymptotically) solve the problem of minimizing the total cost, when there are linear staffing costs and linear waiting costs. We find that an asymptotically optimal staffing policy staffs strictly more than the common square-root staffing policy. Finally, we end by exploring the question of whether routing policies that are based on the service rate, instead of the server idle time, can improve system performance.

This paper initiates the study of the testable implications of choice data in settings where agents have privacy preferences. We adapt the standard conceptualization of consumer choice theory in economics to a situation where the consumer is aware of, and has preferences over, the information revealed by her choices. The main message of the paper is that nothing can be inferred about consumers' preferences once we introduce the possibility that the consumer has concerns about privacy. This holds even when consumers' privacy preferences are assumed to be monotonic and separable. This motivates the consideration of stronger assumptions, and to that end we introduce an additive model for privacy preferences that does have testable implications.

Optimal power procurement for data centers in day-ahead and real-time electricity markets

With the growing trends in the amount of power consumed by data centers, finding ways to cut electricity bills has become an important and challenging problem. In this paper, we seek to understand the cost reductions that data centers may achieve by exploiting the diversity in the price of electricity in the day-ahead and real-time electricity markets. Based on a stochastic optimization framework, we propose to jointly select the data centers' service rates and their demand bids to the day-ahead and real-time electricity markets. In our analysis, we take into account service level-agreements, risk management constraints, and also the statistical characteristics of the workload and the electricity prices. Using empirical electricity price and Internet workload data, our numerical studies show that directly participating in the day-ahead and real-time electricity markets can significantly help data centers to reduce their energy expenditure.

Demand response is a crucial tool for the incorporation of renewable energy into the grid. In this paper, we focus on a particularly promising industry for demand response: data centers. We use simulations to show that, not only are data centers large loads, but they can provide as much (or possibly more) flexibility as large scale storage, if given the proper incentives. However, due to the market power most data centers maintain, it is difficult to design programs that are efficient for data center demand response. To that end, we propose that prediction-based pricing is an appealing market design, and show that it outperforms more traditional supply-function bidding mechanisms in situations where market power is an issue. However, prediction-based pricing may be inefficient when predictions are not accurate, and so we provide analytic, worst-case bounds on the impact of prediction accuracy on the efficiency of prediction-based pricing. These bounds hold even when network constraints are considered, and highlight that prediction-based pricing is surprisingly robust to prediction error.

The increasing penetration of intermittent, unpredictable renewable energy sources, such as wind energy, pose significant challenges for the utility companies trying to incorporate renewable energy in their portfolio. In this talk, we discuss inventory management issues that arise in the presence of intermittent renewable resources. We model the problem as a three stage newsvendor problem with uncertain supply and model the estimates of wind using a martingale model of forecast evolution. We describe the optimal procurement strategy and use it to study the impact of proposed market changes and of increased renewable penetration. A key insight from our results is to show a separation between the impact of the structure of electricity markets and the impact of increased penetration. In particular, the effect of market structure on the optimal procurement policy is independent of the level of wind penetration. Additionally, we study two proposed changes to the market structure: the addition and the placement of an intermediate market. Importantly, we show that addition of an intermediate market does not necessarily reduce the total amount of energy procured by the utility company.

Motivated by cloud services, we consider the interplay of network effects, congestion, and competition in ad-supported services. We study the strategic interactions between competing service providers and a user base, modeling congestion sensitivity and two forms of positive network effects: network effects that are either 'firm-specific' or 'industry-wide.' Our analysis reveals that users are generally no better off due to the competition in a marketplace with positive network effects. Further, our analysis highlights an important contrast between firm-specific and industry-wide network effects: Firms can coexist in a marketplace with industry-wide network effects, but near-monopolies tend to emerge in marketplaces with firm-specific network effects.

In this paper we initiate the study of the computational complexity of Nash equilibria in bimatrix games that are specified via data. This direction is motivated by an attempt to connect the emerging work on the computational complexity of Nash equilibria with the perspective of revealed preference theory, where inputs are data about observed behavior, rather than explicit payoffs. Our results draw such connections for large classes of data sets, and provide a formal basis for studying these connections more generally. In particular, we derive three structural conditions that are sufficient to ensure that a data set is both consistent with Nash equilibria and that the observed equilibria could have been computed efficiently: (i) a small dimensionality of the observed strategies, (ii) a small support size of the observed strategies, and (iii) a small chromatic number of the data set. Key to these results is a connection between data sets and the player rank of a game, defined to be the minimum rank of the payoff matrices of the players. We complement our results by constructing data sets that require rationalizing games to have high player rank, which suggests that computational constraints may be important empirically as well.

In big data analytics timely results, even if based on only part of the data, are often good enough. For this reason, approximation jobs, which have deadline or error bounds and require only a subset of their tasks to complete, are projected to dominate big data workloads. In this paper, we present GRASS, which carefully uses speculation to mitigate the impact of stragglers in approximation jobs. The design of GRASS is based on first principles analysis of the impact of speculative copies. GRASS delicately balances immediacy of improving the approximation goal with the long term implications of using extra resources for speculation. Evaluations with production workloads from Facebook and Microsoft Bing in an EC2 cluster of 200 nodes shows that name increases accuracy of deadline-bound jobs by 47% and speeds up error-bound jobs by 38%. GRASS's design also speeds up exact computations, making it a unified solution for straggler mitigation.

We consider the problem of designing a distribution rule to share `welfare' (cost or revenue) among individually strategic agents. There are many known distribution rules that guarantee the existence of a (pure) Nash equilibrium in this setting, e.g., the Shapley value and its weighted variants; however, a characterization of the space of distribution rules that guarantee the existence of a Nash equilibrium is unknown. Our work provides an exact characterization of this space for a specific class of scalable and separable games, which includes a variety of applications such as facility location, routing, network formation, and coverage games. Given an arbitrary welfare function $W$, we prove that a distribution rule guarantees equilibrium existence for all games (i.e., all possible sets of resources, agent action sets, etc.) if and only if it is equivalent to a generalized weighted Shapley value on some `ground' welfare function $W'$, which can be distinct from $W$. However, if budget-balance is required in addition to the existence of a Nash equilibrium, then $W'$ must be the same as $W$. We also provide an alternate characterization of this space in terms of `generalized' marginal contributions, which is more appealing from the point of view of computational tractability. A possibly surprising consequence of our result is that, in order to guarantee equilibrium existence in all games with any fixed welfare function, it is \textitnecessary to work within the class of potential games.

Energy expenditure has become a signifficant fraction of data center operating costs. Recently, `geographical load balancing' has been suggested as an approach for taking advantage of the geographical diversity of Internet-scale distributed systems in order to reduce energy expenditures by exploiting the electricity price differences across regions. However, the fact that such designs reduce energy costs does not imply that they reduce energy usage. In fact, such designs often increase energy usage.

This paper explores whether the geographical diversity of Internet-scale systems can be used to provide environmental gains in addition to reducing data center costs. Specifically, we explore whether geographical load balancing can encourage usage of `green' energy from renewable sources and reduce usage of `brown' energy from fossil fuels. We make two contributions. First, we derive three algorithms, with varying degrees of distributed computation, for achieving optimal geographical load balancing. Second, using these algorithms, we show that if dynamic pricing of electricity is done in proportion to the fraction of the total energy that is brown at each time, then geographical load balancing provides signifficant reductions in brown energy usage. However, the benefits depend strongly on the degree to which systems accept dynamic energy pricing and the form of pricing used.

We consider algorithms for `smoothed online convex optimization' (SOCO) problems. SOCO is a variant of the class of `online convex optimization' (OCO) problems that is strongly related to the class of `metrical task systems' (MTS). Both of these have a rich history and a wide range of applications. Prior literature on these problems has focused on two performance metrics: regret and the competitive ratio. There exist known algorithms with sublinear regret and known algorithms with constant competitive ratios; however no known algorithm achieves both simultaneously. In this paper, we show that this is due to a fundamental incompatibility between these two metrics -- no algorithm (deterministic or randomized) can achieve sublinear regret and a constant competitive ratio, even in the case when the objective functions are linear. However, we also exhibit an algorithm that, for the important special case of one dimensional decision spaces, provides sublinear regret while maintaining a competitive ratio that grows arbitrarily slowly.

The growth of renewable resources will introduce significant variability and uncertainty into the grid. It is likely that ``peaker'' plants will be a crucial dispatchable resource for compensating for the variations in renewable supply. Thus, it is important to understand the strategic incentives of peaker plants and their potential for exploiting market power due to having responsive supply. To this end, we study an oligopolistic two-settlement market comprising of two types of generation (baseloads and peakers) where there is perfect foresight. We characterize symmetric equilibria in this context via closed-form expressions. However, we also show that, when the system is capacity-constrained, there may not exist equilibria in which baseloads and peakers play symmetric strategies. This happens because of opportunities for both types of generation to exploit market power to increase prices.

A competitive deregulated electricity market with increasingly active market players is foreseen to be the future of the electricity industry. In such settings, market power assessment is a primary concern. In this paper, we propose a novel functional approach for measuring long term market power that unifies a variety of popular market power indices. Specifically, the new measure, termed \emphtransmission constrained network flow (TCNF), unifies three large classes of market power measures: residual supply based, network flow based, and minimal generation based. Further, TCNF provides valuable information about market power not captured by prior indices. We derive its analytic properties and test its efficacy on IEEE test systems.

Performance Evaluation, 2013. 70(10):720-735. This work also appeared in the Proceedings of IFIP Performance, 2013, where it recieved the ``Best Student Paper'' award. It was one of the ten most downloaded papers of Performance Evaluation during Fall and Winter 2013.

MapReduce is a scalable parallel computing framework for big data processing. It exhibits multiple processing phases, and thus an efficient job scheduling mechanism is crucial for ensuring efficient resource utilization. This paper studies the scheduling challenge that results from the overlapping of the ``map'' and ``shuffle'' phases in MapReduce. We propose a new, general model for this scheduling problem, and validate this model using cluster experiments. Further, we prove that scheduling to minimize average response time in this model is strongly NP-hard in the offline case and that no online algorithm can be constant-competitive. However, we provide two online algorithms that match the performance of the offline optimal when given a slightly faster service rate (i.e., in the resource augmentation framework). Finally, we validate the algorithms using a workload trace from a Google cluster and show that the algorithms are near optimal in practical settings.

Performance Evaluation, 2013. 70(10):770-791. This work was recognized as one of the most cited papers in Performance Evaluation during 2014-2016. It also appeared in the Proceedings of IFIP Performance, 2013; and as a poster at ACM Sigmetrics, 2013.

Demand response is a crucial aspect of the future smart grid. It has the potential to provide significant peak demand reduction and to ease the incorporation of renewable energy into the grid. Data centers' participation in demand response is becoming increasingly important given their high and increasing energy consumption and their flexibility in demand management compared to conventional industrial facilities. In this paper, we study two demand response schemes to reduce a data center's peak loads and energy expenditure: workload shifting and the use of local power generations. We conduct a detailed characterization study of coincident peak data over two decades from Fort Collins Utilities, Colorado and then develop two optimization based algorithms by combining workload scheduling and local power generation to avoid the coincident peak and reduce the energy expenditure. The first algorithm optimizes the expected cost and the second one provides the optimal worst-case guarantee. We evaluate these algorithms via trace-based simulations. The results show that using workload shifting in combination with local generation can provide significant cost savings compared to either alone.

We consider the problem of designing a distribution rule to share `welfare' (cost or revenue) among individually strategic agents. There are many known distribution rules that guarantee the existence of a (pure) Nash equilibrium in this setting, e.g., the Shapley value and its weighted variants; however, a characterization of the space of distribution rules that guarantee the existence of a Nash equilibrium is unknown. Our work provides an exact characterization of this space for a specific class of scalable and separable games, which includes a variety of applications such as facility location, routing, network formation, and coverage games. Given an arbitrary welfare function $W$, we prove that a distribution rule guarantees equilibrium existence for all games (i.e., all possible sets of resources, agent action sets, etc.) if and only if it is equivalent to a generalized weighted Shapley value on some `ground' welfare function $W'$, which can be distinct from $W$. However, if budget-balance is required in addition to the existence of a Nash equilibrium, then $W'$ must be the same as $W$. We also provide an alternate characterization of this space in terms of `generalized' marginal contributions, which is more appealing from the point of view of computational tractability. A possibly surprising consequence of our result is that, in order to guarantee equilibrium existence in all games with any fixed welfare function, it is \textitnecessary to work within the class of potential games.

This paper proposes a model to study the interaction of price competition and congestion in the cloud computing marketplace. Specifically, we propose a three-tier market model that captures a marketplace with users purchasing services from Software-as-Service (SaaS) providers, which in turn purchase computing resources from either Provider-as-a-Service (PaaS) providers or Infrastructure-as-a-Service (IaaS) providers. Within each level, we define and characterize competitive equilibria. Further, we use these characterizations to understand the relative profitability of SaaSs and PaaSs/IaaSs, and to understand the impact of price competition on the user experienced performance, i.e., the `price of anarchy' of the cloud marketplace. Our results highlight that both of these depend fundamentally on the degree to which congestion results from shared or dedicated resources in the cloud.

We study the structural complexity of bimatrix games, formalized via \em rank, from an empirical perspective. We consider a setting where we have data on player behavior in diverse strategic situations, but where we do not observe the relevant payoff functions. We prove that high complexity (high rank) has empirical consequences when arbitrary data is considered. Additionally, we prove that, in more restrictive classes of data (termed laminar), any observation is rationalizable using a low-rank game: specifically a zero-sum game. Hence complexity as a structural property of a game is not always testable. Finally, we prove a general result connecting the structure of the feasible data sets with the highest rank that may be needed to rationalize a set of observations.

The need for new measures to assess market power in deregulated electricity markets

Real-time demand response is an essential tool for handling the uncertainties associated with the increasing penetration of renewable generation. Traditionally, demand response has been focused on large industrial and commercial loads, however it is expected that a large number of small residential loads such as air conditioners, dish washers, and electric vehicles will also participate in the coming years. The electricity consumption of these smaller loads, which we call deferrable loads, can be shifted over time, and can thus be used (in aggregate) to compensate for the random fluctuations in renewable generation. In this paper, we propose a real-time distributed deferrable load control algorithm to reduce the variance of aggregate load (load minus renewable generation) by shifting the power consumption of deferrable loads to periods with high renewable generation. At every time step, the algorithm minimizes the expected aggregate load variance with updated predictions. We prove that the suboptimality of the algorithm vanishes quickly as time horizon expands. Further, we evaluate the algorithm via trace-based simulations.

Demand response is a crucial aspect of the future smart grid. It has the potential to provide significant peak demand reduction and to ease the incorporation of renewable energy into the grid. Data centers' participation in demand response is becoming increasingly important given their high and increasing energy consumption and their flexibility in demand management compared to conventional industrial facilities. In this paper, we study two demand response schemes to reduce a data center's peak loads and energy expenditure: workload shifting and the use of local power generations. We conduct a detailed characterization study of coincident peak data over two decades from Fort Collins Utilities, Colorado and then develop two optimization based algorithms by combining workload scheduling and local power generation to avoid the coincident peak and reduce the energy expenditure. The first algorithm optimizes the expected cost and the second one provides the optimal worst-case guarantee. We evaluate these algorithms via trace-based simulations. The results show that using workload shifting in combination with local generation can provide significant cost savings compared to either alone.

A tale of two metrics: Simultaneous bounds on competitiveness and regret

We consider algorithms for `smoothed online convex optimization' (SOCO) problems. SOCO is a variant of the class of `online convex optimization' (OCO) problems that is strongly related to the class of `metrical task systems' (MTS). Both of these have a rich history and a wide range of applications. Prior literature on these problems has focused on two performance metrics: regret and the competitive ratio. There exist known algorithms with sublinear regret and known algorithms with constant competitive ratios; however no known algorithm achieves both simultaneously. In this paper, we show that this is due to a fundamental incompatibility between these two metrics -- no algorithm (deterministic or randomized) can achieve sublinear regret and a constant competitive ratio, even in the case when the objective functions are linear. However, we also exhibit an algorithm that, for the important special case of one dimensional decision spaces, provides sublinear regret while maintaining a competitive ratio that grows arbitrarily slowly.

This paper focuses on the design and analysis of scheduling policies for multi-class queues, such as those found in wireless networks and high-speed switches. In this context, we study the response time tail under generalized max-weight policies in settings where the traffic flows are highly asymmetric. Specifically, we study an extreme setting with two traffic flows, one heavy-tailed, and one light-tailed. In this setting, we prove that classical max-weight scheduling, which is known to be throughput optimal, results in the light-tailed flow having heavy-tailed response times. However, we show that via a careful design of inter-queue scheduling policy (from the class of generalized max-weight policies) and intra-queue scheduling policies, it is possible to maintain throughput optimality, and guarantee light-tailed delays for the light-tailed flow, without affecting the response time tail for the heavy-tailed flow.

Energy consumption imposes a significant cost for data centers; yet much of that energy is used to maintain excess service capacity during periods of predictably low load. Resultantly, there has recently been interest in developing designs that allow the service capacity to be dynamically resized to match the current workload. However, there is still much debate about the value of such approaches in real settings. In this paper, we show that the value of dynamic resizing is highly dependent on statistics of the workload process. In particular, both slow time-scale non-stationarities of the workload (e.g., the peak-to-mean ratio) and the fast time-scale stochasticity (e.g., the burstiness of arrivals) play key roles. To illustrate the impact of these factors, we combine optimization-based modeling of the slow time-scale with stochastic modeling of the fast time scale. Within this framework, we provide both analytic and numerical results characterizing when dynamic resizing does (and does not) provide benefits.

Power consumption imposes a significant cost for data centers implementing cloud services, yet much of that power is used to maintain excess service capacity during periods of predictably low load. This paper investigates how much can be saved by dynamically `right-sizing' the data center by turning off servers during such periods, and how to achieve that saving via an online algorithm. We prove that the optimal offline algorithm for dynamic right-sizing has a simple structure when viewed in reverse time, and this structure is exploited to develop a new `lazy' online algorithm, which is proven to be 3-competitive. We validate the algorithm using traces from two real data center workloads and show that significant cost-savings are possible.

Cooperative control focuses on deriving desirable collective behavior in multiagent systems through the design of local control algorithms. Game theory is beginning to emerge as a valuable set of tools for achieving this objective. A central component of this game theoretic approach is the assignment of utility functions to the individual agents. Here, the goal is to assign utility functions within an `admissible' design space such that the resulting game possesses desirable properties. Our first set of results illustrates the complexity associated with such a task. In particular, we prove that if we restrict the class of utility functions to be local, scalable, and budget-balanced then (i) ensuring that the resulting game possesses a pure Nash equilibrium requires computing a Shapley value, which can be computationally prohibitive for large-scale systems, and (ii) ensuring that the allocation which optimizes the system level objective is a pure Nash equilibrium is impossible. The last part of this paper demonstrates that both limitations can be overcome by introducing an underlying state space into the potential game structure.

We consider a variation of the resource allocation problem. In the traditional problem, there is a global planner who would like to assign a set of players to a set of resources so as to maximize welfare. We consider the situation where the global planner does not have the authority to assign players to resources; rather, players are self-interested. The question that emerges is how can the global planner entice the players to settle on a desirable allocation with respect to the social welfare? To study this question, we focus on a class of games that we refer to as distributed welfare games. Within this context, we investigate how the global planner should distribute the welfare to the players? We measure the efficacy of a distribution rule in two ways: (i) Does a pure Nash equilibrium exist? (ii) How does the welfare associated with a pure Nash equilibrium compare to the social welfare associated with the optimal allocation? In this paper we explore the applicability of cost sharing methodologies for distributing welfare in such resource allocation problems. We demonstrate that obtaining desirable distribution rules, such as distribution rules that are budget balanced and guarantee the existence of a pure Nash equilibrium, often comes at a significant informational and computational cost. In light of this, we derive a systematic procedure for designing desirable distribution rules with a minimal informational and computational cost for a special class of distributed welfare games. Furthermore, we derive a bound on the price of anarchy for distributed welfare games in a variety of settings. Lastly, we highlight the implications of these results using the problem of sensor coverage.

2012

As the market adoption of distributed energy resources (DER) reaches regional scale it will create significant challenges in the management of the distribution system related to existing protection and control systems. This is likely to lead to issues for power quality and reliability because of three issues. In this paper, we describe a framework for the development of a class of pricing mechanisms that both induce deep customer participation and enable efficient management of their end-use devices to provide both distribution and transmission side support. The basic challenge resides in reliably extracting the desired response from customers on short time-scales. Thus, new pricing mechanisms are needed to create effective closed loop systems that are tightly coupled with distribution control systems to ensure reliability and power quality.

Online optimization with switching costs

Minghong Lin, Alan Roytman, Adam Meyerson and Lachlan L. H. Andrew

Proceedings of ACM MAMA, 2012. Extensions of this abstract appeared as a poster at ACM Sigmetrics 2013 and as a full paper in COLT, 2013.

Proceedings of ACM Sigmetrics, 2012. Sigmetrics held jointly with IFIP Performance. An extension of this work is used in HP's Net-zero Data Center Architecture, which was named a 2013 Computerworld Honors Laureate. It was one of the ten most downloaded papers of ACM SIGMETRICS in the Summer, Fall, and Winter of 2013..

The demand for data center computing increased significantly in recent years resulting in huge energy consumption. Data centers typically comprise three main subsystems: IT equipment provides services to customers; power infrastructure supports the IT and cooling equipment; and the cooling infrastructure removes the generated heat. This work presents a novel approach to model the energy flows in a data center and optimize its holistic operation. Traditionally, supply-side constraints such as energy or cooling availability were largely treated independently from IT workload management. This work reduces cost and environmental impact using a holistic approach that integrates energy supply (e.g., renewable supply, dynamic pricing) and cooling supply (e.g., chiller, outside air cooling) with IT workload planning to improve the overall attainability of data center operations. Specifically, we predict renewable energy as well as IT demand and design an IT workload management plan that schedules IT workload and allocates IT resources within a data center according to time varying power supply and cooling efficiency. We have implemented and evaluated our approach using traces from real data centers and production systems. The results demonstrate that our approach can reduce the recurring power costs and the use of non-renewable energy by as much as 60 percent compared to existing, non-integrated techniques, while still meeting operational goals and SLAs.

The rapid growth of content distribution on the Internet has brought with it proportional increases in the costs of distributing content. Adding to distribution costs is the fact that digital content is easily duplicable, and hence can be shared in an illicit peer-to-peer (P2P) manner that generates no revenue for the content provider. In this paper, we study whether the content provider can recover lost revenue through a more innovative approach to distribution. In particular, we evaluate the benefits of a hybrid revenue-sharing system that combines a legitimate P2P swarm and a centralized client-server approach. We show how the revenue recovered by the content provider using a server-supported legitimate P2P swarm can exceed that of the monopolistic scheme by an order of magnitude. Our analytical results are obtained in a fluid model, and supported by stochastic simulations.

Energy consumption imposes a significant cost for data centers; yet much of that energy is used to maintain excess service capacity during periods of predictably low load. Resultantly, there has recently been interest in developing designs that allow the service capacity to be dynamically resized to match the current workload. However, there is still much debate about the value of such approaches in real settings. In this paper, we show that the value of dynamic resizing is highly dependent on statistics of the workload process. In particular, both slow time-scale non-stationarities of the workload (e.g., the peak-to-mean ratio) and the fast time-scale stochasticity (e.g., the burstiness of arrivals) play key roles. To illustrate the impact of these factors, we combine optimization-based modeling of the slow time-scale with stochastic modeling of the fast time scale. Within this framework, we provide both analytic and numerical results characterizing when dynamic resizing does (and does not) provide benefits.

In this work, we study the complexity of finding a Walrasian equilibrium. Our main result gives an algorithm which can compute an approximate Walrasian equilibrium in an exchange economy with general, but well-behaved, utility functions in time that is polynomial in the number of goods when the number of agents is held constant. This result has applications to macroeconomics and finance, where applications of Walrasian equilibrium theory tend to deal with many goods but a fixed number of agents.

Data center power consumption can be reduced by switching off servers during low load. However, excess switching is wasteful. This paper reviews online algorithms for optimizing this tradeoff, including the benefits of shifting load between geographically distant data centres. These algorithms can also adjust a links number of parallel lightpaths.

A common approach for dynamic capacity provisioning is ``receding horizon control (RHC)'', which computes the provisioning for the current time by optimizing over a given window of predictions about the future arrivals. In this work, we provide new results characterizing the performance of RHC. We prove that RHC performs well when servers are homogeneous, i.e., it has performance that quickly tends toward optimality as the prediction window increases. However, we also prove that RHC performs badly when servers are heterogeneous, regardless of the length of the prediction window. This is problematic given that in practice data centers are heterogeneous. To address this issue, we introduce a variant of RHC that performs well in the heterogeneous setting. In fact, we prove that it matches the competitive ratio of RHC in the homogeneous setting under both the homogeneous and heterogeneous settings.

This paper focuses on the competitive analysis of scheduling disciplines in a large deviations setting. Though there are policies that are known to optimize the sojourn time tail under a large class of heavy-tailed job sizes (e.g. Processor Sharing and Shortest Remaining Processing Time) and there are policies known to optimize the sojourn time tail in the case of light-tailed job sizes (e.g. First Come First Served), no policies are known that can optimize the sojourn time tail across both light-tailed and heavy-tailed job size distributions. We prove that no such work-conserving, non-anticipatory, non-learning policy exists and thus that a policy must learn (or know) the job size distribution in order to optimize the sojourn time tail.

We study a polling model where we want to achieve a balance between the fairness of the waiting times and the efficiency of the system. For this purpose, we introduce the k-gated service discipline. It is a hybrid of the classical gated and exhausted disciplines, and consists of using k_i gated service phases at queue i before the server switches to the next queue. We derive the distributions and means of the waiting times, a pseudo conservation law for the weighted sum of the mean waiting times, and the fluid limits of the waiting times. Our goal is to optimize the k_i's so as to minimize the differences in the mean waiting times, i.e. to achieve maximal fairness, without giving up too much on the efficiency of the system. From the fluid limits we derive a heuristic rule for setting the k_i's. In a numerical study the heuristic is shown to perform well.

Scheduling policies that are biased toward small jobs have received growing attention due to their superior mean delay performance. Such policies include Shortest-Remaining-Processing-Time (SRPT), Preemptive-Shortest-Job-First (PSJF), and Least-Attained-Service (LAS). In this paper, we study the delay distribution of LAS and the class of scheduling policies called SMART-LD (SMAll-Response- Time for Large-Deviations) that included SRPT, PSJF, and their variants to understand policies that prioritized short jobs. We study the delay distribution (rate function) of the SMART-LD class and LAS in a discrete-time queueing system under the many sources large deviations regime. We prove that all SMART-LD policies have the same asymptotic delay distribution as SRPT and illustrate the improvements SMART-LD policies and LAS make over First-Come-First-Served (FCFS). Furthermore, we show that the delay distribution of SMART-LD policies stochastically improves upon the delay distribution of LAS across all job sizes.

Speed scaling has long been used as a power-saving mechanism at a chip level. However, in recent years, speed scaling has begun to be used as an approach for trading off energy usage and performance throughout all levels of computer systems. This wide-spread use of speed scaling has motivated significant research on the topic, but many fundamental questions about speed scaling are only beginning to be understood. In this chapter, we focus on a simple, but general, model of speed scaling and provide an algorithmic perspective on four fundamental questions: (i) What is the structure of the optimal speed scaling algorithm? (ii) How does speed scaling interact with scheduling? (iii) What is the impact of the sophistication of speed scaling algorithms? and (iv) Does speed scaling have any unintended consequences? For each question we provide a summary of insights from recent work on the topic in both worst-case and stochastic analysis as well as a discussion of interesting open questions that remain.

Adapting the speed of a processor is an effective method to reduce energy consumption. This paper studies the optimal way to scale speed to balance response time and energy consumption under processor sharing scheduling. It is shown that using a static rate while the system is busy provides nearly optimal performance, but having more available speeds increases robustness to different traffic loads. In particular, the dynamic speed scaling optimal for Poisson arrivals is also constant-competitive in the worst case. The scheme which equates power consumption with queue occupancy is shown to be 10-competitive when power is cubic in speed.

Power consumption imposes a significant cost for data centers implementing cloud services, yet much of that power is used to maintain excess service capacity during periods of predictably low load. This paper investigates how much can be saved by dynamically `right-sizing' the data center by turning off servers during such periods, and how to achieve that saving via an online algorithm. We prove that the optimal offline algorithm for dynamic right-sizing has a simple structure when viewed in reverse time, and this structure is exploited to develop a new `lazy' online algorithm, which is proven to be 3-competitive. We validate the algorithm using traces from two real data center workloads and show that significant cost-savings are possible.

Given the significant energy consumption of data centers, improving their energy efficiency is an important social problem. However, energy efficiency is necessary but not sufficient for sustainability, which demands reduced usage of energy from fossil fuels. This paper investigates the feasibility of powering internet-scale systems using (nearly) entirely renewable energy. We perform a trace-based study to evaluate three issues related to achieving this goal: the impact of geographical load balancing, the role of storage, and the optimal mix of renewables. Our results highlight that geographical load balancing can significantly reduce the required capacity of renewable energy by using the energy more efficiently with ``follow the renewables'' routing. Further, our results show that small-scale storage can be useful, especially in combination with geographical load balancing, and that an optimal mix of renewables includes significantly more wind than photovoltaic solar.

One of the main building blocks of economics is the theory of the consumer, which postulates that consumers are utility maximizing. However, from a computational perspective, this model is called into question because the task of utility maximization subject to a budget constraint is computationally hard in the worst-case under reasonable assumptions. In this paper, we study the empirical consequences of strengthening consumer choice theory to enforce that utilities are computationally easy to maximize. We prove the possibly surprising result that computational constraints have no empirical consequences whatsoever for consumer choice theory. That is, a data set is consistent with a utility maximizing consumer if and only if a data set is consistent with a utility maximizing consumer having a utility function that can be maximized in strongly polynomial time.

Our result motivates a general approach for posing questions about the empirical content of computational constraints: the revealed preference approach to computational complexity. The approach complements the conventional worst-case view of computational complexity in important ways, and is methodologically close to mainstream economics.

Many-to-one matching markets exist in numerous different forms, such as college admissions, matching medical interns to hospitals for residencies, assigning housing to college students, and the classic rms and workers market. In the these markets, externalities such as complementarities and peer effects severely complicate the preference ordering of each agent. Further, research has shown that externalities lead to serious problems for market stability and for developing effciient algorithms to finding stable matchings.

In this paper we make the observation that peer effects are often the result of underlying social connections, and we explore a formulation of the many-to-one matching market where peer effects are derived from an underlying social network. Under this assumption, we prove that stable matchings always exist and characterize the set of stable matchings. Further, we prove bounds on how far welfare of the worst-case stable matching can be from the welfare of the optimal matching, and find that the structure of the clustering of the social network plays a large role.

Power consumption imposes a significant cost for implementing cloud services, yet much of that power is used to maintain excess service capacity during periods of low load. In this work, we study how to avoid such waste via an online dynamic capacity provisioning. We overview recent results showing that the optimal offline algorithm for dynamic capacity provisioning has a simple structure when viewed in reverse time, and this structure can be exploited to develop a new `lazy' online algorithm which is 3-competitive. Additionally, we analyze the performance of the more traditional approach of receding horizon control and introduce a new variant with a significantly improved worst-case performance guarantee.

Online services today are characterized by a highly congestion sensitive user base, that also experiences strong positive network effects. A majority of these services are supported by advertising and are offered for free to the end user. We study the problem of optimal capacity provisioning for a profit maximizing firm operating such an online service in the asymptotic regime of a large market size. We show that network effects heavily influence the optimal capacity provisioning strategy, as well as the profit of the firm. In particular, strong positive network effects allow the firm to operate the service with fewer servers, which translates to increased profit.

We study a nonatomic congestion game with N parallel links, with each link under the control of a profit maximizing provider. Within this `load balancing game', each provider has the freedom to set a price, or toll, for access to the link and seeks to maximize its own profit. Within fixed prices, a Wardrop equilibrium among users is assumed, under which users all choose paths of minimal and identical effective cost. Within this model we have oligopolistic price competition which, in equilibrium, gives rise to situations where neither providers nor users have incentives to adjust their prices or routes, respectively. In this context, we provide new results about the existence and efficiency of oligopolistic equilibria. Our main theorem shows that, when the number of providers is small, oligopolistic equilibria can be extremely inefficient; however as the number of providers N grows, the oligopolistic equilibria become increasingly efficient (at a rate of 1/N) and, as N grows, the oligopolistic equilibrium matches the socially optimal allocation.

We consider the problem of designing the distribution rule used to share welfare (cost or revenue) among individually strategic agents. There are many distribution rules known to guarantee the existence of a Nash equilibrium in this setting, e.g., the Shapley value and its weighted variants; however a characterization of the space of distribution rules that yield the existence of a Nash equilibrium is unknown. Our work provides a step towards such a characterization. We prove that when the welfare function is a submodular function, a budget-balanced distribution rule guarantees the existence of an equilibrium for all games (set of resources, agent action sets, etc.) if and only if it is a weighted Shapley value.

Energy expenditure has become a signifficant fraction of data center operating costs. Recently, `geographical load balancing' has been suggested as an approach for taking advantage of the geographical diversity of Internet-scale distributed systems in order to reduce energy expenditures by exploiting the electricity price differences across regions. However, the fact that such designs reduce energy costs does not imply that they reduce energy usage. In fact, such designs often increase energy usage.

This paper explores whether the geographical diversity of Internet-scale systems can be used to provide environmental gains in addition to reducing data center costs. Specifically, we explore whether geographical load balancing can encourage usage of `green' energy from renewable sources and reduce usage of `brown' energy from fossil fuels. We make two contributions. First, we derive three algorithms, with varying degrees of distributed computation, for achieving optimal geographical load balancing. Second, using these algorithms, we show that if dynamic pricing of electricity is done in proportion to the fraction of the total energy that is brown at each time, then geographical load balancing provides signifficant reductions in brown energy usage. However, the benefits depend strongly on the degree to which systems accept dynamic energy pricing and the form of pricing used.

Recent results in complexity theory suggest that various economic theories require agents to solve intractable problems. However, such results assume the agents are optimizing explicit utility functions, whereas the economic theories merely assume the agents are rational, where rational behavior is defined via some optimization problem. Might making rational choices be easier than solving the corresponding optimization problem? For at least one major economic theory, the theory of the consumer, we find this is indeed the case.

In this paper we quantify the total cost of an epidemic spreading through a social network, accounting for both the immunization and disease costs. Previous research has typically focused on determining the optimal strategy to limit the lifetime of a disease, without considering the cost of such strategies. In the large graph limit, we calculate the exact expected disease cost for a general random graph, and we illustrate it for the specific example of an Erdos-Renyi network. We also give an upper bound on the expected disease cost for finite-size graphs. Finally, we study how to optimally perform a one-shot immunization to minimize the social cost of a disease, including both the cost of the disease and the cost of immunization.

SRPT has long been known to optimize the queue length distribution and the mean response time (a.k.a. sojourn time). As such, it has been the focus of a wide body of analysis. However, results about the heavy-traffic behavior of SRPT have only recently started to emerge and hold only under a few specific workloads. In this work, we tightly characterize the growth rate of the mean response time under SRPT in the M/GI/1 system under general job size distributions. Our results illustrate the relationship between the job size tail and the heavy traffic growth rate of mean response time. Further, we show that the heavy traffic growth rate can be used to provide an accurate approximation for mean response time outside of heavy traffic.

Traditionally, the study of scheduling policies has focused on performance metrics such as response time, queue length, and throughput. However, the more vague notion of `fairness' is often equally or more important than these traditional performance metrics. But, the concept of fairness is difficult to define and so it has been studied only sporadically. This has changed over the last decade and a growing literature providing an analytic framework for studying fairness has emerged. This article surveys recent developments, which include a rich variety of fairness metrics as well as a growing understanding of the fairness of common scheduling policies.

2010

System design must strike a balance between energy and performance by carefully selecting the speed at which the system will run. In this work, we examine fundamental tradeoffs incurred when designing a speed scaler to minimize a weighted sum of expected response time and energy use per job. We prove that a popular dynamic speed scaling algorithm is 2-competitive for this objective and that no ``natural'' speed scaler can improve on this. Further, we prove that energy-proportional speed scaling works well across two common scheduling policies: Shortest Remaining Processing Time (SRPT) and Processor Sharing (PS). Third, we show that under SRPT and PS, gated-static speed scaling is nearly optimal when the mean workload is known, but that dynamic speed scaling provides robustness against uncertain workloads. Finally, we prove that speed scaling magnifies unfairness, notably SRPT's bias against large jobs and the bias against short jobs in non-preemptive policies. However, PS remains fair under speed scaling. Together, these results show that the speed scalers studied here can achieve any two, but only two, of optimality, fairness, and robustness.

When scheduling to minimize the response time tail, the goals of optimality and robustness are seemingly at odds. Over the last decade, results have emerged which show that scheduling disciplines that are near-optimal under light (exponential) tailed workload distributions do not perform well under heavy (power) tailed workload distributions, and vice-versa. Very recently, it has been shown that this conflict between optimality and robustness is fundamental, i.e., no policy that does not learn information about the workload can be optimal across both light-tailed and heavy-tailed workloads. In this paper we show that one can exploit very limited workload information (the system load) in order to design a scheduler that provides robust performance across heavy-tailed and light-tailed workloads.

From a rare events perspective, scheduling disciplines that work well under light (exponential) tailed workload distributions do not perform well under heavy (power) tailed workload distributions, and vice-versa, leading to fundamental problems in designing schedulers that are robust to distributional assumptions on the job sizes. This paper shows how to exploit partial workload information (system load) to design a scheduler that provides robust performance across heavy-tailed and light-tailed workloads. Specifically, we derive new asymptotics for the tail of the stationary sojourn time under Limited Processor Sharing (LPS) scheduling for both heavy-tailed and light-tailed job size distributions, and show that LPS can be robust to the tail of the job size distribution if the multiprogramming level is chosen carefully as a function of the load.

This paper focuses on a generalization of stochastic Kronecker graphs, introducing a Kronecker-like operator and defining a family of generator matrices H dependent on distances between nodes in a specified graph embedding. We prove that any lattice-based network model with sufficiently small distancedependent connection probability will have a Poisson degree distribution and provide a general framework to prove searchability for such a network. Using this framework, we focus on a specific example of an expanding hypercube and discuss the similarities and differences of such a model with recently proposed network models based on a hidden metric space. We also prove that a greedy forwarding algorithm can find very short paths of length O((log log n)2) on the hypercube with n nodes, demonstrating that distance-dependent Kronecker graphs can generate searchable network models.

Shortest Remaining Processing Time first (SRPT) has long been known to optimize the queue length distribution and the mean response time (a.k.a. flow time, sojourn time). As such, it has been the focus of a wide body of analysis. However, results about the heavy-traffic behavior of SRPT have only recently started to emerge. In this work, we characterize the growth rate of the mean response time under SRPT in the M/GI/1 system under general job size distributions. Our results illustrate the relationship between the job size tail and the heavy traffic growth rate of mean response time. Further, we show that the heavy traffic growth rate can be used to provide an accurate approximation for mean response time outside of heavy traffic.

Game-theoretic control is a promising new approach for distributed resource allocation. In this paper, we describe how game-theoretic control can be viewed as having an intrinsic layered architecture, which provides a modularization that simplifies the control design. We illustrate this architectural view by presenting details about one particular instantiation using potential games as an interface. This example serves to highlight the strengths and limitations of the proposed architecture while also illustrating the relationship between game-theoretic control and other existing approaches to distributed resource allocation.

2009

This paper describes an extension to stochastic Kronecker graphs that provides the special structure required for searchability, by defining a 'distance'-dependent Kronecker operator. We show how this extension of Kronecker graphs can generate several existing social network models, such as the Watts-Strogatz small-world model and Kleinberg's lattice-based model. We focus on a specific example of an expanding hypercube, reminiscent of recently proposed social network models based on a hidden hyperbolic metric space, and prove that a greedy forwarding algorithm can find very short paths of length O((log log n)^2 ) for graphs with n nodes.

TD learning and its refinements are powerful tools for approximating the solution to dynamic programming problems. However, the techniques provide the approximate solution only within a prescribed finite-dimensional function class. Thus, the question that always arises is how should the function class be chosen? The goal of this paper is to propose an approach for TD learning based on choosing the function class using the solutions to associated fluid and diffusion approximations. In order to illustrate this new approach, the paper focuses on an application to dynamic speed scaling for power management.

Recently, game theory has been proposed as a tool for cooperative control. Specifically, the interactions of a multi-agent distributed system are modeled as a non-cooperative game where agents are self-interested. In this work, we prove that this approach of non-cooperative control has limitations with respect to engineering multi-agent systems. In particular, we prove that it is not possible to design budget balanced agent utilities that also guarantee that the optimal control is a Nash equilibrium. However, it is important to realize that game-theoretic designs are not restricted to the framework of non-cooperative games. In particular, we demonstrate that these limitations can be overcome by conditioning each player's utility on additional information, i.e., a state. This utility design fits into the framework of a particular form of stochastic games termed state-based games and is applicable in many application domains.

Load balancing is a common approach to task assignment in distributed architectures. In this paper, we show that the degree of inefficiency in load balancing designs is highly dependent on the scheduling discipline used at each of the back-end servers. Traditionally, the back-end scheduler can be modeled as Processor Sharing (PS), in which case the degree of inefficiency grows linearly with the number of servers. However, if the back-end scheduler is changed to Shortest Remaining Processing Time (SRPT), the degree of inefficiency can be independent of the number of servers, instead depending only on the heterogeneity of the speeds of the servers. Further, switching the back-end scheduler to SRPT can provide significant improvements in the overall mean response time of the system as long as the heterogeneity of the server speeds is small.

This paper investigates the performance of online dynamic speed scaling algorithms for the objective of minimizing a linear combination of energy and response time. We prove that (SRPT, $P^-1(n)$), which uses Shortest Remaining Processing Time (SRPT) scheduling and processes at speed such that the power used is equal to the queue length, is 2-competitive for a very wide class of power-speed tradeoff functions. Further, we prove that there exist tradeoff functions such that no algorithm can attain a competitive ratio less than 2.

Energy usage of computer communications systems has quickly become a vital design consideration. One effective method for reducing energy consumption is dynamic speed scaling, which adapts the processing speed to the current load. This paper studies how to optimally scale speed to balance mean response time and mean energy consumption under processor sharing scheduling. Both bounds and asymptotics for the optimal speed scaling scheme are provided. These results show that a simple scheme that halts when the system is idle and uses a static rate while the system is busy provides nearly the same performance as the optimal dynamic speed scaling. However, the results also highlight that dynamic speed scaling provide at least one key benefit -- significantly improved robustness to bursty traffic and mis-estimation of workload parameters.

2008

Recently, the class of SMART scheduling policies (disciplines) has been introduced in order to formalize the common heuristic of ``biasing toward small jobs.'' We study the tail of the sojourn-time (response-time) distribution under both SMART policies and the Foreground-Background policy (FB) in the GI/GI/1 queue. We prove that these policies behave very well under heavy-tailed service times, but behave poorly under light-tailed service times. Specifically, for heavy-tailed service times, we show that the sojourn-time tail under FB and all SMART policies are equal to that of the service time tail, up to a constant. In contrast, for light-tailed service times with no mass in the endpoint of the distribution, we show that, on a logarithmic scale, the sojourn-time tail of FB and all SMART policies is as large as possible.

Recently, computer systems researchers have begun to apply the Foreground-Background (FB) scheduling discipline to a variety of applications, and as a result, there has been a resurgence in theoretical research studying the discipline as well. In this paper, we will bring together results from both of these research streams and provide a survey of state-of-the-art theoretical results characterizing the performance of FB with emphasis on the impact of these results to computer systems.

Motivated by the optimality of Shortest Remaining Processing Time (SRPT) for mean response time, in recent years many computer systems have used the heuristic of favoring small jobs in order to dramatically reduce user response times. However, rarely do computer systems have knowledge of exact remaining sizes. In this paper, we introduce the class of epsilon-SMART policies, which formalizes the heuristic of favoring small jobs in a way that includes a wide range of policies that schedule using inexact job-size information. Examples of epsilon-SMART policies include (i) policies that use exact size information, e.g., SRPT and PSJF, (ii) policies that use job-size estimates, and (iii) policies that use a finite number of size-based priority levels. For many epsilon-SMART policies, e.g., SRPT with inexact jobsize information, there are no analytic results available in the literature. In this work, we prove four main results: we derive upper and lower bounds on the mean response time, the mean slowdown, the response-time tail, and the conditional response time of epsilon-SMART policies. In each case, the results explicitly characterize the tradeoff between the accuracy of the job-size information used to prioritize and the performance of the resulting policy. Thus, the results provide designers an understanding of how accurate job-size information must be in order to achieve desired performance guarantees.

Load balancing is a common approach to distributed task assignment. In this paper we study the degree of inefficiency in load balancing designs. We show that the degree of inefficiency is highly dependent on the back-end scheduler.

We consider a variation of the resource allocation problem. In the traditional problem, there is a global planner who would like to assign a set of players to a set of resources so as to maximize social welfare. We consider the situation where the global planner does not have the authority to assign players to resources; rather, players are self-interested. The question that emerges is how can the global planner entice the players to settle on a desirable allocation with respect to the social welfare? To study this question, we focus on a class of games that we refer to as distributed welfare games. Within this context, we investigate how the global planner should distribute the social welfare to the players? We measure the efficacy of a distribution rule in two ways: (i) Does a pure Nash equilibrium exist? (ii) How does the social welfare of a pure Nash equilibrium compare to the social welfare of the optimal allocation? This comparison is referred to as the price of anarchy. To answer these questions, we derive sufficient conditions on the distribution rule that ensures the existence of a pure Nash equilibrium in any single-selection distributed welfare game. Furthermore, we derive a price of anarchy for distributed welfare games in a variety of settings. Lastly, we highlight the implications of these results using the sensor coverage problem.

Energy consumption in a computer system can be reduced by dynamic speed scaling, which adapts the processing speed to the current load. This paper studies the optimal way to adjust speed to balance mean response time and mean energy consumption, when jobs arrive as a Poisson process and processor sharing scheduling is used. Both bounds and asymptotics for the optimal speeds are provided. Interestingly, a simple scheme that halts when the system is idle and uses a static rate while the system is busy provides nearly the same performance as the optimal dynamic speed scaling. However, dynamic speed scaling which allocates a higher speed when more jobs are present significantly improves robustness to bursty traffic and mis-estimation of workload parameters.

We evaluated an automated Reading Tutor that let children pick stories to read, and listened to them read aloud. All 72 children in three classrooms (grades 2, 4, 5) were independently tested on the nationally normed Word Attack, Word Identification, and Passage Comprehension subtests of the Woodcock Reading Mastery Test (where they averaged nearly 2 standard deviations below national norms), and on oral reading fluency. We split each class into 3 matched treatment groups: Reading Tutor, commercial reading software, or other activities. In 4 months, the Reading Tutor group gained significantly more in Passage Comprehension than the control group (effect size = 1.2, p=.002) - even though actual usage was a fraction of the planned daily 20-25 minutes. To help explain these results, we analyzed relationships among gains in Word Attack, Word Identification, Passage Comprehension, and fluency by 108 additional children who used the Reading Tutor in 7 other classrooms (grades 1-4). Gains in Word Identification predicted Passage Comprehension gains only for Reading Tutor users, both in the controlled study (n=21, p=.042, regression coefficient B=.495+/- s.e. .227) and in the other classrooms (n=108, p=.005, B=.331+/-.115), where grade was also a significant predictor (p=.024, B=2.575+/-1.127).

2007

The growing trend in computer systems towards using scheduling policies that prioritize jobs with small service requirements has resulted in a new focus on the fairness of such policies. In particular, researchers have been interested in whether prioritizing small job sizes results in large jobs being treated

We present a simple mean value analysis (MVA) framework for analyzing the effect of scheduling within queues in classical asymmetric polling systems with gated or exhaustive service. Scheduling in polling systems finds many applications in computer and communication systems. Our framework leads not only to unification but also to extension of the literature studying scheduling in polling systems. It illustrates that a large class of scheduling policies behaves similarly in the exhaustive polling model and the standard M/GI/1 model, whereas scheduling policies in the gated polling model behave very differently than in an M/GI/1.

In recent years, the response times experienced by large job sizes have been the focus of a growing number of papers. Though results about many scheduling disciplines have appeared, to this point, results characterizing the response time of large job sizes have been limited to either mean value analysis or law of large numbers scalings. We will present a novel framework that unifies these results and provides new results characterizing the distributional behavior of large job sizes. Also, we will illustrate the impact these new results have (i) for the analysis of the response time tail, (ii) for the analysis of busy periods, and (iii) for predicting response times of customers upon arrival.

We ask the question, ``for minimizing mean response time, which is preferable: one fast server of speed 1, or k slow servers each of speed 1/k?'' Our setting is the M/PH/k system with two priority classes of customers, high priority and low priority, where PH is a phase-type distribution. We find that multiple slow servers are often preferable, and we demonstrate exactly how many servers are preferable as a function of the load and service time distribution. In addition, we find that the optimal number of servers with respect to the high priority jobs may be very different from that preferred by low priority jobs, and we characterize these preferences. We also study the optimal number of servers with respect to overall mean response time, averaged over high and low priority jobs. Lastly, we ascertain the effect of the service demand variability of high priority jobs on low priority jobs.

Workload generators may be classified as based on a closed system model, where new job arrivals are only triggered by job completions (followed by think time), or an open system model, where new jobs arrive independently of job completions. In general, system designers pay little attention to whether a workload generator is closed or open. Using a combination of implementation and simulation experiments involving static and dynamic workloads, this paper illustrates that there is a vast difference in behavior between these open and closed models in real-world settings. In addition, this work synthesizes many of the differences in behavior between the models into eight simple principles.

Scheduling/prioritization of DBMS transactions is important for many applications that rely on database backends. A convenient way to achieve scheduling is to limit the number of transactions within the database, maintaining most of the transactions in an external queue, which can be ordered as desired by the application. While external scheduling has many advantages in that it doesn't require changes to internal resources, it is also difficult to get right, in that its performance depends critically on the particular multiprogramming limit used (the MPL), i.e. the number of transactions allowed into the database. If the MPL is too low, throughput will suffer, since not all DBMS resources will be utilized. On the other hand, if the MPL is too high, there is insufficient control on scheduling. The question of how to adjust the MPL to achieve both goals simultaneously is an open problem, not just for databases but in system design in general. Herein we study this problem in the context of transactional workloads, both via extensive experimentation and queueing theoretic analysis. We find that the two most critical factors in adjusting the MPL are the number of resources that the workload utilizes and the variability of the transactions' service demands. We develop a feedback based controller, augmented by queueing theoretic models for automatically adjusting the MPL. Finally, we apply our methods to the specific problem of external prioritization of transactions. We find that external prioritization can be nearly as effective as internal prioritization, without any negative consequences, when the MPL is set appropriately.

Scheduling policies that prioritize short jobs have received growing attention in recent years. The class of SMART policies includes many such disciplines, e.g. Shortest Remaining Processing Time (SRPT) and Preemptive Shortest Job First (PSJF). In this work, we study the delay distribution of SMART policies and contrast this distribution with that of the Least-Attained-Service (LAS) policy, which indirectly favors short jobs by prioritizing jobs with the least attained service (age). We study the delay distribution (rate function) of LAS and the SMART class in a discrete-time queueing system under the many flows regime. Our analysis in this regime (large capacity and large number of flows) is enabled by the introduction of a two dimensional queue representation, which creates tie-break rules. These additional rules do not alter the policies, but greatly simplify their analysis. We demonstrate that the queue evolution of all the above policies can be described under this single two dimensional framework. We prove that all SMART policies have the same delay distribution as SRPT and illustrate the improvements SMART policies make over First-Come-First-Served (FCFS). Furthermore, we show that the delay distribution of SMART policies stochastically improves upon the delay distribution of LAS. However, the delay distribution under LAS is not too bad -- the distribution of delay under LAS for most jobs sizes still provides improvement over FCFS. Our results are complementary to prior work that studies delay-tail behavior in the large buffer regime under a single flow.

2005

We define the class of SMART scheduling policies. These are policies that bias towards jobs with small remaining service times, jobs with small original sizes, or both, with the motivation of minimizing mean response time and/or mean slowdown. Examples of SMART policies include PSJF, SRPT, and hybrid policies such as RS (which biases according to the product of the remaining size and the original size of a job). For many policies in the SMART class, the mean response time and mean slowdown are not known or have complex representations involving multiple nested integrals, making evaluation difficult. In this work, we prove three main results. First, for all policies in the SMART class, we prove simple upper and lower bounds on mean response time. Second, we show that all policies in the SMART class, surprisingly, have very similar mean response times. Third, we show that the response times of SMART policies are largely insensitive to the variability of the job size distribution. In particular, we focus on the SRPT and PSJF policies and prove insensitive bounds in these cases.

In addition to providing small mean response times, modern applications seek to provide users predictable service and, in some cases, Quality of Service (QoS) guarantees. In order to understand the predictability of response times under a range of scheduling policies, we study the conditional variance in response times seen by jobs of different sizes. We define a metric and a criterion that distinguish between contrasting functional behaviors of conditional variance, and we then classify large groups of scheduling policies. In addition to studying the conditional variance of response times, we also derive metrics appropriate for comparing higher conditional moments of response time across job sizes. We illustrate that common statistics such as raw and central moments are not appropriate when comparing higher conditional moments of response time. Instead, we find that cumulant moments should be used.

We present the first near-exact analysis of an M/PH/k queue with m > 2 preemptive-resume priority classes. Our analysis introduces a new technique, which we refer to as Recursive Dimensionality Reduction (RDR). The key idea in RDR is that the m-dimensionally infinite Markov chain, representing the m class state space, is recursively reduced to a 1-dimensionally infinite Markov chain, that is easily and quickly solved. RDR involves no truncation and results in only small inaccuracy when compared with simulation, for a wide range of loads and variability in the job size distribution. Our analytic methods are then used to derive insights on how multi-server systems with prioritization compare with their single server counterparts with respect to response time. Multi-server systems are also compared with single server systems with respect to the effect of different prioritization schemes -- ``smart'' prioritization (giving priority to the smaller jobs) versus ``stupid'' prioritization (giving priority to the larger jobs). We also study the effect of approximating m class performance by collapsing the $m$ classes into just two classes.

2004

We compare the overall mean response time (a.k.a. sojourn time) of the Processor Sharing (PS) and Feedback (FB) queues under an M/GI/1 system. We show that FB outperforms PS under service distributions having decreasing failure rates; whereas PS outperforms FB under service distributions having increasing failure rates.

Given a configuration of t indistinguishable pebbles on the n verticies of a graph G, we say that a vertex v can be reached if a pebble can be placed on it in a finite number of ``moves.'' G is said to be pebbleable if all its verticies can be thus reached. Now given the n-path P_n how large (resp. small) must t be so as to be able to pebble the path almost surely (resp. almost never)? It was known that the threshold th(P_n) for pebbling satisfies n 2^c \sqrt\lg n <= th(P_n) <= n 2^sqrtlog n, where c < 1 /sqrt2 is arbitrary. We improve the upper bound for the threshold function to th(P_n) \leq n 2^d sqrtlog n, where d>1 is arbitrary.

We present a general analytical framework for the modeling and analysis of TCP variations. The framework allows the modeling of multiple variations of TCP, including TCP-Vegas, TCP-SACK, and TCP-Reno, under general network situations. In particular, the framework allows us to propose the first analytical model of TCP-Vegas for arbitrary on-off traffic that is able to predict the operating point of the network. The analysis provided by our framework leads to many interesting observations with respect to both the behavior of bottleneck links that are shared by TCP sources and the effectiveness of the design decisions in TCP-SACK and TCP-Vegas.

It is common to evaluate scheduling policies based on their mean response times. Another important, but sometimes opposing, performance metric is a scheduling policy's fairness. For example, a policy that biases towards small job sizes so as to minimize mean response time may end up being unfair to large job sizes. In this paper we define three types of unfairness and demonstrate large classes of scheduling policies that fall into each type. We end with a discussion on which jobs are the ones being treated unfairly.

2002

We propose a framework for comparing the performance of two queueing policies. Our framework is motivated by the notion of competitive analysis, widely used by the computer science community to analyze the performance of online algorithms. We apply our framework to compare M/GI/1/FB and M/GI/1/SJF with M/GI/1/SRPT, and obtain new results about the performance of M/GI/1/FB and M/GI/1/SJF.

We explore the performance of an M/GI/1 queue under various scheduling policies from the perspective of a new metric: the slowdown experienced by largest jobs. We consider scheduling policies that bias against large jobs, towards large jobs, and those that are fair, e.g., Processor-Sharing. We prove that as job size increases to infinity, all work conserving policies converge almost surely with respect to this metric to no more than $1/(1 - \rho)$, where $\rho$ denotes load. We also find that the expected slowdown under any work conserving policy can be made arbitrarily close to that under Processor-Sharing, for all job sizes that are sufficiently large.