ABSTRACT1: We study adaptive policies that handle dynamic inventory and price controls when the random demand for discrete nonperishable items is unknown. Pure inventory control is achieved by targeting newsvendor ordering quantities that correspond to empirical demand distributions learned over time. On this basis we conduct the more complex joint inventory-price control, where demand-affecting prices await to be evaluated as well. We identify policies that strive to balance between exploration and exploitation, and measure their performances via regrets, i.e., the prices to pay for not knowing demand distributions {\em a priori} over a given horizon. Multiple bounds are derived on regrets' growth rates; they vary with how thoroughly unknown the demand distributions are and whether nonperishability has indeed been accounted for. Our simulation study illustrates order-of-magnitude differences between pure inventory and joint inventory-price controls in terms of regret growth rates; it also affirms nonperishability as a serious factor to be reckoned with in dynamic inventory and price controls involving unknown discrete demand.

ABSTRACT2: We study a dynamic inventory control problem involving fixed setup costs and random demand distributions. With an infinite planning horizon, model primitives including costs and distributions are set to be stationary. Under a given demand distribution, an (s,S) policy has been known to minimize the long-run per-period average cost. However, we depart from the traditional literature by allowing the stationary demand distribution to be largely unknown. Our goal is to rein in the long-run growth of the regret resulting from applying a policy that strives to learn the underlying demand while simultaneously meting out ordering decisions based on its learning. We propose a policy that controls the pace at which a traditional (s,S)-computing algorithm is applied to the empirical distribution of the demand learned over time. The regret incurred from the policy is shown to grow over time at an O(T^{1/2}\cdot(\ln T)^{1/2})-sized rate.

ABSTRACT3: We conduct regret analysis for inventory control involving unknown demand for discrete items. Like Huh et al. (2011), we focus on the censored-demand case where the portion of demand exceeding available inventory is lost and unobservable. Instead of simultaneous learning and doing revolving around the fairly sophisticated Kaplan-Meier estimator, our policy executes separate learning and doing over the plainer empirical distribution. The latter was used by Besbes and Muharremoglu (2013) in their regret analyses including ones for the discrete-item censored-demand case. We have no use of a certain separability assumption involving the underlying demand distribution and the newsvendor parameter despite its proven usefulness in the aforementioned work.

Though having probably sacrificed on efficiency, the separation allows us to achieve a T^{2/3}-sized bound on the regret incurred by our policy. Beyond the repeated-newsvendor setting, we also allow nonperishable items to be carried over into future periods. Here, a careful adaptation of Proposition 2 of Katehakis, Yang, and Zhou (2019), initially dedicated to the occasion with uncensored demand and a simultaneous learning-doing policy, finds its use. As long as the chance for zero demand is bounded away from one, the overall regret can remain T^{2/3}-sized. The latter we believe to be nearly optimal for any policy operating on the basis of learning-doing separation that is likely necessitated by other regret analysis involving demand censoring.

ABSTRACT4: We propose a game-theoretic framework that incorporates both incomplete information and general ambiguity attitudes on factors external to all players. Our starting point is players' preferences on return-distribution vectors, essentially mappings from states of the world to distributions of returns to be received by players. There are two ways in which equilibria for this preference game can be defined. Also, when the preferences possess ever more features, we can gradually add more structures to the game. These include real-valued functions over return-distribution vectors, sets of probabilistic priors over states of the world, and eventually the traditional expected-utility framework involving one single prior. We establish equilibrium existence results, show the upper hemi-continuity of equilibrium sets over changing ambiguity attitudes, and uncover relations between the two versions of equilibria. %Players' ambiguity attitudes figure large in these results.

Some attention is paid to the enterprising game, in which players exhibit ambiguity-seeking attitudes while betting optimistically on the favorable resolution of ambiguities. The two solution concepts are unified at this game's pure equilibria, whose existence is guaranteed when strategic complementarities are present. The current framework can be applied to settings like auctions involving ambiguity on competitors' assessments of item worths.