Σχόλια 0

Το κείμενο του εγγράφου

Sponsored Search Auction Design via Machine Learning∗Maria­Florina Balcan†Avrim Blum†Jason D.Hartline‡Yishay Mansour§ABSTRACTIn this work we use techniques from the study of sample-complexity in machine learning to reduce revenue maxi-mizing auction problems to standard algorithmic questions.These results are particularly relevant to designing goodpricing mechanisms for sponsored search.In particular weapply our results to two problems:proﬁt maximizing com-binatorial auctions,and auctions for pricing semanticallyrelated goods.Auctions for sponsored search can be viewedas combinatorial auctions in that bidders have combinato-rial (in the search terms and the location of the ad on thesearch results page) preferences for having ads placed.Fur-thermore since the space of all searches is much larger thanthe set of advertisers,it is useful to use the semantic re-lationship of search terms within pricing algorithms.Ourmain results show how to take algorithms that solve thesepricing problems and convert them into auctions with goodgame-theoretic properties and provably good performance.1.INTRODUCTIONThe typical approach to auctions for sponsored search is torun a separate auction for every search.This has the poten-tial not to perform optimally as it ignores implicit compe-tition between advertisers bidding on semantically similarkeywords.This eﬀect is more pronounced when keywords∗This paper discusses results from Mechanism Design viaMachine Learning,available as Technical report CMU-CS-05-143,as they apply to auctions for sponsored search.†Carnegie Mellon University.{ninamf,avrim}@cs.cmu.edu‡Microsoft Research,Mountain View,CA.hartline@microsoft.com.§School of computer Science,Tel-Aviv University.mansour@cs.tau.ac.il.The work was done while the au-thor was a fellow in the Institute of Advance studies,HebrewUniversity.This work was supported in part by the ISTProgramme of the European Community,under the PAS-CAL Network of Excellence,IST-2002-506778,by a grantno.1079/04 from the Israel Science Foundation and an IBMfaculty award.This publication reﬂects only the authors’views.have only a few advertisers bidding on them but the se-mantic space of similar keywords has many advertisers.Inthe case where the advertisers preferences are all commonknowledge,this motivates the algorithmic problemof pricingsemantically related items.One of the main results of thispaper is to show,when the advertisers preferences are pri-vate,how to use semantic pricing algorithms to constructan auction that takes advantage of the available semanticinformation.1In this work,we use techniques from sample-complexity inmachine learning theory to reduce the design of revenue-maximizing incentive-compatible mechanisms to algorithmicpricing questions relevant to sponsored search.When thenumber of agents is suﬃciently large as a function of an ap-propriate measure of complexity of the class of solutions be-ing compared to,this reduction produces only a 1+ǫ loss insolution quality;that is,an algorithm (or β-approximation)for the standard algorithmic problem can be converted to a(1 + ǫ)-approximation (or β(1 + ǫ)-approximation) for theincentive-compatible design problem.We do this in a fairlygeneral setting that includes the following as special cases:Auction of digital goods to indistinguishable bidders.In this problem,studied in [7,4],we have a digital good(a good of unlimited supply with zero marginal cost)and n bidders,where each bidder i has some valuationvibetween 1 and h.Our goal is to sell our good so asto make proﬁt comparable to the best ﬁxed price:theprice p maximizing p ×|{i:vi≥ p}|.Attribute Auctions.Consider auctions for advertisementsbased on search keys.As mentioned above,a problemwith having a separate auction for each key is thatthis might not produce enough competition to achievegood prices.Instead,we may want to group keys intocategories,say having one auction for all keys relatedto sporting equipment,another for transportation,andso on.Given some taxonomy (or just a collection ofpossible groupings of keywords),we model the prob-lem of determining the best partitioning of keywordsinto markets as something we call an attribute auction.1This is a fundamentally diﬀerent approach from what isknown as “broad match” or “semantic match” where adver-tisers are automatically entered into auctions for keywordsthat are semantically related to their desired keyword.Inparticular,we will never show an advertisers ad with anykeywords other than the ones they have explicitly selected.In this problem,bidders are not indistinguishable butinstead have a set of publicly-known attributes,suchas the keywords they are interested in,and the goalis to achieve revenue comparable to the best pricingfunction over these attributes from some class G.Forexample,[3] considers the special case of the attributeauction problem with 1-dimensional attributes and acomparison class G of functions that partition biddersinto k contiguous “markets” and oﬀer a separate pricein each.In the case of advertisements,G might correspond topartitions of keywords in the taxonomy into k cate-gories.Item-pricing in combinatorial auctions.Proﬁt maximiz-ing combinatorial auctions are another generalizationof the digital good auction problem [8,9].In this set-ting we have m diﬀerent items,each in unlimited sup-ply (like a supermarket),and bidders have valuationson subsets of items.Our goal is to achieve revenuenearly as large as the best auction that uses itemprices(assigns a separate price to each item),which is a natu-ral comparison class.Our results imply that˜O(mh/ǫ2)bidders are suﬃcient to achieve revenue close to theoptimumitem-pricing (assuming the algorithmic prob-lemcan be solved for the given bidders),no matter howcomplicated those bidders’ valuations are.In fact,ourbounds only require that the optimal revenue be largecompared to mh/ǫ2,which improves by roughly a fac-tor of m over the results of [8].Auctions for sponsored search can be viewed as a spe-cial case of this problem where the items on which thebidders have combinatorial preferences are the diﬀer-ent positions that ads can be shown on the result pageof a web search.The generic type of reduction used in these settings is thatgiven an algorithm A (exact or approximate) for the non-incentive-compatible optimization problem and given a setof bidders S,we will split bidders randomly into two sets S1and S2,run the algorithm separately on each set (perhapsadding an additional penalty term to the objective to penal-ize solutions that are too “complex” according to some mea-sure),and then apply the solution found on S1to S2and thesolution found on S2to S1.Sample-complexity results frommachine learning theory can then give a guarantee on thequality of the results if the number of bidders is suﬃcientlylarge compared to some notion of the complexity of the com-parison class or proposed solution.However,froma learningperspective,these mechanism-design settings present a num-ber of technical challenges:in particular,the loss function isdiscontinuous and asymmetric,and the range of bid valuesmay be large.2.DEFINITIONSWe will be considering mechanism design problems of thefollowing general form.We have a set S of n bidders,and weassume that each bidder i has some private information privi(like how much they are willing to pay for a digital good),as well as public information pubi(such as their location ina network).The game itself will be deﬁned by an abstractspace of legal oﬀers (like an oﬀer to sell a good at $17)together with a mapping ρ that deﬁnes how much proﬁt agiven oﬀer yields from a given bidder.For example,in thecase of auctioning a digital good,ρ(“oﬀer $17”,privi) = 17if privi≥ 17 and 0 otherwise.We can think of ρ as deﬁningthe assumption about how agents behave as a function oftheir private values.Definition 1.A comparison class G of pricing func-tions is a set of functions g that map the public informa-tion of a bidder to an oﬀer.The proﬁt of a function g is

iρ(g(pubi),privi).Note that we are implicitly consideringonly unlimited supply mechanism design problems,becausethe proﬁt frombidder i does not depend on whether g receivedproﬁt from other bidders j.Given a comparison-class G,the algorithm design problemis:given both the public and private information in S,ﬁndthe g ∈ G of highest total proﬁt OPTG.In our reductions,we may also want to perform“structural risk minimization”,which adds additional fake penalties to diﬀerent functions gbased on some measure of their complexity,in which casewe will need to assume we have an algorithm that optimizesrevenue minus penalty.The reason for adding these penal-ties is that they will help to prevent the algorithm from“over-ﬁtting” to its input:this will be important when,inour reduction,we run an algorithmon some set S1and applyits results to a diﬀerent set of bidders S2.We now need to deﬁne what we mean by an incentive com-patible mechanism.An incentive-compatible mechanism isa function that takes in the public information of all thebidders,plus the private information of all bidders exceptthe given bidder i and outputs an oﬀer.Our goal will beto design such a mechanism whose total proﬁt is nearly aslarge as the proﬁt of the best function in comparison classG.While we look to compare our proﬁt to the proﬁt of thebest function from some class,our auction’s outcome willnot typically be representable as the result of using such afunction.Since the auction is based on randomly partition-ing the bids into two sets,the function used for each set willgenerally be diﬀerent.This observation is not a drawback ofthe technique we propose nor of our performance measures.2One ﬁnal point at this level of generality:we will assumethat we are given an upper bound h on the value of ρ;thatis,no individual bidder can inﬂuence proﬁt by more thanh.This term will then come into our sample-complexitybounds.2.1 Examples2In the special case of digital-good auctions Goldberg etal.[6] give substantial justiﬁcation for comparing auctionswhich can use multiple prices (analogously pricing functions)to an optimal single price proﬁt:from a large class of nat-ural auctions for proﬁt maximization,none can beat theproﬁt of the optimal single sale price.Furthermore,as shownby Goldberg and Hartline [5],multiple prices are inherentlynecessary for proﬁt maximizing auctions:there is no truth-ful auction that always uses a single pricing function forall bidders and obtains an proﬁt comparable to the optimalsingle price proﬁt in worst case.Auction of digital goods to indistinguishable bid-ders:As described in the introduction,in this setting thebidders have no public information (equivalently,all the bid-ders have the same public information pub) and the privateinformation of bidder i is exactly its valuation vifor the dig-ital good,which is a real number between 1 and h.Here,anatural comparison class G = {gp} is the class of all func-tions that oﬀer a ﬁxed price p,and ρ is a function deﬁnedby ρ(p,privi) = p if p ≤ priviand ρ(p,privi) = 0 otherwise.Attribute Auctions:This is the same as the setting aboveexcept now each bidder i is associated a public attributepubi∈ X where X is the attribute space.We view X as anabstract space,but one can envision it as Rd,for example.G is then a class of pricing functions from X to R+,such asall linear functions or all functions that partition X into kmarkets (say based on distance to k cluster centers) and oﬀera diﬀerent price in each.The mapping ρ is a function fromR+×[1,h] to [0,h] deﬁned (as in the case of indistinguishablebidders) by ρ(p,privi) = p if p ≤ priviand ρ(p,privi) = 0otherwise.We will give analyses of several interesting classesof comparison functions in section 4.Combinatorial Auctions:Here we have a set J of mdistinct items,each in unlimited supply.Each consumerhas a valuation vi(s) for each bundle s ⊆ J of items,whichmeasures how much receiving bundle s would be worth tothe consumer i.The private information of bidder i is givenby the vector of all its valuations on subsets of J (typicallybidders are assumed to be indistinguishable with no publicinformation).A natural class of comparison functions G(studied in [9]) is the class of functions that assign a separateprice to each item,such that the price of a bundle is just thesumof the prices of the items in it (called item-pricing).Themapping ρ is then deﬁned by assuming bidders will buy thebundle (if any) with largest positive gap between its valueto them and its cost.3.GENERIC REDUCTIONSWe are interested in reducing incentive-compatible mecha-nism design to the standard algorithm design problem.Ourreductions will be based on Random Sampling.Let A bean algorithm for the (non incentive-compatible) algorithmicproblem.The simplest mechanism that we consider,whichwe call RSOPF(G,A)(Random Sampling Optimal PricingFunction),is the following generalization of the randomsam-pling digital-goods auction from [7]:1.Randomly split the bidders into two groups S1and S2,ﬂipping a fair coin for each.2.Run A to determine the best (or approximately best)function g1 ∈ G over S1,and similarly the best (orapproximately best) g2∈ G over S2.3.Finally,apply g1over S2and g2over S1.We will also consider variants of RSOPF(G,A)that discretizeG or perform some type of SRM(in which case we will needto assume A can optimize over the given class).Now,ﬁx a setting (deﬁned by ρ and G).In order to sim-plify notation,for a given pricing function g and bidder i,deﬁne g(i) to be the proﬁt made by g from bidder i,i.e.,ρ(g(pubi),privi).Similarly,for a set of bidders S′⊆ S,letg(S′) =

i∈S′g(i).So,OPTG= maxg∈Gg(S).The following lemma is key to our analysis.Lemma 1.Consider a ﬁxed pricing function g and a proﬁtlevel p.If we randomly partition S into S1and S2,then theprobability that |g(S1) − g(S2)| ≥ ǫ max[g(S),p] is at most2e−ǫ2p/(2h).We can now give our simplest generic reduction,for the casethat G is ﬁnite.Note that for particular settings,such as thebasic auction of a digital good (see [2]),we can get strongerguarantees by a more reﬁned analysis.Theorem 2.Given comparison class G and a β-approximationalgorithm A for optimizing over G,then so long as OPTG≥βn and the number of bidders n satisﬁesn ≥8hǫ2ln(2|G|/δ),then with probability at least 1−δ,the proﬁt of RSOPF(G,A)is at least (1 −ǫ) OPTG/β.In many natural cases,G consists of functions at diﬀerent“levels of complexity” k,such as partitioning bidders into kmarkets.One natural approach to such a setting is to per-form structural risk minimization (SRM),that is,to assigna penalty term to functions based on their complexity andthen to run a version of RSOPF(G,A)in which A optimizesproﬁt minus penalty.Speciﬁcally,let¯G be a series of pricingfunction classes G1⊆ G2⊆...,and let pen be a penalty func-tion deﬁned over these classes.Also for simplicity assumeβ = 1 (we have an exact algorithm for the underlying prob-lem).We then deﬁne the procedure RSOPF-SRM(¯G,pen)asfollows:1.Randomly partition the bidders into two sets,S1andS2,ﬂipping fair coin for each.2.Compute g1to maximize maxkmaxg∈Gk[g(S1) −pen(Gk)]and similarly compute g2from S2.3.Use price function g1for bidders in S2and g2for biddersin S1.A straightforward extension of Theorem2 to this case wouldintroduce a quadratic dependence in h,but we will be ableto reduce this to nearly linear.Deﬁne OPTk= OPTGk.Theorem 3.Assuming that we have an exact algorithmfor solving the optimization problem required by RSOPF-SRM(¯G,pen)then for any given value of n,ǫ,and δ,withprobability at least 1 −δ,the revenue of RSOPF-SRM(¯G,pen)for pen(Gk) =6(1−ǫ)272hǫ2ln(8k2|Gk|/δ) ismaxk((1 −ǫ) OPTk−pen(Gk)).Finally,in some cases,|G| is not a very good measure of thetrue complexity of the class G (e.g.,even for the simplestcase of ﬁxed-price functions,if we do not discretize then Gis inﬁnite).In that case we can use the notion of ǫ-covers.To address this we need one more technical deﬁnition.Forg ∈ G let ρgbe the proﬁt function induced by g and letρ(G) = {ρg:g ∈ G}.That is,while g outputs an oﬀer,ρgoutputs the proﬁt made from the given bidder using thatoﬀer.An ǫ-cover of ρ(G) with respect to L∞is a set offunctions Cov(ǫ,ρ(G)) such that for every ρg∈ ρ(G) thereexists f in the cover such that for every bidder i,|ρg(i) −f(i)| ≤ ǫ.Let N(ǫ,ρ(G)) denote the size of the smallestǫ-cover.Now one can prove:Theorem 4.If we randomly partition S into S1and S2,then n ≥8h2ǫ2

ln

2δ

+lnN(ǫ/2,ρ(G))

bidders are suﬃ-cient so that with probability at least 1 −δ,for all functionsg ∈ G we have |g(S1) −g(S2)| ≤ ǫn.Using standard results fromlearning theory [1] one can boundthe size of the ǫ-cover using notions such as fat-shattering di-mension.However,for the special case of attribute auctions,we will get better bounds —see Section 4.2.4.ATTRIBUTE AUCTIONSWe begin by instantiating the results in Section 3 for marketpricing auctions,and then we give an analysis for generalpricing functions over the attribute space that improves onthe bounds of Section 3.4.1 Market PricingFor Attribute Auctions,one natural class of comparisonfunctions are those that partition bidders into markets insome simple way and then apply a separate price in eachmarket.For example,suppose we deﬁne Gkto be the set offunctions that choose k bidders b1,...,bk,use these as clus-ter centers to partition the entire set S into k markets basedon distance in attribute space to the nearest center,andthen oﬀer a ﬁxed price in each market.In that case,if wediscretize prices to powers of (1+ǫ),then clearly the numberof functions in Gkis at most nk(log1+ǫh)k,so Theorem2 im-plies that so long as n ≥8hǫ2

ln(2/δ) +k lnn +k ln

log1+ǫh

and we can solve the algorithmic problem then with proba-bility at least 1−δ,we can get proﬁt at least (1−ǫ) OPTGk.Another interesting and general way to do market pricingis the following.Let C be a class of subsets of X,whichwe will call feasible markets.For k a positive integer,weconsider Fk+1(C) to be the set of all pricing functions of thefollowing form:pick k disjoint subsets s1,...,skfrom C,andk +1 prices p0,...,pkdiscretized to powers of 1 +ǫ.Assignprice pito bidders in si,and price p0to bidders not in anyof s1,...,sk.For example,if X = Rda natural C might bethe set of axis-parallel rectangles in Rd.The speciﬁc case ofd = 1 was studied in [3].We can apply the results in Section 3 by using the machin-ery of VC-dimension to count the number of distinct suchfunctions over any given set of bidders S.In particular,let D = V Cdim(C) be the VC-dimension of C and assumeD < ∞.Deﬁne C[S] to be the number of distinct subsetsof S induced by C.Then,Sauer’s Lemma [1] states thatC[S] ≤

enD

D,and therefore the number of diﬀerent pric-ing functions in Fk(C) over S is at most

log1+ǫh

k

enD

kD.Thus applying Theorem 2 here we get:Corollary 5.Given a β-approximation algorithm A foroptimizing over G = Fk(C),then so long as OPTG ≥ βn andthe number of bidders n satisﬁesn ≥16hǫ2

ln

2δ

+k ln

1ǫlnh

+kDln

4khǫ2

,then with probability at least 1 −δ,the proﬁt of RSOPFG,Ais at least (1 −ǫ) OPTG/β.Corollary 5 gives a guarantee in the revenue of RSOPFFk(C),Aso long as we have enough bidders n.In the following,k ≥ 0,denote by OPTk= OPTFk(C).We can also show a boundthat holds for all n,but with an additive loss term,as follows(we assume for simplicity here that β = 1):Theorem 6.For any given value of n,k,ǫ,and δ,withprobability 1 −δ,the revenue of RSOPFFk(C),Ais(1 −ǫ) OPTk−h ∙ rF(k,D,h,ǫ,δ)where rF(k,D,h,ǫ,δ) = O

kDǫ2ln

kDhǫδ

Finally,we can extend our results to the setting of StructuralRisk Minimization,where we want the algorithm to opti-mize over k,by viewing the additive loss term as a penaltyfunction.Theorem 7.Let¯G be the sequence of pricing functionclasses F1(C),F2(C),...,Fn(C),and let pen(Fk(C)) be de-ﬁned appropriately.Then for any value of n with probability1 −δ the revenue of RSOPF-SRM¯G,penismaxk

(1 −ǫ) OPTk−h ∙ r′F(k,D,h,ǫ,δ)

where r′F(k,D,h,ǫ,δ) = O

kDǫ2ln

kDhǫδ

.4.2 General Pricing Functions over the At­tribute SpaceIn this section we generalize the results in section 4.1 intwo ways:to general classes of pricing functions (not justfunctions deﬁned over the markets) and second,we removethe need for discretization (note that we could use results insection 3,but using the structure of the problem we showhere how we can get better bounds).For example,we mightwant to consider a comparison class of linear functions overthe attributes,or quadratic functions,or perhaps functionsthat divide the space into markets and are linear (ratherthan constant) in each market.Assume that X ⊆ Rd,and let G be a class of pricing func-tions over the attribute space X.For g ∈ G let ρg:X ×[1,h] → R be its associated proﬁt function.Let’s denoteby ρ(G) be the class of the proﬁt functions correspondingto G.Consider OPTG= OPT(S,G) to be the proﬁt of theoptimal pricing function in G over S.Now,let Gdbe theclass of decision surfaces (in Rd+1) induced by G:that is,toeach g ∈ G we associate the set of all (x,v) ∈ X ×[1,h] suchthat g(x) ≤ v.Finally,let D = V Cdim(Gd).Assume in thefollowing that D < ∞.Then we can prove that ([2]):Theorem 8.Given class G and a β-approximation algo-rithm A for optimizing over G,then so long as OPTG≥ βnand the number of bidders n satisﬁesn ≥64hǫ2

ln

2δ

+Dln

64hǫ2

16ǫlnh +1

,then with probability at least 1−δ,the proﬁt of RSOPF(G,A)is at least (1 −ǫ) OPTG/β.5.COMBINATORIAL AUCTIONSFor the case of combinatorial auctions described in Sec-tion 2.1,where we want to achieve revenue nearly as high asthe best set of item-prices,we can directly apply Theorem2.Speciﬁcally,let G be the class of item prices,discretizedto powers of (1 +ǫ).Then we have:Corollary 9.Given a β-approximation algorithm A foroptimizing over G,then so long as OPTG≥ βn and thenumber of bidders n satisﬁesn ≥8hǫ2

mln(log1+ǫh) +ln(2/δ)

,then with probability at least 1 −δ,the proﬁt of RSOPFG,Ais at least (1 −ǫ) OPTG/β.Auctions for sponsored search are combinatorial in nature.Often several advertisements are shown with the outcome ofa search and advertisers may have a preference over the rel-ative position of their ad.Furthermore,an advertiser mightalso have their ad shown on searches for several diﬀerent key-words and may have a preference over the keywords.Itempricing is natural for these settings and the results aboveapply.6.CONCLUSIONSIn this work we have made the connection between ma-chine learning and mechanism design explicit.In doingso,we obtain a uniﬁed approach to considering a varietyof proﬁt maximizing mechanism design problems includingmany that have been previously considered in the litera-ture.These results are particularly relevant to designinggood pricing mechanisms for sponsored search.7.REFERENCES[1] M.Anthony and P.Bartlett.Neural Network Learning:Theoretical Foundations.Cambridge University Press,1999.[2] M.-F.Balcan,A.Blum,J.Hartline,and Y.Mansour.Mechanism design via machine learning.2005.Technical Report,CMU-CS-05-143.[3] A.Blum and J.Hartline.Near-Optimal OnlineAuctions.In Proc.16th Symp.on Discrete Alg.ACM/SIAM,2005.[4] A.Fiat,A.Goldberg,J.Hartline,and A.Karlin.Competitive Generalized Auctions.In Proc.34th ACMSymposium on the Theory of Computing.ACM Press,New York,2002.[5] A.Goldberg and J.Hartline.Envy-Free Auction forDigital Goods.In Proc.of 4th ACM Conference onElectronic Commerce.ACM Press,New York,2003.[6] A.Goldberg,J.Hartline,A.Karlin,M.Saks,andA.Wright.Competitive auctions and digital goods.Games and Economic Behavior,2002.Submitted forpublication.An earlier version available as InterTrustTechnical Report STAR-TR-99.09.01.[7] A.Goldberg,J.Hartline,and A.Wright.CompetitiveAuctions and Digital Goods.In Proc.12th Symp.onDiscrete Algorithms,pages 735–744.ACM/SIAM,2001.[8] Jason Hartline and Andrew Goldberg.Competitiveauctions for multiple digital goods.In ESA,2001.[9] V.Guruswami and J.Hartline and A.Karlin and D.Kempe and C.Kenyon,and F.McSherry.OnProﬁt-Maximizing Envy-Free Pricing.In Proc.16thSymp.on Discrete Alg.ACM/SIAM,2005.