April 5, 2013

On the eve of the Programmatic IO and Ad Tech conferences in SF, I want to share my idea for a new design feature of Exchange/SSP, a feature that has the potential of significantly impacting our industry. This feature is the bid auction rule.

Bid auction rule is known to be central to Search Engine Marketing. Google’s success story underscores how important a role it can play in shaping the process and dynamics of the marketplace. There is reason to believe that it has the similar potential for the RTB Exchange industry.

The current auction model implemented in Ad Exchanges and SSPs are commonly known as Vickrey auction, or the second price auction. It goes like this: upon receiving a set of bids, exchanges will decide on a winner based on the highest bid amount, and set the price to the second highest bid amount. In the RTB Bid Process diagram below, this auction rule is indicated by the green arrow #7:

The new auction I’d like to propose is a familiar one: it is a modified Vickrey auction with quality score! Here, the bid quality score is defined as the quality of an ad to the publisher, aside from the bid price. It essentially captures all things that a publisher may care about the ad. I can think of a few factors:

Ad transparency and related data availability

Ad quality (adware, design)

Ad content relevancy

Advertiser and product brand reputation

User response

Certainly, the bid quality scores are going to be publisher specific. In fact, it can be made site-section specific or page specific. For example, a publisher may have a reason to treat Home Page of their site differently than other pages. It can also vary by user attributes if the publisher like to.

Given that, the Exchange/SSP will no longer be able to carry out the auction all by itself – as the rule no longer depends only on bid amounts. We need a new processing component, as shown in the diagram below.

Now, #7 is replaced with this new component, called Publisher Decider. Owned by the publisher, the decider works through the following steps:

it takes in multiple bids

calculates the bid quality scores

for each bid, calculates the Total Bid Score (TBS), by multiplying bid amount and quality score

ranks the set of bids by the TBS

makes the bid with highest TBS the winner

sets the bid price based on a formula below, as made famous by Google

Here, P1 is the price set for the winner bid. Q1 is the bid quality score. B2 is the bid amount for the bid with second highest TBS. Q2 is the bid quality score for the bid with the second highest TBS.

This is not a surprise and it’s not much of a change. So, why is this so important?

Well, with the implementation of this new auction rule, we can guess some natural impacts coming out:

Named Brands will have an advantage on bid price, because they tend to have better quality scores. A premium publisher may be willing to take $1 CPM from apple than $5 CPM from a potential adware. This will be achieved via Apple having a quality score 5 times or more higher than that of the other crappy ad.

Advertisers will have an incentive to be more transparent. Named brands will be better off with being transparent, to distinguish themselves from others. This will drive the quality score from non-transparent ads lower, therefore starting a good cycle.

DSPs or biddes will have no reason to not submit multiple bids, for they won’t be able to know which ad will be the winner before hand.

Premium Publishers will have more incentive to put their inventory into now that they have transparency and finer level of control.

The Ad Tach eco-system will respond with new players, such as ad-centric data companies serving the publisher needs, similar to the contextual companies serving advertisers

You may see missing links in the process I described here. It is expected, because a complete picture is not the focus of this writing. I hope you will be convinced that bid quality score / Publisher Decider is interesting, and potentially has significant impact by pushing the Ad Tech space in the direction of more unified technologies and consistent framework.

April 11, 2012

Few topics are near and dear to my heart as Attribution Modeling is. I first bumped into it a more than 4 years ago; and my first written piece on attribution is a linkedin Q&A piece answering a question from Kevin Lee on duplication-rate (in August 2007). Since then, my interest in attribution gets real serious, resulting in a dozen’s attribution related blog posts. The interest never died after that, although I have not written anything the last three years.

I am back on it with a vengeance! Consider this as my first one back.

I want to start on a gentle note though. I am amused about people still debating about First Touch vs Last Touch attribution as viable attribution models, a bit out of the question in my opinion. I want to share some funny analogies for what could go wrong with them.

Starting with Last Touch Attribution Model, a football analogy goes like this: “relying solely on a last click attribution model may lead a manager to sack his midfielder for not scoring any goals. Despite creating countless opportunities he gets no credit as his name isn’t on the score-sheet. Similarly a first click attribution model may lead the manager to drop his striker for not creating any goals, despite finishing them. – BrightonSEO presentation slides

There are a lot of good analogies like this that are derived from team sports. This analogy is applicable not only to Last Touch, but to all single touch point attribution models. The funniest one I heard is about First Touch Attribution, from none other than the prolific Avinash Kaushik: “first click attribution is like giving his first girlfriend credit for his current marriage.” – Avinash quote

Analogy is analogy, it does not do full justice to what’s been discussed. However, what we should learn at least this much: if your attribution model is solely based on the sequencing order of touch points, you are wrong. Those who propose Last, First, Even, Linear or whatever attribution models, watch out!

A good attribution model needs a disciplined development process, and better yet, a data-driven one. The less the assumptions made about the values of touch points the better – we should learn to let empirical evidence speak for itself.

April 28, 2009

Who is the first reporter of the Mexico City earth quake? I remember watching twitter second-by-second and @cjserrato was the first one reported the earth quake (the tweet id is 1630381373):

Mining twitter data is a huge challenge. So far I have not been able to see many interesting data/text mining and data analytics around twitter data. I have been playing the data lately, and here’s a thematic/topic graph I had – a visualization of all tweets of the last eight hours that are related to to “mexico city”:

You can tell that “Swine Flu” still at the center of all topics, whereas earthquake is clustered alone to the side.

Have you seen any interesting twitter analytics (by the way, I do not mean the twitter metrics or counters etc..)?

For marketing and advertising, attribution problem normally starts at the macro level: we have total sales/conversions and marketing spends. Marketing Mix Modeling (MMM) is the commonly used analytics tool providing a solution using time series data of these macro metrics.

The MMM solution has many limitations that are intrinsically linked to the nature of the (macro level) data that’s been used. Micro attribution analytics, when the micro-level touch points and conversion tracking is available, provides a better attribution solution. Although sadly, MMM is more often practice even when the data for micro-attribution is available; this is primarily due to the lack of development and understanding of the micro attribution analytics, particularly the model-based approach.

There has been three types, or better yet, three generations of micro analytics over the years: the tracking-based solution, the order-based solution and the model-based solution.

The tracking-based solution has been popular in the multi-channel marketing world. The main challenge here is to figure out through which channel a sale or conversion event happens. The bookMultichannel Marketing – Metrics and Methods for On and Offline Successby Akin Arikan is an excellent source of information for the most often used methodologies – covering customized URL, unique 1-800 numbers and many other cross-channel tracking techniques. Tracking normally implemented at the channel-level, not individual event levels. Without the tracking solution, the sales numbers by channels are inferred through MMM or other analytics. With proper tracking, the numbers are directly observed.

Tracking solution essentially a single attribution approach to a multi-touch attribution problem. It does not deal with the customer level multi-touch experience. This single-touch attribution approach leads natrually to the last-touch point rule when viewed from a multi-touch attribution perspective. Another drawback of it is that it is simply a data-based solution without much analytics sophistication behind it – it provides relationship numbers without a strong argument for causal interpretation.

The order-based solution explicitly recognizes the multi-touch nature of individual consumer experience for brands and products. With the availability of micro-level touch point and conversion data, order-based attribution generally seeks attribution rules in the form of a weighting scheme based on the order of events. For example, when all weights are zero except the last touch point, it simply reduced to the LAST touch point attribution. There has been many such rules been discussed; with constant debate about the virtual and drawbacks of each and every one of the rules. There are also derived metrics based on these low-level order-based rules, such as theappropriate attribution ratio (Eric Peterson).

Despite the many advantages of order-based multi-touch attribution approach, there are still methodological limitations. One of the limitations is that, as many already know, there is no weighting scheme that is generally applicable, or appropriate for all business under all circumstances. There is no point of arguing which rule is the best without the specifics of the business and data context. The proper rule should be different depending on the context; however, there is no provision or general methodology for the rule should be developed.

Another limitation of the order-based weighting scheme is: for any given rule, the weight of an event is determined solely based on the order of event and not the type of event. For example, one rule may specify the first click getting 20% attribution – when it maybe more appropriate to give the first click 40% attribution if it is a “search” and 10% if it is a “banner click through”.

Intrinsic to its intuition-based rule development process is that it does not have a rigorous methodology to support any causal interpretation which is central for right attribution and operation optimization.

Here comes the third generation of attribution analytics: the model-based attribution. It promises to deliver a sound modeling process for rule development, and provides the analytical rigor for finding relationships that can have causal interpretation.

More details to come.Please come back to read the next post:a deep dive example of model-based attribution.

(and these do not include: attribution theory, performance attribution etc. – less related to marketing and advertising.)

It is also interesting to read all the different rules and heuristics have been proposed as solutions for attributing conversion to the prior events by the order of them: first touch attribution, last touch, mean attribution, equal attribution, exponential weighted attribution, use engagement metrics as proxy, looking at ratio of first to last attribution etc..

What about all the articles and posts that talking about it? There are perhaps over 100 of them just making a point about how important it is to think about multi-touch attribution problem and do some analysis – very interesting indeed.

I am sure that I missed some of the names and rule variations in the lists above. Please add whatever I missed in the comments to help me out.

The use of different terminologies creates some confusion – making it difficult to stay focus on the core methodology issue.

Please come back to read the next post on the threegenerations of attribution analytics.

Well, what could be with the question? A standard setup for a conversion model is to use conversion as the dependent variable for the model with banner and search as predictors. The problem here is, we only have convertor cases but no non-convertor cases. We simply can’t perform a model at all. We need more data such as the number of click on banner but did not convert.

The sampling bias issue is actually deeper than this. We want to know if the coverage of banner and search are “biased” for the data we are using, an example is when banner were national while search were regional. We also need to ask, if the future campaigns will be run in ways similar to what happened before – the requirement of modeling setup mimicking the applying context.

2) Encoding sequential pattern

The data for micro attribution is naturally in the form of collection of events/transactions:User1: banner_id time1.1 search_id time1.2 search_id time1.3 conversion time1.4User2: banner_id time2.1User3: search_id time3.1 conversion time3.2
Some may think that this form of data makes predictive modeling infeasible. This is not the cases. There are many predictive modeling are done with transaction/event type of data: fraud detection, survival model, to name a couple. In fact, there are sophisticated practice in mining and modeling sequential patterns that are way beyond what being thought about in common attribution problem discussion. The simple message is: this is an area that is well researched and practices and there have been great amount of knowledge and expertise related to this already.

3) Separating model development from implementation processes

Again, the common sense from the predictive modeling world can shed some light on how our web analytics industry should approach attribution problem. All WA vendors are trying to figure out this crucial question: how should we provide data/tool service to help clients solve their attribution problem. Should we provide data, should we provide attribution rules, or should provide flexible tools so that clients can specify their own attribution rules.

The modeling perspective says that there is no generic conversion model that is right for all clients, very much like in Direct Marketing we all know there is no one right response model for all clients – even for clients in the same industry. Discover Card will have a different response model than American Express, partly because of the differences in their targeted population, their services, and partly because of the availably of data. Web Analytics vendors should provide data sufficient for client to build their own conversion models, but not building ONE standard model for clients (of course, they can provide separate modeling services, which is a different story). Web Analytics vendors should also provide tools so that clients’ modeling can be specified/implemented once it’s been developed. Given the parametric nature in conversion models, none of the tools from current major Web Analytics vendors seem sufficient for this task.

That is all for today. Please come back to read the next post: conversion model – not what you want but what you need.

February 20, 2009

and to do things that could last forever .. the desire that used to be a synonym for ego is perhaps one of the most important, and subtle, force for why we do not see reality as it is when it is right in your face. It is perhaps the one force that comes so natrually for us in preventing us from going with the flow of nature.

I used to see this when it comes to spiritual matters, not knowing that this is so applicable to business as well.

December 18, 2008

Follow up to one of my early post about google’s achillis’ heel, I’d like to add that Google’s latest searchwiki seems to be an interesting response to what I mentioned earlier — I know I know it is not quite like that 🙂

I love to count the many different ways of ranking stuff in response to a search query. The objects, the stuffs, can be text, link, document, image, video etc.. The ranking principle is the essentially a rule of relevance and/or similarity. I count four main types:

1) by content similarity, the algorithm could be PageRank, HITS etc.. For images and videos, this can prove to be very difficult because it involves not only the hard core technology such as pattern recognization for images, but also involves large stocks of prior knowledge about object categorization etc..

2) by similarity of user behavior, when applied some kinds of collective intelligence, or collaborative filtering type of algorithms. User behavior can serve as implicit voting; with algorithms’ help, the complexity of the ranking operation can be dramatically reduced.

3) by similarity of user explicit ratings. Users’ search phrase and explicit ratings ( ratings/reviews on amazon, as well as Google’s latest searchWiki, which interestingly only affect what user see next time, not anyone else’). Some types of social/collective intelligence algorithm has to be applied in order to solve the complexity issue, as well as the sparse data problem associated it when crossing search query with user ratings.

4) of course, there is always the money logic.

If you know more ranking logic than what posted here, I’d like to know it ..

August 9, 2008

the two seem on a collision course lately – this is really unfortunate!Boneheaded privacy advocates mistaken baby with bath water; whereas companies who use BT failed to see the golden opportunity with better analytics technologies..time for “infocrypt behavioral profile” ?

July 29, 2008

just a thought 🙂There have been three search engine ranking principles at works: 1) by content match with search query, 2) by user feedback (or social search) data to query or similar query, and 3) by bidding price. The logic that used by Google Adwords is a complex combination of all three (relevancy, CTR and bid price).

For example, Amazon and Netflex represent the pure form of 2).

All three principles have their own merit and, here’s why it is important, many times one pure logic may match users’ intent better than a complicated mix.

Google’s ranking logic for Adwords evolved overtime, keeping a careful balance so far. But how far can it goes? Will a dynamic logic that mixes the three in significantly different way be a disruptive technology one day?