Sunday, December 2, 12Beef jerky
underwear is reasonably popular apparently. we’re on track to sell between $800MM and900MM in goods this year. This makes us about as big as Hot Topic.

OCTOBER 2012 1.5 billion page
views 55 million unique visitors USD $83 million in transactions 4.2 million items sold http://www.etsy.com/blog/news/?s=weather+reportSunday, December 2, 12We had about 1.5B page views in October which makes us a reasonably large website.

Tons of active A/B tests
and rampups.Sunday, December 2, 12Here’s a screenshot of an internal view of the various tests and conﬁg rampups running onjust one of our pages. As you can see, there are a whole lot of them.

Sunday, December 2, 12We’ve invested
plenty of time and effort into tooling to support this work. This is ascreenshot of our A/B analyzer, which automatically generates a dashboard with importantbusiness metrics for every conﬁgured test.

Sunday, December 2, 12We’ve built
tools that protect us from some gnarly statistics. This wizard does the math foryou and lets you know how long an experiment will need to run in order to have a signiﬁcantresult.

Continuous Experimentation Small, measurable changes.
Keeps us honest. Prevents us from breaking things.Sunday, December 2, 12I’m going to call what we do “continuous experimentation,” for the lack of a better term. We try to makesmall changes as much as possible, and we measure those changes so that we stay honest and don’tbreak the site.

Seung yun Yoo Seoul, South
Korea knifeinthewater.etsy.com http://www.etsy.com/blog/en/2012/featured-shop-knife-in-the-water/Sunday, December 2, 12So what do I mean by “breaking the site?” Well, behind every Etsy shop is a person thatdepends on it, and counts on us not to push changes that hurt their business. So we wouldbe remiss not to measure our changes.

Etsy Sales: Two Scenarios Good
product release Awful product releaseSunday, December 2, 12The second reason we measure product releases is so that we stay honest. Much of Etsy’ssales are seller-driven, so our graphs currently tend to go up no matter what. Obviously thatcan’t continue forever. But we have to use A/B testing to tell if we’ve made things worse orbetter.

“When I am comparison shopping,
I open items in new tabs. We should do that on Etsy.” - Typical know-it-all Etsy employeeSunday, December 2, 12Let me give you an example. A few years ago there was controversy internally at Etsy over whether ornot items should open up in new tabs. Some Etsy employees do this themselves when they’re diggingthrough search results, and they wish that it happened by default. They thought that the average userwould be happier if this were the case.

Sunday, December 2, 12So we
eventually stopped arguing about this and just tried it. We ran an A/B test that opened up itemsin new tabs.

The Horrible Sound of Epic
Failure credit: EmbroideryEverywhere.etsy.comSunday, December 2, 12When we tried that, 70% more people gave up and left the site after getting a new tab. Maybe someEtsy employees know how to use tabs in a browser, but my grandmother doesn’t. We’ve replicated thisresult more than once.

One big thing we’ve learned
from experiments:Sunday, December 2, 12We’ve been at this for a while and one of the main things we’ve learned from this, which is the mainthing I want to talk about today,

Design and product process must
change to accommodate experimentation.Sunday, December 2, 12is that process has to change to accommodate data and experimentation. If you follow a waterfallprocess and try to bolt A/B testing onto it, you will fail

Removing the Search Inﬁnite Scroll
Dropdown Monolithic release. Multi-stage release. Eﬀort up front. Iterative. Changes many things at once. One thing at a time. A/B test as a hurdle. A/B testing integral to process. Assumptions. Hypotheses.Sunday, December 2, 12These were two projects done largely by the same team. Inﬁnite scroll was poorly managed, and arelease removing a dropdown in our site header was well managed.

WoahSunday, December 2, 12If anyone
doesn’t know what I mean by inﬁnite scroll: I mean that we changed search results so that asyou scroll down, more items load in, indeﬁnitely.

Seeing more items faster is
presumed to be a better experience.Sunday, December 2, 12The reason we did this was because we thought that it obvious that more items, faster was a betterexperience. There’s a lot of web lore out there to that effect, based mostly on some ﬁndings Google’smade in their own search.

Inﬁnite Scroll: Release Plan (Implied)
1. Build inﬁnite scroll. 2. Fix some bugs. 3. A/B to measure obvious big improvement. 4. Rent warehouse. 5. Hold release party in warehouse.Sunday, December 2, 12So when we decided to do this we just went for it. We designed and built the feature, and then weﬁgured we’d release it and it’d be great.

Initial reaction: “something’s broken.”Sunday, December
2, 12The ﬁrst thing that occurred to us is that there must have been bugs in the product that we missed. Sowe spent a month trying to ﬁgure out if that was the case. We sliced results by browser and geographiclocation. We sent a guy to a public library to try using an ancient computer. We did ﬁnd some bugs, butnone of them changed the overall results.

Gradual, horrible realization: “we changed
many things at once.”Sunday, December 2, 12Eventually we came to terms with the fact that inﬁnite scroll had made the product worse, and we hadchanged too many things in the process to have any clue which was the culprit.

Premise-validating Experiments Or: “things we
should have done in the ﬁrst place.”Sunday, December 2, 12So, we were in a situation where we weren’t sure if we should continue working on this or not. Even ifwe had issues in IE or something, the behavior of people using Chrome wasn’t way better, it was alsoworse. How do we know if it’s a good idea to ﬁnish this or not?So we went back and tried to verify that the premises that made us do this were right.

Are more items in search
results better? Barely, maybe: more people get to an item page as the result count increases. Absolutely no change in purchases.Sunday, December 2, 12And the answer was yes, maybe a little bit, but only barely. There was a very slight improvement in thenumber of people that ever got to a item page. But the effect is very slight, and purchases aren’tsensitive to this. There’s no increase in purchases when we increase the number of search results.

Are faster results better? MehSunday,
December 2, 12Absolutely nothing happened. Which isn’t to say that performance is pointless, but people buying itemsdon’t seem to be sensitive to performance at all.

Inﬁnite Scroll: Release Plan (Implied)
1. Build inﬁnite scroll. Lots of work 2. Fix some bugs. 3. A/B to measure obvious big improvement. 4. Rent warehouse. 5. Hold release party in warehouse. Didn’t happenSunday, December 2, 12So if we go back to our “product plan,” we see a couple of major things wrong with it. We did a lot ofwork, and it was pointless.

A Slightly Better Inﬁnite Scroll
Release Plan 1. Validate premise: more items is better (easy) 2. Validate premise: faster is better (easy) 3. Either: A. Abort! (easy) B. Build inﬁnite scroll (hard).Sunday, December 2, 12A better way to have done this would have been to validate those premises ahead of time and thenmake the call. But we didn’t do that.

Throwing out work sucks.Sunday, December
2, 12Throwing out work feels really horrible. Most of the time this is a really difﬁcult choice to make, andwithout a lot of honesty and discipline, most teams aren’t going to do it. We are not very rationalcreatures in the face of sunk costs.

Sunday, December 2, 12So, the
ﬁrst thing we had to address was the fact that the dropdown was used to cut the marketplaceby different item types.

HYPOTHESIS: Most users of the
site don’t know anything about this.Sunday, December 2, 12We were working from a hypothesis that most people using Etsy don’t even notice this. But again, wehad to verify this.

Sunday, December 2, 12First we
introduced this faceting on the left side of search results, and made it more obvious. Thisrelatively simple and it was an improvement over the old design that nobody used.

Sunday, December 2, 12But still,
relatively few people noticed that. So we also built faceting into our autosuggest. We made itpossible to drill down into categories as you typed.

Sales of Vintage Items: +3.7%Sunday,
December 2, 12After we did this, sales of vintage items without the dropdown in place increased almost 4%. So weincreased the ability of buyers on Etsy to ﬁnd vintage goods, we didn’t decrease it. Which is a greatthing to be able to tell our community.

VERIFIED HYPOTHESIS: Casual users of
the site don’t know anything about this.Sunday, December 2, 12So we were right. Most people using the site in fact did not know how to use the dropdown for this.

...plus ﬁve or ten other
things on the same scale.Sunday, December 2, 12So you more or less get the idea here. We had a big goal, which we could have been unmanageableas a single release. We did it as ten or ﬁfteen small releases.

Inﬁnite Scroll Design Develop Measure
ಠ_ಠ Dropdown Redesign ✓ ✓ ✗ ✓ ... Design Develop MeasureSunday, December 2, 12Contrasting the two release plans, inﬁnite scroll was a big bet that didn’t work out.The dropdown redesign was a series of small bets: some worked and some didn’t, but we didn’t have to throwout everything when things didn’t work

This will not always work.
Occasionally, you may need to make big bets on redesigns.Sunday, December 2, 12This is not always going to work: you may still have to make big bets on big redesigns sometimes.