Mobify > Bloghttp://www.mobify.com/2014-12-22T10:00:00-08:00CDN RUM and Eggnog2014-12-22T10:00:00-08:00Kyle Young and Dr Chris Bildfelltag:www.mobify.com,2014-12-22:blog/cdn-rum-2014/<div><p>Welcome to the Great Mobify CDN Shootout. We have pitted several major CDN providers against each other in a RUM test for global supremacy, a Royal RUMble if you will. We want to show you how you can get the most value from what is otherwise an expensive and opaque world of complexity and hand-waving.
<br><br><a href="http://www.mobify.com/blog/cdn-rum-2014/#results">TL;DR? Jump straight to the results.</a></p>
<h3>A Little Background</h3>
<p>&#8220;CD-what-now&#8221; you ask? A CDN, short for Content Distribution Network, is a fleet of servers set up at key points all over the globe to act as proxies for web traffic. Most major websites are configured to relay data through CDNs to take advantage of their caching ability, which helps content get to you faster and more reliably.</p>
<p>How does that work? Well, when a sysadmin loves a server, she will want to tell the world how to talk to it by giving it a name and publishing it as a DNS (Domain Name System) record, thereby letting everyone find it. DNS is basically the phone book of the internet. By way of a convoluted example, let&#8217;s say you bought a virtual server from Digital Ocean and called it &#8220;snowflake&#8221;. Aside from your half-baked web experiments, snowflake is also running grandma&#8217;s blog on some poorly-patched-PHP-backed-WYSIWYG-platform like WordPress, and her rum-frosting-brownie recipe just went viral. Congratulations! You&#8217;re going to end up with all of Reddit making requests, and poor snowflake is going to get DDoSed with love.</p>
<p>As it happens, CDNs can help. When you bought gran her domain name, you pointed it to snowflake. To keep poor snowflake alive, you stick the CDN between the DNS record and the server. Users now lookup the IP address for gran&#8217;s muffins and get the CDN&#8217;s IP instead of snowflake&#8217;s. The CDN in turn is configured to look back to snowflake for its content. (NB This is usually called &#8220;pulling&#8221; or &#8220;customer origin&#8221;, as opposed to &#8220;pushing&#8221; or just uploading the content directly to the CDN.) This way you can tell the CDN to remember (aka cache) your pages: &#8220;/rumfrosting&#8221; can be cached for say, an hour, and all the JS and CSS files can be cached a full day. Instead of millions of requests per second crashing your server, the CDN will only occasionally poll for the up-to-date version, and serve the end-users itself, usually from a server much closer to their location. This shields snowflake from the great unwashed masses, and prevents your bandwidth quota from exploding.</p>
<p>For a discussion on Cache Headers and how to set them, I recommend the excellent <a href="http://www.mobify.com/blog/beginners-guide-to-http-cache-headers/">Beginner&#8217;s Guide to HTTP Cache Headers</a> reference article; I hear tell the author is both a gentleman and a scholar.</p>
<p>So, what&#8217;s our angle, and why does Mobify care so much about CDN performance? Well, in case this is the first you&#8217;ve heard of us, some of our core technology is served over CDNs and needs to be delivered as fast as possible. None of our techno-wizardry would matter if we couldn&#8217;t serve our assets quickly. Click around <a href="http://www.mobify.com">our site</a> to learn more!</p>
<h3>The Problem of Choice</h3>
<p>So now that you know why you&#8217;d want a CDN, and have some idea of how they work, let&#8217;s go about choosing one. Obviously, you want the cheapest provider with the performance fit for your needs, right? So, how exactly do you figure out who that is? Step into our lab and explore a few methods.</p>
<h4>The Hard Way: Modeling</h4>
<p>It would be good if you knew the distribution of your traffic&#8217;s asset sizes and frequency of their download. It would be impressive if, based on that, you were able to derive a cost model for your data profile. You&#8217;d get some kind of gold star if you were able to convince potential CDNs to give you the data on their edge performance and fit it to your model. Now wash, rinse, and repeat; 50 times.</p>
<p>Even if they knew, I&#8217;m not sure that game theory allows a CDN&#8217;s sales team give you complete disclosure for all of these points, aside from maybe, &#8220;Australia? Oh yea, we&#8217;re great in Australia!&#8221;. (Until very recently, this would have been a bold-faced lie; Australian CDN performance was laughable almost across the board.)</p>
<h4>The Sloppy Way: Reviews</h4>
<p>So what is one to do? There are many helpful review sites out there, such as <a href="http://www.cdnplanet.com/">CDNPlanet</a>, of which I suppose this article is now one. But truth be told, it can be difficult to find a review that fully exposes its methodology, or the types of assets and configurations it was considering. Also, new CDN vendors pop up every day. Colocation and Virtual Servers basically mean anyone can stand up a few instances of Nginx across the globe and slap a &#8220;CDN&#8221; sticker on the side. If you were feeling particularly audacious, a rack of RaspberryPIs powered by potato batteries could probably pass.</p>
<figure>
<img src="http://www.mobify.com/static/blog/2014/12/potato-matrix-cdn.jpg" alt="Potato Matrix CDN">
</figure>
<figcaption>
Fig 1. A boiled potato can provide 1.3V&#8212;we shall use four cells comprised of banks of parallel potatoes. They may reject the programme without the illusion of choice.
</figcaption>
<p><br></p>
<p>This means that review sites tend to be trailing indicators, and have a hard time staying relevant.</p>
<h4>The &#8220;Almost There&#8221; Way: Monitoring</h4>
<p>A good first instinct is to test all of these providers, and to see just how well they each stack up. Using tools like <a href="https://www.thousandeyes.com">Thousand Eyes</a> or Dyn&#8217;s new <a href="http://dyn.com/dyn-internet-intelligence/">Internet Intelligence</a> suite you can do all sorts of poking and prodding to test network health all over the globe. Of course, in most cases, you&#8217;re not really interested in networks, hops, fibre links, or coming up with some integrated model of how it all balances out&#8212;unless you are, in which case, congratulations, this article has nothing more for you. What we really need to know is how all these parts combine to shape overall performance.</p>
<h4>The Best Way: RUM</h4>
<p>What we really want is Real User Monitoring, or RUM. Skipping over the pirate and drinking jokes, RUM tests let you see what actually happens when all of the variables are weighed and summed in the greatest lab of all: Reality. There&#8217;s no need to guess at the relative importance of this or that, just blackbox the whole thing, sit back, and pour yourself a glass of aged cane-sugar distillate.</p>
<p>The beauty of RUM is that it matches the distribution of your traffic exactly because it is your traffic. You can either measure everyone or just a sample, depending on your needs. Most CDNs will allow you to test their service by setting up a couple of assets for benchmarking. All you need is real users! Of course, if you don&#8217;t have those yet, I might humbly suggest that CDN selection is not your first priority.</p>
<h3>Methodology</h3>
<p>At Mobify, our greatest CDN questions are &#8220;how long does it take to download our JavaScript payload&#8221;, followed by &#8220;how long does it take to download resized images&#8221;. To test these parameters, we&#8217;re going to download a bundle-sized JavaScript file and an image on a small legion of devices all over the world, and measure their performance.</p>
<p>We&#8217;re only really interested in assets that reflect the median file sizes our users request; be sure to look at the distribution of your asset sizes when selecting appropriate test assets. For JavaScript, make sure it&#8217;s something that will be both inert and have the same compression ratio as your real payloads, since gzip handling is one of those things we&#8217;re interested in testing. Also, be sure to account for SSL by either indicating which bucket the content is downloaded under, or only downloading under HTTP or HTTPS.</p>
<p>A basic sketch of the test procedure on a client device is as follows:</p>
<ul>
<li>Step 1: Pick a CDN from the test pool.</li>
<li>Step 2: Record the start time.</li>
<li>Step 3: Download the test JavaScript asset.</li>
<li>Step 4: Record the time it took to download the JavaScript.</li>
<li>Step 5: Download the test image asset.</li>
<li>Step 6: Record the time it took to download the image.</li>
<li>Step 7: Fire off a tracking pixel reporting the timing data and the CDN chosen.</li>
</ul>
<p>Yes, I know, DNS lookup will happen for the JS asset and not the image. Owing to how this test actually gets analyzed, that&#8217;s not entirely relevant, for hand-wavy reasons that don&#8217;t really matter here.</p>
<h4>Tracking Tracking Pixels</h4>
<p>So, we have data on our client&#8217;s devices that we want to report back, and we need a way to send it. The problem is that we have tsunami of bits all trying to report in, and we may not have an endpoint that can handle all of that traffic; remember why we wanted a CDN in the first place?</p>
<p>The simplest, least fun, and most reliable way today is the humble tracking pixel. We wish it weren&#8217;t so; we really wanted to build some giant UDP-hacked-DNS system, or maybe something obscenely fast written in Golang with fleets of Docker instances all rising and terminating at our whim, but alas, &#8217;twas not to be. Not only does using a tracking pixel avoid the thousand compatibility issues of different web browsers, it also lets you leverage the awesome might of AWS as a cheap data warehouse and aggregation tool. We don&#8217;t even need to write any setup code.</p>
<p>First off, make a single-pixel transparent gif (or download one <a href="http://cdn.mobify.com/1x1.gif">here</a>), and host it on an S3 bucket. Next, serve that pixel through CloudFront, with logging enabled back to S3. You can request that tracking pixel with all the query parameters you want, like say timing data and test conditions: CloudFront will dutifully serve you back your 1x1.gif in a timely manner, and within an hour or so, you should have a log file in your S3 bucket reporting all of your added data. As a bonus, CloudFront will also tell you which of its edge nodes served the content, which will come in handy for rough geo-grouping later.</p>
<p>All that&#8217;s left to do is write a tool to download the logs, unzip them, aggregate the files, and derive meaningful metrics from the data. We leave this as an exercise for the reader.</p>
<h3>Let&#8217;s Meet our Contestants</h3>
<p>First off, we&#8217;re sorry, but we probably didn&#8217;t test the CDN you wanted. There are lots and lots of CDNs out there, so we only run this test on the ones that have shown up on our radar.</p>
<p>Edgecast, Cloudfront, and MaxCDN are the vendors Mobify works with to deliver most of our services today, so we always include them in our tests to get a good baseline. Fastly is a promising up-and-comer in these tests, and has been making headlines all over the place for their strategic partnerships and integrations. You&#8217;ll also note the inclusion of ChinaCache and MileWeb; both of these providers operate within the Great Firewall of China. If you want to serve from edges on the mainland, you&#8217;ll need to go through one of them, or someone like them. More on that below.</p>
<h3>And the Winner Is&#8230;</h3>
<p>It should be noted that Mobify makes use of a user&#8217;s location to resolve DNS records. This lets us pick and choose CDNs based on where they perform the best, and aren&#8217;t forced to rely on just one service the world over. Since most CDNs target European and North American performance, leaving markets like Brazil, India, Japan, Korea, and Australia with less coverage, we can use this trick to compensate. Being able to mix and match services helps keep performance up all over the globe. To take advantage of this, we test each of the regions separately. Here are some of the results for one of our North American tests.</p>
<h4>Median delivery times. How fast is fast?</h4>
<p>When selecting a winning provider in any given region we are looking for a service with consistently fast delivery times. To get an idea of the typical delivery times along with the consistency of these results we plot the median time and how it varies over the course of a week (figures 2 &amp; 3). We chose to show results for delivery times in North America, but you can also <a href="http://www.mobify.com/blog/cdn-rum-2014/#results">see our results for all regions here</a>. The dates shown are labelled at midnight (UTC).</p>
<figure>
<img src="http://www.mobify.com/static/blog/2014/12/js_median_n_america.png" alt="North American Median JS Timing">
</figure>
<figcaption>
Fig 2. North American Median JS Timing<br>
</figcaption>
<p><br>
<figure>
<img src="http://www.mobify.com/static/blog/2014/12/img_median_n_america.png" alt="North American Median Image Timing">
</figure>
<figcaption>
Fig 3. North American Median Image Timing
</figcaption>
<br></p>
<p>In case you haven&#8217;t seen a figure like this before, let us explain what we&#8217;re looking at. Each of the heavy lines in the figure traces the median delivery time of our test asset as it varies over the course of a week. Each of the candidate CDN providers is represented by a different colour. The shaded regions around the thick lines show the 68% confidence interval in our ability to measure the median delivery time. The narrower the shaded region, the more confident we are in our measurement of the median.</p>
<p>For the set of providers tested here we see median delivery times for JavaScript (figure 2) ranging from about 500 ms up to about 1500 ms. We also notice that in North America, Edgecast is usually the fastest provider tested, with Fastly, Cloudfront, and Chinacache showing the slowest results. It&#8217;s worth noting that Fastly has improved its service in this region since the last time we ran this test (spring 2014).</p>
<p>It&#8217;s pretty interesting to see how much the median delivery times can fluctuate over the course of a week (about &#177;200 ms). We also see that the median delivery times rise and fall in a correlated manner across different providers. This is probably in response to daily fluctuations in general network congestion across the interwebs, as millions of people flock to the warm glow of their Netflix machines and their Facebooks, then abruptly shut it all down at bedtime. This gives rise to the correlated sawtooth shape that we see across all providers, which you can think of as the heartbeat of the internet. Earlier tests using smaller, more compressible, test files showed much smaller fluctuations (&#177;50 ms), which suggests that smaller files are not as sensitive to general network congestion problems.</p>
<p>The North American results for JavaScript files are fairly conclusive. Edgecast is the obvious winner, with MaxCDN in second place. Unfortunately, for the case of image files, the picture is not as clear. For our test image we see typical median delivery times that vary from about 600 ms up to about 1,000 ms. The image results also show the correlated, daily fluctuations of about &#177;200 ms.</p>
<p>When it comes to picking the best CDN in North America for image delivery we can probably rule out MaxCDN, Chinacache, and Cloudfront because these providers are often significantly slower than their top tier competitors. Edegecast and Fastly, however, are in a much closer race, often overlapping in median delivery times. There is perhaps a hint of Edgecast showing superior performance but we probably need to investigate in more detail to be sure.</p>
<h4>No client left behind</h4>
<p>Now we have a better idea of the typical performance of different CDN providers, along with the stability of their service. But questions remain about who&#8217;s the best image provider, and we still don&#8217;t know what this means for delivering viral cupcake recipes to little Timmy on a fibre connection in Oklahoma versus Grandma Jo on a 2G connection in Timbuktu. What we really need is a little more information on the distribution of delivery times for each provider (figures 4 &amp; 5).</p>
<figure>
<img src="http://www.mobify.com/static/blog/2014/12/js_distribution_n_america.png" alt="North American JS Timing Distribution">
</figure>
<figcaption>
Fig 4. North American JS Timing Distribution
</figcaption>
<figure>
<img src="http://www.mobify.com/static/blog/2014/12/img_distribution_n_america.png" alt="North American Image Timing Distribution">
</figure>
<p><br>
<figcaption>
Fig 5. North American Image Timing Distribution
</figcaption>
<br></p>
<p>Our first instinct is to look at the distribution of delivery times in the form of a histogram. We show this visualization for our JavaScript test in the top panel of figure 4. The distribution has been normalized so that it integrates to unity, which lets us give it the fancy name &#8220;Probability Density Distribution&#8221;. It shows the probability of obtaining a delivery time in a given time-interval. The distribution rises rapidly from 0 ms (impossibly fast), peaks at around 300 ms (ludicrous speed), and has a long tail that extends out beyond 7,000 ms (molasses). This is great! We now know how the delivery times are distributed. We can see that Cloudfront, Chinacache, and Fastly have a low probability of delivery at ludicrous speeds compared to the top providers, but it is still somewhat difficult to interpret these results and distinguish the superior provider.</p>
<p>We need another view. Let&#8217;s consider the Relative Probability Density distribution (figure 4, 2nd panel from the top). This clearly shows the differences between service providers. We use Fastly as the baseline and compare all the other results to them. Again we can see that Chinacache lags behind at the earliest delivery times but after about 300 ms they begin to serve even more customers than Fastly. We can see this from the excess power in the Chinacache distribution between 300 ms and 1,000 ms. How can we know which part of the distribution is more important? We need to know how to balance one step forward against two steps back. What we really want is the integrated, or Cumulative Distribution.</p>
<p>The 3rd panel from the top in figure 4 shows the Cumulative Distribution of delivery times. This is nice because we can easily see that about 50% of clients are served within the first 1,000 ms and about 75% of clients are served within the first 2,000 ms. The disadvantage of this view is that we only get an absolute sense of cumulative performance, but we are actually more interested in the small differences between CDN providers, which are difficult to see on these scales. To address this problem, we turn to the ultimate view, the Relative Cumulative Distribution (RCD)!</p>
<p>We show the RCD in the bottom panel of figure 4 (again we use Fastly as the baseline). The RCD highlights the small differences between the Cumulative Distributions by allowing us to use a magnified scale. We often think of CDN resource delivery as a sprint, but it&#8217;s really a marathon. Providers need to be performant on slow networks as well as fast networks. The RCD shows who&#8217;s winning the race over the full duration, from the first delivery to the last.</p>
<p>For JavaScript in North America, the RCD of delivery times clearly shows that Edgecast is the best. Edgecast gets out to an early lead and maintains it to the finish. If you&#8217;re looking for the best, go with Edgecast!</p>
<p>Edgecast is also the winner in image delivery for North America (figure 5, bottom panel). The differences are smaller in the realm of images, with a peak difference of only about 5% of clients served (compared to about 15% in JS-land). We can also see that the race for second place tells a different story. Where MaxCDN was the runner up for JavaScript, we see that Fastly is the next best for image delivery. Fastly is so close to Edgecast in this case (about 1.5% peak difference), that we may even consider switching to Fastly if the price difference is tempting enough.</p>
<p>We have only discussed the results for North America, but now that you know how to interpret the data, <a href="http://www.mobify.com/blog/cdn-rum-2014/#results">here are the results for all of the regions tested</a>. We also show a summary of the winners in each region in the table below.</p>
<figure>
<table>
<thead>
<tr>
<th>Region</th>
<th>JavaScript</th>
<th>Images</th>
</tr>
</thead>
<tbody>
<tr>
<td>Australia</td>
<td>Edgecast</td>
<td>Fastly</td>
</tr>
<tr>
<td>East Asia</td>
<td>Edgecast</td>
<td>Fastly *</td>
</tr>
<tr>
<td>Europe</td>
<td>Edgcast or MaxCDN **</td>
<td>Fastly</td>
</tr>
<tr>
<td>North America</td>
<td>Edgecast</td>
<td>Edgecast</td>
</tr>
<tr>
<td>South America</td>
<td>Edgecast</td>
<td>Fastly or Edgecast **</td>
</tr>
<tr>
<td>West Asia</td>
<td>Edgecast</td>
<td>Chinacache</td>
</tr>
</tbody>
</table>
</figure>
<figcaption>
Tab 1. This table shows a summary of the best CDN provider in each region. Note that the answer to "who is the best CDN?" sometimes depends on which type of connection we want optimize. Some CDNs are better optimized for slower connections and some for faster connections. You should carefully examine the included figures for your preferred region before making a final decision. * indicates a close contest in the region. ** indicates that there is no clear winner; we must consider which clients have top priority <br> (i.e. prioritize faster connections versus prioritize slower connections).
</figcaption>
<p><br></p>
<p>So far we have only considered pure performance, but any final decisions on providers must consider both your performance budget as well as your budget budget.</p>
<h4>Dolla Dolla Bills, Y'All</h4>
<p>One thing you may note from all of this is that many of the CDN providers offer very similar performance levels. If you knew the relative costs, you might be shocked. Premium CDNs can cost up to five or ten times what a basic provider does, and costs for secure SSL traffic will vary wildly. All of which is to say that if you don&#8217;t absolutely need that ten millisecond median improvement, you&#8217;re probably fine without the premium provider, as basic services are still better than just exposing your naked server. Most premium services differentiate themselves on having more exotic content and control options, and basic web content doesn&#8217;t really need any of that.</p>
<h3>A Word on China</h3>
<p>We hope no one is surprised to learn that the anarcho-cyber-atopia we all love to call &#8220;internet&#8221; doesn&#8217;t exactly look the same if you live in mainland China: It&#8217;s more like a giant national LAN party, with some conflicting hardware on the network. External traffic is tightly controlled, and the state is running the mother of all nanny-filters. China is also home to an exploding population of affluent buyers wanting to partake in the global marketplace, and e-commerce sites are eager to sell, requiring some interesting considerations.</p>
<p>In case you missed the epic China v. Google unfriending, Google-associated resources are inaccessible on the mainland; hopefully you proxy your AppEngine applications, web fonts, hosted libraries, YouTube links, and any other resources that smell like Mountain View. Twitter and Facebook enjoy a similar status, so it&#8217;s sad pandas for social linking. The take-away is that China will, if needed, block a web presence as massive as Google if it feels it is in the national interest, so you shouldn&#8217;t assume that any provider will be immune to future Chinese actions. In fact, shortly after this round of testing was completed, one of the CDN providers, EdgeCast, landed on the no-no list for the Great Firewall.</p>
<p>China&#8217;s Ministry of Industry and Information Technology (MIIT) issues licenses to anyone wishing to publish web content within the Great Firewall. This is serious business as you are politically liable for content published on your domain. This means that you will have to &#8220;harmonize&#8221; any content published by your users to help with China&#8217;s &#8220;guided public opinion&#8221; program for stopping the spread of &#8220;poisonous rumours&#8221;. Alternately you can simply ban user-published content on your domain. Your Chinese CDN partners can help you navigate these tricky waters.</p>
<h4>Chinese Results</h4>
<p>The following results show the performance of CDN traffic in China (identified by GeoIP matching, rather than CloundFront bucket location code).</p>
<figure>
<img src="http://www.mobify.com/static/blog/2014/12/china_cdn_js_distribution.png" alt="Chinese CDN JS Distribution">
</figure>
<figcaption>
Fig 6. JavaScript delivery time results for mainland China, filtered using a shaddy mapping of IP address to geo-location.
</figcaption>
<p><br>
<figure>
<img src="http://www.mobify.com/static/blog/2014/12/china_cdn_image_distribution.png" alt="Chinese CDN image Distribution">
</figure>
<figcaption>
Fig 7. Image delivery time results for mainland China, filtered using a shaddy mapping of IP address to geo-location.
</figcaption>
<br></p>
<p>As you can see, China Cache is the clear winner, which is no surprise, given the fact that their edge locations are actually in China. What is surprising is that even with the Great Firewall in effect, MaxCDN and Fastly still serve traffic fairly quickly. (As noted above, EdgeCast was blocked shortly after this test was concluded.)</p>
<h3>The Future?</h3>
<p>Of course, these results will be stale by this time next year. Thankfully, Mobify prioritizes network performance, and to that end, here are some of the tests we have slated for our next CDN shootout.</p>
<h4>Mobile-tuned CDNs</h4>
<p>There are some exciting new CDN players out there who are tuning their edges to better serve the network profile of mobile devices. As you may have guessed from our name, Mobify is pretty interested in Mobile.</p>
<h4>Comparison of Header Support</h4>
<p>If you&#8217;re going to find any use for a CDN in your stack, it&#8217;s going to need to play well with the standards. This means proper support for Etags and standard Cache-Control headers. Hopefully there won&#8217;t be any gotchas, but you never know until you check.</p>
<h4>SSL, SPDY/HTTP2, and Other Variations</h4>
<p>We also want to know if these edge nodes are using the best-practice SSL schemes, or if some of them are rolling out support for SPDY/HTTP2 or some other variant that we care about. These will be increasingly important considerations as the trend seems to be moving towards encryption everywhere.</p>
<h3>Concluding Remarks</h3>
<p>The goal of this article is to share our CDN RUM story, and hopefully inspire you to try out some of these tests for yourself. We&#8217;d encourage you to tweet us <a href="https://twitter.com/MobifyWebdev">@mobifywebdev</a>, or comment below with your own stories and results.</p>
<p>Good luck, thanks for visiting, and happy testing!</p>
<h3 id="results">Result Gravy</h3>
<p>Pick a region to see all the gory details! If you're not sure how to read these graphs, scroll up and learn you some statistics for great good.</p>
<ul>
<li><a href="http://www.mobify.com/blog/cdn-rum-2014/#namerica-results">North America</a></li>
<li><a href="http://www.mobify.com/blog/cdn-rum-2014/#europe-results">Europe</a></li>
<li><a href="http://www.mobify.com/blog/cdn-rum-2014/#easia-results">East Asia</a></li>
<li><a href="http://www.mobify.com/blog/cdn-rum-2014/#australia-results">Australia</a></li>
<li><a href="http://www.mobify.com/blog/cdn-rum-2014/#wasia-results">West Asia</a></li>
<li><a href="http://www.mobify.com/blog/cdn-rum-2014/#samerica-results">South America</a></li>
</ul>
<h4 id="namerica-results">North America</h4>
<p><figure>
<img src="http://www.mobify.com/static/blog/2014/12/js_median_n_america.png" alt="North American Median JS Timing">
<img src="http://www.mobify.com/static/blog/2014/12/img_median_n_america.png" alt="North American Median Image Timing">
<img src="http://www.mobify.com/static/blog/2014/12/js_distribution_n_america.png" alt="North American JS Timing Distribution">
<img src="http://www.mobify.com/static/blog/2014/12/img_distribution_n_america.png" alt="North American Image Timing Distribution">
</figure></p>
<h4 id="europe-results">Europe</h4>
<p><figure>
<img src="http://www.mobify.com/static/blog/2014/12/js_median_europe.png" alt="European Median JS Timing">
<img src="http://www.mobify.com/static/blog/2014/12/img_median_europe.png" alt="European Median Image Timing">
<img src="http://www.mobify.com/static/blog/2014/12/js_distribution_europe.png" alt="European JS Timing Distribution">
<img src="http://www.mobify.com/static/blog/2014/12/img_distribution_europe.png" alt="European Image Timing Distribution">
</figure></p>
<h4 id="easia-results">East Asia</h4>
<p><figure>
<img src="http://www.mobify.com/static/blog/2014/12/js_median_e_asia.png" alt="East Asian Median JS Timing">
<img src="http://www.mobify.com/static/blog/2014/12/img_median_e_asia.png" alt="East Asian Median Image Timing">
<img src="http://www.mobify.com/static/blog/2014/12/js_distribution_e_asia.png" alt="East Asian JS Timing Distribution">
<img src="http://www.mobify.com/static/blog/2014/12/img_distribution_e_asia.png" alt="East Asian Image Timing Distribution">
</figure></p>
<h4 id="australia-results">Australia</h4>
<p><figure>
<img src="http://www.mobify.com/static/blog/2014/12/js_median_australia.png" alt="Australian Median JS Timing">
<img src="http://www.mobify.com/static/blog/2014/12/img_median_australia.png" alt="Australian Median Image Timing">
<img src="http://www.mobify.com/static/blog/2014/12/js_distribution_australia.png" alt="Australian JS Timing Distribution">
<img src="http://www.mobify.com/static/blog/2014/12/img_distribution_australia.png" alt="Australian Image Timing Distribution">
</figure></p>
<h4 id="wasia-results">West Asia</h4>
<p><figure>
<img src="http://www.mobify.com/static/blog/2014/12/js_median_w_asia.png" alt="West Asian Median JS Timing">
<img src="http://www.mobify.com/static/blog/2014/12/img_median_w_asia.png" alt="West Asian Median Image Timing">
<img src="http://www.mobify.com/static/blog/2014/12/js_distribution_w_asia.png" alt="West Asian JS Timing Distribution">
<img src="http://www.mobify.com/static/blog/2014/12/img_distribution_w_asia.png" alt="West Asian Image Timing Distribution">
</figure></p>
<h4 id="samerica-results">South America</h4>
<p><figure>
<img src="http://www.mobify.com/static/blog/2014/12/js_distribution_s_america.png" alt="South American JS Timing Distribution">
<img src="http://www.mobify.com/static/blog/2014/12/img_distribution_s_america.png" alt="South American Image Timing Distribution">
</figure>
<figcaption>
It turns out that even with all of the traffic we could muster, we couldn't find enough South Americans to make continuous graphs. Maybe next time! Hence no Median Timing Graphs above.
</figcaption></p>
<p><br>
<br></p></div>Why Meetings Are A Waste Of Time And How To Run Them More Efficiently2014-11-14T16:00:00-08:00Candice Percytag:www.mobify.com,2014-11-14:blog/running-meetings-efficiently/<div><p><img alt="Meetings are a waste of time" src="http://www.mobify.com/static/blog/2014/11/meeting-top.jpg"></p>
<p>Do you ever get a meeting invite and dread going or wonder why am I here when you
get there? I&#8217;ve made it a practice to not accept meetings that don&#8217;t have an agenda
or clear objective.</p>
<p>One rule of thumb I use is the G.A.S factor (Give a Shit factor) &#8212; Why do you
need me there?, Am I a collaborator?, Subject matter expert?, Decision maker?
or Do I need to be informed? If it ticks any of these boxes then I feel much
better accepting a meeting and will most likely not be a waste of my time.</p>
<p>Meetings have a tendency to start filling up your work day, week by week, if not
managed carefully. There&#8217;s a number of articles on statistics of how much of your
work day is spent in meetings which turn out to be a waste of time. Here&#8217;s a couple
I&#8217;d like to share:</p>
<ul>
<li>For a Harvard Business Review study, <a href="http://www.theguardian.com/news/oliver-burkeman-s-blog/2014/may/01/meetings-soul-sucking-waste-time-you-thought">three consultants</a> examined the Outlook calendars of multiple workers at a large company and found that one weekly executive meeting consumed 300,000 hours each year.</li>
<li>Atlassian wrote an article on <a href="https://www.atlassian.com/time-wasting-at-work-infographic">You waste a lot of time at work</a> which illustrates below that 31 hours are spent in unproductive meetings over a month with &#189; of them considered a waste of time!
<img alt="Atlassian's Time Wasted infographic" src="http://www.mobify.com/static/blog/2014/11/time-wasted.png"></li>
</ul>
<p>There&#8217;s also a formula I came across to determine how much money the meeting is costing
with the number of people in the room. <a href="http://www.ecyrd.com/timeismoney/">Check this out</a>!</p>
<h2>What can you do about it?</h2>
<p>Ask yourself the following 4 questions to help frame the purpose of the meeting
and who needs to be there, and for how long:</p>
<p><img alt="Make a list for the meeting" src="http://www.mobify.com/static/blog/2014/11/list.jpg"></p>
<ol>
<li><strong>What is the reason for your meeting?</strong> Need a decision? Brainstorming? Informing project status / demo / presentation?</li>
<li><strong>What's your agenda?</strong> Once you determine the reason for the meeting, you can start laying out what you want to go through in the meeting.</li>
<li><strong>Who needs to be there?</strong> As you finish flushing out the agenda topics, this will help drive who needs to be there (key decision maker, subject matter expert, full team or just team leads?) Who is absolutely necessary for the meeting? Only invite those absolutely necessary and inform the others that need to know the results via email of meeting minutes.</li>
<li><strong>How long should the meeting be?</strong> Try and not start with booking the meeting time length. Most people by default book an hour and waste the time as people think I need the fill the whole hour. I highly recommend booking your meeting time based on your agenda. Break down the items and put a timebox on them. This will help save time and money when you&#8217;re in meetings. As a rule of thumb, I try to have my meetings 25 min long, you&#8217;d be surprised how much you can get done in a shorter amount of time!</li>
</ol>
<h2>Ready for the meeting?</h2>
<p>Now that you have finished planning, you&#8217;re ready to have the meeting! Follow
these 3 guidelines to ensure that you are running the most effective meeting:</p>
<p><img alt="One of Mobify's design meetings" src="http://www.mobify.com/static/blog/2014/11/meeting.jpg"></p>
<ol>
<li><strong>Have a meeting facilitator.</strong> Most of the time the meeting facilitator is yourself, as you created the meeting. If it isn&#8217;t you, ensure that the Facilitator is introduced and review the agenda item with the team when the meeting starts. If there are any changes or additions to the agenda, you can update it. Check out this article to dive deeper into the roles and responsibilities of a Meeting Facilitator.</li>
<li><strong>Start and finish the meeting on time.</strong> Start the meeting on time! Nominate a timekeeper (they will help keep you on schedule!) to review the agenda items and inform the attendees when you are nearing the end of the time estimate and when you have reached the end. The Time Keeper will ask the Facilitator if they would like to continue on the same topic or to make a decision, and move to the next item to keep the meeting on track. The Facilitator can track items that are going on too long or are not on the agenda in the parking lot on a whiteboard.</li>
<li><strong>Keep meeting minutes.</strong> Structured meeting minutes, help save time and ensure that you capture all the important information. Nominate a team member to take notes if you are facilitating. One idea would be to nominate the last person seated as the minute taker. Some key things to include in your meeting minutes:<ul>
<li><strong>Attendees.</strong> Who showed up.</li>
<li><strong>Date and time of your meeting.</strong> Helps when you need to keep track of when you had the meeting so that anyone who didn&#8217;t come, who needs to be informed, has a point of reference for when the meeting occurred.</li>
<li><strong>Capture action items with owners and due dates.</strong> Ensure that any action items captured in meeting minutes has clear owners and due dates. It&#8217;s important to know who to follow up with and when.</li>
<li><strong>Capture decisions.</strong> If there are any decisions, ensure they are captured in the meeting minutes and any background notes to help support it. (What is the decision?, Who made the decision?, Why?)</li>
<li><strong><a href="http://meetingking.com/importance-of-parking-lot-for-meetings/">Parking lot</a>.</strong> The parking lot is useful for any discussions that pass the set time limit for the topic, or are not on the agenda.</li>
</ul>
</li>
</ol>
<h2>7 Guidelines that Mobify Uses</h2>
<ol>
<li>Show up on time.</li>
<li>No cell phones or laptops that pose a distraction during the meeting.</li>
<li>Keep the discussion on track. Use a 5 minute rule if topics are going too long. A key job of the meeting facilitator is to keep bringing people back to the issue. Stick to the items on the agenda and don&#8217;t allow discussion to stray or wander.</li>
<li>No side conversations or interrupting people.</li>
<li>Be civil. Difference of opinions need to be heard. The facilitator can help guide the discussion. Look for data not assertions. If there is no data to prove the point then you can use the meeting to determine what data is needed and who can gather and validate for a follow up.</li>
<li>Have clear action items with owners and due dates before you walk out of the meeting.</li>
<li>End the meeting on time.</li>
</ol>
<h2>Where do you start?</h2>
<p>At Mobify we value small, sustainable changes to be able to see results and tweak if necessary. This also helps to not overwhelm the team.</p>
<p>Fear not! One step at a time. Below is a tool you could try to evaluate pain points and select one to focus on.</p>
<p>Meeting Scorecard created by <a href="http://www.slideshare.net/Line-of-Sight/top-10-tips-for-effective-meetings-090813-line-of-sight">Line of Sight</a>.</p>
<p><img alt="Meeting scorecard by Line of Sight" src="http://www.mobify.com/static/blog/2014/11/scorecard.png"></p>
<p>And there you have it, some great tips and tricks to make your next meeting the most effective and productive. I promise you, your team and company will appreciate it!</p></div>A Python guide to handling HTTP request failures2014-10-07T09:30:00-07:00John Boxalltag:www.mobify.com,2014-10-07:blog/http-requests-are-hard/<div><p>A lot of things can go wrong when requesting information over HTTP from a remote
web server: requests timeout, servers fail, government operatives cut undersea
cables. You get the picture.</p>
<p>Identifying and handling failures helps build fault tolerant systems that stay
up even when services they rely on are down. A nice side effect is your phone is
less likely to beep in the middle of the night with a message from your coworkers
talking in all caps.</p>
<p>This guide will introduce you to the common ways HTTP requests fail and how to
handle the failures.</p>
<p>The examples use Python's fantastic <a href="http://docs.python-requests.org/"><code>requests</code></a>
library, but the principles shown work across all languages. You can follow
along on your computer by grabbing <code>requests</code> off PyPi.</p>
<p>The <a href="http://docs.python-requests.org/en/latest/api/#requests.get"><code>requests.get(url)</code></a>
method is the cornerstone for all the examples. It makes a synchronous HTTP
<code>GET</code> request to fetch the content from <code>url</code>:</p>
<div class="codehilite"><pre><span class="c"># Importing `requests` is omitted from here on for brevity. If you are coding</span>
<span class="c"># along with the article, make sure to include before trying the examples!</span>
<span class="kn">import</span> <span class="nn">requests</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="s">"https://www.mobify.com/"</span><span class="p">)</span>
</pre></div>
<p>Where possible, the examples use <a href="https://httpbin.org"><code>httpbin</code></a> to illustrate
the specific failure scenarios. It's a great service for testing how your code
will react in a hostile world!</p>
<p>The guide assumes familiarly with making HTTP requests and uses the following
terminology:</p>
<ul>
<li><strong>Client</strong>: The code making the HTTP requests and the server it lives on.</li>
<li><strong>Server</strong>: The box that delivers the HTTP response we requested.</li>
<li><strong>Caller</strong>: The code which instantiates the client and tells it to make a request.</li>
</ul>
<p>Ready to make some requests? Let's go!</p>
<h3>DNS lookup failures</h3>
<p>HTTP requests can fail before the client can even make a connection to the
server. If the URL specified by the caller has a domain name, the client must
look up its IP address before making the request. If the domain name doesn't
resolve it's possible that it isn't configured correctly or doesn't exist.</p>
<div class="codehilite"><pre><span class="c"># This domain name doesn't exist!</span>
<span class="n">url</span> <span class="o">=</span> <span class="s">"http://www.definitivelydoesnotexist.com/"</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">request</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
<span class="k">except</span> <span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">ConnectionError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">print</span> <span class="s">"These aren't the domains we're looking for."</span>
</pre></div>
<p>It's important to let the caller know they may have entered the wrong domain!</p>
<h3>Errors connecting to the server</h3>
<p>Even if the hostname of the URL correctly resolves, we might not always succeed
in connecting to the server. If someone tripped on its power cord and took it
down, it's unlikely it will accept our connection!</p>
<p>Errors of this nature often block the client, tying it up waiting for a server
that will never respond. For this reason, it's a good idea to add timeouts to
the client. That way, if the server takes too long to respond, the client can
move on to do something else rather than waiting
indefinitely. <a href="http://docs.python-requests.org/en/latest/user/advanced/#timeouts"><code>requests</code> provides both <code>connect</code> and <code>read</code> timeouts</a>.
<code>connect</code> is the amount of time the client should wait to establish a
connection to the server:</p>
<div class="codehilite"><pre><span class="c"># Using a very short `connect_timeout` gives us a feel for what happens when the</span>
<span class="c"># server is slow to pickup the connection.</span>
<span class="n">connect_timeout</span> <span class="o">=</span> <span class="mf">0.0001</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="s">"https://httpbin.org/delay/5"</span><span class="p">,</span>
<span class="n">timeout</span><span class="o">=</span><span class="p">(</span><span class="n">connect_timeout</span><span class="p">,</span> <span class="mf">10.0</span><span class="p">))</span>
<span class="k">except</span> <span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">ConnectTimeout</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">print</span> <span class="s">"Too slow Mojo!"</span>
</pre></div>
<p><code>read</code> is the amount of time it should wait between bytes from the server:</p>
<div class="codehilite"><pre><span class="c"># Our resource takes longer than `read_timeout` to send a byte.</span>
<span class="n">read_timeout</span> <span class="o">=</span> <span class="mf">1.0</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="s">"https://httpbin.org/delay/5"</span><span class="p">,</span>
<span class="n">timeout</span><span class="o">=</span><span class="p">(</span><span class="mf">10.0</span><span class="p">,</span> <span class="n">read_timeout</span><span class="p">))</span>
<span class="k">except</span> <span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">ReadTimeout</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">print</span> <span class="s">"Waited too long between bytes."</span>
</pre></div>
<p>The exact values used for the timeout are usually less important than just
setting one. You don't want the client to be blocked forever on a slowpoke
server. Start with 10 seconds and watch your logs.</p>
<blockquote>
<p>Extra Credit: Depending on the profile of the system you're building, you may
want to implement dynamic timeouts that use historical data to wait longer for
servers that are known to be slow. You may want to <a href="http://martinfowler.com/bliki/CircuitBreaker.html">ban your client</a>
from even trying to connect to servers that always timeout.</p>
</blockquote>
<h3>HTTP errors</h3>
<p>What if something goes sideways while the server is preparing our response?
Maybe its database is unresponsive or it was switched in maintenance mode.
Whatever the reason, if the server is able to detect that it isn't functioning
correctly, it should respond with a <a href="http://en.wikipedia.org/wiki/List_of_HTTP_status_codes#5xx_Server_Error">HTTP server error code</a>.</p>
<p>Alternatively, if the client is incorrectly constructing the request, the server
may respond with a <a href="http://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_Error">HTTP client error code</a>.</p>
<p>In most cases we'll want to identify these bad response status codes and let the
caller handle them. With <code>requests</code>, this is as easy as calling the <a href="http://docs.python-requests.org/en/latest/api/#requests.Response.raise_for_status"><code>response.raise_for_status()</code></a>
method on the <code>response</code> object:</p>
<div class="codehilite"><pre><span class="c"># This URL returns a HTTP 500 Server Error.</span>
<span class="n">url</span> <span class="o">=</span> <span class="s">"https://httpbin.org/status/500"</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">response</span><span class="o">.</span><span class="n">raise_for_status</span><span class="p">()</span>
<span class="k">except</span> <span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">HTTPError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">print</span> <span class="s">"And you get an HTTPError:"</span><span class="p">,</span> <span class="n">e</span><span class="o">.</span><span class="n">message</span>
</pre></div>
<h3>Responses that aren't what we expect</h3>
<p>It's possible that the caller could request a resource that our client wasn't
designed to handle. For example, what if someone uses our RSS reader to request
an MKV file of the last episode of Game of Thrones?</p>
<p>We can assert that <code>Content-Type</code> response header matches what we expect. Our
RSS reader example might look for the following:</p>
<div class="codehilite"><pre><span class="k">class</span> <span class="nc">WrongContent</span><span class="p">(</span><span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">RequestException</span><span class="p">):</span>
<span class="sd">"""The response has the wrong content."""</span>
<span class="c"># This URL sets the `Content-Type` to `text/plain`.</span>
<span class="n">url</span> <span class="o">=</span> <span class="s">"https://httpbin.org/response-headers?Content-Type=text/plain"</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
<span class="k">if</span> <span class="n">response</span><span class="o">.</span><span class="n">headers</span><span class="p">[</span><span class="s">"content-type"</span><span class="p">]</span> <span class="o">!=</span> <span class="s">"application/rss+xml"</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">WrongContent</span><span class="p">(</span><span class="n">response</span><span class="o">=</span><span class="n">response</span><span class="p">)</span>
</pre></div>
<p>Note that even if the <code>Content-Type</code> header does match what we are expecting,
there is no guarantee that the response's body will. Calling code should account
for this. For example, if we're expecting JSON and we don't get back JSON, that's
a problem. In <code>requests</code>, the <code>response.json()</code> method tries to convert the
response body into a Python object from JSON:</p>
<div class="codehilite"><pre><span class="c"># This URL returns an XML document.</span>
<span class="n">url</span> <span class="o">=</span> <span class="s">"https://httpbin.org/xml"</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">json</span><span class="p">()</span>
<span class="k">except</span> <span class="ne">ValueError</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">WrongContent</span><span class="p">(</span><span class="n">response</span><span class="o">=</span><span class="n">response</span><span class="p">)</span>
</pre></div>
<blockquote>
<p>Extra Credit: If we're processing text data like HTML, don't forget to
detect its charset and correctly decode it. You'll need to check the
<code>response</code>'s <code>Content-Type</code> header as well as potentially the content itself
to avoid decoding errors.</p>
</blockquote>
<h3>Responses that are too large</h3>
<p>Let's go back to our movie example. Not only is the movie not the content type
our RSS reader expects, it's also really big. If we're not careful, these kinds
of responses could exhaust our client's resources.</p>
<p>To ensure our client hasn't been asked to download the entire internet, we must
track how much content we've received. With <code>requests</code>, <a href="http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow">this takes a little more code</a>:</p>
<div class="codehilite"><pre><span class="kn">from</span> <span class="nn">contextlib</span> <span class="kn">import</span> <span class="n">closing</span>
<span class="k">class</span> <span class="nc">TooBig</span><span class="p">(</span><span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">RequestException</span><span class="p">):</span>
<span class="sd">"""The response was way too big."""</span>
<span class="n">TOO_BIG</span> <span class="o">=</span> <span class="mi">1024</span> <span class="o">*</span> <span class="mi">1024</span> <span class="o">*</span> <span class="mi">10</span> <span class="c"># 10MB</span>
<span class="n">CHUNK_SIZE</span> <span class="o">=</span> <span class="mi">1024</span> <span class="o">*</span> <span class="mi">128</span>
<span class="n">url</span> <span class="o">=</span> <span class="s">"https://path-to-a-huge-resource/"</span>
<span class="k">with</span> <span class="n">closing</span><span class="p">(</span><span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">stream</span><span class="o">=</span><span class="bp">True</span><span class="p">))</span> <span class="k">as</span> <span class="n">response</span><span class="p">:</span>
<span class="n">content_length</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">chunk</span> <span class="ow">in</span> <span class="n">response</span><span class="o">.</span><span class="n">iter_content</span><span class="p">(</span><span class="n">chunk_size</span><span class="o">=</span><span class="n">CHUNK_SIZE</span><span class="p">):</span>
<span class="n">content_length</span> <span class="o">=</span> <span class="n">content_length</span> <span class="o">+</span> <span class="n">CHUNK_SIZE</span>
<span class="k">if</span> <span class="n">content_length</span> <span class="o">&gt;</span> <span class="n">TOO_BIG</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">TooBig</span><span class="p">(</span><span class="n">response</span><span class="o">=</span><span class="n">response</span><span class="p">)</span>
</pre></div>
<h3>Requests to unexpected URLs</h3>
<p>If the client is located inside your network it may have privileged access to
internal servers not addressable from the public internet. For example, what if
the caller requests <code>http://127.0.0.1/admin/</code>?</p>
<p>If you're letting callers request arbitrary URLs, we need to check that they are
allowed to request what they are asking for.</p>
<p>One strategy is to prevent callers from requesting sensitive hosts using a
blacklist. A blacklist checks whether the requested domain is present in a set
of restricted domains. If it is, the request is rejected before it's even made.
At a minimum, we'll want to blacklist internal IP addresses.</p>
<p>Python 3.3 added the <a href="https://docs.python.org/3/howto/ipaddress.html"><code>ipaddress</code></a>
module to the standard library, and in Python 2 we can install its backport
<a href="https://pypi.python.org/pypi/py2-ipaddress/"><code>py2-ipaddress</code> from PyPi</a>. Here
we use it to filter requests for internal IP addresses:</p>
<div class="codehilite"><pre><span class="kn">import</span> <span class="nn">ipaddress</span>
<span class="kn">import</span> <span class="nn">urlparse</span>
<span class="n">url</span> <span class="o">=</span> <span class="s">"http://127.0.0.1/admin/"</span>
<span class="n">hostname</span> <span class="o">=</span> <span class="n">urlparse</span><span class="o">.</span><span class="n">urlparse</span><span class="p">(</span><span class="n">url</span><span class="p">)</span><span class="o">.</span><span class="n">hostname</span>
<span class="c"># `localhost` isn't an IP address, but we probably don't want callers hitting it.</span>
<span class="k">if</span> <span class="n">hostname</span> <span class="o">==</span> <span class="s">'localhost'</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">InvalidURL</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
<span class="c"># If `hostname` quacks like an IP address, make sure it isn't internal.</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">ip</span> <span class="o">=</span> <span class="n">ipaddress</span><span class="o">.</span><span class="n">ip_address</span><span class="p">(</span><span class="n">hostname</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">ValueError</span><span class="p">:</span>
<span class="k">pass</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">if</span> <span class="n">ip</span><span class="o">.</span><span class="n">is_loopback</span> <span class="ow">or</span> <span class="n">ip</span><span class="o">.</span><span class="n">is_reserved</span> <span class="ow">or</span> <span class="n">ip</span><span class="o">.</span><span class="n">is_private</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">InvalidURL</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
</pre></div>
<p>We might extend our blacklist to include internal hostnames or other sensitive
servers. Maybe we also don't want callers to call the server doing the calling.
Otherwise it could be turtles all the way down.</p>
<blockquote>
<p>Extra Credit: If you want to get serious you'll need to resolve the domain
name of the requested resource and check whether it maps to a local IP
address.</p>
</blockquote>
<p>Alternatively, if callers should only be able to request from a narrow set of
servers it may be easier to use a whitelist to reject requests which aren't
directed at a known host:</p>
<div class="codehilite"><pre><span class="kn">import</span> <span class="nn">urlparse</span>
<span class="n">WHITELISTED_HOSTS</span> <span class="o">=</span> <span class="p">{</span><span class="s">"rainbows.com"</span><span class="p">,</span> <span class="s">"magic.com"</span><span class="p">}</span>
<span class="n">url</span> <span class="o">=</span> <span class="s">"https://unicorns.com/"</span>
<span class="k">if</span> <span class="n">urlparse</span><span class="o">.</span><span class="n">urlparse</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="n">url</span><span class="p">)</span><span class="o">.</span><span class="n">hostname</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">WHITELISTED_HOSTS</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">InvalidURL</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
</pre></div>
<blockquote>
<p>Extra Credit: Depending on your needs, you might also want to restrict other
parts of the HTTP request, including the protocol used, or the ports.
Additionally, if you find a caller abusing the system, you might want to build a
mechanism to ban them!</p>
</blockquote>
<h2>Handling errors</h2>
<p>So now that we've identified all these errors, what the heck should be do with
them?</p>
<h3>Logging</h3>
<p>What broke? When? Where? Logging failures creates a trail that you can search
for patterns. Logs will often give you insight about how you can further tweak
your configuration to best suit your system or whether someone is abusing the
system.</p>
<h3>Retrying</h3>
<p>When you're firing bits around the world sometimes you just get unlucky.
Depending on what you're doing, it may make sense to just retry the
request if you think the error was intermittent. <code>requests</code> provides an interface
for creating custom <a href="http://docs.python-requests.org/en/latest/user/advanced/#transport-adapters"><code>adapters</code></a>
that can be used to implement retries:</p>
<div class="codehilite"><pre><span class="c"># Use a `Session` instance to customize how `requests` handles making HTTP requests.</span>
<span class="n">session</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span>
<span class="c"># `mount` a custom adapter that retries failed connections for HTTP and HTTPS requests.</span>
<span class="n">session</span><span class="o">.</span><span class="n">mount</span><span class="p">(</span><span class="s">"http://"</span><span class="p">,</span> <span class="n">requests</span><span class="o">.</span><span class="n">adapters</span><span class="o">.</span><span class="n">HTTPAdapter</span><span class="p">(</span><span class="n">max_retries</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
<span class="n">session</span><span class="o">.</span><span class="n">mount</span><span class="p">(</span><span class="s">"https://"</span><span class="p">,</span> <span class="n">requests</span><span class="o">.</span><span class="n">adapters</span><span class="o">.</span><span class="n">HTTPAdapter</span><span class="p">(</span><span class="n">max_retries</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
<span class="c"># Rejoice with new fault tolerant behaviour!</span>
<span class="n">session</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="s">"https://www.service-that-drops-every-odd-request.com/"</span><span class="p">)</span>
</pre></div>
<p>Just make sure you only retry requests that are idempotent!</p>
<h3>Notification</h3>
<p>Finally, you'll need to raise the error to the caller. You'll want to do it in a
way that makes it easy for the caller to handle all possible exceptions, but
also in a way that makes it clear why the exception was raised. This is especially
important if you will be displaying the error to a non-technical user and you
want to provide clear instructions about whether they've mistyped the domain or
the server they are trying to connect to is down. In Python, this is a great
chance to <a href="http://blog.ionelmc.ro/2014/08/03/the-most-underrated-feature-in-python-3/">read up on properly re-raising exceptions</a>!</p>
<h2>For Further Consideration</h2>
<h3>SSL</h3>
<p>SSL is pretty cool and we should do more of it. The <code>requests</code> library
<a href="http://docs.python-requests.org/en/latest/user/advanced/#ssl-cert-verification">verifies certificates by default</a>.
If you're using a different library or language, be sure to check that your client
is checking that certificates are valid. You don't want someone <a href="http://en.wikipedia.org/wiki/Man-in-the-middle_attack">MITMing</a>
your connection!</p>
<div class="codehilite"><pre><span class="k">try</span><span class="p">:</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="s">"https://www.super-sketchy-website.com,</span>
<span class="n">verify</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="k">except</span> <span class="n">requests</span><span class="o">.</span><span class="n">exceptions</span><span class="o">.</span><span class="n">SSLError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">print</span> <span class="s">"That domain looks super sketchy."</span>
</pre></div>
<h3>Internationalized Domain Names</h3>
<p><a href="http://en.wikipedia.org/wiki/Internationalized_domain_name">International Domain Names</a>
are a thing. Many libraries will handle these by default now, but you probably
want to throw a test case in there that makes sure the <a href="http://unicodesnowmanforyou.com/">snowman</a> works:</p>
<div class="codehilite"><pre><span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="s">u"http://&#10145;.ws/mobify"</span><span class="p">)</span>
</pre></div>
<h3>Performance</h3>
<p>Depending on how you've built your client, there are a variety of ways you might
be able to improve its performance:</p>
<ul>
<li>Consider requesting the compressed response content by setting the
header <code>Accept-Content: gzip</code>. You'll need to make sure your <a href="http://docs.python-requests.org/en/latest/community/faq/#encoded-data">client can handle
decompressing</a>
the content.</li>
<li>Consider having your client connect through an HTTP proxy
like <a href="http://www.squid-cache.org/">Squid</a> or <a href="http://www.squid-cache.org/">Varnish</a>.
If you expect to be requesting the same resources again and again, the proxy's
cache may considerably reduce response times for <a href="http://www.mobify.com/blog/beginners-guide-to-http-cache-headers/">cacheable resources</a>.</li>
<li><a href="http://docs.python-requests.org/en/latest/user/advanced/#blocking-or-non-blocking"><code>requests</code> is blocking</a>.
That means that your client will only be able to process one request at a time.
If your system needs to support many concurrent requests, you might consider
going async using libraries like <a href="https://github.com/ross/requests-futures"><code>request-futures</code></a>
or <a href="https://github.com/kennethreitz/grequests"><code>grequests</code></a>.</li>
</ul>
<h3>Tooling</h3>
<p>There are a number of tools out there that can help simplify putting this all
together:</p>
<ul>
<li><a href="http://httpbin.org">httpbin.org</a> allows you to quickly test a number of different
HTTP response scenarios. It's <a href="http://httpbin.org/delay/3"><code>delay</code></a> and <a href="http://httpbin.org/drip?numbytes=5&amp;duration=5&amp;code=200"><code>drip</code></a>
endpoints are especially useful for testing weird edge conditions.</li>
<li><a href="https://github.com/gabrielfalcao/HTTPretty">HTTPretty</a> is a mocking library
that can make cranking out unit tests for all these errors relatively simple.</li>
</ul>
<h2>Wrapping it all up</h2>
<p>Wow, there are a lot of ways HTTP requests can fail. TLDR, when making a request:</p>
<ul>
<li>Account for DNS lookup failures</li>
<li>Set a connection and read timeout</li>
<li>Be sure to handle HTTP errors</li>
<li>Check that the response has the content type you expect</li>
<li>Limit the maximum response size</li>
<li>Ensure that private URLs are not requestable</li>
<li>Always. Be. SSLing.</li>
</ul>
<p><strong>Now it's your turn!</strong></p>
<p>Go forth and write fault tolerant services that request data using HTTP!</p>
<blockquote>
<p>Did we miss anything? Let us know in the comments below.</p>
</blockquote></div>How to Use SQLAlchemy Magic to Cut Peak Memory and Server Costs in Half2014-08-15T16:00:00-07:00Ben Lasttag:www.mobify.com,2014-08-15:blog/sqlalchemy-memory-magic/<div><p>We do a <strong>lot</strong> of A/B testing at Mobify, and that means a lot of analysis of
the results. We collect and store data from many websites and need to be able to
run analyses multiple times as we improve and modify our techniques.</p>
<p>Much of this work is done in Python, using <a href="http://www.sqlalchemy.org/">SQLAlchemy</a> for
database access and <a href="http://pandas.pydata.org/pandas-docs/stable/index.html">Pandas</a>
for analysis. This post is about how we took an initial script that used a simple approach
to reading data and reduced the peak memory requirements.</p>
<p>Pandas is a powerful tool, but it has one weakness: all the data being analyzed
(in DataFrames) is held in memory. We have some <strong>very</strong> large data sets, and so
it's important to try to use the smallest amount of memory we can, to
minimize the number of big, expensive EC2 instances that we need to spin up.</p>
<p>The figures in this post are taken from a real data analysis run
spanning eleven days of traffic on one medium-size website. In practice, we
often need to cover a month or more or very high-traffic sites, so the values
are correspondingly higher: peak memory requirements of many tens of Gb.</p>
<h3>The Original Code</h3>
<p>Here's the key fragment of the data reading code as it was first written, with
comments added:</p>
<div class="codehilite"><pre><span class="n">engine</span> <span class="o">=</span> <span class="n">create_engine</span><span class="p">(</span><span class="n">ENGINE_STRING</span><span class="p">)</span>
<span class="c"># Connect to the PostgreSQL database</span>
<span class="k">with</span> <span class="n">engine</span><span class="o">.</span><span class="n">connect</span><span class="p">()</span> <span class="k">as</span> <span class="n">connection</span><span class="p">:</span>
<span class="c"># Execute the query against the database</span>
<span class="n">results</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
<span class="c"># Fetch all the results of the query</span>
<span class="n">fetchall</span> <span class="o">=</span> <span class="n">results</span><span class="o">.</span><span class="n">fetchall</span><span class="p">()</span>
<span class="c"># Build a DataFrame with the results</span>
<span class="n">dataframe</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">fetchall</span><span class="p">)</span>
</pre></div>
<p>To see how much memory is being used at each step, I added some diagnostic code.
I'm using <a href="https://pypi.python.org/pypi/Pympler">Pympler</a> to look at the Python
heap and track what objects exist on it, and
<a href="https://github.com/giampaolo/psutil"><code>psutil</code></a> to get the VM size of the
process. The VM size is <em>the total amount of virtual memory used including all
code, data and shared libraries plus pages that have been swapped out</em> (from the
<em>top</em> manpage). Don't confuse VM size with <em>the actual amount of memory that the
process is using for data</em>. Measuring <strong>that</strong> is a complex task, and a full
discussion of Python memory usage is worth an entire series of blogposts.
However, a working summary is:</p>
<ul>
<li>
<p>There is a <em>Python heap</em> which holds Python objects. The <code>muppy</code> module from
<code>pympler</code> can scan the Python heap and return object counts and sizes.</p>
</li>
<li>
<p>There is a <em>C heap</em> which holds non-Python objects. We can't easily check what
objects on the C heap, but we can see when it increases in size by looking at
the process VM size.</p>
</li>
<li>
<p>The VM size of the process includes the C heap, which also includes the Python heap.</p>
</li>
<li>
<p>The VM size grows when the process uses more memory.</p>
</li>
</ul>
<p>Here's the diagnostic code:</p>
<div class="codehilite"><pre><span class="kn">from</span> <span class="nn">pympler</span> <span class="kn">import</span> <span class="n">summary</span><span class="p">,</span> <span class="n">muppy</span>
<span class="kn">import</span> <span class="nn">psutil</span>
<span class="k">def</span> <span class="nf">get_virtual_memory_usage_kb</span><span class="p">():</span>
<span class="sd">"""</span>
<span class="sd"> The process's current virtual memory size in Kb, as a float.</span>
<span class="sd"> """</span>
<span class="k">return</span> <span class="nb">float</span><span class="p">(</span><span class="n">psutil</span><span class="o">.</span><span class="n">Process</span><span class="p">()</span><span class="o">.</span><span class="n">memory_info_ex</span><span class="p">()</span><span class="o">.</span><span class="n">vms</span><span class="p">)</span> <span class="o">/</span> <span class="mf">1024.0</span>
<span class="k">def</span> <span class="nf">memory_usage</span><span class="p">(</span><span class="n">where</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> Print out a basic summary of memory usage.</span>
<span class="sd"> """</span>
<span class="n">mem_summary</span> <span class="o">=</span> <span class="n">summary</span><span class="o">.</span><span class="n">summarize</span><span class="p">(</span><span class="n">muppy</span><span class="o">.</span><span class="n">get_objects</span><span class="p">())</span>
<span class="k">print</span> <span class="s">"Memory summary:"</span><span class="p">,</span> <span class="n">where</span>
<span class="n">summary</span><span class="o">.</span><span class="n">print_</span><span class="p">(</span><span class="n">mem_summary</span><span class="p">,</span> <span class="n">limit</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="k">print</span> <span class="s">"VM: </span><span class="si">%.2f</span><span class="s">Mb"</span> <span class="o">%</span> <span class="p">(</span><span class="n">get_virtual_memory_usage_kb</span><span class="p">()</span> <span class="o">/</span> <span class="mf">1024.0</span><span class="p">)</span>
</pre></div>
<p>When it's called, <code>memory_usage</code> will print out (using <code>summary</code>) the top two
types of objects that are taking up space in the Python heap, together with
their counts and total sizes. It'll also print out the total VM size in Mb.</p>
<p>Next, I added calls to <code>memory_usage</code> at various steps in the code:</p>
<div class="codehilite"><pre><span class="n">engine</span> <span class="o">=</span> <span class="n">create_engine</span><span class="p">(</span><span class="n">ENGINE_STRING</span><span class="p">)</span>
<span class="k">with</span> <span class="n">engine</span><span class="o">.</span><span class="n">connect</span><span class="p">()</span> <span class="k">as</span> <span class="n">connection</span><span class="p">:</span>
<span class="n">memory_usage</span><span class="p">(</span><span class="s">"1 - before executing query"</span><span class="p">)</span>
<span class="n">results</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
<span class="n">memory_usage</span><span class="p">(</span><span class="s">"2 - after query, before fetchall"</span><span class="p">)</span>
<span class="n">fetched</span> <span class="o">=</span> <span class="n">results</span><span class="o">.</span><span class="n">fetchall</span><span class="p">()</span>
<span class="n">memory_usage</span><span class="p">(</span><span class="s">"3 - after fetchall, before creating DataFrame"</span><span class="p">)</span>
<span class="n">dataframe</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">fetched</span><span class="p">)</span>
<span class="n">memory_usage</span><span class="p">(</span><span class="s">"4 - after creating DataFrame"</span><span class="p">)</span>
</pre></div>
<p>Running the code gives this output:</p>
<div class="codehilite"><pre>Memory summary: 1 - before executing query
types | # objects | total size
str | 37310 | 7.06 MB
dict | 5066 | 6.39 MB
VM: 444.86Mb
Memory summary: 2 - after query, before fetchall
types | # objects | total size
str | 37351 | 7.07 MB
dict | 5084 | 6.40 MB
VM: 831.24Mb
Memory summary: 3 - after fetchall, before creating DataFrame
types | # objects | total size
&lt;class 'sqlalchemy.engine.result.RowProxy | 554378 | 42.30 MB
str | 37351 | 7.07 MB
dict | 5084 | 6.39 MB
VM: 1782.64Mb
Memory summary: 4 - after creating DataFrame
types | # objects | total size
&lt;class 'sqlalchemy.engine.result.RowProxy | 554378 | 42.30 MB
str | 37351 | 7.07 MB
dict | 5087 | 6.40 MB
VM: 1876.64Mb
</pre></div>
<p>Let's look at each step in more detail.</p>
<ol>
<li>
<p>Before executing the query, the memory footprint is the 'base level' of the
process. The initial VM size of 444.86Mb includes all the code, shared
libraries and so on, and we don't need to worry about reducing it. What we're
interested in is how much this figure grows as each step is executed.</p>
</li>
<li>
<p>After the query is executed, the VM size has increased by 386.38Mb. This is
because by default, SQLAlchemy (via the <code>psycopg2</code> library it uses to talk to
PostgreSQL) executes the query, fetches all the results in memory and buffers
them in memory. We know that this buffering is on the C heap, because the VM
size has increased, but there's no significant change in the top Python
objects.</p>
</li>
<li>
<p>The call to <code>fetchall</code> copies the buffered query results from the C heap into
<code>fetched</code>, a Python list. We don't know if SQLAlchemy/psycopg2 discards the
buffered results once that's done, but we can see that the VM size has
increased by an additional 951.4Mb. We can also see that the Python heap is
holding 554,378 <code>sqlalchemy.engine.result.RowProxy</code> objects - these are the
query results, one object per row.</p>
</li>
<li>
<p>The DataFrame is created, and this pushes up memory by another 94Mb.</p>
</li>
</ol>
<p>Overall, peak memory use is 1431.78Mb (the final VM size of 1876.64 minus the
initial size of 444.86Mb). It's this peak use that matters to us, for an
important reason: <strong>when a process releases memory (for example, when a big
Python structure goes out of scope and is garbage-collected), that doesn't
reduce the amount of virtual memory used</strong>.</p>
<p>That applies at step 4: If you look at the code, you'll see that <code>fetched</code> is
still in scope after the call to <code>DataFrame()</code>, so it's still taking up memory.
However, even if we deleted it at that point, we wouldn't get that virtual
memory back.</p>
<p>If you need to understand exactly how memory works in Python, <a href="http://revista.python.org.ar/2/en/html/memory-fragmentation.html">start here</a>,
or Google for "heap fragmentation" and be prepared to spend a lot of time
reading. Alternatively, just keep this rule in mind: <em>it's far, far better to
avoid allocating memory in the first place, rather than allocate it and then
free it.</em></p>
<h3>First Pass</h3>
<p>There's a very simple first step we can take: eliminate the buffering of the
query results. SQLAlchemy supports <a href="http://docs.sqlalchemy.org/en/rel_0_9/core/connections.html?highlight=stream_results#sqlalchemy.engine.Connection.execution_options"><em>streaming</em> of the results</a>
from some databases (including PostgreSQL) so that the result rows are not
buffered, but fetched as they're needed. Since the data needs to travel over the
network from the database whether it's streamed or not, this doesn't add a huge
overhead, but we'll see that it reduces memory requirements.</p>
<p>To do this, we replace the <code>connection.execute</code> line with:</p>
<div class="codehilite"><pre><span class="n">results</span> <span class="o">=</span> <span class="p">(</span><span class="n">connection</span>
<span class="o">.</span><span class="n">execution_options</span><span class="p">(</span><span class="n">stream_results</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="c"># Added this line</span>
<span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="n">query</span><span class="p">))</span>
</pre></div>
<p>Now we can run the code again and see what the effect is:</p>
<div class="codehilite"><pre>Memory summary: 1 - before executing query
types | # objects | total size
str | 37310 | 7.06 MB
dict | 5066 | 6.39 MB
VM: 444.87Mb
Memory summary: 2 - after query, before fetchall
types | # objects | total size
str | 37354 | 7.07 MB
dict | 5087 | 6.40 MB
VM: 455.76Mb
Memory summary: 3 - after fetchall, before creating DataFrame
types | # objects | total size
&lt;class 'sqlalchemy.engine.result.RowProxy | 554378 | 42.30 MB
str | 37333 | 7.07 MB
dict | 5086 | 6.40 MB
VM: 1760.65Mb
Memory summary: 4 - after creating DataFrame
types | # objects | total size
&lt;class 'sqlalchemy.engine.result.RowProxy | 554378 | 42.30 MB
str | 37333 | 7.07 MB
dict | 5089 | 6.40 MB
VM: 1760.65Mb
</pre></div>
<p>Look at the results at step 2 - where previously memory increased by 386.38Mb,
now we've only used an additional 11Mb. We can see the
<code>sqlalchemy.engine.result.RowProxy</code> objects on the Python heap after steps 3 and
4, as they're returned from <code>fetchall</code>, so we're still paying the cost of
holding them all in memory.</p>
<p>We've reduced peak usage a little to 1315.78Mb: 91% of the original value.</p>
<h3>Second Pass</h3>
<p>There isn't really any need to fetch all the data before building the DataFrame.
Pandas accepts a range of different types as the data source in the
<code>DataFrame()</code> constructor, including any iterable or iterator. The result of the
SQLAlchemy <code>execute</code> call is a <code>ResultProxy</code>, which can be used as an iterator,
so we can avoid fetching the data and pass the results directly to <code>DataFrame()</code>.
This eliminates step 3, and the code now looks like this:</p>
<div class="codehilite"><pre><span class="n">memory_usage</span><span class="p">(</span><span class="s">"1 - before executing query"</span><span class="p">)</span>
<span class="n">results</span> <span class="o">=</span> <span class="p">(</span><span class="n">connection</span>
<span class="o">.</span><span class="n">execution_options</span><span class="p">(</span><span class="n">stream_results</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="n">query</span><span class="p">))</span>
<span class="n">memory_usage</span><span class="p">(</span><span class="s">"2 - after query, before creating DataFrame"</span><span class="p">)</span>
<span class="n">dataframe</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="nb">iter</span><span class="p">(</span><span class="n">results</span><span class="p">))</span> <span class="c"># Pass results as an iterator</span>
<span class="n">dataframe</span><span class="o">.</span><span class="n">columns</span> <span class="o">=</span> <span class="n">results</span><span class="o">.</span><span class="n">keys</span><span class="p">()</span>
<span class="n">memory_usage</span><span class="p">(</span><span class="s">"4 - after creating DataFrame"</span><span class="p">)</span>
</pre></div>
<p>We need to call wrap <code>results</code> in an <code>iter()</code> so that <code>pandas</code> recognises it as
an iterator and extracts the rows to build the DataFrame. We're effectively
streaming straight to <code>pandas.DataFrame()</code>. We set the columns afterwards, from
the columns information in <code>results</code>.</p>
<p>This gives:</p>
<div class="codehilite"><pre>Memory summary: 1 - before executing query
types | # objects | total size
str | 37310 | 7.06 MB
dict | 5066 | 6.39 MB
VM: 445.24Mb
Memory summary: 2 - after query, before creating DataFrame
types | # objects | total size
str | 37354 | 7.07 MB
dict | 5087 | 6.40 MB
VM: 456.36Mb
Memory summary: 4 - after creating DataFrame
types | # objects | total size
str | 37333 | 7.07 MB
dict | 5089 | 6.40 MB
VM: 1483.34Mb
</pre></div>
<p>Look at the Python heap after step 4: there are no <code>sqlalchemy.engine.result.RowProxy</code>
objects. That's because they were each consumed by <code>DataFrame()</code> and then
discarded, as the results were streamed.</p>
<p>We're down to a peak VM increase of 1038.1Mb, 72.5% of the first pass. But that's
still a fairly large amount of memory used for only eleven days of data. We can
tell that most of it is on the C heap, since the largest Python heap occupier is
7Mb of strings. That tells us that Pandas is doing most of its data allocation
on the C heap.</p>
<h3>Third Pass</h3>
<p>In order to explain why this third pass saves memory, I need to take a small
detour into the world of Python string handling.</p>
<p>Python creates new string objects when new strings are created. That's
reasonable, but we need to remember that two strings may have the same <em>value</em>,
but be different <em>objects</em>:</p>
<div class="codehilite"><pre><span class="n">Python</span> <span class="mf">2.7</span><span class="o">.</span><span class="mi">3</span> <span class="p">(</span><span class="n">default</span><span class="p">,</span> <span class="n">Feb</span> <span class="mi">27</span> <span class="mi">2014</span><span class="p">,</span> <span class="mi">19</span><span class="p">:</span><span class="mi">58</span><span class="p">:</span><span class="mi">35</span><span class="p">)</span>
<span class="p">[</span><span class="n">GCC</span> <span class="mf">4.6</span><span class="o">.</span><span class="mi">3</span><span class="p">]</span> <span class="n">on</span> <span class="n">linux2</span>
<span class="n">sys</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">extend</span><span class="p">([</span><span class="s">'/vagrant'</span><span class="p">])</span>
<span class="o">&gt;&gt;&gt;</span> <span class="c"># We'll create two identical strings</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s1</span> <span class="o">=</span> <span class="s">'It was the best of times, it was the worst of times. It was Hammer Time.'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s2</span> <span class="o">=</span> <span class="s">'It was the best of times, it was the worst of times. It was Hammer Time.'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="c"># Let's check that these strings are equal</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s1</span> <span class="o">==</span> <span class="n">s2</span>
<span class="bp">True</span>
<span class="o">&gt;&gt;&gt;</span> <span class="c"># Let's check if they are the same string object</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s1</span> <span class="ow">is</span> <span class="n">s2</span>
<span class="bp">False</span>
</pre></div>
<p>Strings <code>s1</code> and <code>s2</code> have the same value but are different objects, and each
has a separate copy of its string data. This is the usual behaviour when strings
are created. However, Python can automatically <a href="https://docs.python.org/2/library/functions.html?highlight=intern#intern"><em>intern</em></a>
short strings - re-using existing string objects rather than allocating new ones:</p>
<div class="codehilite"><pre><span class="o">&gt;&gt;&gt;</span> <span class="c"># Try this with some short strings</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s1</span> <span class="o">=</span> <span class="s">'abc'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s2</span> <span class="o">=</span> <span class="s">'abc'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s1</span> <span class="o">==</span> <span class="n">s2</span>
<span class="bp">True</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s1</span> <span class="ow">is</span> <span class="n">s2</span>
<span class="bp">True</span>
</pre></div>
<p>Interning was originally intended to avoid string duplication for internal
strings like method and class names. In Python 2.7, according to the docs,
<em>...the names used in Python programs are automatically interned, and the
dictionaries used to hold module, class or instance attributes have interned
keys</em>. Python automatically interned <code>s1</code> and <code>s2</code> because they have short
values, but we can request that Python intern any string by using <code>intern()</code>:</p>
<div class="codehilite"><pre><span class="o">&gt;&gt;&gt;</span> <span class="n">s1</span> <span class="o">=</span> <span class="nb">intern</span><span class="p">(</span><span class="s">'It was the third of September, that day I''ll always remember'</span><span class="p">)</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s2</span> <span class="o">=</span> <span class="nb">intern</span><span class="p">(</span><span class="s">'It was the third of September, that day I''ll always remember'</span><span class="p">)</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s1</span> <span class="o">==</span> <span class="n">s2</span>
<span class="bp">True</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">s1</span> <span class="ow">is</span> <span class="n">s2</span>
<span class="bp">True</span>
</pre></div>
<p>But why is any of this a memory issue? In our processing example, the rows of
result data being returned by SQLAlchemy contain many repeated string <em>values</em>
(such as URLs and user agents), but each one is a different string <em>object</em>
(technically they're Unicode objects, but we can think of them as strings).
When these are passed to Pandas, it stores a copy of the data for each string on
the C heap, and we end up with many copies of the same string value taking up
memory. What we want is that we have a single shared string object for any one
value. For example, if we have many rows that all refer to a URL "http://www.mobify.com/blog",
then we want there to be one string with that value, and all the rows to refer
to it.</p>
<p>This is sometimes called string "folding". Folding does just what we want:
combines strings so that instead of code holding references to multiple string
objects that have the same value, all the references for any given value are to
the <strong>same</strong> string object. Remember that Python strings are <em>immutable</em>, so
it's completely safe to have many different parts of the code holding references
to the same string object: there's no way for the object's value to be changed.</p>
<p>The <code>fold_string</code> function below does the work. Since <code>intern</code> won't accept
<code>unicode</code> objects, and SQLAlchemy will usually return any CHAR-based type as
<code>unicode</code>, <code>fold_string</code> attempts to coerce a unicode object to a string object
with the same value, and calls <code>intern</code> on the result. If the unicode literal
can't be coerced, then it's stored in a separate map which does the same job as
<code>intern</code>.</p>
<div class="codehilite"><pre><span class="k">class</span> <span class="nc">StringFolder</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> Class that will fold strings. See 'fold_string'.</span>
<span class="sd"> This object may be safely deleted or go out of scope when</span>
<span class="sd"> strings have been folded.</span>
<span class="sd"> """</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">unicode_map</span> <span class="o">=</span> <span class="p">{}</span>
<span class="k">def</span> <span class="nf">fold_string</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">s</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> Given a string (or unicode) parameter s, return a string object</span>
<span class="sd"> that has the same value as s (and may be s). For all objects</span>
<span class="sd"> with a given value, the same object will be returned. For unicode</span>
<span class="sd"> objects that can be coerced to a string with the same value, a</span>
<span class="sd"> string object will be returned.</span>
<span class="sd"> If s is not a string or unicode object, it is returned unchanged.</span>
<span class="sd"> :param s: a string or unicode object.</span>
<span class="sd"> :return: a string or unicode object.</span>
<span class="sd"> """</span>
<span class="c"># If s is not a string or unicode object, return it unchanged</span>
<span class="k">if</span> <span class="ow">not</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="nb">basestring</span><span class="p">):</span>
<span class="k">return</span> <span class="n">s</span>
<span class="c"># If s is already a string, then str() has no effect.</span>
<span class="c"># If s is Unicode, try and encode as a string and use intern.</span>
<span class="c"># If s is Unicode and can't be encoded as a string, this try</span>
<span class="c"># will raise a UnicodeEncodeError.</span>
<span class="k">try</span><span class="p">:</span>
<span class="k">return</span> <span class="nb">intern</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">s</span><span class="p">))</span>
<span class="k">except</span> <span class="ne">UnicodeEncodeError</span><span class="p">:</span>
<span class="c"># Fall through and handle s as Unicode</span>
<span class="k">pass</span>
<span class="c"># Look up the unicode value in the map and return</span>
<span class="c"># the object from the map. If there is no matching entry,</span>
<span class="c"># store this unicode object in the map and return it.</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">unicode_map</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
<span class="k">if</span> <span class="n">t</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="c"># Put s in the map</span>
<span class="n">t</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">unicode_map</span><span class="p">[</span><span class="n">s</span><span class="p">]</span> <span class="o">=</span> <span class="n">s</span>
<span class="k">return</span> <span class="n">t</span>
</pre></div>
<p>Finally, we use <code>fold_string</code> within a generator function that will take each
row of the data, fold all the strings and yield tuples that Pandas can use to
build the DataFrame, one tuple for each row. Generators are very useful for
processing rows of data from SQLAlchemy, since a generator is itself an
iterator, and can be used anywhere that you might want to iterate over the data.</p>
<div class="codehilite"><pre><span class="k">def</span> <span class="nf">string_folding_wrapper</span><span class="p">(</span><span class="n">results</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> This generator yields rows from the results as tuples,</span>
<span class="sd"> with all string values folded.</span>
<span class="sd"> """</span>
<span class="c"># Get the list of keys so that we build tuples with all</span>
<span class="c"># the values in key order.</span>
<span class="n">keys</span> <span class="o">=</span> <span class="n">results</span><span class="o">.</span><span class="n">keys</span><span class="p">()</span>
<span class="n">folder</span> <span class="o">=</span> <span class="n">StringFolder</span><span class="p">()</span>
<span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">results</span><span class="p">:</span>
<span class="k">yield</span> <span class="nb">tuple</span><span class="p">(</span>
<span class="n">folder</span><span class="o">.</span><span class="n">fold_string</span><span class="p">(</span><span class="n">row</span><span class="p">[</span><span class="n">key</span><span class="p">])</span>
<span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">keys</span>
<span class="p">)</span>
</pre></div>
<p>The query code changes just one line, to:</p>
<div class="codehilite"><pre><span class="n">dataframe</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">string_folding_wrapper</span><span class="p">(</span><span class="n">results</span><span class="p">))</span>
</pre></div>
<p>Running this gives:</p>
<div class="codehilite"><pre>Memory summary: 1 - before executing query
types | # objects | total size
str | 37314 | 7.06 MB
dict | 5066 | 6.39 MB
VM: 444.84Mb
Memory summary: 2 - after query, before creating DataFrame
types | # objects | total size
str | 37358 | 7.07 MB
dict | 5087 | 6.40 MB
VM: 456.61Mb
Memory summary: 4 - after creating DataFrame
types | # objects | total size
str | 37337 | 7.07 MB
dict | 5089 | 6.40 MB
VM: 901.54Mb
</pre></div>
<p>Now we're down to a peak of just 456.7Mb - a fraction under 32% of the original peak. We've saved 1419.94Mb!</p>
<h3>Conclusion</h3>
<p>These techniques let us change from 60Gb instances to instances with 30Gb or
even less, saving us at least 50% in EC2 costs. As we expand to collect and
analyze more data, those savings will also increase.</p>
<p>You can use the same techniques in your own code, but here are some key points
to take away:</p>
<ol>
<li>
<p>Don't allocate memory if you can sensibly avoid it. Here the key word is
<em>sensibly</em>: as Donald Knuth said, <em>premature optimization is the root of all
evil</em>. In this case, though, we want to avoid peak memory scaling wildly with
data volume, and the optimization directly affects our processing costs, so
it's well worth doing.</p>
</li>
<li>
<p>If you can process the results of database queries iteratively (and very
often you can), stream the results, assuming you're using a library that
supports it. SQLAlchemy will do that for PostgreSQL with the <code>stream_results</code>
execution option, and <a href="https://github.com/farcepest/MySQLdb1">MySQLdb</a> supports
it using <code>SSCursor</code> (server-side cursors).</p>
</li>
<li>
<p>If you're iterating over data that contains many repeated strings <strong>that are
kept in memory</strong>, consider using a wrapping generator to fold them.</p>
</li>
</ol></div>DevOps 101: Best Practices for Optimizing and Automating Your Infrastructure2013-11-07T10:00:00-08:00Kyle Youngtag:www.mobify.com,2013-11-07:blog/devops-101-best-practices/<div><p><a href="http://www.mobify.com/blog/devops-101-best-practices/">
<img src="http://www.mobify.com/static/blog/2013/11/devops.png" alt="DevOps 101: Best Practices for Optimizing and Automating Your Infrastructure">
</a></p>
<p>Whether you're coming across the term for the first time, or you've been listing it on your LinkedIn profile for a couple of years now, it's likely that you are still faced with the question: What exactly do we mean by&#160;DevOps?</p>
<p>Whatever your situation, the good news is that if you like automating things and want to see what's going on with those tech hipsters and their fancy conference T-Shirts, you've come to the right place to learn&#160;more!</p>
<p>In this post I'll attempt to outline some of the common themes across the best environments practising DevOps, and communicate some of the top DevOps practices you can bring to your own dev&#160;environment.</p>
<h3>DevOps&#160;Defined</h3>
<p>As a term, 'DevOps' is only slightly less ambiguous than the word 'Cloud'. When pressed to give a concise definition, the best I can come up with is that <strong>DevOps is the marrying of process, infrastructure, and&#160;product</strong>.</p>
<p><em>Actually that's a terrible definition, but it sounds nice when you say it, so I'll try to do better as we go&#160;along.</em></p>
<p>Put another way, DevOps is basically the cool-kids way of stringing stuff together with shell scripts. It exists because we can now wrap things in shell scripts that we used to only dream of; the world is now programmable at a much larger scale, and we have many new tools and techniques for taking advantage of&#160;it.</p>
<p>So, why is that important? With this newfound power of continuous automation and integration, the exciting promise of DevOps is that it can let a small team of developers multiply their effectiveness and compete with much larger teams who find themselves more encumbered with traditional&#160;processes.</p>
<p>And that, in and of itself, has extraordinary&#160;value.</p>
<h3>1. The DevOps Philosophy: Use Things You Can Program, and Program the Things You&#160;Use</h3>
<p>Once upon a time, expanding capacity meant buying new servers, racking these servers, potentially reconfiguring the networking to accommodate the new servers, and then, if you were ahead of the curve, installing an image onto these servers and making it fit into your existing&#160;fleet.</p>
<p>For a surprising number of people, these are still the realities of life. Needless to say, all of this meant that to be passably efficient, you had to do all of this well, without disrupting your existing&#160;systems.</p>
<p>Of course, the simplest system to manage and maintain is the one you don't, and if you're running in an agile environment, this is easily one of your biggest&#160;advantages.</p>
<p>With the emergence of the "as-a-Service" family, (can we call them aaSes? are we there yet?), you can choose the tradeoff between the level of customization you need and the time and capacity you have for this kind of system&#160;management.</p>
<p>This of course brings us to one of my favourite things to have come out of the DevOps movement: We're now at the point where we need to seriously talk about competitive advantage and opportunity&#160;cost.</p>
<p>The core reasoning comes down to this: You're in a business that hopefully does something more efficiently than everyone else, or at least better than most. That is the core of your business&#160;model.</p>
<p>Mobify, for example, is amazing at adapting web experiences for mobile devices, that's what we focus on; the reason we have clients and customers is because they've realized that they're better at what they're doing than they would be at what we&#160;do.</p>
<p>Odds are you are not better at cleaning your office than you are at carrying out your core business model, so you hire cleaning services&#160;&#8212; economists will rightfully object that there are nuances to relative opportunity costs, but I'm simplifying here for&#160;illustration.</p>
<p>Likewise, you are probably less efficient at managing the complex and demanding tasks that go along with network and datacenter maintenance than a typical IaaS&#160;provider.</p>
<p>What this means for those of us who are selling something other than our incredible ability to rack and provision servers is that there is some logical trade off where it makes more sense for us to leverage the services of those who are best at&#160;it.</p>
<p>LAMP-style stacks are boring, but Heroku solved that problem for all of us. Email has been a nightmare to configure for as long as I can remember, but Google was kind enough to free us from that drudgery. Moving servers around in data-centres at 2AM is some kind of sisyphean punishment for abhorrent behaviour and offence given to the gods of Olympus, and I for one am happy to pass that off to the Rackspaces and&#160;Amazons.</p>
<p>What's more important is that all of these services come with APIs, and we can write those small shell scripts to bring up new capacity at the click of a button, whenever we want&#160;it.</p>
<p>The advantages here go well beyond rapid scaling, though that's nice too. We're now at a point where we deal with problems by burning things to the ground and rebuilding. It doesn't matter if there's a weird behaviour on server X-761331-b7, just destroy it and rebuild with the configuration manager of your&#160;choice.</p>
<p>"Ah", you say, "you could certainly move everything over, but it would cost you a fortune!". For some of you that's certainly true, but for most I would argue that the time you're saving more than makes up for the&#160;cost.</p>
<p>If your business model is truly competitive, then hopefully you have a way to convert your time to money; even if you're not on the core product or service team, I'd wager that your time could likely better be spent automating and tooling something new for&#160;them.</p>
<p>Netflix is a great example of a team of mad DevOps gurus leveraging Amazon to the hilt, and then going one step further and sharing a bunch of their tools on <a href="http://github.com/Netflix">github</a>.</p>
<p>Or maybe you're thinking: "Ah, but my servers are special. I do [insert whatever voodoo you&#8217;ve got going], and I couldn't possibly do that on a rented VM". I understand. Your boxes are special, and I don't understand your problem&#160;domain.</p>
<p>That being said, the menagerie of infrastructure available is growing in diversity all the time. I would really encourage you to take another&#160;look.</p>
<p><em>Best practice takeaway #1: Rent wherever you can and get on with your business. They do it better, and ain't nobody got time for&#160;that.</em></p>
<h3>2. Backups, Version Control, and Clusters, oh&#160;my...</h3>
<p>Alright, so you're hopefully convinced that having an infrastructure that you can program is superior to whatever else you're doing. Now we can start talking about all of the wonderful things you get for free because of&#160;it.</p>
<p>Lets start with backups. If you're not making backups already, you're either a purely functional programmer and knight of the lambda calculus, or you know your shame and are only reading this to avoid confronting the horror of the imminent catastrophe in your&#160;future.</p>
<p>Servers can now be expressed as configuration files, which means they can go into version control, which is a value proposition I hope you can&#160;appreciate.</p>
<p>This means new servers can be programmed to come online with preconfigured state, as defined from a central, version-controlled authority; they can download whatever state you have in the rest of cluster and hit the ground running. Beyond that, we're now in a world where the cost of having standby capacity is a&#160;non-issue.</p>
<p>You want to up the number of servers behind your load-balancer? Do it. There's no requisitioning process. If you're on Amazon, let them do it for you. For once, you can sleep through the&#160;emergency.</p>
<p>What's better is that now you can architect&#160;greedily.</p>
<p>By way of example, when you were young and naive, you probably had a hard drive somewhere and experienced the joy of losing all your data when the thing failed. Then you learned, and made a backup, swearing that would never happen again. If you were a keener, maybe you had a job that would run those backups for you automatically. Then you learned about RAID. RAID was cool, wasn't it? Until they were all&#160;stolen.</p>
<p>Eventually, a stable solution would have to look like a storage cluster with geographically separated nodes, much like what many corporate environments run on. A node failure can now be routine; in fact, on many systems, you can have the servers tell the vendor that a component failed and they'll send a technician to replace it without you having to do a thing.&#160;Neato.</p>
<p>Guess what? You can do the same thing with your&#160;services.</p>
<p><a href="http://awsofa.info/">One of the most beautiful network diagrams</a> I've ever seen was the one put out by the Obama re-election campaign. That was a paragon of DevOps if I've ever seen one. A key take away was the clustered nature of the system, and this is a pattern seen in other pioneers of modern architecture (again, think&#160;Netflix).</p>
<p>Even in smaller environments where you only have a handful of servers, looking at your architecture from a cluster perspective will give you plenty more flexibility than you would otherwise have, and will seriously redefine what your backup policies will have to look&#160;like.</p>
<p>If any of this is new to you, you're probably wondering at the necessary level of complication introduced by all of these changes, perhaps even thinking that it might outweigh the benefits. True, you'd have to learn something new, but like with most things, the benefits outweigh the pain (the obvious exception being Esperanto; no one ever benefitted from knowing&#160;that).</p>
<p><em>Best practice takeaway #2: Treat your server configuration like developers treat code. Clusters are the new black. Rapid scaling isn't just about service&#160;spikes.</em></p>
<h3>3. Control Your&#160;Environment</h3>
<p>Speaking of configuration management, let's have a quick word about&#160;that.</p>
<p>This is a vein of DevOps that can be traced back to shortly after the beginning of the Unix Epoch, when we had shell scripts to take a fresh install and move it into something usable. People have been doing this for ages, often with tools they wrote themselves or were maintained within the company or&#160;university.</p>
<p>Things have advanced. Though there are a handful of options to choose from, tools like <a href="http://puppetlabs.com/">Puppet</a> and <a href="http://www.opscode.com/chef/">Chef</a> are rapidly becoming integrated into platform and infrastructure vendors, and it would certainly be to your advantage to look into&#160;them.</p>
<p>There's a second strand, which we'll call the school of the Golden Image, which favours clones of a single and well maintained system image. This branch also has a long history, but with the advent of virtual machines has pretty much exploded, making Star Wars references obligatory in nearly any DevOps&#160;talk.</p>
<p>Generally, people look down on Golden Images, as it usually results in a terrible mutant horde intent on unleashing the demons of entropy on their creator, or to be more precise, the state of the machine tends to change over time and become an unmanageable&#160;mess.</p>
<p>That being said, I've seen it pulled off successfully, and if that's what you have to do to enjoy rapid scaling, it's better than abusing your poor systems team with caffeine and sleep deprivation. Simply scripting the components that need to be customized per-environment is often&#160;enough.</p>
<p>What's important is that you have some automated way of rapidly bringing up more of the systems you need, and of making sure that they're configured appropriately. Plenty of teams have had to create strange and unusual processes to manage this in the past, but now is a great time to take a second look at some of those old systems, and then do the right thing and kill them with&#160;fire.</p>
<p><em>Best practice takeaway #3: Configuration management is cool, you should use it, but if you're stuck with a Golden Image model, we can still be&#160;friends.</em></p>
<h3>4. Oh Right, the&#160;Developers</h3>
<p>So yes, systems are fun, but that's only half the story, and this would really just be an operations story if it weren't for the tools you need to start thinking about for your&#160;developers.</p>
<p>Going along with our theme of replacing process and procedure with shell scripts, let&#8217;s take a quick look at automated testing, continuous integration, and continuous deployment. These are among the practices at the heart of modern agile practice, and are really the reason we have DevOps to begin&#160;with.</p>
<p>Testing is something we all do, right? I'll just pretend you said yes, because now you can do things like have unit tests run as part of your version control system (pro tip: Github makes this super&#160;easy).</p>
<p>Automated test runs are a huge boon to the workflow, and can be used with a continuous integration suite to push out changes to your staging and production environments. Aside from catching fat-fingered errors in your code commits, your developers can take a huge load off of your QA department, and fold a lot of those responsibilities back into their&#160;process.</p>
<p>A test suite or series of test suites, combined with tools like Fabric, lets you start to automate and wrap a lot of your process into commands that can do most of the heavy lifting for you, like some kind of steroid-infused version of bash&#160;aliases.</p>
<p>Having operations-interfacing tools as part of the development process also lets you tie any other sort of big-picture automation into the build. If you have a web component, maybe you want to run a cache purge. Or maybe your build should require a database backup before running, or send some kind of notification. For the sheer novelty of it, maybe you've connected an Arduino up to put on some kind of Rube-Goldberg display when certain parameters are&#160;met.</p>
<p>Let your inner-geek shine. The point is that you code your process and implement it, whatever that looks&#160;like.</p>
<p><em>Best practice takeaway #4: You've heard this before&#160;&#8212; automated testing, continuous integration, and continuous deployment are at the heart of DevOps, start&#160;here.</em></p>
<h3>5. Huff and Puff, and Blow the House&#160;Down</h3>
<p>Of course, what would be the point of all of these lovely tools, redundancies, and procedures if you didn't have confidence in&#160;them?</p>
<p>It's important that everyone involved trust the tools to do exactly what they're supposed to do, so that when push comes to shove, you're not afraid to pull to switch. This is where fire drills come&#160;in.</p>
<p><strong>You wouldn't ship your code without testing, (ahem), and you shouldn't build out your systems without first making sure they'll perform under their edge cases as&#160;planned.</strong></p>
<p>This one is of course a harder sell to whoever it is that conducts your performance reviews; wanting to pull the rug out from under your main systems, cry havoc, and let loose the chaos monkeys of war sounds like something for which you press charges rather than&#160;applaud.</p>
<p>However, these are systems designed to fail and recover, and if they can't do that then trying to pretend that it's not an issue is only compounding the&#160;problem.</p>
<p>This is not the Tao of DevOps. Be brave. Pull the plug. sudo rm -rf /*. See what happens. Blame the intern. Then, like a majestic phoenix, watch your system rise from the ashes and restore itself to all of its former glory and&#160;smile.</p>
<p><strong>Simulating failure, or causing it, is the best way to identify brittle components and to purge&#160;them.</strong></p>
<p>Do the same with security. Plenty of us have had the joy of working with pen-testers&#160;&#8212; those beautiful deviants who get paid to do unspeakable things to your precious servers. Even if you aren't prepared to budget for the cost of one of these digital dominatrices, you have an idea of what would happen if they had their way with you, and you can simulate that easily enough.
What's nice about these tests is that you get the chance to channel all of that malevolent angst you've built up against your uncooperative systems and imagine what terrible evils you could inflict. It's therapy and responsible administration all in&#160;one.</p>
<p><em>Best practice takeaway #5: Plan for failure, then fail, and get back up again. Repeat. Whip it&#160;good.</em></p>
<h3>Kind Words of&#160;Parting</h3>
<p>Again, I need to stress that despite O'Reilly's remarkable turnaround time on publications, DevOps is not yet a practice with a dogma, but rather an emerging and exciting collection of practices being embraced by some of the smartest and most capable tech shops&#160;around.</p>
<p><strong>The fundamental spirit is one of flexibility, agility, and&#160;automation.</strong></p>
<p>There are definitely practical ways to bring these tools into your current environment that don't require some kind of intervening&#160;cataclysm.</p>
<p>The old caveat of automating anything you've had to do more than twice can start you in the right direction; just keep in mind that there's an end goal and nudge it all towards an integrated&#160;framework.</p>
<p>If there's nothing in the shop approaching automated testing, consider starting at the top with integration tests and slowly pushing the code back down the ladder. Maybe buy the developers drinks to lull them into a false sense of confidence, then expense them as cost of&#160;infrastructure.</p>
<p><em>Such are the burdens of DevOps. Now go forth and make&#160;gizmos!</em></p></div>5 Advanced Mobile Web Design Techniques You’ve Probably Never Seen Before2013-10-16T10:00:00-07:00David Faytag:www.mobify.com,2013-10-16:blog/5-advanced-mobile-web-design-techniques-style-com/<div><p><a href="http://www.mobify.com/blog/5-advanced-mobile-web-design-techniques-style-com/">
<img src="http://www.mobify.com/static/blog/2013/10/mobile-web-design-techniques.png" alt="5 Advanced Mobile Web Design Techniques You&#8217;ve Probably Never Seen Before">
</a></p>
<p>Until recently, creating mobile web designs that look and feel like native apps
has pretty much been an impossible dream. There are plenty of
<a href="http://www.mobify.com/blog/beginners-guide-to-perceived-performance/">creative workarounds</a> to try
and bring that native &#8216;feel&#8217; to mobile web browsing, but so far we've struggled
to bridge the gap between native and the&#160;web.</p>
<p>However, a slew of new, high-powered smartphones is allowing designers to
finally unleash complex, performant, native-feeling smartphone UI patterns&#160;&#8212;
designed and built for the web. These patterns are blurring the lines between
native apps and the&#160;web.</p>
<p>Take <a href="http://www.style.com">Style.com</a>&#160;&#8212; their new adaptive website is an
interesting example of how to provide a great user experience at different screen-sizes,
<em>while also targeting different device capabilities</em>. This has led to some truly
advanced mobile designs that work extremely well on recent&#160;devices.</p>
<p>In this post, you&#8217;ll find five awesome design patterns from Style.com that
you&#8217;ve likely never seen before while browsing the web on a&#160;smartphone.</p>
<h3>1. A New Kind of Calendar</h3>
<p>Web-based calendars on mobile devices have always been a pretty terrible
experience. This is why most designers avoid calendars like the plague, opting
instead for a different design pattern like a simple list of&#160;dates.</p>
<p>However, by getting a little creative with carousels and some JavaScript magic,
you can create very usable calendars for users on&#160;mobile.</p>
<figure class="image-centered">
<img style="max-width:300px; border: 1px solid #505353" src="http://www.mobify.com/static/blog/2013/10/calendar10.gif" alt="Double carousel calendars">
</figure>
<p>On Style.com, the fashion show calendar was created by binding two carousels
together to help users easily browse through dates and events. That's right:
double&#160;carousels.</p>
<h3>2. Dynamic Content in Off-Canvas Flyouts</h3>
<p>Off-canvas flyouts are areas of the page that live out of the viewport until a
user taps or swipes the appropriate area. They have become a primary navigation
pattern for mobile in both native and web apps, and you can even find them on a
handful of desktop websites&#160;too.</p>
<p>Most web pages use off-canvas flyouts to simply hide menus and other static
content, but it&#8217;s possible to use them to display a whole host of other dynamic
content&#160;too.</p>
<p>On Style.com, both static and dynamic content are placed off-canvas. Primary
navigation is hidden within an off-canvas menu on the right-hand side, while on
the left-hand side of the header a secondary off-canvas flyout reveals a user&#8217;s
recent history on the&#160;site.</p>
<div class="grid break-480 before-after-showcase">
<div class="span4">
<img style="border: 1px solid #505353" src="http://www.mobify.com/static/blog/2013/10/offcanvas3.png" alt="A user's history on style.com">
<h5>
A users' history on Style.com
</h5>
</div>
<div class="span4">
<img style="border: 1px solid #505353" src="http://www.mobify.com/static/blog/2013/10/offcanvas2.png" alt="style.com with no menus open">
<h5>
Off-canvas flyouts closed
</h5>
</div>
<div class="span4">
<img style="border: 1px solid #505353" src="http://www.mobify.com/static/blog/2013/10/offcanvas1.png" alt="static navigation menu">
<h5>
Static navigation menu
</h5>
</div>
</div>
<p>The recent history includes pages, as well as individual slides from fashion
shows. Seeing as Style.com is a fairly
<a href="http://www.mobify.com/blog/mobile-web-design-for-content-heavy-websites/">content-heavy website</a>,
it&#8217;s a great feature for users who want to review their favourite content after
browsing through the&#160;site.</p>
<p>This technique uses the WebKit
<a href="http://www.mobify.com/blog/smartphone-localstorage-outperforms-browser-cache/">localStorage</a> paired
with the off-canvas system Mobify developed in-house, called
<a href="http://mobify.github.io/pikabu/">Pikabu</a>. There are a bunch of off-canvas libraries
available, you can also check out <a href="http://twitter.com/dbushell">David Bushell</a>'s
<a href="http://github.com/dbushell/Responsive-Off-Canvas-Menu">Responsive Off-Canvas Menu</a>,
and <a href="http://christopheryee.ca/">Christopher Yee</a>'s <a href="http://christopheryee.ca/pushy/">Pushy</a>.</p>
<h3>3. Pinch to Zoom in Galleries</h3>
<p>Pinching to zoom text is generally regarded as the sign of poorly formatted
content&#160;&#8212; but images are a different case entirely.</p>
<p>Just like with native maps and images in apps, users frequently want to zoom
into pictures on the web to see more detail.</p>
<p>Style.com has progressively enhanced their many image galleries so that users
can zoom into any slide and view the image in finer detail. However, they&#8217;ve
done so in a way that does not zoom into the viewport&#160;&#8212; just the
container that the image is&#160;in.</p>
<div class="grid break-480 before-after-showcase">
<div class="span6">
<img style="border: 1px solid #505353" src="http://www.mobify.com/static/blog/2013/10/pinch1.png" alt="style.com inline pinch to zoom for images">
</div>
<div class="span6">
<img style="border: 1px solid #505353" src="http://www.mobify.com/static/blog/2013/10/pinch2.png" alt="style.com zoomed in images">
</div>
</div>
<p>Since this feature requires a fair amount of processing power, it has only been
enabled for users with Retina iOS smartphones. Remember, it's important to
treat performance as a design feature, so make sure that complex features are
only turned on for devices that can support them in a performant&#160;way!</p>
<p>At Mobify, we use a device's pixel density (among other factors) to estimate how
powerful the device is. This allows us to ensure that we provide an experience
that works wonderfully on devices that can handle it&#160;&#8212; without worrying
about breaking the experience on older&#160;devices.</p>
<p>To replicate this viewport-constrained, pinch to zoom image functionality, check
out the popular <a href="http://eightmedia.github.io/hammer.js/">hammer.js</a>&#160;library.</p>
<h3>4. Huge Image Carousels</h3>
<p>There are two main reasons why large image carousels are a pain to implement on
mobile: performance and&#160;navigation.</p>
<p>But if you overcome both of these challenges, you can create a very native-feeling
image browsing experience to really take advantage of all those wonderful high
DPI screens out&#160;there.</p>
<h4>Challenge #1: Performance</h4>
<p>The first challenge is a performance one: devices are not usually powerful enough
to render many objects in a&#160;row.</p>
<p>Imagine an image that is as big as a device&#8217;s screen (or twice as big if it&#8217;s a
Retina device). Now, since it&#8217;s an image carousel, imagine a few dozen or even
hundreds of those kind of images next to each other in a row. How wide will
it&#160;be?</p>
<p>On Style.com, one particular image carousel came in at more than 80,000px
wide.&#160;Yikes.</p>
<p>Most mobile browsers crash just trying to render that many elements on page&#160;&#8212;
even if they are just empty placeholders with no media&#160;content.</p>
<p>To get around this, Style.com has optimized image carousels so that inactive
slides take no space on the page. Using the DOM rewriting properties of
<a href="http://www.mobify.com/mobifyjs/">Mobify.js</a>, images are requested on-demand only, and thus the
amount of CPU resources required for their rendering is significantly&#160;lower.</p>
<p>In some cases this made the mobile gallery up to 10x smaller (from 6.2MB down
to 650kB on one&#160;page).</p>
<h4>Challenge #2: Navigation</h4>
<p>The second challenge was around navigation. If you have all of these beautiful
images in a long row, how can you quickly move between&#160;them?</p>
<figure class="image-centered">
<a href="http://www.mobify.com/static/blog/2013/10/gallery.jpg">
<img src="http://www.mobify.com/static/blog/2013/10/gallery.jpg" alt="Native alphabet scroller vs. Style.com alphabet scroller">
</a>
</figure>
<p>The answer lay in creating a grid view from the carousel HTML. This can be done
relatively easily by changing the image sources and&#160;CSS.</p>
<h3>5. Native-Like Alphabetical Lists</h3>
<p>An alphabetical index is a great way to help users scroll through long lists of
items. Apple provides one to help users select contacts and music in iOS, but
it&#8217;s proven to be pretty difficult to bring this functionality to the web in a
way that works as well as its native&#160;counterparts.</p>
<div class="grid break-480 before-after-showcase">
<div class="span6">
<img style="border: 1px solid #505353" src="http://www.mobify.com/static/blog/2013/10/native-list.png" alt="Native alphabet scroller">
<h5>
Native alphabet scroller
</h5>
</div>
<div class="span6">
<img style="border: 1px solid #505353" src="http://www.mobify.com/static/blog/2013/10/style-list.png" alt="Style.com alphabet scroller">
<h5>
Style.com alphabet scroller
</h5>
</div>
</div>
<p>On Style.com, this design technique been achieved using a double carousel (just
like the calendar!) and <a href="http://github.com/cubiq/iscroll">iscroll.js</a>. And, with a
new spin on it, the alphabetical list only exists within the list's container,
so other content on the page isn&#8217;t&#160;affected!</p>
<h3>Conclusion</h3>
<p>Until recently, the difference between browsing the web and using a native app
has been clear to anyone who uses a&#160;smartphone.</p>
<p>But as devices become more powerful, and responsive and adaptive techniques
become more sophisticated, it's increasingly possible to blur the boundaries
between native and web. Style.com is one such example of how you can use
adaptive techniques to create some really interesting features that otherwise
wouldn't have been possible to bring to users on&#160;mobile.</p>
<p>So with more and more people progressively enhancing the web to take advantage
of the capabilities of modern devices in this manner, it's likely that we&#8217;re
about to see an explosion of truly advanced and noteworthy mobile web
designs.&#160;Exciting!</p>
<p><em>The techniques in this post were developed by the Style.com team and Anton
Bielousov, one of Mobify&#8217;s engineers. If you have any questions about them,
feel free leave a comment or reach out to Anton
directly&#160;<a href="http://www.twitter.com/bielousov">@bielousov</a>.</em></p>
<p><em>And if you've recently used a mobile or responsive website featuring some cool
design patterns that felt completely native, let me know in the&#160;comments!</em></p></div>12 Must-Read RWD Resources (September Digest)2013-10-02T11:00:00-07:00Mike Abasovtag:www.mobify.com,2013-10-02:blog/rwd-links-september-2013/<div><p><a href="http://www.mobify.com/blog/rwd-links-september-2013/">
<img src="http://www.mobify.com/static/blog/2013/10/september-digest.png" alt="12 Must-Read RWD Resources (September Digest)">
</a></p>
<p>Here at Mobify, we spend a fair share of our time collecting and sharing the most interesting resources on how to create amazing responsive web experiences. Hell, you might even call it a passion of&#160;ours!</p>
<p>In this post, I wanted to highlight some of the best RWD stories that got us excited this past month.</p>
<p><em>(If I missed any cool ones, let me know in the comments. Or if you've written something that you think should be included, post it as&#160;well!)</em></p>
<p>Let's dive&#160;in!</p>
<hr>
<h4><a href="http://bradfrostweb.com/blog/post/7-habits-of-highly-effective-media-queries/">1. Seven Habits of Highly Effective Media&#160;Queries</a></h4>
<p><a href="http://twitter.com/brad_frost">Brad Frost</a> shared some considerations for crafting high-quality media queries. <a href="http://bradfrostweb.com/blog/post/7-habits-of-highly-effective-media-queries/">Read this article&#160;&#187;</a></p>
<hr>
<h4><a href="http://webdesign.tutsplus.com/tutorials/htmlcss-tutorials/build-a-freshly-squeezed-responsive-grid-system/">2. Build a Freshly Squeezed Responsive Grid&#160;System</a></h4>
<p><a href="http://twitter.com/joericho">Joe Richardson</a> shares how he built his responsive grid system, so that you can build your own. <a href="http://webdesign.tutsplus.com/tutorials/htmlcss-tutorials/build-a-freshly-squeezed-responsive-grid-system/">Read this article&#160;&#187;</a></p>
<hr>
<h4><a href="http://gist.github.com/phamann/5844442">3. Lesson's learnt building the&#160;Guardian</a></h4>
<p><a href="http://twitter.com/patrickhamann">Patrick Hamann</a> asked Guardian engineers about their experience "starting from scratch and building a mobile-first next generation web platform". <a href="http://gist.github.com/phamann/5844442">Read this article&#160;&#187;</a></p>
<p><em>Related: Luke Wroblewski shares notes from <a href="http://www.lukew.com/ff/entry.asp?1792">Andy Hume's Smashing Conf talk</a> about redesigning The&#160;Guardian.</em></p>
<hr>
<h4><a href="http://www.dtelepathy.com/blog/design/5-steps-to-make-your-website-more-accessible">4. Five Steps to Make Your Website More Accessible</a></h4>
<p>At its core, responsive/adaptive design is about being accessible to any user, in any context and on any device. In this article, <a href="http://plus.google.com/109538937796217329338/">Julia Larson</a> focuses on making websites more accessible. <a href="http://www.dtelepathy.com/blog/design/5-steps-to-make-your-website-more-accessible">Read this article&#160;&#187;</a></p>
<hr>
<h4><a href="http://www.lukew.com/ff/entry.asp?1788">5. Smashing Conf: Responsive Web Design is&#160;Easy/Hard</a></h4>
<p><a href="http://twitter.com/lukew">Luke Wroblewski</a> shares detailed notes from Dan Mall's Smashing Conf talk about rethinking certain RWD processes. <a href="http://www.lukew.com/ff/entry.asp?1788">Read Luke's notes&#160;&#187;</a></p>
<hr>
<h4><a href="http://mobile.smashingmagazine.com/2013/09/11/responsive-navigation-on-complex-websites/">6. Responsive Navigation On Complex&#160;Websites</a></h4>
<p><a href="http://twitter.com/jonrundle">Jon Rundle</a> talks about his experience "launching two large institutional websites with complex navigation systems" while trying to maintain simplicity and sanity. <a href="http://mobile.smashingmagazine.com/2013/09/11/responsive-navigation-on-complex-websites/">Read this article&#160;&#187;</a></p>
<hr>
<h4><a href="http://www.techrepublic.com/blog/web-designer/responsive-web-design-tool-review-embed-responsively/">7. Responsive web design tool review: Embed&#160;Responsively</a></h4>
<p>"Embed Responsively allows you to transform fixed width embedded media content into flexible and fluid responsive objects." <a href="http://plus.google.com/105422432853481832984">Ryan Boudreaux</a> puts Embed Responsively to the test. <a href="http://www.techrepublic.com/blog/web-designer/responsive-web-design-tool-review-embed-responsively/">Read this article&#160;&#187;</a></p>
<hr>
<h4><a href="http://bradfrostweb.com/blog/post/page-height-scrolling-and-responsive-web-design/">8. Page Height, Scrolling and Responsive Web&#160;Design</a></h4>
<p><a href="http://twitter.com/brad_frost">Brad Frost</a> describes three different scrolling behaviour patterns on mobile phones, and why you should keep them in mind when designing your responsive site. <a href="http://bradfrostweb.com/blog/post/page-height-scrolling-and-responsive-web-design/">Read this article&#160;&#187;</a></p>
<hr>
<h4><a href="http://meetcontent.com/blog/planning-for-content-beyond-the-web/">9. Planning for Content Beyond the&#160;Web</a></h4>
<p><a href="http://twitter.com/dmolsen">Dave Olsen</a> thinks beyond phones and tablets. The content of the future will have to adapt to a wide range of devices that don't even exist yet. He argues that if we build a strong foundation now, we can avoid disaster later on. <a href="http://meetcontent.com/blog/planning-for-content-beyond-the-web/">Read this article&#160;&#187;</a></p>
<hr>
<h4><a href="http://speckyboy.com/2013/09/11/responsive-design-is-not-about-screen-sizes-any-more/">10. Responsive Design is Not About Screen Sizes Any&#160;More</a></h4>
<p><a href="http://twitter.com/gorkamolero">Gorka Molero</a> makes a case for why performance should be treated as a design feature, and how to use progressive enhancement and other advanced techniques to achieve optimal page speed. <a href="http://speckyboy.com/2013/09/11/responsive-design-is-not-about-screen-sizes-any-more/">Read this article&#160;&#187;</a></p>
<p><em>BTW: We've built our own set of performance tools to <a href="http://cloud.mobify.com/mps/">make your responsive site&#160;lightweight</a>.</em></p>
<hr>
<h4><a href="http://flippinawesome.org/2013/09/16/break-the-wrist-and-walk-away-responsive-design-and-bootstrap-3/">11. Break The Wrist And Walk Away: Responsive Design And Bootstrap&#160;3</a></h4>
<p><a href="http://twitter.com/burkeholland">Burke Holland</a> takes a close look at Bootstrap 3. This resources is great for both beginners and seasoned developers looking to learn more. <a href="http://flippinawesome.org/2013/09/16/break-the-wrist-and-walk-away-responsive-design-and-bootstrap-3/">Read this article&#160;&#187;</a></p>
<hr>
<h4><a href="http://www.creativebloq.com/web-design/create-flexible-grids-using-sass-9134524">12. Create flexible grids using&#160;Sass</a></h4>
<p>"<a href="http://twitter.com/stevehickeydsgn">Steve Hickey</a> explains how to design your own flexible grid system using CSS and Sass &#8212; and ditch presentational markup for good." <a href="http://www.creativebloq.com/web-design/create-flexible-grids-using-sass-9134524">Read this article&#160;&#187;</a></p>
<hr>
<h3>Your Turn</h3>
<p>What articles, tools or resources did you enjoy reading this September? Share them in the comments&#160;below!</p></div>A Beginner's Guide to Perceived Performance: 4 Ways to Make Your Mobile Site Feel Like a Native App2013-09-18T11:00:00-07:00Kyle Peatttag:www.mobify.com,2013-09-18:blog/beginners-guide-to-perceived-performance/<div><p><em>Editor's note: This post is &#8776;3,000 words. It covers many different aspects of perceived performance of mobile websites as well as practical solutions to speeding up your site. TL;DR: it's not about how fast your site is; it's about how fast your users think it is.</em></p>
<p><a href="http://www.mobify.com/blog/beginners-guide-to-perceived-performance/">
<img src="http://www.mobify.com/static/blog/2013/09/perceived-performance.png" alt="A Beginner's Guide to Perceived Performance">
</a></p>
<p>Building well-designed websites on mobile devices is slowly becoming easier and easier. Whatever the method (responsive, adaptive, etc.), if you know what you're doing, crafting a good-looking site is not a problem.</p>
<p>But your clients, just like ours, may still be asking for that app-like experience. And creating such experiences remains a challenge.</p>
<p>Most of the time, when people say something is <em>&#8216;app-like&#8217;</em> or that it feels <em>&#8216;native&#8217;</em>, they&#8217;re not talking about the way a site looks. Instead, they&#8217;re talking about the way the interface responds to their actions and the way it performs when they make those actions.</p>
<p>Native apps are fast. Animations are rendered smoothly; buttons respond immediately when you press them, and there&#8217;s never any question as to whether something is loading.</p>
<p><strong>Getting your site to feel native means doing everything you can to get your site to perform as quickly as possible.</strong></p>
<p>Improving performance is a really hot topic right now, and for good reason. Until very recently, the web has been trending towards slower and heavier sites. And there&#8217;s an argument that it&#8217;s impossible to make a performant web app.</p>
<p>This was the reason <a href="http://www.infoq.com/news/2012/09/Facebook-HTML5-Native">Facebook said they had to move to a native app</a>. They just couldn&#8217;t get the speed or interactions where they wanted them, at least with the resources they had.</p>
<p>Despite what Facebook thinks, it&#8217;s not impossible to build performant websites. It may not be easy, but it&#8217;s not out of our grasp. We just have to work a little bit harder to make it happen. Technically, we&#8217;ve got the power to make our sites feel faster, more modern, and overall better experiences.</p>
<h3>Perceived Performance vs. Actual Performance</h3>
<p>While improving actual performance is important, it turns out it doesn&#8217;t really mean that much to the end user unless they can actually sense the improvement.</p>
<p>At An Event Apart in Seattle earlier this year, Luke Wroblewski told a <a href="http://www.lukew.com/ff/entry.asp?1797">story about his mobile app, Polar</a>. He explained that his team worked hard to improve the amount of time it takes to load new polls.</p>
<p>At the same time, they introduced a short spinner to show users when polls were loading. Feedback immediately started pouring in about how loading new polls felt so much slower than before, despite the fact that it was actually much faster. Polar quickly released a patch removing the spinner, and people were impressed by how much faster polls loaded.</p>
<p>This is a great example of the importance of perceived performance. It doesn&#8217;t matter how fast your site is if it doesn&#8217;t feel fast. In the case of the spinner, it just drew the user&#8217;s attention to the fact that they were waiting instead of distracting them from it.</p>
<p><strong>As designers and developers, our goal shouldn&#8217;t just be to create the fastest site mathematically; it should also be to create the fastest site experientially.</strong></p>
<p>All that matters is how the user perceives the speed of your site. Any actual speed increase is just a topper on an already well-decorated cake. I would argue that perceived performance is more important than actual performance gains. But that doesn&#8217;t mean you shouldn&#8217;t be working on actual performance as well.</p>
<p>So, enough with the explanations. What can you do to improve perceived performance on your site right now?</p>
<p><strong>Here are four techniques you can start implementing immediately.</strong></p>
<h3>1. Add Touch States to Your Buttons</h3>
<p>One of the easiest ways to improve perceived performance on your site is by enabling the active state on mobile.</p>
<p><strong>You see, any time a visitor taps on a button on your site, she has to wait 300ms before it even looks like anything happened</strong>.</p>
<p>Browsers put that timeout in there so they can make sure the user didn&#8217;t mean to do something else (a double tap, to be precise). So they wait just under one third of a second to see what else the user does and, if it&#8217;s nothing, they act on that initial tap. And when the action finally does happen, it just highlights with a grey overlay and then goes through.</p>
<p>This is a terrible experience. The Nielsen Group performed a study which showed that anything taking more than <a href="http://www.nngroup.com/articles/response-times-3-important-limits/">100ms to respond makes the user feel like they&#8217;re waiting</a>&#160;&#8212; and all they&#8217;re trying to do is move through your site.</p>
<p>However, the majority of mobile sites, including some of the ones I've built, don't address this perceived performance issue. Designers usually leave the default touch state as-is on links or buttons.</p>
<p><strong>To make your site feel faster, you need to make your buttons respond immediately to a user's touch and give that user a great visual indication that something is happening.</strong></p>
<p>There is a great property on the web for showing when a button or link is clicked; it&#8217;s the <code>:active</code> state. We use it all the time for desktop browsers.</p>
<p>Unfortunately, neither iOS nor Android respect this property when a link or button is tapped on mobile. To enable those active states, you need to add a simple event to the page with JavaScript:</p>
<div class="codehilite"><pre><span class="nb">document</span><span class="p">.</span><span class="nx">addEventListener</span><span class="p">(</span><span class="s2">"touchstart"</span><span class="p">,</span> <span class="kd">function</span><span class="p">(){},</span> <span class="kc">true</span><span class="p">)</span>
</pre></div>
<p>Then, you'll want to use CSS to add active states to our buttons and remove the tap highlight:</p>
<div class="codehilite"><pre><span class="o">-</span><span class="n">webkit</span><span class="o">-</span><span class="n">tap</span><span class="o">-</span><span class="n">highlight</span><span class="o">-</span><span class="n">color</span><span class="o">:</span> <span class="n">rgba</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">);</span>
</pre></div>
<p>With both of those properties in place and active states set on your buttons, they&#8217;ll immediately feel like they respond faster even though they actually respond at the same speed. You&#8217;re just giving your users immediate feedback for their actions instead of making them wait 300ms to see if anything happened.</p>
<div class="grid break-480 before-after-showcase">
<div class="span6">
<img src="http://www.mobify.com/static/blog/2013/09/notapstate.gif" alt="Without Touch States">
<h5>
Without Touch State
(<a href="http://codepen.io/mobify/full/ulwqf">code</a>)
</h5>
</div>
<div class="span6">
<img src="http://www.mobify.com/static/blog/2013/09/tapstate.gif" alt="Withough Touch States">
<h5>
With Touch State
(<a href="http://codepen.io/mobify/full/ulwqf">code</a>)
</h5>
</div>
</div>
<p><strong>However, if you want to make them actually respond immediately, you can go one step further.</strong></p>
<p>Using a JavaScript function called <code>fasttap</code> or <code>fastclick</code>, you can fully remove that 300ms delay from your buttons. This, along with the active states, will make your websites feel speedy as heck.</p>
<p>For more information on <code>fasttap</code>, read <a href="http://developers.google.com/mobile/articles/fast_buttons">this article by Google</a> or, for a ready-made implementation, check out <a href="http://github.com/ftlabs/fastclick">this repo on Github</a>.</p>
<h3>2. Use Momentum Scrolling</h3>
<p>Have you tried creating a scrollable container on your mobile site and been stuck with the slow, non-responsive scrolling that&#8217;s enabled by default?</p>
<p>You&#8217;re in luck! Android 3+ and iOS 5+ implemented a new property called <code>overflow-scrolling</code> that enables momentum scrolling. And it works beautifully.</p>
<div class="grid break-480 before-after-showcase">
<div class="span6">
<img src="http://www.mobify.com/static/blog/2013/09/nooverflowscroll.gif" alt="Without Momentum Scrolling">
<h5>
No Momentum Scrolling
(<a href="http://codepen.io/mobify/full/LueFn">code</a>)
</h5>
</div>
<div class="span6">
<img src="http://www.mobify.com/static/blog/2013/09/overflowscroll.gif" alt="With Momentum Scrolling">
<h5>
With Momentum Scrolling
(<a href="http://codepen.io/mobify/full/vLcky">code</a>)
</h5>
</div>
</div>
<p>This momentum scrolling <em>feels</em> native because it <em>is</em> native. All you need to do is add this property to your scrolling containers:</p>
<div class="codehilite"><pre><span class="o">-</span><span class="n">webkit</span><span class="o">-</span><span class="n">overflow</span><span class="o">-</span><span class="n">scrolling</span><span class="o">:</span> <span class="n">touch</span><span class="p">;</span>
</pre></div>
<p>There is one issue with this property, however. It will disable the ability to tap the status bar on the top of your iPhone to scroll back to the top of the page. This bug has been around for a while, and it doesn&#8217;t seem to be fixed even in the <a href="http://www.mobify.com/blog/designing-for-ios-7/">latest version of Mobile Safari on iOS 7</a>.</p>
<p>One way to get around this is to create a class that adds <code>overflow-scrolling: touch</code> to the container and then use JavaScript to only apply that class when that container is visible.</p>
<p>On Android 4, you don&#8217;t even need this property. Every scrollable container has momentum scrolling.</p>
<p>On older versions of Android, you have a couple of options. My favourite one is to detect momentum scrolling support with Modernizr and alter your layout to have the container overflow visibly. If that&#8217;s not an option, there are a few JavaScript libraries you can try instead. Filament Group&#8217;s <a href="http://filamentgroup.github.io/Overthrow/">Overthrow</a> and <a href="http://cubiq.org/iscroll-4">iScroll</a> work pretty well.</p>
<h3>3. Create Performant Animations</h3>
<p>One of the most noticeable differences between websites and apps is the use of animations.</p>
<p>Apps have been able to to take full advantage of the hardware graphic acceleration present in all modern devices for years now. On the web, developers have been making do with JavaScript-based animations, which are slow on weaker mobile CPUs.</p>
<p>But now, with mobile browser support being what it is, you can make great use of hardware-accelerated CSS3 animations in your projects.</p>
<p><strong>This is an awesome way to add some of that visual pizazz that all of our favourite apps have been flaunting for years.</strong></p>
<p>Not so fast, though! For animations to feel native, you have to make sure that your animations aren't slow or choppy, which can be quite difficult.</p>
<p>Allen Pike of Steamclock Software wrote <a href="http://www.allenpike.com/2011/providing-joy-at-60-fps/">a great article in 2011</a> about the joy animations provide users and how non-performant animations can have a huge impact on how an app feels.</p>
<p>Interestingly, he wrote this article about native app development. This makes it a perfect article for us to follow in our mission to create native-feeling animations on the web.</p>
<p>In the article, he lays out what he calls a "timeline of perception":</p>
<p><strong>1. Animations Should Move at 60fps.</strong> This means that each frame should take <a href="http://speakerdeck.com/jaffathecake/rendering-without-lumps">~16ms to complete</a>. That&#8217;s the bare minimum to feel native and smooth. 60fps is the speed that all iOS&#8217;s built-in animations run at; it&#8217;s why scrolling on an iPhone feels so much better than on an Android device (although Google has made great improvements in this space recently). You should attempt to get all animations that are directly related to a user&#8217;s interaction moving at this speed.</p>
<p><strong>2. Everything Else Should Respond in Under 100ms.</strong> That number is the mental barrier after which things feel slow. Anything under 100ms feels essentially instantaneous to the user.</p>
<p><strong>3. If it Absolutely has to Take Longer than 100ms, it Should Definitely Respond Within 1000ms.</strong> Allen suggests that anything that takes this amount of time should give the user some indication that something is happening. Like a spinner or a progress bar.</p>
<p>But, as we learned earlier with the Polar example, focusing the user&#8217;s attention on the wait may actually do more harm than good. I&#8217;ll cover a different method for dealing with this in a bit.</p>
<p><strong>4. Anything That Takes Longer than a Second</strong> is probably bad, and you should feel bad.</p>
<p>Okay, so knowing all of that, you may be wearing your keyboard as a hat and contemplating a career change&#160;&#8212; since when did building for the web mean you had to worry about things like animation timing?</p>
<p><strong>Don&#8217;t worry, there are some great resources that make this stuff a lot easier!</strong></p>
<p>The first one is a new library by the fine folks at HTML5 Boilerplate, <a href="http://github.com/h5bp/Effeckt.css">Effeckt.css</a>. Their goal is to create a library of common transitions and animations that perform at 60fps. While it&#8217;s not quite done yet, a lot of them are working pretty well, and I highly recommend using them and tweaking them to suit your project.</p>
<p>Another great library is <a href="http://topcoat.io/">Topcoat</a>. Built by the web team at Adobe, this is a CSS component library built with performance in mind. This library is filled to the brim with excellent components that run extremely smoothly. Because performance is their main goal, every single one of the components is <a href="http://bench.topcoat.io/">benchmarked so you can see exactly how it performs</a>.</p>
<p>Topcoat and Effeckt.css actually go hand in hand. Topcoat contributes directly to the Effeckt.css repo, and both play very nicely together.</p>
<p><strong>Next, I mentioned earlier that I&#8217;d talk about a method for avoiding spinners when possible.</strong></p>
<p>One of my preferred methods for avoiding spinners on waits longer than 100ms but shorter than, say, 250ms&#160;&#8212; where a spinner will actually do more harm than good&#160;&#8212; is to hide it behind an animation.</p>
<p>For instance, if you&#8217;re Ajaxing in content, try animating the content&#8217;s container so that it shrinks up and then grows back to fit the new content. A short animation like this will distract the user from the wait&#160;&#8212; instead of staring at a spinner they&#8217;re simply waiting for a short animation to finish. They may not even realize the new content wasn&#8217;t there to begin with.</p>
<p>Of course, repetitive animations that take a long time to complete can be annoying too, so make sure you when you use these techniques sparingly. That&#8217;s good advice for most animations.</p>
<h3>4. Take Advantage of Natural Gestures</h3>
<p>One thing apps seem to have over the web is their ability to harness gestures that seem to come naturally to people on touch devices.</p>
<p>App creators have recognized the power that gestures hold and are quickly capitalizing on it.</p>
<p>Look at <a href="http://mailboxapp.com">Mailbox</a> or <a href="http://www.realmacsoftware.com/clear/">Clear</a> for great examples of this. These apps use simple gestures that take advantage of the biggest differentiator of the mobile device, the ability to directly touch objects on screen.</p>
<p>Most websites, however, only rely on tapping objects. Designers avoid implementing other gestures, leaving users feeling like second-class citizens.</p>
<p><strong>We need to start thinking about developing websites directly for the device. If a user's device allows for gestures, why not use them?</strong></p>
<p>Of course, there&#8217;s just one little problem: mobile operating systems have this nasty little habit of hijacking gestures in the browser for their own 'nefarious' means.</p>
<p>For example, apps like Facebook have been using edge swipes on the left and right edges of the screen to open up navigation. However, on the web, this interaction is out of bounds as Chrome uses it to switch between tabs and the new version of Mobile Safari in iOS 7 <a href="http://www.mobify.com/blog/designing-for-ios-7/">uses it to go forward and back through history</a>.</p>
<p>Okay, so those gestures are pretty much off limits. Which ones can and should you use? Here are four:</p>
<h4>Gesture #1. Side-to-Side Swiping</h4>
<p>Even though the edges are out, side-to-side swiping is still a great gesture. You just have to be careful not to have it too close to the edge of the screen.</p>
<p>This gesture is best used to move through a set of objects, in something like a photo carousel or a list of tabs.</p>
<h4>Gesture #2. Pull-to-Refresh</h4>
<p>Pull-to-refresh is another gesture that people just seem to <em>get</em>. There are a bunch of great JavaScript libraries that make it easy to integrate this feature. The one I&#8217;ve used before is <a href="http://usehook.com/">Hook.js</a>.</p>
<h4>Gesture #3. Long Press</h4>
<p>There&#8217;s a useful property called <code>-webkit-touch-callout: none;</code> that will disable the default long press action in Mobile Safari. If you want to disable it on Android, however, it&#8217;s a <a href="http://stackoverflow.com/questions/3413683/disabling-the-context-menu-on-long-taps-on-android">bit more work</a>.</p>
<p>Long press gesture can be used for actions such picking up an item (e.g. reordering items in a list) or showing more options (e.g. social sharing).</p>
<h4>Gesture #4. Pinch-Zoom</h4>
<p>Everyone understands pinch-zoom. Most people when they encounter a photo on the web will try to pinch-zoom in to see it in greater detail.</p>
<p>This is another situation where the browser hijacks a gesture but, in this case, it&#8217;s actually not so bad.</p>
<p>Whether you&#8217;re locking the viewport from zooming or not, sometimes you don&#8217;t want the whole page to zoom when a user pinches. To take over those multitouch gestures, you can use the small but awesome <a href="http://eightmedia.github.io/hammer.js/">Hammer.js</a> library. This library has a bunch of built in gestures you can use or you can create your own.</p>
<p>A great example of this is the pinch-zoom for images on the <a href="http://imgur.com/gallery/U6HYTnm">imgur.com mobile website</a>. Check out what swiping does while you&#8217;re there.</p>
<p>Just remember, if you&#8217;re using a gesture, make sure it&#8217;s one that either feels natural or makes sense for the user to do.</p>
<h3>Conclusion</h3>
<p>My hope is that one day there won&#8217;t need to be a distinction between native and web. We&#8217;re not there yet but the more we work to make all of our websites <em>feel</em> like they were designed for the user, the sooner that day will arrive.</p>
<p><strong>I think the recent trend towards focusing on performance is fantastic but we have to remember that our users aren&#8217;t machines.</strong></p>
<p>They don&#8217;t care about how many requests your website makes or how fast the screen repaints. But they do care about what those things impact&#160;&#8212; their perception of performance.</p>
<p><strong>Focus on making sure your site looks and acts like it&#8217;s as fast as possible. There&#8217;s no point in making the fastest site ever if the user doesn&#8217;t notice.</strong></p>
<p>If you have any more tips for improving perceived performance, please post in the comments.</p></div>Three Early iOS 7 Web Design Best Practices2013-09-10T12:00:00-07:00Mike Abasovtag:www.mobify.com,2013-09-10:blog/designing-for-ios-7/<div><p><a href="http://www.mobify.com/blog/designing-for-ios-7/">
<img src="http://www.mobify.com/static/blog/2013/09/ios7.jpg" alt="iOS 7 Web Design Early Best Practices">
</a></p>
<p>In just over a week, Apple will finally release iOS&#160;7.</p>
<p>The new version of the operating system that powers <a href="http://www.mobify.com/blog/ios-vs-android-q1-q2-2013/">56.5% of global mobile pageviews</a> is shipping with a whole range of new features and exciting design improvements.</p>
<p>Some of the most interesting changes to iOS can be found inside mobile Safari. These changes directly affect the design structure of the browser, and so, by extension, a user's experience of your site.</p>
<p>In this post, I want to share with you three core UX/UI changes in Safari on iOS 7 that you should keep in mind when optimizing your site for Apple devices.</p>
<h3>1. Edge Swipes Are Dead</h3>
<p>Previous versions of Safari allowed users to swipe down the Notification Centre from the top edge of a device. However, in iOS 7, Safari takes full advantage of all four edges of a user's device.</p>
<p>Gestures have been incorporated that let you swipe from left to right to navigate back through the browser history, and from right to left to navigate forward through it. You can also swipe from the bottom up to access the new Control Center.</p>
<h4>How This Affects Your Site</h4>
<p>First, if you were thinking of using any kind of <a href="http://kenyarmosh.com/ios-pattern-slide-out-navigation/">slide-out navigation</a>, you'll need to reassess the implementation. To avoid breaking the user experience, you now need to trigger the navigation "on touch" rather than "on edge pull".</p>
<p>Next, you have to revisit the way you use <a href="http://mobify.github.io/mobify-modules/carousel/">image carousels</a> on your site. If the images are displayed with little or no padding, a visitor may start swiping them at the very edge of the screen, triggering the iOS back/forward gesture.</p>
<p><a href="http://www.mobify.com/static/blog/2013/09/ios-skinnyties.png">
<img src="http://www.mobify.com/static/blog/2013/09/ios-skinnyties.png" alt="Skinny Ties on iOS 6 and iOS7">
</a></p>
<p>Left: Skinny Ties on iOS 6 &#8212; Carousels work as expected.<br>
Right: Skinny Ties on iOS 7 &#8212; Swiping the carousel now triggers a gesture to go back a page.</p>
<p><strong>UPDATE:</strong> As one of the commenters pointed out, the carousel isn't actually
broken. It still works if you don't swipe from the edge of the screen. What
I tried to illustrate is that your visitors will likely be swiping all over the
screen and may trigger a gesture they aren't expecting.</p>
<h3>2. Good Navigation Is More Important Than&#160;Ever</h3>
<p>Another thing you'll notice about the new Safari is that the interface <a href="http://www.imore.com/ios-7-preview-safari-amps-search-tabs-sharing-reading-and-more/">gets minimized</a> as you scroll through a website.</p>
<p>While the URL bar just gets smaller, the basic navigation disappears completely and can only be brought back up by quickly scrolling upwards <em>(note: a similar pattern is used in Google Chrome for iOS)</em>.</p>
<p><a href="http://www.mobify.com/static/blog/2013/09/ios-smashing.png">
<img src="http://www.mobify.com/static/blog/2013/09/ios-smashing.png" alt="Smashing Magazine on iOS 7">
</a></p>
<h4>How This Affects Your Site</h4>
<p>Well-designed websites will benefit from this more minimalist and immersive experience, but sites with confusing navigation will struggle to keep users on the page without the simplicity of the back/forward buttons.</p>
<p>That's why you need to make sure that your site's navigation is easy, intuitive, and touch-friendly.</p>
<h3>3. More Screen Real Estate. Different Fold.</h3>
<p>As mentioned above, Safari now sports a new, full-screen design where interface elements minimize or disappear during normal browsing.</p>
<p>What this also means is that the fold, which is an important consideration for many sites, may shift unexpectedly.</p>
<p>For example, on the screenshot below, you can see that on our own site, the customer logos are only half-visible on iOS 7 (vs. iOS 6).</p>
<p><a href="http://www.mobify.com/static/blog/2013/09/ios-mobify.png">
<img src="http://www.mobify.com/static/blog/2013/09/ios-mobify.png" alt="Mobify.com on iOS 6 and iOS7">
</a></p>
<p>Social proof is an important factor for our company, so we may consider adjusting the layout to keep the logos visible on every device and screen size.</p>
<h4>How This Affects Your Site</h4>
<p>Simply put, you now have another environment to test for. And if the fold is critical to your mobile UX, you should make sure you look at how your site appears on the new iOS as soon as possible.</p>
<h3>Conclusion</h3>
<p>At Mobify, we're really excited by the release of iOS 7 and the new version of mobile Safari. It's going to be an interesting journey to see how the new direction chosen by Jony Ive's team will impact app, web, and interface design for years to come.</p>
<p>In the meantime however, the best thing web designers and developers can do to make sure that their sites work great in this OS is to get their hands on a device running iOS7, or download Xcode 5 (currently offered as a <a href="http://developer.apple.com/xcode/">developer preview</a>).</p>
<h3>Your Turn</h3>
<p><em>Did we miss something?</em> Share your tips and discoveries about building sites for iOS 7 in the comments below!</p></div>iOS Still Drives More Pageviews than Android [NEW GLOBAL DATA]2013-09-04T14:00:00-07:00Philip Webbtag:www.mobify.com,2013-09-04:blog/ios-vs-android-q1-q2-2013/<div><p>When a colleague casually asked what smartphone she should buy, productivity in the Mobify office ground to a halt. Lines were drawn. Furtive glances were thrown across the room. Devices, absent a second ago, suddenly materialized in hands.</p>
<p><strong>The debate was on: iOS or Android</strong>?</p>
<p>After a series of increasingly spirited exchanges, we decided to see what the rest of the world thinks.</p>
<p>To do so, we sampled over <strong>300 million Mobify-powered mobile pageviews</strong> from the first 6 months of 2013 to see which operating systems device owners across the world were using. We broke them down by country and overlaid them on a map of the globe.</p>
<p>The result is a map of iOS and Android pageviews for Q1 and Q2 in 2013. You can hover over individual countries below to see exact percentages.</p>
<p><br><br>
</p><div id="chart_div" style="width: 100%; height: auto;"></div>
<div>
<script src="http://www.google.com/jsapi"></script>
<script>
(function() {
var drawRegionsMap = function() {
var data = google.visualization.arrayToDataTable([
['State', '% iOS', '% Android'],
['Afghanistan', 26, 74],
['Albania', 62, 38],
['Algeria', 15, 85],
['Argentina', 32, 68],
['Armenia', 33, 67],
['Aruba', 50, 50],
['Australia', 72, 28],
['Austria', 70, 30],
['Azerbaijan', 28, 72],
['Bahamas', 13, 87],
['Bahrain', 45, 55],
['Bangladesh', 14, 86],
['Barbados', 38, 62],
['Belarus', 68, 32],
['Belgium', 73, 27],
['Belize', 83, 17],
['Bolivia', 55, 45],
['Bosnia and Herzegovina', 25, 75],
['Brazil', 48, 52],
['Brunei', 27, 73],
['Bulgaria', 22, 78],
['Cambodia', 43, 57],
['Canada', 72, 28],
['Chile', 30, 70],
['China', 54, 46],
['Colombia', 58, 42],
['Costa Rica', 61, 39],
['Croatia', 44, 56],
['Czech Republic', 52, 48],
['Denmark', 84, 16],
['Dominican Republic', 54, 46],
['Ecuador', 71, 29],
['Egypt', 49, 51],
['El Salvador', 20, 80],
['Estonia', 48, 52],
['Ethiopia', 37, 63],
['Fiji', 8, 92],
['Finland', 45, 55],
['France', 58, 42],
['Georgia', 56, 44],
['Germany', 52, 48],
['Ghana', 61, 39],
['Greece', 41, 59],
['Guam', 54, 46],
['Guatemala', 84, 16],
['Haiti', 50, 50],
['Honduras', 43, 57],
['Hong Kong', 55, 45],
['Hungary', 27, 73],
['Iceland', 54, 46],
['India', 25, 75],
['Indonesia', 33, 67],
['Iran', 39, 61],
['Iraq', 27, 73],
['Ireland', 57, 43],
['Isle of Man', 47, 53],
['Israel', 56, 44],
['Italy', 63, 37],
['Jamaica', 49, 51],
['Japan', 50, 50],
['Jersey', 79, 21],
['Jordan', 40, 60],
['Kazakhstan', 52, 48],
['Kenya', 15, 85],
['Kosovo', 79, 21],
['Kuwait', 46, 54],
['Kyrgyzstan', 37, 63],
['Laos', 50, 50],
['Latvia', 37, 63],
['Lebanon', 69, 31],
['Libya', 31, 69],
['Lithuania', 69, 31],
['Luxembourg', 82, 18],
['Macau', 59, 41],
['Macedonia [FYROM]', 71, 29],
['Malaysia', 30, 70],
['Maldives', 50, 50],
['Malta', 58, 42],
['Mauritius', 33, 67],
['Mexico', 61, 39],
['Moldova', 42, 58],
['Mongolia', 34, 66],
['Morocco', 41, 59],
['Myanmar [Burma]', 12, 88],
['Namibia', 36, 64],
['Nepal', 30, 70],
['Netherlands', 41, 59],
['New Zealand', 42, 58],
['Nicaragua', 70, 30],
['Nigeria', 34, 66],
['Norway', 46, 54],
['Oman', 39, 61],
['Pakistan', 29, 71],
['Palestine', 37, 63],
['Panama', 5, 95],
['Paraguay', 19, 81],
['Peru', 61, 39],
['Philippines', 43, 57],
['Poland', 47, 53],
['Portugal', 48, 52],
['Puerto Rico', 48, 52],
['Qatar', 52, 48],
['R&#233;union', 18, 82],
['Romania', 62, 38],
['Russia', 56, 44],
['Rwanda', 30, 70],
['Saint Vincent and the Grenadines', 39, 61],
['Saudi Arabia', 44, 56],
['Serbia', 21, 79],
['Singapore', 58, 42],
['Slovakia', 31, 69],
['Slovenia', 26, 74],
['South Africa', 46, 54],
['South Korea', 29, 71],
['Spain', 57, 43],
['Sri Lanka', 25, 75],
['Sudan', 22, 78],
['Sweden', 71, 29],
['Switzerland', 75, 25],
['Taiwan', 52, 48],
['Tanzania', 22, 78],
['Thailand', 47, 53],
['Trinidad and Tobago', 31, 69],
['Turkey', 59, 41],
['Turkmenistan', 23, 77],
['Turks and Caicos Islands', 56, 44],
['U.S. Virgin Islands', 42, 58],
['Uganda', 10, 90],
['Ukraine', 55, 45],
['United Arab Emirates', 54, 46],
['United Kingdom', 63, 37],
['United States', 59, 41],
['Uruguay', 37, 63],
['Uzbekistan', 31, 69],
['Venezuela', 41, 59],
['Vietnam', 47, 53],
['Yemen', 65, 35],
]);
var options = {
//resolution: 'provinces',
backgroundColor: 'white',
legend: 'none',
colorAxis: {minValue: 50, maxValue:51, colors: ['#cb0826','#0083a5',]},
};
var chart = new google.visualization.GeoChart(document.getElementById('chart_div'));
chart.draw(data, options);
};
google.load('visualization', '1', {'packages': ['geochart']});
google.setOnLoadCallback(drawRegionsMap);
})();
</script>
</div>
<br><br>
<p>Blue countries have most of their pageviews coming from iOS, while red countries have most of their pageviews coming from Android.</p>
<p>This map is another reminder to spend extra time testing your website on iOS and Android devices as they make up the vast majority of mobile pageviews across the world!</p>
<p>And as for the Mobify employee who touched off the iOS vs Android debate, she ended up with an iPhone.</p>
<h3>Notes</h3>
<ul>
<li>Blackberry, Symbian, Windows Phone and other operating systems are excluded from the map data as they combine to make up only 3.15% of pageviews.</li>
<li>Data is based on a sample of 319 million pageviews of Mobify-powered mobile websites.</li>
<li>Countries in grey don't have enough sample data to draw significant conclusions.</li>
<li>Global mobile pageview breakdown by OS:<ul>
<li>iOS: 56.5%</li>
<li>Android: 40.17%</li>
<li>Other: 3.15%</li>
</ul>
</li>
</ul></div>