A personal work journal

LA Times and ads

The LA Times is a good newspaper and is currently doing the best political coverage in California. They are also the most aggressive ad shoveling website I have ever seen. Their ad blocker blocker and paywall works, preventing me from reading articles. I even tried installing an ad blocker blocker blocker which doesn’t work.

So I open articles like this in incognito mode, and let it run its ads, and close the popups and mute the videos and try to ignore the visual distraction. But boy that page does not go quietly. Here’s how they reward their readers.

That’s a timeline of 30 seconds of page activity about 5 minutes after the article was opened. To be clear, this timeline should be empty. Nothing should be loading. Maybe one short ping, maybe loading one extra ad. Instead the page requested 2000 resources totalling 5 megabytes in 30 seconds. It will keep making those requests as long as I leave the page open. 14 gigabytes a day.

There’s no one offender in the network log, it’s a wide variety of different ad services. The big data consumer was a 2 megabyte video ad, but it’s all the other continuous requests that really worry me.

A lot has been written about the future of journalism, the importance of businesses like the LA Times being profitable as a way to protect American democracy. I agree with that in theory. But this sort of incompetence and contempt for readers makes me completely uninterested in helping their business.

Edit for pedantic nerds: for some unfortunate reason this blog post ended up on Hacker News where people raised eyebrows at my 14 gigabytes / day estimate. To be crystal clear: I did a half-assed measurement for 30 seconds and measured 5 megabytes. 5 * 2 * 60 * 24 = 14,400, or about 14GB. That’s all I did.

Yes, I know that extrapolation is possibly inaccurate. If you want to leave a browser window open for 24 hours to measure the true number, be my guest; I’ll even link to your report here. (Hope you have lots of RAM and bandwidth!) I’m a web professional with significant experience measuring bandwidth and performance, I know how to do this right. But that’s the LA Times webmaster’s job, not mine, and I don’t have a lot of confidence that they much care.

Edit 2: for a lot of people visiting my blog I think it is their first view of a network timeline graph like this. Generating your own timeline for any web page is easy if you use Chrome. It’s a standard view in the Network tab in Developer Tools. See the docs, but briefly you open dev tools and click on network and watch the waterfall graph. You want to open it before loading the page so it captures everything from the start. Beware a normal web page will end very soon and be boring; the LA Times is a very special website.

I use uBlock Origin, in the “advanced (I have read the warnings) mode.” I turn off 1st-party JavaScript to read them & I can read without all of their garbage. Amazing how fast it loads without all the script, as well.

Thanks for the kind works Marius! This visualization isn’t my work really, it’s just the standard Chrome Developer Tools view. I updated my blog post with some details (see “Edit 2”). You can easily make your own or demo it in real time, but here’s a higher resolution PNG for you: https://nelsonslog.files.wordpress.com/2017/03/capture4.png

With sites like these, it’s not just the amount of data that is mind-boggling, but also the number of different hosts they contact as part of the whole ad-networking, activity tracking and user metrics. See this: https://urlscan.io/result/8400ef9e-5c5d-4d82-a5cd-e7f1b64b296e The page you mentioned contacts 98 IPs in 8 countries across 60 domains to perform 399 HTTP transactions, and that’s just the initial load! There’s so many parties receiving juicy user information. I think it’s good if we begin naming and shaming ad-bloated sites like this.