Background

As a typical software industry person, usually most of my searches land on stackoverflow. Besides software related quries, I sometimes, land on other stackexchange websites like bicycles, serverfault, superuser etc.

Last month I started observing a strange pattern. All of the stackexchange websites started opening popup ads on clicks. While looking for answers, click anywhere and a popup ad would appear. This was strange for me because:

It is not normal for such websites to serve ads in this way

I have ad block plugin installed in my browser, but the ad was bypassing it somehow

No other person on my network was getting those ads

A few other websites were also serving these ads

Ads appear only on some plain http websites but none of the https website

Based on above mentioned points, I totally rejected the idea that stackexchange is serving these ads. My initial thought was that it is some kind of a malware installed into my browser and this malware is injecting ads into my normal browsing.

Finding & loosing the clue

It was very alarming for me as I am very careful about what gets installed on my machine and what is the source of the software. I thoroughly checked my system for possible trace of a malware. I checked installed applications, registry, startup items, running processes and every other possible thing. Finally I run the firefox without any plugin or extension but the ads were still being served.

Really annoyed by the situation, I pressed Ctrl+F5 and the ad stopped appearing. No more popups. This was possibly because of poisoned cache. Although I had solved the problem but this is now even more worrying. I am no longer getting the ads but someone managed to poison my browser cache and I have lost the clue.

Getting the ads once again

I never connected my system to any public wifi. I use it only at home, in office and connect it via 3G a few times. So, possible culprit was one of these three ISPs.

I almost forgot about the incident until yesterday. Yesterday, I was looking at a bicycle related question on bicycles.stackexchange.com on my iPhone. Naturally, I tapped on the screen. The poor Safari browser had several redirects and opened a popup. This time I was on 3G, Ufone 3G. This was exactly same behaviour.

Identifying the culprit

I immediately opened my laptop. Booted into Linux. Connected to Ufone 3G via hotspot connection. Opened a random stackoverflow question in Firefox Private browsing mode (no cache) and the popup ad is there. Connect to home internet and try the same steps. No popup. So, it is clear. Ufone is injecting popup ads code into stackoverflow website.

Postmortem

Why stackoverflow? I took dumps of same question opened via Ufone 3G and home internet connection and diff them. They are 100% same. No difference. Not a single bit is modified. But wait a minute. What about page resources? Lets have a look at them. I took dumps of all javascript files included in the page and diffed them with the versions opened using Ufone 3G. One of them has some difference. It was Google Analytics javascript. What an intelligent choice. Just poison one JS file and you’ll cover majority of the internet. Every second website will serve your ads. Here is the file:

http://www.google-analytics.com/analytics.js

So what was changed? First of all, obviously it was not Google who is serving the infected file. See the infected file response headers:

Who is doing this? Why?

Honestly, I am not sure. It can be an employee of Ufone, it can be a malware infecting their servers or it can be multiple people in their management getting $$$s for clicks. In any case this is dishonesty and ethically wrong at their end. If they can hijack your browsing sessions, they can do anything they want.

What’s next?

I try to keep most of the my browsing on https but still there are a few websites on http. I also use Ghostery for firefox, I have blocked analytics.js and many other tracking from loading. Tunneling through Ufone 3G seems to be a good solution at this time.

Update [December 21, 2015]:

This is not something new. Many people have already written about it but no official response from Ufone yet and no action taken by PTA.

Note: I am not a UI/UX expert. I am just sharing my feelings about
this design as a consumer and my little bit experience with design.

MCB Bank (One of Pakistan’s largest banks) recently introduced its
branchless banking product “MCB Lite”. Somehow, as a consumer, I am
not satisfied with the design of the card and I am going to share
my thoughts about the card design.

The designer tried to give a feel of a smart phone to the card but
somehow missed some very basic design principles. Smart phones, especially
iPhone, have set very high standards of design and if someone is trying
to design something which looks like a smart phone, they’d have to be extra
careful. I don’t want to sound harsh but it looks like the card was
designed by someone new to design. Printing quality is even worse.

I tried to find out what’s wrong with the design and here are my findings:

Spacing between icons is not consistent.

Text label should NOT be within the icon. Rather there should be no text in app icons.

None of the icons is designed properly, each and every icon looks like resized clip art downloaded from google images.

It was a Saturday morning of November 2012 when I started observing tweets
about Google Pakistan and Microsoft Pakistan websites getting hacked. I
immediately checked both websites and they were really showing a message
from some Turkish hacker. I did nslookup and nameservers were changed to some
free hosting service provider. Obviously, Google and Microsoft were not hosting
their websites on a free webhost. Actually they were not the only ones who were
hacked, it was PKNIC. I quickly did a reverse whois, randomly checked a few of
them. All of them were showing the same page. There were 284 domains pointing
to those specific nameservers. What? 284 domains hacked and people are talking
about just 2 domains. This must be a mega news. I quickly tweeted this:

Not only this, the 284 figure was also published by print media. Here is a
news item from The News Pakistan (By Pakistan’s largest newspaper group):

So, as you can see that each and every news site and blog was after the news and
everyone was publishing it in his own words. What went wrong here? Did
anyone ask any of these blogs or news site for a list of 284 domains hacked?
Did they publish such a list?

The confession part

I tweeted and went for my breakfast. After having the breakfast I decided
to publish the list of these hacked domains. As I started reviewing the hacked
domains list, I noticed that I made a big mistake while counting hacked
domains. There were 2 name servers pointing to that specific free hosting
provider and I counted all the domains pointing to any of those 2 name
servers. So actually, there were just 142 domains each one counted twice.
Now I was extra careful before publishing anything. I checked the name
server change history of all of those domains and noticed that only 110 were
changed in last 24 hours. What about rest of the 32 domains pointing to that
specific name server? All of them were showing real websites hosted by that
free hosting provider and they were not hacked. I verified twice and published
the list here. My blog was getting a huge traffic spike at that time. A
lot of news sites and blogs picked up the list immediately and updated their
news articles. This is how the online news world works. They pick up the news
items from whatever source they can get it and publish it immediately without
verifying anything.

As we are serving static content, there is no need to compress the content with
each and every request. We can have gzipped content generated along with the
other static content and serve it when requested. This approach, in my opinion,
is faster than on-the-fly gzip compression used by nginx and apache. We can
save CPU time used to compress the content with each request. I used
gzip_cache plugin to generate the gzipped version of all my content. Next
step was to serve this static content when requested. Static does not support
this by default. I had to modify it a little bit. It tries to find the
gzipped copy of the content, if gzipped content request is received.

This is purely handled by the HTTP Server serving the content. Again I had to
make a few changes in static to enable caching. I tried to keep the
syntax similar to Apache’s ExpiresByType. Expire time can be specified in
seconds against each mime type.

Again this is purely handled by the HTTP Server and I had to make a few
changes in static to make it possible. Just like Expires headers, I
tried to keep the syntax similar to apache’s AddCharset. Charset can be
set for filename patterns.

This blog template was designed using twitter bootstrap and lots
of custom css. Even after combining and minification, the size was 130KB. I
used mincss to find unused css and remove it. Now the CSS is just 14KB
(4KB gzipped). I had to re-add some styles which were used on other pages.
Once again, done offline and at design time only.

What’s still missing?

Specify image dimensions

Being responsive design, it is not possible to send all images with image
dimensions specified. The images resize themselves according to the screen
size. Although, we could use some javascript to determine screen size and
resize images accordingly, but this would have its own overheads.

Leverage browser caching for external resources

This blog uses only one external resource ga.js, which is the javascript
file used by Google Analytics. It comes with Expires headers of 12 hours.
There has been a lot of discussion about caching and serving it from one’s own
servers but I guess anything like this would be overkill. ga.js is so
common, that it is probably downloaded by some other website already.

Using CDN for static content

This task is in my todo and I am still looking for a good (preferably free)
CDN.

This blog post is continuation of Part-I.
The sample data is increased to 150K Pakistani tweeps now.

Followers

Follower count is no longer a good influence measure. On average each
Pakistani tweep gets followed by 129 users. Majority of Pakistanis
(about 3/4th) have less than 50 followers. Half of Pakistani twitter
users have less than 10 followers. There are about 10,000 tweeps with
no follower and about 12,000 tweeps with single follower. This is a
very strange trend. If you look deeply into these accounts, you’ll notice
that most of them are with default DP and default background. It seems like
these are fake accounts, created by social media cells of different political
parties to increase follower count of their leaders on twitter.

On the other side, there are just 24 Pakistani’s with more than 50,000
followers. Most of them are politicians and TV anchors. Just 2331
tweeps have more than 1000 followers.

Klout

Klout is more reliable social media influence measure. Out of 150,000 Pakistani
tweeps about 40,000 do not have any klout score. About 70,000 have their klout
between 11-20. Average klout score is 16.72. About 12,000 have the minimum
possible score 10.