11.
11 | Research Paper HTML5 Overview: A Look at HTML5 Attack Scenarios
Stage 4: Network Scanning
One of HTML5’s features is the ability to make a direct .
connection to any machine on any port. .
Some restrictions have been put in place for .
this scenario but researchers have shown .
that this can be successfully used not only in a .
port-scanning attack but also as a full-blown .
vulnerability scanner.
Mr. H. Acker could order each of his initial .
10 infected browsers to perform a .
vulnerability scan of the network, particularly looking .
at what internal web servers run on the intranet. After .
an hour or so, the attacker can have a detailed network map of the organization. This
map can list all of the company’s machines, what OSs these run, what services these
have installed, the machines’ individual patch levels, and what vulnerabilities .
these have.
Stage 5: Spreading
As part of the vulnerability scan, Mr. H. Acker noticed .
that every user has a default home page, .
which points to the intranet site, .
http://myhome.bravo.com. While the
company’s information security team has done .
a good job of hardening their external sites, .
this internal site runs a version of a Web .
application with a known SQL injection bug. .
The attacker can then order one of his .
10 initial browsers to exploit this bug, .
installing his attack script on the intranet site. .
Within hours, his number of infected systems can .
rise from 10 to almost all of the company’s systems.
This stage helps the attacker overcome the big drawback of using a browser-based
botnet. While browser bots are incredibly stealthy and can bypass most traditional
security mechanisms, the attacker’s connection will be lost as soon as the victim closes
his browser. Attackers need to factor this in as part of their botnet design. Browser-
based botnets are used for tasks (e.g., spamming, Bitcoin mining, and conducting a
distributed denial-of-service [DDoS] attack) that do not rely on always being on. Note
though that the benefits these provide makes the trade-offs more than acceptable.
Mr. H. Acker knows his bots will go offline and come back online, depending on his
victims’ system status. It is therefore good that he has established two persistent
reinfection vectors—the compromised vintage car forum site and the compromised
intranet site. Every time a victim visits either site, his system rejoins the botnet. The
attacker also used techniques such as social engineering, clickjacking, and tabnabbing
to extend the amount of time each bot remains online.
Completing this stage allowed Mr. H. Acker to fulfill two of the terms of his agreement
with Acme. He maximized the Bravo compromise and produced a very detailed network
map of the organization. His next step is to exfiltrate login and personal credentials.

29.
29 | Research Paper HTML5 Overview: A Look at HTML5 Attack Scenarios
The given code loads the attacker’s content and embeds it in a vulnerable site,
ultimately running the code on a victim’s system.
An extension of this attack known as “cross-site posting” is discussed in greater detail
in Kuppan’s paper. Cross-site posting is almost the reverse of remote file inclusion,
except that in this case, an attacker does not try to embed his own code in the page.
Instead, he tries to have sensitive data that is supposed to be sent to the legitimate
web server to be sent to his server instead. Imagine a page that uses the same
XMLHttpRequest() style setup as those described above. This page asks a user to
enter his user name and password, along with some other confidential information.
Normally, the URL for such a page looks like the following:
http://www.example.com/#login.php
An attacker, however, can send the following link to the user instead:
http://www.example.com/#http://www.attacker.com/
stealDetails.php
The vulnerable page now sends sensitive login data to the attacker’s server, something
that could not happen in the past due to the Same Origin Policy. It is likely that this
issue will continue to be a vulnerability until web developers realize they need to go
back and put extra security checks in place in their code.
Sending Arbitrary Content
One of the assumptions the HTML5 specifications make is that COR should in no way
increase the attack surface of legacy servers that do not have knowledge of the COR
specifications. This also assumes that the new specifications do not grant additional
capabilities to JavaScript in terms of requests that can be made. As previously shown,
a number of issues may be present in legacy servers that use XMLHttpRequest()
without validating if the target site has the same origin.
One should also consider that there are no restrictions on the request part of an
XMLHttpRequest(). In other words, site A can request the content of any other site on
the Internet but can only read the response if the other site explicitly allows it to do so.
In a lot of cases though, merely requesting a page on another server is enough to have
an effect on that server’s web application. Take the following request to an imaginary
page as an example:
http://www.gamblingSite.com/placeBet.php?User=
Robert&bet=1000&horse=1&race=10
To make things even more interesting, HTML5 enables a new scenario wherein the
post data sent by the requesting site is no longer restricted to the key=value format
found in web forms. Data can instead be sent in an arbitrary format. The configuration
of the web server may not be prepared to handle such an input, which can lead to
undesirable results.

35.
35 | Research Paper HTML5 Overview: A Look at HTML5 Attack Scenarios
Imagine this scenario: A victim is browsing the web using an unsecured wireless
network in a local cafe. The attacker is also in the cafe and can spoof any site the
victim browses. The victim’s machine requests a page on site A, the attacker’s machine
sniffs the request and sends back a false page before the real site can respond. In
the scenarios described in the blog, the attacker tries to store a false login page for a
webmail service provider in the user’s cache so the victim continues to load the fake
login page even after they have both left the cafe.
One approach is to use the standard browser cache although this has an issue caused
by HTTPS, which will be best explained by the following example:
1. A user browses webmail.com.
2. An attacker responds to the user with a fake login page. The page is also stored in
the user’s browser cache.
3. If the user enters his login details, the attacker will now gain access to these.
However, the attacker’s goal is to continue making the user load the fake .
login page.
4. The victim returns home and once more types “webmail.com” into his browser. In a
normal browser cache, only pages are cached (e.g., the attacker’s false webmail.
com/login.php page) but not the root of a domain. So, the browser will follow
these steps:
a. Ignore the browser cache and directly request http://webmail.com.
b. Webmail.com informs the browser to download webmail.com/login.php.
c. The browser will load the cached (i.e., false) version.
So, what is the issue? In most cases, login pages are served over HTTPS, which
complicates things. What will actually happen in step 4 is the following:
4. The victim returns home and once more types “webmail.com” in his browser. So,
the browser will follow these steps:
a. Ignore the browser cache and directly request http://webmail.com.
b. Webmail.com informs the browser that it only accept https.
c. The browser requests https://webmail.com.
d. Webmail.com informs the browser to download https://webmail.com/login.php.
In this case, the attacker’s plan failed. He poisoned the http login file but could not
poison the https one. So, how does the application cache get around this issue? It
allows the root file “/” of a site to be cached so that it will always be loaded from the
application cache. Let us see how this changes the attack:
1. A user browses webmail.com.
2. An attacker responds to the user with a fake login page. This page also includes
the manifest attribute in the HTML element so it is added to the application cache.
3. The victim returns home and once more types “webmail.com” in his browser. The
browser now checks to see if it has a cached entry for http://webmail.com, which it
does. It presents the false login page to the user.
In this scenario, because the application cache allows root caching for a site, the
false login page will successfully be loaded from the cache and the browser will never
attempt to make a connection to https://webmail.com.
The Andlabs.org blog entry also describes how to make this a more persistent attack by
ensuring that the application cache for the targeted page does not get updated. It also
presents a POC attack.