World Wide Web

Introduction

The World Wide Web is a non proprietary hypertext document system that is
available on the Internet, and is also referred to as: w3, web, www, and 'the
web'. The World Wide Web was not the first hypertext system, but, it was the
first hypertext system that was successfully interconnected with the software
systems (TCP/IP) of the Internet. Designed with the aim of giving users unfettered
access to information, and with no centre, the web aims to give a technological
levelling of hierarchy (horizontal); earlier information systems tended to
be vertical, with a strong power structure. This, in part, is due to the World
Wide Web supporting unidirectional links: where a resource can be linked to
without the resource having given permission. Berners-Lee managed to 'tie
the knot' between hypertext and the Internet by inventing or co-inventing
three technologies:

The World Wide Web is designed with a client server model; a client-server
model splits the workload between a server (provides the data) and the client
(who requests the data). While the server will share its resources with the
client, the client usually accesses the resource, and does not share any of
it's resources. Therefore, the World Wide Web is a system that consists of
web servers (computers that host files) and web browsers
(client programs that retrieve the files). The World Wide Web is named a 'Web'
because hypertext documents (webpages) are connected together - in a system
likened to a 'Web' - through the use of hyperlinks.
Hyperlinks are text/images that are embedded into hypertext documents (webpages),
and include a URL. Uniform Resource Locators (URLs) are a type of Uniform
Resource Identifier (URI), and are used by the Hypertext Transfer Protocol
(HTTP) to locate the computer a webpage is stored upon.

The World Wide Web is not the only online document retrieval system: Gopher
is an example of another system. The World Wide Web is commonly confused with
being the Internet: the World Wide Web is a service accessed on the Internet,
the Internet existed before the World Wide Web and could continue to exist
without it; that said, the World Wide Web is the most popular service used
on the Internet, and is essential to many 'real world' civil and business
services.

History

Sir Tim Berners-Lee is credited with inventing
the World Wide Web: from 1989-1991, he proposed and developed the software
systems of the Web while he was employed at CERN. Berners-Lee had attempted
to build a document system for CERN in the early 1980s: the system was named
ENQUIRE, and it's purpose was to help CERN scientists
share information, and to avoid losing information. Tim Berners-Lee was in
the 'right place at the right time' to invent the World Wide Web: from 1983,
the CERN Networking Group - which included members: Brian Carpenter, Giorgio
Heiman, François Flückiger, Jean-Michel Jouanigot - had begun
to build an internal and external networking infrastructure. The CERN Networking
Group decided to implement TCP/IP instead of the ISO networking standard,
and by 1991, the CERN external network was a hub for international Internet
traffic. By 1991, the CERN network was responsible for handling up to 80%
of Europe's international Internet traffic. Therefore, CERN was the ideal
location to launch a new Internet service.

Tim Berners-Lee: working at CERN

After the failure of ENQUIRE, Berners-Lee proposed a new hypertext project
- taking advantage of CERN's IP network - that would be available to 'everyone'.
When Berners-Lee proposed this new project to his boss Mike Sendall, he referred
to the system as a "Mesh" - he later decided upon "World Wide
Web" when writing the code for the system in 1990. Berners-Lee's proposal
was submitted in March 1989, and was titled "Information Management:
A Proposal"; Mike Sendall wrote on this paper 'Vague, but exciting'.
In May 1989, Berners-Lee appears to have published the same proposal but titled
it as "a large hypertext
database with typed links". This proposal was not successful, but
it led Berners-Lee to ask for assistance from Robert
Cailliau to develop a more 'concrete' proposal: the proposal they developed
was published on the 12th of November 1990 and was titled: "WorldWideWeb:
Proposal for a HyperText Project". This proposal was 'green lit'
and Cailliau and Berners-Lee began the process of building a development team
to launch their new hypertext project. The project was originally referred
to as the CERN WWW Project, and the members of the WWW Project included: Alain
Favre, Arthur Secret, Bebo White, Bernd Pollermann, Carl Barker, Dan Connolly,
David Foster, Eelco van Asperen, James Whitescarver, Jean-Francois Groff,
Jonthan Streets, Nicola Pellow, Peter Dobberstein, Paul Kunz, Pei
Wei, Robert Cailliau, Tim Berners-Lee, Tony Johnson, and Willem van Leeuwen.

(Pictured: Early www / web logo designed by Robert Cailliau)

Berners-Lee decided that the hypertext documents of the World Wide Web would
be read-only, and accessed by a client server architecture (browsers). The
first World Wide Web server (used by Berners-Lee) was a NeXT
computer, and the first web server software was named CERN httpd; which was
designed by Ari Luotonen, Henrik Frystyk Nielsen, and Tim Berners-Lee. Since
then, there has been a plethora of HTTP server software, such as Apache, that
have simplified the process of hosting web servers, and have helped to popularise
the World Wide Web. Berners-Lee was responsible for designing the first web
browser, unsurprisingly named: WorldWideWeb. The second browser created -
a version of the WorldWideWeb browser that was ported to several operating
systems - was the Line Mode Browser, and it was designed by: Tim Berners-Lee,
Henrik Frystyk Nielsen and Nicola Pellow. In 1992, Robert Cailliau and Nicola
Pellow developed the first browser for the Macintosh platform: named MacWWW.
Pei Wei, another member of the WWW project, developed the ViolaWWW hypertext
browser.

The first Web server (NeXT computer) for the World Wide Web

The World Wide Web was first available as an Internet service on the 6th
of August 1991: when Berners-Lee released information about his "Hypertext
project" on the newsgroup: alt.hypertext. However, Berners-Lee had launched
the first web server on the 25th of December 1990; some simple webpages were
available for download, but only a select number of people knew the project
existed. On the 30th of April 1993, CERN made the World Wide Web's software
- such as a library of code - publicly available; with the aim of increasing
its popularity. CERN also announced that the World Wide Web would be free
to use, and no licence fee would be charged to developers (unlike with Gopher
who charged a licence fee to host a server). Alongside CERN decision to disclaim
ownership of the Web, another key factor in making the Web the Internet's
most popular information system was the Mosaic
web browser: this browser was easy to use, stable, and was capable of displaying
graphics/text on the same page. Most early web browsers were used by thousands
of user, the Mosaic web browser was the first browser to be used by millions
of users.

The World Wide Web was fortunate to be launched at the same time that the
Internet was transitioning from a U.S. government funded network to a commercial
network. By 1995, the World Wide Web was the Internet's most popular service,
and was responsible for the creation of large tech companies like Amazon,
Yahoo!, Paypal, and eBay. The early growth of Internet and the World Wide
Web led to the dot com bubble: where the shares
of Internet companies soared in value (1997-2000) and then crashed in value.
In 1994, Berners-Lee left CERN and founded the World Wide Web Consortium (October
1994); the purpose of the World Wide Web Consortium is to create new web standards
and to educate web developers.

HTTP and Internet Protocols

The World Wide Web is a service/application found on the Internet: the Internet
is a system of interconnected computer networks that uses TCP/IP
(Internet protocol suite). HTTP (Hypertext Transfer Protocol) is the protocol
that the Web uses, and it is located in the application layer of Internet
protocol suite. The World Wide Web could not function without HTTP: HTTP is
a 'request-response' protocol, which means that one computer sends a request
for data and another computer responds to the request.

As with most application layer protocols, the World Wide Web is based upon
a client-server computing model: a client application program (browser), residing
on a user's computer, will use HTTP to send a request to retrieve data from
a web server connected to the Internet. Computer files are retrieved by a
client program (browsers) using a Uniform Resource Identifier (URI); the World
Wide Web uses a URI that is named: Uniform Resource Locator (URL). URL's are
embedded within hyperlinks: hyperlinks, usually referred to as 'links', are
embedded within webpages (hypertext documents), and a user simple has to click
on a hyperlink and the browser will use HTTP to locate the resource.

When a client program (browser) requests HTTP to locate and retrieve a computer
file more than one Internet protocol will be used in the process of retrieving
data from a server. HTTP, through a process of encapsulation, typically uses
the Transmission Control Protocol (TCP) of the transport layer of the Internet
protocol suite: TCP ensures that application layer data is reliable sent and
received. The TCP data segments will then be encapsulated (enveloped) into
an Internet Protocol (IP) packet, which will then be encapsulated in a link
layer frame as it 'hops' across the Internet from host to host. The process
is likened to a letter being placed inside an envelope that is placed inside
another envelope that is placed within a final envelope.

Content on the World Wide Web

If a user wants to upload content to the World Wide Web, then they need to
upload it to a web server: a web server is a computer system that will process
requests via HTTP. The most common files uploaded to a web server are images
files (gif, jpeg) and html documents. The next issue is how do users access
the files / content located at a web server: one option is via the web servers
IP address, but the most common way is through the Domain Name System (DNS).
A domain name is registered, such as example.com, then DNS records are created
that tie the domain name to the web server: all users need to do is enter
the domain name address, such as example.com/file.html, to locate files /
content uploaded to a web server.

A collection of webpages (typically html documents including a index.html
file) are uploaded to a web server - tied to a domain name - the overall resource
of content is termed a website. If a user wishes to create their own website:
all they need to do setup/rent a web server, register a domain name, and upload
web content to the web server. Users can then access the web content by entering
its URL (includes a domain name) into a browser and HTTP will retrieve the
content. The web server will have a bandwidth limit per month, if this bandwidth
(download limit) is exceeded, then the content will be unreachable and requests
for the content will be served with an error message.

Web content 'falls' into two broad categories: commercial and non-commercial.
The World Wide Web has spawned many successful online commercial businesses;
which are referred to as e-tailers (e-tailing) and e-commerce businesses.
Some notable e-commerce businesses are Amazon, eBay, and Paypal - most of
these businesses are located in California, the state in which ARPANET was
launched (the network that evolved into the Internet). Some commercial websites
have even launched their own virtual currency, such as Bitcoin. While the
world's largest technology corporations are typically located in a tech centre
named Silicon Valley (California), UK technology companies can be found in
Silicon Glen (Scotland) and Silicon Roundabout (East London Tech City).

Accessing and finding Web content

When the World Wide Web was launched in the early 1990s, the general public
required three things to access the web: 1) computer with an operating system
and hardware that supported TCP/IP network access; 2) a web
browser (client) that could retrieve content from web servers; 3) an access
account with an Internet Service Provider (ISP). Due to the relative high
cost of purchasing the aforementioned requirements, cybercafes became a new
business venture that provided access for those curious about the newfangled
'information superhighway'. The difficulty of accessing the Internet (TCP/IP)
is highlighted by the fact that Windows 95 did not originally ship with a
default TCP/IP network installation or a browser. Therefore, in 1995, the
first edition of the world's most popular operating system did not come Internet
ready; later editions did rectify this issue and packaged the Internet Explorer
browser with the operating system.

Due to the painfully slow download/upload speed of dialup - the leading
access technology before the year 2000 - the content available to web users
was fairly basic and mostly consisted of text and images. From 1992-1994,
the web was extremely small in comparison to the present day, and content
was found either through word of mouth, or from lists of links (directories)
provided on newly founded websites like Yahoo! As the web expanded in the
mid 1990s, it became clear that maintaining lists of links was not feasible
and an automated alternative was required: thus the search engine was born.
Alta Vista was launched in 1995, and was the most popular search engine before
the creation of Google in 1998.

By the late 1990s, it had become easier to access the web: computer and
modem costs had lowered, Windows editions were now Internet ready, and new
UK ISP's, like Freeserve, were offering 'pay as you use' access accounts with
no expensive startup costs. When broadband (DSL technology) was launched in
the UK (year 2000), the Internet and the World Wide Web started to be viewed
as more than just a novelty, or the preserve of boffins or nerds. The potential
inherent (over 10 times faster than dialup) in broadband gave content creators
the ability to rival established media technologies: television and radio
(youtube, podcasts etc). From 2000-2005, the World Wide Web was establishing
itself as a place to do business and access media: content was increasing,
search engine algorithms were becoming more sophisticated, computer technology
was expanding to offer voice and video capabilities, and broadband was enabling
this evolution.

The biggest difference between web access in the year 2005 and 2017, is the
access device being used: in 2005 it was a desktop computer, in 2017 it is
now split evenly between computers, tablets and smartphones. The software
used is still the same: a browser, mobile browsers have had to be developed,
and websites designed to be mobile friendly. Early websites were often designed
with frames, and tend to be unsuited for rendering on a smartphone screen.
Search engines are still the most popular
way in which to find web content, Google has indexed billions of webpages
into its search results, and the problem search engines now have is dealing
with the amount of new content that is created. Social media sites are beginning
to rival Google for providing a 'hub' from where to access and locate content:
with most individuals and businesses having created a Twitter and Facebook
page. Video content (primarily Youtube) is competing with text pages as the
dominant form of content on the web; prior to the launch of broadband, video
content was not feasible. Access accounts have largely remained the same in
terms of cost, but the download speed and usage limit has improved considerable
with the launch of superfast broadband (fiber optic technology). Wireless
access is now the norm: through wifi in the home / business and mobile network
access 'on the move'.

Efforts have been made to improve access for people with disabilities: the
World Wide Web Consortium (W3C)'s launched the Web Accessibility Initiative
(WAI) in 1997, whose goal it was to improve Web accessibility for people suffering
from: auditory disabilities, cognitive / intellectual disabilities, visual
disabilities, motor / mobility disabilities, and seizures disabilities. Persuading
webmasters not to use flashing or strobing effects on their websites - that
effect people suffering from photo epileptic seizures - is one example where
initiatives, like the WAI, can help ensure that web content is correctly designed
for the disabled.

Privacy and the World Wide Web

While there is no requirement to record the browsing history of the World
Wide Web, the majority of browsing is recorded to some degree. The World Wide
Web is based upon a client-server model: a client program (browser) requests
and retrieves files (webpages, pictures) from a web server (computer connected
to the Internet). Therefore, whenever a file is requested and retrieved upon
the World Wide Web: the client and the server usually records the session.
Web server are installed with software that usually logs the IP address of
all incoming requests for data. Likewise, the browser (client) of the user
will usually record the data transmission by keeping a copy of the retrieved
files in its cache (directory) and keeping a record in the history feature
of the browser. Internet Service Providers - the network a user accesses the
Internet with - will also keep a record of each user's usage. The internal
policies of ISP's differ: it is difficult to know precisely what an ISP will
record and store, and for how long. ISPs will only share their user logs when
it is demanded by a legal entity.

Alongside server logs, websites can also record the browsing habits of its
user's by using HTTP cookies (invented by Louis
Montulli). Cookies are small files, stored on the user's computer, that
store information, such as: username authentication, password authentication,
past browsing history. Therefore, if a user returns to a commercial website
- for example eBay - the user will not be required to enter their username
and password again, and the website can serve content to the user that is
related to the content they viewed the last time they visited the website.
Users can delete cookies whenever they wish, and the typical (first party)
cookie does not pose a serious privacy risk; especially if the user has not
provided personal identifiable information to the website. However, tracking
cookies, referred to as third party cookies, can compile a long term record
of a users browsing history - as they record browsing habits at multiple websites
- and are sometimes viewed as malware.

Most websites have a 'privacy policy' that typically promise to keep users
personal details and usage history secret; though sometimes they will share
this data with third parties, which should be disclosed in the 'terms of use'
of the website. Social media websites (Twitter, Facebook, LinkedIn, Google+)
are by their nature more open when it comes to privacy, with most users openly
sharing information publicly. While social media website do include privacy
options, many users are probably unaware that the information they upload
to these sites is often data mined to identify patterns and establish relationships
(usually related to targeted adverts). Additionally, due to the extensive
amount of personal information a user shares on social media, the long term
impact it may have upon a persons 'real life' is far greater than with other
types of websites.