This chapter is from the book

The Basics

If you're embarking on learning ColdFusion then you undoubtedly have an
interest in applications that are Web (shorthand for World Wide Web)
based. ColdFusion is built on top of the Internet (and the Web), so before
getting started, a good understanding of the Internet and related technologies
is a must.

There is no need to introduce you to the Internet and the Web. The fact that
you're reading this book is evidence enough that these are important to you
(as they should be). The Web is everywhereand Web site addresses appear on
everything from toothpaste commercials to movie trailers to cereal boxes to car
showrooms. In August 1981, 213 hosts (computers) were connected to the Internet.
By the turn of the millennium that number had grown to about 100 million! And
most of them are accessing the Web.

What has made the World Wide Web so popular? That, of course, depends on whom
you ask. But most will agree that these are the two primary reasons:

Ease of use. Publishing information on the Web and browsing for
information are relatively easy tasks.

Quantity of content. With millions of Web pages from which to
choose and thousands more being created each day, there are sites and pages to
cater to almost every surfer's tastes.

A massive potential audience awaits your Web site and the services it offers.
Of course, massive competition awaits you too. Most Web sites still primarily
consist of static information, sometimes dubbed brochureware. That's
rather sad, the Web is a powerful medium and is capable of so much more. You
could, and should, be offering much more than just static text and images. You
need features like:

Dynamic, data-driven Web pages

Database connectivity

Intelligent, user-customized pages

Sophisticated data collection and processing

Email interaction

Rich and engaging user interfaces

ColdFusion enables you to do all thisand more.

But you need to take a step back before starting ColdFusion development. As I
mentioned, ColdFusion takes advantage of existing Internet technologies. As
such, a prerequisite to ColdFusion development is a good understanding of the
Internet, the World Wide Web, Web servers and browsers, and how all these pieces
fit together.

The Internet

Much ambiguity and confusion surround the Internet, so we'll start with
a definition. Simply put, the Internet is the world's largest network.

The networks found in most offices today are local area networks
(LANs), comprised of a group of computers in relatively close proximity to
each other and linked by special hardware and cabling (see Figure 1.1). Some
computers are clients (more commonly known as workstations); others are
servers (also known as file servers). All these computers can communicate
with each other to share information.

Figure 1.1 A LAN is a group of computers in close proximity linked by
special cabling.

Now imagine a bigger networkone that spans multiple geographical
locations. This type of network is typically used by larger companies with
offices in multiple locations. Each location has its own LAN, which links the
local computers together. All these LANs in turn are linked to each other via
some communications medium. The linking can be anything from simple dial-up
modems to high-speed T1 or T3 connections and fiber-optic links. The complete
group of interconnected LANs, as shown in Figure 1.2, is called a wide area
network (WAN).

WANs are used to link multiple locations within a single company. Suppose you
need to create a massive network that links every computer everywhere. How would
you do this?

You'd start by running high-speed backbones, connections capable
of moving large amounts of data at once, between strategic
locationsperhaps large cities or different countries. These backbones
would be similar to high-speed, multilane, interstate highways connecting
various locations. You'd build in fault tolerance to make these backbones
fully redundant so that if any connection broke, at least one other way to reach
a specific destination would be available.

You'd then create thousands of local links that would connect every city
to the backbones over slower connectionslike state highways or city
streets. You'd allow corporate WANs, LANs, and even individual users with
dial-up modems to connect to these local access points. Some would stay
connected at all times, whereas others would connect as needed.

You'd create a common communications language so that every computer
connected to this network could communicate with every other computer.

Finally, you'd devise a scheme to uniquely identify every computer
connected to the network. This would ensure that information sent to a given
computer actually reached the correct destination.

Congratulations, you've just created the Internet!

Even though this is an oversimplification, it is exactly how the Internet
works.

The high-speed backbones do exist. Many are owned and operated by the large
telecommunications companies.

The local access points, more commonly known as points of presence
(POPs), are run by phone companies, online services, cable companies, and
local Internet service providers (also known as ISPs).

The common language is IP, the Internet protocol, except that the term
language is a misnomer. A protocol is a set of rules governing
behavior in certain situations. Foreign diplomats learn local protocol to ensure
that they behave correctly in another country. The protocols ensure that no
communication breakdowns or serious misunderstandings occur. Computers also need
protocols to ensure that they can communicate with each other correctly and that
data is exchanged correctly. IP is the protocol used to communicate across the
Internet, so every computer connected to the Internet must be running a copy of
IP.

The unique identifiers are IP addresses. Every computer, or host,
connected to the Internet has a unique IP address. These addresses are made up
of four sets of numbers separated by periods208.193.16.100, for example.
Some hosts have fixed (or static) IP addresses, whereas others
have dynamically assigned addresses (assigned from a pool each time a connection
is made). Regardless of how an IP address is obtained, no two hosts connected to
the Internet can use the same IP address at any given time. That would be like
two homes having the same phone number or street address. Information would end
up in the wrong place all the time.

Internet Applications

The Internet itself is simply a massive communications network and offers
very little to most users, which is why it took 20 years for the Internet to
become the phenomenon is it today.

The Internet has been dubbed the Information Superhighway, and that analogy
is quite accurate. Highways themselves are not nearly as exciting as the places
you can get to by traveling themand the same is true of the Internet. What
makes the Internet so exciting are the applications that run over it and what
you can accomplish with them.

The most popular application now is the World Wide Web. It is the Web that
single-handedly transformed the Internet into a household word. In fact, many
people mistakenly think that the World Wide Web is the Internet. This is
definitely not the case, and Table 1.1 lists some of the more popular
Internet-based applications.

All these various applicationsand many othersuse IP to
communicate across the Internet. The information transmitted by these
applications is broken into packets, small blocks of data, which are sent
to a destination IP address. The application at the receiving end processes the
received information.

Table 1.1 Some Internet-Based Applications

APPLICATION

DESCRIPTION

Email

Simple Mail Transfer Protocol (SMTP) is the most popular
email transmission mechanism, and the Post Office Protocol (POP) is
the most used mail access interface.

FTP

File Transfer Protocol is used to transfer files between
hosts.

Gopher

This menu-driven document retrieval system was very
popular before the creation of the World Wide Web.

Virtual Private Networks facilitate the secure access
of private networks over the Internet.

WWW

The World Wide Web.

DNS

IP addresses are the only way to uniquely specify a host. When you want to
communicate with a hosta Web server, for exampleyou must specify the
IP address of the Web server you are trying to contact.

As you know from browsing the Web, you rarely specify IP addresses directly.
You do, however, specify a hostname, such as www.forta.com (my Web site). If
hosts are identified by IP addresses, how does your browser know which Web
server to contact if you specify a hostname?

The answer is the Domain Name Service (DNS). DNS is a mechanism that maps
hostnames to IP addresses. When you specify the destination address
www.forta.com, your browser sends an address resolution request to a DNS server
asking for the IP address of that host. The DNS server returns an actual IP
address, in this case 208.193.16.100. Your browser can then use this address to
communicate with the host directly.

If you've ever mistyped a hostname, you've seen error messages
similar to the one seen in Figure 1.3, which tell you the host could not be
found, or that no DNS entry was found for the specified host. These error
messages mean the DNS server was unable to resolve the specified hostname.

DNS is never actually needed (well, usually, there is an exception that
I'll get to in a moment). Users can always specify the name of a
destination host by its IP address to connect to the host. There are, however,
some very good reasons not to:

IP addresses are hard to remember and easy to mistype. Users are
more likely to find www.forta.com than they are 208.193.16.100.

IP addresses are subject to change. For example, if you switch
service providers, you might be forced to use a new set of IP addresses for your
hosts. If users identified your site only by its IP address, they'd never
be able to reach your host if the IP address changed. Your DNS name, however,
stays the same even if your IP address switches. You need to change only the
mapping so the hostname maps to the new, correct IP address (the new service
provider usually handles that).

IP addresses must be unique, as already explained, but DNS names need
not. Multiple hosts, each with a unique IP address, can all share the same
DNS name. This enables load balancing between servers, as well as the
establishment of redundant servers (so that if a server goes down, another
server will still process requests).

A single host, with a single IP address, can have multiple DNS
names. This enables you to create aliases if needed. For example,
ftp.forta.com, www.forta.com, and even just plain forta.com might point to the
same IP address, and thus the same server.

DNS servers are special software programs. Your ISP will often host your DNS
entries, so you don't need to install and maintain your own DNS server
software.

You can host your own DNS server and gain more control over the domain
mappings, but in doing so, you inherit the responsibility of maintaining the
server. If your DNS server is down, there won't be any way of resolving the
hostname to an IP address, and no one will be able to find your site.

Intranets and Extranets

Intranets and Extranets were the big buzzwords a few years back, and while
some of the hype has worn off, Intranets and Extranets are still in use and
still of value. It was not too long ago that most people thought intranet
was a typo; but in a very short period of time, intranets and extranets
became recognized as legitimate and powerful new business tools.

An intranet is nothing more than a private Internet. In other words,
it is a private network, usually a LAN or WAN, that enables the use of
Internet-based applications in a secure and private environment. As on the
public Internet, intranets can host Web servers, FTP servers, and any other
IP-based services. Companies have been using private networks for years to share
information. Traditionally, office networks have not been information friendly.
Old private networks did not have consistent interfaces, standard ways to
publish information, or client applications that were capable of accessing
diverse data stores. The popularity in the public Internet has spawned a whole
new generation of inexpensive and easy-to-use client applications. These
applications are now making their way back into the private networks. The reason
intranets are now getting so much attention is that they are a new solution to
an old problem.

Extranets take this new communication mechanism one step further.
Extranets are intranet-style networks that link multiple sites or
organizations using intranet-related technologies. Many extranets actually use
the public Internet as their backbones and employ encryption techniques to
ensure the security of the data being moved over the network.

The two things that distinguish intranets and extranets from the Internet is
who can access them and from where they can be accessed. Don't be confused
by hype surrounding applications that claim to be intranet ready. If an
application can be used over the public Internet, it will work on private
intranets and extranets, too.

Web Servers

As mentioned earlier, the most commonly used Internet-based application is
now the World Wide Web. The recent growth of interest in the Internet is the
result of growing interest in the World Wide Web.

The World Wide Web is built on a protocol called the Hypertext Transport
Protocol (HTTP). HTTP is designed to be a small, fast protocol that is well
suited for distributed, multimedia information systems and hypertext jumps
between sites.

The Web consists of pages of information on hosts running Web-server
software. The host is often referred to as the Web server, which is technically
inaccurate. The Web server is software, not the computer itself. Versions of Web
server software can run on almost all computers. There is nothing intrinsically
special about a computer that hosts a Web server, and no rules dictate what
hardware is appropriate for running a Web server.

The original World Wide Web development was all performed under various
flavors of Unix. The majority of Web servers still run on Unix boxes, but this
is changing. Now Web server versions are available for almost every major
operating system. Web servers hosted on high-performance operating systems, such
as Windows 2000 and Windows XP, are becoming more and more popular. This is
because Unix is still more expensive to run than Windows and is also more
difficult for the average user to use. Windows XP (built on top of Windows NT)
has proven itself to be an efficient, reliable, and cost-effective platform for
hosting Web servers. As a result, Windows' slice in the Web server
operating system pie is growing. At the same time, Linux (a flavor of Unix) is
growing in popularity as a Web platform thanks to its low cost, its robustness,
and the fact that it is slowly becoming more usable to less technical users.

What exactly is a Web server? A Web server is a program that serves
Web pages upon request. Web servers typically don't know or care what they
are serving. When a user at a specific IP address requests a specific file, the
Web server tries to retrieve that file and send it back to the user. The
requested file might be a Web page's HTML source code, a GIF image, a Flash
file, a XML document, or an AVI file. It is the Web browser that determines what
should be requested, not the Web server. The server simply processes that
request, as shown in Figure 1.4.

It is important to note that Web servers typically do not care about the
contents of these files. HTML code in a Web page, for example, is markup that
the Web browsernot the Web serverwill process. The Web server
returns the requested page as is, regardless of what the page is and what it
contains. If HTML syntax errors exist in the file, those errors will be returned
along with the rest of the page.

Connections to Web servers are made on an as-needed basis. If you request a
page from a Web server, an IP connection is made over the Internet between your
host and the host running the Web server. The requested Web page is sent over
that connection, and the connection is broken as soon as the page is received.
If the received page contains references to additional information to be
downloaded (for example, GIF or JPG images), each would be retrieved using a new
connection. Therefore, it takes at least six requests, or hits, to
retrieve all of a Web page with five pictures in it.

NOTE

This is why the number of hits is such a misleading measure of Web server
activity. When you learn of Web servers that receive millions of hits in one
day, it might not mean that there were millions of visitors. Hits do not equal
the number of visitors or pages viewed. In fact, hits are a useful measure only
of changes in server activity.

Web servers often are not the only IP-based applications running on a single
host. In fact, aside from performance issues, there is no reason a single host
cannot run multiple services. For example, a Web server, an FTP server, a DNS
server, and an SMTP POP3 mail server can run at the same time. Each server is
assigned a port address to ensure that each server application responds only to
requests and communications from appropriate clients. If IP addresses are like
street addresses, ports can be thought of as apartment or suite numbers. A total
of 65,536 ports are available on every hostports 01023 are the
Well Known Ports, ports reserved for special applications and protocols
(such as HTTP). Vendor-specific applications that communicate over the Internet
(such as America Online's Instant Messenger, Microsoft SQL Server, and the
Real Media player) typically use ports 102449151. No two applications can
share a port at the same time.

Most servers use a standard set of port mappings, and some of the more common
ports are listed in Table 1.2.

Most Web servers use port 80, but you can change that. If desired, Web
servers can be installed on nonstandard ports to hide Web servers, as
well as host multiple Web servers on a single computer by mapping each one to a
different port. Remember that if you do use a nonstandard port mapping, users
will need to know the new port number.

NOTE

This discussion of port numbers is very important in ColdFusion MX,
we'll come back to it in a few pages.