It's Time for an Alternative to PHP

Aug 22 2015

PHP has had a good run, and still is the basis for a huge percentage of the
websites out there. But it looks like
its popularity may be waning.
The thing is, PHP is meeting a very real need in the web development world. It's
time for a replacement. Something that meets the same needs, but is updated for
the modern web.

First I'll briefly explain why I think PHP needs a replacement, then I'll sketch
the features I think any suitable replacement must have.

Update: I changed Reason 1 from Apache, to the more broadly descriptive Legacy,
and added a short paragaph to the end.

Why Not PHP?

I developed using PHP for many years, following it from its last days in the
3.x versions to the middle of the 5.x versions. I wrote it concurrently with
many other languages, including Java, Perl, Python, Ruby, and Go. I've had a
great opportunity to see PHP's strengths and weaknesses over the years.

Right now, the weaknesses are strong enough to suggest that it's time to move
to a new language.

Reason 1: Legacy

PHP is a legacy system. This shows in a few places, such as its continued
entrenchment with Apache.

Apache is a phenomenal web server, and the simple fact of the matter is that it
led the way for many emerging web technologies in the past. But it is now large
and complicated compared to the newer HTTP servers on the market.

There remain two reasons to run the mammoth Apache server:

You run a large website with highly sophisticated needs for supporting a
huge variety of new and aged web technologies from only one type of server, and
to do this, you need Apache's huge library of modules.

You run PHP.

Running the complicated, large, and slow Apache server just to get PHP language
support is suboptimal, to say the least. Yet PHP developers tend to have to
do this in production to get "first class" support for PHP. And in many cases,
they end up having to run a full Apache server locally for development.

Conversely, if you want to run PHP on a lighter web server, like nginx, be
prepared to learn a lot about 90s technologies like CGI and FastCGI. And
prepare to make some code changes.

Updated: PHP shows its age elsewhere, too. It still does not have even
rudimentary unicode support. It retains quant attachments to the CGI-style of
programming. It remains page request oriented, generally incapable of retaining
state across requests. It's a language for Web 1.0.

Reason 2: Security

Security remains the bane of PHP. As continued vulnerabilities in top-tier PHP
platforms like Drupal and Wordpress show us, it's hard to prevent SQL injection attacks in
PHP. And it has been the platform for many other types of exploits, including
those involving remote code execution, cross-site scripting, and form injection.

To be fair, it's not simply a matter of PHP having fewer security features than
other languages. Were Python to achieve PHP's success, and were it to be as
web-centric, doubtless we would see many similar errors.

But this is where a new alternative could shine. A language designed for
server-side web programming on the modern web could and should make it much,
much harder to make security mistakes.

Reason 3: Language

PHP is a huge gateway language. Many developers cite it as their first real
programming language (along with JavaScript). But as a programming language,
PHP is slapdash. There's the long-running joke about needles and haystacks (or
is it haystacks and needles?!). There's the ubiquitous non-array type called
array. There are a mishmash of C-style procedural functions and later
OO classes. I could go on and on, but the bottom line is that it is not a
clean and straightforward language. Some of the coding habits one acquires in
PHP development take considerable re-training to learn other languages.

Interlude: Why Others Have Failed To Oust PHP

Other languages have come and gone during PHP's reign. mod_perl. CFML.
Ruby on Rails. JSP. Node.js. Why can't even these decent technologies bump PHP off
the map? The reason is that PHP makes web development ridiculously easy. It is
100% web oriented.

Unlike CFML and JSP, it doesn't require a buy-in to some bigger enterprisey
framework, with supporting Java code to do anyting interesting.

Unlike Ruby, Node, and Perl, it's not a web add-on for an existing language.

PHP is a highly successful domain specific language. And that is a Really Good
Thing (TM).

What Could Replace It?

If we were to build a new language to compete with PHP, what features would it
have?

Small Is Beautiful

From the ground up, the language would be built for running on small web
servers. PHP's inclusion of a built-in webserver was just a little too late,
but it is a fantastic idea. Any new language should be able to do that.

And configuring it to run on production servers should be easy, too.

First-Class Cloud Citizen

PHP nailed its market in the late '90s. People routinely ran their web apps
through hosting providers. And the huge hosting industry grew up around the
LAMP stack.

Today, hosting is fading to the background, and cloud is the new target
platform. Think virtual machines, containers, object storage, and microservices.

How easy could we make it to run a "cloud native" web language? Anything that
could fill PHP's philosophical shoes needs to be as easy as possible to run
in the cloud -- whether that's in containers, VMs, or even unikernels. (In
fact, there may be an argument to be made for writing a unikernel-first
language.)

Web Specific Libraries

The standard installation of the language must provide abundant support for
common web functionality. Above all else, this is to be the focal point of the
language.

Standard libraries can be broken into two categories: Basic web, and advanced
web/internet.

Basic Web:

HTML templating like PHP's Twig.

Strong support for CSS Selectors as a method for querying HTML.

Styles written in the language, and CSS generated as output.

Browser-side scripts written in the language, and JavaScript generated as
output. (But also an ability to use raw JavaScript.)

Forms written in the language, HTML forms generated as output.

Native support for sessions.

Abstraction for client-server out-of-band communications (e.g. web sockets or
HTTP/2 channels). This should be strongly linked to the script generator, but
be easy for pure JavaScript (etc.) to interact with.

An unambiguous Request/Response model.

Rich string processing, oriented toward working with common Web strings like
URLs.

Transparent HTTP/2 support.

A generic model library (CRUD-oriented) that makes it easy to define a model,
and easy to implement a model backend. (Diatribe: How many model layers does
every language actually need? Maybe we could build one in and start with
the premise that the answer is "we only need one.")

Advanced Web/Internet

Each request is handled on its own "thread" (concurrency unit), but a master
"context" for a program can be accessed to store and retrieve "globals" that
persist across requests. Other than initialization (start-up of app) and
shutdown (tear-down of app), code cannot be executed in the global context.

A full TCP/IP library for supporting other protocols.

A Codec trait, with several common encoders and decoders included. JSON, XML,
and YAML are obvious candidates.

Security Starts In The Language

Most languages are constructed to give the programmer as much leeway as possible.
That is great for a general purpose or system programming language. But for a
web-specific language, we can list out a number of things that we would rather
be able to do brainlessly and quickly than trade off for deep functionality.

Here are some ready examples:

HTML/JS/CSS sanitization: By default, the language should make it easy to send
all HTML, JS, and CSS through a sanitization layer en route to printing. An
example of this is Go's html/template package. In contrast, it should be hard
(but not impossible) to send unfiltered HTML, CSS, and JS to the browser.

Preparing and sanitizing database statements: The easy first step is to require
all statements to be executed like prepared statements. String-building should
not be supported at all.

Using SSL/TLS: Enabling SSL in the engine should be dead simple, and accessing
security information inside the language should be even easier. For starters,
the program ought to be able to easily determine whether or not the current
request/response is protected by SSL.

Secure sessions and authentication: There is no reason that today's web
developer should have to write code to generate secure session tokens, or to
write basic auth handlers. The language should provide decent secure authentication
out of the box, and should continue to evolve along with emerging best practices.

Form processing: Input validation and protection from XSS and various injection
attacks are both features that should just come with the language engine. This
includes supporting secure file uploads.

A Gateway Language

Because web development is such a common entry to computer programming, this
language should be conducive to teaching good programming techniques and
terminilogy, while also being straightforward.

A few such features come readily to mind for me:

Unicode (with UTF-8) is default and is supported to the core.

A flexible type system. Python's type system is a good example of a loose
type system that is also flexible. Strongly typed languages are not great for
typical web development, and practically require generics and collections if
they are to make web development convenient. That trade off is not worth it.

Strong list and map collections, decent extended collections. Today's web
development doesn't need to the low-level fixed arrays, nor is it imperative
that they support the rich collections libraries you see in languages like
Java. But... really flexible list and map implementations form a CS foundation
that is practical. For a first pass, I would even argue that a language like
this should have only TWO collections: A growable list and an ordered hash
map.

Syntax that is simple, but not whitespace-oriented. Javascript, Ruby, Elixir
and Go are all quite elegant (in different ways) in this regard.

Conceptually simple OO-like language. Procedural and functional programming
languages each have their strong points. But OO is a good common ground when
it comes to straddling the language's dual goals. It is good for web development,
and it is good as a gateway for introducing other languages. I think favoring
a Class/Object/Trait/Composition style language would be great

Memory managed, garbage collected, and defaulting to pass-by-reference -- all
in the name of ease of use.

Built-in dependency management system that is similar to Ruby Bundler, NPM, or
Composer.

Concurrency via CSP. I love Elixir's concurrency model. Go's is great, too.
These make it easy to write concurrent programs, but avoid many of the
difficulties imposed by thread-based languages.

It should be trivially easy to write an entry point to the application that
simply contains HTML. This does not entail embedding the language inside
of HTML like PHP. But achieving a similarly quick HTML page is desirable.

But should it be a scripting language? Honestly, I think people make too much
of a deal about this point. The fact is that for a scripting language of
significant complexity (like Python, PHP, Ruby, and Perl), the distinction
between interpretation time and runtime introduces as much frustration as
it purports to ameliorate by not requiring compilation. And even in a
weakly typed language, many errors can be caught during compilation time.

Elixir and Go both have modes that are more script-like and modes that compile.
Perhaps this is the way of any future web language.

Non-Goals

The following things should not be goals for this language:

Providing a "general purpose" language, and a web add-on.

Focusing on high-performance, scientific, enterprise, or distributed uses of
the language.

Building a "Pure" (academic).

Where To Start?

I can honestly say that I haven't seen any language that I think is a great
basis for such a project.

Elixir, Go, and Rust each have some desirable elements to be extracted, and
PHP itself should indeed be the bar which any such language must surpass. But
this may honestly be a case where a new language is necessary.