Links

Friday, January 09, 2009

JavaScript modules

When I see code like this,
an example I pulled from YUI,
I simply want to cry. Things need not be so ugly. I don't mean to callout
Yahoo here, it's just an example I found. Most of the non-trivial
'package-ized' JavaScript I see is like this.

Issues:

Everyone of these files is a single anonymous top-level function that
is invoked to execute it at the bottom of the file. Icky. Why do we need this?

This is done because the method of compositing function into your
application is done by including the source of that function into the
big, single namespace known as your JavaScript environment. To
keep from having source you are compositing into your app not "infect"
it with additional global variables, you use the trick of putting all your
code in a function body and executing it. The function can be (and should
be) anonymous. No infection, or controlled infection, as long as you use var
on all your variables
(argh), as the variables at the top level of the function are local to the
function, and not the "environment" (basically, globals).

We have functions defined in functions defined in functions here,
two levels of which are asynchronous callbacks.

I don't have a big beef with nested functions, except when it gets
silly. Like, in this case. One of the big offenders is definition of the
loader function, whose purpose is to load the code, pre-reqs,
etc, defined as callbacks presumably because the loading of such
files isn't necessarily provided as a synchronous capability from
the browser.

I bet the folks that wrote this had editor macros set for the text
YAHOO.example.app

Frankly, there's no defense for this; the code should probably be using
"shortcut" variables for the "package names", and even just some of the
functions, like YAHOO.log.

I assume there is some kind of taboo over using shortcut variables here;
or are people depending on fully qualified function names for code completion
or other source analysis tools? Yikes.

How can we make this nicer looking?

I think we need a better way of packaging up the pieces of our composite
applications.

Packages. Modules. Bundles. Whatever.

Google Gears WorkerPools, again

I've previously blogged about
using Google Gears Worker Pools to build service frameworks.
For certain types of functionality, this makes a lot of sense.
It certainly has the characteristic of compositing function into your
application in a clean, infection-free manner. But it also
has the following characteristics:

All message sense are asynchronous. While this sort of programming
style might be comfortable to some people, and in fact might be the best
way to program in the end, it's not terribly friendly for most programmers
who have been using mostly/only synchronous function calls their entire programming
life.

Out of the box, you're always going to be 'sending messages'
instead of 'calling functions'.There's technically not much of a distinction between sending a message and
a function invocation, you might say, besides the invocation style of the
two. But again, for most programmers, function invocation is
the norm. And probably requires less syntax per invocation.
Shorter programs == good.

So while I don't have any problem with WorkerPools per se, and in fact,
I think they are a great pattern for handling asynchronous, parallel work,
they also aren't really going to provide the best pattern for modularity.

And here's the thing. I think we can add support for this fairly unobtrusively.

The basic idea is to define a new function, say loadModule() which
is used to 'reference' another module by passing it's name
as a parameter (URI to the name, prolly). A module is just a JavaScript file.
Only instead of working the way <script src=""> does,
it actually defines a new separate, empty namespace and loads the
JavaScript into that namespace (just like the way Google Gears WorkPools does).
The process of running loadModule() on a module the first time is that
the JavaScript source is executed. The object returned is a 'module' object,
whose propertes include all the global variables in the module's private
namespace. For loadModule() calls with the same module beyond the first,
the code is not
executed again, but the same 'module' object is returned.

Or other varieties thereof. It's fairly simple to play with this kind
of stuff from with
Rhino,
though your brain will be hurting after getting all the prototype and
context linkages set up right.
I assume you can do this sort of multi-environment stuff in other
JavaScript implementations.

I want it in the browser.

This isn't the kind of code you can write in userland JavaScript,
because JavaScript doesn't give you low-level access to it's innards.
Needs to be yet another function the browser injects into
the JavaScript environment, probably not even implemented in JavaScript,
but in C, C++, Java, etc.

What changes

This makes individual JavaScript files a little cleaner by:

Not requiring the anonymous function wrapper.

Letting you get away with making your namespace a mess without worrying
about infecting someone else's namespace.

Letting you use shorter names, because imports beyond the first
are crazy cheap, so every module would probably just import everything
they needed as one-level modules.

Sure would be nice if we could make that loadModule() function
synchronous, otherwise loadModule() would really have to be a function
which took the URI to the module and a callback, and invoked the callback
after the module load. Back into some ickys-ville. Is
<script src=""> synchronous?

It's not a lot. But it's a start.

Additional advantages

It's easy to imagine that the process of reloading a module which
has changed (you just edited it while you were debugging) would be a little
more straight-forward; largely only the module itself is affected, though
presumably there are some imported object references that would also need to be
fixed up (using short-cut variables causes issues here - is that one of
the reason the Yahoo example used fully-qualified names?). Some
lower-level VM help
could get even those references fixed up, I'm thinking.

Better than eval(). Yeah, you could code something up
to do this using eval(), I suppose. Or get close. The
problem with eval() is the code becomes disassociated from it's source
location. This makes it difficult/impossible to debug. Or save, after
I make my changes in the debugger (some day). With an import story, the
original location of the source can be associated with the code, just like
all the files <script src="">'d into your page get
associated with their source location.

You could imagine the keeping byte- or machine-code versions of those
modules, in their pre-executed state, cached in memory for future
interpreter invocations that imported the module. And cached on disk.

As a simple function, you can imagine have embellished versions that
handled things like version numbering, pre-reqs, etc.

Example

I coded up an implementation of loadModule() for Rhino tonight, along
with a simple example that
uses four modules:

Each module prints a line indicating it's being loaded; the environment
I set up defines the __FILE__ variable containing the module source file name
(my C roots are showing),
and a print() function which prints a line to stdout.

main.js loads two modules, abc.js and def.js.
It then calls
the sayHello() function in the abc module,
followed by
the sayHello() function in the def module.

abc.js and def.js are identical, except for the message
printed from the sayHello() function at the bottom of the file.
Both modules load the sayer.js module. They also both define
a function with the same name - sayHello() - but that's ok
because they live in separate namespaces and can be accessed separately
by code that imports them, like main.js does above.

Frankly, after spending a very small amount of time implementing the
basic functionality, I have to wonder why we don't have something like this
in the web browsers today? We have
XMLHttpRequest
to programmatically fetch data, why don't we have a way of programmatically
fetching and executing code? <script src=""> is a sorry excuse of
a version of this. Let me code it, dammit!

6 comments:

I believe the reason for making each JavaScript file an anonymous function is so that they can be packaged into one large file to be fetched once when the user requests it. By making an AJAX call for every module you load, you'll significantly increase the page's load time with the increased overhead in fetching multiple files as opposed to just one.

You should take a look at Dojo. It uses modules and a loader. It's very clean.http://api.dojotoolkit.org/jsdoc/dojo/1.2/dojo.requireThe async issue that Biao mentioned is handled by allowing you to create a "build" that contains all the modules you want to use in a single file.