Posts tagged ‘JavaScript’

JavaScript’s default mode of operation is to rely heavily on callbacks: functions invoked when a longer operation (such as network I/O) finishes and delivers result. This makes it asynchronous, which is a notably different programming style than using blocking operations contained within threads (or their equivalents, like goroutines in Go).

Callbacks have numerous problems, though, out of which the most severe one is probably the phenomenon of “marching to the right”:

doSomething(withTheseArgs,function(result){

theDoSomethingElse(function(){

nowDoThis(toThat,function(result){

andThen(function(){

stop(hammertime,function(result){

// ...and so on...

});

});

});

});

});

When using the (still) common style of providing a callback as the last argument to a function initiating an asynchronous operation, you get this annoying result of ever-increasing indentation as you chain those operations together. It feels like the language itself is telling you that it was not designed for such a complex stuff… Coincidence? ;)

But it gets worse. Operations may fail somewhere along the way, which is something you’d probably like to know about. Depending on the conventions your project, framework or environment uses, this could mean additional boilerplate inside the callbacks to distinguish success from error cases. This is typical in Node.js, where first argument of callback represents the error, if any:

Alternatively, you may be asked to provide the error handler separately; an “errback”, as it’s sometimes called. Splitting the code into small parts is great and everything, but here it means you’ll have two functions as arguments:

doSomething(withThis,function(result){

// ...

},function(error){

// ...

});

Giving them names and extracting somewhere outside may help readability a little, but will also prevent you from taking advantage of one of the JavaScript’s biggest benefits: superior support for anonymous functions and closures.

Many languages now include the concept of annotations that can be applied to definitions of functions, classes, or even variables and fields. The term ‘annotation’ comes from Java, while other languages use different names: attributes (C#), decorators (Python), tags (Go), etc.
Besides naming disparities, those features also tend to differ in terms of offered power and flexibility. The Python ones, for example, allow for almost arbitrary transformations, while Go tags are constrained to struct fields and can only consist of text labels. The archetypal annotations from Java lie somewhere in between: they are mostly for storing metadata, but they can also be (pre)processed during compilation or runtime.

Now, what about C++? We know the language has a long history of lacking several critical features (*cough* delegates *cough*), but the recent advent of C++11 fixed quite a few of them all at once.
And at first sight, the lack of annotation support seems to be among them. New standard introduces something called attributes, which appears to fall in into the same conceptual bucket:

[[dllexport]]void SomeFunction(int x);

That’s misleading, though. Attributes are nothing else than unified syntax for compiler extensions. Until C++11, the job of attributes were done by custom keywords, such as __attribute__ (GCC) or __declspec (Visual C++). Now they should be replaced by the new [[squareBracket]] syntax above.
So there is nothing really new about those attributes. At best, you could compare them to W3C deciding on common syntax for border-radius that forces both -webkit-border-radius and -moz-border-radius to adapt. Most importantly, there is no standard way to define your custom attributes and introspect them later.

That’s a shame. So I thought I could try to fix that, because the case didn’t look completely lost. In fact, there is a precedent for a mainstream language where some people implemented something-kinda-almost-like annotations. That language is JavaScript, where annotations can be realized as functions arranged into a pipeline. Here’s an example from Express framework:

Both loginRequired and cached(...) are functions, tied together by app.get with the actual request handler at the end. They “decorate” that handler, wrapping it inside a code which offers additional functionality.

But that’s JavaScript, with its dynamic typing and callback bonanza. Can we even try to translate the above code into C++?…
Well yes, we actually can! Two new features from C++11 allow us to attempt this: lambda functions and initializer lists. With those two – and a healthy dose of functional programming – we may achieve at least something comparable.

Flask is one of the countless web frameworks available for Python. It’s probably my favorite, because it’s rather minimal, simple and easy to use. All the expected features are there, too, although they might not be as powerful as in some more advanced tools.

As an example, here’s how you define some simple request handler, bound to a parametrized URL pattern:

from flask import abort, render_template

from myapplication import app, db_session, Post

@app.route('/post/<int:post_id>')

def blogpost(post_id):

post = db_session.query(Post).get(post_id)

ifnot post:

abort(404)

return render_template('post.html', post=post)

This handler responds to requests that go to /post/42 and similar paths. The syntax for those URL patterns is not very advanced: parameters can only be captured as path segments rather than arbitrary groups within a regular expression. (You can still use query string arguments, of course).

On the flip side, reversing the URL – building it from handler name and parameters – is always possible. There is a url_for function which does just that. It can be used both from Python code and, perhaps more usefully, from HTML (Jinja) templates:

Parameters can have types, too. We’ve seen, for example, that post_id was defined as int in the URL pattern for blogpost handler. These types are checked during the actual routing of HTTP requests, but also by the url_for function:

pattern = url_for('blogpost', post_id='<id>')# raises ValueError

Most of the time, this little bit of “static typing” is a nice feature. However, there are some cases where this behavior of url_for is a bit too strict. Anytime we don’t intend to invoke the resulting URL directly, we might want a little more flexibility.

Biggest case-in-point are various client-side templates, used by JavaScript code to update small pieces of HTML without reloading the whole page. If you, for example, wanted to rewrite the template above to use Underscore templates, you would still want url_for to format the blogpost URL pattern:

Assuming you don’t feel dizzy from seeing two templating languages at once, you will obviously notice that '< %= post.id %>' is not a valid int value. But it’s a correct value for post_id parameter, because the resulting URL (/post/< %= post.id %>) would not be used immediately. Instead, it would be just sent to the browser, where some JS code would pick it up and replace the Underscore placeholder with an actual ID.

Unfortunately, bypassing the default strictness of url_for is not exactly easy.

However, pondering the specific case of equality checks, I realized it’s not actually uncommon for programming languages to confuse the heck out of developers with their single, double or even triple “equals”. Among the popular ones, it seems to be a rule rather than exception.

Just consider that:

JavaScript has both ==and===, exactly like PHP does. And the former is just slightly less crazy than its PHP counterpart. For both languages, it just seems like a weak typing failure.

In C and C++, you may easily use = (assignment) in lieu of == (equality), because the former is perfectly allowed inside conditions for if, while or for statements.

Java is famously counterintuitive when it comes to comparing strings, requiring to use String.equals method rather than == (like in case of other fundamental data types). Many, many programmers have been bitten by that. (The fact that under certain conditions you can compare strings char-by-char with == doesn’t exactly help either).

C# complicates stuff even more by allowing to override Equals and overload == operator. It also introduces ReferenceEquals which usually works like ==, except when the latter is overloaded. Oh, and it also has two different kinds of types (value and reference types) which by default compare in two different ways… Joy!

The list could likely go on and include most of the mainstream languages but one of them would be curiously absent: Python.

You see, Python got the == operator right:

It tests for equality only, not identity (also known as “reference equality”). For that there is a separate is operator.

All basic objects – not only strings, but also lists or dictionaries (hash tables) – compare by value. Hence e.g. two lists are equal if they contain equal elements in the same order, whether or not they are the same objects.

Implicit conversions are applied judiciously. Different types of numbers (int, long, float) compare to each other just fine, but there is clear distinction between 42 (number) and "42" (string).

You can overload == but there are no magical tricks that instantly turn your class into wannabe fundamental type (like in C#). If you really want value semantics, you need to write that yourself.

In retrospect, all of this looks like basic sanity. Getting it right two decades ago, however… That’s work of genius, streak of luck – or likely both.

There is a specific technology I wanted to play around with for some time now; it’s called node.js. It also happens that I think the best way to get to know new stuff is to create something small, but complete and functional. Note that by ‘functional’ I don’t really mean ‘practical’; that distinction is pretty important, given what I’m about to present here.

Basically, I wrote a package manager for jQuery. The idea was to have a straightforward way to install jQuery plugins – a way that somewhat mirrors the experience of dozens of other package managers, from pip to cabal. End result looks pretty decent, in my opinion:

$ jqpm install flot

[jqpm] flot installed successfully

$ ls *.js

jquery.flot.js

The funny part? It doesn’t use any central, remote registry of plugins. What it does is searching GitHub and pulling code directly from there – provided it is able to find something relevant that looks like jQuery plugin. That seems to work well for quite a few popular ones, which is rather surprising given how silly and simplistic the underlying algorithm is. Certainly, there’s plenty of room for improvement, including support for jquery.json manifests – the future standard for the upcoming official plugin site.

As I said before, though, the main purpose of jqpm was educational one. After toying with underlying technologies for a couple of evenings, I definitely have better perspective to evaluate their usefulness. While the topic might warrant a follow-up posts in the future, I think I can briefly summarize my findings in few bullet points:

Node’s JavaScript is almost the same language you can find in your browser, with all of its wats, warts and shortcomings. That’s not a big problem if you already learned to deal with them, but I surely wouldn’t recommend it as starter language for novices. Additionally, it also turns out to be quite verbose language, with all the ubiquitous functions and loops, and without denser syntactic sugar such as list comprehensions.

By contrast, the standard library of Node is very nice mixture of usefulness and minimalism. It’s certainly not as rich as Python’s or Java’s, but it’s more than usable, despite sitting a bit on the low level side.

The canonical tool for managing dependencies, npm, is rather curious creature. Combined with the way Node resolves require() calls, it makes for an unusual system that resembles classic C/C++ #includes – but improved, of course. What stands out the most is the lack of virtualenv/rvm-style utilities; instead, an equivalent approach of local node_modules subfolders is used instead. (npm faq and npm help folders provide more elaborate explanation on how does it work exactly).

The callback-based, asynchronous computation is a big hindrance that doesn’t really seem worthwhile. Intriguingly, the hassles of async vs. sync feel strangely similar to issues with pure vs. impure code in functional languages such as Haskell; in both cases you need some serious refactoring of brainware to start coding effectively. In Haskell, however, you are gaining tremendous boons to correctness, modularization, parallelization and testability. In Node, it’s disputable whether you actually gain anything: the whole idea of I/O based on a single event loop sounds all too similar to what an operating system already does with threads sleeping on I/O calls and hardware interrupts that wake them. Granted, this incarnation of asynchronous I/O is much better than some older ones, but that’s mostly thanks to JavaScript being much better equipped to handle the callback bonanza than plain ol’ C.

The bottom line: node.js is definitely not a cancer and has many legitimate uses, mostly pertaining to rapid transfer of relatively small pieces of data over the Internet. API backends, single page web applications or certain game servers all fall easily into this category.

From developer’s point of view, it’s also quite fun platform to code in, despite the asynchronous PITA mentioned above (which is partially alleviated by libraries like async.js or frameworks providing futures/promises). On the overall abstraction ladder, I think it can be placed noticeably lower than Java and not very much higher than plain C. That place is an interesting one, and it’s also not densely populated by any similar technologies and languages (only Go and Objective-C come to mind). Occupying this mostly overlooked niche could very well be one of reasons for Node’s recent popularity.

It would be quite far-fetched to call JavaScript a functional language, for it lacks many more sophisticated features from the FP paradigm – like tail recursion or automatic currying. This puts it on par with many similar languages which incorporate just enough of FP to make it useful but not as much as to blur their fundamental, imperative nature (and confuse programmers in the process). C++, Python or Ruby are a few examples, and on the surface JavaScript seems to place itself in the same region as well.

Except that it doesn’t. The numerous different purposes that JavaScript code uses functions makes it very distinct, even though the functions themselves are of very typical sort, found in almost all imperative languages. Learning to recognize those different roles and the real meaning of function keyword is essential to becoming an effective JS coder.

So, let’s look into them one by one and see what the function might really mean.

A scope

If you’ve seen few good JavaScript libraries, you have surely stumbled upon the following idiom:

/* WonderfulThing.js

* A real-time, HTML5-enabled, MVC-compatible boilerplate

* for appifying robust prototypes... or something

*/

(function() {

// actual code goes here

})();

Any and all code is enclosed within an anonymous function. It’s not even stored in a variable; it’s just called immediately so its content is just executed, now.

This round-trip may easily be thought as if doing absolutely nothing but there is an important reason for keeping it that way. The point is that JavaScript has just one global object (window in case of web browsers) which is a fragile namespace, easily polluted by defining things directly at the script level.

We can prevent that by using “bracketing” technique presented above, and putting everything inside this big, anonymous function. It works because JavaScript has function scope and it’s the only type of non-global scope available to the programmer.

A module

So in the example above, the function is used to confine script’s code and all the symbols it defines. But sometimes we obviously want to let some things through, while restricting access to some others – a concept known as encapsulation and exposing an interface.

Perhaps unsurprisingly, in JavaScript this is also done with the help of a function:

var counter = (function() {

var value = 0;

return {

increment: function(by) {

value += by || 1;

},

getValue: function() {

return value;

},

};

})();

What we get here is normal JS object but it should be thought of more like a module. It offers some public interface in the form of increment and getValue functions. But underneath, it also has some internal data stored within a closure: the value variable. If you know few things about C or C++, you can easily see parallels with header files (.h, .hpp, …) which store declarations that are only implemented in the code files (.c, .cpp).

Or, alternatively, you may draw analogies to C# or Java with their public and private (or protected) members of a class. Incidentally, this leads us to another point…

Object factories (constructors)

Let’s assume that the counter object from the example above is practical enough to be useful in more than one place (a tall order, I know). The DRY principle of course prohibits blatant duplication of code such as this, so we’d like to make the piece more reusable.

Here’s how we typically tackle this problem – still using only vanilla functions:

var createCounter = function(initial) {

var value = initial || 0;

return {

increment: function(by) {

value += by || 1;

},

getValue: function() {

return value;

},

};

};

var counter = createCounter();

var counterFrom1000 = createCounter(1000);

Pretty straightforward, right? Instead of calling the function on a spot, we keep it around and use to create multiple objects. Hence the function becomes a constructor for them, while the whole mechanism is nothing else but a foundation for object-oriented programming.

We have now covered most (if not all) roles that functions play when it comes to structuring JavaScript code. What remains is to recognize how they interplay with each other to control the execution path of a program. Given the highly asynchronous nature of JavaScript (on both client and server side), it’s totally expected that we will see a lot of functions in any typical JS code.

Many caveats clutter the JavaScript language. Some of them are quite hilarious and relatively harmless, but few can get really nasty and lead to insidious bugs. Today, I’m gonna talk about something from the second group: the semantics of this keyword in JavaScript.

Why’s this?

It is worth noting why JS has the this keyword at all. Normally, we would expect it only in those languages which also have the corresponding class keyword. That’s what C++, Java and C# have taught us: that this represents the current object of a class when used inside one of its methods. It only makes sense, then, to use this keyword in a class scope, denoted by the class keyword – both of which JavaScript doesn’t seem to have. So, why’s this even there?

The most likely reason is that JavaScript actually has something that resembles traditional classes – but it does so very poorly. And like pretty much everything in JS, it is written as a function:

function Greeting(text) {

this.text = text

}

Greeting.prototype.greet = function(who) {

alert("Hello, " who + "! " + this.text);

}

var greeting = new Greeting("Nice to meet you!");

greeting.greet("Alice");

Here, the Greeting is technically a function and is defined as one, but semantically it works more like constructor for the Greeting “class”. As for this keyword, it refers to the object being created by such a constructor when invoked by new statement – another familiar construct, by the way. Additionally, this also appears inside greet method and does its expected job, allowing access to the text member of an object that the method was called upon.

So it would seem that everything with this keyword is actually fine and rather unsurprising. Have we maybe overlooked something here, looking only at half of the picture?…

Well yes, very much so. And not even a half but more like a quarter, with the remaining three parts being significantly less pretty – to say it mildly.