2007-09-28

I have come to the conclusion that programmers undergo an imprinting process when they learn and then heavily use a particular style of programming language syntax—especially in the case of programming languages they learn early in their career. Once they've been imprinted with a particular syntactical style, they tend to resist learning or using any language whose notational conventions don't closely conform to that style.

This resistance to "foreign" syntactic styles in programming languages goes much deeper than a native speaker's preference for his own spoken language versus some foreign language.

Many people are quite happy to learn multiple human languages. But people do tend to be more willing to learn a foreign language that uses essentially the same writing system as their native language uses. But if one must learn a set of ideograms instead of using an alphabet, or learn an alphabet instead of a set of ideograms (Chinese, Japanese,) or learn a very different alphabet (Roman, Hebrew, Arabic, Cyrillic. etc,) then there is usually far less willingness to make any attempt to learn such a language.

The aversion to learning a really foreign writing system is usually quite strong. People tend to be much less accepting of the differences between the writing systems of English, Hebrew and Chinese than they are of the differences between English, French and German as spoken/written languages. People just don't want to learn what they see as a strange, mysterious, inscrutable and weird system of writing.

The reaction of programmers to "foreign" syntactic conventions in "exotic" programming languages is a lot like the typical reaction to a really different writing system. This intolerance of the radically foreign is not just unfortunate, it's tragic. A lot of programmers are, in effect, crippling their programming abilities by doing the equivalent of refusing to switch from the use of Roman Numerals to Arabic Numerals simply because they've imprinted themselves to Roman Numerals.

Long division is much easier using Arabic Numerals than it is using Roman Numerals. Not all syntax is created equal, even when the semantics is the same. The first step to overcoming a phobia is to acknowledge you are suffering from it. The second step is to decide you want to be cured.

Real programmers should not be afraid of new syntactic conventions. New syntax is often invented for good reason--and in any case, the only way to know is to try it. No pain, no gain.

2007-09-23

I have written a Smalltalk primer/tutorial. It takes a different approach than other programming language primers or tutorials I have seen on the web.

Firstly, it does not assume that the reader knows the syntax of any particular programming language, although it does assume that the reader either knows how to program, or at least that he is comfortable with the relevant mathematics and foundational concepts.

Secondly, it first explains the computational model of Smalltalk, before presenting any Smalltalk syntax.

Thirdly, it takes a bottom-up approach to the presentation of Smalltalk syntax, starting with lexical tokens and ending with method declarations.

Finally, it stresses the importance of messages above all else, and shows how and why those who have attempted to copy Smalltalk in other programming languages failed to "get the message."

Constructive criticism, suggested rewordings to improve clarity, and requests that additional information be included or that additional topics be covered, are encouraged. Such requests should be sent to the author, Alan L. Lovejoy, at the following e-mail address: smalltalk-tutorial (at) alan-lovejoy (dot) net.

Enjoy!

Smalltalk: Getting The MessageThe Essentials of Message-Oriented Programming with Smalltalk

Message passing: Almost all computation in Smalltalk happens via the sending of messages. The only way to invoke a method is to send a message—which necessarily involves dynamic binding (by name) of message to method at runtime (and never at compile time.) The internals of an object are not externally accessible, ever—the only way to access or modify an object's internal state is to send it a message. So function and data abstraction are both complete and universal. Pervasive message passing is Smalltalk's most important feature—a point that was lost on most of those who have tried to emulate Smalltalk when designing other programming languages.

Dynamic and strong typing: Although any object can be assigned to any variable, the only way to access or modify the internal state of an object is to send it a message—and the sending of any invalid message is detected and prevented at run time. So, even though Smalltalk's pervasive use of dynamic typing enables the programmer to define highly polymorphic abstractions with an extremely high degree of applicability and reusability, it is impossible to apply a function to a value for which there is no valid, defined behavior.

Reflection: In most programming languages, the specifications of types, classes, functions and subroutines exist only in the source code, and so are not accessible at runtime. But in Smalltalk, all specifications of all program constructs (classes, methods, etc.) are live objects that exist both at compile time and at runtime—and those objects are fully accessible to a running program, and can be queried or modified by sending them messages. So a Smalltalk program can not only fully introspect on itself, it has full power to change itself.

Warning: Terms such as "object," "class," "type," "method" and hence "object-oriented programming" itself, as used in the context of Smalltalk, do not have the same meanings as they do when used in the context of other programming languages. The term object-oriented programming ("OOP") was coined by Dr. Alan Kay, the inventor of Smalltalk. He intended the term to describe the essential nature of Smalltalk. Unlike Smalltalk, most of the programming languages that market themselves as "object oriented" do not satisfy Dr. Kay's definition of object oriented programming:

"OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them."

The full import of "object-oriented programming" as originally defined by Dr. Kay—and how and why the meaning of "OOP" as it applies to Smalltalk differs from the meaning of "OOP" as commonly understood outside of a Smalltalk context, is fully explained in the sections that follow. In addition, Dr. Kay's article "The Early History of Smalltalk" is highly recommended reading for anyone who wants to gain an even deeper insight into why and how Smalltalk came to be what it is, and why it is so different from the mainstream programming languages.

Depending on your terminological tradition, objects have "instance variables," "members," "fields," "slots," "associations" or "references." But what matters isn't the terminology, but the semantics and mathematics.

In terms of structure and mathematics, objects are nodes (vetices, points) in a directed graph, and the references objects make to each other are directed arcs (edges) in the graph. The transitive closure of all the objects reachable by following the references from some root object is commonly called the object graph of the root object. Any object can be considered as the "root" of an object graph—although each object considered as a root node results in a different object graph.

This is one reason you typically don't find either a tree data structure or a graph data structure provided as a Collection class in a typical class library: An object already is a tree or graph node, and by defining the object's instance variables and methods, its class defines the name and semantics of the "arcs" that can be navigated from a node of that type.

Objects are graph-theoretic, in the same sense that relational databases are set-theoretic. Set theory is a subset of graph theory—which is why object-relational mapping is non-trivial.

But what is the semantics of a variable? In other words, what is the meaning of the fact that a particular variable (or any reference by name to an object) has any particular value?

Predicates

A named object reference, just like an arc in a graph, represents (or "models") an arity-2 predicate, where the two nodes are the arguments of the predicate. The existence of the arc connecting the two graph nodes is, by convention, interpreted as an assertion that the predicate represented ("modelled") by the arc that connects the two nodes is true.

For example, a variable named "parent" would normally be used to assert that whatever object is the value of that variable is the "parent" of some other object (or represents/models the parent of whatever entity is being modelled by that object). More formally, if there is an object A and an object B, and object A has an instance variable named "parent" whose value is object B, then that situation is normally intended to represent (model) the fact that the predicate hasParent(objectA, objectB) is true.

The name of a variable (or of any named reference to a value) should indicate the semantic role played by the variable—in other words, it should name the predicate modelled by the reference. Programmers often fail to do this, preferring instead to give names to variables based on the type of the values to be held by the variables. But that's a fundamental mistake: It's much easier to infer the type of a value from its semantic role, than it is to infer the semantic role from the type. And the semantic role played by variables is far more important to the meaning of code than is the type of their values.

Language designers typically have been making an analogous mistake, focusing consiberable energy and effort on type constraint systems for variables, while almost completely ignoring the core semantics of variables (or other named references to values.) Fortunately, that is beginning to change.

In addition to the semantic role played by a variable, the following aspects of (or perspectives on) the semantics of a reference to an object (value) are also of fundamental importance:

Cardinality: The minimum and maximum number of different individuals that can validly be members of the arity-2 relation defined by the arity-2 predicate represented by an instance variable.

Symmetry: Is the arity-2 predicate represented by an instance variable symmetric? If hasSibling(A, B), then is it necessarily true that hasSibling(B, A)?

Transitivity: Is the arity-2 predicate represented by an instance variable transitive? If hasAncestor(A, B) and hasAncestor(B, C), then is it necessarily true that hasAncestor(A, C)?

Equivalence Semantics: What matters about the object referenced by a variable—it's identity as an object, or the value it represents?

Modality Semantics: Does the referenced value serve to identify the referrer, or is it simply an accidental (transient, variable) property of the referrer?

The remainder this post deals with the equivalence and modality semantics of inter-object references. For a more complete discussion of the cardinality, symmetry and transitivity of inter-object references, check out RDF and OWL.

Equivalence Semantics

The tradition in object modelling is to classify the properties (instance variables) of an object as being either attributes or associations. The usual explanation of the difference is that an instance variable whose type is one of the primitive types is an "attribute," but that an instance variable whose type is one of the object types is an "association." That explanation is misleading at best, and totally wrong at worst.

To see the problem with the conventional explanation, consider an instance variable whose type is java.lang.String. A Java String is an object, not a primitive type. But there can be no question that String-valued instance variables in Java are attributive, not associative. As another example, consider any instance variable of any Smalltalk object. In Smalltalk, all values are objects—there are no primitive types—and so all instance variables have objects as their values. The idea that the instance variables of a Smalltalk object can't be attributive, just because Smalltalk has no primitive types, is absurd on its face: Whether an instance variable is an attributive reference or an associative reference should not depend on which programming language one uses.

The valid explanation of the difference between an attribute and an association is that the equivalence semantics of the two are different. An attribute references the value of an object. An association references the identity of an object. The distinction between the value and the identity of an object only matters if the object is mutable. If an object is immutable, then its value can't be modified. Since an attribute references the value of an object, any change to the value of the referenced object violates the semantics of the attributive reference.

Primitive values (the values of primitive types) are immutable. That's why they are commonly used as the values of attributes. But any object that is immutable can easily serve as the value of an attributive reference—which is one reason that Smalltalk objects can, in fact, have instance variables that are attributes, even though all values are objects.

It can sometimes be the case that an instance variable needs to be an attributive reference, but there is no type or class available whose instances can be the value of such a variable, and which are (or can be made) immutable. Such cases require either that a different class or type be developed whose instances are immutable, or that the programmer take special care to prevent the objects that are referenced attributively from having their values modified in a way that (silently) breaks the intended equivalence semantics of any attributive references to those objects.

One commonly used technique for "safely" using a mutable object as the value of an attributive reference is to use an accessor method that returns a copy of the attributively-referenced mutable object, instead of answering the object itself. Another is to have "object copy" logic that substitutes a copy of the original (mutable) object as the value of any attributive instance variable in any copy of the original object that is created.

Put another way: Knowing both a) the equivalence semantics of a variable and b) the mutability of the objects that will be the values of an instance variable can be quite useful for safely and correctly implementing the code of a class (or generating safe and correct code for it from a model.)

Modality Semantics

The modality semantics of an instance variable deals with the scope or context of the predicate represented by the variable. Does the predicate assert a necessary and universal truth, or does it represent an accidental, transient truth that may be valid only in a limited scope or context? Although the modality of a predicate can vary along a variety of perspectives, so far I've found it useful to classify instance variables into four different modality categories:

axiomatic variables

identity variables

state variables, and

transient variables

[Note: Both the names and the definitions of the terms "identity variable" and "state variable" were introduced to me by Bobby Woolf, in a conversation we had at some point in the mid-1990s.]

The predicate represented by an axiomatic variable asserts a universal and necessary truth—in other words, it asserts an axiom.

If the instance variable "guid" represents the assertion that the predicate "hasGuid(referrer, aValue)" must necessarily be true, for all time and in all contexts, then "guid" is an axiomatic variable, and the association between the referrer and the referenced value is axiomatic.

The predicate represented by an identity variable asserts a universal and necessary truth, and its value must be unique with respect to either a) any other object of the same class, or b) any entity in the real world modelled by one or more objects in the program or database. The latter case deals with situations where different objects of different types model different aspects of the same entity in the "real world." Note that all identity variables also qualify as axiomatic variables, but not all axiomatic variables qualify as identity variables.

If the instance variable "guid" represents the assertion that the predicate "hasGuid(referrer, aUniqueValue)" must necessarily be true, for all time and in all contexts, and if the value of the variable must be "unique" (as defined above,) then "guid" is an identity variable, and the association between the referrer and the referenced value is an axiomatic identity relationship.

Note that there are cases where a variable by itself is not an identity variable, but it may participate in an identity relationship along with two or more other variables. In RDBMS terminology, such variables are elements of a multi-part key.

The predicate represented by a state variable asserts a non-universal, accidental truth which is not derivable from other predicates that have been asserted.

If the instance variable "location" represents the assertion that the predicate "hasLocation(referrer, aLocation)" happens to be true, as of a particular moment and/or in a particular context, and if that fact is not derivable from other asserted facts, then "location" is a state variable.

The predicate represented by a transient variable asserts a fact which is derivable from other predicates that have been asserted.

If the instance variable "area" represents the assertion that the predicate "hasArea(referrer, anArea)" is true, but that fact is derivable from other asserted facts (e.g., the width, height and shape of the referrer,) then "area" is a transient variable.

Knowing the modality category of a variable aids in making optimal design decisions when implementing a class (or generating code for a class):

The binding of an axiomatic variable to its value should normally be immutable. If mutator methods are provided for such variables, they should not be useable once the instance has been properly initialized.

The binding of a state variable to its value should be mutable. It will probably be necessary to define mutator methods for such instance variables.

The values of a transient variable can be set to nil at any time, without any loss of information. No mutator methods should be provided for such variables; their value should be lazily computed as necessary.

Conclusion

If you design or architect software, you need to know all of the above concepts. So do those who design software tests, programming languages, or object modelling languages/tools. The above concepts are also vital to the domain of model-driven architecture and model-driven design.

The author is also flat out wrong on a few major points, most notably the importance of good dev tools. I've found that the better the programmer, the more good tools help: They amplify whatever you've got to give. And I'd take a great programmer using Java over a poor one using any other language. The "dirty little secret" of IT is that quality of personnel matters far more than tools.

If you think of yourself as unique, think again. The days when physicists could ignore the concept of parallel universes may have come to an end. If that doesn't send a shudder down your spine, think of it this way: our world is just one of many. You are just one version of many.

David Deutsch at the University of Oxford and colleagues have shown that key equations of quantum mechanics arise from the mathematics of parallel universes. "This work will go down as one of the most important developments in the history of science," says Andy Albrecht, a physicist at the University of California at Davis. In one parallel universe, at least, it will - whether it does in our one remains to be seen.

The "many worlds" interpretation of quantum mechanics was proposed 50 years ago by Hugh Everett, a graduate student at Princeton University. Rather than ...

How much is a kilogram? It turns out that nobody can say for sure, at least not in a way that won’t change ever so slightly over time. The official kilogram – a cylinder cast 118 years ago from platinum and iridium and known as the International Prototype Kilogram or “Le Gran K” – has been losing mass, about 50 micrograms at last check. The change is occurring despite careful storage at a facility near Paris.

In science fiction movies, it happens all the time: A small device is briefly held against the skin of a sick crewmember and seconds later the monitor displays what ails him. This futuristic image could someday be real.

2007-09-19

Microorganisms may soon be efficiently and inexpensively producing novel pharmaceutical compounds, such as flavonoids, that fight aging, cancer or obesity, as well as high-value chemicals, as the result of research being conducted by University at Buffalo researchers.

2007-09-17

A team led by biophysicist Jeremy Smith of the University of Tennessee and Oak Ridge National Laboratory has taken a significant step toward unraveling the mystery of how proteins fold into unique, three-dimensional shapes.

Scientists from the University of Pennsylvania have developed nanowires capable of storing computer data for 100,000 years and retrieving that data a thousand times faster than existing portable memory devices such as Flash memory and micro-drives, all using less power and space than current memory technologies.

2007-09-15

Levitation has been elevated from being pure science fiction to science fact, according to a study reported today by physicists. In theory the discovery could be used to levitate a person

In earlier work the same team of theoretical physicists showed that invisibility cloaks are feasible.

Now, in another report that sounds like it comes out of the pages of a Harry Potter book, the University of St Andrews team has created an 'incredible levitation effects’ by engineering the force of nature which normally causes objects to stick together.

The levitation is achieved using the Casimir Effect, which normallyis an attractive force, but is turned into a repulsive force in this instance. The most likely immediate application of the new levitation effect is to eliminate friction in micromachines.

2007-09-13

If your blood glucose is out of whack, the problem may be in your bones. New research in mice shows that bone cells exert a surprising influence on how the body regulates sugar, energy, and fat.

The discovery could lead to new ways to treat type 2 diabetes, a disease involving poor regulation of blood glucose. It also means that skeletons act as endocrine organs, which affect other body tissues by releasing hormones into the bloodstream...."This could also have important ramifications for cardiovascular disease because of the effect on metabolic syndrome," a condition related to diabetes, comments Dana T. Graves of Boston University. "The fact that bone cells regulate energy metabolism, and that they do it through osteocalcin, is a major finding," he says.