Interlanguage Migration: From Scripts to Programs

As scripts grow into full-fledged applications, programmers
should want to port portions of their programs from script-
ing languages to languages with sound and rich type sys-
tems. This form of interlanguage migration ensures type-
safety and provides minimal guarantees for reuse in other
applications, too.
In this paper, we present a framework for expressing this
form of interlanguage migration. Given a program that con-
sists of modules in the untyped lambda calculus, we prove
that rewriting one of them in a simply typed lambda calcu-
lus produces an equivalent program and adds the expected
amount of type safety, i.e., code in typed modules canâ€™t go
wrong. To ensure these guarantees, the migration process
infers constraints from the statically typed module and im-
poses them on the dynamically typed modules in the form
of behavioral contracts.

...a version of PLT Scheme's core with an "occurrence type system".
.. Next we will develop a cross-language refactoring tool that assists programmers with the task of translating one module in a program system from the untyped language into the typed one. [The paper above] lays out the theoretical framework for this research.

Comment viewing options

BeanShell is an interpreted scripting language which runs on the JVM and uses Java-like syntax, except without the need to declare variables, types, or perform casts. In order to save time I often prototype code in BeanShell first and then only convert it to real Java when it's done. If it's just throw-away single-use code and it's fast enough, then I won't bother converting it. BeanShell is nice because it's so similiar to Java that all I really have to do is add the type-related code. If I had programmed it in Python, Perl, Ruby, etc. then it would be much more work to convert it to Java (but probably less work to write it in the first place).

... And that a correct type system ensures "code in typed modules canâ€™t go wrong"? So what would be the best use of your time: writing more unit tests or converting your dynamic program to a statically typed language?
I'll stick with the former.

... And that a correct type system ensures "code in typed modules canâ€™t go wrong"?

This phrase is a refinement of a popular slogan commonly used for proofs of soundness for static type system that claims that "well-typed programs can't go wrong." The notion of "going wrong" is a pithy (if misleading) way of saying "cause runtime type errors." This is what static type systems are supposed to prove can't happen.

However, even in statically typed languages like ML or Haskell, there is always some interaction with the underlying system (via the standard library or FFI) that relies on something outside the system respecting its types. So, in face, the notion that "code in typed modules can't go wrong" is a more accurate reflection of the guarantee that you get from a statically typed language: you might get a runtime type error, but it won't be caused by well-typed code.

So what would be the best use of your time: writing more unit tests or converting your dynamic program to a statically typed language?

You sound like you're disagreeing with the paper, but you aren't; the paper is specifically describing systems in which you don't have to migrate an entire program from one language to another, but rather the language supports both dynamic and static typing and allows you to migrate code piecemeal.

While there are some approaches for testing concurrent programs, I believe testing distributed software is even more complicated (I am being puzzled by this problem, anyway).
OTOH, type systems also are not really helpful in distributed setting (yet).

And Alice ML also has type-safe pickling and distribution since its very first release, if I may add. But I fear that Andris alluded to something stronger than mere unmarshal-time type checking as found in these languages.

I thought more about timing issues, semidecidable failures, extra-lingual configuration management, security, conflicting incentives and similar problems, which are not unique to distributed computing, just manifest themselves stronger in that setting.
I believe that more heavy-weight, contract-like types are needed for handling these issues.
Something remotely similar to Combinators for contracts, maybe (of course leaving more lightweight and easy usable types for writing code inside units of distribution).

programmers should want to port portions of their programs from scripting languages to languages with sound and rich type systems

That's a heck of a leap to make to justify what was probably a very significant amount of work in an effort to ease said porting. So if I make a big application and I don't want to port parts of it "to languages with sound and rich type systems", then I'm wrong? 'Cause lemme tell ya, when I have a big project written in a scripting language working well, of course I'll want to reinven^H^H^H^H^H^H^Hrecode a bunch of it in another language and deal with language integration headaches. </sarcasm>

I'd rather see this amount of effort go into, say, static type analysis systems for popular scripting languages instead.

I'd rather see this amount of effort go into, say, static type analysis systems for popular scripting languages instead.

Well, if static type analysis systems can't be built for Scheme, then my conclusion would be that they are not technically feasible in any other dynamic scripting language (popular or not). The work in Scheme may not be visible to the typical user of Python, Ruby or Lua. But I'd bet that Guido, Matz, and Roberto don't have a problem with the Scheme community doing the foundational work. (And based on what Dave says above, I think the JavaScript developers would also be interested in this work).

'Cause lemme tell ya, when I have a big project written in a scripting language working well, of course I'll want to reinven^H^H^H^H^H^H^Hrecode a bunch of it in another language and deal with language integration headaches.

Sounds like you don't want to port to static type systems because of the costs involved in porting from one programming language to another. Well, what if going from one paradigm to another didn't involve (a) changing your PL? and (b) didn't require you to abandon working code? The whole point of the paper is about cost reduction in changing pieces of code from one to the other. Your analysis might make sense given the current cost curve, but what if the shape of that cost curve were radically different (for example, there were little to no costs associated with going from one to the other)? Has our allegiance in the static vs. dynamic debate made us blind to any possible advantage that might be afforded by the opposition?

Personally, I think that confronting this particular work with the old static vs. dynamic advocacy is misplaced. Indeed, if these guys are successful, that debate might well become an anachronism. I'm skeptical that the type systems will be as rigorous as we see in ML and Haskell. And I'm skeptical that the two paradigms can be blended seamlessly. But that's not the same as saying I think it shouldn't be attempted.

I'd rather see this amount of effort go into, say, static type analysis systems for popular scripting languages instead.

That's not the way how things work in the OSS communities these days. Guido decides to throw annotation syntax to function arguments and return values, leaving unspecified what these annotations shall actually mean. Now dozends of willing programmers are going to implement annotation handlers including type checkers with different kinds of policies and semantics. The best fits might be chosen any time in future to become part of the standard library. Even better than breeding articles/programs is breeding programmers. Not sure if anyone of the implementers reads scientific papers or is particularly interested in Scheme? Sociology matters as always.

BTW it is a common practice to factor certain perfomance critical parts of a script into a "real" programming language i.e. language which is just too painfull for writing scripts. So annotating types might be the cheapest way to cooperate with the compiler to do all the optimizations and produce native code - if this is required...

As a retired programmer who is playing with Scheme and who wrote a bunch of Ada code. I would like this even if it isn't quite as good a ML.
Add a tooltip that does typeing in the style of "success typing" and I would be happy forever.