Comments 0

Document transcript

Integrating Typed and Untyped Code in a Scripting LanguageTobias Wrigstad Francesco Zappa Nardelli∗Sylvain Lebresne Johan¨Ostlund Jan VitekPURDUE UNIVERSITY∗INRIAAbstractMany large software systems originate fromuntyped scripting lan-guage code.While good for initial development,the lack of statictype annotations can impact code-quality and performance in thelong run.We present an approach for integrating untyped codeand typed code in the same system to allow an initial prototype tosmoothly evolve into an efﬁcient and robust program.We introducelike types,a novel intermediate point between dynamic and statictyping.Occurrences of like types variables are checked staticallywithin their scope but,as they may be bound to dynamic values,their usage is checked dynamically.Thus like types provide someof the beneﬁts of static typing without decreasing the expressive-ness of the language.We provide a formal account of like types ina core object calculus and evaluate their applicability in the contextof a new scripting language.Categories and Subject Descriptors D Software [D.3 ProgrammingLanguages]:D.3.1 Formal Deﬁnitions and TheoryGeneral Terms TheoryKeywords Compilers,Object-orientation,Semantics,Types1.IntroductionScripting languages facilitate the rapid development of fully func-tional prototypes thanks to powerful features that are often inher-ently hard to type.Scripting languages pride themselves on “opti-mizing programmer time rather than machine time,” which is espe-cially desirable in the early stages of program development beforerequirements stabilize or are properly understood.A lax view ofwhat constitutes a valid program allows execution of incompleteprograms,a requirement of test-driven development.The absenceof types also obviates the need for early commitment to particulardata structures and supports rapid evolution of systems.However,as programs stabilize and mature—e.g.a temporary data migra-tion script ﬁnds itself juggling with the pension beneﬁts of a smallcountry [31]—the once liberating lack of types becomes a problem.Untyped code,or more precisely dynamically typed code,is hardto navigate,especially for maintenance programmers not involvedin the original implementation.The effects of refactoring,bug ﬁxesand enhancements are hard to trace.Moreover performance is oftennot on par with more static languages.A common way of dealingwith this situation is to rewrite the untyped program in a staticallytyped language such as C#or C++.Apart from being costly andPermission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor proﬁt or commercial advantage and that copies bear this notice and the full citationon the ﬁrst page.To copy otherwise,to republish,to post on servers or to redistributeto lists,requires prior speciﬁc permission and/or a fee.POPL’10,January 17–23,2009,Madrid,Spain.Copyright c2009 ACM978-1-60558-479-9/10/01...$10.00far fromguaranteed to succeed [35],a complete rewrite is likely toslow down future development as it is a snapshot of the dynamicsystem at one particular point in time.Not surprisingly,the idea ofbeing able to gradually evolve a prototype into a full-ﬂedged pro-gramwithin the same language has been a long standing challengein the dynamic language community [3,8,22,28,32,33].An early attempt at bridging the gap between dynamic and statictyping is the soft typing proposed by Cartwright and Fagan [12] andsubsequently applied to a variety of languages [2,9,17,23,24].Soft typing tries to transparently superimpose a type system onunannotated programs,inferring types for variables and functions.When an operation cannot be typed,a dynamic check is emittedand,possibly,a warning for the programmer.A compiler equippedwith a soft type checker would never reject a program,thus pre-serving expressivity of the dynamically typed language.The mainbeneﬁt of soft typing is the promise of more efﬁcient programexe-cution and warnings for potentially dangerous constructs.Its draw-back is the lack of guarantees that a given piece of code is free oferrors.It is thus not possible for programmers to take key pieces oftheir system and “make” them safe,or fast.Furthermore,no guid-ance is given on how to refactor code that is not typable.Incremental typing schemes have been explored by Bracha andGriswold in Strongtalk [8] which inspired pluggable types [7],invarious gradual type systems [3,19,26,29,30],and recently TypedScheme [32,33].In these works,dynamically typed programs canbe incrementally annotated with static type information and un-typed values are allowed to cross the boundary between static anddynamic code.Run-time type checks are inserted at appropriatepoints to ensure that values conformto the annotations on the vari-ables they are bound to.The strength of incremental approaches isthat programmers can decide which parts of their program to an-notate and will get understandable error messages when code doesnot type check.The drawback is that any operation,even one that isfully annotated,may fail due to a non-conforming value passed infrom an untyped context.This has a direct consequence on perfor-mance as type information can not be used for optimization.Evenworse,programperformance may decrease substantially when typeannotations are added to an untyped program.While our goal is related to this previous work,namely toexplore practical techniques for evolving scripts to programs,wecome from a different perspective which impacts some of ourdesign decisions.Unlike most of the previous work which had itsroot in dynamically typed languages (Smalltalk,Scheme,Ruby andJavaScript) and tried to provide static checking,we would like toprovide the ﬂexibility of dynamic languages to static languages.At the language level,we are willing to forgo some of the mostdynamic features of languages,such are run-time modiﬁcation ofobject interfaces,in languages like JavaScript or Ruby.At theimplementation level,the addition of opcodes to support dynamiclanguages in Java virtual machines makes it possible to envisionmixing typed and untyped code without sacriﬁcing performance.The research question is thus howto integrate these different stylesof programming within the same language.In particular,it wouldnot be acceptable for statically typed code to either experience run-time failures or be compiled in a less efﬁcient to support dynamicvalues.Conversely,the expressiveness of dynamic parts of thesystemshould not be restricted by the mere presence of static typesin unrelated parts of the system.We use,as a vehicle for our experiments,a new object-orientedscripting language called Thorn [6] which runs on a JVM andought to support the integration of statically and dynamically typedcode.The statically typed part of Thorn sports a conventional nom-inal type systemwith multiple subtyping akin to that of Java.Thornhas also a fully dynamic part,where every object is of type dynand all operations performed on dyn objects are checked at run-time.We introduce a novel intermediate point,dubbed a “like type,”between dynamic and compile-time checked static types.For eachtype C,there is a type like C.Uses of variables of type like Care checked statically and must respect C’s interface.However,atrun-time,any value can ﬂow into a like C variable and their con-formance to C is checked dynamically.Like types allow the samedegree of incrementality as previous proposals for gradual typing,but we have chosen a design which favors efﬁciency.In contrast toinference-based systems,like types allow static checking of opera-tions against an explicit,programmer-declared,protocol.Notably,this allows catching spelling errors and argument type errors whichare simple and frequent mistakes.Furthermore they make it possi-ble to provide IDE support such as code completion.To summarize,this paper makes the following contributions:•A type system that incorporates dynamic types,concrete typesand like types to provide a way to integrate untyped and typedcode.The separation of concrete and like types makes it possi-ble to optimize concretely typed code,and retain ﬂexibility inthe rest of the program.•Aformalization of the type systemin an imperative class-basedobject-oriented language;a proof of the standard theorems fortyped subsets of the code;a formalization of a wrapper-lesscompilation scheme,and a proof of its adequacy.•An implementation in the Thorn compiler that supports the typesystemand performs optimizations for concretely typed code.•A report on an application of like types to evolve an untypedscript into a partially typed program.A technical report extended with proofs is available at http://moscova.inria.fr/∼zappa/projects/liketypes.2.Background and Motivating ExampleThis section introduces closely related work dealing with the inte-gration of dynamically typed and statically typed code through aseries of examples written in Thorn [6].The Typing of a Point.In a language that supports rapid pro-totyping,it is sometimes convenient to start development withoutcommitting to a particular representation for data.Declaring a two-dimensional Point class with two mutable ﬁelds x and y and threemethods (getX,getY,and move) can be done with every variableand method declaration having the (implicit) type dyn.Run-timechecks are then emitted to ensure that methods are present beforeattempting to invoke them.class Point(var x,var y) {def getX() = x;def getY() = y;def move(pt) { x:=pt.getX();y:=pt.getY() }}As a ﬁrst step toward assurance,the programmer may choose to an-notate the coordinates with concrete types,say Int for integer,butleave the move method unchanged allowing it to accept any objectthat understands getX() and getY().The beneﬁt of such a refac-toring is that a compiler could emit efﬁcient code for operations onthe integer ﬁelds.As the argument to move is untyped,casts maybe needed to ensure that values returned by the getter methods areof the right type.class Point(var x:Int,var y:Int) {def getX():Int = x;def getY():Int = y;def move(pt){ x:= (Int)pt.getX();y:= (Int)pt.getY()}}Of course,this modiﬁcation is disruptive to clients of the class:all places where Point is constructed must be changed to ensurethat arguments have the proper static type.In the long run,the pro-grammer may want more assurance for invocations of move(),e.g.,by annotating the argument of the method as pt:Point.This hasthe beneﬁt that the casts in the method’s body become superﬂuous.This has the drawback that all client code must (again) be revisitedto add static type annotations on arguments and decreases ﬂexibil-ity of the code,as clients may call move passing an Origin object.class Origin {def getX():Int = 0;def getY():Int = 0;}While not a subclass of point,and thus failing to type check,Origin has the interface required by the method.This is not un-usual in dynamically typed programs.Part of the last issue couldbe somewhat mitigated by the adoption of structural subtyping [10].This would lift the requirement that argument of move be a declaredsubtype Point and would accept any type with the same signature.Unfortunately,this is not enough here,as Origin is not a struc-tural subtype either.The solution to this particular example is toinvent a more general type,such as getXgetY which has exactlythe interface required by move.class getXgetY {def getX():Int;def getY():Int;}This solution does not generalize as,if it was applied systemati-cally,it would give rise to many special purpose types with littlemeaning to the programmer.A combination of structural and in-tersection types are often the reasonable choice when starting withan existing untyped language such as Ruby,JavaScript or Scheme(see for example [17,33]) but they add programmer burden,as aprogrammer must explicitly provide type declarations,and are brit-tle in the presence of small changes to the code.For these reasons,Typed Scheme is moving fromstructural to nominal typing.1Soft Typing.A soft typing system in the tradition of Cartwrightand Fagan [12] would infer a type such as getXgetY above with-out programmer intervention.Thus obviating the need to litter thecode with overly speciﬁc types,but soft typing is inherently brittleas something as trivial as a spelling mistake in a method name willgenerate a constraint that will never be satisﬁed and only caughtwhen the method is actually used by client code.Also,inferredtypes can easily get unwieldy and hard to understand for a humanprogrammer.Furthermore,the absence of type declarations meansprogrammers will not have much help from their IDE.In termsof performance,run-time checks are eliminated when the compilercan show that an operation is safe.This makes the performance1Matthias Felleisen,presentation at the STOP’09 (Script to Program Evo-lution) workshop.model opaque as a small change in the code can have a large im-pact on performance simply because it prevents the compiler fromoptimizing an operation in a hotspot.The work on soft typing canbe traced to early work by Cartwright [11] and directly inﬂuencedresearch on soft Scheme [36] and Lagorio et al.’s Just [2,23] bring-ing soft typing to Java.Gradual typing.The gradual typing approach of Siek and Tahaallows for typed and untyped values to commingle freely [26].When an untyped value is coerced,or cast,to a typed value,a wrap-per is inserted to verify that all further interactions through thatparticular reference behave according to the target type’s contract.At the simplest a wrapper is a cast T ⇐ R saying,intuitively,that the value was of type Rand must behave as a value of type T.The number of wrappers is variable and can,in pathological cases,be substantial [19].In practice,any program that has more than asingle wrapper for any value is likely to be visibly slower.In thepresence of aliasing and side-effects the wrappers typically can notbe discharged on the spot and have to be kept as long as the valueis live.The impact of this design choice is that any operation on avalue may fail if that value is a dynamic type which does not abideby the contract imposed by its wrapper.Wrapper have to be manip-ulated at run-time and compiler optimizations are inhibited as thecompiler has to emit code that assumes the presence of wrapperseverywhere.Some of these problems may be avoided with programanalysis,but there is currently no published work that demonstratesthis.To provide improved debugging support researchers have in-vestigated the notion of blame control in the context of gradual typ-ing,[14,29,32,34].The underlying notion is that concretely typedparts of a program should not be blamed for run-time type errors.As an example,let T be a type with a method m and x be a vari-able of type T.Now,if some object o,that does not understand m,isstored in T,blame tracking will not blame the call x.m()—which iscorrect as x has type T—for throwing a “message not understood”exception at run-time.Rather,it will identify the place in the codewhere o was cast to T.Fine-grained blame control requires that areference “remembers” each cast it ﬂows through,perhaps modulooptimizations on redundant casts.Storing such information in ref-erences and not in objects is key to achieve traceability,but incursadditional run-time overhead on top of the run-time type checks.Evaluating the performance impact of blame tracking and its prac-tical impact on the ability to debug gradually typed programs hasnot yet been investigated.We use the term gradual typing to referto a family of approaches that includes hybrid typing [15] and thathave their roots in a contract-based approach of [14,18].3.A Type Systemfor ProgramEvolutionIn this paper we propose a type system for a class-based object-oriented programming language with three kinds of types.Dynamictypes,denoted by the type dyn,represent values that are manipu-lated with no static checks.Dynamic types offer programmers max-imal ﬂexibility as any operation is allowed,as long as the target ob-ject implements the requested method.However,dyn gives littleaid to ﬁnd bugs,to capture design intents,or to prove properties.Atthe other extreme,we depart from previous work on gradual typ-ing,by offering concrete types.Concrete types behave exactly howprogrammers steeped in statically typed languages would expect.Avariable of concrete type C is guaranteed to refer to an instance of Cor one of its subtypes.Concrete types drastically restrict the valuesthat can be bound to a variable as they do not support the notionof wrapped values found in other gradual type systems.Concretetypes are intended to facilitate optimizations such as unboxing andinlining as the compiler can rely on the static type information toemit efﬁcient code.Finally,as an intermediate step between thetwo,we propose like types.Like types combine static and dynamicchecking in a novel way.For any concrete type C,there is a cor-responding like type,written like C,with an identical interface.Whenever a programmer uses a variable typed like C,all manipu-lations of that variable are checked statically against C’s interface,while,at run-time,all uses of the value bound to the variable arechecked dynamically.Figure 1 shows the relations between types(dyn will be implicit in the code snippets).Full arrows indicatetraditional subtype relations (so,for instance if B is a subtype of A,then like B is a subtype of like A),dotted lines indicate implicitdyn casts,and ﬁnally,dashed lines show situations where likecasts are needed.In this paper,we have chosen a nominal type system,thus sub-type relation between concrete types must be explicitly declaredby extends clauses.While we believe that our approach appliesequally well to structural types,our choice is motivated by prag-matic reasons.Using class declarations to generate eponymoustypes is a compact and familiar (to most programmers) way to con-struct a type hierarchy.Moreover,techniques for generating efﬁ-cient ﬁeld access and method dispatch code sequences for nominallanguages are well known and supported by most virtual machines.The ﬁrst key property of like type annotations is that they arelocal.This is both a strength and a limitation.It is a strengthbecause it enables purely local type checking.Returning to ourexample,like types allow us to type the parameter to move thus:def move(p:like Point) {x:= p.getX();y:= p.getY();p.hog();#!Raises a compile time error!}Declaring the variable p to be like a Point,makes the compilercheck all operations on that variable against the interface of Point.Thus,the call to hog would be statically rejected since there is nosuch method in Point.The annotation provides the static informa-tion necessary to enable IDE support commonly found in staticallytyped languages (but not in dynamic ones).The second key property is that like types do not restrict ﬂexi-bility of the code.Declaring a variable to be like C is a promiseon how that variable is used and not to what value that variablecan be bound to.For the client code,a like typed parameter is sim-ilar to a dyn.The question of when to test conformance betweena variable’s type and the value it refers to is subtle.One of ourgoals was to ensure that the addition of like type annotations wouldnot break working code.In particular,adding type annotations toa library class should not cause all of its clients to break.So in-stead of checking at invocation time,each use of a like typed vari-able is preceded by a check that the target object has the requestedmethod.If the check fails,a run-time exception is thrown.ConsiderBlikeBAlikeAdyn<:<:<:DCrelated by (dyn) castrelated by (like) castrelated by subtyping<:Figure 1.Type Relations.C and D are unrelated by inheritance.the Coordinate class,which is similar to Point,but lacks a movemethod:class Coordinate(var x:Int,var y:Int) {def getX():Int = x;def getY():Int = y;}In our running example,if move expects a like Point,then call-ing move with a Coordinate works exactly as in an untyped lan-guage.Even if Coordinate does not implement the entire Pointprotocol,it implements the relevant parts,the methods needed formove to run successfully.If it lacked a getY method,passing aCoordinate to move would compile ﬁne,but result in an excep-tion at run-time.More interestingly,move can also accept an un-typed deﬁnition of Coordinate:class Coord(x,y) { def getX() = x;def getY() = y;}Here,the run-time return value of getX and getY are tested againstInt:invoking move with the argument Coord(1,2) would suc-ceed,Coord("a","b") would raise an exception.Observe that ifPoint used like Int,checking the return type would not be nec-essary as assigning to a like type always succeeds.Interfacing typed and untyped code.Consider a call p1.move(p2)with different declared types for variables p1,p2 and pt (the typeof the parameter in the move method).Depending on the statictype information available on the receiver,different static checksare enabled,and different run-time checks are needed to preservetype-safety.We go through these in detail in Figure 2.p1p2ptResultdynOKlike PointOKdynPointOKPointdynPointERRPointlike PointPointOK∗PointPointPointOKlike PointdynPointERRlike Pointlike PointPointOK∗like PointPointPointOKFigure 2.Conﬁgurations of declared types.The column labeledResult indicate if there will be a compile-time error.Note that thecalculus is slightly more strict and requires explicit casts in caseslabeled∗.Assume that the parameter pt in move has type dyn,then all con-ﬁgurations of receiver and argument are allowed and will compilesuccessfully.In case the parameter has the type like Point,again,all conﬁgurations are statically valid.The last case to consider iswhen pt has the concrete type Point.In that case,there are sev-eral subcases that need to be looked at.If the receiver p1 is un-typed,then,as expected,no static checks are possible.At run-time,we must consequently check that p1 understands the move methodand if so,that p2’s run-time type satisﬁes the type on the parameterin the move method.Since,pt is Point,a subtype test will be per-formed at run-time.If the receiver p1 is a concrete type,the type ofthe argument p2 will be statically checked:if it is dyn,a compile-time error will be reported;if it is like Point,the compiler willaccept the call and emit a run-time subtype test;if p2 is a Point astraightforward typed invocation sequence can be emitted.Finally,the case where the receiver is declared like Point is similar to theprevious case,with the exception that a run-time test is emitted tocheck for the presence of a move method in p1.If move had some concrete return type C,invoking it on a liketyped receiver,would then check that the value returned from themethod was indeed a (subtype of) C.If this cannot be determinedstatically,for instance if the actual method does not return a con-crete type,then a type test is performed on the value returned.Callswith untyped receivers never need to type-check return values,asclient code has no expectations that must be met.The concretelytyped case follows fromregular static checking.Revisiting a previous example,consider a variant of move witha call to getY guarded by an if and assume that p is bound atrun-time to an object that does not have a getY.def move(p:like Point) {x:= p.getX();if (unlikely) y:= p.getY();}As the system only checks uses of p,the error triggers if thecondition is true.Some situations,which are hard to type in systemsthat performeager subtype tests,e.g.,at the start of the method call,work smoothly thanks to this lazy checking.As a result like typesare not structural,but “semi-structural” since they only require themethods called to be present.Code evolution.Like types provide an intermediate step betweendynamic and concrete types.In some cases the programmer mightwant to replace like C annotations with concrete C annotations,butthis is not always straightforward.The reason is the shift in notionof subtype—from(a variant on) structural to nominal.Fortunately,studies of the use of dynamic features in practice in dynamicallytyped programs [4,20] suggest that many dynamic programs arereally not that polymorphic.When this is the case,the transitionis as simple as removing the like keyword.Changing a pieceof code that is largely like typed to use concrete types imposesan additional level of strictness on the code.Subsequently,storesfrom like typed (or dyn) variables into concretely typed variablesmust be guarded by type checks.The Thorn compiler inserts thesechecks automatically where needed and prints a warning to avoidsuppressing actual compile-time errors.Notably,when accessing aconcretely typed ﬁeld or calling a method with concrete return typeon a like typed receiver,the resulting value will be concretely typed.Subsequent operations on the returned value will enjoy the samestrict type checking as all concrete values and can be compiledmore efﬁciently than operations on like typed receivers.In some cases,one can imagine going from typed code to un-typed,for example to facilitate interaction with some larger un-typed program,or to increase the ﬂexibility in the code.Simplyadding a like keyword in the relevant places,e.g.,in front oftypes in the interface,or on key variables,immediately allows fora higher degree of ﬂexibility without losing the local checking andstill keeping the design intent in the code.Compile-Time Optimizations In Thorn,all method calls go thro-ugh a dispatching function.With like types,three different dis-patching functions are used to perform the necessary run-timechecks described above.Every user written method call is com-piled down to one of those dispatching functions depending on thetype information available at the call-site.The dispatching functionused for untyped calls performs run-time type checks and unboxesboxed primitives.The like typed dispatching function checks thatthe intended method is actually present in the receiver and has com-patible types.The concretely typed dispatching function performsa simple and fast lookup,knowing that the method is present.Addi-tionally,if the static type of the argument is a like type when someconcrete type is expected,the Thorn compiler will insert a run-timetype test and issue a warning.Like types allow interaction with an untyped object through atyped interface and guarantees that operations that succeed satisfythe typing constraints speciﬁed in the interface.Consider the fol-lowing code snippet that declares two cells—one for untyped con-tent and one for integers:class Cell(var x) {def get() = x;def set(x’) { x:= x’ }}class IntCell(var i:Int) {def get():Int = i;def set(j:Int) { i:= j }}box = Cell(32);y = box.get();ibox:like IntCell = box;z:Int = ibox.get();ibox.set(z+10);If ibox.get() succeeds,we statically know its return type to bean Int since the cell is accessed through a like typed interface.Subsequent operations on z enjoy static type checking and can beoptimized,contrarily to uses of y.For example,the + operation onthe last line can be compiled into machine instructions or equiva-lent,rather than a high-level method call on an integer object.Alter-natively,the programmer might explicitly cast y to Int.However,typing the cell like IntBox type checks all interactions with thecell statically and gives static type information about what is putinto and taken from it:this requires a single annotation at a decla-ration rather than casts spread all over the code.Relating Like Types to Previous Work Like types add localchecking to code without restricting its use from untyped code.In contrast to gradual typing [3,19,26,29,30] and pluggable types[7],it introduces an intermediate step on the untyped–concretelytyped spectrum and uses nominal rather than structural subtyp-ing.Furthermore,it only requires operations to be present whenactually used.As a result,operations on concrete types can beefﬁciently implemented and like types used where ﬂexibility is de-sired.Typed Scheme [32,33] uses contracts on a module level,rather that simple type annotations,and does not work with ob-ject structures.Soft typing [12] infers constraints fromcode,ratherthan lets programmers expressly encode design intent in the formof type annotations.Adding soft typing to Java [2,23] faces simi-lar although fewer problems.An important difference between liketypes and gradual typing systems like Ob?<:[26],is that code com-pletely annotated with like types can go wrong due to a run-timetype error.On the other hand,a code completely annotated withconcrete types will not go wrong.A perhaps unusual design decision is the lack of blame control.If a method fails,e.g.,due to a missing method in an argument ob-ject,we cannot point to the place in the program that subsequentlylead to this problem.In this respect,the blame tracking supportoffered by like types is not much better than what is offered bya run-time typecast error.This is a design decision.Nothing pre-vents adding blame control to like types in accordance with pre-vious work (e.g.,[1,27]).The rationale for our design is to avoidperformance penalties.Keeping like types blame-free allows for awrapper-less implementation.As part of the aborted ECMAScript 4 standard,Cormac Flana-gan proposed a type systemclosely related to the one we present inthis paper [16].The Objective-Clanguage has like types for objectsand no concrete object types.Classes can be either dyn(called id)or like typed,and the compiler warns rather than rejects programsdue to other language features that can make non-local changes toclasses.4.A Formalization of Like TypesTo investigate the meta-theory of like types we deﬁne mini-Thorn,an imperative variant of FJ [21] extended with dyn and like types.Mini-Thorn is a language tailored to study the interaction betweenuntyped and typed code.Compared to FJ,it lacks subexpressionsbut allows assignment,because aliasing,and understanding whathappens when objects are accessed through different views,is es-sential to our study.Mini-Thorn also lacks some features of theThorn type system (like multiple inheritance or method overload-ing on arity).Extending the formalization would not be difﬁcult butwould take us away from the purpose of this section.The Thorncompiler checks source code without these restrictions.Types.We denote class names by C,D,the dynamic type bydyn,and like types by like C where C is a class name.t::= types| C class name| like C like class C| dyn dynamicThe distinguished class name Object is also the top of the subtypehierarchy.The function concr (t) tests if the type t is concrete,and returns true if t is a class name and false otherwise.Programs.Aprogramconsists of a collection of class deﬁnitionsplus a statement to be executed.A class deﬁnitionclass C extends D {fds;mds }introduces a class named C with superclass D.The new class hasﬁelds fds and methods mds;a ﬁeld is deﬁned by a type annotationand a ﬁeld name t f,while a method is deﬁned by its name m,itssignature,and its body:t m(t1x1..tkxk) {s;returnx }.Statements include object creation,ﬁeld read,ﬁeld update,methodcall,and cast.Fields are private to objects,and can be accessed onlyfroman object’s scope.With an abuse of notation,we will considerlists of statements,rather than trees.We omit null-pointers:the onlyrun-time errors we are interested in this formalization are due todynamic type-checks that fail.As a consequence,ﬁelds must beinitialized at object creation.s::= statements| skip skip| s1;s2sequence| this.f = x ﬁeld update| x = this.f ﬁeld read| x = y.m(y1..yn) method call| x = newC (y1..yn) object creation| x = y copy| x = (t) y castStatic semantics.Figure 3 deﬁnes the static semantics of mini-Thorn.Method invocation on an object accessed through a variablewhich has a dynamic type,e.g.x = y.m(y1..yn) where Γ y:dyn,is trivially well-typed:all type-checks are postponed to run-time.On the contrary,as in FJ,if the variable y has a concretetype,e.g.Γ y:C,then method invocation can be staticallytype checked;the run-time guarantees that the objects actuallyaccessed through y are instances of the class C (or of subclassesof C) and no run-time checks are needed.Type-checking methodinvocation boils down to ensuring that the method exists,that theactual arguments matches the types expected by the method,andthat the type of the result matches the type of the return variable.Observe that type-checking of values is performed only if theexpected type is concrete,as in the hypothesis concr (ti) ⇒Γ yi<:ti;since any value can be stored in a like or dynamic typedvariable,no static type-checking is required.Like types behave ascontracts between variables and contexts:if a variable has a liketype,e.g.Γ y:like C,then a well-typed context uses it only asa variable pointing to an instance of the class C.Operations on suchThe subtyping relation<:is the reﬂexive and transitive relation closed under the rules below.class C extends D {fds;mds }C <:DC <:ObjectC1<:C2like C1<:like C2C <:like CMethod lookup functions are inherited fromFJ.C = Objectmtype(m,C) = ⊥class C extends D {fds;mds }t m(t1x1..tkxk) {s;returnx } ∈ mdsmtype(m,C) = t1..tk→ tclass C extends D {fds;mds }m/∈ mdsmtype(m,C) = mtype(m,D)class C extends D {fds;mds }t m(t1x1..tkxk) {s;returnx } ∈ mdsmbody(m,C) = x1..xk.s;returnxclass C extends D {fds;mds }m/∈ mdsmbody(m,C) = mbody(m,D)The typing judgment for statements,denoted Γ s,relies on the environment Γ to record the types of the local variables accessed by s.We write Γ x <:t as a shorthand for Γ(x) = t

nC md1..C mdn class C extends D {C1f1..Ckfk;md1..mdn}Figure 3.The type systemvariables are then statically checked as if their type was concrete.However in this case the run-time does not guarantee that the objectaccessed are instances of the class C,and the conformance ofthe value actually accessed will be checked individually at eachmethod invocation.These intuitions suggest that,even if it addsthe overhead of redundant conformance checks,it is always safeto consider a variable of type C as a variable of type like C,asallowed by the subtyping rule C <:like C.Similarly,it is easy tosee that the like constructor is covariant.Since ﬁelds are private toeach object,the operations to read or update themare always madein a context where the type of this is known with great precision,and the type constraints can be checked statically.The other rulesare unsurprising.Dynamic semantics.At run-time objects live in the heap and arereferenced via pointers p.Different variables can have differentviews of the same object;for instance,the variables x:C,y:like Dand z:dyn might be aliases and refer to the same object stored atlocation p.The dynamic semantics keeps track of a variable’s viewof an object using wrapped pointers (also called stack-values anddenoted by sv).So the stack-value of z is (dyn) p while that of yis (like D) p.No wrapper is needed for x,whose stack-value is justthe pointer p.The dynamic semantics is then deﬁned as a small-step opera-tional semantics over conﬁgurations.A conﬁguration consists of aheap H of locations p mapped to objectsC (f1= sv1;..;fn= svn)and of a stack S of activation records F1| s1... Fn| sn

)H | F | x = y.m(y1..yn);s S −→ H | [] [x1→sv1..xn→svn] [this →p] | s0;returnx0 F | x = cast (t) ret;s S[REDCALLDYN]F(y) = (dyn) pptype(H,p) = Cmbody(m,C) = x1..xn.s0;returnx0mtype(m,C) = t1..tn→ tF(y1) = w1p1..F(yn) = wnpn∀i.concr (ti) ⇒ svtype(H,wipi) <:tisv1= [[ t1]] p1..svn= [[ tn]] pnH | F | x = y.m(y1..yn);s S −→ H | [] [x1→sv1..xn→svn] [this →p] | s0;returnx0 F | x = (dyn) ret;s SFigure 4.Dynamic semantics1.if a variable x has a concrete type C,then its stack-value willalways be an unwrapped pointer p and the pointer will alwayspoint in the heap to a valid object of type (or subtype of) C;2.if a variable x has type like C,then its stack-value will alwaysbe a (like C) p wrapped pointer;no guarantee about the typeof the object pointed to by p in the heap;3.if a variable x has type dyn,then its stack-value will always bea (dyn) p wrapped pointer;no guarantee about the type of theobject pointed to by p in the heap.To preserve this invariant across reductions,operations on objectsmust perform different checks according to their view (that is,thewrapper stored in the stack-value) of the object.Suppose that a stack-value wp must be stored in a local variablex (or in an object ﬁeld).Let t be the static type of x.If t is someconcrete type C,then the static semantics and the run-time invariantguarantee that the type of the stack-value wp is compatible with C:this implies that the wrapper w is empty and p points to an objectof type D for D <:C.In this case,the link x → p can be safelystored in the stack,and the invariant is preserved.If t is like C(resp.dyn),then any pointer can be used to build a valid stack-value for x,provided that it is wrapped properly in a (like C) (resp.(dyn)) wrapper.The appropriate wrapper is built by the function[[t]] when the type t is known,or,in some cases,copied from theold stack-value of x.For instance,when a new object is created,(rule ([REDNEW]),its ﬁelds are initialized with stack-values that are built by wrapping(if needed) the actual arguments according to the ﬁeld types.Fieldupdate goes along similar lines.Field read illustrates a subtlety:when executing x = this.f the static type of x is not easilyaccessible.However,since the variable x is in the scope,its currentstack-value already reﬂects the viewthat the variable has of objects.In particular,if the static type of x was like C (resp.dyn),thenits current stack-value contains a (like C) (resp.(dyn)) wrapper.The semantics simply updates the pointer,bundling it with the olderwrapper (a cast is built froma wrapper by the function w2c).Sinceﬁelds are private to each object,the type of the enclosing object isknown precisely (the variable this always has a concrete type),andno extra care is required to check the type constraints.Invoking a method (say x = y.m(y1..yn)),requires morecare,as different checks and actions must be performed accordingto view that the variable y has of the object.The ﬁrst condition ofthe three rules for method invocation tests exactly this:if the objectis accessed via a like or dyn wrapper,or not.If it is accessed directly (rule [REDCALL]),that is if F(y) = p,then the run-time invariant guarantees that the object on whichthe method is called is an instance of the class statically checked.The static semantics guarantees that the method m exists.Lett1..tn→ t be the type of the method;if some tiis a concrete type,then the static semantics also guarantees that the actual argumentyiis an instance of ti,and no run-rime type checks are needed.If tiis like or dyn,then the actual arguments are wrapped with[[ ti]],and,again,no run-time checks are performed.The run-timeallocates a new stack-frame to evaluate the body of the methodin an environment where the actual arguments are bound to themethod parameters and this points to the object itself.The returnvalue is passed to the previous stack-frame via the ret distinguishedvariable.If the return value must be stored in a variable that has likeor dyn type,then a cast (computed fromthe previous stack-value ofthe return variable) around ret ensures that the newstack-value willbe properly wrapped.Observe that this rule boils down to standardFJ method invocation if the type of the method mdoes not involvelike or dyn types.If the object is accessed via a (dyn) wrapper,that is if F(y) =(dyn) p,(rule [REDCALLDYN]),then the run-time must verifythat the method exists (contrarily to the previous case,the conditionmbody(m,D) =...might fail),and that the actual argumentsare compatible with the types expected by the method,via thecondition svtype(H,wipi) <:ti(this check is performed onlyif tiis a concrete type,otherwise the arguments are simply wrappedaccording to ti).Also,the returned pointer is bundled in a (dyn)wrapper via a cast,since there are no static guarantees about theuse that the context makes of the returned pointer.If the object is accessed through a (like C) wrapper,that is ifF(y) = (like C) p,(rule ([REDCALLLIKE]),then the argumentsof the method call have been statically type-checked against thetype of the method min class C ().The run-time must then checkthat a method of name m exists in the object actually accessed(which can be an instance of some class D not related to C),andthat its type is compatible with the type of min C (via the condition∀i.ti<:t

i).This strict type matching is not required when t

iis oftype dyn,as the argument will be wrapped with (dyn) anyway.The return value must be wrapped (via casts) not only to the typeof the return variable,but also to the return type t of the method min class C.Mini-Thorn’s dynamic semantics does not rely on chains ofwrappers and every reference to an object goes through at mostone wrapper.Upcast of class types is always allowed:when thecast is evaluated no new wrappers are added,and the previous one(if any) is discarded (rule [REDCASTCLASS]).A cast to a con-crete type that is not a super-type of the type of the object fails.Casting to a like type as like C (resp.to dyn) always succeeds(rule [REDCASTOTHER]);the run-time forgets the previous wrap-per (if any) and insert a (like C) (resp.a dyn) wrapper.The rule for copy of a variable is straightforward,while thereturnx instruction simply deallocates the current stack-frameand stores the stack-value of x in the distinguished ret variable ofthe previous stack-frame.4.1 Meta-theoryA conﬁguration is well-typed if it satisﬁes the run-time invariantinformally described above.The invariant relates the static typeof each variable to the stack-value and heap object it can referto,and we show that well-typed conﬁgurations are closed underreductions.The environment Σ associates class names C to pointers p,andit records the concrete types of the objects in the heap.We thendeﬁne a type relation for stack-values:Σ(p) = DD <:tΣ p:tΣ(p) = ED <:CΣ (like D) p:like CΣ(p) = CΣ (dyn) p:dynThe key property of this relation is that either it reﬂects the wrapperof the stack-value (imposing no conditions on the actual objectaccessed),or,if no wrappers are present,the concrete type of theobject actually accessed.A heap H is then well-typed in Σ if:Σ Hp/∈dom(H)Σ(p) = Cﬁelds (C) = t1f1..tnfnΣ w1p1:t1..Σ wnpn:tnΣ H [p →C (f1= w1p1;..;fn= wnpn)]and the deﬁnitions of well-typed stack and well-typed conﬁguration(denoted Σ;Γ H | S) follow accordingly:Γ(x) = tΣ w p:tx/∈dom(F)Σ;Γ FΣ;Γ F [x →w p]Σ;Γ FΓ sΣ;Γ H | SΣ;Γ H | F | s S(we omit the trivial rules for empty heap and stack).THEOREM 1 (Preservation).If Σ;Γ H | S and H | S −→H

| S

,then there exist Σ

and Γ

such that Σ,Σ

;Γ,Γ

H

| S

.Given a well-typed program,it is easy to see that the initial conﬁg-uration of the programis well-typed.This guarantees that variableswith a concrete type C will only point to unwrapped objects whichare instances of class C:it is safe to optimize accesses to such vari-ables at compile time.We can also show that no type-related run-time errors can arisefrom operations on variables which have a concrete type.We saythat a conﬁguration H | F | s S is stuck if s is non-empty andno reduction rule applies;stuck conﬁgurations capture the state justbefore a run-time error.THEOREM 2 (Progress).If a well-typed conﬁguration Σ;Γ H | F | s S is stuck,that is H | F | s S −→,then the state-ment s is of the form x = y.m(y1..yn);s

= [[ Γ,F1]] | [[ Γ,s1]] .. [[ Γ,Fn]] | [[ Γ,sn]] Figure 6.Compilation of method invocation and of conﬁgurationsdynamic semantics relied on wrappers to determine the correct re-duction rule for method invocation:since the wrapper informationcan be derived from the static types,it is possible to determine foreach method invocation the right behavior statically.We can then deﬁne the three different dispatchers for methodinvocation,identiﬁed by a dispatch label,denoted d,which is eitherempty,or (like C),or (dyn).With an abuse of notation,weuse the same syntax for wrappers and dispatch labels,and relyon the function [[t]] described above to compute the appropriatedispatch label for a given type.These dispatchers implement thethree reduction rules for method invocation.A method invocationcan then be compiled down to the invocation of the right dispatcherfor the given static type,and wrappers can be erased from stack-values altogether.Consider the target language deﬁned by the grammar below:s::= statements| skip skip| s1;s2sequence| this.f = x ﬁeld update| x = this.f ﬁeld read| x = y @d m(y1..yn) method dispatch| x = newC (y1..yn) object creation| x = y copy| x = (t) y castIn the semantics of the target language,stack-values are alwaysunwrapped pointers:the stack simply associates variables to point-ers.The reduction rules for method dispatch are reported in Fig-ure 5;the reduction rules for the other constructs are inherited fromthe source language simply by erasing all the wrappers.The compile function,denoted [[−]],transforms well-typedsource statements into target statements,and more generally well-typed source conﬁgurations into target conﬁgurations.The compilefunction,described in Figure 6,is the identity on statements ex-cept for method invocation.Method invocation is compiled into theinvocation of the appropriate dispatch function,according to thestatic type of the variable pointing to the object.Compilation ofconﬁgurations compiles all the statements in all the stack-frames,and discards all the wrappers.We can show a simulation result between the behavior of well-typed source conﬁgurations and the behavior of compiled conﬁgu-rations.THEOREM 3 (Compilation).Let Σ;Γ H | S be a well-typedsource conﬁguration’:1.if H | S −→ H

| S

,then [[ Γ,H | S ]] −→ [[ Γ,H

| S

]];2.conversely,if [[ Γ,H | S ]] −→ H

| S

,then there exists awell-typed source conﬁguration Σ

;Γ

H

| S

such thatH | S −→ H

| S

and [[ Γ

,H

| S

]] = H

| S

.The Thorn implementation is built upon this wrapper-less compi-lation strategy.Generics.Like types extends nicely to a language that featuresbounded parametric polymorphism.Following FGJ,let X rangeover type variables and let concrete types CN::= CT1..Tn,non-variable types N::= CN | like CN | dyn and types T::=N | X.The key design decision is to restrict bounds in class deﬁni-tions to concrete types:class C X1 CN1..Xk CNk N {fds;mds }where abbreviates extends.If the programmer speciﬁes a bounddifferent from Object,like in ListX Foo,then the parametercan only be instantiated by concrete types and the usual FGJ typeguarantees are recovered.If the type Object is instead speciﬁed as abound,as in ListXObject,then the parameter can be instantiatedwith any type,including dyn or like C:This guarantees ease ofreuse of the List class.5.Experience with ProgramEvolutionWe report on our experience using the proposed type system.Weimplemented support for like types in our Thorn compiler whichruns on top of a standard JVM.Method calls go through a dispatch-ing function that is used to implement dispatch in the presence ofmultiple inheritance which the JVM does not support.What dis-patch function to route a call through is determined by the typeinformation available at the call-site.The concretely typed func-tion simply handles lookup as it assumes that the run-time types ofarguments are correct.The untyped function additionally performsrun-time type checks for arguments to any concretely typed param-eters.A call to a method m on a like typed receiver x goes trougha dispatching function that checks that x has an m with the cor-rect types before proceeding.As a consequence of this design,callswith varying degree of typing are handled differently.Type checksare performed not at the call-site but as part of the dispatching func-tion in the receiver object.Consequently,adding type informationto some class does not require recompilation of client code to takeadvantage of the type checking.Run-time type checks will be car-ried out as part of the untyped dispatching function,and just like ine.g.,Java,changing concrete types in the interface can break clientcode that was compiled assuming other types in the interface.5.1 Types In LibrariesThorn’s libraries constitute a ﬁrst interesting test case.To doc-ument design intents,the initial library implementation includedcomments that described the expected type of the functions.Fromthese comments,we refactored libraries to use a mixture of liketypes and concrete types.The choice of appropriate type annota-tions for the interfaces of a class calls for a compromise betweenﬂexibility vs.safety and performance.Most of Thorn’s librarieshave like typed interfaces for maximal ﬂexibility.Most return typesare either like typed or dynamically typed.This is unsurprisingsince like types primarily provide local guarantees.Return valuesthat were locally created generally have a known (concrete) type.When adding types to untyped code,we found that it is key thatthe effects of adding the types do not propagate “too far.” As anexample,consider the following two classes,declared in separateﬁles:class A { def p(x) = println(x);}class B { a = A();def q(s) { a.p(s) } }At a later date,the class A is replaced with a typed one,obtaining:class A { def p(x:String) = println(x);}Despite the type annotation,the signatures of the untyped and liketyped dispatching functions in A are unchanged.The ﬁrst addition-ally performs a type test on the x argument to make sure it is aString.Thus,old bytecode generated from the untyped code inB will still work with A without recompilation.Notably,concretelytyped and like typed code will call a dispatching function that doesnot need to test run-time types of arguments.Thorn supports a notion of pure classes that create immutableobjects.A pure class is a functional immutable data type andmany of the standard library data types are pure,Int,Float,String etc.The increase the ﬂexibility,we could,although wehave not implemented this,allow value classes to be automaticallyaugmented with a parallel,like typed version,e.g.,:class Int:Value {def +(x:Int):Int =...}can be compiled intoclass Int:Value {def +(x:Int):Int =...def +(x:like Int):like Int =...}where the second method overloads the ﬁrst to allow + etc.to becalled on any argument.This facilitates interaction between typedand untyped code,and is safe since pure classes do not modifystate.Notably,x+"foo"when x:Int is still rejected statically.Following this practise would allows us to annotate most basic datatypes in the standard library with concrete types for speed whileenjoying the ﬂexibility of like types.5.2 Refactoring an Untyped ProgramWe ported a dynamic programto Thorn along with its libraries andgradually added type annotations to it.The application we chosewas Pwyky [25] a simple wiki of about 1,000 lines of Python.Pwyky relies on a generic parser module that was also ported (an-other 1,000 lines).This allowed us to investigate the interactionbetween library and user code annotations.The application is rep-resentative of scripting language code and relies on patterns thatare inherently hard to type,such as using the same variable for val-ues of different types depending on some run-time test.The portedprogram,“Thyky,” is about the same size as the initial Python pro-gram,including libraries (2,000 LOC).Once we had an untypedversion of Thyky running,we set out to gradually add type annota-tions to it.To illustrate this process,consider the function upos inthe ParserBase class.The function upos is called when movingfrom a position i in the parsed string to some position j to updatethe ﬁelds lineno and offset of the ParserBase:class ParserBase(varlineno,varoffset,varrawdata){def upos(i,j) {if (i >= j) return j;nlines = count(rawdata.slice(i,j),"\n");if (nlines!= 0) {lineno:= lineno + nlines;offset:= j;} else offset:= offset + j-i;return j;}}As a ﬁrst step,we added like type annotations in a naive andstraightforward way.The result was the following:class ParserBase(var lineno:like Int,var offset:like Int,var rawdata:like String) {def upos(i:like Int,j:like Int):like Int {if (i >= j) return j;nlines:Int = count(rawdata.slice(i,j),"\n");...#identical}Even if the types are simple data types,there is a rationale behindthe choice of like types.ParserBase was the ﬁrst class weannotated.This class is extended by the class HTMLParser and inturn by the class Wikify which at that time were still untyped:concrete types would have caused a number of implicit type teststo be inserted and a large number of warnings,since methods inParserBase were called with untyped arguments.With the liketype annotations,the type checker is now able to verify that codein ParserBase respects the declared types for lineno,offsetand rawdata.The variable nlines is concretely typed:the countfunction returns an Int and,since nlines is internal to upos,thereis no extra need for ﬂexibility here.We added like type annotationsto HTMLParser and all the code in Thyky in the same fashion.Atthis point we had an untyped and a like-type annotated version of0.00.51.01.52.02.53.0spectral-norm1000 1500 1000 1500 11 12mandelbrotfannkuchTyped ThornDynamic ThornPython 2.5.1Ruby 1.8.6running speed relative to Python 2.5.14.87 4.84Figure 7.Performance comparison between Typed Thorn,dy-namic Thorn,Python 2.5.1,Ruby 1.8.6 normalized on the Pythontimings.Typed Thorn notably runs the benchmarks between 2x and4x the speed of Python.each ﬁle;it was possible to compile and link the annotated versionof one ﬁle against the untyped version of the other,and all the eightpossible combinations worked as expected.Following the annotations above,we attempted to harden thetype annotations of classes such as ParserBase by turning thelike-type annotations into concrete types.This often amounted toremoving the like keywords fromﬁeld declarations,while we keptthe like type annotations for the arguments of function upos toallowcalls to upos fromuntyped contexts without forcing run-timetype tests and to retain the ﬂexibility of dynamic typing.This meantthat we had to rewrite the assignment to the,now concretely typed,offset ﬁeld to make a type test on j,written as follows in Thorn:offset:= (Int) j orelse Int(j);#cast or covert to intThis line of code tries to cast j to an Int and if failing,tries tocreate a new integer value from j.(The orelse keyword executesits RHS if the LHS throws an exception.)Our exercise revealed a bug in the original Python programthathad survived the port to Thorn.In the code below,the variable salways contains a string at run-time,but was used in the followingtest in Python:if (s < 10):area = s[0:pos+10]else:area = s[pos-10:pos+10]The test (comparison on strings and integers) is nonsensical,butnevertheless valid Python code,and always returns false.As soonas we added a type annotation to s,our type-checker caught theproblem.5.3 Optimizing ThornTo demonstrate the value of concrete types,we present timing datafromrunning three simple benchmarks ported from[13].The origi-nal code was ported straight fromPython and thus did not have anytype annotations (reported as dynamic Thorn).We subsequentlyadded type annotations to parameters to the critical functions.Run-ning the programs side by side,we observed signiﬁcant speed upsin the typed version.To give some perspective to these numbers,we present our timed runs in relation to the C implementation ofPython (2.5.1) and also include the C implementation of Ruby(1.8.6),two relatively similar class-based object-oriented scriptinglanguages.The data is presented in Figure 7.It should be noted thatall the library code used by our Thorn programs is untyped.Typedversion of the libraries are currently being written and should fur-ther improve performance.As shown in Figure 7,Typed Thornruns the benchmarks between 2x and 4x faster than Python andabout 3x and 6x faster than dynamic Thorn.The Ruby implemen-tation is the slowest by far and is outperformed by a factor 7x to12x by Typed Thorn.Dissecting Spectral-Norm.As demonstrated in Figure 7,addingtype annotations to programs in Thorn can cause signiﬁcant speed-ups.Let us look at the spectral-norm benchmark [13] in additionaldetail.We naively translated the existing Python implementationinto Thorn.Inspecting the code,we found the following frequentlyexecuted function:def a(i,j) =1.0/(((i + j) * (i + j + 1) >> 1) + i + 1);With boxed Thorn primitives this line of code creates 8 new in-stances of Int or Float causing the method to execute slowly.Thecompiled Thorn code for this function is 87 byte code instructionsof which 8 are invokeinterface.Adding concrete type anno-tations to the method’s arguments brought the number of objectscreated down to 1,the (untyped) return value:def a(i:Int,j:Int) =...Moreover,the produced bytecode for the calculation is equivalentto that of Java—18 instructions.This speed-up should not comeas a surprise.After all,we have added concrete type annotations toallowthe compiler to generate the optimized code for 32-bit integervalues.But we note that with a traditional gradual typing system,itwould not be possible to achieve this due to the need to account forwrappers (or structural subtyping).Now,let’s examine what wouldhappen if we typed the arguments with like types:def a(i:like Int,j:like Int) =1.0/(((i + j) * (i + j+ 1) >> 1) +i + 1);As the methods + and >> in class Int are annotated to accept likeInt and return Int (a reasonable choice with respect to interactionwith mixed-typed code),the highlighted additions above would bemethod calls,and the rest optimized into operations on primitivevalues.Notably,no unboxing of primitive values and no new cre-ation of boxed integer objects are needed.Clearly,the like typedapproach produces more efﬁcient bytecode than the untyped.6.ConclusionsWe presented a type system designed to allow the gradual integra-tion of values of dynamic and static types in the same programminglanguage.Our design departs from the majority of previous workwhich takes an existing dynamic language as a starting point and in-sists that the type system be somehow backwards compatible withlegacy untyped code.In our view,the drawback of those works isthat the static type system is necessarily weak and fails to rule outrun-time errors or permit compiler optimizations.Our proposal putsthe dynamic and static parts of a program on equal footing.Whiledynamically typed code is unconstrained,we guarantee that stati-cally typed code does not experience run-time type errors.By sepa-rating (semi)-structural like types fromconcrete types,we are ableto treat the latter more strictly and as a consequence apply compileroptimizations to the generated code.Like types interact very wellwith untyped code,in particular,adding like type annotation willnever cause working code to fail due to type errors.Choosing nominal subtyping for the statically typed part of ourdesign is in line with all modern object-oriented languages.Butour decision of reusing type names without requiring structuralsubtyping for like types is more controversial.We argue that it is inline with the design philosophy of scripting languages:namely tominimize programmer effort.Like types do not require the scriptinglanguage programmer to declare new types while a program isbeing migrated from untyped to typed,instead they let them reuseexisting types even if these types are only an approximation of the“right” ones.As such,a programwith like type annotations alreadyrequires more “ﬁnger typing” (the pejorative term for insertingtype information used by the scripting community) than completelyuntyped code.But the additional effort is small and optional asit is always possible to interact with a like typed library withoutwriting a single type annotation and the library will still enjoy somesafety and speed-up.Like types are good for documentation andtraceability.Although they impose weaker constraints on behaviorthan in a language such as Java,like typed code will be forcedto evolve as the referenced types evolve.A consequence of thedeﬁnition of like types is that exactly what subset of the operationsof a type is used within a method is not visible on the outside.From a dynamic typing perspective,this is positive as it decreasescoupling and makes code more modular.This is similar to theSmalltalk pattern of encoding type information in variable names[5] and essentially the same reasoning that is used by programmersin object-oriented scripting languages such as Ruby and Python,except that like types give machine-checked hints to go on.The proposed type systemis being co-designed with a newpro-gramming language called Thorn.The advantage of co-designingthe type system with the language is that we can focus on key is-sues without having to ﬁght corners cases of the language deﬁni-tion as would have been the case if we had picked Java or C#as astarting point.Some simpliﬁcations that we have allowed ourselvesincluded ruling out method overloading on parameter types (typi-cally supported in statically typed languages) as well as addition ofﬁelds and methods (typically supported in dynamically typed lan-guages).Nevertheless we believe that one could add like types anddynto other static languages and obtain much of the same beneﬁtswe have outlined in this paper.Acknowledgments.We thank the entire Thorn team:Brian Burg,Gregor Richards,Bard Bloom,Nate Nystrom,and John Field.Thiswork was partially supported by ONRgrant N140910754 and ANRgrant ANR-06-SETI-010-02.References[1]Amal Ahmed,Robert Bruce Findler,Jacob Matthews,and PhilipWadler.Blame for all.In Script to Program Evolution (STOP),2009.[2]Davide Ancona,Giovanni Lagorio,and Elena Zucca.Type inferencefor polymorphic methods in Java-like languages.In Italian Conferenceon Theoretical Computer Science (ICTCS),2007.[3]Christopher Anderson and Sophia Drossopoulou.BabyJ:Fromobjectbased to class based programming via types.Electronic Notes inTheoretical Computer Science,82(7),2003.[4]John Aycock.Aggressive type inference.In International PythonConference,2000.[5]Kent Beck.Smalltalk:Best Practice Patterns.Prentice-Hall,1997.[6]Bard Bloom,John Field,Nathaniel Nystrom,Johan¨Ostlund,GregorRichards,Rok Strniˇsa,Jan Vitek,and Tobias Wrigstad.Thorn–robust,concurrent,extensible scripting on the JVM.In Conference on Object-Oriented Programming,Systems,Languages and Applications (OOP-SLA),2009.[7]Gilad Bracha.Pluggable type systems.OOPSLA04,Workshop onRevival of Dynamic Languages,2004.[8]Gilad Bracha and David Griswold.Strongtalk:TypecheckingSmalltalk in a production environment.In Conference on Object-Oriented Programming,Systems,Languages and Applications (OOP-SLA),1993.[9]Patrick Camphuijsen,Jurriaan Hage,and Stefan Holdermans.Softtyping PHP.Technical report,Utrecht University,2009.[10]Luca Cardelli.Structural Subtyping and the Notion of Power Type.InSymposium on Principles of Programming Languages (POPL),1988.[11]Robert Cartwright.User-deﬁned data types as an aid to verifying LISPprograms.In International Colloquium on Automata,Languages andProgramming (ICALP),pages 228–256,1976.[12]Robert Cartwright and Mike Fagan.Soft Typing.In Conference onProgramming language design and implementation (PLDI),1991.[13]The Computer Language Benchmarks Game.http://shootout.alioth.debian.org/.[14]Robert Bruce Findler and Matthias Felleisen.Contracts for higher-order functions.In International Conference on Functional Program-ming (ICFP),2002.[15]Cormac Flanagan.Hybrid type checking.In Symposiumon Principlesof Programming Languages (POPL),2006.[16]Cormac Flanagan.ValleyScript:It’s like static typing.Technicalreport,UC Santa Cruz,2007.[17]Michael Furr,Jong hoon An,Jeffrey Foster,and Michael Hicks.Statictype inference for ruby.In Symposium in Applied Computing (SAC),2009.[18]Kathryn E.Gray,Robert Bruce Findler,and Matthew Flatt.Fine-grained interoperability through mirrors and contracts.In Conferenceon Object-Oriented Programming,Systems,Languages and Applica-tions (OOPSLA),pages 231–245,2005.[19]David Herman,Aaron Tomb,and Cormac Flanagan.Space-efﬁcientgradual typing.In Trends in Functional Programming (TFP),2007.[20]Alex Holkner and James Harland.Evaluating the dynamic behaviourof Python applications.In Australasian Computer Science Conference(ACSC),2009.[21]Atsushi Igarashi,Benjamin C.Pierce,and Philip Wadler.Feather-weight Java:a minimal core calculus for Java and GJ.ACMTransac-tions on Programming Languages and Systems,23(3),2001.[22]Adobe Systems Inc.ActionScript 3.0 Language and ComponentsReference,2008.[23]Giovanni Lagorio and Elena Zucca.Just:Safe unknown types in Java-like languages.Journal of Object Technology,6(2),2007.[24]Sven-Olof Nystr¨om.A soft-typing system for Erlang.In ErlangWorkshop,2003.[25]Sean B.Palmer.Pwyky (A Python Wiki).[26]Jeremy Siek and Walid Taha.Gradual typing for objects.In EuropeanConference on Object Oriented Programming (ECOOP),2007.[27]Jeremy Siek and Philip Wadler.Threesomes,with and without blame.In Script to Program Evolution (STOP),2009.[28]Jeremy G.Siek.Gradual Typing for Functional Languages.In Schemeand Functional Programming Workshop,2006.[29]Jeremy G.Siek,Ronald Garcia,and Walid Taha.Exploring the designspace of higher-order casts.In European Symposiumon Programming(ESOP),2009.[30]Jeremy G.Siek and Manish Vachharajani.Gradual typing withuniﬁcation-based inference.In Symposium on Dynamic languages(DLS),2008.[31]Ed Stephenson.Perl Runs Sweden’s Pension System.O’Reilly Media,2001.[32]Sam Tobin-Hochstadt and Matthias Felleisen.Interlanguage migra-tion:Fromscripts to programs.In Symposium on Dynamic languages(DLS),2006.[33]Sam Tobin-Hochstadt and Matthias Felleisen.The design and imple-mentation of Typed Scheme.In Symposiumon Principles of Program-ming Languages (POPL),2008.[34]Philip Wadler and Robert Bruce Findler.Well-typed programs can’tbe blamed.In European Symposium on Programming (ESOP),2009.[35]Ulf Wiger.Four-fold increase in productivity and quality.In Workshopon Formal Design of Safety Critical Embedded Systems,2001.[36]AndrewK.Wright and Robert Cartwright.Apractical soft type systemfor Scheme.In Conference on LISP and Functional programming,pages 250–262,1994.