Then, I am analyzing the method d(x). From just analyzing it by itself, we can infer that x must be some sort of number. We do this by what seems like simulating x + 2, and realizing that for that to be satisfied x must be a number. Not quite sure how to implement the type inference here, not sure if it uses symbolic evaluation too.

But then we get to the function call a(1). In this case to do typechecking / type inference on d(x), we have to somehow traverse down the tree of functions, simulating how x is transformed along the way. It finds out that it is divided by 2 somewhere in there, so it goes from integer to float. So we check based on our original assumption that d(x) is a number, and agree that it will be valid.

That is just me roughly trying to figure out how to do type checking / type inference.

I'm wondering two things:

If you need to do some sort of symbolic evaluation to do type checking / inference. If so, any suggestions on resources or places/ways to better understand that.

Say we have a gigantic app with millions of lines of code. Say between a(x) and d(x) there were 500 function calls, doing all sorts of things to x. Wondering if we have to simulate that entire process to figure out if x will be a valid type, or if we can somehow limit the scope and do some sort of shortcut. If we had to traverse the 100's of functions for every variable, that would be a ton of evaluation and would be slow. So wondering how to limit the scope of the search somehow, to do type checking / inference.

Basically I am figuring out how to do type checking / inference. The resources are mostly on the lambda calculus from what I've found, which I am not too familiar with and works differently I would imagine than an imperative program.

After each method has been converted into its intermediate
representation, zscript gradually builds a type graph by each
method called by the program.
The CPA is non-iterative. Only the methods that could
potentially be called are processed, and (except for templates)
they are only processed one time only.
In our implementation of the algorithm, we use a work list. First,
the constructor of the class containing the main method, and the
main method itself is added to the work list. Then, while the
work list is not empty, the methods of the work list are
processed. During the processing of a method, more methods
may be added to the work list. The algorithm terminates when no
more methods remain to be analyzed.

2 Answers
2

Symbolic execution has nothing to do with type inference

Symbolic execution is used for static analysis of programs, and is a special case of abstract interpretation. The symbolic executor provides an symbolic input value and traces the program execution given that symbolic input value.

Type inference is much simpler: it involves automatically deducing the type of programs. Nowhere is symbolic execution of the program involved.

Type inference is usually undecidable

For a dynamic typed language, type inference is not decidable. Moreover, even we have a static type system, type inference is usually still undecidable.

The only (?) type system we know of that has decidable type inference is the Hindley-Milner type system, i.e. System F with only Rank-1 types (plus a few conservative extensions). It is known that H-M can type imperative programs (such as Standard ML), as long as you can give good typing rules. When I was an undergraduate, we did an exercise in which we implemented H-M type inference for a restricted, strongly-typed version of Scheme (which was imperative). It was a lot of fun, but it also proves that
type inference is possible for imperative programs.

I suggest that you start reading on Hindley-Milner and Algorithm W, as a first step (see, e.g., Chapter 22 of Types and Programming Languages by Benjamin C. Pierce). I also suggest that you do more basic reading and learning before asking on CS.SE, as that would make this site work much better. The theory of programming languages is not like coding and must be studied systematically if you want to have a solid grasp of it :-)

Secondly, suppose there are one and only one definition for the + function, and its type signature is (Number , Number) -> Number, i.e. the + function will 2 Number as input and return a Number as output.

Based on all this assumption, whenever we see an invocation for + , let say x + 2, then we can properly infer that x is type of Number, since there are only one definition for +.

However, if we have 2 definitions for + , as seen in the function table below:

In this case, it would still be possible to identify the type of x in the expression x + 2, because the first argument for every definition of + is always Number.

But, if the expression is 2 + x, it would be impossible to accurately determine the type of x, as it maybe Number or String, if the language support union type, then the type of x can be inferred as Number|String.

$\begingroup$Not sure why the downvote but it is a bit off topic since this is just describing one problem you face when typechecking whereas I'm asking about typechecking in general.$\endgroup$
– Lance PollardJul 19 '18 at 18:55