Note that gantt.ml uses project_id but contains no reference to
types.ml where that label is defined. When a bug occurred, it
was non-trivial to fix due to the difficulty of finding the
appropriate definition of project_id. This bug caused a lot of
confusion among less experienced OCaml programmers, and I think
that it is a very bad idea to allow it.

- It only works "in the presence of type information". This is an
unspecifed criteria that most users will probably find
confusing.

- It doesn't work with GADTs. If the constructor in question is
declared using GADT syntax then it cannot be used out of scope.

- It can't work with exceptions.

- It won't work with "open types" (0005584) or "pattern synonyms"
which may be added to the language in future.

- It will confuse users by adding a small element of structural
typing to an otherwise nominative system. I think that most
users picture variant constructors as named values, and all
named values in OCaml obey the same scoping rules. To suddenly
allow these values to be used as if they were not named values,
but structural values (like the methods in an object) will
confuse people.

In return for these issues, all that we gain is the ability to
refer to out of scope constructors and labels without using a
module name, but only under certain difficult to define
conditions.

Last time that this issue was discussed, it only appeared to be
discussed by 4 people. I think perhaps it would be a good idea
to get wider feedback before including a potentially risky
feature like this in a release.

It is also worth pointing out that it would be easier to add this
feature in the future than to remove it later if it causes
problems.

> It allows references to constructors to be used in files that contain no reference to the file defining those constructors.

This is not really new, the same happens e.g. with objects. The problem, for me, is more general: ocamlc should not be allowed to open foo.cmi when compiling bar.ml if bar.ml does not somehow mention Foo. At LexiFi, we have a local patch to do that, and it ensures that "ocamldep" returns (by construction) a safe upper-bound of the dependencies.

Also, I guess that proper tooling (based on -annot or -binannot) should mitigate the bad user experience you mention.

----

What will happen if we remove the feature? Assume we have two modules Ast and TypedAst which define two variant types with the same constructors. In the type checker, we map from one to the other, and it's quite nice to be able to write:

Also, it we turn warning 40 into an error, people will be strongly encouraged to use "open" more often. I consider "open" as a dangerous (even though quite useful) feature in itself, and being able to avoid it as much as possible is an advantage. In particular, the "out of scope" styles permit nice library APIs, like:

If we disallow the "out of scope" style, either library authors will be encouraged to use structural types (and then we don't gain any readability), or users will use "open" more often. The local open mitigates the problems with "open", but it is not a panacea.

I do agree with Alain on this point. The whole approach of doing constructor disambiguation that is based on having multiple conflicting constructors dumped into the same scope strikes me as an unscaleable mess. Leo's point that this is all very hard to reason about is perhaps right. But without this feature, the whole thing of having constructor disambiguation also strikes me as fairly low-value. From my perspective, it will change how we write programs at Jane Street almost not at all.

And, if this feature doesn't make it, we should figure out a way of doing a constructor-only local open. What I typically want is to get access to the constructors and only the constructors for a given type. There's no clean way of doing this now.

That isn't really the same. Objects are structurally typed and
methods are essentially constants. So for example:

let f o = o#m + 3

always means the same thing and always has a type. Whereas

let f r = r.m + 3

only has a type in the presence of a definition of the m label
with type int.

> Also, I guess that proper tooling (based on -annot or -binannot) should mitigate the bad user experience you mention.

Only if the bug has left the correct type information in place,
otherwise they will not be able to help. For instance, changing a
type annotation in one file can cause an "Unbound label" error in
another file, with no easy way to work out which definition the
label originally referred to.

> Also, it we turn warning 40 into an error, people will be strongly encouraged to use "open" more often. I consider "open" as a dangerous (even though quite useful) feature in itself, and being able to avoid it as much as possible is an advantage.

I certainly agree that "open" needs improvements to make it
safer (allowing explicit signatures on them would be a good
start). However, the problems with open should be addressed by
solutions that apply to all values, not to a subset of values
that (to the average user) will appear completely arbitrary.

> The whole approach of doing constructor disambiguation that is based on having multiple conflicting constructors dumped into the same scope strikes me as an unscaleable mess.

This is essentially what we do with regular values. The only difference for labels and constructors is that you cannot rename them, hence the need for record disambiguation.

> And, if this feature doesn't make it, we should figure out a way of doing a constructor-only local open. What I typically want is to get access to the constructors and only the constructors for a given type. There's no clean way of doing this now.

I agree, allowing something like:

open Foo.t

to get access to local access to t and all of its constructors would be very convenient.

In addition, providing:

open (M: S)

that only adds the elements of M mentioned in the signature S into the environment would make local opens much safer.

Leo, I don't understand your point.
The warning is already enabled by default.
It means that using this feature is a deliberate choice.
You may have to take extra care, but this is because you decided to do so.

If what you ask for is a better warning (one that tells you where the constructor comes from for instance), I think this is perfectly doable.
Note that it is already pretty easy to obtain this information by adding a type annotation.

On the different points you mention, I see nothing very convincing.
* On your concrete exemple, why couldn't you just get the type of proj?
This is enough to know everything needed.
Was the warning disabled? If this is not the case, Ignoring warnings is bad coding.
* "presence of type information" is well-defined if you use -principal; even otherwise it is relatively predictable.
Moreover this argument is against type-based ambiguity resolution as a whole, it is not directly related with scope.
* it doesn't work with GADT only by choice; it could work if needed
* it doesn't work for exceptions, but exceptions have a different semantics
* we already had the discussion on nominal vs. structural, and personally I don't buy it, but again nobody forces anybody to use this feature

> The warning is already enabled by default.
> It means that using this feature is a deliberate choice.

Programmers tend to just ignore warnings that they don't understand. Which is what had happened in the case I described.

> If what you ask for is a better warning (one that tells you where the constructor comes from for instance), I think this is perfectly doable.

This would certainly help. The message should also be clearer.

> On your concrete exemple, why couldn't you just get the type of proj?
> This is enough to know everything needed.

It is unusual to have to solve an "unbound value" error by looking at the types. It did not occur to the programmers in question to even consider this. (It barely occurred to me).

This problem was also exacerbated because the program was being edited by some people using trunk and some others using 4.00.1. The improved error messages from trunk might have made the problem clearer. I think Hongbo mentioned similar problems in another bug report.

> "presence of type information" is well-defined if you use -principal; even
> otherwise it is relatively predictable.

Only to those of us who understand the type system. Currently, we are saying:

"If your value is a constructor and there is type information available and
it is not a GADT and it is not an exception, then you don't need to write
the module name"

I think that most programmers will just approximate this to

"Sometimes you don't need to write the module name"

> it doesn't work with GADT only by choice; it could work if needed

Then it definitely should. The type information criteria is confusing enough without taking into account the syntax used to define the constructor.

It feels surprising that a value of Types.Project.project can not be created without type annotations and/or explicit pathing of the record fields but one can sometimes access record fields with no annotation or pathing.

The reasoning makes sense but the rules seem difficult to explain, particularly to someone new to the language:

(* ... but this does not work (no type information) even though it
looks very similar to the 'print_float' line above... *)
let f_not_ok p = p.x;;
(* ... except that we were missing an annotation for this 'p' *)
let f_annotation_ok (p : M.t) = p.x;;

- if out-of-scope label access was not possible, there would be no such confusion
- if out-of-scope label access is considered the correct, usual case (without the warning 40), then "p.x" in the second program below is ambiguous, and the type-checker should at least warn when selecting the candidate "t"; the presence of this warning would make it clear what the problem is
- currently out-of-scope label access has an intermediary status, so it doesn't really makes sense to mark an ambiguity when there is only one choice in scope (and more outside), and that would result in too much warnings to pay for a feature that the user may not use

I would personally support marking the out-of-scope access warning (40) as an error by default, and enlarging the set of ambiguity candidates (for warning 41) with out-of-scope candidates when warning 40 is not an error.

$ ocaml -principal
....
# let f p = p.M.x +. p.y;;
Warning 40: y was selected from type M.t.
It is not visible in the current scope, and will not
be selected if the type becomes unknown.
Warning 18: this type-based field disambiguation is not principal.
val f : M.t -> float = <fun>

I don't see that much asymmetry between the two cases...

I may be repeating myself, but ignoring warnings, even for beginners,
is not the right attitude. I hope that the new message is self explaining.
Warnings are not errors because they are warnings. This does not mean
you should ignore them.

If you really want to use labels out of scope, the right options are
ocaml -principal -w -40 -warn-error +18

Your other comment about hard to understand type propagation makes
sense (yet your examples don't look hard to understand).
But then the only solution is to disable all use of inferred type
information. All you examples can be rewritten without out of scope
access, and exhibiting exactly the same problem.

Does this mean that warning 42 should be enabled by default?
Or something limited to type disambiguation (42 also covers field set
disambiguation) ? This could be useful to remind that -principal
is needed for a precise definition.

Does all this mean that we should have -principal as the default? And warning 40 as an error by default?

As for Gabriel's suggestion of making the behaviour of the type-checker dependent on whether a warning is error-enabled, that will be over my dead body. If you want to do something like that, make up a new option.

I still have some concerns about using constructors from files that are not explicitly referenced. Alain mentioned that Lexifi's version of OCaml does not allow an ml file to have an unmentioned module as one of its imports. Maybe we could provide a warning for such cases.

> Alain mentioned that Lexifi's version of OCaml does not allow an ml file to have an unmentioned module as one of its imports. Maybe we could provide a warning for such cases.

A warning detecting exactly those cases where our modification would fail is difficult to do. In many cases, disallowing a .cmi to be loaded (because its module is not mentioned in the source code being compiled) does not lead to an error: the compiler tries to expand a type abbreviation and since the .cmi cannot be loaded, it considers it as an abstract type, but this works fine (the code doesn't actually require the expansion to be type-checked). This is not to say that the warning would not be useful as well...

> The typing of string constants is not principal because of the overloading with formats. Is there an easy way to make the compiler output a warning for the first expression with -principal?

Yes, basically we just need to check that the expected type "format6" is generalizable.
This is done in trunk at revision 14523.

# fun b -> if b then format_of_string "x" else "y";;
Warning 18: this coercion to format6 is not principal.

Sorry for not answering earlier.

As for how much -principal is important, the performance obstacle seems the main problem.
Alain tells me that he couldn't use it on the Lexifi code base because it was too slow.
In general I would say that you don't need to use it, but if you start to get unstable behavior in you code, it can help you to make it more robust, and give you a better intuition of where annotations are needed.