D has plenty of semantic annotations, but sometimes they are not
handled in a strict enough way, so their usefulness (in catching
bugs) is reduced.
Microsoft catches plenty of troubles statically with SAL2:
http://msdn.microsoft.com/en-us/library/hh916382.aspx
In this program D gives an error:
class Foo {
immutable int[] x;
this() {}
}
void main() {}
test.d(3,5): Error: constructor test.Foo.this missing initializer
for immutable field x
D could give a similar error here, an out argument that is not
assigned inside the function "foo":
void foo(out int x) {}
void main() {}
A possible error message:
test.d(1,21): Error: uninitialised out argument of 'test3.foo'
function
This analysis is basic, so this code does not raise that error,
despite arr[1] is not initialized:
void foo(out int[2] arr) {
arr[0] = 1;
}
Another case where the compiler will not give that 'uninitialised
out argument' error, despite arr is not always initialized:
void foo(in bool b, out int[2] arr) {
if (b)
arr = [1, 2];
}
When and if D language and compiler adds some form of flow
analysis, the analysis for out could become more strict, and
refuse the precedent two examples.
This is a breaking change, so if you want you can reach this
error with a warning-deprecation-error trajectory.
I think that most D code with functions with out arguments that
are not initialized inside the function are already buggy so I
think this code will not introduce lot of breakage in existing D
code.
Bye,
bearophile

But it is not uninitialized. All out parameters are default
initialized to their .init value.<
I don't agree with this opinion, as they *are* initialized.<

The documentation of Microsoft SAL2, several discussions, and
some other things I've read tell me that if you want your
compiler to catch some of your bugs statically (and optimize
well) it needs to be strict. Being strict often means following a
narrow semantics that forbids some rarely useful but correct
programs, and allow only the large percentage of the remaining
ones.
D has more built-in annotations that most other languages I know,
and I like to use them a lot, but their return of investiment is
not always very high. If an annotation allows to catch some bugs
or allows the compiler to optimize better, or it helps the
programmer reason in a simpler way about the code, then the
annotation is giving something back.
I remember discussions about D "pure" not being very useful to
the GDC back-end to optimize the code.
Back to the topic of the "out" arguments discussed here, it's
true that the out argument gets initialized at function entry, so
the current implementation of out is correct. But we should look
at the sum of advantages and disadvantages. If I write a function
with an out argument I may want to initialize it inside the
function or I may want just the default initialization, currently
both the following functions are allowed and usable in D:
void foo(out int x) {}
void bar(out int x) { x = 10; }
How much often do I need to write a function like foo? I think
it's uncommon. Most times I add an "out" argument I want to write
a function like bar that puts something inside x.
On the other hand sometimes I want to write a function like bar,
but by mistake I end writing a function like foo, I forget to
initialize an out argument inside the function. I don't know how
common is this kind of bug (because out arguments are quite rare
in my code, so I don't have much statistic), but it can be costly.
So is it better to catch the bugs caused by the programmer
forgetting to inizialize an out argument, and forbid the
functions like foo (that I think are not common), or is it better
to allow functions like foo because they are sufficiently common
to justify the reduced strictness of the out argument semantics?
Looking at the cost-benefit analysis, I am willing to restrict
the language and disallow functions like foo because they are not
common in my code, and in turn catch statically some bugs where I
miss to initialize the out argument manually (despite I don't
know how much common such bugs are. This makes such analysis
based on fried air, so if you are free to dismiss this whole
thread because of this).
If this little breaking change is introduced in D, I am then
forced to write foo like this:
void foo2(out int x) { x = 0; }
Is this a price small enough to pay for that increase in
strictness?
Bye,
bearophile

What if x is an "optional" out parameter: EG: something you only
set if other conditions are met? Do you want to make sure that x
is at least assigned to once, or rather make it an error to have
a control path that *doesn't* assign anything to it.
Either way:
1. Making it an error to have a control path that doesn't assign
to x would be counter productive, as the result would probably
end up being code that looks like:
foo(out int x) {
x = 0;
...
}
which would defeat the entire point of out.
2. Checking the variable is at least used would be kind of the
same as checking for unused arguments. I think that'd be fine,
provided you could over-ride the warning by not naming your
variable:
foo(out int x){} //Error, x is never assigned to (or used)
foo(out int){} //OK!

What if x is an "optional" out parameter: EG: something you
only set if other conditions are met? Do you want to make sure
that x is at least assigned to once, or rather make it an error
to have a control path that *doesn't* assign anything to it.
Either way:
1. Making it an error to have a control path that doesn't
assign to x would be counter productive, as the result would
probably end up being code that looks like:
foo(out int x) {
x = 0;
...
}
which would defeat the entire point of out.
2. Checking the variable is at least used would be kind of the
same as checking for unused arguments. I think that'd be fine,
provided you could over-ride the warning by not naming your
variable:
foo(out int x){} //Error, x is never assigned to (or used)
foo(out int){} //OK!

Today a function with return value but without return statement
is already an error. Why not return ReturnType.init instead? It's
absolutely the same.

But it is not uninitialized. All out parameters are default initialized to
their .init value.<
I don't agree with this opinion, as they *are* initialized.<

The documentation of Microsoft SAL2, several discussions, and some other things
I've read tell me that if you want your compiler to catch some of your bugs
statically (and optimize well) it needs to be strict.

Microsoft SAL2 is not about D, and D's 'out' has nothing to do with strictness.
Frankly, I think D's parameter classes and array slices are far simpler and
more
effective than SAL2.

How much often do I need to write a function like foo?
I think it's uncommon.

I suspect that is a guess.

If this little breaking change is introduced in D, I am then forced to write
foo
like this:
void foo2(out int x) { x = 0; }
Is this a price small enough to pay for that increase in strictness?

I see it as annoying and nothing to do with 'strictness'. D default initializes
all variables that don't have an explicit initializer. This is normal for D,
and
is a nice convenience. There's no reason that 'out' should behave differently.
It's also a correctness feature - the default initializer is NOT always 0, and
if you're writing generic code you'd be making a mistake to use 0. You'd have
to
initialize it to 'typeof(x).init', which is a good way of punishing users :-)

But it is not uninitialized. All out parameters are default initialized
to
their .init value.<
I don't agree with this opinion, as they *are* initialized.<

The documentation of Microsoft SAL2, several discussions, and some other
things
I've read tell me that if you want your compiler to catch some of your
bugs
statically (and optimize well) it needs to be strict.

Microsoft SAL2 is not about D, and D's 'out' has nothing to do with
strictness. Frankly, I think D's parameter classes and array slices are far
simpler and more effective than SAL2.

How much often do I need to write a function like foo?
I think it's uncommon.

I suspect that is a guess.

If this little breaking change is introduced in D, I am then forced to
write foo
like this:
void foo2(out int x) { x = 0; }
Is this a price small enough to pay for that increase in strictness?

I see it as annoying and nothing to do with 'strictness'. D default
initializes all variables that don't have an explicit initializer. This is
normal for D, and is a nice convenience. There's no reason that 'out' should
behave differently.
It's also a correctness feature - the default initializer is NOT always 0,
and if you're writing generic code you'd be making a mistake to use 0. You'd
have to initialize it to 'typeof(x).init', which is a good way of punishing
users :-)

Frankly, I think D's parameter classes and array slices are far
simpler and more effective than SAL2.<

I have not used SAL so I can't tell. It has nonnull compile-time
tests, and tests of copy or write of the _whole_ span of an
array, that look nice and are missing in D.

How much often do I need to write a function like foo?
I think it's uncommon.

I suspect that is a guess.

I've written a ton of D1/D2 code, and functions like foo are very
uncommon in my code. I can do a statistic on my code.

I see it as annoying and nothing to do with 'strictness'.<

"out" arguments are return values of a function, so the following
are logically the same:
int foo() { return 0; }
void foo(out int x) { x = 0; }
Here strictness means respecting that basic semantic meaning of
the function. So accepting a function like this because we assume
x gets initialized at entry point is less strict:
void foo(out int x) {}
And it's not annoying for me because I almost never write
functions with out arguments that I don't want to initialize
inside the function. (But perhaps other people write code
differently, I'd like to know what they do).

D default initializes all variables that don't have an explicit
initializer. This is normal for D, and is a nice convenience.
There's no reason that 'out' should behave differently.<

Yes, this change in out semantics will break of that more general
rule. But I think when you use an out argument you are asking for
a meaningful return value. Not initializing a variable because
you need a zero is common, but you usually don't call a function
with an out argument because you want a init value. So I think
the two cases are sufficiently different.

It's also a correctness feature - the default initializer is NOT
always 0, and if you're writing generic code you'd be making a
mistake to use 0. You'd have to initialize it to
'typeof(x).init', which is a good way of punishing users :-)<

Right. But how often do you want to initialize an out argument to
its "init" in a function?
--------------------
I think Tobias Pankrath is saying that accepting code like this:
void foo(out int x) {}
Is just like the compiler accepting code like this:
int foo() {}
and rewriting it like this:
int foo() { return typeof(return).init; }
Currently the compiler doesn't perform that rewrite, and instead
it gives a missing return value error.
D even gives errors if you do this:
int foo(in bool b) {
if (b)
return 1;
}
So it asks for all cases to return something or assert(0).
Bye,
bearophile

I have not used SAL so I can't tell. It has nonnull compile-time
tests, and tests of copy or write of the _whole_ span of an
array, that look nice and are missing in D.

I have no idea what that would be 'nice' for, except some optimizations. Nor do
I think that, in general, such an attribute is mechanically checkable. If it is
not checkable, it may wind up being the source of obscure bugs.

I suspect that is a guess.

I've written a ton of D1/D2 code, and functions like foo are very
uncommon in my code. I can do a statistic on my code.

That would be your personal coding style (and there's nothing wrong with that),
but it is not evidence for correctness.

But I think when you use an out argument you are asking for
a meaningful return value. Not initializing a variable because
you need a zero is common, but you usually don't call a function
with an out argument because you want a init value. So I think
the two cases are sufficiently different.

The assumption that a .init value is not meaningful is completely arbitrary.

I think Tobias Pankrath is saying that accepting code like this:

You are correct that they are logically the same. But I don't believe that at
all implies that they are equivalent in error-proneness.

I have not used SAL so I can't tell. It has nonnull
compile-time
tests, and tests of copy or write of the _whole_ span of an
array, that look nice and are missing in D.

I have no idea what that would be 'nice' for, except some
optimizations.<

If your function is supposed to initialize or modify every item
of an array, with that annotation you can express this semantic
fact (with _Out_writes_all_(s). Such long annotations are created
concatenating different pieces almost freely). Then if you by
mistake don't write on all array items (example: because of wrong
for loop intervals, you have missed the last item), you have
introduced a bug. If the static analysis is able to catch this,
you have avoided one bug.
This is an example of code that contains a bug that SAL2 is able
to spot:
http://msdn.microsoft.com/en-us/library/hh916383.aspx
wchar_t * wmemcpy(
_Out_writes_all_(count) wchar_t *dest,
_In_reads_(count) const wchar_t *src,
size_t count)
{
size_t i;
for (i = 0; i <= count; i++) { // BUG: off-by-one error
dest[i] = src[i];
}
return dest;
}
Once the LDC2/GDC2 compilers are able to optimize very well
range-heavy D code (currently they are not yet at this level,
this was shown by the recent contest based on the K-Nearest
Neighbour code, where Rust has shown to optimize Range-like code
much better than D for still unknown reasons. The Rust community
has also improved the Rust compiler a bit more on the base of the
results of that benchmark, unlike us) we can avoid many raw loops
in D, and avoid such silly off-by-one errors.

Nor do I think that, in general, such an attribute is
mechanically checkable.<

I have not used SAL1/SAL2 so I don't know. I know that verifying
statically the SAL annotations takes quite more time than
compiling the program. I have read that SAL is good and it's
widely used by Microsoft, and it helps catch and avoid a large
number of bugs for device drivers and similar Windows software.
And I know SAL2 is the second version of this annotation language
(so eventual design mistakes of SAL1 are now fixed, and probably
they have removed useless parts), I know that its designers are
intelligent and that Microsoft has some of the best computer
scientists of the world. So I don't know how much SAL2 is able to
statically check the annotations like this one, but I assume in
many important cases SAL2 is able to do it.

The assumption that a .init value is not meaningful is
completely arbitrary.<

What I was trying to say is that I think most times you don't
want and need to return a .init, you want to return something
different.

But I don't believe that at all implies that they are equivalent
in error-proneness.<

OK, and I agree on this.
I think the D "out" arguments are a bit a failure, they are
rarely used, and I think returning a tuple (like in Python,
Haskell, Go, Rust, etc) is usually safer and looks cleaner. So I
think that once D gets a built-in syntax to manage tuples well,
the usage of out arguments will become even less common.
Bye,
bearophile

The article says a code analysis tool 'could' catch the bug, not that any
actually does. (SAL2 in itself is not a tool.) I suspect that such analysis is
not possible in general, being the halting problem.

we can avoid many raw loops in D, and avoid such
silly off-by-one errors.

By rewriting the loop as:
foreach (i,v; src)
dest[i] = v;
or even:
dest[] = src[];
such silly off-by-one problems disappear. I don't see that SAL2 has much to
offer D, and I think it tries to solve these issues in a clumsy, complex yet
insufficient manner.

I have not used SAL1/SAL2 so I don't know. I know that verifying statically the
SAL annotations takes quite more time than compiling the program. I have read
that SAL is good and it's widely used by Microsoft, and it helps catch and
avoid
a large number of bugs for device drivers and similar Windows software. And I
know SAL2 is the second version of this annotation language (so eventual design
mistakes of SAL1 are now fixed, and probably they have removed useless parts),
I
know that its designers are intelligent and that Microsoft has some of the best
computer scientists of the world. So I don't know how much SAL2 is able to
statically check the annotations like this one, but I assume in many important
cases SAL2 is able to do it.

I've seen nothing yet that shows SAL2 is inherently superior to D's mechanisms
for avoiding bugs. Microsoft's engineers are undoubtedly smart and competent,
but saying their results are therefore superior is a logical fallacy. Smart and
competent people often devise faster horses rather than inventing cars. (And
I'm
no exception.)

The assumption that a .init value is not meaningful is completely arbitrary.<

What I was trying to say is that I think most times you don't want and need to
return a .init, you want to return something different.

That argument applies equally well to all initializations; it is not unique to
'out'. I don't see why 'out' should be special in this regard.

I think the D "out" arguments are a bit a failure, they are rarely used, and I
think returning a tuple (like in Python, Haskell, Go, Rust, etc) is usually
safer and looks cleaner. So I think that once D gets a built-in syntax to
manage
tuples well, the usage of out arguments will become even less common.

I see it as annoying and nothing to do with 'strictness'. D default
initializes all variables that don't have an explicit initializer.
This is normal for D, and is a nice convenience. There's no reason
that 'out' should behave differently.

I would hope that the compiler could and would optimize out setting the out
parameter to its init value when it's unnecessary - e.g. if the parameter is a
built-in type, or it's a struct without an opAssign (so that skipping setting
it to the init value won't change the semanticts), and the compiler can
determine that it's guaranteed to be assigned to, then the compiler would skip
setting the out parameter to its init value. After all, why pay for setting
the out parameter on function entry when it's not necessary? But I would
expect that to be an optimization, and the compiler obviously wouldn't always
be able to make it. The same would ideally go for normal variables which are
default-initialized, but it wouldn't surprise me if not having much code flow
analysis in the compiler would make it so that such optimizations couldn't
happen very often.
- Jonathan M Davis

I would hope that the compiler could and would optimize out setting the out
parameter to its init value when it's unnecessary

This is called "dead assignment elimination" and is a bog standard data flow
analysis technique that has been in common use since the 1980's.
Compile some sample code and see!
Also:
grep -r "dead assignment" *.c
in the compiler source tree.
BTW, data flow analysis was not pioneered by the Clang folks, despite them
being
very good compiler devs.

I would hope that the compiler could and would optimize out setting
the out parameter to its init value when it's unnecessary

This is called "dead assignment elimination" and is a bog standard
data flow analysis technique that has been in common use since the
1980's.
Compile some sample code and see!
Also:
grep -r "dead assignment" *.c
in the compiler source tree.
BTW, data flow analysis was not pioneered by the Clang folks, despite
them being very good compiler devs.

Oh, it doesn't surprise me in the least that this is a long-standing and known
compiler technique, particularly since it's a fairly obvious thing to do. It's
just that statements that you've made in the past with regards to code flow
analysis imply that dmd does very little of it, which would imply that
optimizations like this are done a lot less than they could be, because it
does require some level of flow analysis to figure out whether a variable is
guaranteed to be assigned to or not.
Regardless, I fully support how out works. I just think that we should try and
optimize out its costs when they're not necessary, and if the compiler can
already do that, that's great.
- Jonathan M Davis

Why? Do you often need to return .init from out arguments? Do you
think that adding logic to make sure you have initialized the
argument inside the function (as in Ada and C#) isn't going to
improve the safety of your D code? Do you think that both Ada and
C# designers have done a design mistake? What's good in an
argument annotation that is used to return a value and that
accepts you to not set such value, to return a default one?

I just think that we should try and optimize out its costs

This whole discussion was not about optimizations. It was about
safety and logic neatness of a language feature.
Bye,
bearophile

Why? Do you often need to return .init from out arguments? Do you
think that adding logic to make sure you have initialized the
argument inside the function (as in Ada and C#) isn't going to
improve the safety of your D code? Do you think that both Ada and
C# designers have done a design mistake? What's good in an
argument annotation that is used to return a value and that
accepts you to not set such value, to return a default one?

I _hate_ how Java and C# operate where they give you an error if you don't
initialize or assign a value to a variable. It's very annoying, particularly
when it's not necessary, but the compiler isn't smart enough to figure that
out. With D's solution, I don't have to do that, and the cost should be
exactly the same, because the cases where the compiler can't figure out
whether the variable is assigned to or not, it just assigns it the init value
for you, and the cases where it is smart enough, it should optimize out the
extraneous assignment that out does.
I see no safety problems with out. It guarantees that you get a consistent
value for a parameter if you don't assign a value to it, and that's
_completely_ consistent without how default-initialization works in D in
general. The worst that can happen is that you don't assign a value to an out
parameter when you meant to. And that's not unsafe. That's just a logic bug -
and one that's trivially caught with unit tests at that. And in my experience,
it's not even a bug that's likely to happen. So, I much, much prefer D's
approach of default-initialization over having the compiler scream at me. It's
perfectly safe and is just plain less annoying. And in my experience, it's
never been bug-prone.
- Jonathan M Davis

C# forces you to set a default value for out parameters, and I
personally find it annoying. The very nature of out parameters is
often that you use it in a situation where there *may* be a
result. Again using a C# example, 'bool
Dictionary.TryGetValue(key, out foo)'. I don't care what the
value of foo is if that returns false, yet the method still has
to manually initialize it to what that particular implementation
feels like (which is not guaranteed to be any particular value).
I prefer D's approach where you have a guaranteed value that it
gets default initialized to, and don't have to manually
initialize it to some value if it doesn't make sense to assign a
value.

C# forces you to set a default value for out parameters, and I
personally find it annoying. The very nature of out parameters
is often that you use it in a situation where there *may* be a
result. Again using a C# example, 'bool
Dictionary.TryGetValue(key, out foo)'. I don't care what the
value of foo is if that returns false, yet the method still has
to manually initialize it to what that particular
implementation feels like (which is not guaranteed to be any
particular value). I prefer D's approach where you have a
guaranteed value that it gets default initialized to, and don't
have to manually initialize it to some value if it doesn't make
sense to assign a value.

It seems to me the check should be, did the code assign a value
to this variable. When it happens and under what conditions isn't
important, but it should do it once. But such a check isn't
likely to have great value.