Feature #9428

Inline argument expressions and re-assignment

Just a random idea. Currently, Ruby allows you to use any arbitrary expression for setting default values for arguments, which can be really convenient and makes for clear code, especially handy for documentation, etc. For example:

def fetch(id, cache = config[:cache])
# bleh
end

In the same vein, as well as setting a default value using an arbitrary expression, it's not uncommon to post-process an argument, some common examples include:

arg = arg.upcase
arg = arg.to_sym
arg = arg.dup

It would be rather nice in my opinion to be able to do this inline when defining the argument:

def fetch(id.to_i, cache = config[:cache])
# bleh
end

This works well where the argument is the receiver of the method call, but what if you wanted to do Integer(id) in the above example instead of using String#to_i? There are two options. One could either fallback to processing the argument within the method/block body, or, you could make the implementation a little bit clever by using inferencing.

Ruby could auto-assign the passed argument to the first variable encountered in the expression. So in the following example, as soon as the virtual machine encounters id, it recognises it as a variable and assigns the argument value before continuing. When encountering subsequent variables, Ruby would take the usual action and look for a corresponding method in self before throwing an error. You can always disambiguate by qualifying the receiver, e.g. self.id

def fetch(Integer(id), cache = config[:cache])
# bleh
end

Whatever the result of the expression, it's assigned as the final argument value. So in the case of id.to_i, the argument name of id is inferred. id is set to the supplied argument for the duration of the expression. The result of the expression is then re-assigned as the value of id. This technically allows expressions of arbitrary complexity, but like all things in Ruby, with great power comes great responsibility. One must use common sense when deciding whether to manipulate the argument inline, or within the method body. As long as the expression is of reasonable length and complexity, readability remains perfectly reasonable.

Interested to get some thoughts and opinions on this one. I sense the potential for controversy :)

History

Haha. I don't think any programming language exists that does anything even similar to this. Whether it's a good idea or not, it's going to provoke all the feelings that come with unfamiliarity. Everything is confusing until you learn it and get use to it. Plenty of things in Ruby confused the hell out of me as there were many new ideas and concepts; those things normally turn out to be the best features mind you.

The first thing to keep in mind that the behaviour is very well defined, and the logic itself is simple. I can't think of any edge cases except if no variable is used in the expression, but this can be picked up by the compiler which could throw an error like: "Expecting local variable in argument expression at position 1".

Really, it's a question of "do we want this in Ruby". I don't think there's any denying the practicality, so really it's only a matter of aesthetics and readability. Keep in mind that just everyone uses code highlighting, so any semi-decent editor would pick up the first local variable in the expression at highlight it some how (make it bold, underline it, etc). If the expression is simple enough like in the examples I've provided, it's very readable in my opinion.

Aesthetically, we must compare the current situation to the proposed. Here's some code I wrote today. Pretty common scenario:

ordered_values.map { |v|
v = v.dup
[v.delete(:media_type), v]
}.to_h

Rewritten using the proposal, we get it onto one line:

ordered_values.map { |v.dup| [v.delete(:media_type), v] }.to_h

You tell me which you prefer? Of course, like most features in Ruby, it can be abused, so looking at all the wrong you can do with it isn't relevant. Another use case would be to provide logical defaults. At the moment, default values for arguments are only applicable when that argument isn't supplied at all, but what if we want to set a default if the value is nil or false. Here's a comparison:

I find all these examples readable and aesthetically pleasing. Longer expressions with conditions are best avoided in favour of simply doing it within the body of the method or block, but shorter conditions work quite well:

def article('Unnamed' if title.empty?, body)
# bleh
end

Like in my previous example, this comes in most handy not when defining methods, but when defining proc's where this can in my opinion greatly improve readability.

Without syntax highlighting, it isn't super obvious, but in simple cases (which are the main use case), like fetch(id.to_i), it is obvious enough without the aid of syntax highlighting.

Remember though, while there are aspects to this that are potentially unobvious, there are other aspects which it make things more obvious, such as in the case of auto-documentation. If the transformation of #to_i or #to_sym is embedded within the method signature, it makes it obvious that the given argument must respond to that method (#to_i or #to_sym). If this logic is hidden away in the method/block body, the author must either document it explicitly, or the user must troll through the method body unless they prefer to find out the hard way at runtime.

There are certainly benefits to be had here beyond merely reducing verbosity. The trade-offs are merely readability related, but as I've said, keeping the expressions simple and good code highlighting pretty much completely mitigate that problem.

If we can't get over the argument name inferencing, then perhaps someone can suggest a syntax for defining the argument name explicitly, e.g.

def fetch(id.to_i as id)
end

But I don't really find that example any more readable. The implicit assignment in the original proposal makes it easy to distinguish expression arguments from arguments with default values.

I certainly don't expect anyone to see this proposal and instantly fall in love. It's a pretty radical idea.

aesthetically: it puts some of the function's code outside the
function's body, which makes it harder to follow a function's execution
when reading code, and it makes the signature unnecessarily messy.

syntactically: none of the proposals you've given make enough sense, at
least for me personally to understand what they mean:

def foo( arg.to_i )
def foo( arg.to_i as arg )

Is the left-most 'arg' a local variable, or referring to self#arg, or
something else..?

Ruby could auto-assign the passed argument to the first variable
encountered in the expression.

According to my understanding of the parser, any heretofore unseen
"bareword" tokens are interpreted as function calls, so there is no "first
variable encountered." It works for optional positional parameters because
they have an equals sign in (and 'bareword = expression' is universally
lvar creation/assignment, in Ruby).

debugability: the 'def' line is a single line, however there's no real
limit to the number of parameters you can include in that line. If each of
those parameters can include arbitrary expressions, well, I'd hate to have
to debug a "NoMethodError: undefined method ... for nil:NilClass" on that
line. And if the answer to that is to split the 'def' line over multiple
lines, then why not just put the expressions on those multiple lines anyway?

orthogonality: what about non-optional keyword arguments?

For what it's worth, I'm not entirely for allowing arbitrary expressions
in optional parameters either, but in that case I can't think of a better
representation. But if I ever seen anything more than a #[] call in a
default value I consider it Bad Form™.

I know you said you're not a fan of allowing expression when assigning default values to optional parameters, but the point about aesthetics applies equally to them also.

The rule is relatively simple. The first identifier (lvar/method) encountered is automatically assigned the value of the argument passed to the method or proc. That's the rule, the first identifier (valid variable name) is assigned the argument value. If you want to refer to self.id, you must use self.id to disambiguate as you would have to in many other scenario's in Ruby. In the example you highlighted def foo( arg.to_i ), the identifier arg is encountered and automatically assigned the argument value before the expression continues execution. The result of the expression is then assigned back to arg.

The same problem exists for expressions used as default values for optional arguments. Debugging is the same for each. If it's not clear where the error occurred, one could always temporarily break the argument definitions over multiple lines while debugging. I don't think debugging would be any worse than debugging a long method chain like Hash[var.select { |v| #bleh }.map { |v| # blah }]. The problem is universal. I don't think debugability can be used against this proposal.

Technically, for optional arguments, you can have an expression for when an argument is given, and an expression for when an argument is optional. It remains consistent in this respect.

I know you said you're not a fan of allowing expression when assigning
default values to optional parameters, but the point about aesthetics
applies equally to them also.

That's partly why I'm not a fan. If I could think of a valid, useful
alternative I would strongly suggest it. I know it wouldn't be adopted
(backwards compatibility, if nothing else) but I'd propose it anyway. The
best I can come up with is another special method, along the lines of
method_given?, perhaps:

It's not great, obviously, but it removes arbitrary code from the 'def'
line.

The rule is relatively simple. The first identifier (lvar/method)
encountered is automatically assigned the value of the argument passed to
the method or proc. That's the rule, the first identifier (valid variable
name) is assigned the argument value. If you want to refer to self.id,
you must use self.id to disambiguate as you would have to in many other
scenario's in Ruby. In the example you highlighted def foo( arg.to_i ),
the identifier arg is encountered and automatically assigned the argument
value before the expression continues execution.

"First encountered" in regular left-to-right parsing order?

def foo( a[b] )
#=>
def foo a
a = a[b]
end

?

The same problem exists for expressions used as default values for
optional arguments. Debugging is the same for each. If it's not clear
where the error occurred, one could always temporarily break the argument
definitions over multiple lines while debugging. I don't think debugging
would be any worse than debugging a long method chain like Hash[var.select
{ |v| #bleh }.map { |v| # blah }]. I therefore don't think debugability
can be used against this proposal.

I agree that existing long/complex lines are hard to debug. But why add
the opportunity for more such lines? Especially in a place that is
traditionally free from such concerns? With my background as a C
programmer I instinctively see the 'def' line as free from execution; it's
a definition, something that informs the interpreter and the human reader
about the nature of the program/data/etc. I would be surprised if I
started seeing runtime exceptions raised from these traditionally
compile-time-only lines.

Again, I know it's already possible to achieve these errors using optional
args, but I concede that as a necessary evil in the absence of an
alternative. And, since we're stuck with them, I prefer a culture of
promoting the least amount of executable code possible in that line; thus
some of my opposition to this proposal.

Technically, for optional arguments, you can have an expression for
when an argument is given, and an expression for when an argument is
optional. It remains consistent in this respect.

def foo(id.to_i = config[:default_id])

This introduces some amount of confusion. Which of the following is
equivalent?

id = id.to_i // id = config[:default_id]

or:

id = id.to_i // id = config[:default_id].to_i

Either way, this is very confusing when, anywhere else in a Ruby script, it
would mean:

It would be first identifier encountered as per the order of execution. In the following example, the variable in the if statement would be the name of the argument.

def foo(id.to_i if String === bob )

You could rewrite this as...

def foo(bob)
bob = id.to_i if String === bob
end

A contrived and fairly non-sensical example, but it demonstrates that the variable furthest to the left isn't necessarily the argument name.

This introduces some amount of confusion. Which of the following is
equivalent?

The first case id = id.to_i // id = config[:default_id]. If an argument is given id = id.to_i, if the argument is omitted, the argument name would be inferred from the expression on the left (id.to_i) and the result of config[:default_id] would be assigned to it as in id = config[:default_id]. I'm not suggesting anyone would want to do this, but it's possible.

As Matz has indicated though, it would be very difficult to parse. Fun to discuss though.

It would be rather nice in my opinion to be able to do this inline when defining the argument:

def fetch(id.to_i, cache = config[:cache])
# bleh
end

-1: I do not see why the person or machine that reads the first line of method's definition (its interface) needs to know how the arguments will be post-processed before being further post-processed. Maybe some idempotent operations, like upcase, would make more sense in this context, but looks too complicated to me.

I do not see why the person or machine that reads the first line of method's definition needs to know how the arguments will be post-processed before being further post-processed

Semantically, it just seems more appropriate to define the argument transformation as part of the method definition. It's common to have the first line or two of a method be argument transformation, the point of which is to check and coerce the arguments into their expected form. The actual logic of the method normally just uses that argument variable without ever reassigning it. It wouldn't likely be "further processed" (i.e reassigned) within the method body as you suggest. Where the argument is only used once within the method, you can do the transformation inline; it'd be unnecessary to do it in the method signature unless you wanted to for documentation reasons.

In some respects, you could consider it type hinting for a dynamic language. While statically typed languages would have a type hint., in dynamically typed languages, it's not uncommon for one to check and coerce the argument into something expected.

In some respects, you could consider it type hinting for a dynamic language. While statically typed languages would have a type hint., in dynamically typed languages, it's not uncommon for one to check and coerce the argument into something expected.

It seems this is what i meant by an idempotent operation: something like #to_s, #to_i, #upcase. (A function f is idempotent if f(f(x)) = f(x).) I can't think of any syntax though, or how and if to restrict it to only idempotent operations.

Indeed limiting it to method calls on the argument object (e.g arg.to_i) would make it much easier to parse and more readable, but greatly limits the potential application. It seems the inferencing in the initial proposal is the cause of all the readability and parsing difficulties. The as syntax is certainly workable though. Perhaps <lvar> as <expression> would be better than <expression> as <lvar>. Makes it more natural as an assignment operation, and names the argument before using it in the expression.

The problem will come back to what to do with optional arguments and keyword arguments? Ambiguity goes through the roof while readability goes through the floor. I suppose method and block definitions are perhaps at the limit of the express-ability, with features like the grenade operator *args, hash grenade **hash, block capturing &block, keyword arguments, optional arguments with default values, etc. It's best to just accept that there's pretty much no room left for adding new functionality to method definitions whilst maintaing parsability and readability.