The idea being that the glob (evaluated) in scalar context, returns true if there is a file of form dirN/f*

However, somehow it is seemingly not being evaluated in scalar context to the extent that if dir1 contains N files beginning with 'f' then the first N directories are returned by grep even if none of the others contain a file beginning with 'f'. It is as if glob is not being evaluated in scalar context. Note even if I force glob to scalar using (scalar glob("$_/f*")), it still fails this way.

Any clue what is going wrong?
Any suggestions for alternative approaches?

Update: I found a problem. If there's more than one match, the code skips every second dir. The reason is that glob in scalar context is stateful, and needs on extra cycle to reset. A possible fix is to evaluate the glob in list context, and check if the list has a least one element.

The code seems to work with that change but I just want to check that there is no need to explicitly check "if the list has at least one element" since presumably evaluating the list (by grep) will determine if it is empty or not.

I'm still not really sure why scalar context doesn't work (indeed, I would have thought scalar context would be better than list context). The stateful carry over part seems to be weird if not buggy. But as long as it works for me by forcing an array context, then it's all good even if it was far from obvious at first glance.

I just want to check that there is no need to explicitly check "if the list has at least one element" since presumably evaluating the list (by grep) will determine if it is empty or not.

Right, no need for an explicit check. Arrays in scalar context evaluate to the number of elements, so only empty arrays are false in boolean context.

The stateful carry over part seems to be weird if not buggy.

Well,

glob</b> needs to carry state for this useful idiom to work:
<code>
while (my $file = glob '*.txt') {
# do something with $file
}
</code>
<p>Which is more friendly to memory than using a list and iterating th+at.</p>
<p>But one could argue that <c>glob

the behavior for example of glob used somewhere deep in a module funciton would vary depending on whether it was at some level called from something in a loop

I can't figure out an example of that. Edit -- okay, here is an example that shows how a function that relies on the behavior of glob inside the function can produce faulty results when the function is called in a loop:

Thanks 7stud for all your patience and persistence in helping me figure out this strange/unexpected behavior.

It seems to me that this is a hidden and potentially significant time-bomb type issue since glob is a core function and it's not inconceivable that people will bury it somewhere in a module where it is used in static context. Then it will lay there waiting until one day someone calls the module from a loop and gets wrong results.

This would be bad enough if the behavior were fully or even adequately documented. But currently, the documentation at best alludes rather obscurely to the behavior that can lead to an issue in such a context.

Do people agree this is a valid issue that needs addressing either in 'fixing' glob or at least in documenting and warning about the behavior?

I think the problem is most similar to the problem of keys (not) resetting the iterator over a hash. I guess that the best solution is to not call glob in scalar context at all.

Most likely, part of the documentation of keys can be adapted to be added to the glob documentation. I would open a bug report using the perlbug utility, best together with a proposed documentation patch that cautions against using glob in scalar context.

I thought grep will *only* return the elements of the array for which the first expression is true. Since the first expression glob("$_/f*") is only true for 'dir1', it should only return that element of the directory list (even though glob returns 3 files in that directory). The glob finds no elements in the other 2 directories, so it should be undefined which would evaluate as false.

I don't understand why the elements of glob("$_/f*") returned for the first directory entry seemingly spill over to subsequent directories in the (implicit) grep iteration.

This behavior is certainly not obvious and it is not (clearly) documented either under 'perldoc -f glob' or at perldoc.perl.org. The line saying "In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted" does not make it clear (at least to me) that such a state persists across calls to glob with a new argument!

I agree that the docs for glob (perldoc -f glob) are not clear. When I read the docs for glob, and I came upon the sentence:

In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted.

I had no idea what that meant. However, I immediately recognized that that sentence did NOT mean what you claimed it meant, namely that in scalar context glob returns true if there were any matching files. I have no idea how you arrived at that interpretation. In my opinion, the literal interpretation would be that in scalar context, glob() sits for a few seconds as it spins through the list of matches, and then glob returns undef, i.e. glob always returns undef in scalar context.

In any case, after reading the docs I was prompted to try an experiment to see how glob() works in scalar context. So I setup this directory structure:

I agree with your final revision to the documentation string.
The doc text truly doesn't make much literal sense. And unless one has perfect Perl Monk karma, I don't see how one can easily intuit the difference between scalar context in a looped vs. non-looped context. The purpose of documentation is (presumably) to help those who are not yet experts. In this case, I humbly propose that the documentation fails to adequately and properly document the behavior in scalar context.

Also what happens when glob is called in a function that is embedded in a loop? Either way I can imagine challenges. If it is still considered in a loop then the behavior for example of glob used somewhere deep in a module funciton would vary depending on whether it was at some level called from something in a loop. On the other hand if calling it from a function that is embedded in a loop behaves differently from calling directly, then again you have an odd behavior where simply wrapping 'glob' in a function call would change its behavior.

To me, this still seems quite flaky and upnredictable. At a minimum, it deserves copious documentation to explain the behavior and potential issues.