# (3)
def spam(filename):
with file(filename) as fh:
for line in fh:
do_something_with(line)

Mind you, (3) is almost as simple as (1) (only one additional line),
but somehow it lacks (1)'s direct simplicity. (And it adds one
more indentation level, which I find annoying.) Furthermore, I
don't recall ever coming across either (2) or (3) "in the wild",
even after reading a lot of high-quality Python code (e.g. standard
library modules).

Finally, I was under the impression that Python closed filehandles
automatically when they were garbage-collected. (In fact (3)
suggests as much, since it does not include an implicit call to
fh.close.) If so, the difference between (1) and (3) does not seem
very big. What am I missing here?

Advertisements

kj wrote:
>
>
>
> There's something wonderfully clear about code like this:
>
> # (1)
> def spam(filename):
> for line in file(filename):
> do_something_with(line)
>
> It is indeed pseudo-codely beautiful. But I gather that it is not
> correct to do this, and that instead one should do something like
>
> # (2)
> def spam(filename):
> fh = file(filename)
> try:
> for line in fh:
> do_something_with(line)
> finally:
> fh.close()
>
> ...or alternatively, if the with-statement is available:
>
> # (3)
> def spam(filename):
> with file(filename) as fh:
> for line in fh:
> do_something_with(line)
>
> Mind you, (3) is almost as simple as (1) (only one additional line),
> but somehow it lacks (1)'s direct simplicity. (And it adds one
> more indentation level, which I find annoying.) Furthermore, I
> don't recall ever coming across either (2) or (3) "in the wild",
> even after reading a lot of high-quality Python code (e.g. standard
> library modules).
>
> Finally, I was under the impression that Python closed filehandles
> automatically when they were garbage-collected. (In fact (3)
> suggests as much, since it does not include an implicit call to
> fh.close.) If so, the difference between (1) and (3) does not seem
> very big. What am I missing here?
>
CPython uses reference counting, so an object is garbage collected as
soon as there are no references to it, but that's just an implementation
detail.

Other implementations, such as Jython and IronPython, don't use
reference counting, so you don't know when an object will be garbage
collected, which means that the file might remain open for an unknown
time afterwards in case 1 above.

Most people use CPython, so it's not surprising that case 1 is so
common.

Advertisements

kj wrote:
> There's something wonderfully clear about code like this:
>
> # (1)
> def spam(filename):
> for line in file(filename):
> do_something_with(line)
>
> It is indeed pseudo-codely beautiful. But I gather that it is not
> correct to do this, and that instead one should do something like
>
> # (2)
> def spam(filename):
> fh = file(filename)
> try:
> for line in fh:
> do_something_with(line)
> finally:
> fh.close()
>
> ...or alternatively, if the with-statement is available:
>
> # (3)
> def spam(filename):
> with file(filename) as fh:
> for line in fh:
> do_something_with(line)
>
> Mind you, (3) is almost as simple as (1) (only one additional line),
> but somehow it lacks (1)'s direct simplicity. (And it adds one
> more indentation level, which I find annoying.) Furthermore, I
> don't recall ever coming across either (2) or (3) "in the wild",
> even after reading a lot of high-quality Python code (e.g. standard
> library modules).
>
> Finally, I was under the impression that Python closed filehandles
> automatically when they were garbage-collected. (In fact (3)
> suggests as much, since it does not include an implicit call to
> fh.close.) If so, the difference between (1) and (3) does not seem
> very big. What am I missing here?
>
> kynn
>
>
We have to distinguish between reference counted and garbage collected.
As MRAB says, when the reference count goes to zero, the file is
immediately closed, in CPython implementation. So all three are
equivalent on that platform.

But if you're not sure the code will run on CPython, then you have to
have something that explicitly catches the out-of-scopeness of the file
object. Both your (2) and (3) do that, with different syntaxes.

On Sep 5, 1:17 pm, Dave Angel <> wrote:
> kj wrote:
> > There's something wonderfully clear about code like this:
>
> > # (1)
> > def spam(filename):
> > for line in file(filename):
> > do_something_with(line)
>
> > It is indeed pseudo-codely beautiful. But I gather that it is not
> > correct to do this, and that instead one should do something like
>
> > # (2)
> > def spam(filename):
> > fh = file(filename)
> > try:
> > for line in fh:
> > do_something_with(line)
> > finally:
> > fh.close()
>
> > ...or alternatively, if the with-statement is available:
>
> > # (3)
> > def spam(filename):
> > with file(filename) as fh:
> > for line in fh:
> > do_something_with(line)
>
> > Mind you, (3) is almost as simple as (1) (only one additional line),
> > but somehow it lacks (1)'s direct simplicity. (And it adds one
> > more indentation level, which I find annoying.) Furthermore, I
> > don't recall ever coming across either (2) or (3) "in the wild",
> > even after reading a lot of high-quality Python code (e.g. standard
> > library modules).
>
> > Finally, I was under the impression that Python closed filehandles
> > automatically when they were garbage-collected. (In fact (3)
> > suggests as much, since it does not include an implicit call to
> > fh.close.) If so, the difference between (1) and (3) does not seem
> > very big. What am I missing here?
>
> > kynn
>
> We have to distinguish between reference counted and garbage collected.
> As MRAB says, when the reference count goes to zero, the file is
> immediately closed, in CPython implementation. So all three are
> equivalent on that platform.
>
> But if you're not sure the code will run on CPython, then you have to
> have something that explicitly catches the out-of-scopeness of the file
> object. Both your (2) and (3) do that, with different syntaxes.
>
> DaveA

Stop being lazy and close the file. You don't want open file objects
just floating around in memory. Even the docs says something like
"yes, python will free the memory associated with a file object but
you can never *really* be sure *when* this will happen, so just
explicitly close the damn thing!". Besides, you can't guarantee that
any data has been written without calling f.flush() or f.close()
first. What if your program crashes and no data is written? huh?

I guess i could put my pants on by jumping into both legs at the same
time thereby saving one step, but i my fall down and break my arm. I
would much rather just use the one leg at a time approach...

On Sat, 5 Sep 2009 16:14:02 +0000 (UTC), kj <>
declaimed the following in gmane.comp.python.general:
> ...or alternatively, if the with-statement is available:
>
> # (3)
> def spam(filename):
> with file(filename) as fh:
> for line in fh:
> do_something_with(line)
>
<snip>
> Finally, I was under the impression that Python closed filehandles
> automatically when they were garbage-collected. (In fact (3)
> suggests as much, since it does not include an implicit call to
> fh.close.) If so, the difference between (1) and (3) does not seem
> very big. What am I missing here?

In the case of the with construct, in effect the with statement IS
equivalent to:

fh = file(filename)
for line in fh:
do something...
fh.close()

with the proper safeguards to ensure the close() is called. Essentially,
anything "opened" by a with clause is "closed" when the block is left.
--
Wulfraed Dennis Lee Bieber KD6MOGHTTP://wlfraed.home.netcom.com/

> CPython uses reference counting, so an object is garbage collected as
> soon as there are no references to it, but that's just an implementation
> detail.
>
> Other implementations, such as Jython and IronPython, don't use
> reference counting, so you don't know when an object will be garbage
> collected, which means that the file might remain open for an unknown
> time afterwards in case 1 above.
>
> Most people use CPython, so it's not surprising that case 1 is so
> common.

Additionally, many scripts just use a small number of files (say,
1-5 files) so having a file-handle open for the duration of the
run it minimal overhead.

On the other hand, when processing thousands of files, I always
explicitly close each file to make sure I don't exhaust some
file-handle limit the OS or interpreter may enforce.

On Sep 5, 2:47 pm, Dennis Lee Bieber <> wrote:
(snip)
> > Finally, I was under the impression that Python closed filehandles
> > automatically when they were garbage-collected. (In fact (3)
> > suggests as much, since it does not include an implicit call to
> > fh.close.) If so, the difference between (1) and (3) does not seem
> > very big. What am I missing here?

True, but i find the with statement (while quite useful in general
practice) is not a "cure all" for situations that need and exception
caught. In that case the laborious finger wrecking syntax of "f.close
()" must be painstaking typed letter by painful letter.

On Sat, 05 Sep 2009 16:14:02 +0000, kj wrote:
> Finally, I was under the impression that Python closed filehandles
> automatically when they were garbage-collected. (In fact (3) suggests
> as much, since it does not include an implicit call to fh.close.) If so,
> the difference between (1) and (3) does not seem very big. What am I
> missing here?

(1) Python the language will close file handles, but doesn't guarantee
when. Some implementations (e.g. CPython) will close them immediately the
file object goes out of scope. Others (e.g. Jython) will close them
"eventually", which may be when the program exists.

(2) If the file object never goes out of scope, say because you've stored
a reference to it somewhere, the file will never be closed and you will
leak file handles. Since the OS only provides a finite number of them,
any program which uses large number of files is at risk of running out.

(3) For quick and dirty scripts, or programs that only use one or two
files, relying on the VM to close the file is sufficient (although lazy
in my opinion *wink*) but for long-running applications using many files,
or for debugging, you may want more control over what happens when.

In <02b2e6ca$0$17565$> Steven D'Aprano <> writes:
>(3) For quick and dirty scripts, or programs that only use one or two
>files, relying on the VM to close the file is sufficient (although lazy
>in my opinion *wink*)

It's not a matter of laziness or industriousness, but rather of
code readability. The real problem here is not the close() per
se, but rather all the additional machinery required to ensure that
the close happens. When the code is working with multiple file
handles simultaneously, one ends up with a thicket of try/finally's
that makes the code just *nasty* to look at. E.g., even with only
two files, namely an input and an output file, compare:

On Sun, 06 Sep 2009 01:51:50 +0000, kj wrote:
> In <02b2e6ca$0$17565$> Steven D'Aprano
> <> writes:
>
>>(3) For quick and dirty scripts, or programs that only use one or two
>>files, relying on the VM to close the file is sufficient (although lazy
>>in my opinion *wink*)
>
> It's not a matter of laziness or industriousness, but rather of code
> readability. The real problem here is not the close() per se, but
> rather all the additional machinery required to ensure that the close
> happens. When the code is working with multiple file handles
> simultaneously, one ends up with a thicket of try/finally's that makes
> the code just *nasty* to look at.

Yep, that's because dealing with the myriad of things that *might* (but
probably won't) go wrong when dealing with files is *horrible*. Real
world code is almost always much nastier than the nice elegant algorithms
we hope for.

Most people know they have to deal with errors when opening files. The
best programmers deal with errors when writing to files. But only a few
of the most pedantic coders even attempt to deal with errors when
*closing* the file. Yes, closing the file can fail. What are you going to
do about it? At the least, you should notify the user, then continue.
Dying with an uncaught exception in the middle of processing millions of
records is Not Cool. But close failures are so rare that we just hope
we'll never experience one.

It really boils down to this... do you want to write correct code, or
elegant code?

Stephen Hansen wrote:
> This is precisely why the with statement exists; to provide a cleaner
> way to wrap a block in setup and teardown functions. Closing is one.
> Yeah, you get some extra indentation-- but you sorta have to live with
> it if you're worried about correct code. I think it's a good compromise
> between your examples of nasty and nice
>
> def compromise(from_, to_):
> with file(to_) as to_h:
> with file(from_) as from_h:
> for line in from_h:
> print >> to_h, munge(line)
>
> It's just too bad that 'with' doesn't support multiple separate "x as y"
> clauses.

The developers already agreed with you ;-).

"With more than one item, the context managers are processed as if
multiple with statements were nested:

with A() as a, B() as b:
suite
is equivalent to

with A() as a:
with B() as b:
suite
Changed in version 3.1: Support for multiple context expressions.
"

On Sep 9, 4:19 pm, Charles Yeomans <> wrote:
(snip
> Unfortunately, both of these simple templates have the following
> problem -- if open fails, a NameError will be raised from the finally
> block.

(snip)
> I removed the except block because I prefer exceptions to error codes.

how will the caller know an exception has occurred? What if logic
depends on the validation that a file *had* or *had not* been written
too, huh?
> In addition to fixing the latent bug in the second simple template, I
> took the opportunity to correct your heinous violation of command-
> query separation.
>
> Charles Yeomans

Oh I see! But what happens if the filename does not exist? What then?
"open" will blow chucks thats what! Here is a version for our paranoid-
schizophrenic-sadomasochist out there...

En Thu, 10 Sep 2009 08:26:16 -0300, David C. Ullrich
<> escribió:
> On Wed, 9 Sep 2009 15:13:49 -0700 (PDT), r <> wrote:
>> On Sep 9, 4:19 pm, Charles Yeomans <> wrote:
>>> I removed the except block because I prefer exceptions to error codes.
>>
>> how will the caller know an exception has occurred? What if logic
>> depends on the validation that a file *had* or *had not* been written
>> too, huh?
>>
>> Oh I see! But what happens if the filename does not exist? What then?
>> "open" will blow chucks thats what! Here is a version for our paranoid-
>> schizophrenic-sadomasochist out there...
>
> Well first, we agree that putting the open() in the try part of a
> try-finally is wrong. try-finally is supposed to ensure that
> _allocated_ resources are cleaned up.
>
> What you do below may work. But it's essentially throwing
> out exception handling and using error codes instead. There
> are plenty of reasons why exceptions are preferred. The
> standard thing is this:
>
> def UseResource(rname):
> r = get(rname)
> try:
> r.use()
> finally
> r.cleanup()

And it is so widely used that it got its own syntax (the with statement)
and library support (contextlib, for creating custom context managers). In
any decent version of Python this idiom becomes:

That's the standard argument showing why structured exceptions are a Good
Thing. I'd say everyone should be aware of that, giving the ample usage of
exceptions in Python, but looks like people requires a reminder from time
to time.

Share This Page

Welcome to The Coding Forums!

Welcome to the Coding Forums, the place to chat about anything related to programming and coding languages.

Please join our friendly community by clicking the button below - it only takes a few seconds and is totally free. You'll be able to ask questions about coding or chat with the community and help others.
Sign up now!