[ https://issues.apache.org/jira/browse/HDFS-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738134#comment-13738134
]
Vinay commented on HDFS-4504:
-----------------------------
{quote}In some cases, DFSOutputStream#close and DFSOutputStream#lastException will be set
by the DataStreamer, prior to DFSOutputStream#close being called. In those cases, we need
to throw an exception from close prior to clearing the exception.{quote}
I assume these cases were never handled. Without handling pipeline failure cases, this patch
will be incomplete.
Pipeline failures while writing data are also most likely to happen.
In case of pipeline failures {{closed}} will be marked {{true}} by DataStreamer thread itself
(as mentioned already in [~cmccabe] comment). On first call to close() will throw the pipeline
failure exception, but next calls to close() just returns. *So Stream will never be marked
as zombie, also resources will never be released.*
You can verify by changing your test {{testCloseWithDatanodeDown}} as follows
{code}+ out.write(100);
+ cluster.stopDataNode(0);{code}
to
{code}+ out.write(100);
+ out.hflush();
+ out.write(100);
+ cluster.stopDataNode(0);{code}
Please check.
> DFSOutputStream#close doesn't always release resources (such as leases)
> -----------------------------------------------------------------------
>
> Key: HDFS-4504
> URL: https://issues.apache.org/jira/browse/HDFS-4504
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-4504.001.patch, HDFS-4504.002.patch, HDFS-4504.007.patch, HDFS-4504.008.patch,
HDFS-4504.009.patch, HDFS-4504.010.patch
>
>
> {{DFSOutputStream#close}} can throw an {{IOException}} in some cases. One example is
if there is a pipeline error and then pipeline recovery fails. Unfortunately, in this case,
some of the resources used by the {{DFSOutputStream}} are leaked. One particularly important
resource is file leases.
> So it's possible for a long-lived HDFS client, such as Flume, to write many blocks to
a file, but then fail to close it. Unfortunately, the {{LeaseRenewerThread}} inside the client
will continue to renew the lease for the "undead" file. Future attempts to close the file
will just rethrow the previous exception, and no progress can be made by the client.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira