[ https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HDFS-941:
-----------------------------
Attachment: hdfs-941.txt
Updated patch to fix failing tests:
- TestDFSClientRetries works by setting the max xceiver count to something very low like 2,
and then hammering it with a lot of clients, to make sure the randomized backoff lets them
all eventually succeed. With even a short keepalive on the datanode side, the transceivers
were occupied for too long. Set the DN keepalive config to 0 for this test case, and modified
the DN code so that a config setting 0 disables the behavior.
- TestNameNodeMetrics was looking at the cluster "load" (read: xceiver count) as one of the
metrics. This was therefore sensitive to timing since it dependended on whether the DN heartbeated
during the keepalive window or after it had expired. I removed this assert since the other
metrics already do good coverage.
> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
> Key: HDFS-941
> URL: https://issues.apache.org/jira/browse/HDFS-941
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: data-node, hdfs client
> Affects Versions: 0.22.0
> Reporter: Todd Lipcon
> Assignee: bc Wong
> Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch,
HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.patch, HDFS-941-6.patch, HDFS-941-6.patch,
hdfs-941.txt, hdfs-941.txt, hdfs941-1.png
>
>
> Right now each connection into the datanode xceiver only processes one operation.
> In the case that an operation leaves the stream in a well-defined state (eg a client
reads to the end of a block successfully) the same connection could be reused for a second
operation. This should improve random read performance significantly.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira