From common-issues-return-9978-apmail-hadoop-common-issues-archive=hadoop.apache.org@hadoop.apache.org Tue Aug 03 22:32:44 2010
Return-Path:
Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org
Received: (qmail 37553 invoked from network); 3 Aug 2010 22:32:43 -0000
Received: from unknown (HELO mail.apache.org) (140.211.11.3)
by 140.211.11.9 with SMTP; 3 Aug 2010 22:32:43 -0000
Received: (qmail 54572 invoked by uid 500); 3 Aug 2010 22:32:43 -0000
Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org
Received: (qmail 54481 invoked by uid 500); 3 Aug 2010 22:32:42 -0000
Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
List-Help:
List-Unsubscribe:
List-Post:
List-Id:
Reply-To: common-issues@hadoop.apache.org
Delivered-To: mailing list common-issues@hadoop.apache.org
Received: (qmail 54473 invoked by uid 99); 3 Aug 2010 22:32:42 -0000
Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Aug 2010 22:32:42 +0000
X-ASF-Spam-Status: No, hits=-2000.0 required=10.0
tests=ALL_TRUSTED
X-Spam-Check-By: apache.org
Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22)
by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Aug 2010 22:32:39 +0000
Received: from thor (localhost [127.0.0.1])
by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o73MWI0O020458
for ; Tue, 3 Aug 2010 22:32:18 GMT
Message-ID: <10866986.146901280874738314.JavaMail.jira@thor>
Date: Tue, 3 Aug 2010 18:32:18 -0400 (EDT)
From: "Hairong Kuang (JIRA)"
To: common-issues@hadoop.apache.org
Subject: [jira] Updated: (HADOOP-6889) Make RPC to have an option to timeout
In-Reply-To: <30758566.75561280434516176.JavaMail.jira@thor>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394
X-Virus-Checked: Checked by ClamAV on apache.org
[ https://issues.apache.org/jira/browse/HADOOP-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hairong Kuang updated HADOOP-6889:
----------------------------------
Attachment: ipcTimeout.patch
This patch passes a rpcTimout parameter to RPC#getProxy method. A a non-positive rpcTimeout means that RPC does not timeout as the default behavior. If rpcTimeout is positive, a RPC client throws SocketTimeoutException if the client has not received a response in rpcTimeout period.
> Make RPC to have an option to timeout
> -------------------------------------
>
> Key: HADOOP-6889
> URL: https://issues.apache.org/jira/browse/HADOOP-6889
> Project: Hadoop Common
> Issue Type: New Feature
> Components: ipc
> Affects Versions: 0.22.0
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: 0.22.0, 0.20-append
>
> Attachments: ipcTimeout.patch
>
>
> Currently Hadoop RPC does not timeout when the RPC server is alive. What it currently does is that a RPC client sends a ping to the server whenever a socket timeout happens. If the server is still alive, it continues to wait instead of throwing a SocketTimeoutException. This is to avoid a client to retry when a server is busy and thus making the server even busier. This works great if the RPC server is NameNode.
> But Hadoop RPC is also used for some of client to DataNode communications, for example, for getting a replica's length. When a client comes across a problematic DataNode, it gets stuck and can not switch to a different DataNode. In this case, it would be better that the client receives a timeout exception.
> I plan to add a new configuration ipc.client.max.pings that specifies the max number of pings that a client could try. If a response can not be received after the specified max number of pings, a SocketTimeoutException is thrown. If this configuration property is not set, a client maintains the current semantics, waiting forever.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.