First, if $r->read reads unchunked data then why is there a Tr=
ansfer-Encoding header saying that the content is chunked? =A0 Shouldn'=
t that header be removed? =A0 How does one know if the content is chunked o=
r not, otherwise?

Second, if there's no Content-Length header then ho=
w does one know how much data to read using $r->read? =A0=A0

<=
br>

One answer is until $r->read returns zero bytes, of course=
. =A0But, is that=A0guaranteed=A0to always be the case, even for, say, pipe=
lined requests? =A0 My guess is yes because whatever is de-chunking the req=
uest knows to stop after reading the last chunk, trailer and empty line. =
=A0 Can anyone=A0elaborate on how Apache/mod_perl is doing this?=A0

Perhaps I'm approaching this incorre=
ctly, but this is all a bit untidy.

--001a11c39f08619fd304e088e5d0--
From modperl-return-63390-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Wed Jul 3 18:35:34 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id CE81610568
for ; Wed, 3 Jul 2013 18:35:34 +0000 (UTC)
Received: (qmail 28426 invoked by uid 500); 3 Jul 2013 18:35:33 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 28334 invoked by uid 500); 3 Jul 2013 18:35:29 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 28323 invoked by uid 99); 3 Jul 2013 18:35:28 -0000
Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 18:35:28 +0000
X-ASF-Spam-Status: No, hits=-0.0 required=5.0
tests=SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (nike.apache.org: local policy)
Received: from [67.212.167.194] (HELO server.tqis.com) (67.212.167.194)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 18:35:22 +0000
Received: from server.tqis.com (localhost.localdomain [127.0.0.1])
by server.tqis.com (8.13.8/8.13.8) with ESMTP id r63IYwCD006525;
Wed, 3 Jul 2013 14:34:58 -0400
Received: from localhost (jschueler@localhost)
by server.tqis.com (8.13.8/8.13.8/Submit) with ESMTP id r63IYteP006521;
Wed, 3 Jul 2013 14:34:58 -0400
X-Authentication-Warning: server.tqis.com: jschueler owned process doing -bs
Date: Wed, 3 Jul 2013 14:34:55 -0400 (EDT)
From: Jim Schueler
X-X-Sender: jschueler@server.tqis.com
To: Bill Moseley
cc: mod_perl list
Subject: Re: mod_perl and Transfer-Encoding: chunked
In-Reply-To:
Message-ID:
References:
User-Agent: Alpine 2.00 (LRH 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="-733756761-961859947-1372876498=:25557"
X-Virus-Checked: Checked by ClamAV on apache.org
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
---733756761-961859947-1372876498=:25557
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8BIT
I played around with chunking recently in the context of media streaming:
The client is only requesting a "chunk" of data. "Chunking" is how media
players perform a "seek". It was originally implemented for FTP
transfers: E.g, to transfer a large file in (say 10K) chunks. In the
case that you describe below, if no Content-Length is specified, that
indicates "send the remainder".
>From what I know, a "chunk" request header is used this way to specify the
server response. It does not reflect anything about the data included in
the body of the request. So first, I would ask if you're confused about
this request information.
Hypothetically, some browsers might try to upload large files in small
chunks and the "chunk" header might reflect a push transfer. I don't know
if "chunk" is ever used for this purpose. But it would require the
following characteristics:
1. The browser would need to originally inquire if the server is
capable of this type of request.
2. Each chunk of data will arrive in a separate and independent HTTP
request. Not necessarily in the order they were sent.
3. Two or more requests may be handled by separate processes
simultaneously that can't be written into a single destination.
4. Somehow the server needs to request a resend if a chunk is missing.
Solving this problem requires an imaginitive use of HTTP.
Sounds messy. But might be appropriate for 100M+ sized uploads. This
*may* reflect your situation. Can you please confirm?
For a single process, the incoming content-length is unnecessary. Buffered
I/O automatically knows when transmission is complete. The read()
argument is the buffer size, not the content length. Whether you spool
the buffer to disk or simply enlarge the buffer should be determined by
your hardware capabilities. This is standard IO behavior that has nothing
to do with HTTP chunk. Without a "Content-Length" header, after looping
your read() operation, determine the length of the aggregate data and pass
that to Catalyst.
But if you're confident that the complete request spans several smaller
(chunked) HTTP requests, you'll need to address all the problems I've
described above, plus the problem of re-assembling the whole thing for
Catalyst. I don't know anything about Plack, maybe it can perform all
this required magic.
Otherwise, if the whole purpose of the Plack temporary file is to pass a
file handle, you can pass a buffer as a file handle. Used to be
IO::String, but now that functionality is built into the core.
By your last paragraph, I'm really lost. Since you're already passing the
request as a file handle, I'm guessing that Catalyst creates the
tempororary file for the *response* body. Can you please clarify? Also,
what do you mean by "de-chunking"? Is that the same think as
re-assembling?
Wish I could give a better answer. Let me know if this helps.
-Jim
On Tue, 2 Jul 2013, Bill Moseley wrote:
> For requests that are chunked (Transfer-Encoding: chunked and no
> Content-Length header) calling $r->read returns unchunked data from the
> socket.
> That's indeed handy. Is that mod_perl doing that un-chunking or is it
> Apache?
>
> But, it leads to some questions.
>
> First, if $r->read reads unchunked data then why is there a
> Transfer-Encoding header saying that the content is chunked? Shouldn't
> that header be removed? How does one know if the content is chunked or
> not, otherwise?
>
> Second, if there's no Content-Length header then how does one know how much
> data to read using $r->read?
>
> One answer is until $r->read returns zero bytes, of course. But, is
> that guaranteed to always be the case, even for, say, pipelined requests?
> My guess is yes because whatever is de-chunking the request knows to stop
> after reading the last chunk, trailer and empty line. Can anyone elaborate
> on how Apache/mod_perl is doing this?
>
>
> Perhaps I'm approaching this incorrectly, but this is all a bit untidy.
>
> I'm using Catalyst and Catalyst needs a Content-Length. So, I have a Plack
> Middleware component that creates a temporary file writing the buffer from
> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes. I pass
> this file handle onto Catalyst.
>
> Then, for some content-types, Catalyst (via HTTP::Body) writes the body to
> another temp file. I don't know how Apache/mod_perl does its de-chunking,
> but I can call $r->read with a huge buffer length and Apache returns that.
> So, maybe Apache is buffering to disk, too.
>
> In other words, for each tiny chunked JSON POST or PUT I'm creating two (or
> three?) temp files which doesn't seem ideal.
>
>
> --
> Bill Moseley
> moseley@hank.org
>
>
---733756761-961859947-1372876498=:25557--
From modperl-return-63391-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Wed Jul 3 18:45:50 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 88DF7105A0
for ; Wed, 3 Jul 2013 18:45:50 +0000 (UTC)
Received: (qmail 48686 invoked by uid 500); 3 Jul 2013 18:45:49 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 48666 invoked by uid 500); 3 Jul 2013 18:45:49 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 48659 invoked by uid 99); 3 Jul 2013 18:45:49 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 18:45:49 +0000
X-ASF-Spam-Status: No, hits=1.5 required=5.0
tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW
X-Spam-Check-By: apache.org
Received-SPF: error (athena.apache.org: local policy)
Received: from [74.125.82.181] (HELO mail-we0-f181.google.com) (74.125.82.181)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 18:45:44 +0000
Received: by mail-we0-f181.google.com with SMTP id p58so407025wes.40
for ; Wed, 03 Jul 2013 11:45:01 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=google.com; s=20120113;
h=mime-version:x-originating-ip:in-reply-to:references:from:date
:message-id:subject:to:cc:content-type:x-gm-message-state;
bh=3xknzlaOoZUEjQDu+hdWuN5H5/E6jHdt9aHY6j8sKt0=;
b=SYdfWAw/WwLN/IdEFAUhv863qq0b03aQ7cpLnc1CFPBBz/bo+bi7HFfrfo+5ZQG1Mu
nhdL+UzLFQzjgvi9RuPXT2ADOneVOabjQYf5me9UH62GPSmCHZUnNstDM89+NT54OKIW
OKBn3FwlbougoY+SMP/JcsNHz2BFU4WJrTulVW7MAopac6AYBOJqwX8Brrj+RTTRWXlF
A+SukfOsLdo6yR6Y0CyNEWfQdy6pWB4b6dotD/zCpO2tuuFMh6eIWUPZTOmSUlb6uHZC
wi9kauO1xE3GB3i6u9w71W9PqGzI/BLmPQTpx2BcF58d2u+HaUMTpGkuxIOZckFDTqHp
JfVw==
X-Received: by 10.180.210.132 with SMTP id mu4mr1401609wic.5.1372877101654;
Wed, 03 Jul 2013 11:45:01 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.194.175.170 with HTTP; Wed, 3 Jul 2013 11:44:31 -0700 (PDT)
X-Originating-IP: [96.18.102.166]
In-Reply-To:
References:
From: Bill Moseley
Date: Wed, 3 Jul 2013 11:44:31 -0700
Message-ID:
Subject: Re: mod_perl and Transfer-Encoding: chunked
To: Jim Schueler
Cc: mod_perl list
Content-Type: multipart/alternative; boundary=001a11c25daaf6fa6b04e09fdb35
X-Gm-Message-State: ALoCoQlvnONy0ATuEPWR0L4/BlvyF8oOu5rQoBvoK+BlXL4hYN9/h5p4ztiJXpoVYQHr1mrjVxDC
X-Virus-Checked: Checked by ClamAV on apache.org
--001a11c25daaf6fa6b04e09fdb35
Content-Type: text/plain; charset=ISO-8859-1
Hi Jim,
This is the Transfer-Encoding: chunked I was writing about:
http://tools.ietf.org/html/rfc2616#section-3.6.1
On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler wrote:
> I played around with chunking recently in the context of media streaming:
> The client is only requesting a "chunk" of data. "Chunking" is how media
> players perform a "seek". It was originally implemented for FTP transfers:
> E.g, to transfer a large file in (say 10K) chunks. In the case that you
> describe below, if no Content-Length is specified, that indicates "send the
> remainder".
>
> From what I know, a "chunk" request header is used this way to specify the
> server response. It does not reflect anything about the data included in
> the body of the request. So first, I would ask if you're confused about
> this request information.
>
> Hypothetically, some browsers might try to upload large files in small
> chunks and the "chunk" header might reflect a push transfer. I don't know
> if "chunk" is ever used for this purpose. But it would require the
> following characteristics:
>
> 1. The browser would need to originally inquire if the server is
> capable of this type of request.
> 2. Each chunk of data will arrive in a separate and independent HTTP
> request. Not necessarily in the order they were sent.
> 3. Two or more requests may be handled by separate processes
> simultaneously that can't be written into a single destination.
> 4. Somehow the server needs to request a resend if a chunk is missing.
> Solving this problem requires an imaginitive use of HTTP.
>
> Sounds messy. But might be appropriate for 100M+ sized uploads. This
> *may* reflect your situation. Can you please confirm?
>
> For a single process, the incoming content-length is unnecessary. Buffered
> I/O automatically knows when transmission is complete. The read() argument
> is the buffer size, not the content length. Whether you spool the buffer
> to disk or simply enlarge the buffer should be determined by your hardware
> capabilities. This is standard IO behavior that has nothing to do with
> HTTP chunk. Without a "Content-Length" header, after looping your read()
> operation, determine the length of the aggregate data and pass that to
> Catalyst.
>
> But if you're confident that the complete request spans several smaller
> (chunked) HTTP requests, you'll need to address all the problems I've
> described above, plus the problem of re-assembling the whole thing for
> Catalyst. I don't know anything about Plack, maybe it can perform all this
> required magic.
>
> Otherwise, if the whole purpose of the Plack temporary file is to pass a
> file handle, you can pass a buffer as a file handle. Used to be
> IO::String, but now that functionality is built into the core.
>
> By your last paragraph, I'm really lost. Since you're already passing the
> request as a file handle, I'm guessing that Catalyst creates the
> tempororary file for the *response* body. Can you please clarify? Also,
> what do you mean by "de-chunking"? Is that the same think as re-assembling?
>
> Wish I could give a better answer. Let me know if this helps.
>
> -Jim
>
>
>
> On Tue, 2 Jul 2013, Bill Moseley wrote:
>
> For requests that are chunked (Transfer-Encoding: chunked and no
>> Content-Length header) calling $r->read returns unchunked data from the
>> socket.
>> That's indeed handy. Is that mod_perl doing that un-chunking or is it
>> Apache?
>>
>> But, it leads to some questions.
>>
>> First, if $r->read reads unchunked data then why is there a
>> Transfer-Encoding header saying that the content is chunked? Shouldn't
>> that header be removed? How does one know if the content is chunked or
>> not, otherwise?
>>
>> Second, if there's no Content-Length header then how does one know how
>> much
>> data to read using $r->read?
>>
>> One answer is until $r->read returns zero bytes, of course. But, is
>> that guaranteed to always be the case, even for, say, pipelined requests?
>>
>> My guess is yes because whatever is de-chunking the request knows to stop
>> after reading the last chunk, trailer and empty line. Can
>> anyone elaborate
>> on how Apache/mod_perl is doing this?
>>
>>
>> Perhaps I'm approaching this incorrectly, but this is all a bit untidy.
>>
>> I'm using Catalyst and Catalyst needs a Content-Length. So, I have a
>> Plack
>> Middleware component that creates a temporary file writing the buffer from
>> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes. I pass
>> this file handle onto Catalyst.
>>
>> Then, for some content-types, Catalyst (via HTTP::Body) writes the body to
>> another temp file. I don't know how Apache/mod_perl does its
>> de-chunking,
>> but I can call $r->read with a huge buffer length and Apache returns that.
>> So, maybe Apache is buffering to disk, too.
>>
>> In other words, for each tiny chunked JSON POST or PUT I'm creating two
>> (or
>> three?) temp files which doesn't seem ideal.
>>
>>
>> --
>> Bill Moseley
>> moseley@hank.org
>>
>>
--
Bill Moseley
moseley@hank.org
--001a11c25daaf6fa6b04e09fdb35
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Hi Jim,

I played around with chunking recently in th=
e context of media streaming: The client is only requesting a "chunk&q=
uot; of data. =A0"Chunking" is how media players perform a "=
seek". =A0It was originally implemented for FTP transfers: =A0E.g, to =
transfer a large file in (say 10K) chunks. =A0In the case that you describe=
below, if no Content-Length is specified, that indicates "send the re=
mainder".

>From what I know, a "chunk" request header is used this way to sp=
ecify the server response. =A0It does not reflect anything about the data i=
ncluded in the body of the request. =A0So first, I would ask if you're =
confused about this request information.

Hypothetically, some browsers might try to upload large files in small chun=
ks and the "chunk" header might reflect a push transfer. =A0I don=
't know if "chunk" is ever used for this purpose. =A0But it w=
ould require the following characteristics:

=A0 1. =A0The browser would need to originally inquire if the server is
=A0 =A0 =A0 capable of this type of request.
=A0 2. =A0Each chunk of data will arrive in a separate and independent HTTP=
=A0 =A0 =A0 request. =A0Not necessarily in the order they were sent.
=A0 3. =A0Two or more requests may be handled by separate processes
=A0 =A0 =A0 simultaneously that can't be written into a single destinat=
ion.
=A0 4. =A0Somehow the server needs to request a resend if a chunk is missin=
g.
=A0 =A0 =A0 Solving this problem requires an imaginitive use of HTTP.

For a single process, the incoming content-length is unnecessary. Buffered =
I/O automatically knows when transmission is complete. =A0The read() argume=
nt is the buffer size, not the content length. =A0Whether you spool the buf=
fer to disk or simply enlarge the buffer should be determined by your hardw=
are capabilities. =A0This is standard IO behavior that has nothing to do wi=
th HTTP chunk. =A0Without a "Content-Length" header, after loopin=
g your read() operation, determine the length of the aggregate data and pas=
s that to Catalyst.

But if you're confident that the complete request spans several smaller=
(chunked) HTTP requests, you'll need to address all the problems I'=
;ve described above, plus the problem of re-assembling the whole thing for =
Catalyst. =A0I don't know anything about Plack, maybe it can perform al=
l this required magic.

Otherwise, if the whole purpose of the Plack temporary file is to pass a fi=
le handle, you can pass a buffer as a file handle. =A0Used to be IO::String=
, but now that functionality is built into the core.

By your last paragraph, I'm really lost. =A0Since you're already pa=
ssing the request as a file handle, I'm guessing that Catalyst creates =
the tempororary file for the *response* body. =A0Can you please clarify? =
=A0Also, what do you mean by "de-chunking"? =A0Is that the same t=
hink as re-assembling?

Wish I could give a better answer. =A0Let me know if this helps.

-Jim

On Tue, 2 Jul 2013, Bill Moseley wrote:

For requests that are chunked (Transfer-Encoding: chunked and no
Content-Length header) calling $r->read returns unchunked=A0data from th=
e
socket.
That's indeed handy. =A0Is that mod_perl doing that un-chunking or is i=
t
Apache?

But, it leads to some questions. =A0=A0

First, if $r->read reads unchunked data then why is there a
Transfer-Encoding header saying that the content is chunked? =A0 Shouldn=
9;t
that header be removed? =A0 How does one know if the content is chunked or<=
br>
not, otherwise?

Second, if there's no Content-Length header then how does one know how =
much
data to read using $r->read? =A0=A0

One answer is until $r->read returns zero bytes, of course. =A0But, is
that=A0guaranteed=A0to always be the case, even for, say, pipelined request=
s? =A0
My guess is yes because whatever is de-chunking the request knows to stop
after reading the last chunk, trailer and empty line. =A0 Can anyone=A0elab=
orate
on how Apache/mod_perl is doing this?=A0

Perhaps I'm approaching this incorrectly, but this is all a bit untidy.=

--001a11c25daaf6fa6b04e09fdb35--
From modperl-return-63392-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Wed Jul 3 18:54:30 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 3E207105C9
for ; Wed, 3 Jul 2013 18:54:30 +0000 (UTC)
Received: (qmail 67679 invoked by uid 500); 3 Jul 2013 18:54:29 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 67602 invoked by uid 500); 3 Jul 2013 18:54:29 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 67595 invoked by uid 99); 3 Jul 2013 18:54:29 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 18:54:29 +0000
X-ASF-Spam-Status: No, hits=3.2 required=5.0
tests=FORGED_YAHOO_RCVD,HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (athena.apache.org: local policy)
Received: from [72.30.235.58] (HELO n5-vm6.bullet.mail.bf1.yahoo.com) (72.30.235.58)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 18:54:24 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=gcom1024; t=1372877643; bh=zVF/8o8HBypT1JKhOV6WF8X4aXBpFGhhOSut0uoDsTk=; h=Received:Received:Received:DKIM-Signature:X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:References:Mime-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding:Message-Id:Cc:X-Mailer:From:Subject:Date:To; b=PabKf1efnXQnaskDo7by3z+Bzq02pRYbTzMBrjw4kOPzOw4NURPYKuEkEL/oP/cEGvvqx0JKwvbKwiSSz4yNFOxyA/05RDT7Y6TLNjgKyvBDnUHQlIf7id+H1q93FaMB8Ay1CMA02NB/I1o/yI7xkVJcQ5YIhrtdIEi+HQZ4sOg=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=gcom1024; d=yahoo.com;
b=im08P/H75gwzRGr2AdE+OR2UiN/XfNjIYAtolgUyh0KyP9JJgTNwkRLOgtKxbtN3N5rO2gnB68oP8WzAqh4XvoFd1ZXQXhj2k/sd4Sjo6m5NCS/GXwXOARuh2CUBccmqRZ+UFkI9RxOE3zbeKNVu4Zpu510KaQ4k2tKPMYCtOqA=;
Received: from [66.196.81.175] by n5.bullet.mail.bf1.yahoo.com with NNFMP; 03 Jul 2013 18:54:03 -0000
Received: from [98.139.211.200] by t5.bullet.mail.bf1.yahoo.com with NNFMP; 03 Jul 2013 18:54:03 -0000
Received: from [127.0.0.1] by smtp209.mail.bf1.yahoo.com with NNFMP; 03 Jul 2013 18:54:03 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1372877643; bh=zVF/8o8HBypT1JKhOV6WF8X4aXBpFGhhOSut0uoDsTk=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:References:Mime-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding:Message-Id:Cc:X-Mailer:From:Subject:Date:To; b=UUJCvyElORuP1O6tufQOHTqEYsqOzAyAHmUDgiIQ0JWYO7WZwd+yg1K/PzD6jp+0HbAwd4Gvjr+nraRXWTwYxRu7UYHxhRLjb+XtJKwL5rNzefM0Td5FXCYpcpR3GsAgp8X6AD6aWkErG9Wmv/DWGta+qXcgao+5iAUjBvfL1zY=
X-Yahoo-Newman-Id: 417609.57095.bm@smtp209.mail.bf1.yahoo.com
X-Yahoo-Newman-Property: ymail-3
X-YMail-OSG: hR.CuAoVM1l5RqL4t5EtuJdKQZg_MVgfjz6VFs5weeKcnhH
4ZHti_nc1reHrY3.Vfgvwd5CLlBR.QVgAqtmEwbt7Duhhoq1a1bApCb2cdGA
V30U6R7_rsQOfmaU5aV5ExNXgIyn9WFWglJQtyxBJ3OLpiJOcPYzWvYsllrG
37PpkCPqZdHxi1I1rfLvl1SBDVzOrG7bP6iR4mj9asJUbIMqzkXl2XXaaXvH
AXdG7i63TyWt1qB53A7.wCPfyizEla6eQpAVvkVX9u_Q7jOeDKCrSxpBngcR
SRlGZU24DnPtBiWzNkXiyhjL64_wwhjKpzbFiaP6R3pwaXQ2085jwJKsrPFH
GWde9WEMGThBrwxi4W0yIOwI4NzqLLZxXEz0NIrZ6JnjkRuBTltpJTpUEFiy
AVfTRZXPVAzSFpQHpDP_fYpzplL2RTBeowwCGEGRpdg_cGit41VJtYxbeaVJ
0cHT63nRI1eWtwTNksUfYKmsis8YclOZGFfPgzBUHxbs377KHRtsb0bLAHbH
4XuXukYakat4MBzjGOMgyM2nVAcEEGuyy_f8rHBzTWjVqLyiFBl8QkY.xnQA
9rOK10Iyo_heFW_0p.Uw3G7WsdagJ4eIdjQZYNdbwfL47Ic4YFKQgIlzAyt3
RXyWdO2sSagqvKzA3T3Jjmg--
X-Yahoo-SMTP: QDlDAnmswBC.bNldBIBkpZHKbHoL830igw--
X-Rocket-Received: from [192.168.2.102] (joe_schaefer@99.135.28.65 with )
by smtp209.mail.bf1.yahoo.com with SMTP; 03 Jul 2013 11:54:03 -0700 PDT
References:
Mime-Version: 1.0 (1.0)
In-Reply-To:
Content-Type: multipart/alternative;
boundary=Apple-Mail-A12031E3-C9FA-4758-86B1-6E0661129822
Content-Transfer-Encoding: 7bit
Message-Id: <3FF9F5A0-8805-49C0-843C-5014BDA87B44@yahoo.com>
Cc: Jim Schueler ,
mod_perl list
X-Mailer: iPhone Mail (10B350)
From: Joseph Schaefer
Subject: Re: mod_perl and Transfer-Encoding: chunked
Date: Wed, 3 Jul 2013 14:53:58 -0400
To: Bill Moseley
X-Virus-Checked: Checked by ClamAV on apache.org
--Apple-Mail-A12031E3-C9FA-4758-86B1-6E0661129822
Content-Type: text/plain;
charset=us-ascii
Content-Transfer-Encoding: quoted-printable
When you read from the input filter chain as $r->read does, the http input f=
ilter automatically handles the protocol and passes the dechunked data up to=
the caller. It does not spool the stream at all.
You'd have to look at how mod perl implements read to see if it loops its ap=
_get_brigade calls on the input filter chain to fill the passed buffer to th=
e desired length or not. But under no circumstances should you have to deal=
with chunked data directly.
HTH
Sent from my iPhone
On Jul 3, 2013, at 2:44 PM, Bill Moseley wrote:
> Hi Jim,
>=20
> This is the Transfer-Encoding: chunked I was writing about:
>=20
> http://tools.ietf.org/html/rfc2616#section-3.6.1
>=20
>=20
>=20
> On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler wr=
ote:
>> I played around with chunking recently in the context of media streaming:=
The client is only requesting a "chunk" of data. "Chunking" is how media p=
layers perform a "seek". It was originally implemented for FTP transfers: E=
.g, to transfer a large file in (say 10K) chunks. In the case that you desc=
ribe below, if no Content-Length is specified, that indicates "send the rema=
inder".
>>=20
>> =46rom what I know, a "chunk" request header is used this way to specify t=
he server response. It does not reflect anything about the data included in=
the body of the request. So first, I would ask if you're confused about th=
is request information.
>>=20
>> Hypothetically, some browsers might try to upload large files in small ch=
unks and the "chunk" header might reflect a push transfer. I don't know if "=
chunk" is ever used for this purpose. But it would require the following ch=
aracteristics:
>>=20
>> 1. The browser would need to originally inquire if the server is
>> capable of this type of request.
>> 2. Each chunk of data will arrive in a separate and independent HTTP
>> request. Not necessarily in the order they were sent.
>> 3. Two or more requests may be handled by separate processes
>> simultaneously that can't be written into a single destination.
>> 4. Somehow the server needs to request a resend if a chunk is missing.=
>> Solving this problem requires an imaginitive use of HTTP.
>>=20
>> Sounds messy. But might be appropriate for 100M+ sized uploads. This *m=
ay* reflect your situation. Can you please confirm?
>>=20
>> For a single process, the incoming content-length is unnecessary. Buffere=
d I/O automatically knows when transmission is complete. The read() argumen=
t is the buffer size, not the content length. Whether you spool the buffer t=
o disk or simply enlarge the buffer should be determined by your hardware ca=
pabilities. This is standard IO behavior that has nothing to do with HTTP c=
hunk. Without a "Content-Length" header, after looping your read() operatio=
n, determine the length of the aggregate data and pass that to Catalyst.
>>=20
>> But if you're confident that the complete request spans several smaller (=
chunked) HTTP requests, you'll need to address all the problems I've describ=
ed above, plus the problem of re-assembling the whole thing for Catalyst. I=
don't know anything about Plack, maybe it can perform all this required mag=
ic.
>>=20
>> Otherwise, if the whole purpose of the Plack temporary file is to pass a f=
ile handle, you can pass a buffer as a file handle. Used to be IO::String, b=
ut now that functionality is built into the core.
>>=20
>> By your last paragraph, I'm really lost. Since you're already passing th=
e request as a file handle, I'm guessing that Catalyst creates the temporora=
ry file for the *response* body. Can you please clarify? Also, what do you=
mean by "de-chunking"? Is that the same think as re-assembling?
>>=20
>> Wish I could give a better answer. Let me know if this helps.
>>=20
>> -Jim
>>=20
>>=20
>>=20
>> On Tue, 2 Jul 2013, Bill Moseley wrote:
>>=20
>>> For requests that are chunked (Transfer-Encoding: chunked and no
>>> Content-Length header) calling $r->read returns unchunked data from the
>>> socket.
>>> That's indeed handy. Is that mod_perl doing that un-chunking or is it
>>> Apache?
>>>=20
>>> But, it leads to some questions. =20
>>>=20
>>> First, if $r->read reads unchunked data then why is there a
>>> Transfer-Encoding header saying that the content is chunked? Shouldn't=
>>> that header be removed? How does one know if the content is chunked or=
>>> not, otherwise?
>>>=20
>>> Second, if there's no Content-Length header then how does one know how m=
uch
>>> data to read using $r->read? =20
>>>=20
>>> One answer is until $r->read returns zero bytes, of course. But, is
>>> that guaranteed to always be the case, even for, say, pipelined requests=
? =20
>>> My guess is yes because whatever is de-chunking the request knows to sto=
p
>>> after reading the last chunk, trailer and empty line. Can anyone elabo=
rate
>>> on how Apache/mod_perl is doing this?=20
>>>=20
>>>=20
>>> Perhaps I'm approaching this incorrectly, but this is all a bit untidy.
>>>=20
>>> I'm using Catalyst and Catalyst needs a Content-Length. So, I have a Pl=
ack
>>> Middleware component that creates a temporary file writing the buffer fr=
om
>>> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes. I pass=
>>> this file handle onto Catalyst.
>>>=20
>>> Then, for some content-types, Catalyst (via HTTP::Body) writes the body t=
o
>>> another temp file. I don't know how Apache/mod_perl does its de-chunk=
ing,
>>> but I can call $r->read with a huge buffer length and Apache returns tha=
t.
>>> So, maybe Apache is buffering to disk, too.
>>>=20
>>> In other words, for each tiny chunked JSON POST or PUT I'm creating two (=
or
>>> three?) temp files which doesn't seem ideal.
>>>=20
>>>=20
>>> --
>>> Bill Moseley
>>> moseley@hank.org
>=20
>=20
>=20
> --=20
> Bill Moseley
> moseley@hank.org
--Apple-Mail-A12031E3-C9FA-4758-86B1-6E0661129822
Content-Type: text/html;
charset=utf-8
Content-Transfer-Encoding: quoted-printable

When you read from the input filter ch=
ain as $r->read does, the http input filter automatically handles the pro=
tocol and passes the dechunked data up to the caller. It does not spool the s=
tream at all.

You'd have to look at how mod perl im=
plements read to see if it loops its ap_get_brigade calls on the input filte=
r chain to fill the passed buffer to the desired length or not. But un=
der no circumstances should you have to deal with chunked data directly.

I played around with chunking recently in the c=
ontext of media streaming: The client is only requesting a "chunk" of data. &=
nbsp;"Chunking" is how media players perform a "seek". It was original=
ly implemented for FTP transfers: E.g, to transfer a large file in (sa=
y 10K) chunks. In the case that you describe below, if no Content-Leng=
th is specified, that indicates "send the remainder".

=46rom what I know, a "chunk" request header is used this way to specify the=
server response. It does not reflect anything about the data included=
in the body of the request. So first, I would ask if you're confused a=
bout this request information.

Hypothetically, some browsers might try to upload large files in small chunk=
s and the "chunk" header might reflect a push transfer. I don't know i=
f "chunk" is ever used for this purpose. But it would require the foll=
owing characteristics:

1. The browser would need to originally inquire if the server i=
s
capable of this type of request.
2. Each chunk of data will arrive in a separate and independent=
HTTP
request. Not necessarily in the order they were s=
ent.
3. Two or more requests may be handled by separate processes
simultaneously that can't be written into a single dest=
ination.
4. Somehow the server needs to request a resend if a chunk is m=
issing.
Solving this problem requires an imaginitive use of HTT=
P.

Sounds messy. But might be appropriate for 100M+ sized uploads. =
This *may* reflect your situation. Can you please confirm?

For a single process, the incoming content-length is unnecessary. Buffered I=
/O automatically knows when transmission is complete. The read() argum=
ent is the buffer size, not the content length. Whether you spool the b=
uffer to disk or simply enlarge the buffer should be determined by your hard=
ware capabilities. This is standard IO behavior that has nothing to do=
with HTTP chunk. Without a "Content-Length" header, after looping you=
r read() operation, determine the length of the aggregate data and pass that=
to Catalyst.

But if you're confident that the complete request spans several smaller (chu=
nked) HTTP requests, you'll need to address all the problems I've described a=
bove, plus the problem of re-assembling the whole thing for Catalyst. =
I don't know anything about Plack, maybe it can perform all this required ma=
gic.

Otherwise, if the whole purpose of the Plack temporary file is to pass a fil=
e handle, you can pass a buffer as a file handle. Used to be IO::Strin=
g, but now that functionality is built into the core.

By your last paragraph, I'm really lost. Since you're already passing t=
he request as a file handle, I'm guessing that Catalyst creates the temporor=
ary file for the *response* body. Can you please clarify? Also, w=
hat do you mean by "de-chunking"? Is that the same think as re-assembl=
ing?

Wish I could give a better answer. Let me know if this helps.

-Jim

On Tue, 2 Jul 2013, Bill Moseley wrote:

For requests that are chunked (Transfer-Encoding: chunked and no
Content-Length header) calling $r->read returns unchunked data from t=
he
socket.
That's indeed handy. Is that mod_perl doing that un-chunking or is it<=
br>
Apache?

But, it leads to some questions.

First, if $r->read reads unchunked data then why is there a
Transfer-Encoding header saying that the content is chunked? Shouldn'=
t
that header be removed? How does one know if the content is chunked o=
r
not, otherwise?

Second, if there's no Content-Length header then how does one know how much<=
br>
data to read using $r->read?

One answer is until $r->read returns zero bytes, of course. But, is=
that guaranteed to always be the case, even for, say, pipelined re=
quests?
My guess is yes because whatever is de-chunking the request knows to stop
after reading the last chunk, trailer and empty line. Can anyone&nbsp=
;elaborate
on how Apache/mod_perl is doing this?

Perhaps I'm approaching this incorrectly, but this is all a bit untidy.

I'm using Catalyst and Catalyst needs a Content-Length. So, I have a P=
lack
Middleware component that creates a temporary file writing the buffer from
$r->read( my $buffer, 64 * 1024 ) until that returns zero bytes. I p=
ass
this file handle onto Catalyst.

Then, for some content-types, Catalyst (via HTTP::Body) writes the body to
another temp file. I don't know how Apache/mod_perl does i=
ts de-chunking,
but I can call $r->read with a huge buffer length and Apache returns that=
.
So, maybe Apache is buffering to disk, too.

In other words, for each tiny chunked JSON POST or PUT I'm creating two (or<=
br>
three?) temp files which doesn't seem ideal.

=
--Apple-Mail-A12031E3-C9FA-4758-86B1-6E0661129822--
From modperl-return-63393-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Wed Jul 3 20:26:47 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id BF6811088D
for ; Wed, 3 Jul 2013 20:26:47 +0000 (UTC)
Received: (qmail 49600 invoked by uid 500); 3 Jul 2013 20:26:46 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 49573 invoked by uid 500); 3 Jul 2013 20:26:46 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 49566 invoked by uid 99); 3 Jul 2013 20:26:46 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 20:26:46 +0000
X-ASF-Spam-Status: No, hits=-0.0 required=5.0
tests=SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (athena.apache.org: local policy)
Received: from [67.212.167.194] (HELO server.tqis.com) (67.212.167.194)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 20:26:43 +0000
Received: from server.tqis.com (localhost.localdomain [127.0.0.1])
by server.tqis.com (8.13.8/8.13.8) with ESMTP id r63KQKcs009599;
Wed, 3 Jul 2013 16:26:20 -0400
Received: from localhost (jschueler@localhost)
by server.tqis.com (8.13.8/8.13.8/Submit) with ESMTP id r63KQJSo009596;
Wed, 3 Jul 2013 16:26:20 -0400
X-Authentication-Warning: server.tqis.com: jschueler owned process doing -bs
Date: Wed, 3 Jul 2013 16:26:19 -0400 (EDT)
From: Jim Schueler
X-X-Sender: jschueler@server.tqis.com
To: Bill Moseley
cc: mod_perl list
Subject: Re: mod_perl and Transfer-Encoding: chunked
In-Reply-To:
Message-ID:
References:
User-Agent: Alpine 2.00 (LRH 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="-733756761-38241519-1372883180=:25557"
X-Virus-Checked: Checked by ClamAV on apache.org
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
---733756761-38241519-1372883180=:25557
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8BIT
Thanks for the prompt response, but this is your question, not mine. I
hardly need an RTFM for my trouble.
I drew my conclusions using a packet sniffer. And as far-fetched as my
answer may seem, it's more plausible than your theory that Apache or
modperl is decoding a raw socket stream.
The crux of your question seems to be how the request content gets
magically re-assembled. I don't think it was ever disassembled in the
first place. But if you don't like my answer, and you don't want to
ignore it either, then please restate the question. I can't find any
definition for unchunked, and Wiktionary's definition of de-chunk says to
"break apart a chunk", that is (counter-intuitively) chunk a chunk.
> Second, if there's no Content-Length header then how
> does one know how much
> data to read using $r->read?
>
> One answer is until $r->read returns zero bytes, of
> course. But, is
> that guaranteed to always be the case, even for,
> say, pipelined requests?
> My guess is yes because whatever is de-chunking the
read() is blocking. So it never returns 0, even in a pipeline request (if
no data is available, it simply waits). I don't wish to discuss the
merits here, but there is no technical imperative for a content-length
request in the request header.
-Jim
On Wed, 3 Jul 2013, Bill Moseley wrote:
> Hi Jim,
> This is the Transfer-Encoding: chunked I was writing about:
>
> http://tools.ietf.org/html/rfc2616#section-3.6.1
>
>
>
> On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler
> wrote:
> I played around with chunking recently in the context of media
> streaming: The client is only requesting a "chunk" of data.
> "Chunking" is how media players perform a "seek". It was
> originally implemented for FTP transfers: E.g, to transfer a
> large file in (say 10K) chunks. In the case that you describe
> below, if no Content-Length is specified, that indicates "send
> the remainder".
>
> >From what I know, a "chunk" request header is used this way to
> specify the server response. It does not reflect anything about
> the data included in the body of the request. So first, I would
> ask if you're confused about this request information.
>
> Hypothetically, some browsers might try to upload large files in
> small chunks and the "chunk" header might reflect a push
> transfer. I don't know if "chunk" is ever used for this
> purpose. But it would require the following characteristics:
>
> 1. The browser would need to originally inquire if the server
> is
> capable of this type of request.
> 2. Each chunk of data will arrive in a separate and
> independent HTTP
> request. Not necessarily in the order they were sent.
> 3. Two or more requests may be handled by separate processes
> simultaneously that can't be written into a single
> destination.
> 4. Somehow the server needs to request a resend if a chunk is
> missing.
> Solving this problem requires an imaginitive use of HTTP.
>
> Sounds messy. But might be appropriate for 100M+ sized uploads.
> This *may* reflect your situation. Can you please confirm?
>
> For a single process, the incoming content-length is
> unnecessary. Buffered I/O automatically knows when transmission
> is complete. The read() argument is the buffer size, not the
> content length. Whether you spool the buffer to disk or simply
> enlarge the buffer should be determined by your hardware
> capabilities. This is standard IO behavior that has nothing to
> do with HTTP chunk. Without a "Content-Length" header, after
> looping your read() operation, determine the length of the
> aggregate data and pass that to Catalyst.
>
> But if you're confident that the complete request spans several
> smaller (chunked) HTTP requests, you'll need to address all the
> problems I've described above, plus the problem of re-assembling
> the whole thing for Catalyst. I don't know anything about
> Plack, maybe it can perform all this required magic.
>
> Otherwise, if the whole purpose of the Plack temporary file is
> to pass a file handle, you can pass a buffer as a file handle.
> Used to be IO::String, but now that functionality is built into
> the core.
>
> By your last paragraph, I'm really lost. Since you're already
> passing the request as a file handle, I'm guessing that Catalyst
> creates the tempororary file for the *response* body. Can you
> please clarify? Also, what do you mean by "de-chunking"? Is
> that the same think as re-assembling?
>
> Wish I could give a better answer. Let me know if this helps.
>
> -Jim
>
>
> On Tue, 2 Jul 2013, Bill Moseley wrote:
>
> For requests that are chunked (Transfer-Encoding:
> chunked and no
> Content-Length header) calling $r->read returns
> unchunked data from the
> socket.
> That's indeed handy. Is that mod_perl doing that
> un-chunking or is it
> Apache?
>
> But, it leads to some questions.
>
> First, if $r->read reads unchunked data then why is
> there a
> Transfer-Encoding header saying that the content is
> chunked? Shouldn't
> that header be removed? How does one know if the
> content is chunked or
> not, otherwise?
>
> Second, if there's no Content-Length header then how
> does one know how much
> data to read using $r->read?
>
> One answer is until $r->read returns zero bytes, of
> course. But, is
> that guaranteed to always be the case, even for,
> say, pipelined requests?
> My guess is yes because whatever is de-chunking the
> request knows to stop
> after reading the last chunk, trailer and empty
> line. Can anyone elaborate
> on how Apache/mod_perl is doing this?
>
>
> Perhaps I'm approaching this incorrectly, but this
> is all a bit untidy.
>
> I'm using Catalyst and Catalyst needs a
> Content-Length. So, I have a Plack
> Middleware component that creates a temporary file
> writing the buffer from
> $r->read( my $buffer, 64 * 1024 ) until that returns
> zero bytes. I pass
> this file handle onto Catalyst.
>
> Then, for some content-types, Catalyst (via
> HTTP::Body) writes the body to
> another temp file. I don't know how
> Apache/mod_perl does its de-chunking,
> but I can call $r->read with a huge buffer length
> and Apache returns that.
> So, maybe Apache is buffering to disk, too.
>
> In other words, for each tiny chunked JSON POST or
> PUT I'm creating two (or
> three?) temp files which doesn't seem ideal.
>
>
> --
> Bill Moseley
> moseley@hank.org
>
>
>
>
> --
> Bill Moseley
> moseley@hank.org
>
>
---733756761-38241519-1372883180=:25557--
From modperl-return-63394-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Wed Jul 3 20:32:20 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 37E40108BA
for ; Wed, 3 Jul 2013 20:32:20 +0000 (UTC)
Received: (qmail 60184 invoked by uid 500); 3 Jul 2013 20:32:19 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 60138 invoked by uid 500); 3 Jul 2013 20:32:19 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 60129 invoked by uid 99); 3 Jul 2013 20:32:19 -0000
Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 20:32:19 +0000
X-ASF-Spam-Status: No, hits=-0.0 required=5.0
tests=SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (nike.apache.org: local policy)
Received: from [67.212.167.194] (HELO server.tqis.com) (67.212.167.194)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 20:32:13 +0000
Received: from server.tqis.com (localhost.localdomain [127.0.0.1])
by server.tqis.com (8.13.8/8.13.8) with ESMTP id r63KVpUF009713
for ; Wed, 3 Jul 2013 16:31:52 -0400
X-Sent-To:
Received: from localhost (jschueler@localhost)
by server.tqis.com (8.13.8/8.13.8/Submit) with ESMTP id r63KVp3m009710
for ; Wed, 3 Jul 2013 16:31:51 -0400
X-Authentication-Warning: server.tqis.com: jschueler owned process doing -bs
Date: Wed, 3 Jul 2013 16:31:51 -0400 (EDT)
From: Jim Schueler
X-X-Sender: jschueler@server.tqis.com
To: mod_perl list
Subject: Re: mod_perl and Transfer-Encoding: chunked
In-Reply-To:
Message-ID:
References:
User-Agent: Alpine 2.00 (LRH 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="-733756761-1755238508-1372883375=:25557"
Content-ID:
X-Virus-Checked: Checked by ClamAV on apache.org
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
---733756761-1755238508-1372883375=:25557
Content-Type: TEXT/PLAIN; CHARSET=ISO-8859-15; FORMAT=flowed
Content-Transfer-Encoding: 8BIT
Content-ID:
In light of Joe Schaefer's response, I appear to be outgunned. So, if
nothing else, can someone please clarify whether "de-chunked" means
re-assembled?
-Jim
On Wed, 3 Jul 2013, Jim Schueler wrote:
> Thanks for the prompt response, but this is your question, not mine. I
> hardly need an RTFM for my trouble.
>
> I drew my conclusions using a packet sniffer. And as far-fetched as my
> answer may seem, it's more plausible than your theory that Apache or modperl
> is decoding a raw socket stream.
>
> The crux of your question seems to be how the request content gets
> magically re-assembled. I don't think it was ever disassembled in the first
> place. But if you don't like my answer, and you don't want to ignore it
> either, then please restate the question. I can't find any definition for
> unchunked, and Wiktionary's definition of de-chunk says to "break apart a
> chunk", that is (counter-intuitively) chunk a chunk.
>
>
>> Second, if there's no Content-Length header then how
>> does one know how much
>> data to read using $r->read?
>>
>> One answer is until $r->read returns zero bytes, of
>> course. But, is
>> that guaranteed to always be the case, even for,
>> say, pipelined requests?
>> My guess is yes because whatever is de-chunking the
>
> read() is blocking. So it never returns 0, even in a pipeline request (if no
> data is available, it simply waits). I don't wish to discuss the merits
> here, but there is no technical imperative for a content-length request in
> the request header.
>
> -Jim
>
>
>
>
>
>
> On Wed, 3 Jul 2013, Bill Moseley wrote:
>
>> Hi Jim,
>> This is the Transfer-Encoding: chunked I was writing about:
>>
>> http://tools.ietf.org/html/rfc2616#section-3.6.1
>>
>>
>>
>> On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler
>> wrote:
>> I played around with chunking recently in the context of media
>> streaming: The client is only requesting a "chunk" of data.
>> "Chunking" is how media players perform a "seek". It was
>> originally implemented for FTP transfers: E.g, to transfer a
>> large file in (say 10K) chunks. In the case that you describe
>> below, if no Content-Length is specified, that indicates "send
>> the remainder".
>>
>> >From what I know, a "chunk" request header is used this way to
>> specify the server response. It does not reflect anything about
>> the data included in the body of the request. So first, I would
>> ask if you're confused about this request information.
>>
>> Hypothetically, some browsers might try to upload large files in
>> small chunks and the "chunk" header might reflect a push
>> transfer. I don't know if "chunk" is ever used for this
>> purpose. But it would require the following characteristics:
>>
>> 1. The browser would need to originally inquire if the server
>> is
>> capable of this type of request.
>> 2. Each chunk of data will arrive in a separate and
>> independent HTTP
>> request. Not necessarily in the order they were sent.
>> 3. Two or more requests may be handled by separate processes
>> simultaneously that can't be written into a single
>> destination.
>> 4. Somehow the server needs to request a resend if a chunk is
>> missing.
>> Solving this problem requires an imaginitive use of HTTP.
>>
>> Sounds messy. But might be appropriate for 100M+ sized uploads.
>> This *may* reflect your situation. Can you please confirm?
>>
>> For a single process, the incoming content-length is
>> unnecessary. Buffered I/O automatically knows when transmission
>> is complete. The read() argument is the buffer size, not the
>> content length. Whether you spool the buffer to disk or simply
>> enlarge the buffer should be determined by your hardware
>> capabilities. This is standard IO behavior that has nothing to
>> do with HTTP chunk. Without a "Content-Length" header, after
>> looping your read() operation, determine the length of the
>> aggregate data and pass that to Catalyst.
>>
>> But if you're confident that the complete request spans several
>> smaller (chunked) HTTP requests, you'll need to address all the
>> problems I've described above, plus the problem of re-assembling
>> the whole thing for Catalyst. I don't know anything about
>> Plack, maybe it can perform all this required magic.
>>
>> Otherwise, if the whole purpose of the Plack temporary file is
>> to pass a file handle, you can pass a buffer as a file handle.
>> Used to be IO::String, but now that functionality is built into
>> the core.
>>
>> By your last paragraph, I'm really lost. Since you're already
>> passing the request as a file handle, I'm guessing that Catalyst
>> creates the tempororary file for the *response* body. Can you
>> please clarify? Also, what do you mean by "de-chunking"? Is
> > that the same think as re-assembling?
>>
>> Wish I could give a better answer. Let me know if this helps.
>>
>> -Jim
>>
>>
>> On Tue, 2 Jul 2013, Bill Moseley wrote:
>>
>> For requests that are chunked (Transfer-Encoding:
>> chunked and no
>> Content-Length header) calling $r->read returns
>> unchunked data from the
>> socket.
>> That's indeed handy. Is that mod_perl doing that
>> un-chunking or is it
>> Apache?
>>
>> But, it leads to some questions.
>>
>> First, if $r->read reads unchunked data then why is
>> there a
>> Transfer-Encoding header saying that the content is
>> chunked? Shouldn't
>> that header be removed? How does one know if the
>> content is chunked or
>> not, otherwise?
>>
>> Second, if there's no Content-Length header then how
>> does one know how much
>> data to read using $r->read?
>>
>> One answer is until $r->read returns zero bytes, of
>> course. But, is
>> that guaranteed to always be the case, even for,
>> say, pipelined requests?
>> My guess is yes because whatever is de-chunking the
>> request knows to stop
>> after reading the last chunk, trailer and empty
>> line. Can anyone elaborate
>> on how Apache/mod_perl is doing this?
>>
>>
>> Perhaps I'm approaching this incorrectly, but this
>> is all a bit untidy.
>>
>> I'm using Catalyst and Catalyst needs a
>> Content-Length. So, I have a Plack
>> Middleware component that creates a temporary file
>> writing the buffer from
>> $r->read( my $buffer, 64 * 1024 ) until that returns
>> zero bytes. I pass
>> this file handle onto Catalyst.
>>
>> Then, for some content-types, Catalyst (via
>> HTTP::Body) writes the body to
>> another temp file. I don't know how
>> Apache/mod_perl does its de-chunking,
>> but I can call $r->read with a huge buffer length
>> and Apache returns that.
>> So, maybe Apache is buffering to disk, too.
>>
>> In other words, for each tiny chunked JSON POST or
>> PUT I'm creating two (or
>> three?) temp files which doesn't seem ideal.
>>
>>
>> --
>> Bill Moseley
>> moseley@hank.org
>>
>>
>>
>>
>> --
>> Bill Moseley
>> moseley@hank.org
>>
>
---733756761-1755238508-1372883375=:25557--
From modperl-return-63395-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Wed Jul 3 20:41:35 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id EDA6D10906
for ; Wed, 3 Jul 2013 20:41:35 +0000 (UTC)
Received: (qmail 79704 invoked by uid 500); 3 Jul 2013 20:41:35 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 79676 invoked by uid 500); 3 Jul 2013 20:41:35 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 79669 invoked by uid 99); 3 Jul 2013 20:41:35 -0000
Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 20:41:35 +0000
X-ASF-Spam-Status: No, hits=1.5 required=5.0
tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (nike.apache.org: domain of trawick@gmail.com designates 209.85.215.50 as permitted sender)
Received: from [209.85.215.50] (HELO mail-la0-f50.google.com) (209.85.215.50)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 20:41:29 +0000
Received: by mail-la0-f50.google.com with SMTP id dy20so552449lab.23
for ; Wed, 03 Jul 2013 13:41:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20120113;
h=mime-version:in-reply-to:references:date:message-id:subject:from:to
:cc:content-type;
bh=a/QbZpWhLyoEvNKLPPrBMaC/RfKAlebGazHCIVA5yh8=;
b=blpPNrs8+DYp+v6AHcQJdfXfVgeAJlPWlT5/l+4bfO9upWglvS6PXr0WHGB6YAcVJR
FKgAG67FfVAp9ZqTnu6B9eQnrCwFRT0w/fQs5bkNpjPLF5WhXOxB9K8kRAB+ovuldGzd
OHMTuggpEaBW5rxB87YrZMAevaWWdU/ARWoitbIkudCodpwl7rlfVwCDm6feDiCU/LGg
bhsCi0GbaI39WeSJR1CD+TcJp7M11eoy7ZfTaPPwjXskkwdJ6PDkjSrHEyzyXDzC6XjX
oCbuIy64ptseNDmEW91qcYWmV2E75fjAznR+3fFBz04kyfCEczJHR8knIKeIA2GWdFkc
Cc7A==
MIME-Version: 1.0
X-Received: by 10.112.5.134 with SMTP id s6mr1988877lbs.95.1372884068755; Wed,
03 Jul 2013 13:41:08 -0700 (PDT)
Received: by 10.114.175.231 with HTTP; Wed, 3 Jul 2013 13:41:08 -0700 (PDT)
In-Reply-To:
References:
Date: Wed, 3 Jul 2013 16:41:08 -0400
Message-ID:
Subject: Re: mod_perl and Transfer-Encoding: chunked
From: Jeff Trawick
To: Jim Schueler
Cc: mod_perl list
Content-Type: multipart/alternative; boundary=14dae94ed6413c70fe04e0a17bf6
X-Virus-Checked: Checked by ClamAV on apache.org
--14dae94ed6413c70fe04e0a17bf6
Content-Type: text/plain; charset=ISO-8859-1
On Wed, Jul 3, 2013 at 4:31 PM, Jim Schueler wrote:
> In light of Joe Schaefer's response, I appear to be outgunned. So, if
> nothing else, can someone please clarify whether "de-chunked" means
> re-assembled?
yes, where re-assembled means convert it back to the original data stream
without any sort of transport encoding
>
>
> -Jim
>
>
> On Wed, 3 Jul 2013, Jim Schueler wrote:
>
> Thanks for the prompt response, but this is your question, not mine. I
>> hardly need an RTFM for my trouble.
>>
>> I drew my conclusions using a packet sniffer. And as far-fetched as my
>> answer may seem, it's more plausible than your theory that Apache or
>> modperl is decoding a raw socket stream.
>>
>> The crux of your question seems to be how the request content gets
>> magically re-assembled. I don't think it was ever disassembled in the
>> first place. But if you don't like my answer, and you don't want to ignore
>> it either, then please restate the question. I can't find any definition
>> for unchunked, and Wiktionary's definition of de-chunk says to "break apart
>> a chunk", that is (counter-intuitively) chunk a chunk.
>>
>>
>> Second, if there's no Content-Length header then how
>>> does one know how much
>>> data to read using $r->read?
>>>
>>> One answer is until $r->read returns zero bytes, of
>>> course. But, is
>>> that guaranteed to always be the case, even for,
>>> say, pipelined requests?
>>> My guess is yes because whatever is de-chunking the
>>>
>>
>> read() is blocking. So it never returns 0, even in a pipeline request
>> (if no data is available, it simply waits). I don't wish to discuss the
>> merits here, but there is no technical imperative for a content-length
>> request in the request header.
>>
>> -Jim
>>
>>
>>
>>
>>
>>
>> On Wed, 3 Jul 2013, Bill Moseley wrote:
>>
>> Hi Jim,
>>> This is the Transfer-Encoding: chunked I was writing about:
>>>
>>> http://tools.ietf.org/html/**rfc2616#section-3.6.1
>>>
>>>
>>>
>>> On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler
>>> wrote:
>>> I played around with chunking recently in the context of media
>>> streaming: The client is only requesting a "chunk" of data.
>>> "Chunking" is how media players perform a "seek". It was
>>> originally implemented for FTP transfers: E.g, to transfer a
>>> large file in (say 10K) chunks. In the case that you describe
>>> below, if no Content-Length is specified, that indicates "send
>>> the remainder".
>>>
>>> >From what I know, a "chunk" request header is used this way to
>>> specify the server response. It does not reflect anything about
>>> the data included in the body of the request. So first, I would
>>> ask if you're confused about this request information.
>>>
>>> Hypothetically, some browsers might try to upload large files in
>>> small chunks and the "chunk" header might reflect a push
>>> transfer. I don't know if "chunk" is ever used for this
>>> purpose. But it would require the following characteristics:
>>>
>>> 1. The browser would need to originally inquire if the server
>>> is
>>> capable of this type of request.
>>> 2. Each chunk of data will arrive in a separate and
>>> independent HTTP
>>> request. Not necessarily in the order they were sent.
>>> 3. Two or more requests may be handled by separate processes
>>> simultaneously that can't be written into a single
>>> destination.
>>> 4. Somehow the server needs to request a resend if a chunk is
>>> missing.
>>> Solving this problem requires an imaginitive use of HTTP.
>>>
>>> Sounds messy. But might be appropriate for 100M+ sized uploads.
>>> This *may* reflect your situation. Can you please confirm?
>>>
>>> For a single process, the incoming content-length is
>>> unnecessary. Buffered I/O automatically knows when transmission
>>> is complete. The read() argument is the buffer size, not the
>>> content length. Whether you spool the buffer to disk or simply
>>> enlarge the buffer should be determined by your hardware
>>> capabilities. This is standard IO behavior that has nothing to
>>> do with HTTP chunk. Without a "Content-Length" header, after
>>> looping your read() operation, determine the length of the
>>> aggregate data and pass that to Catalyst.
>>>
>>> But if you're confident that the complete request spans several
>>> smaller (chunked) HTTP requests, you'll need to address all the
>>> problems I've described above, plus the problem of re-assembling
>>> the whole thing for Catalyst. I don't know anything about
>>> Plack, maybe it can perform all this required magic.
>>>
>>> Otherwise, if the whole purpose of the Plack temporary file is
>>> to pass a file handle, you can pass a buffer as a file handle.
>>> Used to be IO::String, but now that functionality is built into
>>> the core.
>>>
>>> By your last paragraph, I'm really lost. Since you're already
>>> passing the request as a file handle, I'm guessing that Catalyst
>>> creates the tempororary file for the *response* body. Can you
>>> please clarify? Also, what do you mean by "de-chunking"? Is
>>>
>> > that the same think as re-assembling?
>>
>>>
>>> Wish I could give a better answer. Let me know if this helps.
>>>
>>> -Jim
>>>
>>>
>>> On Tue, 2 Jul 2013, Bill Moseley wrote:
>>>
>>> For requests that are chunked (Transfer-Encoding:
>>> chunked and no
>>> Content-Length header) calling $r->read returns
>>> unchunked data from the
>>> socket.
>>> That's indeed handy. Is that mod_perl doing that
>>> un-chunking or is it
>>> Apache?
>>>
>>> But, it leads to some questions.
>>>
>>> First, if $r->read reads unchunked data then why is
>>> there a
>>> Transfer-Encoding header saying that the content is
>>> chunked? Shouldn't
>>> that header be removed? How does one know if the
>>> content is chunked or
>>> not, otherwise?
>>>
>>> Second, if there's no Content-Length header then how
>>> does one know how much
>>> data to read using $r->read?
>>>
>>> One answer is until $r->read returns zero bytes, of
>>> course. But, is
>>> that guaranteed to always be the case, even for,
>>> say, pipelined requests?
>>> My guess is yes because whatever is de-chunking the
>>> request knows to stop
>>> after reading the last chunk, trailer and empty
>>> line. Can anyone elaborate
>>> on how Apache/mod_perl is doing this?
>>>
>>>
>>> Perhaps I'm approaching this incorrectly, but this
>>> is all a bit untidy.
>>>
>>> I'm using Catalyst and Catalyst needs a
>>> Content-Length. So, I have a Plack
>>> Middleware component that creates a temporary file
>>> writing the buffer from
>>> $r->read( my $buffer, 64 * 1024 ) until that returns
>>> zero bytes. I pass
>>> this file handle onto Catalyst.
>>>
>>> Then, for some content-types, Catalyst (via
>>> HTTP::Body) writes the body to
>>> another temp file. I don't know how
>>> Apache/mod_perl does its de-chunking,
>>> but I can call $r->read with a huge buffer length
>>> and Apache returns that.
>>> So, maybe Apache is buffering to disk, too.
>>>
>>> In other words, for each tiny chunked JSON POST or
>>> PUT I'm creating two (or
>>> three?) temp files which doesn't seem ideal.
>>>
>>>
>>> --
>>> Bill Moseley
>>> moseley@hank.org
>>>
>>>
>>>
>>>
>>> --
>>> Bill Moseley
>>> moseley@hank.org
>>>
>>>
--
Born in Roswell... married an alien...
http://emptyhammock.com/
--14dae94ed6413c70fe04e0a17bf6
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

yes, where re-assembled means convert it back to the or=
iginal data stream without any sort of transport encoding

=A0

=A0-Jim

On Wed, 3 Jul 2013, Jim Schueler wrote:

Thanks for the prompt response, but this is your question, not mine. =A0I h=
ardly need an RTFM for my trouble.

I drew my conclusions using a packet sniffer. =A0And as far-fetched as my a=
nswer may seem, it's more plausible than your theory that Apache or mod=
perl is decoding a raw socket stream.

The crux of your question seems to be how the request content gets
magically re-assembled. =A0I don't think it was ever disassembled in th=
e first place. =A0But if you don't like my answer, and you don't wa=
nt to ignore it either, then please restate the question. =A0I can't fi=
nd any definition for unchunked, and Wiktionary's definition of de-chun=
k says to "break apart a chunk", that is (counter-intuitively) ch=
unk a chunk.

read() is blocking. =A0So it never returns 0, even in a pipeline request (i=
f no data is available, it simply waits). =A0I don't wish to discuss th=
e merits here, but there is no technical imperative for a content-length re=
quest in the request header.

On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler <jschueler@eloquency.com>
wrote:
=A0 =A0 =A0 I played around with chunking recently in the context of media<=
br>
=A0 =A0 =A0 streaming: The client is only requesting a "chunk" of=
data.
=A0 =A0 =A0 =A0"Chunking" is how media players perform a "se=
ek". =A0It was
=A0 =A0 =A0 originally implemented for FTP transfers: =A0E.g, to transfer a=
=A0 =A0 =A0 large file in (say 10K) chunks. =A0In the case that you describ=
e
=A0 =A0 =A0 below, if no Content-Length is specified, that indicates "=
send
=A0 =A0 =A0 the remainder".

=A0 =A0 =A0 >From what I know, a "chunk" request header is use=
d this way to
=A0 =A0 =A0 specify the server response. =A0It does not reflect anything ab=
out
=A0 =A0 =A0 the data included in the body of the request. =A0So first, I wo=
uld
=A0 =A0 =A0 ask if you're confused about this request information.

=A0 =A0 =A0 Otherwise, if the whole purpose of the Plack temporary file is<=
br>
=A0 =A0 =A0 to pass a file handle, you can pass a buffer as a file handle.<=
br>
=A0 =A0 =A0 =A0Used to be IO::String, but now that functionality is built i=
nto
=A0 =A0 =A0 the core.

--14dae94ed6413c70fe04e0a17bf6--
From modperl-return-63396-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Wed Jul 3 20:42:36 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 9873010909
for ; Wed, 3 Jul 2013 20:42:36 +0000 (UTC)
Received: (qmail 81348 invoked by uid 500); 3 Jul 2013 20:42:35 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 81321 invoked by uid 500); 3 Jul 2013 20:42:35 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 81314 invoked by uid 99); 3 Jul 2013 20:42:35 -0000
Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 20:42:35 +0000
X-ASF-Spam-Status: No, hits=1.0 required=5.0
tests=FORGED_YAHOO_RCVD,RCVD_IN_DNSWL_NONE,SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (nike.apache.org: local policy)
Received: from [98.136.216.240] (HELO nm33-vm1.bullet.mail.gq1.yahoo.com) (98.136.216.240)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 20:42:29 +0000
Received: from [98.137.12.174] by nm33.bullet.mail.gq1.yahoo.com with NNFMP; 03 Jul 2013 20:42:07 -0000
Received: from [208.71.42.195] by tm13.bullet.mail.gq1.yahoo.com with NNFMP; 03 Jul 2013 20:42:07 -0000
Received: from [127.0.0.1] by smtp206.mail.gq1.yahoo.com with NNFMP; 03 Jul 2013 20:42:07 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1372884127; bh=z/wHolkEipLCCaPQcFjMiVf6BLp6ZCfWHpHvRxcPClo=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:References:In-Reply-To:Mime-Version:Content-Transfer-Encoding:Content-Type:Message-Id:Cc:X-Mailer:From:Subject:Date:To; b=dFN5tYiIC22n7WZGTdWKLjSCLbe2WtAHU846f8SPCCb/VB0zLr5ack8At87hXS5BLpFDGJ55OTgjgZ170Rwlcl+dafMb/30eBesDXXCIAxwD+rTxmhoL8nWHdBQL4c5kNcuNwSa9uSdHP2+jg1uqTGtOXkCX7y/AoCNBQpML6mQ=
X-Yahoo-Newman-Id: 192144.60663.bm@smtp206.mail.gq1.yahoo.com
X-Yahoo-Newman-Property: ymail-3
X-YMail-OSG: lVLwYWsVM1nRl.En99E2Xs.UDaTe0mQ6YLOGk4y.H.S1yPQ
JkcOgF0m2TvGk1XzqTzIDaR4qPwugS2f61eHwyOfiRYYsw62zmzVVtdTr9uk
Z_hSQYZ59KPfN7KYsgnrukbrVsBiQppH3YqvjgD8s1hkN9PIVq0bY0bmO9Jl
.KGIG8zcIFFJNMg8H6otuCb4clz7vJX.Zv5s_swmbofBZn4fbpZu65uKjxmG
VWUICxFh1qQHgTB.6g9btcgA.zJ2GS4ZlUKV88CRlO.aDUjeA2TPjI1sfPTC
mYTPnsyzn.rJFpsrhOvEE90O1ZSKY7rHTV.Bg_v232X5aOuW6ZzfaLVN8YF5
bomtOPWbqCYQtJ0.0djEVkfayXqcplk.PzeWblBK_aONz83Ujm_F3KRdM_TL
kWmR7CAZ14Ht8Co1mK01NM9PJmk8GGulNDiD4l7tSs3zhVKnTh_HD59Bu_KY
E2mPO.aaGaZuwt3BT2_1i7pEZhC8RCTOdX7e1Def3Qhe7UjXOOnuLITaR7hw
5UmSYsGcTmqX6_bL7xfs69M3A9zs5SauJHqXdYNfqB8xojjcxSqfPwc1x8j8
ubE_XlvHgrKH2YnWq0PvL0xUQK_sXvZ7iWUGtx_BZf7pGXlIjpUJir9o3JCm
.lwNYcHME9KMz4GC2NIGUXIde9WcDJ5Mx6draD3DB
X-Yahoo-SMTP: QDlDAnmswBC.bNldBIBkpZHKbHoL830igw--
X-Rocket-Received: from [192.168.2.102] (joe_schaefer@99.135.28.65 with )
by smtp206.mail.gq1.yahoo.com with SMTP; 03 Jul 2013 13:42:06 -0700 PDT
References:
In-Reply-To:
Mime-Version: 1.0 (1.0)
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=us-ascii
Message-Id: <0BC39C9B-5830-499B-A4E7-33BC76C3759B@yahoo.com>
Cc: mod_perl list
X-Mailer: iPhone Mail (10B350)
From: Joseph Schaefer
Subject: Re: mod_perl and Transfer-Encoding: chunked
Date: Wed, 3 Jul 2013 16:42:04 -0400
To: Jim Schueler
X-Virus-Checked: Checked by ClamAV on apache.org
Dechunked means it strips out the lines containing metadata about the next b=
lock of raw data. The metadata is just the length of the next block of data.=
Imagine a chunked stream is like having partial content length headers emb=
edded in the data stream.
The http filter embedded in httpd takes care of the metadata so you don't ha=
ve to parse the stream yourself. $r->read will always provide only the raw d=
ata in a blocking call, until the stream is finished in which case it should=
return 0 or an error code. Check the mod perl docs, or better the source, t=
o see if the semantics are more like perl's sysread or more like read.
Sent from my iPhone
On Jul 3, 2013, at 4:31 PM, Jim Schueler wrote:
> In light of Joe Schaefer's response, I appear to be outgunned. So, if not=
hing else, can someone please clarify whether "de-chunked" means re-assemble=
d?
>=20
> -Jim
>=20
> On Wed, 3 Jul 2013, Jim Schueler wrote:
>=20
>> Thanks for the prompt response, but this is your question, not mine. I h=
ardly need an RTFM for my trouble.
>>=20
>> I drew my conclusions using a packet sniffer. And as far-fetched as my a=
nswer may seem, it's more plausible than your theory that Apache or modperl i=
s decoding a raw socket stream.
>>=20
>> The crux of your question seems to be how the request content gets
>> magically re-assembled. I don't think it was ever disassembled in the fi=
rst place. But if you don't like my answer, and you don't want to ignore it=
either, then please restate the question. I can't find any definition for u=
nchunked, and Wiktionary's definition of de-chunk says to "break apart a chu=
nk", that is (counter-intuitively) chunk a chunk.
>>=20
>>=20
>>> Second, if there's no Content-Length header then how
>>> does one know how much
>>> data to read using $r->read? =20
>>>=20
>>> One answer is until $r->read returns zero bytes, of
>>> course. But, is
>>> that guaranteed to always be the case, even for,
>>> say, pipelined requests? =20
>>> My guess is yes because whatever is de-chunking the
>>=20
>> read() is blocking. So it never returns 0, even in a pipeline request (i=
f no data is available, it simply waits). I don't wish to discuss the merit=
s here, but there is no technical imperative for a content-length request in=
the request header.
>>=20
>> -Jim
>>=20
>>=20
>>=20
>>=20
>>=20
>>=20
>> On Wed, 3 Jul 2013, Bill Moseley wrote:
>>=20
>>> Hi Jim,
>>> This is the Transfer-Encoding: chunked I was writing about:
>>> http://tools.ietf.org/html/rfc2616#section-3.6.1
>>> On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler
>>> wrote:
>>> I played around with chunking recently in the context of media
>>> streaming: The client is only requesting a "chunk" of data.
>>> "Chunking" is how media players perform a "seek". It was
>>> originally implemented for FTP transfers: E.g, to transfer a
>>> large file in (say 10K) chunks. In the case that you describe
>>> below, if no Content-Length is specified, that indicates "send
>>> the remainder".
>>>=20
>>>> =46rom what I know, a "chunk" request header is used this way to
>>> specify the server response. It does not reflect anything about
>>> the data included in the body of the request. So first, I would
>>> ask if you're confused about this request information.
>>>=20
>>> Hypothetically, some browsers might try to upload large files in
>>> small chunks and the "chunk" header might reflect a push
>>> transfer. I don't know if "chunk" is ever used for this
>>> purpose. But it would require the following characteristics:
>>>=20
>>> 1. The browser would need to originally inquire if the server
>>> is
>>> capable of this type of request.
>>> 2. Each chunk of data will arrive in a separate and
>>> independent HTTP
>>> request. Not necessarily in the order they were sent.
>>> 3. Two or more requests may be handled by separate processes
>>> simultaneously that can't be written into a single
>>> destination.
>>> 4. Somehow the server needs to request a resend if a chunk is
>>> missing.
>>> Solving this problem requires an imaginitive use of HTTP.
>>>=20
>>> Sounds messy. But might be appropriate for 100M+ sized uploads.
>>> This *may* reflect your situation. Can you please confirm?
>>>=20
>>> For a single process, the incoming content-length is
>>> unnecessary. Buffered I/O automatically knows when transmission
>>> is complete. The read() argument is the buffer size, not the
>>> content length. Whether you spool the buffer to disk or simply
>>> enlarge the buffer should be determined by your hardware
>>> capabilities. This is standard IO behavior that has nothing to
>>> do with HTTP chunk. Without a "Content-Length" header, after
>>> looping your read() operation, determine the length of the
>>> aggregate data and pass that to Catalyst.
>>>=20
>>> But if you're confident that the complete request spans several
>>> smaller (chunked) HTTP requests, you'll need to address all the
>>> problems I've described above, plus the problem of re-assembling
>>> the whole thing for Catalyst. I don't know anything about
>>> Plack, maybe it can perform all this required magic.
>>>=20
>>> Otherwise, if the whole purpose of the Plack temporary file is
>>> to pass a file handle, you can pass a buffer as a file handle.
>>> Used to be IO::String, but now that functionality is built into
>>> the core.
>>>=20
>>> By your last paragraph, I'm really lost. Since you're already
>>> passing the request as a file handle, I'm guessing that Catalyst
>>> creates the tempororary file for the *response* body. Can you
>>> please clarify? Also, what do you mean by "de-chunking"? Is
>>> that the same think as re-assembling?
>>>=20
>>> Wish I could give a better answer. Let me know if this helps.
>>>=20
>>> -Jim
>>>=20
>>> On Tue, 2 Jul 2013, Bill Moseley wrote:
>>>=20
>>> For requests that are chunked (Transfer-Encoding:
>>> chunked and no
>>> Content-Length header) calling $r->read returns
>>> unchunked data from the
>>> socket.
>>> That's indeed handy. Is that mod_perl doing that
>>> un-chunking or is it
>>> Apache?
>>>=20
>>> But, it leads to some questions. =20
>>>=20
>>> First, if $r->read reads unchunked data then why is
>>> there a
>>> Transfer-Encoding header saying that the content is
>>> chunked? Shouldn't
>>> that header be removed? How does one know if the
>>> content is chunked or
>>> not, otherwise?
>>>=20
>>> Second, if there's no Content-Length header then how
>>> does one know how much
>>> data to read using $r->read? =20
>>>=20
>>> One answer is until $r->read returns zero bytes, of
>>> course. But, is
>>> that guaranteed to always be the case, even for,
>>> say, pipelined requests? =20
>>> My guess is yes because whatever is de-chunking the
>>> request knows to stop
>>> after reading the last chunk, trailer and empty
>>> line. Can anyone elaborate
>>> on how Apache/mod_perl is doing this?=20
>>>=20
>>> Perhaps I'm approaching this incorrectly, but this
>>> is all a bit untidy.
>>>=20
>>> I'm using Catalyst and Catalyst needs a
>>> Content-Length. So, I have a Plack
>>> Middleware component that creates a temporary file
>>> writing the buffer from
>>> $r->read( my $buffer, 64 * 1024 ) until that returns
>>> zero bytes. I pass
>>> this file handle onto Catalyst.
>>>=20
>>> Then, for some content-types, Catalyst (via
>>> HTTP::Body) writes the body to
>>> another temp file. I don't know how
>>> Apache/mod_perl does its de-chunking,
>>> but I can call $r->read with a huge buffer length
>>> and Apache returns that.
>>> So, maybe Apache is buffering to disk, too.
>>>=20
>>> In other words, for each tiny chunked JSON POST or
>>> PUT I'm creating two (or
>>> three?) temp files which doesn't seem ideal.
>>>=20
>>> --
>>> Bill Moseley
>>> moseley@hank.org
>>> --
>>> Bill Moseley
>>> moseley@hank.org
From modperl-return-63397-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Thu Jul 4 08:37:51 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id D8A65CAE6
for ; Thu, 4 Jul 2013 08:37:51 +0000 (UTC)
Received: (qmail 86141 invoked by uid 500); 4 Jul 2013 08:37:51 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 85797 invoked by uid 500); 4 Jul 2013 08:37:47 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 85786 invoked by uid 99); 4 Jul 2013 08:37:46 -0000
Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 08:37:46 +0000
X-ASF-Spam-Status: No, hits=-0.0 required=5.0
tests=SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (nike.apache.org: domain of margol@beamartyr.net designates 85.195.98.136 as permitted sender)
Received: from [85.195.98.136] (HELO mail1.mirimar.net) (85.195.98.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 08:37:41 +0000
Received: from [192.168.10.95] (bzq-80-23-58.static.bezeqint.net [82.80.23.58])
(authenticated bits=0)
by mail1.mirimar.net (8.14.3/8.14.3/Debian-9.1ubuntu1) with ESMTP id r648bELY001839
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
for ; Thu, 4 Jul 2013 10:37:16 +0200
Message-ID: <51D53430.9080201@beamartyr.net>
Date: Thu, 04 Jul 2013 11:37:04 +0300
From: Issac Goldstand
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7
MIME-Version: 1.0
To: modperl@perl.apache.org
Subject: Re: mod_perl and Transfer-Encoding: chunked
References: <3FF9F5A0-8805-49C0-843C-5014BDA87B44@yahoo.com>
In-Reply-To: <3FF9F5A0-8805-49C0-843C-5014BDA87B44@yahoo.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on zaphod.mirimar.net
X-Virus-Scanned: clamav-milter 0.96.5 at zaphod
X-Virus-Status: Clean
X-Virus-Checked: Checked by ClamAV on apache.org
X-Old-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,URIBL_BLOCKED
autolearn=ham version=3.3.2
On 03/07/2013 21:53, Joseph Schaefer wrote:
> When you read from the input filter chain as $r->read does, the http
> input filter automatically handles the protocol and passes the dechunked
> data up to the caller. It does not spool the stream at all.
>
> You'd have to look at how mod perl implements read to see if it loops
> its ap_get_brigade calls on the input filter chain to fill the passed
> buffer to the desired length or not. But under no circumstances should
> you have to deal with chunked data directly.
I'm pretty sure that it's not even a mod_perl thing. IIRC, httpd itself
sticks a chunk/de-chunk filter near the respective ends of the filter
chain. So if you can't find the code in mod_perl land, you might want
to check httpd source.
>
> HTH
>
> Sent from my iPhone
>
> On Jul 3, 2013, at 2:44 PM, Bill Moseley > wrote:
>
>> Hi Jim,
>>
>> This is the Transfer-Encoding: chunked I was writing about:
>>
>> http://tools.ietf.org/html/rfc2616#section-3.6.1
>>
>>
>>
>> On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler > > wrote:
>>
>> I played around with chunking recently in the context of media
>> streaming: The client is only requesting a "chunk" of data.
>> "Chunking" is how media players perform a "seek". It was
>> originally implemented for FTP transfers: E.g, to transfer a
>> large file in (say 10K) chunks. In the case that you describe
>> below, if no Content-Length is specified, that indicates "send the
>> remainder".
>>
>> From what I know, a "chunk" request header is used this way to
>> specify the server response. It does not reflect anything about
>> the data included in the body of the request. So first, I would
>> ask if you're confused about this request information.
>>
>> Hypothetically, some browsers might try to upload large files in
>> small chunks and the "chunk" header might reflect a push transfer.
>> I don't know if "chunk" is ever used for this purpose. But it
>> would require the following characteristics:
>>
>> 1. The browser would need to originally inquire if the server is
>> capable of this type of request.
>> 2. Each chunk of data will arrive in a separate and independent
>> HTTP
>> request. Not necessarily in the order they were sent.
>> 3. Two or more requests may be handled by separate processes
>> simultaneously that can't be written into a single destination.
>> 4. Somehow the server needs to request a resend if a chunk is
>> missing.
>> Solving this problem requires an imaginitive use of HTTP.
>>
>> Sounds messy. But might be appropriate for 100M+ sized uploads.
>> This *may* reflect your situation. Can you please confirm?
>>
>> For a single process, the incoming content-length is unnecessary.
>> Buffered I/O automatically knows when transmission is complete.
>> The read() argument is the buffer size, not the content length.
>> Whether you spool the buffer to disk or simply enlarge the buffer
>> should be determined by your hardware capabilities. This is
>> standard IO behavior that has nothing to do with HTTP chunk.
>> Without a "Content-Length" header, after looping your read()
>> operation, determine the length of the aggregate data and pass
>> that to Catalyst.
>>
>> But if you're confident that the complete request spans several
>> smaller (chunked) HTTP requests, you'll need to address all the
>> problems I've described above, plus the problem of re-assembling
>> the whole thing for Catalyst. I don't know anything about Plack,
>> maybe it can perform all this required magic.
>>
>> Otherwise, if the whole purpose of the Plack temporary file is to
>> pass a file handle, you can pass a buffer as a file handle. Used
>> to be IO::String, but now that functionality is built into the core.
>>
>> By your last paragraph, I'm really lost. Since you're already
>> passing the request as a file handle, I'm guessing that Catalyst
>> creates the tempororary file for the *response* body. Can you
>> please clarify? Also, what do you mean by "de-chunking"? Is that
>> the same think as re-assembling?
>>
>> Wish I could give a better answer. Let me know if this helps.
>>
>> -Jim
>>
>>
>>
>> On Tue, 2 Jul 2013, Bill Moseley wrote:
>>
>> For requests that are chunked (Transfer-Encoding: chunked and no
>> Content-Length header) calling $r->read returns unchunked data
>> from the
>> socket.
>> That's indeed handy. Is that mod_perl doing that un-chunking
>> or is it
>> Apache?
>>
>> But, it leads to some questions.
>>
>> First, if $r->read reads unchunked data then why is there a
>> Transfer-Encoding header saying that the content is chunked?
>> Shouldn't
>> that header be removed? How does one know if the content is
>> chunked or
>> not, otherwise?
>>
>> Second, if there's no Content-Length header then how does one
>> know how much
>> data to read using $r->read?
>>
>> One answer is until $r->read returns zero bytes, of course.
>> But, is
>> that guaranteed to always be the case, even for, say,
>> pipelined requests?
>> My guess is yes because whatever is de-chunking the request
>> knows to stop
>> after reading the last chunk, trailer and empty line. Can
>> anyone elaborate
>> on how Apache/mod_perl is doing this?
>>
>>
>> Perhaps I'm approaching this incorrectly, but this is all a
>> bit untidy.
>>
>> I'm using Catalyst and Catalyst needs a Content-Length. So, I
>> have a Plack
>> Middleware component that creates a temporary file writing the
>> buffer from
>> $r->read( my $buffer, 64 * 1024 ) until that returns zero
>> bytes. I pass
>> this file handle onto Catalyst.
>>
>> Then, for some content-types, Catalyst (via HTTP::Body) writes
>> the body to
>> another temp file. I don't know how Apache/mod_perl does
>> its de-chunking,
>> but I can call $r->read with a huge buffer length and Apache
>> returns that.
>> So, maybe Apache is buffering to disk, too.
>>
>> In other words, for each tiny chunked JSON POST or PUT I'm
>> creating two (or
>> three?) temp files which doesn't seem ideal.
>>
>>
>> --
>> Bill Moseley
>> moseley@hank.org
>>
>>
>>
>>
>> --
>> Bill Moseley
>> moseley@hank.org
From modperl-return-63398-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Thu Jul 4 08:40:51 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 0936FCAFE
for ; Thu, 4 Jul 2013 08:40:51 +0000 (UTC)
Received: (qmail 92318 invoked by uid 500); 4 Jul 2013 08:40:50 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 92140 invoked by uid 500); 4 Jul 2013 08:40:50 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 92133 invoked by uid 99); 4 Jul 2013 08:40:49 -0000
Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 08:40:49 +0000
X-ASF-Spam-Status: No, hits=-0.0 required=5.0
tests=SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (nike.apache.org: domain of margol@beamartyr.net designates 85.195.98.136 as permitted sender)
Received: from [85.195.98.136] (HELO mail1.mirimar.net) (85.195.98.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 08:40:43 +0000
Received: from [192.168.10.95] (bzq-80-23-58.static.bezeqint.net [82.80.23.58])
(authenticated bits=0)
by mail1.mirimar.net (8.14.3/8.14.3/Debian-9.1ubuntu1) with ESMTP id r648eK6C002228
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
for ; Thu, 4 Jul 2013 10:40:22 +0200
Message-ID: <51D534E9.8030600@beamartyr.net>
Date: Thu, 04 Jul 2013 11:40:09 +0300
From: Issac Goldstand
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7
MIME-Version: 1.0
To: modperl@perl.apache.org
Subject: Re: mod_perl and Transfer-Encoding: chunked
References:
In-Reply-To:
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on zaphod.mirimar.net
X-Virus-Scanned: clamav-milter 0.96.5 at zaphod
X-Virus-Status: Clean
X-Virus-Checked: Checked by ClamAV on apache.org
X-Old-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,URIBL_BLOCKED
autolearn=ham version=3.3.2
On 03/07/2013 23:26, Jim Schueler wrote:
>
>> Second, if there's no Content-Length header then how
>> does one know how much
>> data to read using $r->read?
>>
>> One answer is until $r->read returns zero bytes, of
>> course. But, is
>> that guaranteed to always be the case, even for,
>> say, pipelined requests?
>> My guess is yes because whatever is de-chunking the
>
> read() is blocking. So it never returns 0, even in a pipeline request
> (if no data is available, it simply waits). I don't wish to discuss the
> merits here, but there is no technical imperative for a content-length
> request in the request header.
>
> -Jim
Probably. If you, for some reason, were doing the chunking work
yourself, each chunk says how many bytes are in it (or in the next one
perhaps; I forget offhand), so you'd know what size read to do.
>
>
>
>
>
> On Wed, 3 Jul 2013, Bill Moseley wrote:
>
>> Hi Jim,
>> This is the Transfer-Encoding: chunked I was writing about:
>>
>> http://tools.ietf.org/html/rfc2616#section-3.6.1
>>
>>
>>
>> On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler
>> wrote:
>> I played around with chunking recently in the context of media
>> streaming: The client is only requesting a "chunk" of data.
>> "Chunking" is how media players perform a "seek". It was
>> originally implemented for FTP transfers: E.g, to transfer a
>> large file in (say 10K) chunks. In the case that you describe
>> below, if no Content-Length is specified, that indicates "send
>> the remainder".
>>
>> >From what I know, a "chunk" request header is used this way to
>> specify the server response. It does not reflect anything about
>> the data included in the body of the request. So first, I would
>> ask if you're confused about this request information.
>>
>> Hypothetically, some browsers might try to upload large files in
>> small chunks and the "chunk" header might reflect a push
>> transfer. I don't know if "chunk" is ever used for this
>> purpose. But it would require the following characteristics:
>>
>> 1. The browser would need to originally inquire if the server
>> is
>> capable of this type of request.
>> 2. Each chunk of data will arrive in a separate and
>> independent HTTP
>> request. Not necessarily in the order they were sent.
>> 3. Two or more requests may be handled by separate processes
>> simultaneously that can't be written into a single
>> destination.
>> 4. Somehow the server needs to request a resend if a chunk is
>> missing.
>> Solving this problem requires an imaginitive use of HTTP.
>>
>> Sounds messy. But might be appropriate for 100M+ sized uploads.
>> This *may* reflect your situation. Can you please confirm?
>>
>> For a single process, the incoming content-length is
>> unnecessary. Buffered I/O automatically knows when transmission
>> is complete. The read() argument is the buffer size, not the
>> content length. Whether you spool the buffer to disk or simply
>> enlarge the buffer should be determined by your hardware
>> capabilities. This is standard IO behavior that has nothing to
>> do with HTTP chunk. Without a "Content-Length" header, after
>> looping your read() operation, determine the length of the
>> aggregate data and pass that to Catalyst.
>>
>> But if you're confident that the complete request spans several
>> smaller (chunked) HTTP requests, you'll need to address all the
>> problems I've described above, plus the problem of re-assembling
>> the whole thing for Catalyst. I don't know anything about
>> Plack, maybe it can perform all this required magic.
>>
>> Otherwise, if the whole purpose of the Plack temporary file is
>> to pass a file handle, you can pass a buffer as a file handle.
>> Used to be IO::String, but now that functionality is built into
>> the core.
>>
>> By your last paragraph, I'm really lost. Since you're already
>> passing the request as a file handle, I'm guessing that Catalyst
>> creates the tempororary file for the *response* body. Can you
>> please clarify? Also, what do you mean by "de-chunking"? Is
> > that the same think as re-assembling?
>>
>> Wish I could give a better answer. Let me know if this helps.
>>
>> -Jim
>>
>>
>> On Tue, 2 Jul 2013, Bill Moseley wrote:
>>
>> For requests that are chunked (Transfer-Encoding:
>> chunked and no
>> Content-Length header) calling $r->read returns
>> unchunked data from the
>> socket.
>> That's indeed handy. Is that mod_perl doing that
>> un-chunking or is it
>> Apache?
>>
>> But, it leads to some questions.
>>
>> First, if $r->read reads unchunked data then why is
>> there a
>> Transfer-Encoding header saying that the content is
>> chunked? Shouldn't
>> that header be removed? How does one know if the
>> content is chunked or
>> not, otherwise?
>>
>> Second, if there's no Content-Length header then how
>> does one know how much
>> data to read using $r->read?
>>
>> One answer is until $r->read returns zero bytes, of
>> course. But, is
>> that guaranteed to always be the case, even for,
>> say, pipelined requests?
>> My guess is yes because whatever is de-chunking the
>> request knows to stop
>> after reading the last chunk, trailer and empty
>> line. Can anyone elaborate
>> on how Apache/mod_perl is doing this?
>>
>>
>> Perhaps I'm approaching this incorrectly, but this
>> is all a bit untidy.
>>
>> I'm using Catalyst and Catalyst needs a
>> Content-Length. So, I have a Plack
>> Middleware component that creates a temporary file
>> writing the buffer from
>> $r->read( my $buffer, 64 * 1024 ) until that returns
>> zero bytes. I pass
>> this file handle onto Catalyst.
>>
>> Then, for some content-types, Catalyst (via
>> HTTP::Body) writes the body to
>> another temp file. I don't know how
>> Apache/mod_perl does its de-chunking,
>> but I can call $r->read with a huge buffer length
>> and Apache returns that.
>> So, maybe Apache is buffering to disk, too.
>>
>> In other words, for each tiny chunked JSON POST or
>> PUT I'm creating two (or
>> three?) temp files which doesn't seem ideal.
>>
>>
>> --
>> Bill Moseley
>> moseley@hank.org
>>
>>
>>
>>
>> --
>> Bill Moseley
>> moseley@hank.org
>>
>>
From modperl-return-63399-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Thu Jul 4 08:42:02 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id E5101CB0D
for ; Thu, 4 Jul 2013 08:42:02 +0000 (UTC)
Received: (qmail 94490 invoked by uid 500); 4 Jul 2013 08:42:02 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 94170 invoked by uid 500); 4 Jul 2013 08:42:02 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 94162 invoked by uid 99); 4 Jul 2013 08:42:01 -0000
Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 08:42:01 +0000
X-ASF-Spam-Status: No, hits=-0.0 required=5.0
tests=SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (nike.apache.org: domain of margol@beamartyr.net designates 85.195.98.136 as permitted sender)
Received: from [85.195.98.136] (HELO mail1.mirimar.net) (85.195.98.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 08:41:55 +0000
Received: from [192.168.10.95] (bzq-80-23-58.static.bezeqint.net [82.80.23.58])
(authenticated bits=0)
by mail1.mirimar.net (8.14.3/8.14.3/Debian-9.1ubuntu1) with ESMTP id r648fUCP002338
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
for ; Thu, 4 Jul 2013 10:41:33 +0200
Message-ID: <51D5352F.70208@beamartyr.net>
Date: Thu, 04 Jul 2013 11:41:19 +0300
From: Issac Goldstand
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7
MIME-Version: 1.0
To: modperl@perl.apache.org
Subject: Re: mod_perl and Transfer-Encoding: chunked
References: <0BC39C9B-5830-499B-A4E7-33BC76C3759B@yahoo.com>
In-Reply-To: <0BC39C9B-5830-499B-A4E7-33BC76C3759B@yahoo.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on zaphod.mirimar.net
X-Virus-Scanned: clamav-milter 0.96.5 at zaphod
X-Virus-Status: Clean
X-Virus-Checked: Checked by ClamAV on apache.org
X-Old-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,URIBL_BLOCKED
autolearn=ham version=3.3.2
On 03/07/2013 23:42, Joseph Schaefer wrote:
> Dechunked means it strips out the lines containing metadata about the next block of raw data. The metadata is just the length of the next block of data. Imagine a chunked stream is like having partial content length headers embedded in the data stream.
>
> The http filter embedded in httpd takes care of the metadata so you don't have to parse the stream yourself. $r->read will always provide only the raw data in a blocking call, until the stream is finished in which case it should return 0 or an error code. Check the mod perl docs, or better the source, to see if the semantics are more like perl's sysread or more like read.
>
Yep. That makes sense to me too - it's just not what I read in your
previous email, but maybe I read it wrong :)
> Sent from my iPhone
>
> On Jul 3, 2013, at 4:31 PM, Jim Schueler wrote:
>
>> In light of Joe Schaefer's response, I appear to be outgunned. So, if nothing else, can someone please clarify whether "de-chunked" means re-assembled?
>>
>> -Jim
>>
>> On Wed, 3 Jul 2013, Jim Schueler wrote:
>>
>>> Thanks for the prompt response, but this is your question, not mine. I hardly need an RTFM for my trouble.
>>>
>>> I drew my conclusions using a packet sniffer. And as far-fetched as my answer may seem, it's more plausible than your theory that Apache or modperl is decoding a raw socket stream.
>>>
>>> The crux of your question seems to be how the request content gets
>>> magically re-assembled. I don't think it was ever disassembled in the first place. But if you don't like my answer, and you don't want to ignore it either, then please restate the question. I can't find any definition for unchunked, and Wiktionary's definition of de-chunk says to "break apart a chunk", that is (counter-intuitively) chunk a chunk.
>>>
>>>
>>>> Second, if there's no Content-Length header then how
>>>> does one know how much
>>>> data to read using $r->read?
>>>>
>>>> One answer is until $r->read returns zero bytes, of
>>>> course. But, is
>>>> that guaranteed to always be the case, even for,
>>>> say, pipelined requests?
>>>> My guess is yes because whatever is de-chunking the
>>>
>>> read() is blocking. So it never returns 0, even in a pipeline request (if no data is available, it simply waits). I don't wish to discuss the merits here, but there is no technical imperative for a content-length request in the request header.
>>>
>>> -Jim
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, 3 Jul 2013, Bill Moseley wrote:
>>>
>>>> Hi Jim,
>>>> This is the Transfer-Encoding: chunked I was writing about:
>>>> http://tools.ietf.org/html/rfc2616#section-3.6.1
>>>> On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler
>>>> wrote:
>>>> I played around with chunking recently in the context of media
>>>> streaming: The client is only requesting a "chunk" of data.
>>>> "Chunking" is how media players perform a "seek". It was
>>>> originally implemented for FTP transfers: E.g, to transfer a
>>>> large file in (say 10K) chunks. In the case that you describe
>>>> below, if no Content-Length is specified, that indicates "send
>>>> the remainder".
>>>>
>>>>> From what I know, a "chunk" request header is used this way to
>>>> specify the server response. It does not reflect anything about
>>>> the data included in the body of the request. So first, I would
>>>> ask if you're confused about this request information.
>>>>
>>>> Hypothetically, some browsers might try to upload large files in
>>>> small chunks and the "chunk" header might reflect a push
>>>> transfer. I don't know if "chunk" is ever used for this
>>>> purpose. But it would require the following characteristics:
>>>>
>>>> 1. The browser would need to originally inquire if the server
>>>> is
>>>> capable of this type of request.
>>>> 2. Each chunk of data will arrive in a separate and
>>>> independent HTTP
>>>> request. Not necessarily in the order they were sent.
>>>> 3. Two or more requests may be handled by separate processes
>>>> simultaneously that can't be written into a single
>>>> destination.
>>>> 4. Somehow the server needs to request a resend if a chunk is
>>>> missing.
>>>> Solving this problem requires an imaginitive use of HTTP.
>>>>
>>>> Sounds messy. But might be appropriate for 100M+ sized uploads.
>>>> This *may* reflect your situation. Can you please confirm?
>>>>
>>>> For a single process, the incoming content-length is
>>>> unnecessary. Buffered I/O automatically knows when transmission
>>>> is complete. The read() argument is the buffer size, not the
>>>> content length. Whether you spool the buffer to disk or simply
>>>> enlarge the buffer should be determined by your hardware
>>>> capabilities. This is standard IO behavior that has nothing to
>>>> do with HTTP chunk. Without a "Content-Length" header, after
>>>> looping your read() operation, determine the length of the
>>>> aggregate data and pass that to Catalyst.
>>>>
>>>> But if you're confident that the complete request spans several
>>>> smaller (chunked) HTTP requests, you'll need to address all the
>>>> problems I've described above, plus the problem of re-assembling
>>>> the whole thing for Catalyst. I don't know anything about
>>>> Plack, maybe it can perform all this required magic.
>>>>
>>>> Otherwise, if the whole purpose of the Plack temporary file is
>>>> to pass a file handle, you can pass a buffer as a file handle.
>>>> Used to be IO::String, but now that functionality is built into
>>>> the core.
>>>>
>>>> By your last paragraph, I'm really lost. Since you're already
>>>> passing the request as a file handle, I'm guessing that Catalyst
>>>> creates the tempororary file for the *response* body. Can you
>>>> please clarify? Also, what do you mean by "de-chunking"? Is
>>>> that the same think as re-assembling?
>>>>
>>>> Wish I could give a better answer. Let me know if this helps.
>>>>
>>>> -Jim
>>>>
>>>> On Tue, 2 Jul 2013, Bill Moseley wrote:
>>>>
>>>> For requests that are chunked (Transfer-Encoding:
>>>> chunked and no
>>>> Content-Length header) calling $r->read returns
>>>> unchunked data from the
>>>> socket.
>>>> That's indeed handy. Is that mod_perl doing that
>>>> un-chunking or is it
>>>> Apache?
>>>>
>>>> But, it leads to some questions.
>>>>
>>>> First, if $r->read reads unchunked data then why is
>>>> there a
>>>> Transfer-Encoding header saying that the content is
>>>> chunked? Shouldn't
>>>> that header be removed? How does one know if the
>>>> content is chunked or
>>>> not, otherwise?
>>>>
>>>> Second, if there's no Content-Length header then how
>>>> does one know how much
>>>> data to read using $r->read?
>>>>
>>>> One answer is until $r->read returns zero bytes, of
>>>> course. But, is
>>>> that guaranteed to always be the case, even for,
>>>> say, pipelined requests?
>>>> My guess is yes because whatever is de-chunking the
>>>> request knows to stop
>>>> after reading the last chunk, trailer and empty
>>>> line. Can anyone elaborate
>>>> on how Apache/mod_perl is doing this?
>>>>
>>>> Perhaps I'm approaching this incorrectly, but this
>>>> is all a bit untidy.
>>>>
>>>> I'm using Catalyst and Catalyst needs a
>>>> Content-Length. So, I have a Plack
>>>> Middleware component that creates a temporary file
>>>> writing the buffer from
>>>> $r->read( my $buffer, 64 * 1024 ) until that returns
>>>> zero bytes. I pass
>>>> this file handle onto Catalyst.
>>>>
>>>> Then, for some content-types, Catalyst (via
>>>> HTTP::Body) writes the body to
>>>> another temp file. I don't know how
>>>> Apache/mod_perl does its de-chunking,
>>>> but I can call $r->read with a huge buffer length
>>>> and Apache returns that.
>>>> So, maybe Apache is buffering to disk, too.
>>>>
>>>> In other words, for each tiny chunked JSON POST or
>>>> PUT I'm creating two (or
>>>> three?) temp files which doesn't seem ideal.
>>>>
>>>> --
>>>> Bill Moseley
>>>> moseley@hank.org
>>>> --
>>>> Bill Moseley
>>>> moseley@hank.org
From modperl-return-63400-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Thu Jul 4 11:06:49 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 9C1AFCF7B
for ; Thu, 4 Jul 2013 11:06:49 +0000 (UTC)
Received: (qmail 83176 invoked by uid 500); 4 Jul 2013 11:06:47 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 83134 invoked by uid 500); 4 Jul 2013 11:06:46 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 83124 invoked by uid 99); 4 Jul 2013 11:06:45 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 11:06:45 +0000
X-ASF-Spam-Status: No, hits=-0.0 required=5.0
tests=SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (athena.apache.org: domain of aw@ice-sa.com designates 212.85.38.228 as permitted sender)
Received: from [212.85.38.228] (HELO tor.combios.es) (212.85.38.228)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 11:06:39 +0000
Received: from [192.168.245.129] (HSI-KBW-37-49-53-194.hsi14.kabel-badenwuerttemberg.de [37.49.53.194])
(Authenticated sender: andre.warnier@ice-sa.com)
by tor.combios.es (Postfix) with ESMTPA id EE41E3C2A56
for ; Thu, 4 Jul 2013 13:06:42 +0200 (CEST)
Message-ID: <51D55728.20406@ice-sa.com>
Date: Thu, 04 Jul 2013 13:06:16 +0200
From: =?ISO-8859-1?Q?Andr=E9_Warnier?=
Reply-To: mod_perl list
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
To: mod_perl list
Subject: Re: mod_perl and Transfer-Encoding: chunked
References:
In-Reply-To:
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Virus-Checked: Checked by ClamAV on apache.org
Not disregarding the other answers to your questions, but I believe that maybe one aspect
has been neglected here.
Bill Moseley wrote:
> For requests that are chunked (Transfer-Encoding: chunked and no
> Content-Length header) calling $r->read returns *unchunked* data from the
> socket.
>
> That's indeed handy. Is that mod_perl doing that un-chunking or is it
> Apache?
>
> But, it leads to some questions.
>
> First, if $r->read reads unchunked data then why is there a
> Transfer-Encoding header saying that the content is chunked? Shouldn't
> that header be removed? How does one know if the content is chunked or
> not, otherwise?
The real question is : does one need to know ?
The transfer-coding is something that even an intermediate HTTP proxy may
be allowed to change, for reasons to do with transport of the request along a section of
the network path.
It should be entirely transparent to the application receiving the data.
>
> Second, if there's no Content-Length header then how does one know how much
> data to read using $r->read?
>
> One answer is until $r->read returns zero bytes, of course.
Indeed. That means that the end of *this* request body has been encountered.
But, is
> that guaranteed to always be the case, even for, say, pipelined requests?
It should be, because $r concerns the present request being processed.
If there is another request pipelined onto that same connection, it is a separate request
and a different $r.
> My guess is yes because whatever is de-chunking the request knows to stop
> after reading the last chunk, trailer and empty line. Can
> anyone elaborate on how Apache/mod_perl is doing this?
>
I can't really, but it should be done by something at some fairly low level. It should be
the *first* thing which happens to the request body, before any request-level body access
is allowed.
(Similarly, at the response level, "chunking" a response body should be the last thing
happening before the request is put on the wire out.)
>
> Perhaps I'm approaching this incorrectly, but this is all a bit untidy.
>
> I'm using Catalyst and Catalyst needs a Content-Length.
I would posit then that Catalyst is wrong (or not compatible with HTTP 1.1 in that respect).
So, I have a Plack
> Middleware component that creates a temporary file writing the buffer from
> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes. I pass
> this file handle onto Catalyst.
>
So what you wrote then is a patch to Catalyst.
> Then, for some content-types, Catalyst (via HTTP::Body) writes the body to *
> another* temp file. I don't know how Apache/mod_perl does its
> de-chunking, but I can call $r->read with a huge buffer length and Apache
> returns that. So, maybe Apache is buffering to disk, too.
>
> In other words, for each tiny chunked JSON POST or PUT I'm creating two (or
> three?) temp files which doesn't seem ideal.
>
>
I realise that my comments above don't really help you in your specific predicament, but I
just felt that it was good to put things back in their place, particularly that at the $r
(request) level, you should not have to know if the request came in chunked or not.
And that if a client sends a request with a chunked body, you are not necessarily gettting
it so on the server on the which application runs. And vice-versa.
From modperl-return-63401-apmail-perl-modperl-archive=perl.apache.org@perl.apache.org Thu Jul 4 15:50:20 2013
Return-Path:
X-Original-To: apmail-perl-modperl-archive@www.apache.org
Delivered-To: apmail-perl-modperl-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 6A7C610760
for ; Thu, 4 Jul 2013 15:50:20 +0000 (UTC)
Received: (qmail 70623 invoked by uid 500); 4 Jul 2013 15:50:19 -0000
Delivered-To: apmail-perl-modperl-archive@perl.apache.org
Received: (qmail 70426 invoked by uid 500); 4 Jul 2013 15:50:18 -0000
Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Delivered-To: mailing list modperl@perl.apache.org
Received: (qmail 70419 invoked by uid 99); 4 Jul 2013 15:50:17 -0000
Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 15:50:17 +0000
X-ASF-Spam-Status: No, hits=1.5 required=5.0
tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW
X-Spam-Check-By: apache.org
Received-SPF: error (nike.apache.org: local policy)
Received: from [209.85.212.176] (HELO mail-wi0-f176.google.com) (209.85.212.176)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jul 2013 15:50:11 +0000
Received: by mail-wi0-f176.google.com with SMTP id ey16so6457250wid.15
for ; Thu, 04 Jul 2013 08:49:30 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=google.com; s=20120113;
h=mime-version:x-originating-ip:in-reply-to:references:from:date
:message-id:subject:to:content-type:x-gm-message-state;
bh=cJYXiDonev/ymXPXFPQAWwXyc+PtYFlfMtZ9fQJd+vc=;
b=PVhZHEmgalHbYd8g6uVWMShVB6ni3yRleyGjH7j5lkoO25xsFuW1oVPndhOvXthZ7b
6tBoI5JedSnFwKD+dYwWZuHn8qYK3CvG8p5qb29fpFg6eORqtj5MYjEHShp0nBJm8aN6
UvCyq1D552IQef7shFJq7gUoMLde5+yBjJN6VvCvLQWBAkkRhFtBydk8ol+G5vZRaBYt
J1lHM7wLFqkU0NlFxln82/q+wNLuhKlzxYOXB0F70nMy0vyjkeF4EQ5ariXhbLFknc1Q
1ArYaqRpexZEdmW0xeJ70wgWCc/CkWuCkRc2ZjJvhTgD+tcdD1p6r3k8cIceFMYK+b5t
BO0w==
X-Received: by 10.180.74.197 with SMTP id w5mr21389083wiv.20.1372952970184;
Thu, 04 Jul 2013 08:49:30 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.194.175.170 with HTTP; Thu, 4 Jul 2013 08:48:59 -0700 (PDT)
X-Originating-IP: [96.18.102.166]
In-Reply-To: <51D55728.20406@ice-sa.com>
References:
<51D55728.20406@ice-sa.com>
From: Bill Moseley
Date: Thu, 4 Jul 2013 08:48:59 -0700
Message-ID:
Subject: Re: mod_perl and Transfer-Encoding: chunked
To: mod_perl list
Content-Type: multipart/alternative; boundary=f46d04374a0d14e1ce04e0b1861e
X-Gm-Message-State: ALoCoQkSZgaKCs4oZ3VKJF26yRFGEeZn74q66vysPUwedUbgoT8fxewDJpatAFBzgDzn61kwQHId
X-Virus-Checked: Checked by ClamAV on apache.org
--f46d04374a0d14e1ce04e0b1861e
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Andr=E9, thanks for the response:
On Thu, Jul 4, 2013 at 4:06 AM, Andr=E9 Warnier wrote:
>
> Bill Moseley wrote:
>
>>
>> First, if $r->read reads unchunked data then why is there a
>> Transfer-Encoding header saying that the content is chunked? Shouldn't
>> that header be removed?
>>
>
Looking at the RFC again the answer appears to be yes. Look at the last
line in this decoding example in
http://tools.ietf.org/html/rfc2616#section-19.4.6
A process for decoding the "chunked" transfer-coding (section 3.6
)
can be represented in pseudo-code as:
length :=3D 0
read chunk-size, chunk-extension (if any) and CRLF
while (chunk-size > 0) {
read chunk-data and CRLF
append chunk-data to entity-body
length :=3D length + chunk-size
read chunk-size and CRLF
}
read entity-header
while (entity-header not empty) {
append entity-header to existing header fields
read entity-header
}
Content-Length :=3D length
Remove "chunked" from Transfer-Encoding
Apache/mod_perl is doing the first part but not updating the headers.
There's more on Content-Length and Transfer-Encoding here:
http://tools.ietf.org/html/rfc2616#section-4.4
How does one know if the content is chunked or not, otherwise?
>>
>
> The real question is : does one need to know ?
>
Perhaps. That's an interesting question. Applications probably don't
need to care. They should receive the body -- so for mod_perl that means
reading data using $r->read until there's no more to read and then the app
should never need to look at the Transfer-Encoding header -- or
Content-Length header for that matter by that reasoning.
It's a bit less clear if you think about Plack. It sits between web
servers and applications. What should, say, a Plack Middleware component
see in the body if the headers say Trasnfer-Encoding: chunked? The
decoding probably should happen in the
server,
but the headers would need to indicate that by removing the
Transfer-Encoding header and adding in the Content-Length.
>> Perhaps I'm approaching this incorrectly, but this is all a bit untidy.
>>
>> I'm using Catalyst and Catalyst needs a Content-Length.
>>
>
> I would posit then that Catalyst is wrong (or not compatible with HTTP 1.=
1
> in that respect).
But, Catalyst is a web application (framework) and from your point above it
should not care about the encoding and just read the input stream by
calling ->read(). Really, if you think about Plack, Catalyst should never
make exceptions based on $ENV{MOD_PERL}.
So, the separation of concerns between the web server and the app is not
very clean.
> So, I have a Plack
>
>> Middleware component that creates a temporary file writing the buffer fr=
om
>> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes. I pass
>> this file handle onto Catalyst.
>>
>>
> So what you wrote then is a patch to Catalyst.
>
No, the Middleware component should be usable for any application. And
likewise, for any web server. That's the point of Plack.
Obviously, there's differences between web servers and maybe we need code
that understans when running under mod_perl that the Transfer-Encoding:
chunked header should be ignored, but if that code must live in Catalyst
then that's really breaking the separation that Plack provides.
I think the sane thing here is if Apache/mod_perl didn't provide a header
saying the body is chunked, when it isn't. Otherwise, code (Plack, web
apps) that receive a set of headers and a handle to read from don't really
have any choice but to believe what it is told.
--=20
Bill Moseley
moseley@hank.org
--f46d04374a0d14e1ce04e0b1861e
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Andr=E9, thanks for the response:

=A0 How does one know if the content is chunked or=A0not=
, otherwise?

The real question is : does one need to know ?

Perhaps. =A0That's an interesting question. =A0 Applications pr=
obably don't need to care. =A0They should receive the body -- so for mo=
d_perl that means reading data using $r->read until there's no more =
to read and then the app should never need to look at the Transfer-Encoding=
header -- or Content-Length header for that matter by that reasoning.

It's a bit less clear if you think about Plack. =A0=
It sits between web servers and applications. =A0 What should, say, a Plack=
Middleware component see in the body if the headers say Trasnfer-Encoding:=
chunked? =A0 The decoding probably should happen in the server, but t=
he headers would need to indicate that by removing the Transfer-Encoding he=
ader and adding in the Content-Length.

Perhaps I'm approaching this incorrectly, but this is all a bit untidy.=

I'm using Catalyst and Catalyst needs a Content-Length.

I would posit then that Catalyst is wrong (or not compatible with HTTP 1.1 =
in that respect).

But, Catalyst is a web ap=
plication (framework) and from your point above it should not care about th=
e encoding and just read the input stream by calling ->read(). =A0 Reall=
y, if you think about Plack, Catalyst should never make exceptions based on=
$ENV{MOD_PERL}.

So, the separation of concerns between the web server a=
nd the app is not very clean.

No, the Middleware component should be usable for any application. =
=A0 And likewise, for any web server. =A0That's the point of Plack.

Obviously, there's differences between web servers =
and maybe we need code that understans when running under mod_perl that the=
Transfer-Encoding: chunked header should be ignored, but if that code must=
live in Catalyst then that's really breaking the separation that Plack=
provides.

I think the sane thing here is if Apache/mod_perl didn&=
#39;t provide a header saying the body is chunked, when it isn't. =A0 O=
therwise, code (Plack, web apps) that receive a set of headers and a handle=
to read from don't really have any choice but to believe what it is to=
ld.