Twisted: Ticket #886: Add an HTTP 1.1 client protocol implementation to twisted.webhttps://twistedmatrix.com/trac/ticket/886
<p>
Twisted Web only includes an HTTP 1.0 client protocol implementation. HTTP 1.1 includes numerous useful features which clients benefit from.
</p>
<p>
As a first step towards good HTTP 1.1 client support, add a low-level API for issuing HTTP 1.1 requests and parsing HTTP 1.1 responses (this excludes things like timeouts, following redirects, etc). Once this is done, higher-level APIs and more HTTP features can be added based on it.
</p>
en-usTwistedhttps://twistedmatrix.com/trac/chrome/common/trac_banner.pnghttps://twistedmatrix.com/trac/ticket/886
Trac 1.0.13exarkunSun, 13 Feb 2005 04:01:43 GMThttps://twistedmatrix.com/trac/ticket/886#comment:1
https://twistedmatrix.com/trac/ticket/886#comment:1
<pre>
getPage should remain as-is. It should be re-implemented in terms of a more
expressive API which allows streaming and non-body retrieval.
pop3client.py is an example of a reasonable way to do this, I think.
</pre>
TicketexarkunThu, 30 Oct 2008 21:17:57 GMTbranch, branch_author sethttps://twistedmatrix.com/trac/ticket/886#comment:2
https://twistedmatrix.com/trac/ticket/886#comment:2
<ul>
<li><strong>branch</strong>
set to <em>branches/expressive-http-client-886</em>
</li>
<li><strong>branch_author</strong>
set to <em>exarkun</em>
</li>
</ul>
<p>
(In <a class="missing changeset" title="No changeset 25208 in the repository">[25208]</a>) Branching to 'expressive-http-client-886'
</p>
TicketexarkunWed, 10 Dec 2008 22:53:06 GMTpriority changedhttps://twistedmatrix.com/trac/ticket/886#comment:3
https://twistedmatrix.com/trac/ticket/886#comment:3
<ul>
<li><strong>priority</strong>
changed from <em>low</em> to <em>high</em>
</li>
</ul>
<p>
A lot of stuff has happened in the branch. There are still a lot of minor things to take care of (mostly boring coding standard stuff) and I'd like to add <em>some</em> higher-level API to the branch before merging it (nothing too hard, just something that won't scare away newbies), but it'd be great to get some feedback about the low-level stuff before proceeding to anything high-level.
</p>
<p>
The intent is to implement persistent connection support in a subsequent ticket. The goal for this code is just to not make it impossible to do that while still supporting the APIs added in this branch.
</p>
<p>
James, your feedback in particular would be appreciated. :) At the very least,<a class="ext-link" href="http://twistedmatrix.com/trac/browser/branches/expressive-http-client-886/twisted/web/_newclient.py#L221"><span class="icon">​</span>connection control header detection</a> is wrong. Hints on how to make it (or anything else) right would be helpful.
</p>
TicketradixThu, 18 Dec 2008 21:23:23 GMThttps://twistedmatrix.com/trac/ticket/886#comment:4
https://twistedmatrix.com/trac/ticket/886#comment:4
<p>
Couple of pre-review comments:
</p>
<ol><li>The example doesn't log errors
</li><li>The second callback in the example doesn't need to take 'proto', and so the previous callback doesn't need to munge the callback result into (proto, response)
</li><li>The branch is getting huge (3300+ lines), and it'd be really good if it could be split up.
</li><li>I don't like 'client' as a method name in high-level-http-client.py (though I expect you were planning to find a better one anyway), but I do like the way these things are composable
</li><li>I think the fourth argument to Request should be optional. I really don't like seeing random "None"s in code using the API.
</li></ol>
TicketexarkunSat, 20 Dec 2008 19:49:43 GMTkeywords sethttps://twistedmatrix.com/trac/ticket/886#comment:5
https://twistedmatrix.com/trac/ticket/886#comment:5
<ul>
<li><strong>keywords</strong>
<em>httpclient</em> added
</li>
</ul>
TicketexarkunThu, 08 Jan 2009 23:13:00 GMTbranch changedhttps://twistedmatrix.com/trac/ticket/886#comment:6
https://twistedmatrix.com/trac/ticket/886#comment:6
<ul>
<li><strong>branch</strong>
changed from <em>branches/expressive-http-client-886</em> to <em>branches/expressive-http-client-886-2</em>
</li>
</ul>
<p>
(In <a class="missing changeset" title="No changeset 25912 in the repository">[25912]</a>) Branching to 'expressive-http-client-886-2'
</p>
TicketexarkunFri, 09 Jan 2009 13:48:02 GMTowner, status changedhttps://twistedmatrix.com/trac/ticket/886#comment:7
https://twistedmatrix.com/trac/ticket/886#comment:7
<ul>
<li><strong>owner</strong>
changed from <em>jknight</em> to <em>exarkun</em>
</li>
<li><strong>status</strong>
changed from <em>new</em> to <em>assigned</em>
</li>
</ul>
TicketexarkunMon, 12 Jan 2009 15:00:42 GMThttps://twistedmatrix.com/trac/ticket/886#comment:8
https://twistedmatrix.com/trac/ticket/886#comment:8
<p>
(In <a class="missing changeset" title="No changeset 25948 in the repository">[25948]</a>) Merge identity-transfer-encoding-3605
</p>
<p>
Author: exarkun, itamar
Reviewer: therve
Fixes: <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/3605" title="enhancement: Factor &#34;identity&#34; transfer encoding support out of ... (closed: fixed)">#3605</a>
Refs: <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/886" title="enhancement: Add an HTTP 1.1 client protocol implementation to twisted.web (closed: fixed)">#886</a>
</p>
<p>
Add an identity transfer-encoding decoder, similar to the chunked transfer-encoding
decoder, to <tt>twisted.web.http</tt> and use it to simplify <tt>HTTPChannel</tt> slightly by
removing the knowledge of any particular transfer-encoding from its request body
handling code.
</p>
<p>
Also expand the transfer-encoding decoder interface so that it will be useful for
an HTTP client and rename the chunked decoder so that its name more closely reflects
what it does.
</p>
TicketexarkunTue, 13 Jan 2009 14:28:13 GMTbranch changedhttps://twistedmatrix.com/trac/ticket/886#comment:9
https://twistedmatrix.com/trac/ticket/886#comment:9
<ul>
<li><strong>branch</strong>
changed from <em>branches/expressive-http-client-886-2</em> to <em>branches/expressive-http-client-886-3</em>
</li>
</ul>
<p>
(In <a class="missing changeset" title="No changeset 25973 in the repository">[25973]</a>) Branching to 'expressive-http-client-886-3'
</p>
TicketexarkunTue, 13 Jan 2009 22:38:02 GMTdescription, summary changedhttps://twistedmatrix.com/trac/ticket/886#comment:10
https://twistedmatrix.com/trac/ticket/886#comment:10
<ul>
<li><strong>description</strong>
modified (<a href="/trac/ticket/886?action=diff&amp;version=10">diff</a>)
</li>
<li><strong>summary</strong>
changed from <em>Rewrite twisted.web.client.getPage to support streaming of result data and returning headers etc.</em> to <em>Add an HTTP 1.1 client protocol implementation to twisted.web</em>
</li>
</ul>
<p>
As the first comment says, <tt>getPage</tt> is not actually going to change. Updating the description (empty) and summary of this ticket to describe what change is actually being made.
</p>
TicketexarkunTue, 13 Jan 2009 22:44:16 GMTkeywords, status changed; owner deletedhttps://twistedmatrix.com/trac/ticket/886#comment:11
https://twistedmatrix.com/trac/ticket/886#comment:11
<ul>
<li><strong>keywords</strong>
<em>review</em> added
</li>
<li><strong>status</strong>
changed from <em>assigned</em> to <em>new</em>
</li>
<li><strong>owner</strong>
<em>exarkun</em> deleted
</li>
</ul>
<p>
Okay. Low-level protocol implementation done (ha ha).
</p>
<p>
It's all private still. The plan is to implement a higher-level API based on this as soon as this is OKed, then make whichever low-level APIs are suitable for end users public. This is so that we prove that the low-level APIs are good enough for the high-level APIs we want to provide before offering them for public consumption.
</p>
TicketivankTue, 13 Jan 2009 22:55:54 GMThttps://twistedmatrix.com/trac/ticket/886#comment:12
https://twistedmatrix.com/trac/ticket/886#comment:12
<p>
doc/web/examples/http_client.py parses the port, but does not send it as part of the Host header (required when port # does not match the default)
</p>
TicketivankTue, 13 Jan 2009 23:13:10 GMThttps://twistedmatrix.com/trac/ticket/886#comment:13
https://twistedmatrix.com/trac/ticket/886#comment:13
<p>
It would be nice if doc/web/examples/http_client.py supported https. The easy information on how to do this was lost when the high-level-http-client.py sketch was deleted.
</p>
TicketivankTue, 13 Jan 2009 23:32:21 GMTattachment sethttps://twistedmatrix.com/trac/ticket/886
https://twistedmatrix.com/trac/ticket/886
<ul>
<li><strong>attachment</strong>
set to <em>httpclient.py.https_and_host.patch</em>
</li>
</ul>
<p>
https support for httpclient.py example. Also, send non-default ports as part of Host header.
</p>
TicketglyphWed, 14 Jan 2009 02:36:00 GMTstatus changed; owner sethttps://twistedmatrix.com/trac/ticket/886#comment:14
https://twistedmatrix.com/trac/ticket/886#comment:14
<ul>
<li><strong>owner</strong>
set to <em>glyph</em>
</li>
<li><strong>status</strong>
changed from <em>new</em> to <em>assigned</em>
</li>
</ul>
TicketglyphWed, 14 Jan 2009 04:05:57 GMTstatus, owner, keywords changedhttps://twistedmatrix.com/trac/ticket/886#comment:15
https://twistedmatrix.com/trac/ticket/886#comment:15
<ul>
<li><strong>status</strong>
changed from <em>assigned</em> to <em>new</em>
</li>
<li><strong>owner</strong>
changed from <em>glyph</em> to <em>exarkun</em>
</li>
<li><strong>keywords</strong>
<em>review</em> removed
</li>
</ul>
<p>
OK. Awesome. Really glad to see this is finally in review!
</p>
<p>
Please note that the implementation itself really looks great, I like the idiom of create a request / issue the request / build a response. I think it's a solid low level interface. Most of my concerns are about process, documentation, and naming. So this might look like a lot of text, but I don't think it will actually be a lot of work.
</p>
<p>
Feedback actually relevant to this branch:
</p>
<ol><li>I know that the ticket description says "low level", but I like to think that when a ticket is filed against Twisted for some library functionality, it's implicit that <em>some</em> level of that functionality needs to be public before the ticket is closed. While I like new private implementation guts, I don't like that this change is entirely a new private API. The fact that lots of attributes and classes in here are double-private and triple-private suggests that you've given some thought to what should be exposed and what shouldn't.
<ol><li>I mean, <tt>twisted.web._newclient._LenthEnforcingConsumer._length</tt>? Sure you've got enough underscores on that? Every double-private top-level should drop its underscore. (I can understand the desire for private attributes of private classes though, so they can implement an interface and not "leak" anything else into the public API indirectly.)
</li><li>As a more concrete suggestion, I think the following things should be public (although I may have missed some necessary piece):
<ol><li>All the exceptions should be in <tt>twisted.web.error</tt>.
</li><li><tt>Request</tt> and <tt>Response</tt> should be moved to <tt>twisted.web.client</tt>.
</li><li><tt>HTTP11ClientProtocol</tt> should be moved to <tt>twisted.web.client</tt>. I'm inclined to suggest that it should have a better name, but I don't think that's strictly necessary, given that it is really supposed to be the lower level of something else.
</li></ol></li><li>Once these are public the example shouldn't be using any private APIs. Which is good, because we should really never have any examples that use private APIs. They are, almost by definintion, bad examples.
</li></ol></li><li>The changes in <tt>twisted.internet.abstract</tt> shouldn't be merged, for obvious reasons.
</li><li><tt>_makeFunction</tt> will create functions that don't appear in the API documentation. Also: worst name ever. May I suggest something more along the lines of (spelled python2.3-compatibly, of course):
<div class="code"><pre> <span class="nd">@stateful</span>
<span class="k">def</span> <span class="nf">connectionLost</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> reason<span class="p">):</span>
<span class="sd">"""
The underlying transport went away. If appropriate, notify the parser
object.
"""</span>
</pre></div>I believe this preserves the docstrings?
</li><li><tt>IStreamingProducer</tt> is a name that seems designed to be confusing. I don't know why <tt>IPushProducer</tt> wasn't named <tt>IStreamingProducer</tt> in the first place, given that the flag's name to indicate it is "<tt>streaming</tt>". Not that I think <em>that</em> is a great name either, but as usual consistency is more important than meaningfulness. A suggestion for this new interface: <tt>IEntityProducer</tt>. Maybe even <tt>IHTTPEntityBodyProducer</tt> since the <tt>length</tt> is kind of an HTTP feature?
</li><li>Why bother having <tt>UNKNOWN_LENGTH</tt> as a distinct value? Wouldn't <tt>None</tt> serve this purpose adequately? Or having the attribute be optional? You still have to explicitly check, after all. It seems to bloat the code without providing any real benefits. If we keep it, <tt>UNKNOWN_LENGTH</tt> should have a @var in the module docstring.
</li><li>The last line of <tt>newclient.py</tt>, i.e. <tt>Request._connectionLost_CONNECTION_LOST</tt>, appears to be uncovered by any test. Seems like possibly it should just be deleted, since that error condition is wacky enough that getting a "no such method" exception would be okay.
</li><li>This code is a bit more comment-heavy than I'm used to, which I was surprised to discover is pretty good. HTTP sucks a lot, and it's nice to have a guide to some of the more absurd pieces of the implementation inline. However:
<ol><li><tt>Proxy-Connection</tt> should be documented a bit more thoroughly than "undocumented HTTP 1.0 abomination". At least put in a link to someone else who implements it, or explain briefly what they use it for.
</li><li><tt>HTTPParser.isConnectionControlHeader</tt> should just include the comment in its docstring.
</li><li>The giant enumerated outline in the middle of <tt>Request._writeToContentLength</tt> should be broken up to put each bullet point closer to the responsible code, and converted into full sentences, or possibly just removed.
</li></ol></li><li>The tests sure do say <tt>Response(None, None, None, None, transport)</tt> a lot. If that's a valid thing to say, perhaps it should be more convenient? <tt>Response(transport=transport)</tt>? If it's a test-only thing, a function in the test module would be good (with a docstring explaining why it's OK to leave all those values as <tt>None</tt>, since I'm not sure I understand that bit).
</li><li>None of the <tt>_blah_BLAH</tt> state-machine methods are documented. Which I actually think I could be OK with, except... none of the <tt>BLAH</tt> states are actually documented anywhere either. For now, let's just stick to our existing standard and document all these methods. Notwithstanding the difficulty with constants mentioned below, I'd really like it if <tt>TRANSMITTING</tt>, for example, had a docstring of its own.
</li><li><tt>HTTP11ClientProtocol._finish</tt> is also undocumented, and the antecedent in its "XXX" comment is ambiguous. (Thanks to the link for <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/3420" title="enhancement: twisted.web.client persistent connections (closed: fixed)">#3420</a> though.)
</li></ol><p>
Idle musings / reminders of persistent problems:
</p>
<ol><li>New examples are great. But every time I see a new example that uses <tt>reactor.run()</tt>, I feel it necessary to point out the thing that I pointed out on <a class="ext-link" href="http://twistedmatrix.com/trac/ticket/1490#comment:39"><span class="icon">​</span>my comment to #1490</a>. Nothing this branch can do, I guess, just a reminder.
</li><li>If we're keeping <tt>UNKNOWN_LENGTH</tt>, <tt>object()</tt> is an opaque way to make a constant value. I really wish we had some convenient idiom for making constants that would <tt>repr()</tt> (and ideally <tt>qual()</tt> as well) to their FQPN. Inspecting <tt>UNKNOWN_LENGTH</tt> in a debugger, or a failed test, like inspecting any other Python constant, is going to be mysterious and unpleasant. How would you feel about filing a ticket for a good <tt>NamedConstant</tt> class? It seems the more protocols the implement, the more constants like this that we have. Anyway, as a more practical solution I suggest just making it a string ("twisted.web.iweb.UNKNOWN_LENGTH") for the time being.
</li><li>The only <tt>pyflakes</tt> warning from any changed file is <tt>IUsernameDigestHash</tt>. Not your fault, I know, but could you perhaps just add a blank line that says "<tt>IUsernameDigestHash # export in __all__ below</tt>" and take it to zero?
</li><li>We should really have one good, public implementation of a state-machine helper somewhere. How many more times are we going to re-implement <tt>vertex.statemachine</tt> and <tt>epsilon.modal</tt> before we settle down and decide on one that works?
</li></ol>
TicketexarkunWed, 14 Jan 2009 16:43:03 GMTowner, keywords, branch_author changedhttps://twistedmatrix.com/trac/ticket/886#comment:16
https://twistedmatrix.com/trac/ticket/886#comment:16
<ul>
<li><strong>owner</strong>
changed from <em>exarkun</em> to <em>glyph</em>
</li>
<li><strong>keywords</strong>
<em>review</em> added
</li>
<li><strong>branch_author</strong>
changed from <em>exarkun</em> to <em>exarkun, itamar</em>
</li>
</ul>
<p>
Replying to <a class="ticket" href="https://twistedmatrix.com/trac/ticket/886#comment:15" title="Comment 15">glyph</a>:
</p>
<blockquote class="citation">
<p>
OK. Awesome. Really glad to see this is finally in review!
</p>
<p>
Please note that the implementation itself really looks great, I like the idiom of create a request / issue the request / build a response. I think it's a solid low level interface. Most of my concerns are about process, documentation, and naming. So this might look like a lot of text, but I don't think it will actually be a lot of work.
</p>
<p>
Feedback actually relevant to this branch:
</p>
<ol><li>I know that the ticket description says "low level", but I like to think that when a ticket is filed against Twisted for some library functionality, it's implicit that <em>some</em> level of that functionality needs to be public before the ticket is closed. While I like new private implementation guts, I don't like that this change is entirely a new private API. The fact that lots of attributes and classes in here are double-private and triple-private suggests that you've given some thought to what should be exposed and what shouldn't.
</li></ol></blockquote>
<p>
I agree with you in general. Some functionality needs to be public. I mentioned the specific motivation behind what is happening in this branch in the comment where I requested a review, though. Can you argue against that plan in particular?
</p>
<blockquote class="citation">
<ol><li>I mean, <tt>twisted.web._newclient._LenthEnforcingConsumer._length</tt>? Sure you've got enough underscores on that? Every double-private top-level should drop its underscore. (I can understand the desire for private attributes of private classes though, so they can implement an interface and not "leak" anything else into the public API indirectly.)
</li></ol></blockquote>
<p>
You're right. The middle underscore isn't adding anything. I fixed most (not <tt>_WrappedException</tt> (<em>yet</em>?) since I may end up moving its definition to another module, depending on what happens for some other points) instances of this pattern in <a class="missing changeset" title="No changeset 25999 in the repository">r25999</a>, <a class="missing changeset" title="No changeset 26000 in the repository">r26000</a>, and <a class="missing changeset" title="No changeset 26001 in the repository">r26001</a>.
</p>
<blockquote class="citation">
<ol><li>As a more concrete suggestion, I think the following things should be public (although I may have missed some necessary piece):
<ol><li>All the exceptions should be in <tt>twisted.web.error</tt>.
</li><li><tt>Request</tt> and <tt>Response</tt> should be moved to <tt>twisted.web.client</tt>.
</li><li><tt>HTTP11ClientProtocol</tt> should be moved to <tt>twisted.web.client</tt>. I'm inclined to suggest that it should have a better name, but I don't think that's strictly necessary, given that it is really supposed to be the lower level of something else.
</li></ol></li><li>Once these are public the example shouldn't be using any private APIs. Which is good, because we should really never have any examples that use private APIs. They are, almost by definintion, bad examples.
</li></ol></blockquote>
<p>
Deferring dealing with any of these points until you respond to my question about the first point.
</p>
<blockquote class="citation">
<ol><li>The changes in <tt>twisted.internet.abstract</tt> shouldn't be merged, for obvious reasons.
</li></ol></blockquote>
<p>
Oh, indeed. Fixed in <a class="missing changeset" title="No changeset 26002 in the repository">r26002</a>.
</p>
<blockquote class="citation">
<ol><li><tt>_makeFunction</tt> will create functions that don't appear in the API documentation. Also: worst name ever. May I suggest something more along the lines of (spelled python2.3-compatibly, of course):
<div class="code"><pre> <span class="nd">@stateful</span>
<span class="k">def</span> <span class="nf">connectionLost</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> reason<span class="p">):</span>
<span class="sd">"""
The underlying transport went away. If appropriate, notify the parser
object.
"""</span>
</pre></div>I believe this preserves the docstrings?
</li></ol></blockquote>
<p>
Fixed in <a class="missing changeset" title="No changeset 26003 in the repository">r26003</a>.
</p>
<blockquote class="citation">
<ol><li><tt>IStreamingProducer</tt> is a name that seems designed to be confusing. I don't know why <tt>IPushProducer</tt> wasn't named <tt>IStreamingProducer</tt> in the first place, given that the flag's name to indicate it is "<tt>streaming</tt>". Not that I think <em>that</em> is a great name either, but as usual consistency is more important than meaningfulness. A suggestion for this new interface: <tt>IEntityProducer</tt>. Maybe even <tt>IHTTPEntityBodyProducer</tt> since the <tt>length</tt> is kind of an HTTP feature?
</li></ol></blockquote>
<p>
I renamed it to <tt>IEntityBodyProducer</tt> in <a class="missing changeset" title="No changeset 26004 in the repository">r26004</a>. I think <em>HTTP</em> is already implied by its location (the other interfaces in <tt>iweb.py</tt> don't have <em>HTTP</em> in their names).
</p>
<blockquote class="citation">
<ol><li>Why bother having <tt>UNKNOWN_LENGTH</tt> as a distinct value? Wouldn't <tt>None</tt> serve this purpose adequately? Or having the attribute be optional? You still have to explicitly check, after all. It seems to bloat the code without providing any real benefits. If we keep it, <tt>UNKNOWN_LENGTH</tt> should have a @var in the module docstring.
</li></ol></blockquote>
<p>
Added a <tt>@var</tt> in <a class="missing changeset" title="No changeset 26005 in the repository">r26005</a>. We chose to use <tt>UNKNOWN_LENGTH</tt> rather than <tt>None</tt> in order to attempt to prevent bugs in applications of the form <tt>if response.length: ...</tt> (note that <tt>UNKNOWN_LENGTH</tt> is used both as a possible value for <tt>Response.length</tt> and for <tt>IEntityBodyProducer.length</tt>). A response with length of 0 is of course significantly different from a response with an unknown length.
</p>
<p>
This rather directly raises a question which I've been pondering all morning though - why is a request body produced with an <tt>IEntityBodyProducer</tt> but a response body is made available through the <tt>Response.deliverBody</tt> mechanism; it seems as though <tt>Response</tt> could have an attribute which is an <tt>IEntityBodyProducer</tt> provider. Maybe I'll remember the answer by the time I finish responding to review feedback.<sup>1</sup>
</p>
<blockquote class="citation">
<ol><li>The last line of <tt>newclient.py</tt>, i.e. <tt>Request._connectionLost_CONNECTION_LOST</tt>, appears to be uncovered by any test. Seems like possibly it should just be deleted, since that error condition is wacky enough that getting a "no such method" exception would be okay.
</li></ol></blockquote>
<p>
I didn't want to just have an <tt>AttributeError</tt> (particularly one exposing an irrelevant implementation detail like how <tt>connectionLost</tt> is made to do the right thing depending on what state the protocol is in), but your suggestion made me realize I could do better error handling in <tt>makeStateDispatcher</tt> (previously <tt>_makeFunction</tt>). So I added that and deleted the untested <tt>_connectionLost_CONNECTION_LOST</tt> in <a class="missing changeset" title="No changeset 26006 in the repository">r26006</a>.
</p>
<blockquote class="citation">
<ol><li>This code is a bit more comment-heavy than I'm used to, which I was surprised to discover is pretty good. HTTP sucks a lot, and it's nice to have a guide to some of the more absurd pieces of the implementation inline. However:
<ol><li><tt>Proxy-Connection</tt> should be documented a bit more thoroughly than "undocumented HTTP 1.0 abomination". At least put in a link to someone else who implements it, or explain briefly what they use it for.
</li><li><tt>HTTPParser.isConnectionControlHeader</tt> should just include the comment in its docstring.
</li><li>The giant enumerated outline in the middle of <tt>Request._writeToContentLength</tt> should be broken up to put each bullet point closer to the responsible code, and converted into full sentences, or possibly just removed.
</li></ol></li></ol></blockquote>
<p>
Fixed these issues in <a class="missing changeset" title="No changeset 26007 in the repository">r26007</a>.
</p>
<blockquote class="citation">
<ol><li>The tests sure do say <tt>Response(None, None, None, None, transport)</tt> a lot. If that's a valid thing to say, perhaps it should be more convenient? <tt>Response(transport=transport)</tt>? If it's a test-only thing, a function in the test module would be good (with a docstring explaining why it's OK to leave all those values as <tt>None</tt>, since I'm not sure I understand that bit).
</li></ol></blockquote>
<p>
Did something in <a class="missing changeset" title="No changeset 26008 in the repository">r26008</a>, let me know if that's sufficient.
</p>
<blockquote class="citation">
<ol><li>None of the <tt>_blah_BLAH</tt> state-machine methods are documented. Which I actually think I could be OK with, except... none of the <tt>BLAH</tt> states are actually documented anywhere either. For now, let's just stick to our existing standard and document all these methods. Notwithstanding the difficulty with constants mentioned below, I'd really like it if <tt>TRANSMITTING</tt>, for example, had a docstring of its own.
</li></ol></blockquote>
<p>
I documented the states that <tt>Response</tt> and <tt>HTTP11ClientProtocol</tt> can be in and I added docstrings for all the state methods in <a class="missing changeset" title="No changeset 26009 in the repository">r26009</a>. In doing this, I also noticed that none of <tt>HTTP11ClientProtocol</tt>'s attributes were documented, so I documented them in <a class="missing changeset" title="No changeset 26010 in the repository">r26010</a>.
</p>
<blockquote class="citation">
<ol><li><tt>HTTP11ClientProtocol._finish</tt> is also undocumented, and the antecedent in its "XXX" comment is ambiguous. (Thanks to the link for <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/3420" title="enhancement: twisted.web.client persistent connections (closed: fixed)">#3420</a> though.)
</li></ol></blockquote>
<p>
Done in <a class="missing changeset" title="No changeset 26011 in the repository">r26011</a> and <a class="missing changeset" title="No changeset 26012 in the repository">r26012</a>.
</p>
<blockquote class="citation">
<p>
Idle musings / reminders of persistent problems:
</p>
<ol><li>New examples are great. But every time I see a new example that uses <tt>reactor.run()</tt>, I feel it necessary to point out the thing that I pointed out on <a class="ext-link" href="http://twistedmatrix.com/trac/ticket/1490#comment:39"><span class="icon">​</span>my comment to #1490</a>. Nothing this branch can do, I guess, just a reminder.
</li></ol></blockquote>
<p>
Yea. <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/3270" title="defect: The XMPP example in Twisted.Words crashes if I stop it with Ctrl-C (closed: fixed)">#3270</a> is also somewhat related. Do you want to have a fight with Chris about it? <a class="ext-link" href="http://divmod.org/trac/browser/trunk/Epsilon/epsilon/react.py"><span class="icon">​</span>react.py</a> is as far as I could push that issue. Hm, actually, I see that Chris didn't voice his objections on the ticket, he must have done it in meatspace, which means I can pretend he didn't, and maybe push the <tt>react</tt> solution forward.
</p>
<blockquote class="citation">
<ol><li>If we're keeping <tt>UNKNOWN_LENGTH</tt>, <tt>object()</tt> is an opaque way to make a constant value. I really wish we had some convenient idiom for making constants that would <tt>repr()</tt> (and ideally <tt>qual()</tt> as well) to their FQPN. Inspecting <tt>UNKNOWN_LENGTH</tt> in a debugger, or a failed test, like inspecting any other Python constant, is going to be mysterious and unpleasant. How would you feel about filing a ticket for a good <tt>NamedConstant</tt> class? It seems the more protocols the implement, the more constants like this that we have. Anyway, as a more practical solution I suggest just making it a string ("twisted.web.iweb.UNKNOWN_LENGTH") for the time being.
</li></ol></blockquote>
<p>
I don't feel very good about adding this kind of thing to Twisted. Perhaps it will come to that (we probably should have hashed this out long ago, oh well), but it's such a simple thing, and has basically nothing to do with Twisted... Why isn't it in the stdlib, or why isn't there a widespread third-party package that provides the functionality?
</p>
<p>
Switching to a string (well, I'm going to use unicode actually, ha ha) for now seems like a good idea (<a class="missing changeset" title="No changeset 26013 in the repository">r26013</a>).
</p>
<blockquote class="citation">
<ol><li>The only <tt>pyflakes</tt> warning from any changed file is <tt>IUsernameDigestHash</tt>. Not your fault, I know, but could you perhaps just add a blank line that says "<tt>IUsernameDigestHash # export in __all__ below</tt>" and take it to zero?
</li></ol></blockquote>
<p>
I really dislike hacking around pyflakes shortcomings like that. I'd rather add <tt>__all__</tt> support to pyflakes.
</p>
<blockquote class="citation">
<ol><li>We should really have one good, public implementation of a state-machine helper somewhere. How many more times are we going to re-implement <tt>vertex.statemachine</tt> and <tt>epsilon.modal</tt> before we settle down and decide on one that works?
</li></ol></blockquote>
<p>
Good question. I'm getting close to the point where I want to write one and put it in Twisted. I'm not quite sure what exact features it'll have yet, though. Some stuff from <tt>statemachine</tt> is nice, some stuff from <tt>modal</tt> is nice, there's probably useful features that neither of them provides...
</p>
<p>
I'm re-assigning this to you so you can respond to my response to your point one. After that's dealt with, I wouldn't mind if it went back to the general review pool.
</p>
<p>
<sup>1</sup>: No, I didn't remember.
</p>
TicketglyphThu, 15 Jan 2009 04:50:55 GMTowner deletedhttps://twistedmatrix.com/trac/ticket/886#comment:17
https://twistedmatrix.com/trac/ticket/886#comment:17
<ul>
<li><strong>owner</strong>
<em>glyph</em> deleted
</li>
</ul>
<p>
Replying to <a class="ticket" href="https://twistedmatrix.com/trac/ticket/886#comment:16" title="Comment 16">exarkun</a>:
</p>
<blockquote class="citation">
<p>
I agree with you in general. Some functionality needs to be public. I mentioned the specific motivation behind what is happening in this branch in the comment where I requested a review, though. Can you argue against that plan in particular?
</p>
</blockquote>
<p>
So, I have a bunch of reasons why it's a bad idea. I'll list them here, but I've considered it and I think that (aside from removing the example from doc/examples, and putting it somewhere private itself) I am willing to get this merged as all private. If you find yourself persuaded by my arguments, then we can change the plan, but if not, I also think it would be valuable to try experimenting with this one-layer-at-a-time merging strategy. I'm not entirely convinced by my own arguments.
</p>
<p>
Trunk should always be in a releasable state. So I'm assuming that a release may happen at any time; specifically, after this branch is merged but before the high-level API is merged.
</p>
<p>
If we release private functionality which is better than the public functionality, (especially if it is replete with examples of how to use the private functionality), it strikes me as likely that users will disregard the underscore and start using it. And who could blame them? When python-core releases a non-functional public API with some juicy private bits that we want to use, we use them anyway and are upset when they break them. (Although granted, it would be nicer if they had a better definition of "private", so we could tell in advance.) I think that having a compatibility policy that works depends on having the kind of credibility with users that comes from not prevaricating too much.
</p>
<p>
In other words, let's not just replace making too much stuff public with making too much stuff private.
</p>
<p>
Your motivation for the low-level-then-high-level plan seems to be keeping the branches (and therefore the reviews) smaller, by breaking them into layers. All things being equal, that would certainly be better. But, all things are rarely equal: by splitting the feature up, the nature of the review changes. Large as this branch was, I would personally have preferred it be even larger but included a top-level starting point more polished than the example. When I do reviews of new features I like to start reading at the "top".
</p>
<p>
My perspective on the structure and necessity of the implementation details is directly influenced by the way in which they are used by the high-level API. For example, in its current form the behavior of <tt>HTTP11ClientProtocol.request</tt> when not in the <tt>QUIESCENT</tt> state makes sense. But my opinion might be influenced by seeing what the request-queuing implementation implemented above that might look like.
</p>
<p>
This might also be an argument for keeping the present structure private (rather than making <tt>HTTP11ClientProtocol</tt> public) when merging this branch, but it seems to make the shorter reviews a false economy. If I'm going to have to re-review these implementation details again later once I've seen them in more serious use, why bother having a separate branch?
</p>
<p>
Clearly this branch needed a high-level design to go with it, hence the structure in the example, but the high-level design is only partially included. The example is cheating. It has no tests, it has no documentation, and it's not clear how features that it leaves out (like handling redirects, or translating "error" status codes into errors) will be implemented.
</p>
<p>
Again, all that said, this is just a suggestion.
</p>
<p>
I think I might have some time later for a re-review, but I'll put this back into the general pool to avoid blocking it further.
</p>
TicketglyphThu, 15 Jan 2009 07:16:54 GMThttps://twistedmatrix.com/trac/ticket/886#comment:18
https://twistedmatrix.com/trac/ticket/886#comment:18
<p>
Well, one of my points seems to have been invalidated already. As heard on IRC:
</p>
<pre class="wiki">&lt;ivan&gt; it's already deployed to production
</pre><p>
so perhaps keeping it out of trunk isn't going to help :).
</p>
TicketglyphThu, 15 Jan 2009 07:25:56 GMTstatus changed; owner sethttps://twistedmatrix.com/trac/ticket/886#comment:19
https://twistedmatrix.com/trac/ticket/886#comment:19
<ul>
<li><strong>status</strong>
changed from <em>new</em> to <em>assigned</em>
</li>
<li><strong>owner</strong>
set to <em>glyph</em>
</li>
</ul>
<p>
Reviewing again.
</p>
TicketglyphThu, 15 Jan 2009 09:36:08 GMTkeywords, owner, status changedhttps://twistedmatrix.com/trac/ticket/886#comment:20
https://twistedmatrix.com/trac/ticket/886#comment:20
<ul>
<li><strong>keywords</strong>
<em>review</em> removed
</li>
<li><strong>owner</strong>
changed from <em>glyph</em> to <em>exarkun</em>
</li>
<li><strong>status</strong>
changed from <em>assigned</em> to <em>new</em>
</li>
</ul>
<p>
Replying to <a class="ticket" href="https://twistedmatrix.com/trac/ticket/886#comment:16" title="Comment 16">exarkun</a>:
</p>
<p>
Since this is a pretty long comment, which is turning into a meandering discussion, I'm putting all the mandatory review stuff up here at the front. If you get bored you are free to ignore the rest.
</p>
<p>
These are all docstring or minor issues, so you can merge once they're dealt with.
</p>
<ol><li><tt>twisted.web._newclient.HTTPParser.statusReceived</tt> isn't valid epytext. I believe this is because of the "\r\n"; I've run into this problem before, I'd recommend spelling it "CRLF" in docstrings.
</li><li>the following reference (link) names couldn't be resolved by pydoctor:
<ol><li>in <tt>_newclient</tt>:
<ol><li><tt>IProtocol</tt>
</li><li><tt>IPushProducer</tt> (I think you mean <tt>IPushProducer</tt>)
</li><li><tt>IResponse</tt>
</li><li>There also seems to be a bug for mwh, not you: pydoctor can't find "twisted.web.iweb.UNKNOWN_LENGTH", despite the ivar declaration.
</li></ol></li><li>in <tt>iweb</tt>:
<ol><li><tt>IConsumer</tt>
</li><li><tt>Deferred</tt>
</li><li><tt>IProducer</tt>
</li><li><tt>Failure</tt>
</li></ol></li></ol></li><li><tt>_WrapperException</tt> needs a docstring, whether or not it remains double-private.
</li><li><tt>makeStateDispatcher</tt> should really have a docstring.
</li><li>undocumented in <tt>test_newclient</tt>:
<ol><li><tt>AccumulatingProtocol</tt> (isn't there one of these somewhere that you can use? let's not make <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/3604" title="enhancement: Create a single definitive module for commonly used test helpers and ... (closed: duplicate)">#3604</a> harder to fix.)
</li><li><tt>SlowRequest</tt>
</li><li><tt>SimpleRequest</tt>
</li><li><tt>StringProducer</tt>
</li><li><tt>RequestTests</tt>
</li><li>"<tt>mumble</tt>". (Perhaps "lol" is not a useful comment, either.)
</li><li>Assertion methods. (It would be more consistent for these to be on a class, but I don't think that's too important.)
<ol><li><tt>assertResponseFailed</tt>
</li><li><tt>assertRequestGenerationFailed</tt>
</li><li><tt>assertRequestTransmissionFailed</tt>
</li><li><tt>HTTPClientParserTests._noBodyTest</tt>
</li></ol></li></ol></li></ol><hr />
<p>
That's it for the real meat of the review. Now for the optional stuff and further rumination on loosely-related ideas:
</p>
<blockquote class="citation">
<p>
Replying to <a class="ticket" href="https://twistedmatrix.com/trac/ticket/886#comment:15" title="Comment 15">glyph</a>:
</p>
<blockquote class="citation">
<p>
OK. Awesome. Really glad to see this is finally in review!
</p>
</blockquote>
<p>
I agree with you in general. Some functionality needs to be public. I mentioned the specific motivation behind what is happening in this branch in the comment where I requested a review, though. Can you argue against that plan in particular?
</p>
</blockquote>
<p>
The abovementioned comment on IRC was enough to convince me, at least, that the best course of action at this point is to complete the review cycle as quickly as possible, in whatever form.
</p>
<p>
I still stand by my earlier suggestion, in point 1.2 above, that some elements of this should be public — but this is not out of any procedural concern. I just have a fairly high level of confidence in the API as we've discussed it, and the implementation seems high-quality. The sketch adequately demonstrates to me that this module, or at least the parts that I propose become public, are suitable for a real substrate.
</p>
<p>
So feel free to make them public, or not.
</p>
<blockquote class="citation">
<p>
You're right. The middle underscore isn't adding anything. I fixed most (not <tt>_WrappedException</tt> (<em>yet</em>?) since I may end up moving its definition to another module, depending on what happens for some other points) instances of this pattern in <a class="missing changeset" title="No changeset 25999 in the repository">r25999</a>, <a class="missing changeset" title="No changeset 26000 in the repository">r26000</a>, and <a class="missing changeset" title="No changeset 26001 in the repository">r26001</a>.
</p>
</blockquote>
<p>
This begs an interesting question, which isn't really relevant to this review: if <tt>_WrappedException</tt> is private, then there's no documentation nor any supported way to create a <tt>ResponseFailed</tt>, <tt>RequestTransmissionFailed</tt>, etc. The duplicate documentation of <tt>@ivar reasons</tt> is a nice bit of attention to detail, but somewhat redundant.
</p>
<blockquote class="citation">
<blockquote class="citation">
<ol start="3"><li><tt>_makeFunction</tt> will create functions that don't appear in the API documentation. Also: worst name ever. May I suggest something more along the lines of (spelled python2.3-compatibly, of course):
</li></ol></blockquote>
</blockquote>
<blockquote class="citation">
<p>
Fixed in <a class="missing changeset" title="No changeset 26003 in the repository">r26003</a>.
</p>
</blockquote>
<p>
Better. I have some comments for using this more broadly, but these should be considered as comments on a potential public implementation of this; the ad-hoc version here can remain as-is:
</p>
<ol><li>The "name" argument is both redundant and misleading. It should really just inspect its function argument and use that name. Why does the name sometimes differ and sometimes not? As I was writing this I realized that it is written that way to consistently make the implementations private without making the names necessarily private, but it's a bit of a mystery at first. When I saw '<tt>def _connectionLost_QUIESCENT</tt>' my impulse was to grep for '<tt>def _connectionLost</tt>', after having seen '<tt>def _bodyDataReceived_INITIAL</tt>' and '<tt>def _bodyDataReceived</tt>'.
</li><li>Decorators should be written with an eye to @-syntax compatibility. We might not use it, but it would be annoying for an external user to have to define a new wrapper function just so that they could. Also, it's hard enough puzzling out the equivalence of '<tt>@foo def bar</tt>' and '<tt>def bar ... bar = foo(bar)</tt>'; adding extra arguments makes it even harder to read.
</li><li>I also can't help but note that <tt>epsilon.modal</tt> addresses all of these issues; the implementations don't need to be private, the names are less ugly, it has a really convenient syntax without being a decorator, etc. It bothers me that it's apparently easier to do something crappy over and over and over again than to roll in code that we've <em>already written</em> from somewhere else.
</li></ol><blockquote class="citation">
<blockquote class="citation">
<ol><li>Why bother having <tt>UNKNOWN_LENGTH</tt> as a distinct value? Wouldn't <tt>None</tt> serve this purpose adequately? Or having the attribute be optional? You still have to explicitly check, after all. It seems to bloat the code without providing any real benefits.
</li></ol></blockquote>
<blockquote>
<p>
We chose to use <tt>UNKNOWN_LENGTH</tt> rather than <tt>None</tt> in order to attempt to prevent bugs in applications of the form <tt>if response.length: ...</tt>
</p>
</blockquote>
</blockquote>
<p>
I am not sure what kind of bug you mean. I mean, I can come up with a class of bugs in applications like that, but in order to prevent them you'd need a <tt>__nonzero__</tt> method that raises an exception. (Although, actually, that's kind of a cool idea.) As it is, it just flips the polarity from silently failing your test to silently succeeding it. <tt>UNKNOWN_LENGTH</tt> can be 0-length too, after all, if all you get is one chunk.
</p>
<blockquote class="citation">
<p>
This rather directly raises a question which I've been pondering all morning though - why is a request body produced with an <tt>IEntityBodyProducer</tt> but a response body is made available through the <tt>Response.deliverBody</tt> mechanism; it seems as though <tt>Response</tt> could have an attribute which is an <tt>IEntityBodyProducer</tt> provider.
</p>
</blockquote>
<p>
Indeed, it would be good if these were consistent. Since you can't remember why it's this way, can you fix it?
</p>
<blockquote class="citation">
<blockquote class="citation">
<ol><li>The tests sure do say <tt>Response(None, None, None, None, transport)</tt> a lot.
</li></ol></blockquote>
<p>
Did something in <a class="missing changeset" title="No changeset 26008 in the repository">r26008</a>, let me know if that's sufficient.
</p>
</blockquote>
<p>
Better than sufficient. Reading the tests with that change I feel like I understand this code better :).
</p>
<blockquote class="citation">
<blockquote class="citation">
<p>
Idle musings / reminders of persistent problems:
</p>
<ol><li>New examples are great. But every time I see a new example that uses <tt>reactor.run()</tt>, I feel it necessary to point out the thing that I pointed out on <a class="ext-link" href="http://twistedmatrix.com/trac/ticket/1490#comment:39"><span class="icon">​</span>my comment to #1490</a>. Nothing this branch can do, I guess, just a reminder.
</li></ol></blockquote>
<p>
Yea. <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/3270" title="defect: The XMPP example in Twisted.Words crashes if I stop it with Ctrl-C (closed: fixed)">#3270</a> is also somewhat related. Do you want to have a fight with Chris about it? <a class="ext-link" href="http://divmod.org/trac/browser/trunk/Epsilon/epsilon/react.py"><span class="icon">​</span>react.py</a> is as far as I could push that issue. Hm, actually, I see that Chris didn't voice his objections on the ticket, he must have done it in meatspace, which means I can pretend he didn't, and maybe push the <tt>react</tt> solution forward.
</p>
</blockquote>
<p>
I'd prefer <tt>twistd run</tt> to <tt>react</tt> for examples, since <tt>react</tt> will still produce misleading behavior where newbies think they need to do something special to call <tt>listenTCP</tt> twice. I'll probably take your side in a fight against Chris over that though, <tt>react</tt> is an improvement. (<tt>react()</tt> will probably be more useful for simple command-line tools or client applications, too).
</p>
<blockquote class="citation">
<blockquote class="citation">
<blockquote>
<p>
... I really wish we had some convenient idiom for making constants that would <tt>repr()</tt> (and ideally <tt>qual()</tt> as well) to their FQPN. ...
</p>
</blockquote>
</blockquote>
<p>
I don't feel very good about adding this kind of thing to Twisted.
</p>
</blockquote>
<p>
Why not? Twisted has <em>lots</em> of utilities in twisted.python that just do stdlib-type stuff better than the stdlib, and for the most part I love them. Sure, some of them were ill-advised, but they haven't really hurt us. (i.e. how much time have you spent maintaining <tt>twisted.python.hook</tt> this year?)
</p>
<blockquote class="citation">
<p>
Perhaps it will come to that (we probably should have hashed this out long ago, oh well), but it's such a simple thing, and has basically nothing to do with Twisted... Why isn't it in the stdlib, or why isn't there a widespread third-party package that provides the functionality?
</p>
</blockquote>
<p>
I'd certainly rather add a dozen more little utilities like this to Twisted than try to popularize a dozen widespread third-party packages, or live with the quality-control insanity of trying to integrate a dozen <em>versions</em> of a dozen third-party packages into our already onerous buildbot setup. (Not to mention that then we'd need to write a working, portable package manager for Python so that users can install Twisted's two-dozen dependencies without <tt>easy_install</tt> deleting their root partition or whatever.)
</p>
<blockquote class="citation">
<p>
Switching to a string (well, I'm going to use unicode actually, ha ha) for now seems like a good idea (<a class="missing changeset" title="No changeset 26013 in the repository">r26013</a>).
</p>
</blockquote>
<p>
Speaking of "ha ha", sure you don't want to switch to something that raises an exception on <tt>__nonzero__</tt>?
</p>
<blockquote class="citation">
<blockquote class="citation">
<ol><li>The only <tt>pyflakes</tt> warning from any changed file is <tt>IUsernameDigestHash</tt>. Not your fault, I know, but could you perhaps just add a blank line that says "<tt>IUsernameDigestHash # export in __all__ below</tt>" and take it to zero?
</li></ol></blockquote>
<p>
I really dislike hacking around pyflakes shortcomings like that. I'd rather add <tt>__all__</tt> support to pyflakes.
</p>
</blockquote>
<p>
Le mieux est l'ennemi du bien.
</p>
<p>
I noticed that there wasn't a Divmod ticket for this, so I filed one. <a class="ext-link" href="http://divmod.org/trac/ticket/2826"><span class="icon">​</span>Be my guest and fix pyflakes</a>, but in the meanwhile it's a lot easier to read pyflakes' output if I don't have to manually verify the accidental exports.
</p>
<p>
Actually, I just realized that one the main reason I haven't agitated for a pyflakes fix is that I actually <em>like</em> the pyflakes workaround, because it's also a Python workaround. It's easy to miss the <tt>__all__</tt> entry down below; nothing will warn you if the name <em>stops</em> being defined, and the API is thusly incompatibly changed. But if you delete the import and you're not using pyflakes, the name reference below will blow up in your tests, right next to a comment that tells you the name is being explicitly exported.
</p>
<p>
Of course you know how I <a class="ext-link" href="https://code.launchpad.net/~glyph/+junk/pyexport"><span class="icon">​</span>actually</a> wish this worked.
</p>
TicketitamarThu, 15 Jan 2009 13:53:04 GMThttps://twistedmatrix.com/trac/ticket/886#comment:21
https://twistedmatrix.com/trac/ticket/886#comment:21
<p>
Re why response body comes as protocol, not IEntityBodyProducer: IEntityBodyProducer is still kinda crap. Users would need to support yet another new, over-complex consumer API (see the fun code needed to deal with request producers!), which hopefully we will eventually replace anyway with better version... So why not use API that is equivalent, much simpler, and that users already know: a protocol with a transport? Bleh. And now I've used up half my daily typing allowance :)
</p>
TicketexarkunFri, 16 Jan 2009 22:20:43 GMThttps://twistedmatrix.com/trac/ticket/886#comment:22
https://twistedmatrix.com/trac/ticket/886#comment:22
<p>
Replying to <a class="ticket" href="https://twistedmatrix.com/trac/ticket/886#comment:20" title="Comment 20">glyph</a>:
</p>
<blockquote class="citation">
<ol><li><tt>twisted.web._newclient.HTTPParser.statusReceived</tt> isn't valid epytext. I believe this is because of the "\r\n"; I've run into this problem before, I'd recommend spelling it "CRLF" in docstrings.
</li></ol></blockquote>
<p>
Fixed in <a class="missing changeset" title="No changeset 26052 in the repository">r26052</a>.
</p>
<blockquote class="citation">
<ol><li>the following reference (link) names couldn't be resolved by pydoctor:
<ol><li>in <tt>_newclient</tt>:
<ol><li><tt>IProtocol</tt>
</li><li><tt>IPushProducer</tt> (I think you mean <tt>IPushProducer</tt>)
</li><li><tt>IResponse</tt>
</li><li>There also seems to be a bug for mwh, not you: pydoctor can't find "twisted.web.iweb.UNKNOWN_LENGTH", despite the ivar declaration.
</li></ol></li><li>in <tt>iweb</tt>:
<ol><li><tt>IConsumer</tt>
</li><li><tt>Deferred</tt>
</li><li><tt>IProducer</tt>
</li><li><tt>Failure</tt>
</li></ol></li></ol></li></ol></blockquote>
<p>
Hum. Indeed. I am rather dissatisfied with the proper spelling for these links. I think it makes the unprocessed docstrings rather harder to read. I'd like to talk to mwh about making pydoctor able to resolve them as-is or with some approach other than the one that epydoc/epytext requires. I've not done anything about them yet.
</p>
<blockquote class="citation">
<ol><li><tt>_WrapperException</tt> needs a docstring, whether or not it remains double-private.
</li></ol></blockquote>
<p>
Fixed in <a class="missing changeset" title="No changeset 26053 in the repository">r26053</a>
</p>
<blockquote class="citation">
<ol><li><tt>makeStateDispatcher</tt> should really have a docstring.
</li></ol></blockquote>
<p>
Fixed in <a class="missing changeset" title="No changeset 26054 in the repository">r26054</a>
</p>
<blockquote class="citation">
<ol><li>undocumented in <tt>test_newclient</tt>:
<ol><li><tt>AccumulatingProtocol</tt> (isn't there one of these somewhere that you can use? let's not make <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/3604" title="enhancement: Create a single definitive module for commonly used test helpers and ... (closed: duplicate)">#3604</a> harder to fix.)
</li></ol></li></ol></blockquote>
<p>
Fixed in <a class="missing changeset" title="No changeset 26055 in the repository">r26055</a>
</p>
<blockquote class="citation">
<ol><li><tt>SlowRequest</tt>
</li><li><tt>SimpleRequest</tt>
</li><li><tt>StringProducer</tt>
</li></ol></blockquote>
<p>
Fixed in <a class="missing changeset" title="No changeset 26056 in the repository">r26056</a>
</p>
<blockquote class="citation">
<ol><li><tt>RequestTests</tt>
</li></ol></blockquote>
<p>
Erm? This one has a docstring (albeit a short one which reveals very little useful information...)
</p>
<blockquote class="citation">
<ol><li>"<tt>mumble</tt>". (Perhaps "lol" is not a useful comment, either.)
</li><li>Assertion methods. (It would be more consistent for these to be on a class, but I don't think that's too important.)
<ol><li><tt>assertResponseFailed</tt>
</li><li><tt>assertRequestGenerationFailed</tt>
</li><li><tt>assertRequestTransmissionFailed</tt>
</li><li><tt>HTTPClientParserTests._noBodyTest</tt>
</li></ol></li></ol></blockquote>
<p>
Fixed in <a class="missing changeset" title="No changeset 26057 in the repository">r26057</a>.
</p>
<p>
So this leaves only the pydoctor/epytext issues.
</p>
<blockquote class="citation">
<hr />
<p>
That's it for the real meat of the review. Now for the optional stuff and further rumination on loosely-related ideas:
</p>
<blockquote class="citation">
<p>
Replying to <a class="ticket" href="https://twistedmatrix.com/trac/ticket/886#comment:15" title="Comment 15">glyph</a>:
</p>
<blockquote class="citation">
<p>
OK. Awesome. Really glad to see this is finally in review!
</p>
</blockquote>
<p>
I agree with you in general. Some functionality needs to be public. I mentioned the specific motivation behind what is happening in this branch in the comment where I requested a review, though. Can you argue against that plan in particular?
</p>
</blockquote>
<p>
The abovementioned comment on IRC was enough to convince me, at least, that the best course of action at this point is to complete the review cycle as quickly as possible, in whatever form.
</p>
<p>
I still stand by my earlier suggestion, in point 1.2 above, that some elements of this should be public — but this is not out of any procedural concern. I just have a fairly high level of confidence in the API as we've discussed it, and the implementation seems high-quality. The sketch adequately demonstrates to me that this module, or at least the parts that I propose become public, are suitable for a real substrate.
</p>
<p>
So feel free to make them public, or not.
</p>
</blockquote>
<p>
Okay. I think actually I'm going to try the stacked branch thing you mentioned in a previous comment. After the higher-level stuff is done, then some of the lower-level stuff can probably become public. And then the two branches can be merged at almost the same time.
</p>
<blockquote class="citation">
<p>
[snip - there should be a good way to define state machines]
</p>
</blockquote>
<p>
Yep, I agree.
</p>
<blockquote class="citation">
<p>
[snip - UNKNOWN_LENGTH.<span class="underline">nonzero</span> should raise or something]
</p>
</blockquote>
<p>
Actually, even if the response body for an UNKNOWN_LENGTH response ends up being 0 bytes, you still have to treat the response object as you would for a known non-zero length response. In either case, you have to ask the response to produce the body to a protocol. So, while I wouldn't write it myself, <tt>if response.length:´ would let you detect the case where you need to call </tt>deliverBody<tt>. </tt>if response.length is UNKNOWN_LENGTH or response.length:<tt> is a nicer way to write it, but either works. So I ''could'' make </tt>UNKNOWN_LENGTH.<span class="underline">nonzero</span>` raise and force people to write the nicer version, but should I actually?
</p>
<blockquote class="citation">
<p>
[snip - replace protocol usage with another IEntityBodyProducer]
</p>
</blockquote>
<p>
I hope itamar's replied clarified this situation. Really, we should just add a non-shitty <tt>IProducer</tt>/<tt>IConsumer</tt> replacement. :/ I think I'm going to leave this alone for now, though.
</p>
<blockquote class="citation">
<p>
[snip - add a good const/enum/whatever thing to Twisted]
</p>
</blockquote>
<p>
Eh yea okay, I guess we'll have to.
</p>
<blockquote class="citation">
<p>
[snip - pyflakes workaround]
</p>
</blockquote>
<p>
Okay, added in <a class="missing changeset" title="No changeset 26058 in the repository">r26058</a>.
</p>
TicketivankSat, 24 Jan 2009 21:11:17 GMThttps://twistedmatrix.com/trac/ticket/886#comment:23
https://twistedmatrix.com/trac/ticket/886#comment:23
<p>
Some servers like <a class="ext-link" href="http://news.ycombinator.com/"><span class="icon">​</span>http://news.ycombinator.com/</a> use only \n to delimit status lines and headers, instead of \r\n. All not-ridiculously-obscure web browsers can parse this. _newclient.py revision 26111 is unable to parse any header from this page (why is raising AttributeError instead of BadHeaders? I would imagine that no headers are 'bad headers')
</p>
<p>
Failure: twisted.web._newclient.ResponseFailed: [&lt;twisted.python.failure.Failure &lt;type 'exceptions.AttributeError'&gt;&gt;]
</p>
<p>
My not-thoroughly-tested workaround is to have delimiter = '\n' in HTTPParser,
and this in lineReceived:
</p>
<p>
if line and len(line) &gt; 0 and line[-1] == '\r': line = line[:-1]
</p>
TicketjknightSat, 24 Jan 2009 21:51:41 GMThttps://twistedmatrix.com/trac/ticket/886#comment:24
https://twistedmatrix.com/trac/ticket/886#comment:24
<p>
Are there any servers not written by paul graham that exhibit this incredible brokenness?
</p>
TicketivankSat, 24 Jan 2009 23:22:33 GMThttps://twistedmatrix.com/trac/ticket/886#comment:25
https://twistedmatrix.com/trac/ticket/886#comment:25
<p>
I don't know. It could take a while to find one in the wild.
</p>
<p>
Also, I can't find a browser that doesn't allow the broken behavior, even though it violates the RFC: "This flexibility regarding line breaks applies only to text media in the entity-body; a bare CR or LF MUST NOT be substituted for CRLF within any of the HTTP control structures (such as header fields and multipart boundaries)."
</p>
TicketivankWed, 01 Apr 2009 18:58:35 GMTattachment sethttps://twistedmatrix.com/trac/ticket/886
https://twistedmatrix.com/trac/ticket/886
<ul>
<li><strong>attachment</strong>
set to <em>t.web._newclient_support_bare_LF.patch</em>
</li>
</ul>
<p>
support for bare LF newlines sent by broken servers
</p>
TicketivankWed, 01 Apr 2009 18:59:40 GMThttps://twistedmatrix.com/trac/ticket/886#comment:26
https://twistedmatrix.com/trac/ticket/886#comment:26
<p>
If Twisted chooses to support LF newlines, I'd be happy to manually rewrite the tests however they should be written.
</p>
TicketexarkunTue, 12 May 2009 20:42:05 GMTcc changedhttps://twistedmatrix.com/trac/ticket/886#comment:27
https://twistedmatrix.com/trac/ticket/886#comment:27
<ul>
<li><strong>cc</strong>
<em>ivank</em> added
</li>
</ul>
<p>
Any HTTP server that emits LF newlines is broken and written by someone who is either incompetent or malicious. Such people should be punished.
</p>
<p>
However, if someone (ie, ivank) wants to do the work to support this case, then that's fine with me. I do propose separating that out into a separate unit of work, though. ivank, want to file a ticket for implementing that once this code all makes it to trunk (which I am going to renew my efforts to achieve now)?
</p>
TicketivankWed, 13 May 2009 03:14:39 GMThttps://twistedmatrix.com/trac/ticket/886#comment:28
https://twistedmatrix.com/trac/ticket/886#comment:28
<p>
Okay, <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/3833" title="enhancement: _newclient does not support bare LF newlines, unlike web browsers (closed: fixed)">#3833</a> is for LF newline support.
</p>
TicketexarkunTue, 14 Jul 2009 20:56:45 GMThttps://twistedmatrix.com/trac/ticket/886#comment:29
https://twistedmatrix.com/trac/ticket/886#comment:29
<p>
I undid <a class="missing changeset" title="No changeset 26058 in the repository">r26058</a>, a fix for an earlier review comment, because pyflakes now understands <tt>__all__</tt>, so these workarounds are no longer necessary.
</p>
TickettruekonradsFri, 28 Aug 2009 09:31:55 GMTcc changedhttps://twistedmatrix.com/trac/ticket/886#comment:30
https://twistedmatrix.com/trac/ticket/886#comment:30
<ul>
<li><strong>cc</strong>
<em>truekonrads</em> added
</li>
</ul>
<p>
I think that a high level API should be part of this ticket. I think it is best to have a stateful client (one that tracks cookies), like HTTPClientFactory tries to do.
</p>
<p>
I typically will use such object as a page scraper or to interact with Web APIs. Therefore, the following would be useful:
</p>
<ul><li>Proxy support. Either by specifying proxy URL or by passing some sort of wrapper object. Proxy support should enable use of CONNECT method.
</li><li>Cookie handling in a cookie-jar (addCookie/getCookie/deleteCookie methods) so that repeat requests would contain cookies.
</li><li>A convenience function for posting application/x-www-form-urlencoded forms.
</li><li>Some sort of fairy dust for JSON (very common use-case) - POST with JSON encoded data in message body.
</li><li>Some convenience to enable LoggingTransport for debugging.
</li><li>Good docs so dummies like me can start using it fully immediately.
</li></ul>
TicketexarkunFri, 28 Aug 2009 13:37:51 GMThttps://twistedmatrix.com/trac/ticket/886#comment:31
https://twistedmatrix.com/trac/ticket/886#comment:31
<p>
The high-level API is not going to happen as part of this ticket. It will have documentation, and it will be such that the first three features you described will <em>not</em> <strong>have</strong> to be implemented in Twisted or by copying a bunch of code from Twisted and editing it. There probably will be proxy and cookie support in Twisted, though.
</p>
<p>
If you'd like to contribute other requirements or suggestions to the high-level API, please edit <a class="wiki" href="https://twistedmatrix.com/trac/wiki/TwistedWebClient">TwistedWebClient</a>.
</p>
TicketexarkunMon, 31 Aug 2009 21:43:37 GMTbranch, branch_author changedhttps://twistedmatrix.com/trac/ticket/886#comment:32
https://twistedmatrix.com/trac/ticket/886#comment:32
<ul>
<li><strong>branch</strong>
changed from <em>branches/expressive-http-client-886-3</em> to <em>branches/expressive-http-client-886-4</em>
</li>
<li><strong>branch_author</strong>
changed from <em>exarkun, itamar</em> to <em>itamar, exarkun</em>
</li>
</ul>
<p>
(In <a class="missing changeset" title="No changeset 27272 in the repository">[27272]</a>) Branching to 'expressive-http-client-886-4'
</p>
TicketexarkunSun, 08 Nov 2009 22:29:26 GMTstatus changed; resolution sethttps://twistedmatrix.com/trac/ticket/886#comment:33
https://twistedmatrix.com/trac/ticket/886#comment:33
<ul>
<li><strong>status</strong>
changed from <em>new</em> to <em>closed</em>
</li>
<li><strong>resolution</strong>
set to <em>fixed</em>
</li>
</ul>
<p>
(In <a class="missing changeset" title="No changeset 27413 in the repository">[27413]</a>) Merge expressive-http-client-886-4
</p>
<p>
Author: exarkun, itamar
Reviewer: glyph
Fixes: <a class="closed ticket" href="https://twistedmatrix.com/trac/ticket/886" title="enhancement: Add an HTTP 1.1 client protocol implementation to twisted.web (closed: fixed)">#886</a>
</p>
<p>
Add a new HTTP 1.1 protocol implementation.
</p>
Ticket<automation>Mon, 14 Feb 2011 04:30:49 GMTowner deletedhttps://twistedmatrix.com/trac/ticket/886#comment:34
https://twistedmatrix.com/trac/ticket/886#comment:34
<ul>
<li><strong>owner</strong>
<em>exarkun</em> deleted
</li>
</ul>
Ticket