On Sep 12, 2006, at 4:29 PM, dtillman wrote:
> I'm sorry if I missed the definition of a *large* file - could you
> provide a figure in bytes please that seems to be the threshhold for
> causing problems? Thanks.
>
> - Doug
Hi Doug,
Really "large file" is anything that takes a noticeable time to
upload. In my specific case, it's anything that takes longer than
three seconds, as that's the refresh rate I've set for the status
page that pops up once, then clocks. I suppose it'll depend on the
pipe between here and there.
I've seen it happen with as little as a 700kb file. The types of
files that my application is *supposed* to handle are from a few K up
to 10MB each.
-Gary

On 9/12/06, Dan Milstein <danmil@...> wrote:
> > Can you tell us what's different between your production and dev
> > environment? That would include how the process is launched as well.
> > And is there a difference in what modules get imported?
>
> Let me do some detective work -- I've got a bit of time now, so let
> me see what I can find. If I come up blank, I'll pass on everything
> I've got.
>
> > Also, even if you tapped into the cgi module, would you really be able
> > to give the user a progress bar? It seems that their window would be
> > tied up by the upload task being performed by the browser.
>
> It works by way of a Javascript function which polls the server every
> few seconds and updates the progress bar. The tricky part is getting
> some sort of identifier to the Javascript function so that it can
> tell the server which upload it's asking about. When I generate the
> page, I put in a new unique id which goes both into the URL of the
> form action and into the Javascript code. The tapped cgi module then
> gets that id from the URL and, as it fills the temp file, updates a
> globally-accessible dict with info about the progress. The JS
> function hits a servlet which just grabs the progress info from that
> dict.
That's neat. I should have presumed it was a Javascript technique.
Oops. I better say "AJAX" or I might not be employable. :-)
-Chuck

Dan Milstein wrote:
>> Can you tell us what's different between your production and dev
>> environment? That would include how the process is launched as well.
>> And is there a difference in what modules get imported?
>>
>
> Let me do some detective work -- I've got a bit of time now, so let
> me see what I can find. If I come up blank, I'll pass on everything
> I've got.
>
>
>> Also, even if you tapped into the cgi module, would you really be able
>> to give the user a progress bar? It seems that their window would be
>> tied up by the upload task being performed by the browser.
>>
>
> It works by way of a Javascript function which polls the server every
> few seconds and updates the progress bar. The tricky part is getting
> some sort of identifier to the Javascript function so that it can
> tell the server which upload it's asking about. When I generate the
> page, I put in a new unique id which goes both into the URL of the
> form action and into the Javascript code. The tapped cgi module then
> gets that id from the URL and, as it fills the temp file, updates a
> globally-accessible dict with info about the progress. The JS
> function hits a servlet which just grabs the progress info from that
> dict.
>
> It's a surprisingly small amount of code, actually -- it just touches
> on a lot of stages in subtle ways.
>
> -D
>
>
>
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Webware-discuss mailing list
> Webware-discuss@...
> https://lists.sourceforge.net/lists/listinfo/webware-discuss
>
I'm sorry if I missed the definition of a *large* file - could you
provide a figure in bytes please that seems to be the threshhold for
causing problems? Thanks.
- Doug

> Can you tell us what's different between your production and dev
> environment? That would include how the process is launched as well.
> And is there a difference in what modules get imported?
Let me do some detective work -- I've got a bit of time now, so let
me see what I can find. If I come up blank, I'll pass on everything
I've got.
> Also, even if you tapped into the cgi module, would you really be able
> to give the user a progress bar? It seems that their window would be
> tied up by the upload task being performed by the browser.
It works by way of a Javascript function which polls the server every
few seconds and updates the progress bar. The tricky part is getting
some sort of identifier to the Javascript function so that it can
tell the server which upload it's asking about. When I generate the
page, I put in a new unique id which goes both into the URL of the
form action and into the Javascript code. The tapped cgi module then
gets that id from the URL and, as it fills the temp file, updates a
globally-accessible dict with info about the progress. The JS
function hits a servlet which just grabs the progress info from that
dict.
It's a surprisingly small amount of code, actually -- it just touches
on a lot of stages in subtle ways.
-D

On 9/12/06, Dan Milstein <danmil@...> wrote:
> I don't have any kind of answer, but I've had a similar thing come
> up, so I'll pass it on, in case it helps narrow the focus:
>
> - In my dev environment (Webware 0.8.1 / Apache 2 / Mac OS X 10.4),
> when I do large file uploads, it freezes the entire webware process
>
> Works fine in the production environment, so I've been lazy about
> tracking it down, but it's a particularly nasty freeze. I have to
> kill -9 the process to get it to die. Which does have the flavor of
> some sort of GIL issue.
Can you tell us what's different between your production and dev
environment? That would include how the process is launched as well.
And is there a difference in what modules get imported?
Also, even if you tapped into the cgi module, would you really be able
to give the user a progress bar? It seems that their window would be
tied up by the upload task being performed by the browser.
Gary, I'm sorry I don't have an answer. I guess I've been lucky enough
not to have this problem on production or development.
-Chuck

I don't have any kind of answer, but I've had a similar thing come
up, so I'll pass it on, in case it helps narrow the focus:
- In my dev environment (Webware 0.8.1 / Apache 2 / Mac OS X 10.4),
when I do large file uploads, it freezes the entire webware process
Works fine in the production environment, so I've been lazy about
tracking it down, but it's a particularly nasty freeze. I have to
kill -9 the process to get it to die. Which does have the flavor of
some sort of GIL issue.
"Large" seems to be "large enough that the browser doesn't send it as
a single POST" (which it seems to do for small files)
Also, you're not going to be able to get a meaningful progress check
the way you're trying to. The file upload lifecycle works as follows:
- Apache gets the beginning of the huge request
- It starts forwarding it to the Webware process
- The webware proc uses the cgi library to parse out the values
- The cgi library creates a temp file for the upload and fills it
as the rest of the request is pulled from Apache
- Once it's done getting *all* the data from Apache, Webware then
hands control over to your servlet
- At that point, when you do things like file.value, it gets read
out of that temp file
The slow part is happening in the cgi.py library, before your servlet
is called. Your servlet is just copying from the temp file, which is
fast.
You *can* subclass cgi.py and play some games to get a progress check
as the temp file is filled (cgi.py is designed to let you override
the temp file creation, so you can instrument that, basically).
-Dan
On Sep 12, 2006, at 2:06 PM, Gary Perez wrote:
>
> On Sep 12, 2006, at 12:35 PM, Chuck Esterbrook wrote:
>
>> On 9/12/06, Gary Perez <gary.perez@...> wrote:
>>> On Sep 8, 2006, at 8:34 AM, sophana wrote:
>>>
>>>> Looking at the error messages, it seems that it is webware that
>>>> don't
>>>> accept the connection.
>>>> I don't know why, and don't even know if I have the same problem.
>>>> Are you sure that your second request is not blocked by the first
>>>> one in
>>>> webware?
>>>
>>> How would I be able to determine whether the second request is being
>>> blocked? If the simple answer is "because the second request doesn't
>>> get served", then yeah, it's being blocked... but why, and is
>>> there a
>>> way around this?
>>>
>>> I admit I don't know a lot about threads/threading, but I was under
>>> the impression that the AppServer (as config'd below) could handle
>>> multiple, simultaneous requests.
>>>
>>> Is there something completely obvious that I'm overlooking?
>>
>> Does anyone think this could be Python's GIL kicking in? Perhaps the
>> first thread has acquired the GIL and is not letting it go?
>
> GIL (global interpreter lock) - I'm *very* unqualified to address
> this.
>
>> Is the first thread using any extension modules (Python modules
>> written in C instead of Python)?
>
> No sir. From the first thread (form processor):
>
> from Template import Template
> import vtools, os.path
>
> ... where "vtools" is an external .py module that's written entirely
> in Python (no C).
>
> The 2nd request is also a python-only module that simply does a
> glob.glob('*') on the pwd.
>
>> What does it do with all the form
>> data being uploaded?
>> -Chuck
>
>
> At the moment, it simply does a bit of form checking (e.g., did the
> user select a file to upload), then writes the file.value from the
> form data, as such:
>
> filename, contents = file.filename, file.value
> open(os.path.join(PROJDIR + pdir, filename), 'wb').write(contents)
>
> As I previously stated, I was going to try to figure out a way to
> write the file data in chunks (?) so I could provide some type of
> status display to the user doing the uploading, but that's a problem
> for a later time...
>
> -Gary
>
> ----------------------------------------------------------------------
> ---
> Using Tomcat but need to do more? Need to support web services,
> security?
> Get stuff done quickly with pre-integrated technology to make your
> job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache
> Geronimo
> http://sel.as-us.falkag.net/sel?
> cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Webware-discuss mailing list
> Webware-discuss@...
> https://lists.sourceforge.net/lists/listinfo/webware-discuss

On Sep 12, 2006, at 2:06 PM, Gary Perez wrote:
>
> On Sep 12, 2006, at 12:35 PM, Chuck Esterbrook wrote:
>
>> On 9/12/06, Gary Perez <gary.perez@...> wrote:
>>
>> Does anyone think this could be Python's GIL kicking in? Perhaps the
>> first thread has acquired the GIL and is not letting it go?
>
> GIL (global interpreter lock) - I'm *very* unqualified to address
> this.
After a bit of research (http://ldp.paradoxical.co.uk/LDP/LGNET/107/
pai.html), I've found:
"In order to support multi threaded Python programs the interpreter
regularly releases and reacquires the lock, by default every 10
bytecode instructions. This can however be changed using the
sys.setcheckinterval() function. The lock is also released and
reacquired around potentially blocking I/O operations like reading or
writing a file, so that other threads can run while the thread that
requests the I/O is waiting for the I/O operation to complete."
And, at first, I thought that the "writing a file" part pertained to
my situation--which pointed to AppServer's behavior contradicting the
above statement.
However, the rate-limiting step in the whole thing is the transfer of
1000s of KB worth of data across the ether. In between the moment
that the form's submit button is pressed & the actual file write is
called, that's all upload time - the form hasn't been fully submitted
until all that goes across the wire, right?
So now, Chuck, I'm thinking it might very well be a GIL problem. To
get around this, should I explicitly spin off a new thread in the
status module? I assume I cannot do it on the upload side, since it's
a standard HTTP POST?
TIA,
-Gary

On Sep 12, 2006, at 12:35 PM, Chuck Esterbrook wrote:
> On 9/12/06, Gary Perez <gary.perez@...> wrote:
>> On Sep 8, 2006, at 8:34 AM, sophana wrote:
>>
>>> Looking at the error messages, it seems that it is webware that
>>> don't
>>> accept the connection.
>>> I don't know why, and don't even know if I have the same problem.
>>> Are you sure that your second request is not blocked by the first
>>> one in
>>> webware?
>>
>> How would I be able to determine whether the second request is being
>> blocked? If the simple answer is "because the second request doesn't
>> get served", then yeah, it's being blocked... but why, and is there a
>> way around this?
>>
>> I admit I don't know a lot about threads/threading, but I was under
>> the impression that the AppServer (as config'd below) could handle
>> multiple, simultaneous requests.
>>
>> Is there something completely obvious that I'm overlooking?
>
> Does anyone think this could be Python's GIL kicking in? Perhaps the
> first thread has acquired the GIL and is not letting it go?
GIL (global interpreter lock) - I'm *very* unqualified to address this.
> Is the first thread using any extension modules (Python modules
> written in C instead of Python)?
No sir. From the first thread (form processor):
from Template import Template
import vtools, os.path
... where "vtools" is an external .py module that's written entirely
in Python (no C).
The 2nd request is also a python-only module that simply does a
glob.glob('*') on the pwd.
> What does it do with all the form
> data being uploaded?
> -Chuck
At the moment, it simply does a bit of form checking (e.g., did the
user select a file to upload), then writes the file.value from the
form data, as such:
filename, contents = file.filename, file.value
open(os.path.join(PROJDIR + pdir, filename), 'wb').write(contents)
As I previously stated, I was going to try to figure out a way to
write the file data in chunks (?) so I could provide some type of
status display to the user doing the uploading, but that's a problem
for a later time...
-Gary

On 9/12/06, Gary Perez <gary.perez@...> wrote:
> On Sep 8, 2006, at 8:34 AM, sophana wrote:
>
> > Looking at the error messages, it seems that it is webware that don't
> > accept the connection.
> > I don't know why, and don't even know if I have the same problem.
> > Are you sure that your second request is not blocked by the first
> > one in
> > webware?
>
> How would I be able to determine whether the second request is being
> blocked? If the simple answer is "because the second request doesn't
> get served", then yeah, it's being blocked... but why, and is there a
> way around this?
>
> I admit I don't know a lot about threads/threading, but I was under
> the impression that the AppServer (as config'd below) could handle
> multiple, simultaneous requests.
>
> Is there something completely obvious that I'm overlooking?
Does anyone think this could be Python's GIL kicking in? Perhaps the
first thread has acquired the GIL and is not letting it go?
Is the first thread using any extension modules (Python modules
written in C instead of Python)? What does it do with all the form
data being uploaded?
-Chuck

On Sep 8, 2006, at 8:34 AM, sophana wrote:
> Looking at the error messages, it seems that it is webware that don't
> accept the connection.
> I don't know why, and don't even know if I have the same problem.
> Are you sure that your second request is not blocked by the first =20
> one in
> webware?
How would I be able to determine whether the second request is being =20
blocked? If the simple answer is "because the second request doesn't =20
get served", then yeah, it's being blocked... but why, and is there a =20=
way around this?
I admit I don't know a lot about threads/threading, but I was under =20
the impression that the AppServer (as config'd below) could handle =20
multiple, simultaneous requests.
Is there something completely obvious that I'm overlooking?
> Gary Perez a =E9crit :
>> Hello all,
>>
>> [snip]
>>
>> I have two pages to be served *almost simultaneously* by WebKit: one
>> of them is a large-file upload form processor which--naturally--will
>> take a long time to return (a problem I will try to solve later), the
>> second is a status page that lists the contents of a directory which
>> self-refreshes once every three seconds.
>>
>> While the first is waiting for all the form-data to upload, the
>> second simply does not respond, and eventually throws an Apache
>> Internal Server Error (500).
>>
>> Apache error logs report ten of these:
>> [Wed Sep 06 17:38:36 2006] [error] Can not open socket connection to
>> WebKit AppServer
>> [Wed Sep 06 17:38:36 2006] [error] Couldn't connect to AppServer,
>> attempt 10 of 10, sleeping 1 second(s)
>> ... and finally:
>> [Wed Sep 06 17:38:37 2006] [error] timed out trying to connect to
>> appserver -- giving up.
>>
>> I have WebKit's AppServer running threaded (default =20
>> AppServer.config):
>> StartServerThreads =3D 10
>> MaxServerThreads =3D 20
>> MinServerThreads =3D 5
>>
>> Someone with greater knowledge of Apache stated: "The default mode
>> for Apache 2 is pre-forking, aka, 1.3 compatibility." ... which I
>> take to mean it forks child processes from the main control (root)
>> process - and this is not the same thing as threading... am I
>> mistaken in this assumption?
>>
>> Finally, is apache's inability to connect to the AppServer an
>> inherent limitation and/or problem with the AppServer, *or* is it
>> caused by running apache in pre-forking mode versus enabling threads
>> there? If so, do you think that running both in threaded mode will
>> resolve the problem?