Status

()

For bugs in Firefox Desktop, the Mozilla Foundation's web browser. For Firefox user interface issues in menus, developer tools, bookmarks, location bar, and preferences. Many Firefox bugs will either be filed here or in the Core product. (more info)

For issues dealing with helper applications, and guessing Content Types when they aren't specified/known (ftp:, file:, jar:, but generally not http:). This component does not cover: backend networking issues, such as those covered by Networking: FTP or Networking: File, nor does it cover the Download Manager which has its own component in the Toolkit product.

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5
I just paid my credit card bill and wanted to save the receipt. So, I saved a print view page in a file, but when I viewed its content, I saw unusable garbage that wasn't related to what I saw in the browser. Source view of the page I wanted to save reflected what I saw, but that couldn't be saved in a file, for some reason. This is a data lost bug.
Reproducible: Always

I think you'd need to have a Firefox developer with a credit card account at Chase. Until one becomes available, perhaps, somebody should address existing bugs related Firefox re-requesting data from a server upon saving pages. Maybe when they are fixed, this bug will be eliminated as well. Core functionality such as this is much more important than the "awesome bar" and other things, in my opinion.
Having considered what I said above, let me know if I can provide more information.

Save as HTML or complete ?
That you see the page in the browser doesn't mean that the original is available if the page should not be cached. The images for example are only in the memory as decoded image but the original file is gone, the html is in the memory as parsed html but the original html is gone....
And discussing which bugs should be fixed and which not doesn't belong into this bug and we always accept patches from you.
Searching before filing your bug is a must and you should already found bug 84106, bug 288462.

Save as HTML.
What needs to be saved is whatever reflects what's being viewed in the browser window. If it's generated HTML, so be it. View source shows the correct data that should be saved... at least, in my case with Chase.
I will mark this bug as duplicate of bug 288462. The first one was filed before Firefox existed and is assigned to a person who is said not to be reading bug reports.

I seriously doubt this specific bug is a duplicate of the general mess that bug 288462 is.
That said, I can't make sense of what this bug is about. "print view page"? "source view"?
If you can describe the exact steps you took which led to the problem, that might get us closer to actually fixing it. I don't need to be able to repeat the steps on the site, but I _do_ need to have those steps to see which codepath you were following.

Bug 288462 is meta bug for tracking purpose. Why this bug can be a DUP of meta bug? Reopening, and set 288462 in Blocks: of this bug. Please find appropriate bug to DUP if DUP, after checking dependency tree for meta Bug 288462.

I will try to describe what happened with more detail. Hopefully, with this attention, the problem will get solved.
I went through a bill payment process that consisted of a two or three steps (different pages), and then I got to a transaction confirmation page (the final page that displayed the receipt, so to speak). It had a button/link to print the receipt. I clicked it and got a new page that was more suitable (content-wise) for printing. This page also launched a dialog to print it. I canceled the dialog and wanted to save the page. Because I was aware of the data refetching problem in Firefox, I opened the source view window (Ctrl+u) and then switched back the web page window and saved it (Ctrl+s to save HTML only). Then I looked at the content of the resulting file and saw that wasn't nearly what I saw in the browser. Not even similar. As Boris has said in one of bug reports, useless. However, the source view window contained exactly (as far as I could tell) the code that reflected what I saw in the page window -- the receipt. What I tried to say in this bug report is that the source view window could display the proper HTML code, etc., but the browser couldn't save the same thing in the file. Apparently, it had to refetch it from the server.
In 2 or 3 weeks I will be paying another bill and could watch for certain things if you tell me what to watch for. I could even let one of you do a remote admin session if this could help.

Roman, thank you very much for the clear description of the problem. Do you have any idea whether the page you're trying to save is a result of a POST request or a GET request? This could be bug 84106 if the former.
If you don't know, would you be willing to get an HTTP log following the directions at http://www.mozilla.org/projects/netlib/http/http-debugging.html the next time you pay the bill? So start the browser, pay the bill, quit the browser, and then e-mail me the log? (Probably better to not attach it to the bug, since it might have some sort of personal information in it.)

Boris, if I had to guess, I'd say it was a POST request.
I am certainly willing to get an HTTP log the next time I pay my bill. Will it be useful only to determine the type of request or will it provide more potentially useful data?
That page seems to say that having a nightly trunk build of Mozilla is a precondition for getting a log. Is that correct or can I produce a log with a release version of FF 3.0.5 too?

Roman, the HTTP log will provide various other data too (e.g. the headers the server sends, whether we hit the cache, etc).
You can produce a log with a release version. The "nightly trunk build" thing was true as of when the page was first written, right after the logging code landed. That was in 2003 or so. ;)
Thanks again for helping get to the bottom of this!

Roman sent me a log. Looks like the relevant parts for purposes of this bug are:
0[932140]: POST /EPay/EpayMakePayment.aspx?AI=134902742 HTTP/1.1
...
0[932140]: POST /EPay/epaymakepaymentverify.aspx HTTP/1.1
...
0[932140]: POST /EPay/epayconfirmpayment.aspx HTTP/1.1
and then later on in the same log:
0[932140]: GET /EPay/epayconfirmpayment.aspx HTTP/1.1
The server response to this last GET is 660 bytes long, which is exactly the size of the incorrect data saved on disk.
The server response to the POST to the same url is 19583 bytes long...
So this definitely looks like bug 84106, at least assuming the log includes an attempt to save the confirmation page.

Bug 84106 is about "Not correctly retrieving post data...", but the problem here and in other reports seems to stem from the fact the Firefox contacts the server to save a file representation of which a user is looking at. It should just save that representation without contacting the server.

Yes, but the point is that it doesn't have the original text in memory necessarily, and getting it from cache correctly requires having the right cache token and postdata.
In any case, I'll see what I can do about getting bug 84106 fixed, and then we should see whether this bug ends up fixed as a result.

Please keep in mind that the page source view shows the correct code. So, one part of FF has the right info, but another doesn't? That source code is what has to be saved in my situation. But sure, fix whatever you think needs to be fixed. As long as things are improving beyond superficial things such as adding a star to bookmark pages, things are good :)

OK, so I should have been a little more careful in comment 13. The key issue there seems to be that there's a GET at all. Bug 84106 would have caused a re-POST of the data; instead we have no POST data at all. The latest log Roman put together for me has several (4) GETs for epayconfirmpayment.aspx after the single POST, and all are responded to with that 660 byte response as far as I can tell. I have no idea where those GETs are coming from, but I assume the final save-as is reading one of those GETs from cache for whatever reason.

Boris, do you remember this from my earlier comment: "...I got to a transaction confirmation page (the final page that displayed the receipt, so to speak). It had a button/link to print the receipt. I clicked it and got a new page that was more suitable (content-wise) for printing..."?
Getting the page with printable content possibly made a GET request. I also viewed the source of the printable page, which might have made a GET request (is it possible?). And then came the Save As operation.

Oh, hmm... Indeed. The view-source shouldn't have been the thing that did that GET, since if it did it would have shown the error page.
But more interestingly, the view-source you did is not showing up in this log, as far as I can tell.
My best guess is that what happens is that the printable version is actually created via document.write(). That would show the written data in view-source, but trying to save would get something totally different, since save would use the URI of the page that made the write() call. This is the only scenario I can think of at this point consistent with what we're seeing.
If you're interested, I could create a special build that would log all data being passed to document.write() to standard output so that one could see it when running "firefox -console". Alternately, I could create a build that would alert and bail out when trying to save a page generated via document.write.
Let me know if either would be useful?

Boris, your special build saved what I wanted. There's one strange thing that I noticed, though: the HTML document specified charset=UTF-8, but the saved file is in "UTF-16LE with Signature" (according to EmEditor 7.00.3), and Notepad displayed empty vertical rectangles throughout the file that are usually displayed when unrecognized (or undisplayable?) characters are encountered.

Yeah, that's because the document.write document is created in UTF-16 directly; that's what all JS works with. Sounds like Notepad just ignores the BOM and ends up trying to render U+0000 for the zero bytes in UTF-16. It's really not much of a text editor... ;)
OK, sounds like we understand what the issue is here. Not really at all related to bug 288462.
Paolo, would you be interested in picking this up? I think the necessary changes here are one of:
1a) Change nsIWebBrowserPersist to take an originating principal as an argument
to saveURI and set it as the owner on the channel.
or
1b) Allow wyciwyg:// channels without an owner.
2) Something like the patch above in terms of detecting wyciwyg documents and
using that URI as the URI to save. We don't want to always use the
docshell's URI, because for error pages that would leave to weird behavior,
I think...
jst, do you have a preference between (1a) and (1b) above?

(In reply to comment #26)
> Paolo, would you be interested in picking this up? I think the necessary
> changes here are one of:
>
> 1a) Change nsIWebBrowserPersist to take an originating principal as an
> argument
> to saveURI and set it as the owner on the channel.
> or
> 1b) Allow wyciwyg:// channels without an owner.
>
> 2) Something like the patch above in terms of detecting wyciwyg documents and
> using that URI as the URI to save. We don't want to always use the
> docshell's URI, because for error pages that would leave to weird behavior,
> I think...
>
> jst, do you have a preference between (1a) and (1b) above?
It seems this is another case where the channel to be saved is set up
incorrectly; solution (1a) is probably the most correct.
I had a quick look at the docshell code, and if I understand it correctly
the code sets the right owner on wyciwyg:// channels, since the URIs have
the flag nsIProtocolHandler::URI_INHERITS_SECURITY_CONTEXT:
http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#7586
Probably the save code should do the same thing. But I'm wondering, since we
already have code that does everything that is needed for the channel setup,
why don't we find a way to reuse it and simply pass the prepared channel to
the saveChannel method of nsIWebBrowserPersist?
I filed bug 486921 to track this possibility, which is also an interesting
way to address issues like finding out the right cache key for subframes
(bug 485196), which is not trivial, and so on (the complete list of dependent
bugs is available in bug 486921).
What do you think about this approach? To me, it seems feasible, I'd only need
some help with understanding which operations performed by the docshell are
only required if the page is actually loaded in the browser window, rather
than saved separately (see last sentence of bug 486921 comment 0). Another
aspect would be designing an interface that could be used from JavaScript.

(In reply to comment #26)
> We don't want to always use the docshell's URI, because for error pages
> that would leave to weird behavior, I think...
This is an interesting, separate issue. Currently, when the user tries to
save error pages, they have the option of saving them as "complete XHTML"
or as "file only". If the page is saved as complete, then the contents of
the error page itself are stored on disk; if the page is saved as file
only, the browser tries to fetch the page again from the server. I don't
know what happens with your patch applied, when the page is saved as file
only.
Maybe, the "Save As" behavior when an error page is displayed could be
similar to the one triggered by a reload, with the difference that
errors would be conveyed through the Download Manager. Since we don't
have a "real" document at hand, we may also want to disallow saving an
error page as complete.
Another possibility (but I don't like it very much) is to disable the
"Save As" command completely for error pages, however we should still
handle the case where an error page is currently displayed in a subframe.
This change would probably require a separate UI review.

> I had a quick look at the docshell code, and if I understand it correctly
> the code sets the right owner on wyciwyg:// channels, since the URIs have
> the flag nsIProtocolHandler::URI_INHERITS_SECURITY_CONTEXT:
Yep.
However, the owner really is needed only when parsing into a document, so it doesn't necessarily make sense to assert on lack of owner in cases like save as, like we do now.
Reusing the docshell setup code from webbrowserpersist is not going to be very easy, nor necessarily desirable. For example, they want very different load flag setups....
> Currently, when the user tries to save error pages, they have the option of
> saving them as "complete XHTML" or as "file only".
Yeah. Honestly, "save" should just be disabled on error pages in the UI, I would think, as you describe in your last paragraph of comment 28. We should spin that off into a separate bug, as you say.

(In reply to comment #29)
> > I had a quick look at the docshell code, and if I understand it correctly
> > the code sets the right owner on wyciwyg:// channels, since the URIs have
> > the flag nsIProtocolHandler::URI_INHERITS_SECURITY_CONTEXT:
>
> Yep.
>
> However, the owner really is needed only when parsing into a document, so it
> doesn't necessarily make sense to assert on lack of owner in cases like save
> as, like we do now.
I didn't realize that. Are you suggesting to remove the owner check from
nsWyciwygChannel.cpp entirely?
> Reusing the docshell setup code from webbrowserpersist is not going to be very
> easy, nor necessarily desirable. For example, they want very different load
> flag setups....
Actually, I was thinking about replacing the saveURI call with saveChannel
in JavaScript, without modifying the nsIWebBrowserPersist interface or code.
In order to call saveChannel, the channel itself must be prepared in a way
that is similar to what is done by the docshell. My idea of reusing the code
is based on the consideration that most of the save-related bugs I saw are
caused by differences in the channel setup process.
I agree, however, that most of the docshell code mentioned in bug 486921
is unrelated to the actual channel setup. We may just reconstruct the logic
and reimplement it in JavaScript, to prepare the channel for saveChannel.
In both cases, I think a review of the docshell history loading process,
noting the differences with "Save As", may be beneficial. Boris, maybe
you're more indicated than me do this comparison; every time I look at
that code, I notice a new line that requires me to do a three- or four-level
deep MXRing in order to have a vague idea of what it does :-)
For example, what about channels supporting offline application caches?
"Save As" does not set them up in the same way the docshell does.
> > Currently, when the user tries to save error pages, they have the option of
> > saving them as "complete XHTML" or as "file only".
>
> Yeah. Honestly, "save" should just be disabled on error pages in the UI, I
> would think, as you describe in your last paragraph of comment 28. We should
> spin that off into a separate bug, as you say.
Filed bug 488339, with "uiwanted" keyword; you may want to add your
considerations there.

If we use nsIWebNavigation.currentURI.spec instead of document.location.href
for saving the original file, we must make sure not to use it to determine
the file name to use on save. Probably internalSave would need both URIs.
I'd prefer to write some regression tests (bug 486342) before going on with
fixing this. I'd have to write a specific test for the patch to be checked
in, anyway.

> Are you suggesting to remove the owner check from nsWyciwygChannel.cpp entirely?
Basically, yes. Maybe move it to somewhere in nsHTMLDocument or some such if we really care.
> Actually, I was thinking about replacing the saveURI call with saveChannel
> in JavaScript, without modifying the nsIWebBrowserPersist interface or code.
Ah, I see. That might be interesting, yes.
I have no idea what the right offline app cache behavior is here, to be honest. dcamp's the man for that.
Getting some tests for this first makes a lot of sense to me, yes.

(In reply to comment #32)
> Getting some tests for this first makes a lot of sense to me, yes.
I started writing some basic infrastructure for "Save As" testing on bug 486342,
maybe you want to have a look.