The XForms 1.0 last-call draft requires that the binary data be transmitted
in, at best, base64, and that it be inside the XML instance.
One of my proposed applications of XForms requires the transmission of XML
instance data and large amounts of binary data gathered by an <upload>
control.
The data in my application, and I suspect in other voice, video, and image
applications as well, will be tens of megabytes, whereas the remainder of
the instance data will be small (a few thousand bytes at most).
Having the binary data embedded in the XML instance data makes the data
harder to process and validate, because of the storage requirements, and
restricts the range of XML processing packages I can use to implement my
application.
The XForms 1.0 last-call draft allows the legacy POST of multipart/form data
as in XHTML 4.01 and XHTML 1.1 forms with <input type="file"... >. Although
this meets requirement to move binary data out of the XML instance, the
mechanism is deprecated in XForms 1.0 last call, and furthermore does not
provide XML instance data.
Therefore, I propose a mechanism for allowing the separation of the large
binary data from the XML instance, namely allowing XForms model to specify
by type that the XForms Processor refer to the <upload> data by URI instead
of base64 or hex encoding.
An XForms Processor may choose to provide this URI by generating it locally
and offering a service such as HTTP to serve up the content, or it may
choose to use <submitInfo mediaType="multipart/related"> and send the
content along with the text/xml instance data according to RFCS 2387, 2111,
and 2557 ([1] [2] [3] [4] [5]) or it may choose some other method of
generating the URI.
This proposal does not encompass such methods; it proposes changes to the
XForms specification so that an interoperable implementation of large binary
data form submission can be developed, and is independent of any
implementation of generation or interpretation of the URI.
I note that the xsd:base64Binary and xsd:hexBinary restriction on upload is
not enforced in the XForms Schema, so no changes are necessary in the XForms
Schema.
===============================================================
Proposed Changes
1. Section 8.5 "upload"
I propose that section 8.5 be changed to say the following:
Data Binding Restrictions: This form control can only be bound to
datatypes xsd:base64Binary, xsd:hexBinary,
or xsd:anyURI, or types derived by restriction from these.
2. Section 4.4.4.1 "Binary Content"
I propose that section 4.4.1 be changed to add the following:
Instance data nodes with values of the type xsd:anyURI are allowed, but
the mapping of URI to binary content resource
is not specified here.
===============================================================
Examples
With Mikko Honkala's proposed modifications for additional binding to
mediaType and fileName[6], the current specification would define the
following XForms fragment
<upload ref="document">
<mimeType ref="@mimetype"/>
<fileName ref="@fileName"/>
</upload>
<input ref="budgetCenter">
<input ref="docket">
resulting in the following instance
<xforms:instance>
<document mimetype="image/tiff"
fileName="document.tif"
xsi:type="xsd:base64Binary">
SUkqAKt4AAD//JuUKP///////8gO4iOyORgZHcgLKiKEeRhEcMgG8gKHRE82j+XjbMBgjgtn
ICRQiMRORHA0DZIDa0YRdEcUwBqC1ICFJF0bjAG0NOU2oy4EspTSyjRG2XAmBslMq0YRoBNA
rlMDmR0YA55HIzctoEA5BrkcQvEdH0Iy2XzQZgCYBwXBkI4UvF0cQiIy1wNG0bCEcUjgTQht
...
... and so on for 38 more megabytes...
...
<budgetCenter>7787</budgetCenter>
<docket>D10927</docket>
</xforms:instance>
With my proposed changes 1 and 2, the following would be allowed:
<upload ref="document">
<mimeType ref="@mimetype"/>
<fileName ref="@fileName"/>
</upload>
<input ref="budgetCenter">
<input ref="docket">
This could result in the following instance, where the URI is determined by
the XForms processor:
<xforms:instance>
<document mimetype="image/tiff"
fileName="document.tif"/
xsi:type="xsd:anyURI">http://myscanner.example.com/file?id=1a7f2642</attachm
ent>
<budgetCenter>7787</budgetCenter>
<docket>D10927</docket>
</xforms:instance>
=========================================================================
[1] Methods For Sending Out of Band Binary with XForms
http://lists.w3.org/Archives/Member/w3c-forms/2000OctDec/att-0067
[2] SOAP Messages with Attachments
http://static.userland.com/weblogsCom/gems/soapweblogscom/soapMessagesWithAt
tachments.html
[3] The MIME Multipart/Related Content-type
ftp://ftp.isi.edu/in-notes/rfc2387.txt
[4] Content-ID and Message-ID Uniform Resource Locators
ftp://ftp.isi.edu/in-notes/rfc2111.txt
[5] MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)
ftp://ftp.isi.edu/in-notes/rfc2557.txt
[6] Files sent with <upload> do not include mime type or file name
http://lists.w3.org/Archives/Public/www-forms-editor/2002Feb/0029.html