Graphics libraries use file uploading to control submissions and generate thumbnails.

ISP-hosted storefronts use file uploading to send product images.

Web-based file uploading is a vastly superior alternative to other means of transferring files to a central server over the Internet protocols. Let's examine why.

HTTP vs. FTP

FTP has been the standard mechanism for sending files to a server since the earliest days of TCP/IP. It is reliable, can take into account text vs. binary formats across platforms and function regardless of the operating system the client is using. However, compared to the flexibility of HTTP, it is deeply lacking. Let's compare:

Authentication
With FTP uploads, you must either manage many user accounts or allow anonymous access. With uploads via a web application, the application can determine who is allowed to upload, without a large administrative burden.

Security
Uploads via HTTP can be SSL-encoded so that the information is encrypted during transmission. There is no means of encrypting using standard FTP.

Ease of configuration
FTP uploads require the administrator to fine tune NTFS permissions (if you use NTFS). With HTTP-based uploads and your application, this is determined by the application as well as by the administrator, if desired.

Flexibility
Want to save DOC files in one location and graphics in another? With FTP, your users have to know that. With a web application, you can enforce these policies in your application and change them without disrupting your users.

Power
With a web application, you can limit the size of the uploaded file dynamically every time it is invoked. You could even change the size depending on information contained in the same form. Additionally, you can filter uploads that match certain criteria, such as wrong MIME type or file contents.

Simplicity and friendliness
A pleasing web page can offer instructions, advice and on-line help. This is not possible with batch-based FTP. More importantly, when errors occur you can provide immediate feedback to the user and offer corrective action.

Firewall support
Many organizations do not allow out-bound FTP for security and intellectual-property reasons. While this is simply a configuration issue, most firewalls do allow HTTP uploads.

Supplemental Information
An HTTP upload (using RFC1867) renders accessible additional information about the upload, such as the user's original filename. This can be very useful in intranet scenarios.

Upload to a database
Server-side components, such SA-FileUp, allow you to upload to an OLE DB database. Try that with FTP!

Performance
Both FTP and HTTP ultimately use the TCP protocol, which is the primary determinant of transfer performance.

Reliability and Restart
Both FTP and HTTP 1.1 allow for transfer restart. Unfortunately, many servers, including IIS, do not support restart of either protocol at this time. FTP restart is apparently coming in IIS5.

In short, like the web itself, it is programmability of the server that offers vast advantages of HTTP uploads over FTP.

Forms of HTTP upload

There are three mechanisms of file upload via HTTP: RFC1867, PUT and WebDAV.

HTTP Upload Method 1: RFC1867

RFC1867 (http://info.internet.isi.edu/in-notes/rfc/files/rfc1867.txt) stayed as a proposed standard within the IETF for a while before it received the blessing of the W3C ultimately in HTML 3.2. It was first implemented by Netscape in Navigator 2.0, followed by Microsoft as an add-on to IE 3.02 (32-bit) and native in IE 3.03 (16-bit). It is a very simple yet powerful idea: define a new type of form field

<INPUT TYPE= "FILE">

and add different encoding scheme to the form itself, rather than the typical:

This encoding scheme is much more efficient at transferring large amounts of data than the default "application/x-url-encoded" form-encoding scheme. As you may be aware, URL encoding has a very limited character set. Anything outside of the character set must be replaced by '%nn' where nn is the two digit hexadecimal equivalent. For example, even the common character is replaced by '%20'. If the browser had to encode entire files using this inefficient scheme, the transmitted size of the uploaded file could be 2-3 times larger than the original file! Instead, RFC1867 uses Multipart MIME encoding, as commonly found in e-mail messages, to transfer large amounts of data with no encoding, and just a few simple but useful headers around the data.

The result looks like a regular HTML form post, but rather than being say, 4 KB of form data, it can be megabytes long! RFC1867 also proposed a number of attributes of the TYPE="FILE" tag that have yet to adopted by the browser vendors. These include: ACCEPT: to let the web site restrict the type of file to be uploaded before receiving the file.SIZE: to set size of a single filename text box or to allow multiple files with a single <INPUT> tag.MAXLENGTH: to potentially set on the client-side, the maximum size file to be transferred.

Wildcards and directory uploads: neither IE nor Navigator supports wildcarded names or directories even though this is suggested in the RFC.

Fortunately, both browser vendors implemented the suggested "Browse..." button so the user can easily pick the file to be uploaded using the native "Open File..." dialog box. The use of the VALUE clause is interesting. Normally, it is intuitive to let the web site preset values of form fields for user convenience. However in this case, it could allow a nefarious web site to preset the name of the file to be uploaded, and coupled with a client-side form submit, "steal" files off a user's PC without their consent. In the summer of 1997, the CERT, in conjunction with an employee at Bell Labs, issued a security warning about this and both Netscape and Microsoft quickly issued patches that prevent presetting the file to be uploaded (see: http://www.microsoft.com/ie/security/bell.htm)

This is unfortunate, since the original RFC1867 clearly specified "it is important that a user agent not send any file that the user has not explicitly asked to be sent." So rather than disabling presetting the name entirely, the browser vendors could have simply issued an alert dialog box such as : "Do you want to transmit files x, y, z to the server?". As a final twist to this, yet another security hole was found in IE 4.01 in mid-October that allows a web site to circumvent IE's current security mechanism. (see http://www.microsoft.com/windows/ie/security/paste.htm)

HTTP Upload Method 2: HTTP PUT

HTTP 1.1 introduced a new HTTP verb: PUT. When a web server receives an HTTP PUT and object name ("/myweb/image/x.gif"), it will authenticate the user and take the content of the HTTP stream and store it directly to the web server. Since this could wreak havoc on a web site, it is not used frequently. It also takes away HTTP's greatest advantage: programmability of the server. In the case of PUT, the web server handles the request itself: there is no room for a CGI or ASP application to step in. The only way for your application to capture a PUT is to operate on the low-level, ISAPI filter level. Most web developers have no interest in this, with due reason.

Think of it as a non-proprietary Configuration Management (e.g. SourceSafe) plus file transfer for the web. Microsoft has publicly announced that it will be supported in IIS5, Office 2000 and future versions of IE. ISPs will love it as a replacement for the low-level, often broken, mechanics of FrontPage server extensions. Note that it will not replace the FrontPage server extensions: it will simply offer low-level standard services to support the more sophisticated functions that the server extensions currently perform. It is via WebDAV that Office 2000 can do those nifty "Save to web" functions you may have seen at the October '98 PDC.

Sounds great, right? Well, if all you are interested in is uploading content, WebDAV is great. It solves many problems. However, if you need file uploading within your web application, WebDAV will do nothing for you. Like HTTP PUT, the WebDAV verbs are interpreted by the server, not your web application. You need to work at the ISAPI filter level to access the WebDAV verbs and interpret the content in your application.

HTTP Upload Mechanisms: Conclusion

RFC1867 still remains the most flexible means of uploading files to your web application. PUT has very limited use. WebDAV is great for content authors, such as FrontPage users, but will be of little use to web developers who want to add file upload to their web application.

ASP Implementation

So we've concluded that RFC1867 is the best way to add file upload capabilities to your web application. How is it actually implemented? What tools does Microsoft supply? What other tools are available?

Microsoft's Posting Acceptor

ASP does not understand the "multipart/form-data" encoding scheme. Instead, Microsoft provides for free the Posting Acceptor (http://www.microsoft.com/iis/support/iishelp/iis/htm/core/pareadme.htm). The Posting Acceptor is an ISAPI application that accepts a REPOST to an ASP page after the upload is complete. (See also Scott Stanfield's article in July '98 issue of MIND).

SA-FileUp from Software Artisans

SA-FileUp (http://www.softartisans.com/softartisans/saf.html) was one of the first commercial Active Server Components. Version 1 shipped in May '97 and is currently in use on thousands of sites worldwide including microsoft.com. Early betas used a combination of ISAPI filter and Active Server component for integration with ASP. Microsoft then delivered ASP 1.0b (ASP.DLL 1.15.14.0) that provided a new method: Request.BinaryRead. The BinaryRead method made available the raw, unprocessed data from the browser to an Active Server component. Once that was available, SA-FileUp dropped the need for the ISAPI filter and now exists purely as an ASP component.

SA FileUp takes advantage of Request.BinaryRead which is mutually exclusive with the Request.Form. This makes sense: how could you read the raw stream of data from the browser and concurrently parse it as it were form information? To make life easier for the ASP developer, SA-FileUp reimplements all of the Request.Form functionality in its own .Form collection. This makes using SA-FileUp familiar to ASP coders who are used to using Request.Form.

Comparison of Posting Acceptor and SA-FileUp

Here is an objective comparison between PA and SA-FileUp:

ASP Integration
SA-FileUp is fully scriptable by Active Server Pages. Rather than existing as a separate ISAPI DLL, SA-FileUp integrates very smoothly with your ASP application.

Standards support
PA Upload from IE browsers uses the proprietary WebPost API, rather than the standard RFC 1867, so by default you need different forms for Netscape and IE users.

Anonymous Connections
Since PA uses an ISAPI DLL, it must provide additional security protection outside of your ASP application. For this reason, PA disallows all anonymous connections by default. PA 1.1 can allow anonymous uploads, but since there is programmatic control of the upload there is a considerable security risk here. Since SA-FileUp is integrated with ASP, your application can decide the appropriate level of security, including anonymous.

Control of the Upload
PA does not allow any control of the upload as it being sent. With SA-FileUp, you limit the size of the upload, or decide at run-time to flush the upload. Best of all, you can change the location of the upload dynamically.

Processing
PA has a two-step upload and repost processing. With SA-FileUp, everything can be accomplished in a single step, such as writing to a database depending on the status of the upload.

Uploading to a Database
PA can only upload to files. SA-FileUp can upload to files as well as databases.

"Spaces in filenames"
PA has a known issue when processing filenames that contains spaces. SA-FileUp has no such restriction.

Price
PA is bundled with NT Option Pack and free for download from Microsoft. SA-FileUp is not free: it is a supported commercial component.

Common Support Issues

By far, the most common support issues for file upload are security related. Typically, a site has secured NTFS permissions too carefully, which prevents the anonymous user account from writing to the destination file location. Also, security is often misunderstood by even advanced server administrators.

Remember that IIS/ASP executes each ASP page in a specific security context. If no authentication mechanism is in place (no Basic, no NT Challenge/Response), each page is executed as the anonymous user. The NT account that corresponds to the anonymous user can be set by the web admin.

For IIS3, the default anonymous user is IUSR_<computername>.

For IIS4, the default anonymous user is IUSR_<computername> for all in-process web applications ("Run in a separate memory space" is not checked). The default anonymous user is IWAM_<computername> for all out of process web applications ("Run in a separate memory space" is checked).

When using SA-FileUp, you must ensure that the destination directory has Read, Write and Delete permissions by the appropriate user.

If authentication is in force, then IIS/ASP will impersonate the authenticated user during the execution of the ASP page. So, the authenticated user's NT login account must have Read, Write and Permissions to the destination directory. A complete discussion of IIS security is beyond the scope of the article. Please see the IIS 4 Resource Kit for a very good explanation.

A Single File Upload

So enough theory, let see what the ASP code looks like. Here is a simple HTML form that will upload a single file:

Any content after the 1000th byte will be discarded, so the web server's disks are not unnecessarily filled.

Conclusion

Uploading files to your web application is simple: it can be accomplished in as little as two lines of ASP code. HTTP/RFC1867 file upload is the preferred mechanism because of the rich programming environment offered by the server. SA-FileUp, as an Active Server component integrated with ASP, offers significant advantages over the free Posting Acceptor from Microsoft.