f you’ve been working with Extensible Markup Language (XML) anytime lately, you’ve probably realized what a powerful language it can be. You can do all kinds of interesting perambulations to change your XML into HTML, filter and sort it, and if you’re running Internet Explorer 5.0, even use the XML to drive databases, create and populate HTML components, and even display in relatively plain clothes with CSS. However, most Internet transactions aren’t just one way anymore. If you are able to send XML to the browser, you should reliably be able to send XML back to the server. The return trip, however, isn’t anywhere near as obvious as the paths from the server to the client. Fortunately, with a little understanding of the basics of HTTP, you can find your way back. In this article, I’m assuming that you have at least Microsoft Internet Information Server (IIS) 4.0 on your system, and that you’re running Internet Explorer 5 as your client. You should also have a basic understanding of the way that ASP works.
The first time I work with a new component, I set aside a few days to try out its APIs and see what breaks and what works in unexpected ways. When I began playing with Microsoft’s IE5 XML parser, one of the features that I noticed was the .save method. Reasoning that this was a way of saving XML to a file, I tried the offhand possibility that I could also pass a URL as an argument and save back to the server:
It was a valiant try, but even with server permissions set to allow writing, the save function didn’t work. I was a little disappointed?the interface would have been much cleaner and more consistent if you had that capability, since you’re giving permission to write to the server to the client. Still, this technology is new.
The second route back down the mountain involves using the POST ability of a form. This does work, although it’s more than a little inconvenient. You post a hidden form field to an ASP page on the server:
The biggest drawback to this technique is that the posting forces a refresh of the page based upon the ASP. In some cases, this effect may be desirable?for example, if the XML being posted modifies something on the server and necessitates a change to a new page. But to make a client-based application, it would be preferable not to have to recreate the page every time the client and the server needed to talk. Ideally, the process can take place in the background, perhaps even invisibly to the user, which is the way that such processes as chat and collaborative messaging take.
Hitting the Trail With XMLHTTP
One of the most exciting (and generally least promoted) aspects of the MSXML component is that you can directly communicate with the server through the XML Document Object Model. In fact, it turns out that Microsoft’s implementation of the latest HTTP standards (HTTP/1.1) actually makes use of the XML to transmit general information back and forth between client and server (something I’ll cover in more detail later in this article). The agent for handling communication between client and server is the XMLHTTP request object (see Table 1). While it is part of the MSXML.DLL, the request object services are implemented through a separate interface package, “microsoft.xmlhttp”, rather than the normal XML parsing class of “microsoft.xmldom”.
The XMLHTTP object lets you create a mini-client, which can make requests to the server, and receive responses back from the server. The simplest use of the component is to get a file from a server?any file. For that matter, if the file being queried is an ASP page or CGI program that is continuously changing, the XMLHTTP object would let you repeatedly query the page transparently, making such things as a stock ticker, server status indicator, or any other stream-related element possible with DHTML, rather than having to rely on ActiveX controls or Java applets.
For example, one of the most basic advertising vehicles on the Internet is the ad rotator. This rotator displays a banner advertisement that changes every time a document is refreshed. In almost every case, this ad is typically a GIF, which provides the benefits of animation with a single download image. Suppose, however, that instead of just pulling in a GIF, you could retrieve any combination of DHTML text and graphics?multiple links, games, quizzes, product demonstrations?anything that could be accomplished in normal HTML. Furthermore, suppose that this information could rotate automatically every 30 seconds or so, without forcing the entire page to refresh. Finally, imagine ads that could send information back to the server indicating when activity took place on the ad (for example, timers that would activate whenever a potential customer moved over the ad with the mouse and send themselves with completed time when a customer moved off).
Okay, so maybe this is beginning to sound like a Ronco ad (“But Wait, There’s More!!”). However, this is all fairly easy to accomplish with the XMLHTTP request object. This script, for instance, will make a query against an ASP page called messages.asp, and then replace the contents of a DIV with the resulting downloaded text. Then, 15 seconds later it will call the function again.
The .open method takes a number of arguments, although most of them are optional.
The first argument contains the HTTP protocol that you’re calling. The “GET” protocol is the simplest, and is precisely the same GET that a form uses for sending basic form data. It retrieves data from a file (or a stream, in the case of ASP, Java servlets, or CGI), as well as a header that provides some information about the file. I’ll be returning to this parameter in a bit?with the right server you can do considerably more than just retrieve a file. Note that GET is also the only protocol that is supported on all servers (it was the only command in the HTTP/0.9 protocol, the oldest standard still in regular use on the Internet).
With the current state of the art, the data that you retrieve probably won’t just be a static XML file. More likely, you’ll be working with a stream of data that’s generated dynamically through ASP, Java Server Pages (JSP), Perl, or some other source. For instance, take a look at my sample page of dynamic ad banners at http://18.104.22.168/CC/journeyHome/test.htm. Here, an ASP page called XMLSource.asp retains state information on the server about the last block of HTML sent (see Listing 1). The HTML itself is embedded within an XML document called banners.xml (see Listing 2). You should note that the HTML data contained therein is what’s called “well-formed”?the HTML is itself an XML document, with closing tags for all elements, even those that in typical HTML don’t have them (such as IMG tags).
The second argument contains the URL that you’re targeting, which would correspond to a form’s action attribute. If you are retrieving an ASP page or the results of a CGI query, you can also pass name-value pair arguments just as you would with a standard URL. For example, you could set up the Pithymessage.asp file so that it could take an explicit filename. The .open statement would then look something like:
Be aware that the onreadystatechange function may fire as many as four different times, depending upon the state of the download when the function is fired. You should always check that the .readyState is “complete” prior to processing.
If you start an HTTP request and then decide to terminate it (something analogous to pressing the STOP button on a browser), you can call http.abort(). This method gives you a way of getting out of potential hanging situations with as minimal an impact on the program as possible.
Headers in the Right Direction
Whenever you send an HTML file, the browser sends some extra information that you’re probably not aware of. This information is transmitted in name/value pairs to the server to inform it about the nature of the data being sent to it. These sets (or headers, as they’re known in Internet circles) can specify file types, data length, expiration policies, or other relevant information, with each protocol supporting specific headers.
Of all the headers, the two that are most important are Content-Type and Content-Length. Content-Type gives the MIME type of the document being transmitted?for example, text/html for an HTML document, text/xml for an XML document, image/jpeg for a JPEG document, and so forth. In some cases (such as the GET protocol), the header is used by the server to determine how the output gets sent back. In this case, it’s actually a little superfluous, since the default case for the Content-Type value is text/html.
The Content-Length header is not used with the GET protocol, but it is used with most of the others. It returns the length of the body of the message being sent, not counting header information. It is much more useful with the POST protocol, in which you are explicitly sending a stream of information. Content-Length can also be used with the headers of the response document (covered later in this article), provided that their source was a file on the server. ASP pages, for example, can have variable lengths of output depending upon their processing instructions. As a consequence, they do not send back a Content-Length header.
The purpose of the open function is to identify the target URL to send an HTTP request, but it doesn’t actually send any information. You send an HTTP request using, not surprisingly, the .send() method. Send opens a Windows socket to the target and sends header information, possibly followed by a body (depending upon the HTTP protocol used). The server in turn sends back two streams of data, a set of headers followed (if so requested) by a body. The headers contain status information about the transaction (for example, the infamous 404 File Not Found Error includes the error code 404 in the header). This makes it possible for the browser to display error messages in its own way if so desired.
You can retrieve this information using the .status and .statusText properties respectively. The first property retrieves the status code, which is a three digit number that has a specific meaning, while .statusText returns the string message that the server sends back-?usually a human description of the error encountered , although it could conceivably be anything that the server program so desires. In general, you should use the .status number to decipher the error message, rather than relying on some specific phrase within the .statusText for determining error causes.
The server returns a great deal more in the header information than just the error code, however. Normally, the headers get absorbed by the browser to determine how to handle the page, but you can see these values with the getAllHeaders function:
The code will probably look something like this:
Content-Type: text/htmlContent-Length: 2645Last-Modified: Mon, 01 Mar 1999 20:12:18 GMT
The content-length returns the total number of characters that were contained in the body. It’s worth remembering that even if the .open method points to an erroneous URL, the content-length will still be larger than zero characters, as it usually contains HTML that the server displays when an error takes place.
You can pick out individual headers through the use of the .getResponse method. For example, to retrieve just the last modified date, you’d use the expression:
document.write http.getResponseHeader("Last-Modified")--> Mon, 01 Mar 1999 20:12:18 GMT
While you should be consistent for form's sake, the getResponse() function is case insensitive?.getResponseHeader("last-modified") is the same as .getResponseHeader("Last-Modified").
The .responseText method is only one of four data formats available, and hints at the true power of the XMLHTTP Request function. The .responseXML property returns the data sent back from the server as an XML entity. This can cut down significantly on using the XML document's .load method, because the XML will already have been downloaded and converted with .responseXML:
This is similar to the .load method mentioned earlier, but it does give you the advantage of receiving the HTTP headers, and of loading components other than XML files onto the client.
The final two response properties, .responseBody and .responseStream, are used in more advanced contexts. The data that gets sent back from the client is frequently in encoded notation, both to conserve bandwidth transmission size and to ensure that dangerous characters (such as those used to delineate packet boundaries) aren't explicitly sent over the Internet. The body object contains this encoded data as a collection of bytes. For text data (.txt, .htm, .xml, and so on), the responseText function converts that data back into a stream of text. The .responseStream function, on the other hand, converts that same data into a binary stream object that can in turn be loaded, saved, or manipulated on the client with the appropriate tools. A similar mechanism is used to transport SQL recordsets from the server to the client for Remote Data Services (or RDS).
The techniques shown here should give you a basic understanding of how to work with the XMLHTTP Request object, although it is really only the beginning of what you can do with it. Next month, I'm going to focus on the HTTP1.1 protocol and how you can leverage the Request object to read directories, create, modify, and delete files on the server through HTTP, post images to the server, and get information about the server.