Telnet and REST Web Services?

December 15, 2004

"Telnet" usually refers to a terminal emulation program that uses the telnet protocol to talk to a server. It's been most popular for gaining shell access to remote computers, especially over dial-up lines in the old days, but you can also use it to submit commands to a web server if you can speak a little of the web server's language: HTTP.

Fans of the REST style of web services often point out that the four HTTP commands PUT, GET, POST, and DELETE let you perform the most important operations on a collection of data: Create new data, Read existing data, Update data, and Delete data—giving us the lovely acronym CRUD. I first heard of the correspondence between these four operations and SQL's four most basic operations in a table that Sun's Marc Hadley showed in a presentation on Web Services:

	HTTP	SQL
Create	PUT	INSERT
Read	GET	SELECT
Update	POST	UPDATE
Delete	DELETE	DELETE

As important new standards such as Atom build on these HTTP commands, I wanted to understand this foundation better. Web browsers, SOAP applications, and other applications that issue HTTP requests typically do so in the background as they perform their jobs; telnet lets you type these commands in directly and, after it passes them to the server, it displays the complete results immediately. It's unforgiving and unwieldy, but it provides valuable lessons in exactly what's going on with these important commands.

GET

When your web browser retrieves a web page, it sends the GET command to the HTTP server where the web page is stored. (I'm going to call it an "HTTP server" instead of a "web server" for the remainder of the article, because an HTTP server can do a lot more than deliver web pages—that's much of the point of web services.)

Follow this link to http://www.snee.com/xml/crud/sample1.html and do a View Source to see what this file looks like. Now let's try it the telnet way. A telnet program that you start up from the command shell is included with Windows, Linux, and OS X. At your command prompt, enter the following command to start a session with the HTTP port of the snee.com server:

telnet snee.com 80

The HTTP port is usually 80, but as we'll see, it doesn't have to be. Once it starts up, you may or may not see some text appear, depending on your telnet utility.

An HTTP request begins with a line consisting of the method name, the name of the resource to act on, and the version of HTTP being used. To ask the HTTP server to deliver the resource stored at /xml/crud/sample1.html, enter the following two lines:

GET /xml/crud/sample1.html HTTP/1.1
Host: snee.com

(You should be able to copy these lines from here and paste them into your telnet program. Your telnet program may not show the text that you type or paste in, but it still received it. Just because the text doesn't display after you pasted it doesn't mean it's not there.) The second line identifies the host part of the URI identifying the resource you're trying to request; it's required when using HTTP 1.1. This request is two lines long; as we'll see, an HTTP request can be longer than that. A blank line ends the request, so press the return key twice after entering that second line, and the server should serve up the document you requested preceded by an HTML header:

GET /xml/crud/sample1.html HTTP/1.1
Host: snee.com
 
HTTP/1.1 200 OK
Date: Wed, 20 Oct 2004 00:20:40 GMT
Server: Apache/1.3.29
Last-Modified: Tue, 19 Oct 2004 23:56:04 GMT
ETag: "50d923-69-4175a994"
Accept-Ranges: bytes
Content-Length: 105
Content-Type: text/html
 
<html>
  <head><title>sample1.html</title></head>
  <body><p>Here is sample1.html.</p></body>
</html>

The very first line of the header includes the response code 200, indicating that the request was successful. Other response codes include 404 for "Not Found," which you've probably seen when you mistyped a URL. We'll see other response codes as we try the other HTTP verbs with or without success. For example, the www.snee.com/xml/crud directory has no sample2.html file, so try the request above substituting "sample2.html" for "sample1.html" and watch what happens.

More advanced use of GET lets you specify a CGI script name in the URL along with parameters that the server will pass to the CGI script in a QUERY_STRING environment variable. For example, go to this form, do a View Source, and note that the form's method is "get." Go back to the form itself, enter the strings "Barry" and "Wom" in the two fields, click the "go" button, and note the result and the URL displayed in the navigation bar above the result. A GET request with this URL doesn't ask for an existing disk file, like the sample1.html GET request above, but it does request a resource from the HTTP server.

To do this with telnet, enter the following lines into a telnet session. It should pull the same HTML that was generated for the page you saw when you clicked the posttest.html form's "go" button because it's requesting the same resource.

GET /xml/crud/gettest.cgi?fname=Barry&lname=Wom HTTP/1.1
Host: snee.com

This means that you can actually make calls to REST-oriented web services with telnet. For example, try starting the telnet program with "www.daml.org 80" as parameters and then making the following request to their web service. It returns RDF/XML about a company based on its NYSE stock ticker symbol:

GET /cgi-bin/nyse?IBM HTTP/1.1
Host: www.daml.org

POST

If you have any experience with CGI scripts, you probably already know something about both GET and POST. To a web user simply looking at web pages and filling out forms, the form at http://www.snee.com/xml/crud/posttest.html looks and acts very much like its GET equivalent that we saw above. Its interaction with the HTTP server, however, is very different.

Before trying it via telnet, use your web browser to follow the preceding paragraph's link to it. Next, do a View Source to verify that the form's method is "post," and then return to the form from the View Source display. Fill in the strings "Barry" and "Wom" again, and compare the results with the results of the gettest.html form. The REQUEST_URI displayed on the result page (and in your navigation bar) shows the parameters passed in the GET version but not in the POST version. In the GET version, the parameters appear in the QUERY_STRING variable that the gettest.cgi read from the host environment but not in the QUERY_STRING that posttest.cgi read in. On the other hand, the CONTENT_LENGTH variable read by gettest.cgi is empty, but the posttest.cgi version shows the length of the content passed via the standard input. You may want to do a View Source on the result of the posttest.cgi "posted data" page to get an idea of what it looks like. You'll be seeing it again soon, the telnet way.

How do we pass content to the HTTP server via standard input? Here's how to do it with telnet (If you don't use the same fname and lname values shown here, remember to adjust the Content-Length figure so that the posttest.cgi script knows how much to read in from the standard input):

POST /xml/crud/posttest.cgi HTTP/1.1
Host: snee.com
Content-Length: 21

fname=Barry&lname=Wom

Don't forget the blank line after the Content-Length line to indicate where the HTTP request ends. You shouldn't need to press Return after "Wom," though, because the CGI script knows exactly how much data to read in—you told it so in the Content-Length header.

The server responds with the following header and the contents of the HTML file generated by posttest.cgi:

HTTP/1.1 200 OK
Date: Sat, 23 Oct 2004 17:13:33 GMT
Server: Apache/1.3.29
Transfer-Encoding: chunked
Content-Type: text/html

The HTML file will resemble what you saw when you did a View Source of the "posted data" page generated by posttest.cgi when you filled out the form on posttest.html.

POST is considered the HTTP equivalent of the CRUD Update operation. Its most common use so far is to invoke CGI scripts, much like GET is used with CGI scripts: as a way to say "run this script, and here's some parameters to use when running it." As you learn more about technologies that build on the mapping of HTTP methods to CRUD operations, watch for PUT to be used as more than just an alternative way to invoke CGIs that doesn't put parameters in the navigation bar.

DELETE

Commands that can add new files or delete existing files from a HTTP server obviously pose some danger, and most HTTP servers either don't allow the use of the HTTP PUT and DELETE methods or require password authentication. For example, try this:

DELETE /xml/crud/gettest.html HTTP/1.1
Host: snee.com

You should get a 405 ("Method Not Allowed") return code. (If not, I'll have some issues with my host provider!) The easiest way to experiment with the power of these two HTTP methods is to configure an HTTP server (for our purposes here, it's best to use one running on a local machine that's inaccessible to the outside world) to allow DELETE and PUT commands. When I did this, I installed Jakarta Tomcat and then added the following to the servlet element with a servlet-name value of "default" in Tomcat's conf/web.xml configuration file:

<init-param>
  <param-name>readonly</param-name>
  <param-value>false</param-value> 
</init-param>

I also changed the connector port value in the conf/server.xml file to 8083 because another process on my machine was using port 8083, so I began a telnet session with the following line:

telnet localhost 8083

I also created a files subdirectory of the ROOT directory created by the Tomcat installation, so that to point a browser to a test1.html file in that directory, the URL was http://localhost:8083/files/test1.html. (I installed Tomcat and did all these configuration steps on both a Linux box and a Windows one to ensure that all the telnet HTTP interaction was the same from both, and it was.)

I didn't want to GET test1.html, though; I wanted to destroy it! I started up telnet and entered the following lines:

DELETE /files/test1.html HTTP/1.1
Host: localhost

After pressing return twice, I saw the following:

HTTP/1.1 204 No Content
Date: Wed, 20 Oct 2004 22:15:42 GMT
Server: Apache-Coyote/1.1

The 204 code indicates that the request was successful, and that "no content" needs to be returned for this particular request. I looked in the files directory, and the test1.html file was gone. I entered the same lines into a telnet session again, to see what would happen if I tried to delete a nonexistent file, and saw everyone's favorite HTTP error code: 404 Not Found.

PUT

A PUT request lets you say "put the resource with this identifier and the following content onto the origin server." Its format closely resembles that of the POST command. In fact, entering the POST request above with "PUT" substituted for "POST" would be perfectly legal syntax. I didn't try that exact request for two reasons: first, the snee.com server is not set up to allow PUT commands, and second, even if it were, I didn't want to replace the existing /xml/crud/posttest.cgi file there with a file consisting of "fname=Barry&lname=Wom".

Let's look at the effect of a modified version of that command that I performed on my local machine. I changed the URI of the resource and did everything else the same way, using telnet to send the following to port 8083 of localhost:

PUT /files/test2.txt HTTP/1.1
Host: localhost
Content-Length: 21

fname=Barry&lname=Wom

As with the POST command, I had to know the exact length of the content I was going to send to the server so that I could specify it after "Content-Length:", and I had to skip a line before the content to show that I was finished entering the header. After I entered the "m" that was the twenty-first character in the data part of the command, the HTTP server displayed the following underneath it:

HTTP/1.1 201 Created
Content-Length: 0
Date: Thu, 21 Oct 2004 23:43:50 GMT
Server: Apache-Coyote/1.1

The 201 means that the resource I asked the server to create was successfully created. I looked in the files directory and saw the new one-line file that the server had created there, named test2.txt and consisting of the 21 characters "fname=Barry&lname=Wom".

The ampersands and equal signs entered there make more sense in a POST or GET command, though. For a better demonstration of a PUT request, let's send a more realistic file to the server. I put a small, simple HTML file into the files directory by pasting the following into a telnet session:

PUT /files/test3.html HTTP/1.1
Host: localhost
Content-Length: 138

<html><head><title>small, simple</title></head>
<h1>Let's Get Small</h1>
<body><p>This is a small, simple HTML file.</p></body>
</html>

After I did so, sending my web browser to the URL http://localhost:8083/files/test3.html displayed this little HTML file just as if I had put it there using an FTP program.

When I entered the same commands into a telnet session a second time, I got the same return code and saw the same response, and the new version replaced the old version. The server gave no warning that a file with that name already existed. This is why some descriptions of the HTTP PUT command describe it, and not POST, as the command that updates data: you can do a GET to read a file, make some changes, and then PUT it back where you found it, essentially updating it.

Getting Practical Again

Executing these commands from telnet makes a nice party trick. (Well, maybe if you're at a very geeky party.) In a more practical situation, you perform these operations using a library or other API. Now that you've seen how these HTTP requests work, you're in a better position to judge what each API adds to your use of the base HTTP vocabulary and which APIs take best advantage of them (or, in the case of WebDAV, what gaps they're trying to fill among the four methods to enable more robust use of the four CRUD commands). For one thing, an API should automate the measurement of the length of your POST and PUT content and the inclusion of the right number after "Content-Length:" for you. The handling of passwords as a level of security should also be simple and straightforward when using an API. To learn more about some of these APIs and how they use HTTP, take a look at Mark Pilgrim's article from October of 2003, The Atom API.

Perhaps you could even write your own API in your favorite language now that you know how little is required of it. This knowledge has been a big factor in the growing success of REST-oriented web services, as developers realize that the thick layers of libraries and tools often associated with SOAP development can be a lot more than you need when developing distributed applications that take advantage of the ubiquitous HTTP protocol. (Vendors of SOAP tools and libraries may offer a different perspective on how badly you need them.)

My examples all sent and pulled down little HTML files, but you can do it with any files you like, including binary files. In the early days of web services and XML, many developers realized that XML was a great way to encode messages with arbitrarily complex structures for communication between different platforms and that HTTP was a great way to send them. This combination of XML and HTTP has always formed the basis of web services, and it became the key to XML's popularity beyond the web-based electronic publishing applications that its inventors originally pictured. Now that you know how simple it is to send and receive XML using HTTP, you can start building your own web services or at least understand what the tools that help you build them are really doing.