Or
Simon Cozens
A long time ago, Kirrily Robert and I had a plan to write a book about internet protocols for the "intelligent amateur". One of the things it was going to contain was a description of how to speak the various protocols to a computer from a telnet session, as a real-life example of how each protocol works.
The book will probably never happen, but the need for such a set of examples has not gone away; when debugging various transactions, developing Perl modules, or simply sending and checking mail without the appropriate tools to hand, I've often wished for a handy reference guide to this or that protocol. Sometimes I've been lucky and hit on some web site with a quick description of the protocol. Sometimes, though, I've had to resort to seeking out the relevant RFC.
Unfortunately, RFCs are written more like legal documents than handy references. So the time had come to sit down and write a little phrasebook with a quick description of the important bits of each protocol. This is that phrasebook.
All the high-level protocols discussed here are designed in the Unix tradition: line-based, textual data, designed to relatively easy for a human programmer to read and generate by hand. They're also almost entirely based on a command-response model - the client asks the server to do something, the server tells the client the result.
The handiest tool for speaking to a server using such a protocol is the telnet utility. Any hacker worth his salt will know how to use telnet as a web browser, mail application, newsreader, and much else besides, and this article will show how it's done.
telnet takes the name or IP address of the machine you want to talk to, the port which the server listens on, and opens a connection to the remote computer. Everything you then type will be relayed to the server, and the server's responses will appear on your screen. On a Unix computer, a sample telnet session will look like this:
% telnet 10.0.0.1 1234 Trying 127.0.0.1... Connected to 10.0.0.1. Escape character is '^]'. Fictitious Server on 10.0.0.1 - Ready QUIT Leaving... Connection closed by foreign host.
This is how we will represent conversations with the computer in this phrasebook: things you type will be in bold on a lighter background, and everything else will be the things the computer responds with. Things which are outside the scope of the network conversation are in green.
On a Windows computer, you will probably be using the Windows telnet client; when chatting with a server in the protocols in this phrasebook, you'll need to turn local echo on in order to see what you're typing.
Now we are ready to start talking to some servers.
The first set of protocols we're going to look at are those about sending and checking email; these are the protocols which are most often spoken "by hand".
Name: Simple Mail
Transport Protocol
Port: 25
RFCs: 2821
SMTP is the means by which one computer moves a piece of email to another. There are many servers which implement the SMTP protocol; sendmail, exim, postfix and qmail being the most used implementations. When you connect to an SMTP server, it may well tell you which one it's running:
% telnet 10.0.0.1 25 Trying 10.0.0.1... Connected to 10.0.0.1. Escape character is '^]'. 220 ddtm.pad ESMTP Postfix
When you connect to a mail server, it's polite to say hello. You're supposed to say which mail domain that you're in, but nobody really cares about that any more; however, if you don't say hello, (or say something which looks like a domain - ie, has a dot in it) you might not be allowed to send mail.
HELO sailor
250 ddtm.pad
This server isn't particularly chatty, but it responds with an OK status code (a number beginning with 2) and its own mail domain. The curious spelling of "hello" is there to keep all SMTP commands four letters long. An important thing to note here is that for most of the protocols we're going to look at, the computer only cares about the status message. Much of the text is for human consumption. So our server could just well have responded:
HELO sailor
250 Hello, sailor, pleased to meet you.
And indeed some of them do.
In a second, we'll demonstrate most of the useful parts of the protocol by sending a mail, but first, there's a couple of commands which can occasionally come in handy. EXPN and VRFY are both used to determine whether or not a server is likely to accept mail for a given address. Sometimes these are disabled to avoid information leakage, and even if not, only one or the other will work.
VRFY simon@ddtm.pad 252 simon@ddtm.pad VRFY nothere@ddtm.pad 450 <nothere@ddtm.pad>: User unknown in local recipient table
The status codes beginning with 4 generally mean "no", whereas those beginning with 5 mean "error":
EXPN simon@ddtm.pad
502 Error: command not implemented
And verifying only really works for addresses handled by the local server; for remote addresses, it may claim to be able to deliver them, but will actually refuse to relay mail if you're not authorised to send via this server.
Anyway, on to the main event: sending a mail via SMTP.
% telnet 10.0.0.1 25 Trying 10.0.0.1... Connected to 10.0.0.1. Escape character is '^]'. 220 ddtm.pad ESMTP Postfix HELO sailor 250 ddtm.pad MAIL From: simon@simon-cozens.org 250 Ok RCPT To: simon@ddtm.pad 250 Ok DATA 354 End data with <CR><LF>.<CR><LF> Subject: Blah blah blah Hello. This is a test message. . 250 Ok: queued as 910D613B813 QUIT 221 Bye Connection closed by foreign host. You have new mail.
This is all you need to do to send a message: say hello, say where the mail is coming from, where it's going, and the data, followed by a dot on a line by itself.
Name: Post Office Protocol, version 3
Port: 110
RFCs: 1939
There are five commands you need to know to check your mail via POP3: USER and PASS will log you on, LIST tells you how many messages you've got, RETR retrieves a message, and DELE deletes a message. There's also STAT to get some metadata about how many messages you've got, but you don't really need that.
So here's a typical POP3 conversation:
% telnet popserver.myisp.net pop3 Trying 123.4.5.6 Connected to 123.4.5.6 Escape character is '^]'. +OK Hello there. USER simon +OK Password required. PASS xxx +OK logged in. STAT +OK 8 2797628 LIST +OK POP3 clients that break here, they violate STD53. 1 33618 2 2832 3 4213 4 3830 5 5429 6 2735418 7 9814 8 2474 .
This is a list of numbered email messages and their size in bytes. If there's a really big one, like number 6, that's clogging up our download and we want to get rid of it, we can delete it:
DELE 6
+OK
The really important thing to note here is that the messages won't get renumbered until you QUIT; so you can still RETR 8:
RETR 8 +OK X-Delivered: at request of simon on mail Return-Path: <someone@somewhere.com> ...
If you wanted to get the top 10 lines of message 6 before blowing it away, try TOP 6 10.
Name: Internet Message Access Protocol
Port: 143
RFCs: 3501
IMAP is slightly different from the various other protocols in this phrasebook. It requires each command to be preceded by a tag, to enable commands to be processed asynchronously. Since we're going to be typing into a telnet window, it's OK for us to give all our commands the same tag. Let's call it foo, in grand style. The server will echo back the tag to enable us to match up the command that we sent with the results we got back.
Now for the commands that you're going to want to know. First of all, we need to log in:
foo LOGIN simon foobar
Next, choose the mailbox you're going to look at. This is done with the SELECT command, and the mailbox is usually "Inbox". If it isn't, you know what it should be.
foo SELECT Inbox
Now you can look for various types of message, and count them. For instance, let's get a listing of message IDs for all the messages in the inbox:
foo SEARCH ALL
Name: Hyper-text Transfer Protocol
Port: 80
RFCs: 2616
HTTP is probably the most widely used Internet protocol, since it underpins the World Wide Web. Getting pages via HTTP is generally very simple, although there can be some complications. Here's the simplest version:
% telnet www.google.com 80
Trying 123.4.5.6
Connected to 123.4.5.6
Escape character is '^]'.
GET /
...
This is an old-style HTTP 0.9 request, but it'll be good enough in many cases, if you just want the data.
If it isn't - specifically, if the server is hosting multiple web sites and needs to know which "/" you're referring to, or if you want to look at the HTTP headers - then you'll need to send an HTTP 1.1 request, like so:
% telnet www.wecjapan.com 80 Trying 217.204.174.162... Connected to www.wecjapan.com Escape character is '^]'. GET http://www.wecjapan.com/ HTTP/1.1 Host: www.wecjapan.com
Here we send an additional header, Host, which tells the server which host we're interested in. Note the extra line after the headers, to tell the server that we've finished sending headers.
The response we get will be in two parts, headers and data (usually HTML) separated by a blank line:
HTTP/1.1 200 OK Date: Sat, 26 Jul 2003 12:23:11 GMT Server: Apache/1.3.27 (Darwin) Last-Modified: Fri, 18 Jul 2003 12:24:29 GMT ETag: "10d9bb-12ff-3f17e6fd" Accept-Ranges: bytes Content-Length: 4863 Content-Type: text/html <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd"> ...
If you're just interested in the headers, send HEAD instead of GET.
The other method worth looking at is POST, used for submitting form data back to a web server.