What's the difference between a GET and a POST request? And HEAD?
A week or two ago I wrote a little rant about how we're having trouble finding a suitable contractor to do some work for us. During the rant, I mentioned one of the questions I asked to get a feel for how much knowledge the interviewees had.
What's the difference between a GET and POST request?
Over the last week, Google has started sending people to that post who looking for an answer to that specific question. Since people want to know, and seem to expect me to tell them, I figured I'd better answer it *grin* For the full description of it all, have a read of the HTTP/1.1 RFC (2616), which describes the specification. For older stuff, the HTTP/1.0 RFC (1945) is still worth reading too.
When a request is made to a web server, a message in a known format is sent. It consists of two parts - request headers, and a request body. Some headers are mandatory, and others are optional. A request body is only included if required - GET does not have one and must not have one. Headers are split by a new line (CRLF), and the header section is marked complete with a double newline.
There are only two request headers that are absolutely essential - the Request-Line, and a Host header (in HTTP/1.1, it's not defined in the HTTP/1.0 specification). The Host header names the server you attempting to contact so that server hosting multiple sites can resolve which instance you require.
The Request-Line is the crux of everything. It consists of the format 'Request-Method Request-URI HTTP-Version'. Some example request lines are (the double dash is a comment, only pay attention to the bold :):
GET /blogs/geoff.appleby/default.aspx HTTP/1.1 --to get my main page
POST /blogs/geoff.appleby/someentry.aspx HTTP/1.1 --to submit a comment on a post, say.
In both cases we could include some query string parameters:
GET /blogs/geoff.appleby/default.aspx?foo=bar&fnord=cow HTTP/1.1
The querystring is supposed to be used to locate a specific resource, or dynamically locate a piece of data, that sort of thing. It's supposed be used for discovery and identification. Again, it can be used for both GET and POST.
So what's the difference? GET is supposed to be used to only retrieve a resource (static or dynamic). POST is supposed to be used to send data TO a resource.
So where does this data get sent? In a POST, as well as request headers you include a message body. Just like how the querystring is a set of name/value pairs (separated by &'s), the message body for a POST request is a just a set of name/value pairs, separated by newlines (CRLFs).
And this, essentially, is the only difference. It's the way in which name/value pairs are sent up to the server. There's a lot more involved in what you can do with either of them (for example, only on certain conditions - depending on different optional headers - can the response to POST request be cached, but these are more usage rule differences rather than actual differences in what's sent), but generally it's safe to say that this is all it is. There's also a size constraint in place here. Any request Uri (including the querystring) can normally only be at most 1024 characters long (give or take a char or two - not sure about null terminators :) but the request body can be much larger (chosen by each individual web server, not the specification) so you can upload a lot more data via POST.
Notice how I italicised the word 'supposed' several times above? The intent of these request methods is as I've stated. You can actually get around this however. In HTML you, can certainly do <FORM method="get">, and have the form params submitted only via the querystring, and update data as a result. It just depends on how each page has been implemented.
And what's a HEAD request? It's exactly the same as a GET request, but the web server must not return the requested resource. A response, just like a request, consists of headers and a body - the headers describe meta information about the resource requested (it's size, it's mime type, if it couldn't be located, etc) and the body is the actual resource content. So the result of a GET request is some headers and a body. The result of a HEAD request is the exact same headers as if it was a GET request, but NO body. Easy huh?
There's also a few more request method types available for use, but I won't go into the here - read the RFC's to find out about them, or to find out more about what I've talked about. It is interesting though that with GET and POST being the two most common request methods out there, a web server only optionally has to implement a handler for POST requests. Only GET and HEAD are mandatory for implementation. Funny.
Side note: When asked in the interviews I mentioned at the start, I was certainly not expecting an answer as detailed as this. A simple sentence or two like 'POST is for submitting data, GET is for getting data' or 'POST data is sent after all the header' would have been more than enough to keep me happy :)
Listening to: enough space - foo fighters - (2:37)