English French
A document sharing system such as the `world wide web` is composed of three important parts.
A first solution is to force the users to be authenticated. This was the solution used by `FTP` to control the files that each user could access. Initially, user names and passwords could be included inside URIs :rfc:`1738`. However, placing passwords in the clear in a potentially publicly visible URI is completely insecure and this usage has now been deprecated :rfc:`3986`. HTTP supports several extension headers :rfc:`2617` that can be used by a server to request the authentication of the client by providing his/her credentials. However, user names and passwords have not been popular on web servers as they force human users to remember one user name and one password per server. Remembering a password is acceptable when a user needs to access protected content, but users will not accept to remember a unique user name and password for each web sites that they visit.
a `header`, that contains additional information about the response. The response header ends with an empty line.
a `header`, that is used by the client to specify optional parameters for the request. An empty line is used to mark the end of the header
All status codes starting with digit `2` indicate a valid response. `200 Ok` indicates that the HTTP request was successfully processed by the server and that the response is valid.
All status codes starting with digit `3` indicate that the requested document is no longer available on the server. `301 Moved Permanently` indicates that the requested document is no longer available on this server. A `Location:` header containing the new URI of the requested document is inserted in the HTTP response. `304 Not Modified` is used in response to an HTTP request containing the `If-Modified-Since:` header. This status line is used by the server if the document stored on the server is not more recent than the date indicated in the `If-Modified-Since:` header.
All status codes starting with digit `4` indicate that the server has detected an error in the HTTP request sent by the client. `400 Bad Request` indicates a syntax error in the HTTP request. `404 Not Found` indicates that the requested document does not exist on the server.
All status codes starting with digit `5` indicate an error on the server. `500 Internal Server Error` indicates that the server could not process the request due to an error on the server itself.
A markup language is a structured way of adding annotations about the formatting of the document within the document itself. Example markup languages include troff_, which is used to write the Unix man pages or Latex_. HTML uses markers to annotate text and a document is composed of `HTML elements`. Each element is usually composed of three parts: a start tag that potentially includes some specific attributes, some text (often including other elements), and an end tag. A HTML tag is a keyword enclosed in angle brackets. The generic form of an HTML element is ::
a `method`, that indicates the type of request, a URI, and the version of the HTTP protocol used by the client
a MIME document
An example of a non-retrievable URI is `urn:isbn:0-380-81593-1` which is an unique identifier for a book, through the urn scheme (see :rfc:`3187`). Of course, any URI can be made retrievable via a dedicated server or a new protocol but this one has no explicit protocol. Same thing for the scheme tag (see :rfc:`4151`), often used in Web syndication (see :rfc:`4287` about the Atom syndication format). Even when the scheme is retrievable (for instance with `http`), it is often used only as an identifier, not as a way to get a resource. See http://norman.walsh.name/2006/07/25/namesAndAddresses for a good explanation.
`anonymous` : in this mode, clients supply the `anonymous` user identifier and their email address as password. These clients are granted access to a special zone of the file system that only contains public files.
an optional MIME document attached to the request
As an illustration of HTTP/1.0, the transcript below shows a HTTP request for `http://www.ietf.org <http://www.ietf.org>`_ and the corresponding HTTP response. The HTTP request was sent using the curl_ command line tool. The `User-Agent:` header line contains more information about this client software. There is no MIME document attached to this HTTP request, and it ends with a blank line.
A second solution to allow servers to tune that content to the needs and capabilities of the user is to rely on the different types of `Accept-*` HTTP headers. For example, the `Accept-Language:` header can be used by the client to indicate its preferred languages. Unfortunately, in practice this header is usually set based on the default language of the browser and it is difficult for a user to indicate the language it prefers by selecting options for each visited web server.
As illustrated above, a client can send several HTTP requests over the same persistent TCP connection. However, it is important to note that all of these HTTP requests are considered to be independent by the server. Each HTTP request must be self-contained. This implies that each request must include all the header lines that are required by the server to understand the request. The independence of these requests is one of the key design choices of HTTP. As a consequence of this design choice, when a server processes a HTTP request, it does not use any other information than what is contained in the request itself. This explains why the client adds its `User-Agent:` header in all of the HTTP requests that it sends over the persistent TCP connection.
A simple HTML page
A standard document format : the `HyperText Markup Language <http://www.w3.org/MarkUp>`_
A standardized addressing scheme that unambiguously identifies documents