Introduction (Introduction)
1.1 purpose (Purpose)
HTTP (Hypertext Transfer Protocol) is an Application Level Agreement that adapts to distributed hypermedia collaboration systems for flexibility and speed requirements. It is a general, stateless, object-based protocol, which can be used for a variety of purposes, such as Name Server and Distributed Object Management Systems. One feature of HTTP is that its data representation allows the system to build the data that is no longer depends on to transmit.
HTTP has been widely used on WWW since 1990. This specification reflects the general usage of "http / 1.0".
This specification describes the features that have been implemented on most HTTP / 1.0 clients and servers. The specification will be divided into two parts: the implementation of the HTTP characteristics is the main content of this document, and other less popular implementations will be listed in Appendix D.
A practical information system requires more functions, not just data acquisition, including search, front-end updates, and annotations. HTTP allows an open command set to represent the purpose of the request, which uses the URI [2] (Uniform Resource Identifier), the rule of the unified resource identifier (URL [4]) or naming (URN [16]) method The resource. HTTP uses a format similar to mail (Internet Mail [7]) and MIME (MultiPurpose Internet Mail Extensions [5]).
HTTP also communicates with the user agent, proxy server / gateway, and other Internet protocols, SMTP [12], NNTP [11], FTP [14], Gopher [1], And WAIS [8], etc. . HTTP allows different applications to make basic hypermedia resource access, and simplify user agents.
1.2 Terminology
This specification uses many terms related to participants, objects and HTTP communication, as follows:
Connection
Two applications create virtual circuits in the transport layer in the transport layer in communication.
Message
The basic unit of the HTTP communication, the structured, sequential byte (which means is defined in Section 4).
Berners-Lee, et al information [Page 4]
Request
HTTP request message (defined in Section 5)
Response (Response)
HTTP response message (defined in section 6)
Resource
Data objects or services that can be identified by URI (see Section 3.2) on the network (see Section 3.2)
Entity (Entity)
Special representation, data resource representation, representation of the service resource, etc., is composed of metity header or entity body or entity body body (Entity header) or entity body.
Client (client)
Refers to the application to establish a connection with the purpose of issuing a request.
User Agent
Refers to the initialized client, such as browser, editor, spider (web crawler) or other end user tools. Server
Refers to the connection and respond to the application of the service request by sending a response.
Origin Server
Store resources or generate resources that generate resources.
Proxy
At the same time, the intermediate procedure of servers and client roles is placed to generate requests for other customers. The request is transformed, passed to the final destination server, within the agent, request or processed, or passed. The agent must explain the message before the message forwarding, and if necessary, you have to rewrite the message. The agent is usually used as a client exit through the firewall to assist the request for processing the user agent.
Berners-Lee, et al information [Page 5]
Gateway
Server between servers. Unlike the agent, the gateway accepts the request as if it is the original server where the resource is located, and the requesting client may not realize that it communicates with the gateway. The gateway is the gateway of the network firewall server. When accessing non-HTTP system resources, the gateway is an intermediate protocol translator.
Tunnel
The tunnel is like a repeater that connects to both ends. When in the activation state, it is initialized by HTTP requests, but it does not participate in HTTP communication. The tunnel is naturally terminated when both ends of the relay connection are needed. Tunnel technology is usually used when there is a need for demand and intermediate programs that cannot or without interpretation.
Cache
Reference The retriever local storage and the subsystem used to control the message storage, recovery, and delete.
The purpose of the cache response is to reduce the request response time and the consumption of network bandwidth in the next time. Any client and server can contain a cache. The server cannot use a cache when working in a tunnel.
Any specified program has the ability to be used as a client and server. When we use this concept, we don't look at the program function to implement customer and server, but what role (customer or server) is played on a specific connection time period (customer or server). Similarly, any server can play a raw server, agency, gateway, tunnel, and other roles, and the behavior handover depends on the content of each request.
1.3 Overview (Overale Operation)
The HTTP protocol is based on request / response mechanism. After the client is connected to the server side, the client will issue a request to the server side in a request method, a URI, a protocol version, and the like, which can follow the MIME type message containing the request modifier, customer information, and possible requestor (BODY) content. .
Berners-Lee, et al informational [Page 6]
The server side responds through the Status Line, including the protocol version of the message, success, or error code, also follows the MIME type message containing server information, physical meta information, and entity content.
Most HTTP communications are initialized by the user agent and assemble the request to get resources stored on some original servers. In the simplest case, it can be done by a simple connection (V) between the User Agent (UA) and the original server (O).
Request chain ------------------------>
UA ------------------- V ------------------ O
<--------------------- Response Chain More Complex Situation is that there is one or more intermediate links between the request / response chain. Overall, there are three intermediate links: agency (Proxy), Gateway, tunnel.
Agent (Proxy) is a forward-pusher (Agent) that receives URI requests in absolute form, overrides all or part of the message, and continues to be pushed at the server specified in the URI.
The gateway is a receiving agent, which is above the server layer, and it is used to deliver the request with the server-recognizable protocol.
The tunnel does not change the message, it is just a relay point to connect both ends. In the case where there is an intermediate layer (such as a firewall) or the intermediate layer cannot parse the message content, it is necessary to help communication through the intermediate layer by tunnel technology.
Request chain -------------------------------------->
UA ----- V ----- A ----- V ----- B ----- V ----- C ----- V ----- O
<------------------------------------- Response Chain
The above graphic represents three intermediate layers (A, B, and C) between the user agent and the original server. As can be seen from the figure, the request or response message runs through the four separate connections on the entire information chain, which is different from the simple situation described before this, and this difference is important. Because the HTTP communication options can be set to several cases, such as connecting with the nearest non-tunnel neighbor, only connect to the end of the information chain, or can be connected to all links in the chain. Although the above figure is linear, it is actually communicating with multiple parties at the same time. For example, b push request from other servers other than C, at this moment, at this moment, and give processing at this moment.
Any party participating in communication If you do not work in a tunnel, you must use the internal cache mechanism to process the request. If a part of the chain happens to caches a request response, the request / response chain corresponds to the shortened request / response chain. The following legend demonstrates the case where the B caches from O via C, and the ua and a have no cache:
Berners-Lee, et al informational [Page 7]
Request chain ---------->
UA ----- V ----- A ----- V ----- B - - - - - C - - - - - O
<--------- Response Chain
Not all responses can be cached, and the cache behavior may be specified in the modifier included in some requests. Some HTTP / 1.0-based applications use the heuristic way to describe which responses can be cached, and which are not, but unfortunately, these rules do not form a standard.
On the Internet, HTTP communication is often based on TCP / IP connection. The default port is TCP 80 [15] mouth, but other ports can also be used. The HTTP implementation of other protocols or network protocols based on the INETERNET is not excluded. HTTP is only assumed to be reliable, so any protocols that provide this guarantee can be used. As for HTTP / 1.0 requests and respond to data structures in the data transfer process, it is not discussed herein.
Except for laboratory applications, the current practice is that the client establishes a connection before each request, and the server is closed after the response is sent. Regardless of the client or server side, you should pay attention to handling the burst connection interrupt, because both parties may be shut down because of the user's operation, automatic timeout, the program failure, etc. The connection is closed. In this case, regardless of whether the request is in what state, if the connection is closed at the same time, it will cause the current request to terminate. 1.4 HTTP and MIME
HTTP / 1.0 uses a variety of structures to define MIME, see RFC1521 [5] for details. Appendix C describes the Internet media type and MAIL media types, and gives the basic explanation of the difference between the INTERNET type.
2. Sign Conversion and General Syntax (Notational Convention and Generic Grammar)
2.1 Supplemental Feedback Method (Augment BNF)
It is very similar to RFC822 [7], which is described in the manner of all mechanisms in a prose and supplementary feedback. For implementations, you must understand these agreements, you must be familiar with these symbols. Supplementary feedback mode includes the following structure:
Berners-Lee, et al information [Page 8]