Network Working Group T. Berners-Lee
Request for Comments: 1945 mit / lcs
Category: INFORMATIONAL R. FIELDING
UC IRVINE
H. FryStyk
MIT / LCS
May 1996
Hypertext Transfer Protocol - HTTP / 1.0
About the next memo (Status of this Memo)
This paragraph text provides information for the Internet group and is not specified in any way. This paragraph text has no distribution limit.
IESG Note:
IESG is already paying attention to this agreement and looks forward to this document can be replaced by standard tracking documents as soon as possible.
Abstract
HTTP (Hypertext Transfer Protocol) is an Application Level Agreement that adapts to distributed hypermedia collaboration systems for flexibility and speed requirements. It is a general, stateless, object-based protocol, which can be used for a variety of purposes, such as Name Server and Distributed Object Management Systems. One feature of HTTP is that its data representation allows the system to build the data that is no longer depends on to transmit.
HTTP has been widely used on WWW since 1990. This specification reflects the general usage of "http / 1.0".
Table of Contents
Introduction .......................................... .............................. 4. 4
1.1 purpose (purpose) ......................................... ............................. 4
1.2 Terminology ........................................ ... .................................... 4
1.3 Overview ................................................. ............. 6
1.4 HTTP and MIME ................................................ .........…………………. 8
2. Sign conversion and general syntax (Notational Conventions and Generic Grammar) ... 8
2. Supplemental feedback method (Augment BNF) ....................................... ....…… 8
2.2 Basic Rules ...................................... ........ .... ..................... 10
3. Protocol parameters .................................. ...... ............ ........ 123.1 http version ....................... .................................................................. ........ ........ 12
3.2 Uniform Resource Identifiers .....................................
3.2.1 General Syntax .................................... ........ 14
3.2.2 HTTP URL ........................................... .. .......... .......................
Berners-Lee, et al informational [Page 1]
3.3 Date / Time format ................................................................... .......................................... 15
3.4 Character Sets ..................................... .................................................................................................
3.5 Content Codings .................................... .... .............................
3.6 Media Types ........................................ .............................................
3.6.1 Standards and Text Defaults ........... 19
3.6.2 Multipart Types ................................ ........ ......... 20
3.7 Product Logies ..................................................... .... .... .... .... .... 20
4. HTTP Message .................................... ...... .... .......... .......... ....... twenty one
4.1 Message type ............................................ .. .......... .......... .......... .......... ....... twenty one
4.2 Message Title .................................... ......... ................................................................................................................................................................................................................................................... 224.3 Header Fields ...................................... ........ ... ...... ........ twenty three
5. Request the command (request) ......................................... ...... .......... .......... .......... ......... twenty three
5.1 Request-line .................. .......... .............. ................... ......... twenty three
5.1.1 Method ............................................... ... .......... .......... .......... .......... .... twenty four
5.1.2 Request-URI .................................. ...... .... .......... .. twenty four
5.2 Request Header Fields .................................... ........ 25
6. Respond (response) .......................................... ..... ........................................................................................ 25
6.1 Status-line ................................... ... .....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
6.1.1 Status code and explanation .................................. 26
6.2 response header structure ............................................................... ....... 28
7. Entity .......................................... ..... ........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ... 28
7.1 Entity Header Fields ............................................................... ........ .. 29
7.2 Entity Body ...................................... ......... ...................
7.2.1 Type (TYPE) ................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ....... ................................................................................................................................... .................................... ........ ......... ... 30
8. Method Definitions ..................................... ..... .......................... 30
8.1 get ............................................ ................................................................................................................................................................................................................................................... ..... 31
8.2 HEAD .......................................... ................................................................................................................................................................................................................................................... 31
8.3 post ............................................ .................................................................................. .... 31
9. Status code definitions ................................ ...... .......... ...... 32
9.1 Message 1xx (Information) ...................................... .......... .............. ........ 32
9.2 success 2xx (successful) .................................... ........ .. .......... ................. 32
9.3 Redirect 3xx (redirection) ........................................ .............................. 34
9.4 Client error 4XX (Client Error) .................................... .......... .. 35
9.5 server error 5XX (server error) ............................................... ..... .......... .... 37
10. Header Field Definitions ....................................... .............. 37
10.1 Allowing ......................................... .............................................................
10.2 Authorization .................................. ....... ................................ 3810.3 Content-encoding ........... ................................................ 39
10.4 Content-length .................................. ...... ............ ....... 39
10.5 Content Type (Content-Type) .................................. ...... .......... ......... 40
10.6 Date (date) ........................................... .......... .........................................
10.7 expires ......................................... ................................................ 41
10.88 from (from) ........................................... .. .......... ............................... 42
Berners-Lee, et al informational [Page 2]
10.9 When you change (if-modified-since) ................................. ..... ......... .... 42
10.10 Recent change (Last-modified) .................................... .... .................................................
10.11 Location ........................................................................................................................................................................................... ..... ...........................................
10.12 Note (PRAGMA) ........................................ ........................................................ 44
10.13 Submit (Referer) ........................................ .. ................................................................ 44
10.14 Server (Server) .......................................... ...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
10.15 User Agent (User-Agent) .................................... .... .............. ........ 4610.16 WWW-authenticate .............. .................................................. 46
11. Access Authentication .................................. ........ .. .......... ........ 47
11.1 Basic Authentication Scheme .................................... 4.48
12. Safety considerations .............................................. .. .......... ........ 49
12.1 AUTHENTICATION OF Clients .................................. ...... .... 49
12.2 Safety Method (Safe Methods) ................................. ... .......... ........ 49
12.3 Disadvantages of server log information ............. 50
12.4 Sensitive Information Transfer .............................. 50
12.5 Attacks Based on File and Path Names. 51 Based on File Name and Path Name .. 51
13. Thanks (Acknowledgedgments) ........................................... .......... .................... 51
14. Reference (References) ..................................... ... ....... .................................................
15. Author's address (Authors' Addresses) ................................. ... ............................ 54
Appendix A. Internet Media Type Messages / HTTP ................................................................................................................................................................... .......... ........ 55
Appendix B. Tolerant Applications ........................................ ...... .. ........ 55
Appendix C. MIME Related ........................ .......... ........ ......... ................................................ 56C.1 Conversion Conversion to Canonical Form ...................... ....... 56
C.2 conversion date format .................................. 57
C.3 Content Code Introduction ............................ 57
C.4 No content-transfer-encoding .............................. 57
C.5 HTTP Title Domain (HTTP Header Fields in Multipart Body-Parts). 57
Appendix D. Additional features ........................................................................... ....... .......... 57
D.1 Additional Request Methods ...................................... 58
D.1.1 Put ........................................... .................................................... 58
D.1.2 delete ............................................... .......................................... 58
D.1.3 link .................................... ........ .. .................................................. 58
D.1.4 unlink .................................... .......... .......................................................... 58
D.2 Additional Head Structure Definition (Additional Header Field Definitions) ........................ 58
D.2.1 Accept .......................................... .................................................... 58
D.2.2 accept-charset ............................................................. .............................. 59
D.2.3 accept-encoding ....................................................................... .......... ........ .. 59
D.2.4 accept-language ........................................................................... ........................ 59d.2.5 content-language .................. ................................................................ 59
D.2.6 link .................................... ........ .. .......... .........................................
D.2.7 Mime-Version ..................................................................................... ..... .............................. 59
D.2.8 retroyal-after ......................................................................................... ....... ........................................ 60
D.2.9 Title .............................................. .......... ................................................................ 60
D.2.10 URI ...................................... ...... ........................................................ 60
Berners-Lee, et al information [Page 3]
Introduction (Introduction)
1.1 purpose (Purpose)
HTTP (Hypertext Transfer Protocol) is an Application Level Agreement that adapts to distributed hypermedia collaboration systems for flexibility and speed requirements. It is a general, stateless, object-based protocol, which can be used for a variety of purposes, such as Name Server and Distributed Object Management Systems. One feature of HTTP is that its data representation allows the system to build the data that is no longer depends on to transmit.
HTTP has been widely used on WWW since 1990. This specification reflects the general usage of "http / 1.0".
This specification describes the features that have been implemented on most HTTP / 1.0 clients and servers. The specification will be divided into two parts: the implementation of the HTTP characteristics is the main content of this document, and other less popular implementations will be listed in Appendix D.
A practical information system requires more functions, not just data acquisition, including search, front-end updates, and annotations. HTTP allows an open command set to represent the purpose of the request, which uses the URI [2] (Uniform Resource Identifier), the rule of the unified resource identifier (URL [4]) or naming (URN [16]) method The resource. HTTP uses a format similar to mail (Internet Mail [7]) and MIME (MultiPurpose Internet Mail Extensions [5]).
HTTP also communicates with the user agent, proxy server / gateway, and other Internet protocols, SMTP [12], NNTP [11], FTP [14], Gopher [1], And WAIS [8], etc. . HTTP allows different applications to make basic hypermedia resource access, and simplify user agents. 1.2 Terminology
This specification uses many terms related to participants, objects and HTTP communication, as follows:
Connection
Two applications create virtual circuits in the transport layer in the transport layer in communication.
Message
The basic unit of the HTTP communication, the structured, sequential byte (which means is defined in Section 4).
Berners-Lee, et al information [Page 4]
Request
HTTP request message (defined in Section 5)
Response (Response)
HTTP response message (defined in section 6)
Resource
Data objects or services that can be identified by URI (see Section 3.2) on the network (see Section 3.2)
Entity (Entity)
Special representation, data resource representation, representation of the service resource, etc., is composed of metity header or entity body or entity body body (Entity header) or entity body.
Client (client)
Refers to the application to establish a connection with the purpose of issuing a request.
User Agent
Refers to the initialized client, such as browser, editor, spider (web crawler) or other end user tools.
Server
Refers to the connection and respond to the application of the service request by sending a response.
Origin Server
Store resources or generate resources that generate resources.
Proxy
At the same time, the intermediate procedure of servers and client roles is placed to generate requests for other customers. The request is transformed, passed to the final destination server, within the agent, request or processed, or passed. The agent must explain the message before the message forwarding, and if necessary, you have to rewrite the message. The agent is usually used as a client exit through the firewall to assist the request for processing the user agent.
Berners-Lee, et al information [Page 5]
Gateway
Server between servers. Unlike the agent, the gateway accepts the request as if it is the original server where the resource is located, and the requesting client may not realize that it communicates with the gateway. The gateway is the gateway of the network firewall server. When accessing non-HTTP system resources, the gateway is an intermediate protocol translator.
Tunnel
The tunnel is like a repeater that connects to both ends. When in the activation state, it is initialized by HTTP requests, but it does not participate in HTTP communication. The tunnel is naturally terminated when both ends of the relay connection are needed. Tunnel technology is usually used when there is a need for demand and intermediate programs that cannot or without interpretation.
Cache
Reference The retriever local storage and the subsystem used to control the message storage, recovery, and delete.
The purpose of the cache response is to reduce the request response time and the consumption of network bandwidth in the next time. Any client and server can contain a cache. The server cannot use a cache when working in a tunnel.
Any specified program has the ability to be used as a client and server. When we use this concept, we don't look at the program function to implement customer and server, but what role (customer or server) is played on a specific connection time period (customer or server). Similarly, any server can play a raw server, agency, gateway, tunnel, and other roles, and the behavior handover depends on the content of each request.
1.3 Overview (Overale Operation)
The HTTP protocol is based on request / response mechanism. After the client is connected to the server side, the client will issue a request to the server side in a request method, a URI, a protocol version, and the like, which can follow the MIME type message containing the request modifier, customer information, and possible requestor (BODY) content. .
Berners-Lee, et al informational [Page 6]
The server side responds through the Status Line, including the protocol version of the message, success, or error code, also follows the MIME type message containing server information, physical meta information, and entity content.
Most HTTP communications are initialized by the user agent and assemble the request to get resources stored on some original servers. In the simplest case, it can be done by a simple connection (V) between the User Agent (UA) and the original server (O).
Request chain ------------------------>
UA ------------------- V ------------------ O
<----------------------- Response Chain
More complex situations are one or more intermediate links between the request / response chain. Overall, there are three intermediate links: agency (Proxy), Gateway, tunnel.
Agent (Proxy) is a forward-pusher (Agent) that receives URI requests in absolute form, overrides all or part of the message, and continues to be pushed at the server specified in the URI.
The gateway is a receiving agent, which is above the server layer, and it is used to deliver the request with the server-recognizable protocol.
The tunnel does not change the message, it is just a relay point to connect both ends. In the case where there is an intermediate layer (such as a firewall) or the intermediate layer cannot parse the message content, it is necessary to help communication through the intermediate layer by tunnel technology.
Request chain -------------------------------------->
UA ----- V ----- A ----- V ----- B ----- V ----- C ----- V ----- O
<------------------------------------- Response Chain
The above graphic represents three intermediate layers (A, B, and C) between the user agent and the original server. As can be seen from the figure, the request or response message runs through the four separate connections on the entire information chain, which is different from the simple situation described before this, and this difference is important. Because the HTTP communication options can be set to several cases, such as connecting with the nearest non-tunnel neighbor, only connect to the end of the information chain, or can be connected to all links in the chain. Although the above figure is linear, it is actually communicating with multiple parties at the same time. For example, b push request from other servers other than C, at this moment, at this moment, and give processing at this moment. Any party participating in communication If you do not work in a tunnel, you must use the internal cache mechanism to process the request. If a part of the chain happens to caches a request response, the request / response chain corresponds to the shortened request / response chain. The following legend demonstrates the case where the B caches from O via C, and the ua and a have no cache:
Berners-Lee, et al informational [Page 7]
Request chain ---------->
UA ----- V ----- A ----- V ----- B - - - - - C - - - - - O
<--------- Response Chain
Not all responses can be cached, and the cache behavior may be specified in the modifier included in some requests. Some HTTP / 1.0-based applications use the heuristic way to describe which responses can be cached, and which are not, but unfortunately, these rules do not form a standard.
On the Internet, HTTP communication is often based on TCP / IP connection. The default port is TCP 80 [15] mouth, but other ports can also be used. The HTTP implementation of other protocols or network protocols based on the INETERNET is not excluded. HTTP is only assumed to be reliable, so any protocols that provide this guarantee can be used. As for HTTP / 1.0 requests and respond to data structures in the data transfer process, it is not discussed herein.
Except for laboratory applications, the current practice is that the client establishes a connection before each request, and the server is closed after the response is sent. Regardless of the client or server side, you should pay attention to handling the burst connection interrupt, because both parties may be shut down because of the user's operation, automatic timeout, the program failure, etc. The connection is closed. In this case, regardless of whether the request is in what state, if the connection is closed at the same time, it will cause the current request to terminate.
1.4 HTTP and MIME
HTTP / 1.0 uses a variety of structures to define MIME, see RFC1521 [5] for details. Appendix C describes the Internet media type and MAIL media types, and gives the basic explanation of the difference between the INTERNET type.
2. Sign Conversion and General Syntax (Notational Convention and Generic Grammar)
2.1 Supplemental Feedback Method (Augment BNF)
It is very similar to RFC822 [7], which is described in the manner of all mechanisms in a prose and supplementary feedback. For implementations, you must understand these agreements, you must be familiar with these symbols. Supplementary feedback mode includes the following structure:
Berners-Lee, et al information [Page 8]
The name to explain = Name = definition rule name (Name) is it itself (without any angle bracket, "<", ">"), followed by equal =, then the definition of this rule . If the rule needs to be described in multiple rows, the indent format is used to use spaces. Some basic rules use uppercase, such as SP, LWS, HT, CRLF, Digit, Alpha, and more. Angle brackets can also be used in definitions to help understand the use of the rule name.
Lesser meaning ("literal")
The literal meaning of the text is placed in the middle of the quotation, unless specified, the text is sensitive.
Rule 1 | Rule 2 (Rule1 | Rule2)
"|" Indicates that its separated element is optional, for example, "Yes" Yes "is 'is' or 'No'.
(Rule 1 Rule 2) ((rule1 rule2))
Elements in parentheses indicate that one of them must be selected. Such as (element 1 (element 2 | element 3) element 4) can indicate two meaning, "element 1 element 2 element 4" and "element 1 element 3 element 4"
* Rules (* rule)
In front of the elements, the star is indicated by the cycle, and its complete form is "
[Rule] ([Rule])
In square brackets are optional elements. Such as "[element 1 element 2]" and "* 1 (element 1 element 2)" is a matter.
N rule (n rule)
The number of times the cycle is indicated: "
Berners-Lee, et al information [Page 9]
# 规 ((#rule)
"#" Is similar to "*" to define a list of elements.
The complete form is "
Empty elements can be used any in the structure, but do not participate in the count of the number of elements. That is, "(element 1), (element 2)" only represents two elements. However, in the structure, there should be at least one non-empty element. The default is 0 to unlimited, ie "# (element)" means any value, including 0; "1 # element" means at least one; and "1 # 2 element" means there is 1 or 2. Comment (; Comment)
The semicolon is noted later, only in a single line.
Injection * LWS (IMPLIED * LWS)
The grammatical description of this article is based on words. Unless otherwise specified, linear spaces (LWS) can be used in two adjacent symbols or separators (Tspecials), without affecting the meaning of the whole sentence. There must be at least one separator between the two symbols because they also do as separate symbols. In fact, the application should be in accordance with the "usual mode" when the HTTP structure is generated, as it is now clear that it is not working properly in the usual mode.
2.2 Basic Rules
The following rules will be used in the basic analysis of structures after this article.
US-ASCII encoded character set definition [17].
OcTet = <8-bit sequential data, ie Bytes>
Char =
Upalpha =
LOALPHA =
Berners-Lee, et al information [Page 10]
Alpha = Upalpha | Loalpha
Digit =
CTL =
Cr =
LF =
SP =
HT =
<"> =
HTTP / 1.0 specifies that except for entity main body (see Appendix B fault tolerant applications), all protocol elements' row end signs are CRLF (in byte order). The row end flag inside the entity main body and its corresponding media type are defined, see the description of Section 3.6. CRLF = CR LF
The head of HTTP / 1.0 can be folded into a lot, as long as each subsequent line starts with a space or horizontal tab. All linear spaces (LWS), with spaces (SP) have the same semantics.
LWS = [CRLF] 1 * (SP | HT)
In fact, some applications do not consider handling such a multi-line head, so the HTTP / 1.0 application is best not to generate a multi-line head from compatibility.
The TEXT rule is just the content of the domain used to describe the message interpreter and its value. Text content can contain characters different from US-ASCII.
TEXT =
The recipients in the title domain include characters other than the US-ASCII character set, which will be explained in accordance with the ISO-8859-1 standard.
Hexadecimal digital characters are in several protocol elements.
HEX = "a" | "b" | "c" | "d" | "e" | "f"
"A" | "B" | "C" | "D" | "e" | "f" | DIGIT
The content of many http / 1.0 headers consists of a word or special character separated by LWS, which must be placed in the middle of the quotation marks, and is used as the parameter value.
Word = Symbol (TOKEN) | Strings caused by quotation marks
Berners-Lee, et al informational [Page 11]
Token = 1 *
Tspecials = "(" | ")" | <"|"> "|" @ "
| "," | ";" | | "/" | <">
| "/" | "[" "]" | "" | "="
| "{" | "}" | SP | HT
In the HTTP header field, you can use brackets to indicate text. Note Only the domain containing the comment is allowed to be partially defined as part of the domain value. In addition to other domains, parentheses will be considered as domain values.
Comment = "(" * (ctext | comment) ")"
ctext =
The text string in the double quotes will be considered a word.
Quoted-string = (<"> * (qdtext) <">)
QDText =
3. Protocol parameters (Protocol parameters)
3.1 HTTP Version (HTTP Version)
HTTP uses the master from (
The version of the HTTP message is represented by the HTTP version field in the first row. If the protocol version in the message is not specified, the recipient must assume that it is a simple standard that meets HTTP / 0.9.
Berners-lee, et al information [Page 12]
Http-version = "http" / "1 * Digit". "1 * DIGIT
Note that the master-slave version should be considered a separate integer because they all have increased, thereby exceeding one integer. Thus, HTTP / 2.4 is lower than HTTP / 2.13, and HTTP / 2.13 is lower than HTTP / 12.3. 0 of the version number will be ignored by the recipient, but should not be generated at the sender.
This document defines 0.9 and 1.0 versions of the HTTP protocol. Sending a full request (full-request) and full response (full-response) message must indicate that the HTTP version is "http / 1.0".
HTTP / 1.0 server must:
o Identify the format of the request queue in the HTTP / 0.9 and HTTP / 1.0 request commands.
o Understand any legal request format in HTTP / 0.9 and HTTP / 1.0.
o Use the same version of the agreement to respond to the client.
HTTP / 1.0 client must:
o Identify the format of the Status-line in the HTTP / 1.0 response.
o Understand the format of legal response in HTTP / 0.9 and HTTP / 1.0.
When the agent and gateway receives HTTP requests with their own versions, be careful to process the request push because the protocol version indicates that the sender's capabilities, the agent, or gateway should not issue a higher version. If a high release request, a proxy or gateway must reduce the version of the request and respond to an error. The low version of the request should also be upgraded before being pushed.
The proxy or gateway responds to the request, you must follow the previously listed regulations.
3.2 Uniform Resource Identifiers
There are many names, such as WWW addresal documents, univyresal resource identifiers [2], and final uniform resource locators (URL) [4]) and unified Resource Name (URN). Before related to HTTP, the URI describes the string - name, location, or other characteristics, such as network resources. 3.2.1 General Synthics (General Syntax)
The URI in HTTP can be expressed in absolute form or may be represented in the form of a certain basic URI [9], depending on their usage. The difference in these two forms is that the absolute URI always begins with the method name ":": "
URI = (Absoluteuri | relativeuri) ["#" Fragment]
Absoluteuri = scheme ":" * (uchar | reserved)
Relativeuri = net_path | ABS_PATH | REL_PATH
NET_PATH = "//" NET_LOC [ABS_PATH]
ABS_PATH = "/" Rel_Path
REL_PATH = [PATH] [";" params] ["?" Query]
Path = fsegment * ("/" segment)
FSEGMENT = 1 * pchar
segment = * pchar
Params = param * (";" param)
PARAM = * (PCHAR | "/")
Scheme = 1 * (alpha | DIGIT | "|" - "|". "
NET_LOC = * (PCHAR | ";" | "?")
Query = * (uchar | reserved)
Fragment = * (uchar | reserved)
Pchar = uchar | ":" | "@" | "=" | " "
Uchar = unreserved | escape
Unreserved = alpha | DIGIT | SAFE | EXTRA | National
Escape = "%" HEX HEX
RESERVED = ";" | "/" | "|" @ "|" = "|" "
EXTRA = "!" "*" | "" | "(") "|", "Safe =" $ "|" - "|" _ "|". "
Unsafe = CTL | SP | <"> |" # "|"% "| <" |> "
National = Berners-Lee, et al information [Page 14] Reserved, Extra, Safe, And UNSAFE> Authoritative URL syntax and semantic information, see RFC1738 [4] and RFC1808 [9]. The BNF mentioned above includes a symbol (RFC 1738) that is not allowed in the legal URL, since the HTTP server is not limited to only the characters in the non-hospital set, and the HTTP agent may also receive RFC1738. Uriped URI requests are not defined. 3.2.2 HTTP URL "Http" means to locate network resources through the HTTP protocol. This section defines the syntax and semantics of the HTTP URL. http_url = "http:" "//" Host [":" port] [ABS_PATH] Host = < See RFC1123, 2.1 Definition> Port = * DIGIT If the port is empty or not specified, the default is 80 ports. For the URI of the absolute path, the server host with the requested resource receives the URI request by listening to the TCP connection of the port. If an absolute path is not given in the URL, it is necessary to use as a request URI (see Section 5.1.2), you must give it in "/". Note: Although the HTTP protocol is independent of the transport layer protocol, the HTTP URL is only identified the TCP location of the resource, and for non-TCP resources, it must be identified in the form of other URIs. The specified HTTP URL form can be obtained by converting uppercase characters in the host into lowercase (host name is case sensitive). If the port is 80, remove the colon and port number, and replace the empty path to "/". 3.3 Date / Time Format (DATE / TIME FORMATS) For historical reasons, HTTP / 1.0 applications allow three formats to represent timestamps: Sun, 06 NOV 1994 08:49:37 GMT; RFC 822, Updated by RFC 1123 Sunday, 06-NOV-94 08:49:37 GMT; RFC 850, Obsoleted by RFC 1036 Sun Nov 6 08:49:37 1994; ANSI C's asctime () Format Berners-Lee, et al information [Page 15] The first format is the preferred INTERNET standard format, indicating method length (RFC 1123 [6]). The second format is used in normal cases, but it is based on the date format in the abandoned RFC850 [10], and the year is not expressed in four digits. The HTTP / 1.0 client and server end can identify all three formats when parsing the date, but they cannot generate third time formats. Note: When receiving date data generated by non-HTTP applications, the received date values are promoted. This is because, at some point, the agent or gateway may get or send messages via SMTP or NNTP. All HTTP / 1.0 Date / TIMP Timestamps must be used in World Time (UT), which is Greenwich Mean Time, GMT, without any residual room. The previous two formats use "GMT" to represent time zones, and when reading ASC, it should also be assumed to be this time zone. Http-date = RFC1123-Date | RFC850-Date | Asctime-Date RFC1123-Date = Wkday "," sp Date1 SP Time SP "GMT" RFC850-date = weekday "," sp Date2 SP Time SP "GMT" asctime-date = wkday sp Date3 SP Time SP 4Digit Date1 = 2Digit SP Month SP 4Digit Day Month Year (E.g., 02 JUN 1982) Date2 = 2Digit "-" MONTH "-" 2Digit Day-month-year (E.G., 02-JUN-82) Date3 = Month SP (2Digit | (sp 1digit)) Month Day (E.G., Jun 2) Time = 2Digit ":" 2Digit ":" 2Digit ; 00:00:00 - 23:59:59 Wkday = "MON" | "TUE" | "WED" | "THU" | "fri" | "sat" | "sun" Weekday = "Monday" | "Tuesday" | "Wednesday" | "Thursday" | "Friday" | "Saturday" | "sunday" Month = "Jan" | "feb" | "mar" | "APR" | "May" | "jun" | "jul" | "AUG" | "SEP" | "Oct" | "NOV" | "dec" Note: HTTP requirements can only use the Data / Time timestamp format in the protocol stream, which is not required to use this type of format in the user description, request login, etc.. Berners-Lee, et al information [Page 16] 3.4 Character Sets The character set used by HTTP defines the same as the MIME: This document uses one or more tables to convert sequence bytes into sequence characters using one or more tables. Note that there is no need to convert unconditional conversion in other directions, because all characters can be represented by a given character set, and a character set may also provide an or more byte order to represent a special character. This definition tends to allow different types of character encoding to be implemented by simple single-table mapping, such as switching from table US-ASCII to complex tables such as ISO2202. In fact, definitions related to the MIME character set must be fully specified from bytes to characters, especially to determine precise mapping by utilizing external configuration information. Note: The term character set is characterized by character encoding. In fact, since HTTP and MIME use the same registration, the terminology should also be consistent. The HTTP character set consists of case sensitive symbols. All symbol definitions are registered with IANA character sets [15]. Because the registry does not define a set of symbols separately, we have seen the characters here in this, mostly related to HTTP entities. These character sets registered in RFC 1521 [5], namely US-ASCII [17], and ISO-8859 [18] character sets, and some other character sets are strongly recommended inside the MIME character set parameter. Charset = "US-ASCII" | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3" | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6" | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9" "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR" "Unicode-1-1" | "Unicode-1-1-UTF-7" | "Unicode-1-1-UTF-8" | token Although HTTP allows the use of a dedicated symbol as a character set value, any symbols with a predefined value in IANA character set registry [15] must indicate the character sets thereafter. The application should limit its use of the character set to the range of IANA registry. Berners-Lee, et al information [Page 17] If the character set of the entity body is not to mark the US-ASCII or ISO-8859-1, you should not mark it, otherwise it should be marked with the most basic naming in the main character encoding method. 3.5 Content Codings The content decoding value is used to indicate encoding conversion to the resource. The content decoding is mainly used to restore files that are compressed, encrypted, etc., allowing it to maintain its original media type. Typically, the coded saved resource can only be reduced by decoding or similar operations. Content-code = "x-gzip" | "x-compress" | TOKEN Note: For future compatibility, HTTP / 1.0 applications should be "gzip" and "compress" and "x-gzip", respectively. " X-compress corresponds to it. All content decoding values are sensitive. HTTP / 1.0 uses the content decoding value in the content encoding (10.3) header field. Although this value is described, it is content decoding, but more importantly, it indicates what mechanism should be used to decode. Note that a separate program may have the ability to implement decoding of multiple format encoding. In this text, two values mentioned: X-Gzip File compression program "Gzip" (GNU ZIP, developed by jean-loup gailly). This format is a typical Lempel-ZIV decoding with 32-bit CRC check (LZ77). X-compress The file compression program "Compress" encoding format applies to LZW (Lempel-Ziv-Welch) decoding. Note: Use the program name to identify the code format, not very ideal, in the future, may not continue to do so. Now, this is because of history, it is not a good design. Berners-Lee, et al information [Page 18] 3.6 Media Types HTTP uses Internet Media Types [13] in the Content-Type HEADER field (10.5) to provide open scalable data types. Media-type = type "/" subtype * (";" parameter) Type = token Subtype = token The parameters can refer to the properties / value pair, written with the type / subtype format. Parameter = attribute "=" value Attribute = token Value = token | quoted-string Where, type, subtype, parameter attribute name is sensitive. The parameter value is not necessarily sensitive, which is depends on the syntax of the parameter name. There is no LWS (space) between type and subtypes, attribute names, and attribute values. When receiving parameters of the type of media that cannot be identified, the user agent should ignore them. Some old HTTP applications cannot identify media type parameters, so HTTP / 1.0's application can only use media parameters when defining messages. Media-Type values are registered with Internet authorization allocation numbers (Internet Assigned Number Authority, IANA [15]). See RFC1590 [13] for the media type registration process. Unregistered media types are not encouraged. 3.6.1 Standards and Text Defaults (CANONICALIZATION AND TEXT Defaults) The Internet media type is registered in the form of a specification. In general, it is necessary to indicate the appropriate specification format before transmitting the entity main body (entity-body) through the HTTP protocol. If the body is encoded with a Content-Encoding, the following data must be converted to a specification form before encoding: The media subtype of the "text" type is interrupted by using CRLF in the specification form. In fact, it is consistent with the use of the entity body (entity body), and HTTP allows transportation to represent a line interrupted text medium in the CR or LF. The HTTP application must see CRLF, Cr, LF in the text medium received by HTTP mode as a line interverter. Berners-Lee, et al information [Page 19] In addition, if the character set of the text medium does not use bytes 13 and 10 as CR and LF, the HTTP allows the use of any sequential replacement CR and LF to be used as the character set by some multi-byte character sets. The flexible mode of use of such lines can only be in the entity main body (entry-body). A pure Cr or LF should not replace CRLF in any HTTP control structure (such as the header domain-Header Field and Multiple Boundary Line-Multipart Boundaries). The parameter "charset" is used with some media types when defining the data set (Section 3.4). When the sender does not explicitly give the character parameters, HTTP defines the "text" media subtype as the default value "ISO-8859-1" when receiving the character parameters. The "ISO-8859-1" character set or the data other than its subset must mark its corresponding character set value, which ensures that the reception can parse it correctly. Note: Many current HTTP servers provide other character sets other than "ISO-8859-1", and there is no correct tag, which limits interoperability, and it is recommended not to adopt. As a remedy, some HTTP user agents provide configuration options that allow users to change the default media type interpretation method without specifying the character set parameter. 3.6.2 MultiPart Types MIME provides a number of numbers of "Multipart") - several entities (enttive) can be packaged in a separate message entity main body (entity-body). Although the user agent may need to know each type, it can correctly explain the intention of each part of the subject, but in the multipart type registration registration, the content specified for HTTP / 1.0 is not found in the multipart type registration registration. HTTP User Agent has to do its own work, the process and behavior are the same or similar to the MIME user agent. The HTTP server should not assume that the HTTP client has the ability to handle multiple types. All multi-segment types use generic syntax, and must include boundary parameters in the media type value section. The main body of the message is its own, as a protocol element, which must use only CRLF as a line interruption in the Body-Parts. Multipart Body-Parts may include an HTTP title domain for each paragraph. 3.7 Product Identification (Product tokens) It is a communication application to identify a simple symbol of its own, often use with any letters and version descriptions. Most product identities also lists the version numbers of the important components of their products, separated by spaces in the middle. Press conventions, when identifying the application, the components are arranged in its importance. Product = token ["/" product-version] Product-Version = Token E.g: User-agent: Cern-linemode / 2.15 lowww / 2.17b3 Server: Apache / 0.8.4 The product identity should be short, thus prohibiting the use of this domain to fill in the advertisement or other unrelated information. Although any symbolic characters may appear in the product version, the symbol should only be used to make a version definition, that is, the continuous versions of the same product can only be distinguished by the product version. 4. HTTP Message (HTTP Message) 4.1 Message Types The HTTP message consists of a request for the client to the server and a response from the server to the client. Http-message = Simple-request; http / 0.9 Messages | Simple-Response Full-Request; HTTP / 1.0 Messages | Full-Response Full-Request and full response (full-response) use the message format specified in the Substrate Transport section in RFC822 [7]. The messages of the two may include the title domain (HEADERS, optional), entity body (Entity Body). The entity main body is separated by the blanks (i.e., there is no line before CRLF). Full-Request = Request-line; Section 5.1 * (General-HEADER; Section 4.3 | Request-header; Section 5.2 | Entity-header; section 7.1 CRLF [Entity-body]; section 7.2 Full-response = status-line; section 6.1 * (General-HEADER; Section 4.3 | Response-header; Section 6.2 Berners-Lee, et al information [Page 21] | Entity-header; section 7.1 CRLF [Entity-body]; section 7.2 Simple request (Simple_Request) and Simple-Response) Does not allow any title information, and limit only the unique request method (GET) Simple-request = "get" sp Request-URI CRLF Simple-response = [Entity-body] It is not advocated to use a simple way to request format because it prevents the server from verifying the media type returned to the entity when the server is connected. 4.2 Message Title (Message Headers) HTTP Title Domain, including the primary title (Sequest-Header, Section 5.2), response title (Response-Header, 6.2) and entity title (Entity-Header, 7.1), Follow the general format definitions given by RFC822-3.1 [7]. Each header domain consists of the name of the colon, single spacer (SP), characters, and domain values. The domain name is case sensitive. Although not advocated, the title domain can be extended into multi-line, as long as these lines start in one or more SP or HT. Http-header = field-name ":" [Field-Value] CRLF Field-name = token Field-value = * (Field-Content | LWS) Field-content = And consisting of each * text or combinations of token, tspecials, and quoted-string> The order in the title domain is not important, but good habits are, first send the master title, then the request title or response title, and finally the entity title. When all the domain values of the title domain are indicated by a comma-separated column (ie, the # (value)), the HTTP title field with the same domain name can be represented in one message. Moreover, it must be able to add the associated domain value to the first value without changing the message syntax, and the multipletles domains can eventually bind to the "domain name" pair in combination with a comma domain. Berners-Lee, et al information [Page 22] 4.3 General Header Fields There are several headings to be used by requests, but do not be used to be transmitted. These headings are only for messages that are transmitted. General-header = date; section 10.6 | Pragma; Section 10.12 The general title domain name can only be reliable only after the change in the change in the protocol version. In fact, the headlines in the new or experiments can be used by the communication parties, and their grammar can be used, and the type of title that cannot be identified will be considered as an entity domain. 5. Request Request messages from the client to the server include, in the header, the request method, the resource identifier and the protocol used for the resource. Considering the backward compatibility of HTTP / 0.9, there are two legal HTTP request formats: REQUEST = Simple-Request | Full-Request Simple-request = "get" sp Request-URI CRLF Full-Request = Request-line; Section 5.1 * (General-head; section 4.3 | request-head; section 5.2 | Entity-header; section 7.1 CRLF [Entity-body]; section 7.2 If the HTTP / 1.0 server receives a simple request, it must respond to a simple response in an HTTP / 0.9 format. The HTTP / 1.0 client has the ability to receive a complete response, but it cannot produce a simple request. 5.1 Request-line The request queue starts with a method symbol, followed by the request URI and protocol version, ending with CRLF. This element is separated by space SP. A separate CR or LF value is not allowed in addition to the final CRLF. REQUEST-line = Method SP Request-Uri SP HTTP-VERSION CRLF Berners-Lee, et al information [Page 23] Note that the difference between the simple request queue is between the request queue in the full request is whether there is an HTTP version field and whether other methods other than Get can be used. 5.1.1 Method (Method) The method code indicates how many ways to be accessed by request URI will be accessed. The method is case sensitive. Method = "get"; section 8.1 | "HEAD"; section 8.2 | "Pos"; section 8.3 | Extension-Method EXTENSION-METHOD = TOKEN The method list that can access the specified resource is dynamically changed; if accessing the resource with some method, the client can get notified from the return code in response. The server side returns state code 501 (not not implemented) when the method cannot identify or implement the method. These methods are commonly used by the HTTP / 1.0 applications, and see Section 8 for complete definitions. 5.1.2 Request URI (Request-URI) The request URI is the Unified Resource Identifier (Section 3.2) to identify resources to be requested. Request-Uri = Absoluteuri | ABS_PATH The above two requests URI mode can be selected according to the actual request. The absolute URI (ABSOLUTEURI) format is only used when the proxy is generated during the request. The agent's responsibility is to push the request forward and return it back. If the request is a GET or HEAD mode, and the previous response is cached, if the agent ignores the expiration information limit for the title domain, it may use the messages in the cache. Note that the agent may push the request to another agent, or the request can be sent directly to the destination server specified in the absolute URI. To avoid request loops, the agent must be able to identify all of its server names, including alias, local variables, and digital forms of IP addresses. Here is an example of a request queue: Get http://www.w3.org/pub/www/theproject.html http / 1.0 Berners-lee, et al information [Page 24] The most common request URI is the way the original server or gateway is used to identify resources. In this manner, only the URI gives an absolute path can be transmitted (see Section 3.2.1). For example, if the client wants to receive resources directly from the original server, they will generate a TCP connection with the host "www.w3.org" 80 port, and send the following command after the full request: Get / Pub / WWW / theProject .html http / 1.0 Note that the absolute path cannot be empty. If there is no content in the URI, it must be added "/" (Server root). The request URI is transmitted by encoding string, and some characters may be escaped during transmission, such as turning into a "% hexhex" form. For details, please refer to RFC1738 [4]. The original server must decode the request URI before the request is requested. 5.2 Request the title (Request Header Fields) Request the title field allows the client to deliver additional information and client information to the server to the server. This domain is a request for the requesting part, follow the grammatical form of the programming language program to call the parameters. Request-header = authorization; section 10.2 | From; section 10.8 | If-modified-Since; Section 10.9 | Refrer; section 10.13 User-agent; section 10.15 Request the title domain name only after combining the change in the protocol version to make a reliable extension. In fact, the headlines in the new or experiments can be used by the communication parties, and their grammar can be used, and the type of title that cannot be identified will be considered as an entity domain. 6. Respond (response) After receiving, the server side returns an HTTP response message after receiving the request message. Response = Simple-Response | Full-Response Simple-response = [Entity-body] Berners-Lee, et al information [Page 25] Full-response = status-line; section 6.1 * (General-HEADER; Section 4.3 | Response-header; Section 6.2 | Entity-header; section 7.1 CRLF [Entity-body]; section 7.2 When the request is http / 0.9 or when the server is only supports HTTP / 0.9, it can only respond with the Simple-Response method. If the client sends an HTTP / 1.0 full request, the received response is not starting with Status-line, and the client will treat it as a simplicity and analyze it accordingly. Note that simple request only includes the entity main body, which is terminated when the server is closed. 6.1 Status-line The first line of the full response message is the status line. It is composed of the protocol version, the digital form of state code, and the corresponding word language text. Each element is separated, except for the end of CRLF, not allowed separately CR or LF. Status-line = http-version sp status-code sp seed-phrase CRLF The status line always starts with the protocol version and status code, such as: "HTTP /" 1 * DIGIT "." 1 * Digit SP 3Digit SP (Eg, "HTTP / 1.0 200"). This expression is not enough to distinguish the complete request and simple request. Simple response may allow this expression to appear in the beginning of the entity main body, but will cause misunderstandings of the message. Because most HTTP / 0.9 servers can only respond to the "text / html" type, in this case, it is impossible to generate a complete response. 6.1.1 Status Code and Reason Analysis (STATUS CODE AND REASON Phrase) Status-code consists of 3 digits, indicating whether the request is understood or is satisfied. Cause Analysis is the reason why the status code is generated by short text. Status code is used to support automatic operation, and the reason analysis is to prepare for human users. The client does not need to check or display the reason analysis. Berners-Lee, et al information [Page 26] The first digit of the status code defines the category of the response, and the two digits have no specific classification. The first digit has 5 values: o 1xx :: Reserved, in the future. o 2XX: Success - operation is received, understood, accept, accept, understood, accepts. o 3XX: Redirection - To complete the request, further action must be performed. o 4XX: Client error - requests for speech errors or cannot be implemented. o 5XX: The server end error - the server cannot achieve legal requests. The status code of HTTP / 1.0 is explained below. The following reasons are only recommended to use, can be arbitrarily changed without impact on the protocol. The complete code is defined in Section 9. Status-code = "200"; ok | "201"; created | "202"; Accepted | "204"; no content | "301"; MOVED Permanently 8.1 get The GET method is to obtain information specified by the request URI in an entity method. If the request URI is just a data generation process, then it is finally necessary to return to the resource pointed to by the result of the process, rather than returning the text of the processing, unless the text is exactly the output. If the request message contains the IF-Modified-Since title domain, the syntax of the GET method becomes "Condition Get", ie "(CONDition Get". The condition GET method can determine the specified resource, if it occurs after the IF-Modified-Since Title Domain (see section 10.9), the transmission is activated, otherwise it will not be transmitted. This condition GET Allows the cached entity to refresh without having to pass multiple requests or unnecessary data transmission, thereby helping to reduce the network load. 8.2 Headhead method is almost the same as get, the difference is that the HEAD method does not allow the server to return any entity in response. For the response part of the HEAD request, the meta information contained in its HTTP title is the same as the GET request. By using this method, it is not necessary to transfer the entire entity body, the meta information specified by the request URI can be obtained. This method is often used to test the legality, accessibility and recent update of hyperlinks. Unlike conditions GET, there is no so-called "conditional head", ie "Conditional Head". It will also be ignored even if the IF-Modified-Since Title field is specified in the HEAD request. 8.3 POST The POST method is used to request a request to the purpose, requiring it to accept the entity attached to the request and put it as an additional new child of the resource specified by the request queue (request-line). POST is designed to achieve the following functions with a unified approach: o Annotation of existing resources for existing resources; o Send a message to the electronic public bill, newsgroup, mailing list, or similar discussion group; o Submit the data block, such as submitting the results of the form (Form [3]) to the data processing process; o Extension the database by additional operation. Berners-Lee, et al information [Page 31] The actual function of the POST method is determined by the server, and is usually dependent on the request URI. In the POST process, the entity is the slave part of the URI, as if the file is subjected to the directory containing its directory, the newsgroup file is subjected to the newsgroup that is issued to the file, which is the same as the database belonging to it. Successful POST does not need to create an entity in the original server and make it as a resource; it is not necessary to provide conditions for future access. That is, the POST method does not necessarily point to the resources specified by the URI. In this case, 200 (success) or 204 (no content) is appropriate response state, depending on the description of the results in the actual response entity. If a resource is created on the original server, the response should be 201 (created) and contain an entity (the "Text / HTML" type is most suitable), which records the status description of the new resource request. In all POST requests of all HTTP / 1.0, the legitimate length of content is required. If the HTTP / 1.0 server does not determine its length when receiving the request message content, the 400 (illegal request) code will be returned. The application cannot cache the response to the POST request, because as an application, they have no way to know how the server will respond in the future request. 9. Status Code Definition (STATUS CODE DEFINitions) Each status code will be described below, including which method they will correspond, and all meta information required in response. 9.1 Message 1xx (Information) This type of state code is used to indicate a temporary response. The temporary response consists of state lines and optional titles, and is terminated by the blank line. No 1xx status code is defined in HTTP / 1.0, so they are not legal responses to HTTP / 1.0 requests. In fact, they are mainly used for experimental purposes, which has exceeded the scope of this document. 9.2 Success 2XX (Successful 2xx) Indicates that the client request is successfully received, understood, and accepts. Berners-Lee, et al information [Page 32] 200 OK The request is successful. The information responded relies on the method used to request, as follows: The resources you want to request have been placed in the entity responded. HEAD has no entity main body, and only the title information is included in the response. Bamboo POST entity (description or result of the result of the operation). 201 Created The request is completed, and the result is a new resource. The URI of the newly created resource can be obtained in the entity responded. The original server should create this resource before issuing the status code. If the operation cannot be completed immediately, the server must give a prompt in the response body when the resource is available, otherwise, the server should receive 202 (acceptable). The method defined herein, only POST can create resources. 202 ACCEPTED The request is accepted, but the processing has not been completed. The request may not be finally completed, and it is possible to be handled at any time. In this case, there is no way to resend status code in asynchronous operation. 202 The response is not an obligation, the purpose of doing the server is to wait until the end of the user agent and the server, you can respond to the request of other processes (like running once a day, batch-based process). In some of the entities returned in some response, the status indication of the current request, the status monitor pointer, or the user's evaluation information to the request can be implemented. 204 no content The server has already implemented a request, but no new information is returned. If the customer is a user agent, do not update your own document view. This response is mainly to perform input and other operations of the Script statement without affecting the user agent activation document view. The response may also include new, meta information indicated in the form of entity, which can be used by the current user agent activating the document in the view. Berners-Lee, et al information [Page 33] 9.3 Redirection (Redirection 3XX) This type of state code indicates that the user agent wants to complete the request, and further operations need to be issued. These operations can only be implemented by the user agent only when the request is Get or HEAD, without interacting with the user. The user agent will never have more than 5 redirection of the request, which may result in an infinite loop. 300 Multiple Choices This status code is not used directly by the application of HTTP / 1.0, just the default interpretation of the 3xx type response. There are multiple available requested resources. Unless it is a HEAD request, the resource must be included in the entity must include the character list and location information of these resources, which is most suitable by the user or user agent. If the server is preferred, it should store the corresponding URL information at the location domain, and the user agent implements automatic redirection based on the value of this domain. 301 MOVED Permanently The requested resource allocates a permanent URL so that this resource can be accessed through this URL in the future. The client with editing links automatically updates the request URI as much as possible to the new link returned by the server. The new URL must be specified by the location domain in the response. Unless it is a HEAD request, the entity-body "must include a brief description of the new URL hyperlink. If a request is issued with a POST method, the 301 response status code is received. In this case, unless the user confirms, the user agent does not have to automatically redirect requests because this will result in changing the environment that has been issued. Berners-Lee, et al Information [Page 34] Note: When the POST request is automatically redirected after receiving the 301 status code, some existing user agents will be incorrectly changed to the GET request. 302 MOVED TEMPORARILY The requested resource is temporarily saved at a different URL. Because the redirection is sometimes changed, the client should continue to use the request URI to issue later requests. The new URL must be specified by the location domain in the response. Unless it is a HEAD request, the entity-body "must include a brief description of the new URL hyperlink. If a request is issued with a POST method, the 302 response status code is received. In this case, unless the user confirms, the user agent does not have to automatically redirect requests because this will result in changing the environment that has been issued. Note: When the POST request is automatically redirected after receiving a 302 status code, some existing user agents will incorrectly change it to the GET request. 304 Not Modified If the client successfully executes the condition GET request, the server has not been updated since the date specified by the IF-Modified-SINCE field, and the server should respond to this status code, rather than sending the entity main body to the client. In response to the title domain, you should only include some related information, such as cache managers, modifications independent of the Entity's Last-Modified date. Examples of the relevant title are: date, server, expiration time. Whenever the domain value given in the 304 response changes, the cache should update the cache entity. 9.4 Client Error 4xx The status code of the 4XX class indicates an error in the client. If the client requests yet not completed when receiving 4xx code, it should immediately terminate data to the server. In addition to responding to the HEAD request, whether the error is temporary or permanent, the server side must include an incorrect state interpretation in the entity responding. These status codes apply to any request method. Berners-Lee, et al information [Page 35] Note: If the client is sending data, the server-side TCP implementation should be careful to ensure that the client receives the response package before closing the input connection. If the client sends data to the server after shutting down, the server sends a reset package to the client, empties the input buffer that has not been processed, to terminate the read and interpretation of the HTTP application. 400 illegal requests (Bad Request) If the requested syntax is wrong, the server will not understand. The client should not repeat the request again before making changes to the request. 401 unauthorized Request user authorization. The WWW-Authenticate Title Domain (10.16) in response should be prompted to request resources in an authorized manner. The client should use the appropriate authorized title domain (10.2) to repeat the request. If the authorization trust information has been included in the request, the response 401 indicates that this authorization is rejected. If the user agent returns the 401 state code after multiple times, the user should check the entity of the response because some related dynamic information will be included in the entity. The HTTP Access Authorization will explain in Section 11. 403 prohibition (Forbidden) Server understands the request, but refuses to implement the request. Authorization is not helpful, the client should stop repeatedly send this request. If not using a HEAD request method, and the server will be written in the response entity when the server is willing to publish the requested untime the reasons. This status code is typically used for the server-side not wants to announce the details of the request to be rejected or have no other response available. 404 did not find (Not found) The server does not find resources that match the request URI. The 404 status code does not specify whether the situation is temporary or permanent. If the server does not want to provide this information for the client, the 403 (prohibition) status code is also responded. Berners-Lee, et al information [Page 36] 9.5 Server Error 5xx The response code is indicated by the status code starting at '5' indicates that the server is found to have an error and cannot continue the request. If the client receives the 5xx status code, the request has not been completed, it should immediately stop sending data to the server. In addition to responding to the HEAD request, the server should include an interpretation of the error in its response entity and indicate that it is a temporary. This type of response code does not have a title domain, which can be applied to any request method. 500 server internal error (INTERNAL Server Error) The server has encountered an unexpected situation to make it unable to continue to respond to request. 501 unrealized (Not Implement) The server is unable to provide support required for the request. If the server cannot identify the request method, it will respond to this status code, which means that any resource required to request the request. 502 illegal gateway (Bad Gateway) A server that acts as a gateway or agent receives illegal responses from the UPStream server to send the request. 503 Service Unavailable (Service Unavailable) The server is currently unable to handle the request. This is generally caused by the temporary overload or maintenance of the server. This status code suggests that it is temporary, and some delays are generated. Note: The 503 status code does not imply that the server must return this status when overload. Some servers may wish to use simple processing in overload, that is, disconnects. 10. Title Domain Definitions (Header Field Definitions) This section defines syntax and semantics commonly used in HTTP / 1.0 header domain. Whether it is the sender or the receiver, it is possible to be a client or server side. The specific role depends on who is receiving at this time. Who is sending. Berners-Lee, et al information [Page 37] 10.1 Allow (Allow) Indicates that the resource specified by the request URI is supported in the Allow entity title domain, with the purpose of making the receiver more clearly know the legal way to request this resource. The Allow Title domain is not allowed to be used in the Post method. If it is not to do, it will be ignored. Allow = "allow": "1 # method Example of use: Allow: get, head This domain does not prevent the client from trying to other methods. However, it is actually useful to indicate the indication information represented by the value in the Allow Title domain, which should be observed. The actual Allow method set is defined when a request is issued on the original server. Since the user agent may communicate with the original server for other purposes, as a proxy (Proxy), even if all the methods specified by the request, the requested Allow Title domain cannot be modified. The Allow Tita domain does not indicate which methods have been implemented. 10.2 Authorization 10.7 Expired (Expires) The date / time value in the expired entity header domain specifies the time of the entity expired. This provides a means of failing information failure for the information provider. When exceeds this period, the application should not be cached on this entity. Expired does not mean that the original resource will change or stop after this period. In practical applications, the information provider knows or predicts the exact date of the change in the expiration of the expiration domain. This format is used by absolute date (Section 3.2). Expires = "expiRes": "http-date E.g: Expires: THU, 01 DEC 1994 16:00:00 GMT If the given date is higher than the date (or the same) in the date title, the recipient should not cache additional entities. If the resource is dynamically generated, the entity of the resource should be plus an appropriate expiration time value. The expiration domain does not force the user agent to refresh or reload resources, it is only used for cache mechanisms. This mechanism checks the expiration status of the resource when a new request is issued for an initialized resource. User agent usually has a history, such as the "Back" button and a list of history. This type of mechanism can be used to reset the entity information that has been obtained before a certain dialog. In the default, the expiration domain does not use the historical mechanism. Unless the user specifies the expiration refresh of the history file when configuring the user agent, as long as the entity is still saved, the historical mechanism can display it, whether or not the entity has expired. Note: The application should be compatible with the implementation of the expiration title illegal or error, such as the date format of the 0 value or illegal, the application should be considered "Expires Immedierately". Although these values do not meet HTTP / 1.0, it is necessary for a robust app. Berners-Lee, et al information [Page 41] 10.8 from (from) From requesting the title domain, if given, it should include an Internet E-mail address of the human user using this user agent. This address should be identified by the system, just like the email definition in RFC822 [7] (already updated to RFC1123 [6]). From = "from": "Mailbox E.g: From: webmaster@w3.org The title domain may be used as a login purpose to determine if a request for a resource is legal. It does not apply for unsafe access protection. The interpretation of this domain is that the request has been completed in the way the requestor specified, and the requestor will be responsible for this manner. In special cases, the robot agent should also include this title domain. In this domain, it is, so that it can be contacted with this person when any problems occur in the receiving end. The Internet E-MAIL address in this domain can be separated from the Internet host that processes the request. For example, when requesting through a proxy (Proxy), the original transmission address should be used. Note: The client should not send a FROM title domain when they are not approved by the user, as doing so may generate user privacy and website security issues. It is highly recommended to provide a means to disable (disable), enable (enable), and modify the value of this domain. 10.9 When you change (if-modified-since) If the IF-Modified-Since requests the title field and the GET method to process the following case: If the resource has not changed any changes as the date specified in this domain. At this time, the server will not submit the copy of the resource, that is, the response does not bring any entity main body, just 304 status code (not modified). If-modified-since = "if-modified-since": "http-date E.g: IF-Modified-Since: SAT, 29 OCT 1994 19:43:31 GMT Berners-Lee, et al information [Page 42] The Condition GET method can request the server to download the specified resource that will change after the specified date in the IF-Modified-Since Title domain, that is, if the resource has not changed, it will not pass. Its algorithm is as follows: a) If the requested response status is not 200 (success) code or it is not legal in the IF-Modified-Since it passed, it will respond in normal GET. If the date is too late than the current time of the server, it is illegal time. b) If the resource changes after the if-modified-since date, the response is also the same as ordinary GET. c) If the resource has not changed after the if-modified-since date, the server will respond to 304 (not modified). Note: This date should be legal. The purpose of this is to effectively update the cached information in order to minimize the cost. 10.10 Recent change (Last-Modified) The Last-Modified entity title domain represents the resource set by the sender recently modified the date and time. The exact definition of this domain is how the recipient explains why: if the recipient has the copy of this resource, this copy is older than the Last-Modified domain, the copy is expired. Last-Modified = "Last-Modified": "http-date E.g: Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT The precise meaning of the title domain depends on the execution method of the sender and the natural state of the original resource. For documents, it may be its Last-Modified time in the file system. For entities that contain multiple components, it may be the latest Last-Modify time in the component. For the database gateway, it may be a recorded Last-Update timestamp. For the virtual object, it may be the nearest change time of the internal state. The original server should not send a last-modified date that is time later than the server message, because the message will cause the server to update the domain value with the original date in a certain time in the future. Berners-Lee, et al Information [Page 43] 10.11 Location (location) The Location response Title domain defines the location of the resource specified by the request URI. For 3XX (redirect), the location field must help the server find the corresponding URL to achieve redirection of resources. Only absolute URL is allowed. Location = "location": "Absoluteuri E.g: Location: http://www.w3.org/hypertext/www/newlocation.html 10.12 Note (PRAGMA) The PRAMA normal title area includes some special instructions that may be useful for any recipient in the request / response chain. From the perspective of the protocol, all annotations indicate some specific optional behaviors. In fact, some systems may require behavior to consistent with the indication. Pragma = "Pragma" ":" 1 # pragma-directive Pragma-Directive = "no-cache" | Extension-Pragma Extension-pragma = token ["=" word] When "no-cache" appears in the request message, the application should push this request to the original server, even if it has been cached in the last request. This will ensure that the client can receive the most authoritative response. It is also used to force the copy to force the copy to enforce the copy when copying or expires in the client. Regardless of the annotation of the annotation (PROXY) and Gateway applications, it must be able to cross these applications because this information may be useful for other recipients on the request / response chain. In fact, if the annotation is not related to a recipient, it should be ignored by the receiver. 10.13 Submit (Referer) The submission request title domain is due to the considering of server-side interests, allowing clients to indicate the source of the link, that is, the request URI of the point to the resource address is obtained. In this way, the server will generate a backup link (BACK-LINKS) list for maintenance of popular resources, login, and cache optimization. Berners-Lee, et al information [Page 44] Referer = "Referer": "(Absoluteuri | Relativeuri) example: Referer: http://www.w3.org/hypertext/datasources/overview.html If only part of the URI is given, you should refer to the request URI to explain it. The URI cannot include a section (Fragment). Note: Because the original code of the link may expose some privacy information, it is strongly recommended by the user to determine whether to send the submission. For example, a browser client has an option to be used out to browse, enable or disable the sender or form information. 10.14 Server (Server) The server response the title domain contains software information used by the original server to process the request. This domain can include multiple product identifiers (Sections 3.7) and annotations to identify servers and important subproducts. According to habits, the product identity will be arranged in an important order of its application. Server = "server": "1 * (Product | Comment) E.g: Server: CERN / 3.0 LIBWW / 2.17 If you respond to push through a proxy, the agent application should not add its own data to the product list. Note: Some versions of the specified server software have a revelation because these versions of the software exists, which will make the server more vulnerable. Advocating server software When implementing, turn this domain into options that can be configured. Berners-Lee, et al information [Page 45] Note: Some servers do not follow the grammatical constraints of the server domain product identity. 10.15 User Agent (User-Agent) The user agent request the title domain contains information of the user's original request, which can be used for statistical purposes. Automatically identify the user agent to avoid the limitations of special user agents to avoid the limitations of special user agents. Although there is no provision, the user agent should include this domain in the request. This domain can include multiple product identifiers (Sections 3.7) and annotations to identify the agent and its important subproducts. According to habits, the product identification will be arranged in the order of the importance of the application. User-agent = "user-agent": "1 * (Product | Comment) E.g: User-agent: Cern-linemode / 2.15 lowww / 2.17b3 Note: Existing agent applications returns their product information to the user agency domain, which is not worth promoting, because this will make the machine to confuse when this information is explained. Note: Some clients are now not complying with the syntax constraints of the product identity in the user agency domain. 10.16 WWW - Www-Authenticate WWW-Authorized response The title field must be included in the 401 (unauthorized) response message. This domain value consists of more than one Challenge, which can be used to indicate the authorization scheme and parameters of the request URI. WWW-authenticate = "www-automate": "1 # challenge HTTP Access Authorization Processing is described in Section 11. User Agents Pay special attention to see if it contains more than more than one WWW-authorized title domain, because of the challenge content May include a list of authorized parameters separated by commas. Berners-Lee, et al information [Page 46] 11. Access authentication (Access Authentication) HTTP provides a simple questionenge-response authentication mechanism that can be used to identify them through the authorization information provided by the client. The authorization scheme is identified with scalable, case sensitive symbols, followed by the certificate to demonstrate the required 'attribute-value' pairs required. Auth-scheme = token Auth-param = token "=" quoted-string The original server responds to the message with a 401 (unauthorized) to question the authorization of the user agent. The response must include a WWW-authorized header domain, and the WWW-Authorization Title domain includes more than one parameter for requesting resource authentication. Challenge = auth-scheme 1 * sp reason * ("," auth-param) Realm = "realm" "=" realm-value Realm-value = quoted-string Any authorization program involving parameters processing has a Realm property (case sensitive). A Realm value (also case sensitive) used in combination with standard URL (relative to access server root) is used to define the protection area. Realm makes the protected resources on the server in a special protection partition, which have their respective authorization and / or authorized databases. The RELM value is a string, usually allocated by the original server, and may have some additional syntax processing issues for the authorization scheme. Typically, the user agent may (or may not) expect the server to authorize it when receiving a 401 (unauthorized) response. If you want to be authorized, the user agent will join the Authorization request header (Authorization request header) in the request. The authorization domain value is composed of trust certificates, including authorization information on the user agent requesting resources. Credentials = Basic-Credentials | (Auth-scheme # auth-param) The area that can be accessed by the user agent through the trust mode is determined by the protection area. If the earlier request has been certified, other requests can be accessed by the same trust within the time interval specified by the authorization scheme, parameters, and / or user selection, etc.. Berners-Lee, et al information [Page 47] Unless otherwise specified by the authorization, the range of a single protection area cannot be extended to the server. If the server does not want to accept trust by request, it should return 403 (forbidden) response. Access authorization of the HTTP protocol is not limited to this simple challenge response mechanism, and other methods, such as transport grade encryption or message packaging, and by additional title domains. However, these methods are not discussed in this document. The agent must completely transparently handle the user authorization, that is, they must push the WWW-authorization and authorization title forward without any changes, or caching the response to the authorization. HTTP / 1.0 does not provide a method of authorizing the client through a proxy method. 11.1 Basic Authentication Scheme The user agent must authorize itself through the user ID (User-ID) and password for each domain, which is the working mode of the basic authorization scheme. The REALM value should be considered as an opaque string that will be used to compare with other Realm values of the server. Only the user identifier and password pass through the authentication of the protected resource, the server will authorize the request. Authorized parameters have no options. When receiving an unauthenticated resource request for the protected area, the server should respond to a challenge, as follows: www-authenticate: Basic realm = "wallyworld" "WallyWorld" is a string assigned by the server for identifying the protected resources specified by the request URI. To receive authorization, the client needs to send a user ID and password in a certificate based on 64-bit (Base64 [5]), and the inner colon ':' is separated. Basic-credentials = "Basic" sp Basic-cookie Basic-cookie = Except Not Limited to 76 Char / Line> Berners-Lee, et al information [Page 48] Userid-password = [token] ":" * text If the user agent wants to send the user identifies "ALADDIN" and password "Open SESAME", the following title domain is followed: Authorization: Basic Qwxhzgrpbjpvcgvuihnlc2ftzq == The BASIC Authorization Scheme is a non-secure method for filtering the unauthorized access of the HTTP server resource. It is based on the assumption that the client and server connection is secure, why is it assumed, because in an actual open network, use the Basic authorization plan often there are many unsafe places. Nevertheless, the client still needs to implement this scheme to communicate with the server adopted by this scheme. 12. Safety considerations (Security Considance) The description of this section is related to the following roles: information applies developers, information providers, and HTTP / 1.0 are subject to security restrictions. This section is only to discuss security issues and put forward recommendations for reducing hidden dangers, but does not provide the final solution to the problem. 12.1 Customer Authentication Of Clients As described in Section 11.1, the Basic Authentication scheme is not a secure user authorization scheme, or it cannot be used to prevent the physical main source code from being transmitted in a physical network in a text. HTTP / 1.0 does not oppose other authorization methods and encryption mechanisms in front of the current increasingly prominent security issues. 12.2 Safety Method (SAFE Methods) Client software developers should note that client software represents users to interact with other aspects on the Internet, and should pay attention to avoiding the user knows the specific actions therefrom, these actions may expose information on interactive parties. In particular, the GET and HEAD methods should be considered secure, and there is nothing different from the re-obtaining data. This allows the user agent to adopt other methods, such as POST, in some case, may have such a case, that is, the request contains unsafe behavior. Berners-Lee, et al information [Page 49] Typically, after executing the GET request, the result of its results remain on the server; in fact, some dynamic resources require this characteristic. The important difference here is that the user does not request these by-products, and this kind of request should not be explained. 12.3 The Abuse of Server Log Information Server provides space to save personal data related to the user request, such as reading, or the subject of interest. These storage information is obviously protected by certain national laws, so the processing of such data should be careful. One party that provides data with the HTTP protocol should be responsible for ensuring that this information will not be spread out before the partition is permitted. 12.4 Sensitive Information Transport (Transfer of Sensitive Information) Like other protocols, the HTTP protocol cannot adjust the content of the transmitted data, nor does it exist in an unfinished method, and the sensitivity of the information can be speculated by a context information segment of a given request. Thus, the application should provide more control for this information as much as possible like the information provider. Here, there are three heading domains to be mentioned: Server, Referr and from (from). Some of the versions of the specified server software have a revelation because these versions of the software exists, which will make the server more vulnerable. Advocating the server software When implementing, turn the Server title domain into options that can be configured. The Referer Title domain allows the reading pattern to be exposed and the reverse link can be exported. Although this domain is useful, if the user information contained in this domain is not separated, its effect is likely to be abused. In addition, even if the user information is cleared in this domain, the URI of its private file can still be speculated from other information in this domain, which may be that the information publisher wants to see. From the title domain may include information related to user private privacy and site security, so that users should be allowed to use some settings, such as disable, allowing (enable), and modifications, and modifications should be allowed to use before sending data. MODIFY, configure this domain information. The user should be able to set the contents of this domain according to their choice or using the default configuration provided by the application. We recommend, but do not require: provide users with convenient interfaces to allow (disable) or disabled information to send the FROM domain or the Referer domain. 12.5 Attacks Based on File and Path Names The implementation of the HTTP original server should be aware that the HTTP request for a file is subject to the name of the server administrator. If the HTTP server sends the HTTP URI to the system call, the server should pay special attention to the service when a request file is not sent to the HTTP client. For example, in UNIX, Microsoft Windows, and other operating systems use ".." as the superior directory name. Under such systems, the HTTP server side must disable access to other range of HTTP servers by using the request URI of this structure. Similarly, some files used inside the server include access control files, configuration files, Script code, etc., also subject to special protection to avoid being illegally requested to obtain, resulting in system sensitive information exposure. Experiments have shown that even the smallest bug can also lead to serious security issues. 13. Thank you (ACKNOWLEDGMENTS) This document focuses on the augmented BNF and the general structure defined in RFC822 [7] by David H. Crocker in RFC822 [7]. Similarly, it uses many definitions made by Nathaniel Borenstein and Ned Freed for MIME [5]. We hope to reduce the relationship between HTTP / 1.0 and MAIL messages. The HTTP protocol has been developing quickly over the past four years, it has benefited from a huge and active development group - is them, these people involved in the WWW discussion of the mailing list, which creates HTTP's global success. Marc Andreessen, Robert Cailliau, Daniel W. Connolly, Bob Denny, Jean-Francois Groff, Phillip M. Hallam-Baker, Hakon W. Lie, Ari Luotonen, Rob McCool, Lou Montulli, Dave Raggett, Tony Sanders, and Marc VanHeyningen, They put a huge energy for the earlier versions of this document. Paul Hoffman provides information about information status, as well as the contents of Appendix C, D. Berners-Lee, et al information [Page 51] This document has benefited from the commentary of HTTP-WG members. The following is people who contribute to this specification: Gary Adams Harald Tveit Alvestrand Keith Ball Brian Behlendorf Paul Burchard Maurizio Codogno Mike Cowlishaw Roman Czyborra Michael A. Dolan John Franks Jim GetTys Marc Hedlund Koen Holtman Alex Hopmann Bob Jernigan shel kaphan Martijn Koster Dave Kristol Daniel Laliberte Paul LEACH Albert Lunde John C. Mallery Larry Masinter Mitra Jeffrey Mogul Gavin Nicol Bill Perry Jeffrey PERRY Owen Rees Luigi Rizzo David Robinson Marc Salomonrich Salz Jim SEIDMAN Chuck Shotton Eric W. Sink Simon E. Spero Robert S. THAU Francois Yergeau Mary Ellen Zurko Jean-Philippe Martin-Flatin 14. Reference book (References) [1] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., Torrey, D., And B. Alberti, "The Internet Gopher Protocol: a Distributed Document Search and Retrieval Protocol, RFC 1436, University of Minnesota, March 1993. [2] Berners-Lee, T., "Universal Resource Identifiers in www: a Unifying syntax for the expression of names and addresses of name Objects on the network as buy in the world-wide web, RFC 1630, CERN, JUNE 1994. [3] Berners-Lee, T., And D. Connolly, "Hypertext Markup Language - 2.0 ", RFC 1866, MIT / W3C, NoveMber 1995. [4] Berners-Lee, T., Masinter, L., And M. McCahill, "Uniform Resource Locators (URL) ", RFC 1738, Cern, Xerox Parc, University of Minnesota, December 1994. Berners-Lee, et al information [Page 52] [5] Borenstein, N., And N. FREED, "MIME (MultiPurpose Internet Mail) Extensions Part One: Mechanisms for Specifying and Describing The Format of Internet Message Bodies, RFC 1521, Bellcore, Innosoft, September 1993. [6] BRADEN, R., "Requirements for Internet Hosts - Application and Support ", STD 3, RFC 1123, IETF, October 1989. [7] CROCKER, D., "Standard for the format of arpa internet text Messages, STD 11, RFC 822, UDEL, AUGUST 1982. [8] F. Davis, B. Kahle, H. Morris, J. Salem, T. Shen, R. Wang, J. SUI, And M. Grinbaum. "WAIS Interface Protocol Prototypefunctional Specification." (V1.5), Thinking Machines Corporation, April 1990. [9] FIELDING, R., "Relative Uniform Resource Locators", RFC 1808, UC IRVINE, June 1995. [10] Horton, M., And R. Adams, "Standard for Interchange of UseNet Messages, RFC 1036 (Obsoletes RFC 850), AT & T Bell Laboratories, Center for Seismic Studies, DecEmber 1987. [11] Kantor, B., And P. Lapsley, "NetWork News Transfer Protocol: A proposed Standard for the stream-based transmission of news, RFC 977, UC San Diego, UC Berkeley, February 1986. [12] Postel, J., SIMPLE Mail Transfer Protocol. "STD 10, RFC 821, USC / ISI, AUGUST 1982. [13] Postel, J., "Media Type Registration Procedure." RFC 1590, USC / ISI, MARCH 1994. [14] Postel, J., And J. Reynolds, "File Transfer Protocol (FTP)", STD 9, RFC 959, USC / ISI, OCTOBER 1985. [15] Reynolds, J., And J. Postel, "Assigned Numbers", STD 2, RFC 1700, USC / ISI, October 1994. [16] Sollins, K., And L. Masinter, "Functional Requirements for Uniform Resource Names, RFC 1737, MIT / LCS, Xerox Corporation, DECEMBER 1994. [17] US-ASCII. Coded Character Set - 7-Bit American Standard Code For Information Interchange. Standard Ansi X3.4-1986, ANSI, 1986. Berners-Lee, et al information [Page 53] [18] ISO-8859. INTERNATIONAL STANDARD - INFORMATION Processing - 8-Bit Single-byte Coded Graphic Character Sets - Part 1: Latin Alphabet No. 1, ISO 8859-1: 1987. Part 2: Latin Alphabet No. 2, ISO 8859-2, 1987. Part 3: Latin Alphabet No. 3, ISO 8859-3, 1988. Part 4: Latin Alphabet No. 4, ISO 8859-4, 1988. Part 5: Latin / Cyrillic Alphabet, ISO 8859-5, 1988.Part 6: Latin / Arabic Alphabet, ISO 8859-6, 1987. Part 7: Latin / Greek Alphabet, ISO 8859-7, 1987. Part 8: Latin / Hebrew Alphabet, ISO 8859-8, 1988. Part 9: Latin Alphabet No. 5, ISO 8859-9, 1990. 15. Author's address (Authors' Addresses) Tim Berners-Lee Director, W3 consortium Mit Laboratory for Computer Science 545 TECHNOLOGY SQUARE Cambridge, MA 02139, u.s.a. Fax: 1 (617) 258 8682 Email: Timbl@w3.org Roy T. Fielding Department of Information and Computer Science University of California Irvine, CA 92717-3425, U.S.A. Fax: 1 (714) 824-4056 Email: FIELDING@ics.uci.edu Henrik Frystyk Nielsen W3 consortium Mit Laboratory for Computer Science 545 TECHNOLOGY SQUARE Cambridge, MA 02139, u.s.a. Fax: 1 (617) 258 8682 Email: Frystyk@w3.org Berners-Lee, et al information [Page 54] Appendices This information appears in the appendix only one reason, that is, they have not become an integral part of the HTTP / 1.0 specification. A. Internet Media Type Message / HTTP (Internet Media Type Message / HTTP) As a supplement to the HTTP / 1.0 protocol, this document is a specification for Internet Media Types "Message / HTTP". The following is registered in IANA [13]. Media Type name: Message: MEDIA TYPE NAME: MESSAGE Media Subtype Name: http Required Parameters: NONE Optional parameters: Version, MSGTYPE Version: The HTTP version number of additional messages, such as "1.0". If not given, the version can be obtained from the first row of its main body. Message Type (MSGTYPE): Message Type - Request or Respond. If not given, the version can be obtained from the first row of its main body. Encoding Considances: Only "7bit", "8bit", or "binary" is allowed. Safety considerations: NONE B. Wayerrant Applications Although this document indicates the necessary conditions for generating HTTP / 1.0 messages, not all applications are corrected their implementation. Therefore, we recommend that the application enhances its fault tolerance, so that it can also ensure normal operation when it is still clearly explained. When the client parsing status row (Status-line) and server resolution request, it should be fault tolerant. In particular, even if only one SP is required, they can also accept the domain separated by any number of SP or HT characters. The row termination of the HTTP header domain is the sequence character CRLF. And we recommend that the application is resolving such titles, it should also be identified as a terminator condition for a single LF (without the previous CR). Berners-Lee, et al information [Page 55] C. Relationship with MIME (RELATIONSHIP to MIME) HTTP / 1.0 uses many structures defined for Internet Mail (RFC822 [7]) and multi-purpose Internet mail extensions MIME [5] to allow entities to be transmitted through an open scalable mechanism. In fact, some characteristics in HTTP are different from the emails discussed in RFC1521, which are used to optimize the performance of binary transmission, providing greater freedom to media types, making the date more easier, of course, this It is also for compatibility with some of the early HTTP servers and client applications. When writing this article, it is said that RFC1521 will be revised. The amendment will include some existing applications that appear in HTTP / 1.0, but these applications are not included in the current RFC1521. The appendix describes the differences in HTTP and RFC 1521. When restricting the MIME environment, proxy and gateways should notice these differences and provide corresponding conversion support when necessary. The agent and gateway from the MIME to HTTP environment should also pay attention to these differences because some conversions may be necessary. C.1 Conversion to specification form (Conversion to Canonical Form) RFC1521 requires an Internet mail entity to convert to a specification form before transmitting, as described in RFC1521 [5] Appendix C. The specific form of the subclass of the "Text" media type allowed when the HTTP is described in Section 3.6.1 in this document describes. RFC1521 requires "text" content type (content-type) must use CRLF as a row break, and Cr or LF is prohibited separately. HTTP allows the use of CRLF when HTTP transmission, separate CR or LF as a line break. As long as it is possible, the agent or gateway in the HTTP environment or the RFC1521 environment should be converted to CRLF all rows in the text media type described in section 3.6.1 of this document. Note that due to the content encoding (Content-encoding), and HTTP allows the use of multi-character sets, some of these characters are not used as CR and LF, which makes actual processing more complicated. Berners-Lee, et al informational [Page 56] CONVERSION OF Date Formats HTTP / 1.0 uses the restricted date format set (Section 3.3) to simplify the process of date comparison. The proxy and gateway of other protocols should ensure that any date the title field in the message is consistent with the HTTP / 1.0 format, otherwise it is to be rewritten. C.3 Content Coding Introduction Of Content-Encoding RFC1521 does not include concepts such as the content encoding the title field in HTTP / 1.0. Since the content type domain is a modification of the media type, the agent and gateway from the HTTP to MIME compatibility protocol must change the value of the content type title field or the entity main body before pushing the message forward or on the entity main body (Entity- Body) Decoding (Some experimental applications for the Internet Mail content type use the media type parameters "; conversions = C.4 No content Transmission Coding (no content-transfer-encoding) HTTP does not use the CTE (Content-Transfer-Encoding) domain of RFC1521. Agent and gateway compatible with the MIME protocol must clear any no identifiable CTE encoding ("quoted-printable" or "base64") before passing the response message to the HTTP client. Proxy and gateways from HTTP to MIME compatibility protocols are responsible for ensuring correct and encoding transmission security on the protocol, so-called secure transmission refers to the restrictions or constraints specified in the corresponding protocol. The agent or gateway should identify data with the appropriate content transmission coding to improve the possibility of implementing secure transmission on the destination protocol. C.5 HTTP Title Domain (HTTP Header Fields in Multipart Body-Parts) In RFC1521, most of the types of mains components are typically ignored unless their domain name begins with "Content-". In HTTP / 1.0, any HTTP title field contained in multiple body portys is only meaningful for the corresponding part. D. Additional Features (Additional Features) Some protocol elements included in this appendix are present in some HTTP implementations, but they are not applicable to all HTTP / 1.0 applications. Developers should pay attention to these features, but they cannot rely on them to interact with other HTTP / 1.0 applications. Berners-Lee, et al information [page 57] D.1 Additional Request Methods (Additional Request Methods) D.1.1 PUT The PUT method requesting the server stores the entity of the attachment in the request URI. If the resource that the request URI points to, the attachment entity should be seen as a modified version of the current resource on the current original server. If the request URI does not point to the existing resource, the URI will be defined by the requested user agent into a new resource, and the original server will use the URI to generate this resource. The basic difference between the POST and PUT two requests is that the understanding of the request URI is different. The resources identified in the POST request method will be processed as an accessory entity by the server, which may be a data receiving process, a gateway of some other protocols, or a separate entity that can be annotated. In contrast, the user agent clearly knows where the entity identified in the PUT request is pointed out, and the request should not be used on other resource headers at the server. D.1.2 Delete The delete method requests the original server to delete the resources specified by the request URI. D.1.3 LINK The LINK method establishes one or more connection relationships between the resource specified by the request URI or other existing resources. D.1.4 unlink UNLINK method Deletes one or more connection relationships between the resources specified by the request URI. D.2 Additional Title Domain Definitions (Additional Header Field Definitions) D.2.1 ACCEPT The Accept request title domain is used to indicate a list of media ranges that can be accepted. The asterisk "*" is used to group media types, with "* / *" instructions to accept all media types; "Type / *" instructs all subtypes that accept the Type type. For a given context, the client should indicate which types are it acceptable. Berners-Lee, et al information [Page 58] D.2.2 Acceptable font set (Accept-charset) The ACCEPT-CHARSET request Title domain is used to indicate the preferred character set in addition to US-ASCII and ISO-8859-1. This domain will enable the client to understand more or more character sets, which can be stored on the server to store documents with such a character set. D.2.3 Accept coding (Accept-encoding) The Accept-Encoding request The title field is similar to the Accept, but limits the Content-Coding value that the response is accepted. D.2.4 Accept-language (Accept-Language) Accept-language request The title domain is similar to Accept, but limits the preferred natural language set in the request response. D.2.5 Content Language (Content-Language) The Content-Language entity header domain describes the natural language specified in additional entities. Note that this may be not a matter of matter with various languages used inside the entity. D.2.6 Connection (LINK) The LINK entity title domain describes the relationship between entities and certain other resources. An entity may include multiple connection values. LINK in the meta-information level indicates the relationship between the hierarchical structure and the navigation path. D.2.7 MIME version (MIME-VERSION) The HTTP message may include a single MIME version of the General-HEADER domain to indicate the version of the MIME protocol to construct messages. The use of the MIME version of the title is as defined in RFC1521 [5], and should be used to indicate whether the message meets the MIME specification. However, unfortunately, some old HTTP / 1.0 servers do not select this domain, causing this domain to be discarded. Berners-Lee, et al information [Page 59] D.2.8 is in .... Retroyal-After The Retry-After response to the title field can be used with 503 (service unavailable) to indicate the length of time for the server to stop responding to the customer request. The value of this domain can be represented by the date in the HTTP format, or an integer can also be used to indicate the number of seconds after the response time. D.2.9 Title (Title) Title entity Title domain is used to indicate the title of the entity. D.2.10 URI The URI entity header domain may contain some or all of the Uniform Resource Identifiers. See Section 3.2, through these identities to represent the resources specified by the request URI. It is not guaranteed to be able to find the specified resource based on the URI.