1 |
INTERNET-DRAFT Ken A L Coar |
2 |
draft-coar-cgi-v11-03.{html,txt} IBM Corporation |
3 |
D.R.T. Robinson |
4 |
E*TRADE UK Ltd. |
5 |
25 June 1999 |
6 |
|
7 |
The WWW Common Gateway Interface |
8 |
Version 1.1 |
9 |
|
10 |
Status of this Memo |
11 |
|
12 |
This document is an Internet-Draft and is in full |
13 |
conformance with all provisions of Section 10 of RFC2026. |
14 |
|
15 |
Internet-Drafts are working documents of the Internet |
16 |
Engineering Task Force (IETF), its areas, and its working |
17 |
groups. Note that other groups may also distribute working |
18 |
documents as Internet-Drafts. |
19 |
|
20 |
Internet-Drafts are draft documents valid for a maximum of |
21 |
six months and may be updated, replaced, or obsoleted by |
22 |
other documents at any time. It is inappropriate to use |
23 |
Internet-Drafts as reference material or to cite them other |
24 |
than as "work in progress." |
25 |
|
26 |
The list of current Internet-Drafts can be accessed at |
27 |
<http://www.ietf.org/ietf/1id-abstracts.txt>. |
28 |
|
29 |
The list of Internet-Draft Shadow Directories can be |
30 |
accessed at <http://www.ietf.org/shadow.html>. |
31 |
|
32 |
Abstract |
33 |
|
34 |
The Common Gateway Interface (CGI) is a simple interface for |
35 |
running external programs, software or gateways under an |
36 |
information server in a platform-independent manner. |
37 |
Currently, the supported information servers are HTTP servers. |
38 |
|
39 |
The interface has been in use by the World-Wide Web since |
40 |
1993. This specification defines the "current practice" |
41 |
parameters of the 'CGI/1.1' interface developed and documented |
42 |
at the U.S. National Centre for Supercomputing Applications |
43 |
[NCSA-CGI]. This document also defines the use of the CGI/1.1 |
44 |
interface on the Unix and AmigaDOS(tm) systems. |
45 |
|
46 |
Discussion of this draft occurs on the CGI-WG mailing list; |
47 |
see the project Web page at |
48 |
<URL:http://CGI-Spec.Golux.Com/> for details on the |
49 |
mailing list and the status of the project. |
50 |
|
51 |
Table of Contents |
52 |
|
53 |
1 Introduction..............................................3 |
54 |
1.1 Purpose................................................3 |
55 |
1.2 Requirements...........................................3 |
56 |
1.3 Specifications.........................................4 |
57 |
|
58 |
Coar, et al. INTERNET-DRAFT [Page 1] |
59 |
|
60 |
CGI/1.1 Expires: 31 December, 1999 |
61 |
|
62 |
1.4 Terminology............................................4 |
63 |
2 Notational Conventions and Generic Grammar................4 |
64 |
2.1 Augmented BNF..........................................5 |
65 |
2.2 Basic Rules............................................5 |
66 |
3 Protocol Parameters.......................................6 |
67 |
3.1 URL Encoding...........................................6 |
68 |
3.2 The Script-URI.........................................6 |
69 |
4 Invoking the Script.......................................7 |
70 |
5 The CGI Script Command Line...............................7 |
71 |
6 Data Input to the CGI Script..............................8 |
72 |
6.1 Request Metadata (Metavariables).......................8 |
73 |
6.1.1 AUTH_TYPE...........................................9 |
74 |
6.1.2 CONTENT_LENGTH......................................9 |
75 |
6.1.3 CONTENT_TYPE........................................9 |
76 |
6.1.4 GATEWAY_INTERFACE...................................10 |
77 |
6.1.5 Protocol-Specific Metavariables.....................11 |
78 |
6.1.6 PATH_INFO...........................................11 |
79 |
6.1.7 PATH_TRANSLATED.....................................12 |
80 |
6.1.8 QUERY_STRING........................................13 |
81 |
6.1.9 REMOTE_ADDR.........................................13 |
82 |
6.1.10 REMOTE_HOST........................................13 |
83 |
6.1.11 REMOTE_IDENT.......................................13 |
84 |
6.1.12 REMOTE_USER........................................14 |
85 |
6.1.13 REQUEST_METHOD.....................................14 |
86 |
6.1.14 SCRIPT_NAME........................................14 |
87 |
6.1.15 SERVER_NAME........................................15 |
88 |
6.1.16 SERVER_PORT........................................15 |
89 |
6.1.17 SERVER_PROTOCOL....................................15 |
90 |
6.1.18 SERVER_SOFTWARE....................................16 |
91 |
6.2 Request Message-Bodies................................16 |
92 |
7 Data Output from the CGI Script...........................16 |
93 |
7.1 Non-Parsed Header Output...............................17 |
94 |
7.2 Parsed Header Output...................................17 |
95 |
7.2.1 CGI header fields...................................17 |
96 |
7.2.1.1 Content-Type.....................................17 |
97 |
7.2.1.2 Location.........................................18 |
98 |
7.2.1.3 Status...........................................18 |
99 |
7.2.1.4 Extension header fields..........................19 |
100 |
7.2.2 HTTP header fields..................................19 |
101 |
8 Server Implementation.....................................19 |
102 |
8.1 Requirements for Servers...............................19 |
103 |
8.1.1 Script-URI..........................................19 |
104 |
8.1.2 Request Message-body Handling.......................20 |
105 |
8.1.3 Required Metavariables..............................20 |
106 |
8.1.4 Response Compliance.................................20 |
107 |
8.2 Recommendations for Servers............................20 |
108 |
8.3 Summary of Metavariables...............................21 |
109 |
9 Script Implementation.....................................22 |
110 |
9.1 Requirements for Scripts...............................22 |
111 |
9.2 Recommendations for Scripts............................23 |
112 |
10 System Specifications....................................23 |
113 |
10.1 AmigaDOS..............................................23 |
114 |
10.2 Unix..................................................24 |
115 |
11 Security Considerations..................................24 |
116 |
|
117 |
Coar, et al. INTERNET-DRAFT [Page 2] |
118 |
|
119 |
CGI/1.1 Expires: 31 December, 1999 |
120 |
|
121 |
11.1 Safe Methods..........................................24 |
122 |
11.2 HTTP Header Fields Containing Sensitive Information...25 |
123 |
11.3 Script Interference with the Server...................25 |
124 |
11.4 Data Length and Buffering Considerations..............25 |
125 |
11.5 Stateless Processing..................................25 |
126 |
12 Acknowledgments..........................................26 |
127 |
13 References...............................................26 |
128 |
14 Authors' Addresses.......................................27 |
129 |
|
130 |
1. Introduction |
131 |
|
132 |
1.1. Purpose |
133 |
|
134 |
Together the HTTP [3,8] server and the CGI script are |
135 |
responsible for servicing a client request by sending back |
136 |
responses. The client request comprises a Universal Resource |
137 |
Identifier (URI) [1], a request method, and various ancillary |
138 |
information about the request provided by the transport |
139 |
mechanism. |
140 |
|
141 |
The CGI defines the abstract parameters, known as |
142 |
metavariables, which describe the client's request. Together |
143 |
with a concrete programmer interface this specifies a |
144 |
platform-independent interface between the script and the HTTP |
145 |
server. |
146 |
|
147 |
1.2. Requirements |
148 |
|
149 |
This specification uses the same words as RFC 1123 [5] to |
150 |
define the significance of each particular requirement. These |
151 |
are: |
152 |
|
153 |
MUST |
154 |
This word or the adjective 'required' means that the |
155 |
item is an absolute requirement of the specification. |
156 |
|
157 |
SHOULD |
158 |
This word or the adjective 'recommended' means that |
159 |
there may exist valid reasons in particular |
160 |
circumstances to ignore this item, but the full |
161 |
implications should be understood and the case |
162 |
carefully weighed before choosing a different course. |
163 |
|
164 |
MAY |
165 |
This word or the adjective 'optional' means that this |
166 |
item is truly optional. One vendor may choose to |
167 |
include the item because a particular marketplace |
168 |
requires it or because it enhances the product, for |
169 |
example; another vendor may omit the same item. |
170 |
|
171 |
An implementation is not compliant if it fails to satisfy one |
172 |
or more of the 'must' requirements for the protocols it |
173 |
implements. An implementation that satisfies all of the 'must' |
174 |
and all of the 'should' requirements for its features is said |
175 |
|
176 |
Coar, et al. INTERNET-DRAFT [Page 3] |
177 |
|
178 |
CGI/1.1 Expires: 31 December, 1999 |
179 |
|
180 |
to be 'unconditionally compliant'; one that satisfies all of |
181 |
the 'must' requirements but not all of the 'should' |
182 |
requirements for its features is said to be 'conditionally |
183 |
compliant.' |
184 |
|
185 |
1.3. Specifications |
186 |
|
187 |
Not all of the functions and features of the CGI are defined |
188 |
in the main part of this specification. The following phrases |
189 |
are used to describe the features which are not specified: |
190 |
|
191 |
system defined |
192 |
The feature may differ between systems, but must be the |
193 |
same for different implementations using the same |
194 |
system. A system will usually identify a class of |
195 |
operating-systems. Some systems are defined in section |
196 |
10 of this document. New systems may be defined by new |
197 |
specifications without revision of this document. |
198 |
|
199 |
implementation defined |
200 |
The behaviour of the feature may vary from |
201 |
implementation to implementation, but a particular |
202 |
implementation must document its behaviour. |
203 |
|
204 |
1.4. Terminology |
205 |
|
206 |
This specification uses many terms defined in the HTTP/1.1 |
207 |
specification [8]; however, the following terms are used here |
208 |
in a sense which may not accord with their definitions in that |
209 |
document, or with their common meaning. |
210 |
|
211 |
metavariable |
212 |
A named parameter that carries information from the |
213 |
server to the script. It is not necessarily a variable |
214 |
in the operating-system's environment, although that is |
215 |
the most common implementation. |
216 |
|
217 |
script |
218 |
The software which is invoked by the server via this |
219 |
interface. It need not be a standalone program, but |
220 |
could be a dynamically-loaded or shared library, or |
221 |
even a subroutine in the server. It may be a set of |
222 |
statements interpreted at run-time, as the term |
223 |
'script' is frequently understood, but that is not a |
224 |
requirement and within the context of this |
225 |
specification the term has the broader definition |
226 |
stated. |
227 |
|
228 |
server |
229 |
The application program which invokes the script in |
230 |
order to service requests. |
231 |
|
232 |
2. Notational Conventions and Generic Grammar |
233 |
|
234 |
|
235 |
Coar, et al. INTERNET-DRAFT [Page 4] |
236 |
|
237 |
CGI/1.1 Expires: 31 December, 1999 |
238 |
|
239 |
2.1. Augmented BNF |
240 |
|
241 |
All of the mechanisms specified in this document are described |
242 |
in both prose and an augmented Backus-Naur Form (BNF) similar |
243 |
to that used by RFC 822 [6]. This augmented BNF contains the |
244 |
following constructs: |
245 |
|
246 |
name = definition |
247 |
The definition by the equal character ("="). Whitespace |
248 |
is only significant in that continuation lines of a |
249 |
definition are indented. |
250 |
|
251 |
"literal" |
252 |
Quotation marks (") surround literal text, except for a |
253 |
literal quotation mark, which is surrounded by |
254 |
angle-brackets ("<" and ">"). Unless stated otherwise, |
255 |
the text is case-sensitive. |
256 |
|
257 |
rule1 | rule2 |
258 |
Alternative rules are separated by a vertical bar |
259 |
("|"). |
260 |
|
261 |
(rule1 rule2 rule3) |
262 |
Elements enclosed in parentheses are treated as a |
263 |
single element. |
264 |
|
265 |
*rule |
266 |
A rule preceded by an asterisk ("*") may have zero or |
267 |
more occurrences. A rule preceded by an integer |
268 |
followed by an asterisk must occur at least the |
269 |
specified number of times. |
270 |
|
271 |
[rule] |
272 |
An element enclosed in square brackets ("[" and "]") is |
273 |
optional. |
274 |
|
275 |
2.2. Basic Rules |
276 |
|
277 |
The following rules are used throughout this specification to |
278 |
describe basic parsing constructs. |
279 |
|
280 |
alpha = lowalpha | hialpha |
281 |
alphanum = alpha | digit |
282 |
lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
283 |
| "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
284 |
| "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
285 |
| "y" | "z" |
286 |
hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
287 |
| "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
288 |
| "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
289 |
| "Y" | "Z" |
290 |
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
291 |
| "8" | "9" |
292 |
hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" |
293 |
|
294 |
Coar, et al. INTERNET-DRAFT [Page 5] |
295 |
|
296 |
CGI/1.1 Expires: 31 December, 1999 |
297 |
|
298 |
| "b" | "c" | "d" | "e" | "f" |
299 |
escaped = "%" hex hex |
300 |
OCTET = <any 8-bit sequence of data> |
301 |
CHAR = <any US-ASCII character (octets 0 - 127)> |
302 |
CTL = <any US-ASCII control character |
303 |
(octets 0 - 31) and DEL (127)> |
304 |
CR = <US-ASCII CR, carriage return (13)> |
305 |
LF = <US-ASCII LF, linefeed (10)> |
306 |
SP = <US-ASCII SP, space (32)> |
307 |
HT = <US-ASCII HT, horizontal tab (9)> |
308 |
NL = CR | LF |
309 |
LWSP = SP | HT | NL |
310 |
tspecial = "(" | ")" | "@" | "," | ";" | ":" | "\" | <"> |
311 |
| "/" | "[" | "]" | "?" | "<" | ">" | "{" | "}" |
312 |
| SP | HT | NL |
313 |
token = 1*<any CHAR except CTLs or tspecials> |
314 |
quoted-string = ( <"> *qdtext <"> ) | ( "<" *qatext ">") |
315 |
qdtext = <any CHAR except <"> and CTLs but including LWSP> |
316 |
qatext = <any CHAR except "<", ">" and CTLs but |
317 |
including LWSP> |
318 |
mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" |
319 |
unreserved = alphanum | mark |
320 |
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | |
321 |
"$" | "," |
322 |
uric = reserved | unreserved | escaped |
323 |
|
324 |
Note that newline (NL) need not be a single character, but can |
325 |
be a character sequence. |
326 |
|
327 |
3. Protocol Parameters |
328 |
|
329 |
3.1. URL Encoding |
330 |
|
331 |
Some variables and constructs used here are described as being |
332 |
'URL-encoded'. This encoding is described in section 2 of RFC |
333 |
2396 [4]. |
334 |
|
335 |
An alternate "shortcut" encoding for representing the space |
336 |
character exists and is in common use. Scripts MUST be |
337 |
prepared to recognise both '+' and '%20' as an encoded space |
338 |
in a URL-encoded value. |
339 |
|
340 |
Note that some unsafe characters may have different semantics |
341 |
if they are encoded. The definition of which characters are |
342 |
unsafe depends on the context. For example, the following two |
343 |
URLs do not necessarily refer to the same resource: |
344 |
|
345 |
http://somehost.com/somedir%2Fvalue |
346 |
http://somehost.com/somedir/value |
347 |
|
348 |
See section 2 of RFC 2396 [4] for authoritative treatment of |
349 |
this issue. |
350 |
|
351 |
3.2. The Script-URI |
352 |
|
353 |
Coar, et al. INTERNET-DRAFT [Page 6] |
354 |
|
355 |
CGI/1.1 Expires: 31 December, 1999 |
356 |
|
357 |
|
358 |
The 'Script-URI' is defined as the URI of the resource |
359 |
identified by the metavariables. Often, this URI will be the |
360 |
same as the URI requested by the client (the 'Client-URI'); |
361 |
however, it need not be. Instead, it could be a URI invented |
362 |
by the server, and so it can only be used in the context of |
363 |
the server and its CGI interface. |
364 |
|
365 |
The Script-URI has the syntax of generic-RL as defined in |
366 |
section 2.1 of RFC 1808 [7], with the exception that object |
367 |
parameters and fragment identifiers are not permitted: |
368 |
|
369 |
<scheme>://<host><port>/<path>?<query> |
370 |
|
371 |
The various components of the Script-URI are defined by some |
372 |
of the metavariables (see section 4 below); |
373 |
|
374 |
script-uri = protocol "://" SERVER_NAME ":" SERVER_PORT enc-script |
375 |
enc-path-info "?" QUERY_STRING |
376 |
|
377 |
where 'protocol' is obtained from SERVER_PROTOCOL, |
378 |
'enc-script' is a URL-encoded version of SCRIPT_NAME and |
379 |
'enc-path-info' is a URL-encoded version of PATH_INFO. See |
380 |
section 4.6 for more information about the PATH_INFO |
381 |
metavariable. |
382 |
|
383 |
Note that the scheme and the protocol are not identical; for |
384 |
instance, a resource accessed via an SSL mechanism may have a |
385 |
Client-URI with a scheme of "https" rather than "http". |
386 |
CGI/1.1 provides no means for the script to reconstruct this, |
387 |
and therefore the Script-URI includes the base protocol used. |
388 |
|
389 |
4. Invoking the Script |
390 |
|
391 |
The script is invoked in a system defined manner. Unless |
392 |
specified otherwise, the file containing the script will be |
393 |
invoked as an executable program. |
394 |
|
395 |
5. The CGI Script Command Line |
396 |
|
397 |
Some systems support a method for supplying an array of |
398 |
strings to the CGI script. This is only used in the case of an |
399 |
'indexed' query. This is identified by a "GET" or "HEAD" HTTP |
400 |
request with a URL query string not containing any unencoded |
401 |
"=" characters. For such a request, servers SHOULD parse the |
402 |
search string into words, using the following rules: |
403 |
|
404 |
search-string = search-word *( "+" search-word ) |
405 |
search-word = 1*schar |
406 |
schar = xunreserved | escaped | xreserved |
407 |
xunreserved = alpha | digit | xsafe | extra |
408 |
xsafe = "$" | "-" | "_" | "." |
409 |
xreserved = ";" | "/" | "?" | ":" | "@" | "&" |
410 |
|
411 |
|
412 |
Coar, et al. INTERNET-DRAFT [Page 7] |
413 |
|
414 |
CGI/1.1 Expires: 31 December, 1999 |
415 |
|
416 |
After parsing, each word is URL-decoded, optionally encoded in |
417 |
a system defined manner, and then the argument list is set to |
418 |
the list of words. |
419 |
|
420 |
If the server cannot create any part of the argument list, |
421 |
then the server SHOULD NOT generate any command line |
422 |
information. For example, the number of arguments may be |
423 |
greater than operating system or server limitations permit, or |
424 |
one of the words may not be representable as an argument. |
425 |
|
426 |
Scripts SHOULD check to see if the QUERY_STRING value contains |
427 |
an unencoded "=" character, and SHOULD NOT use the command |
428 |
line arguments if it does. |
429 |
|
430 |
6. Data Input to the CGI Script |
431 |
|
432 |
Information about a request comes from two different sources: |
433 |
the request header, and any associated message-body. Servers |
434 |
MUST make portions of this information available to scripts. |
435 |
|
436 |
6.1. Request Metadata (Metavariables) |
437 |
|
438 |
Each CGI server implementation MUST define a mechanism to pass |
439 |
data about the request from the server to the script. The |
440 |
metavariables containing these data are accessed by the script |
441 |
in a system defined manner. The representation of the |
442 |
characters in the metavariables is system defined. |
443 |
|
444 |
This specification does not distinguish between the |
445 |
representation of null values and missing ones. Whether null |
446 |
or missing values (such as a query component of "?" or "", |
447 |
respectively) are represented by undefined metavariables or by |
448 |
metavariables with values of "" is implementation-defined. |
449 |
|
450 |
Case is not significant in the metavariable names, in that |
451 |
there cannot be two different variables whose names differ in |
452 |
case only. Here they are shown using a canonical |
453 |
representation of capitals plus underscore ("_"). The actual |
454 |
representation of the names is system defined; for a |
455 |
particular system the representation MAY be defined |
456 |
differently than this. |
457 |
|
458 |
Metavariable values MUST be considered case-sensitive except |
459 |
as noted otherwise. |
460 |
|
461 |
The canonical metavariables defined by this specification are: |
462 |
|
463 |
AUTH_TYPE |
464 |
CONTENT_LENGTH |
465 |
CONTENT_TYPE |
466 |
GATEWAY_INTERFACE |
467 |
PATH_INFO |
468 |
PATH_TRANSLATED |
469 |
QUERY_STRING |
470 |
|
471 |
Coar, et al. INTERNET-DRAFT [Page 8] |
472 |
|
473 |
CGI/1.1 Expires: 31 December, 1999 |
474 |
|
475 |
REMOTE_ADDR |
476 |
REMOTE_HOST |
477 |
REMOTE_IDENT |
478 |
REMOTE_USER |
479 |
REQUEST_METHOD |
480 |
SCRIPT_NAME |
481 |
SERVER_NAME |
482 |
SERVER_PORT |
483 |
SERVER_PROTOCOL |
484 |
SERVER_SOFTWARE |
485 |
|
486 |
Metavariables with names beginning with the protocol name |
487 |
(e.g., "HTTP_ACCEPT") are also canonical in their description |
488 |
of request header fields. The number and meaning of these |
489 |
fields may change independently of this specification. (See |
490 |
also section 6.1.5.) |
491 |
|
492 |
6.1.1. AUTH_TYPE |
493 |
|
494 |
This variable is specific to requests made via the "http" |
495 |
scheme. |
496 |
|
497 |
If the Script-URI required access authentication for external |
498 |
access, then the server MUST set the value of this variable |
499 |
from the 'auth-scheme' token in the request's "Authorization" |
500 |
header field. Otherwise it is set to NULL. |
501 |
|
502 |
AUTH_TYPE = "" | auth-scheme |
503 |
auth-scheme = "Basic" | "Digest" | token |
504 |
|
505 |
HTTP access authentication schemes are described in section 11 |
506 |
of the HTTP/1.1 specification [8]. The auth-scheme is not |
507 |
case-sensitive. |
508 |
|
509 |
Servers MUST provide this metavariable to scripts if the |
510 |
request header included an "Authorization" field that was |
511 |
authenticated. |
512 |
|
513 |
6.1.2. CONTENT_LENGTH |
514 |
|
515 |
This metavariable is set to the size of the message-body |
516 |
entity attached to the request, if any, in decimal number of |
517 |
octets. If no data are attached, then this metavariable is |
518 |
either NULL or not defined. The syntax is the same as for the |
519 |
HTTP "Content-Length" header field (section 14.14, HTTP/1.1 |
520 |
specification [8]). |
521 |
|
522 |
CONTENT_LENGTH = "" | 1*digit |
523 |
|
524 |
Servers MUST provide this metavariable to scripts if the |
525 |
request was accompanied by a message-body entity. |
526 |
|
527 |
6.1.3. CONTENT_TYPE |
528 |
|
529 |
|
530 |
Coar, et al. INTERNET-DRAFT [Page 9] |
531 |
|
532 |
CGI/1.1 Expires: 31 December, 1999 |
533 |
|
534 |
If the request includes a message-body, CONTENT_TYPE is set to |
535 |
the Internet Media Type [9] of the attached entity if the type |
536 |
was provided via a "Content-type" field in the request header, |
537 |
or if the server can determine it in the absence of a supplied |
538 |
"Content-type" field. The syntax is the same as for the HTTP |
539 |
"Content-Type" header field. |
540 |
|
541 |
CONTENT_TYPE = "" | media-type |
542 |
media-type = type "/" subtype *( ";" parameter) |
543 |
type = token |
544 |
subtype = token |
545 |
parameter = attribute "=" value |
546 |
attribute = token |
547 |
value = token | quoted-string |
548 |
|
549 |
The type, subtype, and parameter attribute names are not |
550 |
case-sensitive. Parameter values MAY be case sensitive. Media |
551 |
types and their use in HTTP are described in section 3.7 of |
552 |
the HTTP/1.1 specification [8]. |
553 |
|
554 |
Example: |
555 |
|
556 |
application/x-www-form-urlencoded |
557 |
|
558 |
There is no default value for this variable. If and only if it |
559 |
is unset, then the script MAY attempt to determine the media |
560 |
type from the data received. If the type remains unknown, then |
561 |
the script MAY choose to either assume a content-type of |
562 |
application/octet-stream or reject the request with a 415 |
563 |
("Unsupported Media Type") error. See section 7.2.1.3 for more |
564 |
information about returning error status values. |
565 |
|
566 |
Servers MUST provide this metavariable to scripts if a |
567 |
"Content-Type" field was present in the original request |
568 |
header. If the server receives a request with an attached |
569 |
entity but no "Content-Type" header field, it MAY attempt to |
570 |
determine the correct datatype, or it MAY omit this |
571 |
metavariable when communicating the request information to the |
572 |
script. |
573 |
|
574 |
6.1.4. GATEWAY_INTERFACE |
575 |
|
576 |
This metavariable is set to the dialect of CGI being used by |
577 |
the server to communicate with the script. Syntax: |
578 |
|
579 |
GATEWAY_INTERFACE = "CGI" "/" major "." minor |
580 |
major = 1*digit |
581 |
minor = 1*digit |
582 |
|
583 |
Note that the major and minor numbers are treated as separate |
584 |
integers and hence each may be more than a single digit. Thus |
585 |
CGI/2.4 is a lower version than CGI/2.13 which in turn is |
586 |
lower than CGI/12.3. Leading zeros in either the major or the |
587 |
minor number MUST be ignored by scripts and SHOULD NOT be |
588 |
|
589 |
Coar, et al. INTERNET-DRAFT [Page 10] |
590 |
|
591 |
CGI/1.1 Expires: 31 December, 1999 |
592 |
|
593 |
generated by servers. |
594 |
|
595 |
This document defines the 1.1 version of the CGI interface |
596 |
("CGI/1.1"). |
597 |
|
598 |
Servers MUST provide this metavariable to scripts. |
599 |
|
600 |
6.1.5. Protocol-Specific Metavariables |
601 |
|
602 |
These metavariables are specific to the protocol via which the |
603 |
request is made. Interpretation of these variables depends on |
604 |
the value of the SERVER_PROTOCOL metavariable (see section |
605 |
6.1.17). |
606 |
|
607 |
Metavariables with names beginning with "HTTP_" contain values |
608 |
from the request header, if the scheme used was HTTP. Each |
609 |
HTTP header field name is converted to upper case, has all |
610 |
occurrences of "-" replaced with "_", and has "HTTP_" |
611 |
prepended to form the metavariable name. Similar |
612 |
transformations are applied for other protocols. The header |
613 |
data MAY be presented as sent by the client, or MAY be |
614 |
rewritten in ways which do not change its semantics. If |
615 |
multiple header fields with the same field-name are received |
616 |
then the server MUST rewrite them as though they had been |
617 |
received as a single header field having the same semantics |
618 |
before being represented in a metavariable. Similarly, a |
619 |
header field that is received on more than one line MUST be |
620 |
merged into a single line. The server MUST, if necessary, |
621 |
change the representation of the data (for example, the |
622 |
character set) to be appropriate for a CGI metavariable. |
623 |
|
624 |
Servers are not required to create metavariables for all the |
625 |
request header fields that they receive. In particular, they |
626 |
MAY decline to make available any header fields carrying |
627 |
authentication information, such as "Authorization", or which |
628 |
are available to the script via other metavariables, such as |
629 |
"Content-Length" and "Content-Type". |
630 |
|
631 |
6.1.6. PATH_INFO |
632 |
|
633 |
The PATH_INFO metavariable specifies a path to be interpreted |
634 |
by the CGI script. It identifies the resource or sub-resource |
635 |
to be returned by the CGI script, and it is derived from the |
636 |
portion of the URI path following the script name but |
637 |
preceding any query data. The syntax and semantics are similar |
638 |
to a decoded HTTP URL 'path' token (defined in RFC 2396 [4]), |
639 |
with the exception that a PATH_INFO of "/" represents a single |
640 |
void path segment. |
641 |
|
642 |
PATH_INFO = "" | ( "/" path ) |
643 |
path = segment *( "/" segment ) |
644 |
segment = *pchar |
645 |
pchar = <any CHAR except "/"> |
646 |
|
647 |
|
648 |
Coar, et al. INTERNET-DRAFT [Page 11] |
649 |
|
650 |
CGI/1.1 Expires: 31 December, 1999 |
651 |
|
652 |
The PATH_INFO string is the trailing part of the <path> |
653 |
component of the Script-URI (see section 3.2) that follows the |
654 |
SCRIPT_NAME portion of the path. |
655 |
|
656 |
Servers MAY impose their own restrictions and limitations on |
657 |
what values they will accept for PATH_INFO, and MAY reject or |
658 |
edit any values they consider objectionable before passing |
659 |
them to the script. |
660 |
|
661 |
Servers MUST make this URI component available to CGI scripts. |
662 |
The PATH_INFO value is case-sensitive, and the server MUST |
663 |
preserve the case of the PATH_INFO element of the URI when |
664 |
making it available to scripts. |
665 |
|
666 |
6.1.7. PATH_TRANSLATED |
667 |
|
668 |
PATH_TRANSLATED is derived by taking any path-info component |
669 |
of the request URI (see section 6.1.6), decoding it (see |
670 |
section 3.1), parsing it as a URI in its own right, and |
671 |
performing any virtual-to-physical translation appropriate to |
672 |
map it onto the server's document repository structure. If the |
673 |
request URI includes no path-info component, the |
674 |
PATH_TRANSLATED metavariable SHOULD NOT be defined. |
675 |
|
676 |
PATH_TRANSLATED = *CHAR |
677 |
|
678 |
For a request such as the following: |
679 |
|
680 |
http://somehost.com/cgi-bin/somescript/this%2eis%2epath%2einfo |
681 |
|
682 |
|
683 |
the PATH_INFO component would be decoded, and the result |
684 |
parsed as though it were a request for the following: |
685 |
|
686 |
http://somehost.com/this.is.the.path.info |
687 |
|
688 |
This would then be translated to a location in the server's |
689 |
document repository, perhaps a filesystem path something like |
690 |
this: |
691 |
|
692 |
/usr/local/www/htdocs/this.is.the.path.info |
693 |
|
694 |
The result of the translation is the value of PATH_TRANSLATED. |
695 |
|
696 |
The value of PATH_TRANSLATED may or may not map to a valid |
697 |
repository location. Servers MUST preserve the case of the |
698 |
path-info segment if and only if the underlying repository |
699 |
supports case-sensitive names. If the repository is only |
700 |
case-aware, case-preserving, or case-blind with regard to |
701 |
document names, servers are not required to preserve the case |
702 |
of the original segment through the translation. |
703 |
|
704 |
The translation algorithm the server uses to derive |
705 |
PATH_TRANSLATED is implementation defined; CGI scripts which |
706 |
|
707 |
Coar, et al. INTERNET-DRAFT [Page 12] |
708 |
|
709 |
CGI/1.1 Expires: 31 December, 1999 |
710 |
|
711 |
use this variable may suffer limited portability. |
712 |
|
713 |
Servers SHOULD provide this metavariable to scripts if and |
714 |
only if the request URI includes a path-info component. |
715 |
|
716 |
6.1.8. QUERY_STRING |
717 |
|
718 |
A URL-encoded string; the <query> part of the Script-URI. (See |
719 |
section 3.2.) |
720 |
|
721 |
QUERY_STRING = query-string |
722 |
query-string = *uric |
723 |
|
724 |
The URL syntax for a query string is described in section 3 of |
725 |
RFC 2396 [4]. |
726 |
|
727 |
Servers MUST supply this value to scripts. The QUERY_STRING |
728 |
value is case-sensitive. If the Script-URI does not include a |
729 |
query component, the QUERY_STRING metavariable MUST be defined |
730 |
as an empty string (""). |
731 |
|
732 |
6.1.9. REMOTE_ADDR |
733 |
|
734 |
The IP address of the client sending the request to the |
735 |
server. This is not necessarily that of the user agent (such |
736 |
as if the request came through a proxy). |
737 |
|
738 |
REMOTE_ADDR = hostnumber |
739 |
hostnumber = ipv4-address | ipv6-address |
740 |
|
741 |
The definitions of ipv4-address and ipv6-address are provided |
742 |
in Appendix B of RFC 2373 [13]. |
743 |
|
744 |
Servers MUST supply this value to scripts. |
745 |
|
746 |
6.1.10. REMOTE_HOST |
747 |
|
748 |
The fully qualified domain name of the client sending the |
749 |
request to the server, if available, otherwise NULL. (See |
750 |
section 6.1.9.) Fully qualified domain names take the form as |
751 |
described in section 3.5 of RFC 1034 [10] and section 2.1 of |
752 |
RFC 1123 [5]. Domain names are not case sensitive. |
753 |
|
754 |
Servers SHOULD provide this information to scripts. |
755 |
|
756 |
6.1.11. REMOTE_IDENT |
757 |
|
758 |
The identity information reported about the connection by a |
759 |
RFC 1413 [11] request to the remote agent, if available. |
760 |
Servers MAY choose not to support this feature, or not to |
761 |
request the data for efficiency reasons. |
762 |
|
763 |
REMOTE_IDENT = *CHAR |
764 |
|
765 |
|
766 |
Coar, et al. INTERNET-DRAFT [Page 13] |
767 |
|
768 |
CGI/1.1 Expires: 31 December, 1999 |
769 |
|
770 |
The data returned may be used for authentication purposes, but |
771 |
the level of trust reposed in them should be minimal. |
772 |
|
773 |
Servers MAY supply this information to scripts if the RFC1413 |
774 |
[11] lookup is performed. |
775 |
|
776 |
6.1.12. REMOTE_USER |
777 |
|
778 |
If the request required authentication using the "Basic" |
779 |
mechanism (i.e., the AUTH_TYPE metavariable is set to |
780 |
"Basic"), then the value of the REMOTE_USER metavariable is |
781 |
set to the user-ID supplied. In all other cases the value of |
782 |
this metavariable is undefined. |
783 |
|
784 |
REMOTE_USER = *OCTET |
785 |
|
786 |
This variable is specific to requests made via the HTTP |
787 |
protocol. |
788 |
|
789 |
Servers SHOULD provide this metavariable to scripts. |
790 |
|
791 |
6.1.13. REQUEST_METHOD |
792 |
|
793 |
The REQUEST_METHOD metavariable is set to the method with |
794 |
which the request was made, as described in section 5.1.1 of |
795 |
the HTTP/1.0 specification [3] and section 5.1.1 of the |
796 |
HTTP/1.1 specification [8]. |
797 |
|
798 |
REQUEST_METHOD = http-method |
799 |
http-method = "GET" | "HEAD" | "POST" | "PUT" | "DELETE" |
800 |
| "OPTIONS" | "TRACE" | extension-method |
801 |
extension-method = token |
802 |
|
803 |
The method is case sensitive. CGI/1.1 servers MAY choose to |
804 |
process some methods directly rather than passing them to |
805 |
scripts. |
806 |
|
807 |
This variable is specific to requests made with HTTP. |
808 |
|
809 |
Servers MUST provide this metavariable to scripts. |
810 |
|
811 |
6.1.14. SCRIPT_NAME |
812 |
|
813 |
The SCRIPT_NAME metavariable is set to a URL path that could |
814 |
identify the CGI script (rather than the script's output). The |
815 |
syntax and semantics are identical to a decoded HTTP URL |
816 |
'path' token (see RFC 2396 [4]). |
817 |
|
818 |
SCRIPT_NAME = "" | ( "/" [ path ] ) |
819 |
|
820 |
The SCRIPT_NAME string is some leading part of the <path> |
821 |
component of the Script-URI derived in some implementation |
822 |
defined manner. No PATH_INFO or QUERY_STRING segments (see |
823 |
sections 6.1.6 and 6.1.8) are included in the SCRIPT_NAME |
824 |
|
825 |
Coar, et al. INTERNET-DRAFT [Page 14] |
826 |
|
827 |
CGI/1.1 Expires: 31 December, 1999 |
828 |
|
829 |
value. |
830 |
|
831 |
Servers MUST provide this metavariable to scripts. |
832 |
|
833 |
6.1.15. SERVER_NAME |
834 |
|
835 |
The SERVER_NAME metavariable is set to the name of the server, |
836 |
as derived from the <host> part of the Script-URI (see section |
837 |
3.2). |
838 |
|
839 |
SERVER_NAME = hostname | hostnumber |
840 |
|
841 |
Servers MUST provide this metavariable to scripts. |
842 |
|
843 |
6.1.16. SERVER_PORT |
844 |
|
845 |
The SERVER_PORT metavariable is set to the port on which the |
846 |
request was received, as used in the <port> part of the |
847 |
Script-URI. |
848 |
|
849 |
SERVER_PORT = 1*digit |
850 |
|
851 |
If the <port> portion of the script-URI is blank, the actual |
852 |
port number upon which the request was received MUST be |
853 |
supplied. |
854 |
|
855 |
Servers MUST provide this metavariable to scripts. |
856 |
|
857 |
6.1.17. SERVER_PROTOCOL |
858 |
|
859 |
The SERVER_PROTOCOL metavariable is set to the name and |
860 |
revision of the information protocol with which the request |
861 |
arrived. This is not necessarily the same as the protocol |
862 |
version used by the server in its response to the client. |
863 |
|
864 |
SERVER_PROTOCOL = HTTP-Version | extension-version |
865 |
| extension-token |
866 |
HTTP-Version = "HTTP" "/" 1*digit "." 1*digit |
867 |
extension-version = protocol "/" 1*digit "." 1*digit |
868 |
protocol = 1*( alpha | digit | "+" | "-" | "." ) |
869 |
extension-token = token |
870 |
|
871 |
'protocol' is a version of the <scheme> part of the |
872 |
Script-URI, but is not identical to it. For example, the |
873 |
scheme of a request may be "https" while the protocol remains |
874 |
"http". The protocol is not case sensitive, but by convention, |
875 |
'protocol' is in upper case. |
876 |
|
877 |
A well-known extension token value is "INCLUDED", which |
878 |
signals that the current document is being included as part of |
879 |
a composite document, rather than being the direct target of |
880 |
the client request. |
881 |
|
882 |
Servers MUST provide this metavariable to scripts. |
883 |
|
884 |
Coar, et al. INTERNET-DRAFT [Page 15] |
885 |
|
886 |
CGI/1.1 Expires: 31 December, 1999 |
887 |
|
888 |
|
889 |
6.1.18. SERVER_SOFTWARE |
890 |
|
891 |
The SERVER_SOFTWARE metavariable is set to the name and |
892 |
version of the information server software answering the |
893 |
request (and running the gateway). |
894 |
|
895 |
SERVER_SOFTWARE = 1*product |
896 |
product = token [ "/" product-version ] |
897 |
product-version = token |
898 |
|
899 |
Servers MUST provide this metavariable to scripts. |
900 |
|
901 |
6.2. Request Message-Bodies |
902 |
|
903 |
As there may be a data entity attached to the request, there |
904 |
MUST be a system defined method for the script to read these |
905 |
data. Unless defined otherwise, this will be via the 'standard |
906 |
input' file descriptor. |
907 |
|
908 |
If the CONTENT_LENGTH value (see section 6.1.2) is non-NULL, |
909 |
the server MUST supply at least that many bytes to scripts on |
910 |
the standard input stream. Scripts are not obliged to read the |
911 |
data. Servers MAY signal an EOF condition after CONTENT_LENGTH |
912 |
bytes have been read, but are not obligated to do so. |
913 |
Therefore, scripts MUST NOT attempt to read more than |
914 |
CONTENT_LENGTH bytes, even if more data are available. |
915 |
|
916 |
For non-parsed header (NPH) scripts (see section 7.1 below), |
917 |
servers SHOULD attempt to ensure that the data supplied to the |
918 |
script are precisely as supplied by the client and unaltered |
919 |
by the server. |
920 |
|
921 |
Section 8.1.2 describes the requirements of servers with |
922 |
regard to requests that include message-bodies. |
923 |
|
924 |
7. Data Output from the CGI Script |
925 |
|
926 |
There MUST be a system defined method for the script to send |
927 |
data back to the server or client; a script MUST always return |
928 |
some data. Unless defined otherwise, this will be via the |
929 |
'standard output' file descriptor. |
930 |
|
931 |
There are two forms of output that scripts can supply to |
932 |
servers: non-parsed header (NPH) output, and parsed header |
933 |
output. Servers MUST support parsed header output and MAY |
934 |
support NPH output. The method of distinguishing between the |
935 |
two types of output (or scripts) is implementation defined. |
936 |
|
937 |
Servers MAY implement a timeout period within which data must |
938 |
be received from scripts. If a server implementation defines |
939 |
such a timeout and receives no data from a script within the |
940 |
timeout period, the server MAY terminate the script process |
941 |
and SHOULD abort the client request with either a '504 Gateway |
942 |
|
943 |
Coar, et al. INTERNET-DRAFT [Page 16] |
944 |
|
945 |
CGI/1.1 Expires: 31 December, 1999 |
946 |
|
947 |
Timed Out' or a '500 Internal Server Error' response. |
948 |
|
949 |
7.1. Non-Parsed Header Output |
950 |
|
951 |
Scripts using the NPH output form MUST return a complete HTTP |
952 |
response message, as described in Section 6 of the HTTP |
953 |
specifications [3,8]. NPH scripts MUST use the SERVER_PROTOCOL |
954 |
variable to determine the appropriate format for a response. |
955 |
|
956 |
Servers SHOULD attempt to ensure that the script output is |
957 |
sent directly to the client, with minimal internal and no |
958 |
transport-visible buffering. |
959 |
|
960 |
7.2. Parsed Header Output |
961 |
|
962 |
Scripts using the parsed header output form MUST supply a CGI |
963 |
response message to the server as follows: |
964 |
|
965 |
CGI-Response = *optional-field CGI-Field *optional-field NL |
966 |
[ Message-Body ] |
967 |
optional-field = ( CGI-Field | HTTP-Field ) |
968 |
CGI-Field = Content-type |
969 |
| Location |
970 |
| Status |
971 |
| extension-header |
972 |
|
973 |
The response comprises a header and a body, separated by a |
974 |
blank line. The body may be NULL. The header fields are either |
975 |
CGI header fields to be interpreted by the server, or HTTP |
976 |
header fields to be included in the response returned to the |
977 |
client if the request method is HTTP. At least one CGI-Field |
978 |
MUST be supplied, but no CGI field name may be used more than |
979 |
once in a response. If a body is supplied, then a |
980 |
"Content-type" header field MUST be supplied by the script, |
981 |
otherwise the script MUST send a "Location" or "Status" header |
982 |
field. If a Location CGI-Field is returned, then the script |
983 |
MUST NOT supply any HTTP-Fields. |
984 |
|
985 |
Each header field in a CGI-Response MUST be specified on a |
986 |
single line; CGI/1.1 does not support continuation lines. |
987 |
|
988 |
7.2.1. CGI header fields |
989 |
|
990 |
The CGI header fields have the generic syntax: |
991 |
|
992 |
generic-field = field-name ":" [ field-value ] NL |
993 |
field-name = token |
994 |
field-value = *( field-content | LWSP ) |
995 |
field-content = *( token | tspecial | quoted-string ) |
996 |
|
997 |
The field-name is not case sensitive; a NULL field value is |
998 |
equivalent to the header field not being sent. |
999 |
|
1000 |
7.2.1.1. Content-Type |
1001 |
|
1002 |
Coar, et al. INTERNET-DRAFT [Page 17] |
1003 |
|
1004 |
CGI/1.1 Expires: 31 December, 1999 |
1005 |
|
1006 |
|
1007 |
The Internet Media Type [9] of the entity body, which is to be |
1008 |
sent unmodified to the client. |
1009 |
|
1010 |
Content-Type = "Content-Type" ":" media-type NL |
1011 |
|
1012 |
This is actually an HTTP-Field rather than a CGI-Field, but it |
1013 |
is listed here because of its importance in the CGI dialogue |
1014 |
as a member of the "one of these is required" set of header |
1015 |
fields. |
1016 |
|
1017 |
7.2.1.2. Location |
1018 |
|
1019 |
This is used to specify to the server that the script is |
1020 |
returning a reference to a document rather than an actual |
1021 |
document. |
1022 |
|
1023 |
Location = "Location" ":" |
1024 |
( fragment-URI | rel-URL-abs-path ) NL |
1025 |
fragment-URI = URI [ # fragmentid ] |
1026 |
URI = scheme ":" *qchar |
1027 |
fragmentid = *qchar |
1028 |
rel-URL-abs-path = "/" [ hpath ] [ "?" query-string ] |
1029 |
hpath = fpsegment *( "/" psegment ) |
1030 |
fpsegment = 1*hchar |
1031 |
psegment = *hchar |
1032 |
hchar = alpha | digit | safe | extra |
1033 |
| ":" | "@" | "& | "=" |
1034 |
|
1035 |
The Location value is either an absolute URI with optional |
1036 |
fragment, as defined in RFC 1630 [1], or an absolute path |
1037 |
within the server's URI space (i.e., omitting the scheme and |
1038 |
network-related fields) and optional query-string. If an |
1039 |
absolute URI is returned by the script, then the server MUST |
1040 |
generate a '302 redirect' HTTP response message unless the |
1041 |
script has supplied an explicit Status response header field. |
1042 |
Scripts returning an absolute URI MAY choose to provide a |
1043 |
message-body. Servers MUST make any appropriate modifications |
1044 |
to the script's output to ensure the response to the |
1045 |
user-agent complies with the response protocol version. If the |
1046 |
Location value is a path, then the server MUST generate the |
1047 |
response that it would have produced in response to a request |
1048 |
containing the URL |
1049 |
|
1050 |
scheme "://" SERVER_NAME ":" SERVER_PORT rel-URL-abs-path |
1051 |
|
1052 |
Note: If the request was accompanied by a message-body (such |
1053 |
as for a POST request), and the script redirects the request |
1054 |
with a Location field, the message-body may not be available |
1055 |
to the resource that is the target of the redirect. |
1056 |
|
1057 |
7.2.1.3. Status |
1058 |
|
1059 |
The "Status" header field is used to indicate to the server |
1060 |
|
1061 |
Coar, et al. INTERNET-DRAFT [Page 18] |
1062 |
|
1063 |
CGI/1.1 Expires: 31 December, 1999 |
1064 |
|
1065 |
what status code the server MUST use in the response message. |
1066 |
|
1067 |
Status = "Status" ":" digit digit digit SP reason-phrase NL |
1068 |
reason-phrase = *<CHAR, excluding CTLs, NL> |
1069 |
|
1070 |
The valid status codes are listed in section 6.1.1 of the |
1071 |
HTTP/1.0 specifications [3]. If the SERVER_PROTOCOL is |
1072 |
"HTTP/1.1", then the status codes defined in the HTTP/1.1 |
1073 |
specification [8] may be used. If the script does not return a |
1074 |
"Status" header field, then "200 OK" SHOULD be assumed by the |
1075 |
server. |
1076 |
|
1077 |
If a script is being used to handle a particular error or |
1078 |
condition encountered by the server, such as a '404 Not Found' |
1079 |
error, the script SHOULD use the "Status" CGI header field to |
1080 |
propagate the error condition back to the client. E.g., in the |
1081 |
example mentioned it SHOULD include a "Status: 404 Not Found" |
1082 |
in the header data returned to the server. |
1083 |
|
1084 |
7.2.1.4. Extension header fields |
1085 |
|
1086 |
Scripts MAY include in their CGI response header additional |
1087 |
fields not defined in this or the HTTP specification. These |
1088 |
are called "extension" fields, and have the syntax of a |
1089 |
generic-field as defined in section 7.2.1. The name of an |
1090 |
extension field MUST NOT conflict with a field name defined in |
1091 |
this or any other specification; extension field names SHOULD |
1092 |
begin with "X-CGI-" to ensure uniqueness. |
1093 |
|
1094 |
7.2.2. HTTP header fields |
1095 |
|
1096 |
The script MAY return any other header fields defined by the |
1097 |
specification for the SERVER_PROTOCOL (HTTP/1.0 [3] or |
1098 |
HTTP/1.1 [8]). Servers MUST resolve conflicts beteen CGI |
1099 |
header and HTTP header formats or names (see section 8). |
1100 |
|
1101 |
8. Server Implementation |
1102 |
|
1103 |
This section defines the requirements that must be met by HTTP |
1104 |
servers in order to provide a coherent and correct CGI/1.1 |
1105 |
environment in which scripts may function. It is intended |
1106 |
primarily for server implementors, but it is useful for script |
1107 |
authors to be familiar with the information as well. |
1108 |
|
1109 |
8.1. Requirements for Servers |
1110 |
|
1111 |
In order to be considered CGI/1.1-compliant, a server must |
1112 |
meet certain basic criteria and provide certain minimal |
1113 |
functionality. The details of these requirements are described |
1114 |
in the following sections. |
1115 |
|
1116 |
8.1.1. Script-URI |
1117 |
|
1118 |
Servers MUST support the standard mechanism (described below) |
1119 |
|
1120 |
Coar, et al. INTERNET-DRAFT [Page 19] |
1121 |
|
1122 |
CGI/1.1 Expires: 31 December, 1999 |
1123 |
|
1124 |
which allows script authors to determine what URL to use in |
1125 |
documents which reference the script; specifically, what URL |
1126 |
to use in order to achieve particular settings of the |
1127 |
metavariables. This mechanism is as follows: |
1128 |
|
1129 |
The server MUST translate the header data from the CGI header |
1130 |
field syntax to the HTTP header field syntax if these differ. |
1131 |
For example, the character sequence for newline (such as |
1132 |
Unix's ASCII NL) used by CGI scripts may not be the same as |
1133 |
that used by HTTP (ASCII CR followed by LF). The server MUST |
1134 |
also resolve any conflicts between header fields returned by |
1135 |
the script and header fields that it would otherwise send |
1136 |
itself. |
1137 |
|
1138 |
8.1.2. Request Message-body Handling |
1139 |
|
1140 |
These are the requirements for server handling of |
1141 |
message-bodies directed to CGI/1.1 resources: |
1142 |
|
1143 |
1. The message-body the server provides to the CGI script |
1144 |
MUST have any transfer encodings removed. |
1145 |
2. The server MUST derive and provide a value for the |
1146 |
CONTENT_LENGTH metavariable that reflects the length of |
1147 |
the message-body after any transfer decoding. |
1148 |
3. The server MUST leave intact any content-encodings of the |
1149 |
message-body. |
1150 |
|
1151 |
8.1.3. Required Metavariables |
1152 |
|
1153 |
Servers MUST provide scripts with certain information and |
1154 |
metavariables as described in section 8.3. |
1155 |
|
1156 |
8.1.4. Response Compliance |
1157 |
|
1158 |
Servers MUST ensure that responses sent to the user-agent meet |
1159 |
all requirements of the protocol level in effect. This may |
1160 |
involve modifying, deleting, or augmenting any header fields |
1161 |
and/or message-body supplied by the script. |
1162 |
|
1163 |
8.2. Recommendations for Servers |
1164 |
|
1165 |
Servers SHOULD provide the "query" component of the script-URI |
1166 |
as command-line arguments to scripts if it does not contain |
1167 |
any unencoded '=' characters and the command-line arguments |
1168 |
can be generated in an unambiguous manner. (See section 5.) |
1169 |
|
1170 |
Servers SHOULD set the AUTH_TYPE metavariable to the value of |
1171 |
the 'auth-scheme' token of the "Authorization" field if it was |
1172 |
supplied as part of the request header. (See section 6.1.1.) |
1173 |
|
1174 |
Where applicable, servers SHOULD set the current working |
1175 |
directory to the directory in which the script is located |
1176 |
before invoking it. |
1177 |
|
1178 |
|
1179 |
Coar, et al. INTERNET-DRAFT [Page 20] |
1180 |
|
1181 |
CGI/1.1 Expires: 31 December, 1999 |
1182 |
|
1183 |
Servers MAY reject with error '404 Not Found' any requests |
1184 |
that would result in an encoded "/" being decoded into |
1185 |
PATH_INFO or SCRIPT_NAME, as this might represent a loss of |
1186 |
information to the script. |
1187 |
|
1188 |
Although the server and the CGI script need not be consistent |
1189 |
in their handling of URL paths (client URLs and the PATH_INFO |
1190 |
data, respectively), server authors may wish to impose |
1191 |
consistency. So the server implementation SHOULD define its |
1192 |
behaviour for the following cases: |
1193 |
|
1194 |
1. define any restrictions on allowed characters, in |
1195 |
particular whether ASCII NUL is permitted; |
1196 |
2. define any restrictions on allowed path segments, in |
1197 |
particular whether non-terminal NULL segments are |
1198 |
permitted; |
1199 |
3. define the behaviour for "." or ".." path segments; i.e., |
1200 |
whether they are prohibited, treated as ordinary path |
1201 |
segments or interpreted in accordance with the relative |
1202 |
URL specification [7]; |
1203 |
4. define any limits of the implementation, including limits |
1204 |
on path or search string lengths, and limits on the volume |
1205 |
of header data the server will parse. |
1206 |
|
1207 |
Servers MAY generate the Script-URI in any way from the client |
1208 |
URI, or from any other data (but the behaviour SHOULD be |
1209 |
documented). |
1210 |
|
1211 |
For non-parsed header (NPH) scripts (see section 7.1), servers |
1212 |
SHOULD attempt to ensure that the script input comes directly |
1213 |
from the client, with minimal buffering. For all scripts the |
1214 |
data will be as supplied by the client. |
1215 |
|
1216 |
8.3. Summary of MetaVariables |
1217 |
|
1218 |
Servers MUST provide the following metavariables to scripts. |
1219 |
See the individual descriptions for exceptions and semantics. |
1220 |
|
1221 |
CONTENT_LENGTH (section 6.1.2) |
1222 |
CONTENT_TYPE (section 6.1.3) |
1223 |
GATEWAY_INTERFACE (section 6.1.4) |
1224 |
PATH_INFO (section 6.1.6) |
1225 |
QUERY_STRING (section 6.1.8) |
1226 |
REMOTE_ADDR (section 6.1.9) |
1227 |
REQUEST_METHOD (section 6.1.13) |
1228 |
SCRIPT_NAME (section 6.1.14) |
1229 |
SERVER_NAME (section 6.1.15) |
1230 |
SERVER_PORT (section 6.1.16) |
1231 |
SERVER_PROTOCOL (section 6.1.17) |
1232 |
SERVER_SOFTWARE (section 6.1.18) |
1233 |
|
1234 |
Servers SHOULD define the following metavariables for scripts. |
1235 |
See the individual descriptions for exceptions and semantics. |
1236 |
|
1237 |
|
1238 |
Coar, et al. INTERNET-DRAFT [Page 21] |
1239 |
|
1240 |
CGI/1.1 Expires: 31 December, 1999 |
1241 |
|
1242 |
AUTH_TYPE (section 6.1.1) |
1243 |
REMOTE_HOST (section 6.1.10) |
1244 |
|
1245 |
|
1246 |
In addition, servers SHOULD provide metavariables for all |
1247 |
fields present in the HTTP request header, with the exception |
1248 |
of those involved with access control. Servers MAY at their |
1249 |
discretion provide metavariables for access control fields. |
1250 |
|
1251 |
Servers MAY define the following metavariables. See the |
1252 |
individual descriptions for exceptions and semantics. |
1253 |
|
1254 |
PATH_TRANSLATED (section 6.1.7) |
1255 |
REMOTE_IDENT (section 6.1.11) |
1256 |
REMOTE_USER (section 6.1.12) |
1257 |
|
1258 |
Servers MAY at their discretion define additional |
1259 |
implementation-specific extension metavariables provided their |
1260 |
names do not conflict with defined header field names. |
1261 |
Implementation-specific metavariable names SHOULD be prefixed |
1262 |
with "X_" (e.g., "X_DBA") to avoid the potential for such |
1263 |
conflicts. |
1264 |
|
1265 |
9. Script Implementation |
1266 |
|
1267 |
This section defines the requirements and recommendations for |
1268 |
scripts that are intended to function in a CGI/1.1 |
1269 |
environment. It is intended primarily as a reference for |
1270 |
script authors, but server implementors should be familiar |
1271 |
with these issues as well. |
1272 |
|
1273 |
9.1. Requirements for Scripts |
1274 |
|
1275 |
Scripts using the parsed-header method to communicate with |
1276 |
servers MUST supply a response header to the server. (See |
1277 |
section 7.) |
1278 |
|
1279 |
Scripts using the NPH method to communicate with servers MUST |
1280 |
provide complete HTTP responses, and MUST use the value of the |
1281 |
SERVER_PROTOCOL metavariable to determine the appropriate |
1282 |
format. (See section 7.1.) |
1283 |
|
1284 |
Scripts MUST check the value of the REQUEST_METHOD |
1285 |
metavariable in order to provide an appropriate response. (See |
1286 |
section 6.1.13.) |
1287 |
|
1288 |
Scripts MUST be prepared to handled URL-encoded values in |
1289 |
metavariables. In addition, they MUST recognise both "+" and |
1290 |
"%20" in URL-encoded quantities as representing the space |
1291 |
character. (See section 3.1.) |
1292 |
|
1293 |
Scripts MUST ignore leading zeros in the major and minor |
1294 |
version numbers in the GATEWAY_INTERFACE metavariable value. |
1295 |
(See section 6.1.4.) |
1296 |
|
1297 |
Coar, et al. INTERNET-DRAFT [Page 22] |
1298 |
|
1299 |
CGI/1.1 Expires: 31 December, 1999 |
1300 |
|
1301 |
|
1302 |
When processing requests that include a message-body, scripts |
1303 |
MUST NOT read more than CONTENT_LENGTH bytes from the input |
1304 |
stream. (See sections 6.1.2 and 6.2.) |
1305 |
|
1306 |
9.2. Recommendations for Scripts |
1307 |
|
1308 |
Servers may interrupt or terminate script execution at any |
1309 |
time and without warning, so scripts SHOULD be prepared to |
1310 |
deal with abnormal termination. |
1311 |
|
1312 |
Scripts MUST reject with error '405 Method Not Allowed' |
1313 |
requests made using methods that they do not support. If the |
1314 |
script does not intend processing the PATH_INFO data, then it |
1315 |
SHOULD reject the request with '404 Not Found' if PATH_INFO is |
1316 |
not NULL. |
1317 |
|
1318 |
If a script is processing the output of a form, it SHOULD |
1319 |
verify that the CONTENT_TYPE is |
1320 |
"application/x-www-form-urlencoded" [2] or whatever other |
1321 |
media type is expected. |
1322 |
|
1323 |
Scripts parsing PATH_INFO, PATH_TRANSLATED, or SCRIPT_NAME |
1324 |
SHOULD be careful of void path segments ("//") and special |
1325 |
path segments ("." and ".."). They SHOULD either be removed |
1326 |
from the path before use in OS system calls, or the request |
1327 |
SHOULD be rejected with '404 Not Found'. |
1328 |
|
1329 |
As it is impossible for scripts to determine the client URI |
1330 |
that initiated a request without knowledge of the specific |
1331 |
server in use, the script SHOULD NOT return "text/html" |
1332 |
documents containing relative URL links without including a |
1333 |
"<BASE>" tag in the document. |
1334 |
|
1335 |
When returning header fields, scripts SHOULD try to send the |
1336 |
CGI header fields (see section 7.2) as soon as possible, and |
1337 |
SHOULD send them before any HTTP header fields. This may help |
1338 |
reduce the server's memory requirements. |
1339 |
|
1340 |
10. System Specifications |
1341 |
|
1342 |
10.1. AmigaDOS |
1343 |
|
1344 |
The implementation of the CGI on an AmigaDOS operating system |
1345 |
platform SHOULD use environment variables as the mechanism of |
1346 |
providing request metadata to CGI scripts. |
1347 |
|
1348 |
Environment variables |
1349 |
These are accessed by the DOS library routine GetVar. |
1350 |
The flags argument SHOULD be 0. Case is ignored, but |
1351 |
upper case is recommended for compatibility with |
1352 |
case-sensitive systems. |
1353 |
|
1354 |
The current working directory |
1355 |
|
1356 |
Coar, et al. INTERNET-DRAFT [Page 23] |
1357 |
|
1358 |
CGI/1.1 Expires: 31 December, 1999 |
1359 |
|
1360 |
The current working directory for the script is set to |
1361 |
the directory containing the script. |
1362 |
|
1363 |
Character set |
1364 |
The US-ASCII character set is used for the definition |
1365 |
of environment variable names and header field names; |
1366 |
the newline (NL) sequence is LF; servers SHOULD also |
1367 |
accept CR LF as a newline. |
1368 |
|
1369 |
10.2. Unix |
1370 |
|
1371 |
The implementation of the CGI on a UNIX operating system |
1372 |
platform SHOULD use environment variables as the mechanism of |
1373 |
providing request metadata to CGI scripts. |
1374 |
|
1375 |
For Unix compatible operating systems, the following are |
1376 |
defined: |
1377 |
|
1378 |
Environment variables |
1379 |
These are accessed by the C library routine getenv. |
1380 |
|
1381 |
The command line |
1382 |
This is accessed using the argc and argv arguments to |
1383 |
main(). The words have any characters that are 'active' |
1384 |
in the Bourne shell escaped with a backslash. If the |
1385 |
value of the QUERY_STRING metavariable contains an |
1386 |
unencoded equals-sign '=', then the command line SHOULD |
1387 |
NOT be used by the script. |
1388 |
|
1389 |
The current working directory |
1390 |
The current working directory for the script SHOULD be |
1391 |
set to the directory containing the script. |
1392 |
|
1393 |
Character set |
1394 |
The US-ASCII character set is used for the definition |
1395 |
of environment variable names and header field names; |
1396 |
the newline (NL) sequence is LF; servers SHOULD also |
1397 |
accept CR LF as a newline. |
1398 |
|
1399 |
11. Security Considerations |
1400 |
|
1401 |
11.1. Safe Methods |
1402 |
|
1403 |
As discussed in the security considerations of the HTTP |
1404 |
specifications [3,8], the convention has been established that |
1405 |
the GET and HEAD methods should be 'safe'; they should cause |
1406 |
no side-effects and only have the significance of resource |
1407 |
retrieval. |
1408 |
|
1409 |
CGI scripts are responsible for enforcing any HTTP security |
1410 |
considerations [3,8] with respect to the protocol version |
1411 |
level of the request and any side effects generated by the |
1412 |
scripts on behalf of the server. Primary among these are the |
1413 |
considerations of safe and idempotent methods. Idempotent |
1414 |
|
1415 |
Coar, et al. INTERNET-DRAFT [Page 24] |
1416 |
|
1417 |
CGI/1.1 Expires: 31 December, 1999 |
1418 |
|
1419 |
requests are those that may be repeated an arbitrary number of |
1420 |
times and produce side effects identical to a single request. |
1421 |
|
1422 |
11.2. HTTP Header Fields Containing Sensitive Information |
1423 |
|
1424 |
Some HTTP header fields may carry sensitive information which |
1425 |
the server SHOULD NOT pass on to the script unless explicitly |
1426 |
configured to do so. For example, if the server protects the |
1427 |
script using the "Basic" authentication scheme, then the |
1428 |
client will send an "Authorization" header field containing a |
1429 |
username and password. If the server, rather than the script, |
1430 |
validates this information then the password SHOULD NOT be |
1431 |
passed on to the script via the HTTP_AUTHORIZATION |
1432 |
metavariable without careful consideration. This also applies |
1433 |
to the Proxy-Authorization header field and the corresponding |
1434 |
HTTP_PROXY_AUTHORIZATION metavariable. |
1435 |
|
1436 |
11.3. Script Interference with the Server |
1437 |
|
1438 |
The most common implementation of CGI invokes the script as a |
1439 |
child process using the same user and group as the server |
1440 |
process. It SHOULD therefore be ensured that the script cannot |
1441 |
interfere with the server process, its configuration, or |
1442 |
documents. |
1443 |
|
1444 |
If the script is executed by calling a function linked in to |
1445 |
the server software (either at compile-time or run-time) then |
1446 |
precautions SHOULD be taken to protect the core memory of the |
1447 |
server, or to ensure that untrusted code cannot be executed. |
1448 |
|
1449 |
11.4. Data Length and Buffering Considerations |
1450 |
|
1451 |
This specification places no limits on the length of |
1452 |
message-bodies presented to the script. Scripts should not |
1453 |
assume that statically allocated buffers of any size are |
1454 |
sufficient to contain the entire submission at one time. Use |
1455 |
of a fixed length buffer without careful overflow checking may |
1456 |
result in an attacker exploiting 'stack-smashing' or |
1457 |
'stack-overflow' vulnerabilities of the operating system. |
1458 |
Scripts may spool large submissions to disk or other buffering |
1459 |
media, but a rapid succession of large submissions may result |
1460 |
in denial of service conditions. If the CONTENT_LENGTH of a |
1461 |
message-body is larger than resource considerations allow, |
1462 |
scripts should respond with an error status appropriate for |
1463 |
the protocol version; potentially applicable status codes |
1464 |
include '503 Service Unavailable' (HTTP/1.0 and HTTP/1.1), |
1465 |
'413 Request Entity Too Large' (HTTP/1.1), and '414 |
1466 |
Request-URI Too Long' (HTTP/1.1). |
1467 |
|
1468 |
11.5. Stateless Processing |
1469 |
|
1470 |
The stateless nature of the Web makes each script execution |
1471 |
and resource retrieval independent of all others even when |
1472 |
multiple requests constitute a single conceptual Web |
1473 |
|
1474 |
Coar, et al. INTERNET-DRAFT [Page 25] |
1475 |
|
1476 |
CGI/1.1 Expires: 31 December, 1999 |
1477 |
|
1478 |
transaction. Because of this, a script should not make any |
1479 |
assumptions about the context of the user-agent submitting a |
1480 |
request. In particular, scripts should examine data obtained |
1481 |
from the client and verify that they are valid, both in form |
1482 |
and content, before allowing them to be used for sensitive |
1483 |
purposes such as input to other applications, commands, or |
1484 |
operating system services. These uses include, but are not |
1485 |
limited to: system call arguments, database writes, |
1486 |
dynamically evaluated source code, and input to billing or |
1487 |
other secure processes. It is important that applications be |
1488 |
protected from invalid input regardless of whether the |
1489 |
invalidity is the result of user error, logic error, or |
1490 |
malicious action. |
1491 |
|
1492 |
Authors of scripts involved in multi-request transactions |
1493 |
should be particularly cautios about validating the state |
1494 |
information; undesirable effects may result from the |
1495 |
substitution of dangerous values for portions of the |
1496 |
submission which might otherwise be presumed safe. Subversion |
1497 |
of this type occurs when alterations are made to data from a |
1498 |
prior stage of the transaction that were not meant to be |
1499 |
controlled by the client (e.g., hidden HTML form elements, |
1500 |
cookies, embedded URLs, etc.). |
1501 |
|
1502 |
12. Acknowledgements |
1503 |
|
1504 |
This work is based on a draft published in 1997 by David R. |
1505 |
Robinson, which in turn was based on the original CGI |
1506 |
interface that arose out of discussions on the www-talk |
1507 |
mailing list. In particular, Rob McCool, John Franks, Ari |
1508 |
Luotonen, George Phillips and Tony Sanders deserve special |
1509 |
recognition for their efforts in defining and implementing the |
1510 |
early versions of this interface. |
1511 |
|
1512 |
This document has also greatly benefited from the comments and |
1513 |
suggestions made by Chris Adie, Dave Kristol, Mike Meyer, |
1514 |
David Morris, Jeremy Madea, Patrick McManus, Adam Donahue, |
1515 |
Ross Patterson, and Harald Alvestrand. |
1516 |
|
1517 |
13. References |
1518 |
|
1519 |
[1] |
1520 |
Berners-Lee, T., 'Universal Resource Identifiers in |
1521 |
WWW: A Unifying Syntax for the Expression of Names and |
1522 |
Addresses of Objects on the Network as used in the |
1523 |
World-Wide Web', RFC 1630, CERN, June 1994. |
1524 |
[2] |
1525 |
Berners-Lee, T. and Connolly, D., 'Hypertext Markup |
1526 |
Language - 2.0', RFC 1866, MIT/W3C, November 1995. |
1527 |
[3] |
1528 |
Berners-Lee, T., Fielding, R. T. and Frystyk, H., |
1529 |
'Hypertext Transfer Protocol -- HTTP/1.0', RFC 1945, |
1530 |
MIT/LCS, UC Irvine, May 1996. |
1531 |
[4] |
1532 |
|
1533 |
Coar, et al. INTERNET-DRAFT [Page 26] |
1534 |
|
1535 |
CGI/1.1 Expires: 31 December, 1999 |
1536 |
|
1537 |
Berners-Lee, T., Fielding, R., and Masinter, L., |
1538 |
Editors, 'Uniform Resource Identifiers (URI): Generic |
1539 |
Syntax', RFC 2396, MIT, U.C. Irvine, Xerox Corporation, |
1540 |
August 1996. |
1541 |
[5] |
1542 |
Braden, R., Editor, 'Requirements for Internet Hosts -- |
1543 |
Application and Support', STD 3, RFC 1123, IETF, |
1544 |
October 1989. |
1545 |
[6] |
1546 |
Crocker, D.H., 'Standard for the Format of ARPA |
1547 |
Internet Text Messages', STD 11, RFC 822, University of |
1548 |
Delaware, August 1982. |
1549 |
[7] |
1550 |
Fielding, R., 'Relative Uniform Resource Locators', RFC |
1551 |
1808, UC Irvine, June 1995. |
1552 |
[8] |
1553 |
Fielding, R., Gettys, J., Mogul, J., Frystyk, H. and |
1554 |
Berners-Lee, T., 'Hypertext Transfer Protocol -- |
1555 |
HTTP/1.1', RFC 2068, UC Irvine, DEC, MIT/LCS, January |
1556 |
1997. |
1557 |
[9] |
1558 |
Freed, N. and Borenstein N., 'Multipurpose Internet |
1559 |
Mail Extensions (MIME) Part Two: Media Types', RFC |
1560 |
2046, Innosoft, First Virtual, November 1996. |
1561 |
[10] |
1562 |
Mockapetris, P., 'Domain Names - Concepts and |
1563 |
Facilities', STD 13, RFC 1034, ISI, November 1987. |
1564 |
[11] |
1565 |
St. Johns, M., 'Identification Protocol', RFC 1431, US |
1566 |
Department of Defense, February 1993. |
1567 |
[12] |
1568 |
'Coded Character Set -- 7-bit American Standard Code |
1569 |
for Information Interchange', ANSI X3.4-1986. |
1570 |
[13] |
1571 |
Hinden, R. and Deering, S., 'IP Version 6 Addressing |
1572 |
Architecture', RFC 2373, Nokia, Cisco Systems, July |
1573 |
1998. |
1574 |
|
1575 |
14. Authors' Addresses |
1576 |
|
1577 |
Ken A L Coar |
1578 |
MeepZor Consulting |
1579 |
7824 Mayfaire Crest Lane, Suite 202 |
1580 |
Raleigh, NC 27615-4875 |
1581 |
U.S.A. |
1582 |
Tel: +1 (919) 254.4237 |
1583 |
Fax: +1 (919) 254.5250 |
1584 |
Email: Ken.Coar@Golux.Com |
1585 |
|
1586 |
David Robinson |
1587 |
E*TRADE UK Ltd |
1588 |
Mount Pleasant House |
1589 |
2 Mount Pleasant |
1590 |
Huntingdon Road |
1591 |
|
1592 |
Coar, et al. INTERNET-DRAFT [Page 27] |
1593 |
|
1594 |
CGI/1.1 Expires: 31 December, 1999 |
1595 |
|
1596 |
Cambridge CB3 0RN |
1597 |
UK |
1598 |
Tel: +44 (1223) 566926 |
1599 |
Fax: +44 (1223) 506288 |
1600 |
Email: drtr@etrade.co.uk |
1601 |
|
1602 |
|
1603 |
|
1604 |
|
1605 |
|
1606 |
|
1607 |
|
1608 |
|
1609 |
|
1610 |
|
1611 |
|
1612 |
|
1613 |
|
1614 |
|
1615 |
|
1616 |
|
1617 |
|
1618 |
|
1619 |
|
1620 |
|
1621 |
|
1622 |
|
1623 |
|
1624 |
|
1625 |
|
1626 |
|
1627 |
|
1628 |
|
1629 |
|
1630 |
|
1631 |
|
1632 |
|
1633 |
|
1634 |
|
1635 |
|
1636 |
|
1637 |
|
1638 |
|
1639 |
|
1640 |
|
1641 |
|
1642 |
|
1643 |
|
1644 |
|
1645 |
|
1646 |
|
1647 |
|
1648 |
|
1649 |
|
1650 |
|
1651 |
Coar, et al. INTERNET-DRAFT [Page 28] |
1652 |
|