RFC 2141
RFC 2141
Moats
Request for Comments: 2141 AT&T
Category: Standards Track May 1997
URN Syntax
Abstract
1. Introduction
2. Syntax
All URNs have the following syntax (phrases enclosed in quotes are
REQUIRED):
RFC 1630 [2] and RFC 1737 [3] each presents additional considerations
for URN encoding, which have implications as far as limiting syntax.
On the other hand, the requirement to support existing legacy naming
systems has the effect of broadening syntax. Thus, we discuss the
acceptable syntax for both the Namespace Identifier and the Namespace
Specific String separately.
<upper> ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
"I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
"Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
"Y" | "Z"
<lower> ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
"i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
"q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
"y" | "z"
<number> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
"8" | "9"
The "%" character is reserved in the URN syntax for introducing the
escape sequence for an octet. Literal use of the "%" character in a
namespace must be encoded using "%25" in URNs for that namespace.
The presence of an "%" character in an URN MUST be followed by two
characters from the <hex> character set.
RFC 1630 [2] reserves the characters "/", "?", and "#" for particular
purposes. The URN-WG has not yet debated the applicability and
precise semantics of those purposes as applied to URNs. Therefore,
these characters are RESERVED for future developments. Namespace
developers SHOULD NOT use these characters in unencoded form, but
rather use the appropriate %-encoding for each character.
<excluded> ::= octets 1-32 (1-20 hex) | "\" | """ | "&" | "<"
| ">" | "[" | "]" | "^" | "`" | "{" | "|" | "}" | "~"
| octets 127-255 (7F-FF hex)
The URN syntax defines the canonical format for URNs and all URN
transport and interchanges MUST take place in this format. Further,
all URN-aware applications MUST offer the option of displaying URNs
in this canonical form to allow for direct transcription (for example
by cut and paste techniques). Such applications MAY support display
of URNs in a more human-friendly form and may use a character set
that includes characters that aren't permitted in URN syntax as
defined in this RFC (that is, they may replace %-notation by
characters in some extended character set in display to humans).
1- URN:foo:a123,456
2- urn:foo:a123,456
3- urn:FOO:a123,456
4- urn:foo:A123,456
5- urn:foo:a123%2C456
6- URN:FOO:a123%2c456
8. Security considerations
This document specifies the syntax for URNs. While some namespaces
resolvers may assign special meaning to certain of the characters of
the Namespace Specific String, any security consideration resulting
from such assignment are outside the scope of this document. It is
strongly recommended that the process of registering a namespace
identifier include any such considerations.
9. Acknowledgments
10. References
Request For Comments (RFC) and Internet Draft documents are available
from <URL:ftp://ftp.internic.net> and numerous mirror sites.
Ryan Moats
AT&T
15621 Drexel Circle
Omaha, NE 68135-2358
USA
The URN syntax has been defined so that URNs can be used in places
where URLs are expected. A resolver that conforms to the current URL
syntax specification [3] will extract a scheme value of "urn:" rather
than a scheme value of "urn:<nid>".