Tony Garnock-Jones <you.can.find.me.through@google.easily> wrote:
+---------------
| Geoffrey Summerhayes wrote:
| > RFC 822:
|
| Does 2822 make things any simpler?
+---------------
Possibly a little bit, but not in any significiant way in the area
I think you're asking about. RFC 2822 introduced the "dot-atom"
production which simplified the description of when periods were
allowed in unquoted local-parts:
Some of the structured header field bodies also allow the period
character (".", ASCII value 46) within runs of atext. An additional
"dot-atom" token is defined for those purposes.
...
atom = [CFWS] 1*atext [CFWS]
dot-atom = [CFWS] dot-atom-text [CFWS]
dot-atom-text = 1*atext *("." 1*atext)
Both atom and dot-atom are interpreted as a single unit, comprised of
the string of characters that make it up. Semantically, the optional
comments and FWS surrounding the rest of the characters are not part
of the atom; the atom is only the run of atext characters in an atom,
or the atext and "." characters in a dot-atom.
and then "addr-spec" was tweaked to use "dot-atom":
3.4.1. Addr-spec specification
An addr-spec is a specific Internet identifier that contains a
locally interpreted string followed by the at-sign character ("@",
ASCII value 64) followed by an Internet domain. The locally
interpreted string is either a quoted-string or a dot-atom. If the
string can be represented as a dot-atom (that is, it contains no
characters other than atext characters or "." surrounded by atext
characters), then the dot-atom form SHOULD be used and the
quoted-string form SHOULD NOT be used. Comments and folding white
space SHOULD NOT be used around the "@" in the addr-spec.
addr-spec = local-part "@" domain
local-part = dot-atom / quoted-string / obs-local-part
domain = dot-atom / domain-literal / obs-domain
domain-literal = [CFWS] "[" *([FWS] dcontent) [FWS] "]" [CFWS]
dcontent = dtext / quoted-pair
dtext = NO-WS-CTL / ; Non white space controls
%d33-90 / ; The rest of the US-ASCII
%d94-126 ; characters not including "[",
; "]", or "\"
Also, the "route" syntax in a "addr-spec" was deprecated, see
"4.4 Obsolete Addressing" and the "obs-angle-addr" production.
Finally, "CFWS" [comment and/or folding-white-space] was removed
from being allowed around the dots within a "word" and around the
"@" between a "local-part" and a "domain". [I think. If I'm reading
"4.4" correctly.]
Unfortunately, while these simplifications apply to what you may *send*,
they do *NOT* apply to what you must still be prepared to *receive*:
3.1. Introduction
...
In some of the definitions, there will be nonterminals whose names
start with "obs-". These "obs-" elements refer to tokens defined in
the obsolete syntax in section 4. In all cases, these productions
are to be ignored for the purposes of generating legal Internet
messages and MUST NOT be used as part of such a message. However,
when interpreting messages, these tokens MUST be honored as part of
the legal syntax. In this sense, section 3 defines a grammar for
generation of messages, with "obs-" elements that are to be ignored,
while section 4 adds grammar for interpretation of messages.
This means that you must still be prepared to *parse* all the
old, ugly syntax, which means that it's really no simplification
at all, practically speaking. (*sigh*)
-Rob
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607