Mailing List Archive

rfc822.Message.readheaders bug (PR#3)
Full_Name: Skip Montanaro
Version: 1.5.2
OS: Unix
Submission from: eric.cnri.reston.va.us (132.151.1.38)
Submitted by: guido


[resubmitted by GvR]

I think there's a bug in rfc822.Message.readheaders (v. 1.5.2). In that
method it splits a header into name and value and assigns to a dict:

headerseen = self.isheader(line)
if headerseen:
# It's a legal header line, save it.
list.append(line)
self.dict[headerseen] = string.strip(line[len(headerseen)+2:])
continue

See the "len(headerseen)+2" as the starting index of the slice? I think
that should be "len(headerseen)+1". It appears the code assumes there is a
space following the colon that separates the name and the value. My reading
of the relevant section of RFC 822 suggests that a single colon is the only
separator between a field name and its value:

3.2. HEADER FIELD DEFINITIONS

These rules show a field meta-syntax, without regard for the
particular type or internal syntax. Their purpose is to permit
detection of fields; also, they present to higher-level parsers
an image of each field as fitting on one line.

field = field-name ":" [ field-body ] CRLF

field-name = 1*<any CHAR, excluding CTLs, SPACE, and ":">

field-body = field-body-contents
[CRLF LWSP-char field-body]

field-body-contents =
<the ASCII characters making up the field-body, as
defined in the following sections, and consisting
of combinations of atom, quoted-string, and
specials tokens, or else consisting of texts>

I got the above from http://www.faqs.org/rfcs/rfc822.html.