Mailing List SIMS@mail.stalker.com Message #15032
From: Bill Cole <listbill@scconsult.com>
Subject: Re: = at line ends
Date: Fri, 3 Jun 2005 20:31:58 -0400
To: SIMS Discussions <SIMS@mail.stalker.com>
At 3:45 PM -0700 6/3/05, Warren Michelsen  imposed structure on a stream of electrons, yielding:
What is it that causes equals signs to appear at the ends of long
lines? Sometimes it's =2D and other times, just =.

Quoted-printable encoding. It's an encoding mechanism for text that leaves most characters intact and uses =<hexcode> to encode anything that risks getting clobbered by transport agents. A '=' at the end of a line indicates a 'soft' linebreak, usually with protected whitespace before it so that anything re-wrapping the text will not clobber the whitespace at the end of the line. A '=2D' should be translated into a '-' so you'll see it at the end of a line with a hyphenated word.

See http://www.rfc-editor.org/rfc/rfc2045.txt section 6.7 for the full spec.

When something limits line length like this by forcing line breaks,
shouldn't it be smart enough not to add breaks -- equals and a
return -- in the middle of a html tag or URL?

Yes. Any any mail user agent in the modern would should deode QP so that you don't see any of that.

The case which got me wondering is a html message with
Content-Transfer-Encoding: quoted-printable. None of the encoded
characters were non-ASCII so I see no reason why it was
quoted-printable in the first place.

QP is about more than protecting high-bit characters. It provides a way of doing 'soft' line breaks and assures that ASCII characters outside of the 64 that are reliably 'mail safe' can be transmitted unmolested. It is designed to be mostly human-readable even without decoding and to fatten the encoded text minimally (compared to base64 and uucode, both of which enocde all characters, increase size by 33%, and result in data that must be decoded to be read at all.)

In any case, something along the line truncated many lines and broke
URLs and links in the process by causing malformed tags and URLs,
inasmuch as =[carriage-return] and another character is not valid
quoted printable and cannot be decoded.

Not so. =[carriage return] is specifically defined in the QP spec and anything failing to decode it is not doing QP decoding.

So the end result is lots of
lines ending with equals signs.

Is the culprit in these situations typically the sending server?

No. Few modern servers touch encoding at all, and those that do are most likely to convert any message with risky bytes to base64 in toto rather than try to fiddle with the text as QP. QP encoding and decoding is almost always a user agent operation.


--
Bill Cole
bill@scconsult.com

Subscribe (FEED) Subscribe (DIGEST) Subscribe (INDEX) Unsubscribe Mail to Listmaster