Mailing List SIMS@mail.stalker.com Message #15034
From: Bill Cole <listbill@scconsult.com>
Subject: Re: = at line ends
Date: Sat, 4 Jun 2005 12:28:56 -0400
To: SIMS Discussions <SIMS@mail.stalker.com>
At 10:05 PM -0700 6/3/05, Warren Michelsen  imposed structure on a stream of electrons, yielding:
At 8:31 PM -0400 6/3/05, Bill Cole sent email containing:
At 3:45 PM -0700 6/3/05, Warren Michelsen  imposed structure on a
stream of electrons, yielding:
What is it that causes equals signs to appear at the ends of long
lines? Sometimes it's =2D and other times, just =.

Quoted-printable encoding. It's an encoding mechanism for text that
leaves most characters intact and uses =<hexcode> to encode
anything that risks getting clobbered by transport agents. A '=' at
the end of a line indicates a 'soft' linebreak, usually with
protected whitespace before it so that anything re-wrapping the
text will not clobber the whitespace at the end of the line.

I can't imagine any case in which there's whitespace, protected or
not, in the middle of words and tags. Example:

<IMG SRC=3D"http://images=2Ehideaways=2Ecom/newsletters/auctio=
n2=5Fheader=5F01=2Ejpg" WIDTH=3D164 HEIGHT=3D100 BORDER=3D0>

In that case the soft line break is being done by the encoder so that it can be rejoined correctly. (Note that I did say 'usually.')


A '=2D' should be translated into a '-' so you'll see it at the end
of a line with a hyphenated word.

I should have said 0D. My error.

That would indicate a hard line break. 0x0D=carriage return.


See http://www.rfc-editor.org/rfc/rfc2045.txt section 6.7 for the full spec.

When something limits line length like this by forcing line breaks,
shouldn't it be smart enough not to add breaks -- equals and a
return -- in the middle of a html tag or URL?

Yes. Any any mail user agent in the modern would should deode QP so
that you don't see any of that.

The problem message sent to me in source form was received by Apple
mail on 10.2.8.

The case which got me wondering is a html message with
Content-Transfer-Encoding: quoted-printable. None of the encoded
characters were non-ASCII so I see no reason why it was
quoted-printable in the first place.

QP is about more than protecting high-bit characters. It provides a
way of doing 'soft' line breaks and assures that ASCII characters
outside of the 64 that are reliably 'mail safe' can be transmitted
unmolested.

By 'soft', you mean that they should be removed by the receiving
MUA? Or, if not removed, at least the lines between which the '='
sits should be concatenated, or something like that.

Yes. A MUA capable of decoding QP should also re-join any lines ending in '=' and re-wrap them based on local presentation needs. A QP 'soft' break is done to make sure that nothing in transit ever sees a need to break lines. For example, whatever MUA you use composes your messages as line-per-paragraph (a common GUI approach) and sends them that way raw. That is only a minor nuisance today for people quoting your mail, but back in 1992 when the MIME framework and QP was designed, it would have been useless.

You might or might not notice that I use a different approach to the wrapping issue, the MIME 'flowed' format (RFC2646) which was developed in part as a response to the readability and fragility issues of QP: it doesn't solve everything QP solves, but it handles the problem which QP is most often used to address more readably and robustly.


In any case, something along the line truncated many lines and broke
URLs and links in the process by causing malformed tags and URLs,
inasmuch as =[carriage-return] and another character is not valid
quoted printable and cannot be decoded.

Not so. =[carriage return] is specifically defined in the QP spec
and anything failing to decode it is not doing QP decoding.

OK, I was not aware that the lone '=' at the end of a line is valid
QP for a soft break.

So the end result is lots of
lines ending with equals signs.

Is the culprit in these situations typically the sending server?

No. Few modern servers touch encoding at all, and those that do are
most likely to convert any message with risky bytes to base64 in
toto rather than try to fiddle with the text as QP. QP encoding and
decoding is almost always a user agent operation.

The user in this case must have a faulty mail.app then, yes?

Maybe.

There are a lot of ways to break QP encoding without working very hard at it. The most reliable way is for the originating MUA or something chewing on the message in transit (rarely an MTA, but more commonly a mailing list exploder) to mangle the MIME headers. Sometimes you will also see QP-encoded text pasted into a message unwittingly so that there is another layer of QP encoding added. That is a user error unattributable to software.

--
Bill Cole
bill@scconsult.com

Subscribe (FEED) Subscribe (DIGEST) Subscribe (INDEX) Unsubscribe Mail to Listmaster