[Esd-l] Has anyone tried removing HTML "code" via a sanitizer
process??
daniel lance herrick
dan.herrick at pbs.proquest.com
Thu Oct 23 13:22:14 PDT 2003
On Thu, 23 Oct 2003, Jim Bucks wrote:
> Hello All,
>
> I was wondering if anyone has tried removing HTML code via a sanitizer
> process. I know the resulting text is going to be extremely ugly - and
> probably unreadable.
You have multipart/alternative and the browser is
choosing the web-imitation alternative. If you
change the "Content-Type: text/html" to something
like "Content-Type: text/petunia" then the browser
won't know how to display the "text/petunia" part
and will choose the other part. (Then you could
distribute a mailcap that says to use lynx to
display text/petunia and really send them in
circles.)
dan
> Here's more details on the issue:
> 1) Using Redhat 6.x
> Sendmail 8.12.9
> Procmail 3.21
> Sanitizer 1.139
>
> 2) Mail clients are predominantly Netscape Communicator 4.79
>
> 3) The defanging process is working GREAT! However, I have
> a small group of users that are having problems (they
> refuse to view the message source) with HTML email they
> have received. The Sanitizer is properly modifying the
> html tags, but then the message just comes up as blank.
>
> 4) Most of the time, the original recipient (R)can see
> the HTML message. When this person (R) then forwards
> or reply's to that message, the message body disappears
> from the forwarded message.
>
> See below for the message source from one of the
> "interesting" messages.
>
> 5) My thoughts on fixing this are:
> - Turn off html sanitizing. I'm fighting this.
>
> - Find a way to strip the html tags from the
> messages, leaving just the ascii text.
> I'm not sure if this is even reasonably doable.
>
> - Find a way to strip just the meta and style sections,
> leaving the remaining part of the message intact.
> I'm sure this will still be pretty ugly.
>
> - Look into an alternative to Sanitizer.
> Probably end up being big $$$$$.
>
> Do 'yall have any lessons learned / words of wisdom I can use as
> guidance for this? Other than shoot the users?
>
> Jim
>
>
>
>
> Source of "interesting" message............................
>
> Return-Path: <ccccc at ddddd.xxx>
> Received: from eeeee.ddddd.xxx ([88.88.88.88])
> by gw1.bbbbb.xxx (8.12.9/8.11.6) with ESMTP id
> h9MLNfG1025024
> for <aaaaa at bbbbb.xxx>; Wed, 22 Oct 2003 15:23:43
> -0600
> X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0
> content-class: urn:content-classes:message
> MIME-Version: 1.0
> X-Security: MIME headers sanitized on fffff
> See http://www.impsec.org/email-tools/sanitizer-intro.html
> for details. $Revision: 1.139 $Date: 2003-09-07 10:14:23-07
> Content-Type: multipart/alternative;
> boundary="----_=_NextPart_001_01C398E2.B9D528E1"
> Subject: Why Me
> Date: Wed, 22 Oct 2003 16:23:24 -0500
> Message-ID:
> <6A7AD98CA7919B45A429AA7AC5D88D37175613 at eeeeee.ddddd.xxx>
> X-MS-Has-Attach:
> X-MS-TNEF-Correlator:
> Thread-Topic: Why Me
> Thread-Index: AcOY4rnTfLE/hfwFSmGufHKejINLyw==
> From: "Bill Smith" <ccccc at ddddd.com>
> To: <aaaaa at bbbbb.com>
>
> This is a multi-part message in MIME format.
>
> ------_=_NextPart_001_01C398E2.B9D528E1
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> blah blah blah blah blah blah blah blah blah blah blah blah blah blah bl
> ah blah blah blah blah bla.
>
> =20
>
> blah bla blah blah blah
>
> 111111 1111 1111111111
>
> 111111 1111 1111111111
>
> 111111 1111 1111111111
>
> 11111 1111 1111111111
>
> 1111111 1111 1111111111
>
> 1111111 1111 1111111111
>
>
> ------_=_NextPart_001_01C398E2.B9D528E1
> Content-Type: text/html; charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> <html>
>
> <head>
> <DEFANGED_META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
> charset=3Dus-ascii">
>
>
> <DEFANGED_meta name=3DGenerator content=3D"Microsoft Word 10
> (filtered)">
>
> <!-- <DEFANGED_STYLE>
> <!--
> /* Style Definitions */
> p.MsoNormal, li.MsoNormal, div.MsoNormal
> {margin:0in;
> margin-bottom:.0001pt;
> font-size:12.0pt;
> font-family:"Times New Roman";}
> a:link, span.MsoHyperlink
> {color:blue;
> text-decoration:underline;}
> a:visited, span.MsoHyperlinkFollowed
> {color:purple;
> text-decoration:underline;}
> span.EmailStyle17
> {font-family:Arial;
> color:windowtext;}
> @page Section1
> {size:8.5in 11.0in;
> margin:1.0in 1.25in 1.0in 1.25in;}
> div.Section1
> {page:Section1;}
> -->
> --> </DEFANGED_STYLE>
>
> </head>
>
> <body lang=3DEN-US link=3Dblue vlink=3Dpurple>
>
> <div class=3DSection1>
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
>
> font-family:Arial'>blah blah blah blah blah blah blah blah blah blah bla
> h blah ? blah blah blah blah blah blah blah b</sman></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
>
> font-family:Arial'> </span></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
>
> font-family:Arial'>blah bla blah =
> blah blah</span></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
>
> font-family:Arial'>111111</span></font><font size=3D2 =
> face=3DArial><span
>
> style=3D'font-size:10.0pt;font-family:Arial'> &nbs=
>
> p; 1111 1111111111</span></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
>
> font-family:Arial'>111111</span></font><font size=3D2 =
> face=3DArial><span
>
> style=3D'font-size:10.0pt;font-family:Arial'> &nbs=
>
> p; 1111 1111111111</span></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
>
> font-family:Arial'>111111</span></font><font size=3D2 =
> face=3DArial><span
>
> style=3D'font-size:10.0pt;font-family:Arial'> &nbs=
>
> p; 1111 1111111111</span></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
>
> font-family:Arial'>111111</span></font><font size=3D2 =
> face=3DArial><span
>
> style=3D'font-size:10.0pt;font-family:Arial'> &nbs=
>
> p; 1111 1111111111</span></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
>
> font-family:Arial'>1111111</span></font><font size=3D2 =
> face=3DArial><span
>
> style=3D'font-size:10.0pt;font-family:Arial'> =
> 1111 1111111111</span></font></p>
>
> <p class=3DMsoNormal><font size=3D2 face=3DArial><span =
> style=3D'font-size:10.0pt;
>
> font-family:Arial'>1111111</span></font><font size=3D2 =
> face=3DArial><span
>
> style=3D'font-size:10.0pt;font-family:Arial'> =
> 1111 1111111111</span></font></p>
>
> </div>
>
> </body>
>
> </html>
> =00
> ------_=_NextPart_001_01C398E2.B9D528E1--
>
>
> --
> Jim Bucks - IT/IS Support www.coloradostudios.com
> 2400 N. Ulster St. Denver, Co. 80238
> jbucks at coloradostudios.com 303-388-8500
> _______________________________________________
> Esd-l mailing list
> Esd-l at spconnect.com
> http://www.spconnect.com/mailman/listinfo/esd-l
>
More information about the esd-l
mailing list