source: trunk/phpgwapi/doc/kses-0.2.1/README @ 2

Revision 2, 7.1 KB checked in by niltonneto, 17 years ago (diff)

Removida todas as tags usadas pelo CVS ($Id, $Source).
Primeira versão no CVS externo.

  • Property svn:executable set to *
  • Property svn:mime-type set to application/octet-stream
Line 
1kses 0.2.1 README  [kses strips evil scripts!]
2=================
3
4
5* INTRODUCTION *
6
7
8Welcome to kses - an HTML/XHTML filter written in PHP. It removes all unwanted
9HTML elements and attributes, no matter how malformed HTML input you give it.
10It also does several checks on attribute values. kses can be used to avoid
11Cross-Site Scripting (XSS), Buffer Overflows and Denial of Service attacks,
12among other things.
13
14The program is released under the terms of the GNU General Public License. You
15should look into what that means, before using kses in your programs. You can
16find the full text of the license in the file COPYING.
17
18
19* FEATURES *
20
21
22Some of kses' current features are:
23
24* It will only allow the HTML elements and attributes that it was explicitly
25told to allow.
26
27* Element and attribute names are case-insensitive (a href vs A HREF).
28
29* It will understand and process whitespace correctly.
30
31* Attribute values can be surrounded with quotes, apostrophes or nothing.
32
33* It will accept valueless attributes with just names and no values (selected).
34
35* It will accept XHTML's closing " /" marks.
36
37* Attribute values that are surrounded with nothing will get quotes to avoid
38producing non-W3C conforming HTML
39(<a href=http://sourceforge.net/projects/kses> works but isn't valid HTML).
40
41* It handles lots of types of malformed HTML, by interpreting the existing
42code the best it can and then rebuilding new code from it. That's a better
43approach than trying to process existing code, as you're bound to forget about
44some weird special case somewhere. It handles problems like never-ending
45quotes and tags gracefully.
46
47* It will remove additional "<" and ">" characters that people may try to
48sneak in somewhere.
49
50* It supports checking attribute values for minimum/maximum length and
51minimum/maximum value, to protect against Buffer Overflows and Denial of
52Service attacks against WWW clients and various servers. You can stop
53<iframe src= width= height=> from having too high values for width and height,
54for instance.
55
56* It has got a system for whitelisting URL protocols. You can say that
57attribute values may only start with http:, https:, ftp: and gopher:, but no
58other URL protocols (javascript:, java:, about:, telnet:..). The functions that
59do this work handle whitespace, upper/lower case, HTML entities
60("jav&#97;script:") and repeated entries ("javascript:javascript:alert(57)").
61It also normalizes HTML entities as a nice side effect.
62
63* It removes Netscape 4's JavaScript entities ("&{alert(57)};").
64
65* It handles NULL bytes and Opera's chr(173) whitespace characters.
66
67* There is both a procedural version and an object-oriented version of kses.
68
69
70* USE IT *
71
72
73It's very easy to use kses in your own PHP web application! Basic usage looks
74like this:
75
76
77<?php
78
79include 'kses.php';
80
81$allowed = array('b' => array(),
82                 'i' => array(),
83                 'a' => array('href' => 1, 'title' => 1),
84                 'p' => array('align' => 1),
85                 'br' => array());
86
87$val = $_POST['val'];
88if (get_magic_quotes_gpc())
89  $val = stripslashes($val);
90# You must strip slashes from magic quotes, or kses will get confused.
91
92$val = kses($val, $allowed); # The filtering takes place here.
93
94# Do something with $val.
95
96?>
97
98
99This definition of $allowed means that only the elements B, I, A, P and BR are
100allowed (along with their closing tags /B, /I, /A, /P and /BR). B, I and BR
101may not have any attributes. A may only have the attributes HREF and TITLE,
102while P may only have the attribute ALIGN. You can list the elements and
103attributes in the array in any mixture of upper and lower case. kses will also
104recognize HTML code that uses both lower and upper case.
105
106It's important to select the right allowed attributes, so you won't open up
107an XSS hole by mistake. Some important attributes that you mustn't allow
108include but are not limited to: 1) style, and 2) all intrinsic events
109attributes (onMouseOver and so on, on* really). I'll write more about this in
110the documentation that will be distributed with future versions of kses.
111
112It's also important to note that kses' HTML input must be cleaned of all
113slashes coming from magic quotes. If the rest of your code requires these
114slashes to be present, you can always add them again after calling kses with
115a simple addslashes() call.
116
117You should take a look at the documentation in the docs/ directory and the
118examples in the examples/ directory, to get more information on how to use
119kses. The object-oriented version of kses is also worth checking out, and it's
120included in the oop/ directory.
121
122
123* UPGRADING FROM 0.1.0 OR 0.2.0 TO 0.2.1 *
124
125
126kses 0.2.1 is backwards compatible with 0.1.0 and 0.2.0, so upgrading should
127just be a matter of using a new version of kses.php instead of an old one!
128
129When you're ready to start using 0.2.1's new features, you can read about them
130in the files in the docs/ directory. The ChangeLog also summarizes the new
131features in this release.
132
133
134* NEW VERSIONS, MAILING LISTS AND BUG REPORTS *
135
136
137If you want to download new versions, subscribe to the kses-general mailing
138list or even take part in the development of kses, we refer you to its
139homepage at  http://sourceforge.net/projects/kses . New developers and beta
140testers are more than welcome!
141
142If you have any bug reports, suggestions for improvement or simply want to tell
143us that you use kses for some project, feel free to post to the kses-general
144mailing list. If you have found any security problems (particularly XSS,
145naturally) in kses, please contact Ulf privately at  metaur at users dot
146sourceforge dot net  so he can correct it before you or someone else tells the
147public about it.
148
149(No, it's not a security problem in kses if some program that uses it allows a
150bad attribute, silly. If kses is told to accept the element body with the
151attributes style and onLoad, it will accept them, even if that's a really bad
152idea, securitywise.)
153
154
155* OTHER HTML FILTERS *
156
157
158Here are the other stand-alone, open source HTML filters that we currently know
159of:
160
161* XSS filter for PHP4 - the filter from Squirrelmail
162  PHP
163  Konstantin Riabitsev
164  http://www.mricon.com/html/phpfilter.html
165
166* HTML::StripScripts and related CPAN modules
167  Perl
168  Nick Cleaton
169  http://search.cpan.org/perldoc?HTML%3A%3AStripScripts
170
171There are also a lot of HTML filters that were written specifically for some
172program. Some of them are better than others.
173
174Please write to the kses-general mailing list if you know of any other
175stand-alone, open-source filters.
176
177
178* DEDICATION *
179
180
181kses 0.2.1 is dedicated to Mischa the cat.
182
183
184* MISC *
185
186
187The kses code is based on an HTML filter that Ulf wrote on his own back in 2002
188for the open-source project Gnuheter ( http://savannah.nongnu.org/projects/
189gnuheter ). Gnuheter is a fork from PHP-Nuke. The HTML filter has been
190improved a lot since then.
191
192To stop people from having sleepless nights, we feel the urgent need to state
193that kses doesn't have anything to do with the KDE project, despite having a
194name that starts with a K.
195
196In case someone was wondering, Ulf is available for kses-related consulting.
197
198Finally, the name kses comes from the terms XSS and access. It's also a
199recursive acronym (every open-source project should have one!) for "kses
200strips evil scripts".
201
202
203// Ulf and the kses gang, September 2003
Note: See TracBrowser for help on using the repository browser.