1 | kses attribute value checks |
---|
2 | =========================== |
---|
3 | |
---|
4 | As you've probably already read in the README file, an $allowed_html array |
---|
5 | normally looks like this: |
---|
6 | |
---|
7 | $allowed = array('b' => array(), |
---|
8 | 'i' => array(), |
---|
9 | 'a' => array('href' => 1, |
---|
10 | 'title' => 1), |
---|
11 | 'p' => array('align' => 1), |
---|
12 | 'br' => array()); |
---|
13 | |
---|
14 | This sets what elements and attributes are allowed. |
---|
15 | |
---|
16 | From kses 0.2.0, you can also perform some checks on the attribute values. You |
---|
17 | do it like this: |
---|
18 | |
---|
19 | $allowed = array('b' => array(), |
---|
20 | 'i' => array(), |
---|
21 | 'a' => array('href' => |
---|
22 | array('maxlen' => 100), |
---|
23 | 'title' => 1), |
---|
24 | 'p' => array('align' => 1), |
---|
25 | 'font' => array('size' => |
---|
26 | array('maxval' => 20)), |
---|
27 | 'br' => array()); |
---|
28 | |
---|
29 | This means that kses should perform the maxlen check with the value 100 on the |
---|
30 | <a href=> value, as well as the maxval check with the value 20 on the <font |
---|
31 | size=> value. |
---|
32 | |
---|
33 | The currently implemented checks (with more to come) are 'maxlen', 'maxval', |
---|
34 | 'minlen', 'minval' and 'valueless'. |
---|
35 | |
---|
36 | 'maxlen' checks that the length of the attribute value is not greater than the |
---|
37 | given value. It is helpful against Buffer Overflows in WWW clients and various |
---|
38 | servers on the Internet. In my example above, it would mean that |
---|
39 | "<a href='ftp://ftp.v1ct1m.com/AAAA..thousands_of_A's...'>" wouldn't be |
---|
40 | accepted. |
---|
41 | |
---|
42 | Of course, this problem is even worse if you put that long URL in a <frame> |
---|
43 | tag instead, so the WWW client will fetch it automatically without a user |
---|
44 | having to click it. |
---|
45 | |
---|
46 | 'maxval' checks that the attribute value is an integer greater than or equal to |
---|
47 | zero, that it doesn't have an unreasonable amount of zeroes or whitespace (to |
---|
48 | avoid Buffer Overflows), and that it is not greater than the given value. In |
---|
49 | my example above, it would mean that "<font size='20'>" is accepted but |
---|
50 | "<font size='21'>" is not. This check helps against Denial of Service attacks |
---|
51 | against WWW clients. |
---|
52 | |
---|
53 | One example of this DoS problem is <iframe src="http://some.web.server/" |
---|
54 | width="20000" height="2000">, which makes some client machines completely |
---|
55 | overloaded. |
---|
56 | |
---|
57 | 'minlen' and 'minval' works the same as 'maxlen' and 'maxval', except that they |
---|
58 | check for minimum lengths and values instead of maximum ones. |
---|
59 | |
---|
60 | 'valueless' checks if an attribute has a value (like <a href="blah">) or not |
---|
61 | (<option selected>). If the given value is a "y" or a "Y", the attribute must |
---|
62 | not have a value to be accepted. If the given value is an "n" or an "N", the |
---|
63 | attribute must have a value. Note that <a href=""> is considered to have a |
---|
64 | value, so there's a difference between valueless attributes and attribute |
---|
65 | values with the length zero. |
---|
66 | |
---|
67 | You can combine more than one check, by putting one after the other in the |
---|
68 | inner array. |
---|