[903] | 1 | kses attribute value checks |
---|
| 2 | =========================== |
---|
| 3 | |
---|
| 4 | As you've probably already read in the README file, an $allowed_html array |
---|
| 5 | normally looks like this: |
---|
| 6 | |
---|
| 7 | $allowed = array('b' => array(), |
---|
| 8 | 'i' => array(), |
---|
| 9 | 'a' => array('href' => 1, |
---|
| 10 | 'title' => 1), |
---|
| 11 | 'p' => array('align' => 1), |
---|
| 12 | 'br' => array()); |
---|
| 13 | |
---|
| 14 | This sets what elements and attributes are allowed. |
---|
| 15 | |
---|
| 16 | From kses 0.2.0, you can also perform some checks on the attribute values. You |
---|
| 17 | do it like this: |
---|
| 18 | |
---|
| 19 | $allowed = array('b' => array(), |
---|
| 20 | 'i' => array(), |
---|
| 21 | 'a' => array('href' => |
---|
| 22 | array('maxlen' => 100), |
---|
| 23 | 'title' => 1), |
---|
| 24 | 'p' => array('align' => 1), |
---|
| 25 | 'font' => array('size' => |
---|
| 26 | array('maxval' => 20)), |
---|
| 27 | 'br' => array()); |
---|
| 28 | |
---|
| 29 | This means that kses should perform the maxlen check with the value 100 on the |
---|
| 30 | <a href=> value, as well as the maxval check with the value 20 on the <font |
---|
| 31 | size=> value. |
---|
| 32 | |
---|
| 33 | The currently implemented checks (with more to come) are 'maxlen', 'maxval', |
---|
| 34 | 'minlen', 'minval' and 'valueless'. |
---|
| 35 | |
---|
| 36 | 'maxlen' checks that the length of the attribute value is not greater than the |
---|
| 37 | given value. It is helpful against Buffer Overflows in WWW clients and various |
---|
| 38 | servers on the Internet. In my example above, it would mean that |
---|
| 39 | "<a href='ftp://ftp.v1ct1m.com/AAAA..thousands_of_A's...'>" wouldn't be |
---|
| 40 | accepted. |
---|
| 41 | |
---|
| 42 | Of course, this problem is even worse if you put that long URL in a <frame> |
---|
| 43 | tag instead, so the WWW client will fetch it automatically without a user |
---|
| 44 | having to click it. |
---|
| 45 | |
---|
| 46 | 'maxval' checks that the attribute value is an integer greater than or equal to |
---|
| 47 | zero, that it doesn't have an unreasonable amount of zeroes or whitespace (to |
---|
| 48 | avoid Buffer Overflows), and that it is not greater than the given value. In |
---|
| 49 | my example above, it would mean that "<font size='20'>" is accepted but |
---|
| 50 | "<font size='21'>" is not. This check helps against Denial of Service attacks |
---|
| 51 | against WWW clients. |
---|
| 52 | |
---|
| 53 | One example of this DoS problem is <iframe src="http://some.web.server/" |
---|
| 54 | width="20000" height="2000">, which makes some client machines completely |
---|
| 55 | overloaded. |
---|
| 56 | |
---|
| 57 | 'minlen' and 'minval' works the same as 'maxlen' and 'maxval', except that they |
---|
| 58 | check for minimum lengths and values instead of maximum ones. |
---|
| 59 | |
---|
| 60 | 'valueless' checks if an attribute has a value (like <a href="blah">) or not |
---|
| 61 | (<option selected>). If the given value is a "y" or a "Y", the attribute must |
---|
| 62 | not have a value to be accepted. If the given value is an "n" or an "N", the |
---|
| 63 | attribute must have a value. Note that <a href=""> is considered to have a |
---|
| 64 | value, so there's a difference between valueless attributes and attribute |
---|
| 65 | values with the length zero. |
---|
| 66 | |
---|
| 67 | You can combine more than one check, by putting one after the other in the |
---|
| 68 | inner array. |
---|