Sanitising user-contributed markup

At ApacheCon, I once again encountered the argument sanitising markup is difficult, with an explanation of how easy it is to evade pattern-matching filters with tricks like reordering, whitespace, and embedded comments. I protested that this kind of difficulty comes from using the wrong tools, and the problem largely goes away if you use markup-aware tools.

On April 10th I promised a note on this (though that promise came from a separate conversation at apachecon, and in a different context to the security issue). Today I’ve just delivered on that promise, with a brief technical note. I expect to use it in future when the subject arises.


Posted on April 20, 2008, in apache, html, security.

