When developing for the web a time will come when you’ll need to sanitize HTML. If you need to do this in Python then you should check out Bleach.
Bleach is an HTML sanitizing library that escapes or strips markup and attributes based on a white list. Bleach can also linkify text safely, applying filters that Django’s urlize filter cannot, and optionally setting rel attributes, even on links already in the text.
Even if all you want to do is apply
rel='nofollow' to the links in user generated content, Bleach has you covered. So, check it out the next time you need to clean some HTML.
Have comments? Send a tweet to @TheChangelog on Twitter.
Subscribe to The Changelog Weekly – our weekly email covering everything that hits our open source radar.