Merry Christmas, Internets! My gift to you this year is Sanitize, a whitelist-based HTML sanitizer written in Ruby. Given a list of acceptable elements and attributes, Sanitize will remove all unacceptable HTML from a string.

Using a simple configuration syntax, you can tell Sanitize to allow certain elements, certain attributes within those elements, and even certain URL protocols within attributes that contain URLs. Any HTML elements or attributes that you don’t explicitly allow will be removed.

Because it’s based on Nokogiri, a full-fledged HTML parser, rather than a bunch of fragile regular expressions, Sanitize has no trouble dealing with malformed or maliciously-formed HTML. When in doubt, Sanitize always errs on the side of caution.