HTML Sanitizer API

The HTML Sanitizer API allows developers to take strings of HTML and filter out unwanted elements, attributes, and other HTML entities when they are inserted into the DOM or a shadow DOM.

Concepts and usage

Web applications often need to work with untrusted HTML on the client side, for example, as part of a client-side templating solution, when rendering user generated content, or if including data in a frame from another site.

Injecting untrusted HTML can make a site vulnerable to various types of attacks. In particular, cross-site scripting (XSS) attacks work by injecting untrusted HTML into the DOM that then executes JavaScript in the context of the current origin — allowing malicious code to run as though it was served from the site's origin. These attacks can be mitigated by removing unsafe HTML elements and attributes before they are injected into the DOM.

The HTML Sanitizer API provides a number of methods for removing unwanted HTML entities from HTML input before it is injected into the DOM. These come in XSS-safe versions that enforce removal of all unsafe elements and attributes, and potentially unsafe versions that give developers full control over the HTML entities that are allowed.

Sanitization methods

The HTML Sanitizer API provides XSS-safe and XSS-unsafe methods for injecting HTML strings into an Element or a ShadowRoot, and for parsing HTML into a Document.

Safe methods: Element.setHTML(), ShadowRoot.setHTML(), and Document.parseHTML().
Unsafe methods: Element.setHTMLUnsafe(), ShadowRoot.setHTMLUnsafe(), and Document.parseHTMLUnsafe().

All the methods take the HTML to be injected and an optional sanitizer configuration as arguments. The configuration defines the HTML entities that will be filtered out of the input before it is injected. The Element methods are context aware, and will additionally drop any elements that the HTML specification does not allow in the target element.

The safe methods always remove XSS-unsafe elements and attributes. If no sanitizer is passed as a parameter they will use the default sanitizer configuration, which allows all elements and attributes except those that are known to be unsafe, such as