Security Safe HTML

Cross-Site-Scripting (XSS) vulnerabilities are a class of web application security bugs that allow an attacker to execute arbitrary malicious JavaScript in the context of a victim's browser session. In turn, such malicious script can for example steal the user's session credentials (resulting in hijacking and full compromise of the user's session), extract and leak sensitive or confidential data from the victim's account, or execute transactions chosen by the attacker in the name of the victim.

In GWT applications, many aspects of a web-application's UI are expressed in terms of abstractions (such as Widgets) that do not expose a potential for untrusted data to be interpreted as HTML markup or script. As such, GWT apps are inherently less prone to XSS vulnerabilities than applications built on top of frameworks where UI is rendered directly as HTML (such as server-side templating systems, with the exception of templating systems that automatically escape template variables according to the HTML context the variable appears in).

However, GWT applications are not inherently safe from XSS vulnerabilities.

A large class of potential XSS vulnerabilities in GWT applications arises from the use of methods that cause the browser to evaluate their argument as HTML, for example, setInnerHTML(String), setHTML(String), as well as the constructors of HTML-containing widgets such as HTML. If an application passes a string to such a method where the string is even partially derived from untrusted input, the application is vulnerable to XSS. In this context, untrusted input includes immediate user input, data the client app has received from the server and which may have been provided to the server by a different, malicious user, as well as values obtained from a history token read from a URL fragment.

This document introduces a new security package and accompanying coding guidelines that help developers avoid this class of XSS vulnerabilities, while minimizing overhead in run-time and development effort. A primary goal of the coding guidelines is to facilitate high-confidence code-reviews of applications for the absence of this class of XSS bugs.

Note that this document does not address other classes of XSS vulnerabilities that GWT applications may be vulnerable to, such as server-side XSS, as well as other classes of client-side XSS (for example, calling eval() on an untrusted string in native JavaScript.)

Coding Guidelines

The goals of these guidelines are two-fold:

A GWT application whose code-base these guidelines have been consistently and comprehensively applied to should be free of the class of XSS vulnerabilities due to attacker-controlled strings being evaluated as HTML in the browser.
The code should be structured such that it is easily reviewable for absence of this class of XSS — for each use of a potentially XSS-prone method such as Element.setInnerHTML it should be "obvious" that this use can't result in an XSS vulnerability.

The second goal is based on the desire to achieve a high degree of confidence in the absence of this class of bugs. When reviewing code, it is often difficult and error-prone to determine whether or not a value passed to some method may be controlled by an attacker, especially if the value is received via a long chain of assignments and method calls.

Hence, the central idea behind these guidelines is to use a type to encapsulate strings that are safe to use in HTML context, construct safe HTML into instances of this type, and use this type to "transport" strings as close as possible to a code site where they are used as HTML. Such a use is then easily apparent to be free of XSS vulnerabilities, given the type's contract (as well as Java's type-safety) as an assumption.

Coding Guidelines for Developers of Widget Client Code

The following guidelines are aimed at developers of client-code that uses existing widget libraries, in particular the core widget library that is distributed with GWT.

Prefer Plain-Text Widgets

The best way to avoid XSS bugs, and to write code that is easily seen to be free of XSS vulnerabilities, is to simply not use API methods and widgets that interpret parameters as HTML unless strictly necessary.

For example, it is not uncommon to see GWT application code such as:

HTML widget = new HTML("Some text in the widget");

widget.setHTML(someText);

In the first example, it's obvious that the value passed to the HTML constructor cannot result in an XSS vulnerability: The value doesn't contain HTML markup, and furthermore is a compile-time constant and hence cannot possibly be manipulated by an attacker. In the second example, it may be obvious to a reviewer that the call is safe if the variable someText is assigned to "nearby" in the code; however if the supplied value is passed in via a parameter through a few layers of method calls this is much less obvious. Also, such a scenario may well result in a bug in a future code iteration if calling code is changed by a developer who doesn't realize that the value will be used in an HTML context.

In such situations, it is preferred to use the non-HTML equivalent, that is, the Label widget and the setText method, respectively, both of which are always safe from XSS even if the string passed to the Label constructor or the setText method is under the control of an attacker. Similarly, use setInnerText instead of setInnerHTML on DOM elements.

Use UiBinder for Declarative Layout

Using GWT UiBinder is the preferred approach to declaratively creating Widget and DOM structures in GWT applications. In addition to the primary benefits of UiBinder (clean separation of code and layout, performance, built-in internationalization support), using UiBinder also typically results in code that is much less prone to XSS vulnerabilities than code that uses an ad-hoc approach to assembling HTML markup to, say, be placed into a HTML widget.

Often, the "leaf nodes" in the UiBinder-declared layout will be widgets whose data content is plain text rather than HTML markup. As such, they are naturally populated via setText rather than setHTML or equivalent, which completely avoids the potential for an XSS vulnerability.

Use the SafeHtml Type to Represent XSS-Safe HTML

There will be occasions where using UiBinder or similar approaches is not practical or too inconvenient. The com.google.gwt.safehtml package provides types and classes that can be used to write such code and yet have confidence that it is free of XSS.

The package provides an interface, SafeHtml, to represent the subset of strings that are safe to use in an HTML context, in the sense that evaluating the string as HTML in a browser will not result in script execution. More specifically, all implementations of this interface must adhere to the type contract that invoking the asString() method on an instance will always return a string that is HTML-safe in the above sense. In addition, the type's contract requires that the concatenation of any two SafeHtml-wrapped strings must itself be safe to use in an HTML context.

With the introduction of the com.google.gwt.safehtml package, all of the core GWT library's widgets that take String arguments that are interpreted as HTML have been augmented with corresponding methods that take a SafeHtml-typed value. In particular, all widgets that implement the HasHTML (or HasDirectionalHtml) interface also implement the HasSafeHtml (or HasDirectionalSafeHtml, respectively) interface. These interfaces define:

public void setHTML(SafeHtml html);
public void setHTML(SafeHtml html, Direction dir);

as safe alternatives to setHTML(String) and setHTML(String, Direction).

For example, the HTML widget has been augmented with the following constructors and methods:

public class HTML extends Label 
    implements HasDirectionalHtml, HasDirectionalSafeHtml {
  // ...
  public HTML(SafeHtml html);
  public HTML(SafeHtml html, Direction dir);
  @Override
  public void setHTML(SafeHtml html);
  @Override
  public void setHTML(SafeHtml html, Direction dir);
}

A central aspect of these coding guidelines is that developers of GWT applications should not use constructors and methods with String-typed parameters whose values are interpreted as HTML, and instead use the SafeHtml equivalent.

Creating SafeHtml Values

The safehtml package provides a number of tools to create instances of SafeHtml that cover many common scenarios in which GWT applications typically manipulate strings containing HTML markup:

A builder class that facilitates the creation of SafeHtml values by safely combining developer-controlled snippets of HTML markup with (possibly attacker-controlled) values.
A templating mechanism that allows the definition of snippets of structured HTML markup, into which values are safely interpolated at run-time.
I18N Messages can return localized messages in the form of a SafeHtml.
A number of convenience methods are available to create SafeHtml values from strings.
A simple HTML sanitizer that accepts a limited subset of HTML markup in its input, and HTML-escapes any HTML markup not within that subset.

Each of the above mechanisms has been carefully reviewed with respect to adherence to the SafeHtml type contract.

SafeHtmlBuilder

In many scenarios, the strings that will be used in an HTML context are concatenated partially from trusted strings (for example, snippets of HTML markup defined within the application's source) and untrusted strings that may be under the control of a potential attacker. The SafeHtmlBuilder class provides a builder interface that supports this use-case while ensuring that the untrusted parts of the string are appropriately escaped.

Consider this usage example:

public void showItems(List items) {
  SafeHtmlBuilder builder = new SafeHtmlBuilder();
  for (String item : items) {
    builder.appendEscaped(item).appendHtmlConstant("
");
  }
  itemsListHtml.setHTML(builder.toSafeHtml());
}

SafeHtmlBuilder's appendHtmlConstant method is used to append a constant snippet of HTML to the builder, without escaping the argument. The appendEscaped method in contrast will HTML-escape its string argument before appending.

To allow SafeHtmlBuilder to adhere to the SafeHtml contract, code using it must in turn adhere to the following rules:

The argument of appendHtmlConstant must be a string literal (or, more generally, must be fully determined at compile time).
The string provided must not end within an HTML tag. For example, the following use would be illegal because the argument of the first appendHtmlConstant contains an incomplete tag; the string ends in the context of the value of the href attribute of that tag:

builder.appendHtmlConstant("")

The first rule is necessary to ensure that strings passed to appendHtmlConstant cannot possibly be under the control of an attacker. The second rule is necessary because untrusted strings used inside an HTML tag attribute can result in script execution even if they are HTML-escaped. In the above example, script execution could occur if the value of url is javascript:evil_js_code().

When executing client-side in hosted mode, or server-side with assertions enabled, appendHtmlConstant parses its argument and checks that it satisfies the second constraint. For performance reasons, this check is not performed in production mode in client code, and with assertions disabled on the server.

SafeHtmlBuilder also provides the append(SafeHtml) method. The contents of the provided SafeHtml will be appended to the builder without prior escaping (due to the SafeHtml contract, it can be assumed to be HTML-safe). This method allows HTML snippets wrapped as SafeHtml to be composed into larger SafeHtml snippets.

Creating HTML using the SafeHtmlTemplates Interface

To facilitate the creation of SafeHtml instances containing more complex HTML markup, the safehtml package provides a compile-time bound template mechanism which can be used as in this example:

public class MyWidget ... {
// ...
  public interface MyTemplates extends SafeHtmlTemplates {
    @Template("{0}: {2}")
    SafeHtml messageWithLink(SafeHtml message, String url, String linkText,
        String style);
  }
 
  private static final MyTemplates TEMPLATES =
      GWT.create(MyTemplates.class);
 
  public void useTemplate(...) {
    SafeHtml message;
    String url;
    String linkText;
    String style;
    // ...
    InlineHTML messageWithLinkInlineHTML = new InlineHTML(
        TEMPLATES.messageWithLink(message, url, linkText, style));
    // ...
  }

Instantiating a SafeHtmlTemplates interface with GWT.create() returns an instance of an implementation that is generated at compile time. The code generator parses the value of each template method's @Template annotation as an (X)HTML template, with template variables denoted by curly-brace placeholders that refer by index to the corresponding template method parameter.

All methods in SafeHtmlTemplates interfaces must have a return type of SafeHtml. The compile-time generated implementations of such methods are constructed such that they return instances of SafeHtml that indeed honor the SafeHtml type contract. The code generator accomplishes this guarantee by a combination of compile-time checks and run-time checks in the generated code (see however the note below with regards to a limitation of the current implementation):

The template is parsed with a lenient HTML stream parser that accepts HTML similar to what would typically be accepted by a web browser. The parser does not require that templates consist of balanced HTML tags. However, the parser and template code generator enforce the following constraints on input templates: Template parameters may not appear in HTML comments, parameters may not appear in a Javascript context (e.g., inside a