Home » 2010 » February » 4 » Browser Security Handbook, part 1.2
3:39 PM
Browser Security Handbook, part 1.2

HTML entity encoding

HTML features a special encoding scheme called HTML entities. The purpose of this scheme is to make it possible to safely render certain reserved HTML characters (e.g., < > &) within documents, as well as to carry high bit characters safely over 7-bit media. The scheme nominally permits three types of notation:

  • One of predefined, named entities, in the format of &<name>; - for example &lt; for <, &gt; for >, &rarr; for , etc,
  • Decimal entities, &#<nn>;, with a number corresponding to the desired Unicode character value - for example &#60; for <, &#8594; for ,
  • Hexadecimal entities, &#x<nn>;, likewise - for example &#x3c; for <, &#x2192; for .

In every browser, HTML entities are decoded only in parameter values and stray text between tags. Entities have no effect on how the general structure of a document is understood, and no special meaning in sections such as <SCRIPT>. The ability to understand and parse the syntax is still critical to properly understanding the value of a particular HTML parameter, however. For example, as hinted in one of the earlier sections, <A HREF="javascript&#09;:alert(1)"> may need to be parsed as an absolute reference to javascript<TAB>:alert(1), as opposed to a link to something called javascript& with a local URL hash string part of #09;alert(1).

Unfortunately, various browsers follow different parsing rules to these HTML entity notations; all rendering engines recognize entities with no proper ; terminator, and all permit entities with excessively long, zero-padded notation, but with various thresholds:

Test description MSIE6 MSIE7 MSIE8 FF2 FF3 Safari Opera Chrome Android
Maximum length of a correctly terminated decimal entity 7 7 7 8* 8* 8*
Maximum length of an incorrectly terminated decimal entity 7 7 7 8* 8* 8*
Maximum length of a correctly terminated hex entity 6 6 6 8* 8* 8*
Maximum length of an incorrectly terminated hex entity 0 0 0 8* 8* 8*
Characters permitted in entity names (excluding A-Z a-z 0-9) none none none - . - . none none none none

* Entries one byte longer than this limit still get parsed, but incorrectly; for example, &#000000065; becomes a sequence of three characters, \x06 5 ;. Two characters and more do not get parsed at all - &#0000000065; is displayed literally).

An interesting piece of trivia is that, as per HTML entity encoding requirements, links such as:


Should be technically always encoded in HTML parameters (but not in JavaScript code) as:

<a href="http://example.com/?p1=v1&amp;p2=v2">Click here</a>

In practice, however, the convention is almost never followed by web developers, and browsers compensate for it by treating invalid HTML entities as literal &-containing strings.

Document Object Model

As the web matured and the concept of client-side programming gained traction, the need arose for HTML document to be programatically accessible and modifiable on the fly in response to user actions. One technology, Document Object Model (also referred to as "dynamic HTML"), emerged as the prevailing method for accomplishing this task.

In the context of web browsers, Document Object Model is simply an object-based representation of HTML document layout, and by a natural extension much of the browser internals, in a hierarchical structure with various read and write properties, and callable methods. To illustrate, to access the value of the first <INPUT> tag in a document through DOM, the following JavaScript code could be used:


Document Object Model also permits third-party documents to be referenced, subject to certain security checks discussed later on; for example, the following code accesses the <INPUT> tag in the second <IFRAME> on a current page:


DOM object hierarchy - as seen by programmers - begins with an implicit "root" object, sometimes named defaultView or global; all the scripts running on a page have defaultView as their default scope. This root object has the following members:

  • Various properties and methods relating to current document-specific window or frame (reference). Properties include window dimensions, status bar text, references to parent, top-level, and owner (opener) defaultView objects. Methods permit scripts to resize, move, change focus, or decor of current windows, display certain UI widgets, or - oddly enough - set up JavaScript timers to execute code after a certain "idle" interval.
  • All the current script's global variables and functions.
  • defaultView.document (reference) - a top-level object for the actual, proper document object model. This hierarchy represents all of the document elements, along with their specific methods and properties, and document-wide object lookup functions (e.g., getElementById, getElementsByTagName, etc). An assortment of browser-specific, proprietary APIs is usually present on top of the common functionality (e.g., document.execCommand in Internet Explorer).
  • defaultView.location (reference) - a simple object describing document's current location, both as an unparsed string, and split into parsed segments; provides methods and property setters that allow scripts to navigate away to other sites.
  • defaultView.history (reference) - another simple object offering three methods to enable pages to navigate back to previously visited pages.
  • defaultView.screen (reference) - yet another modestly sized object with a bunch of mostly read-only properties describing properties of the current display device, including pixel and DPI resolution, and so forth.
  • defaultView.navigator (reference) - an object containing read-only properties such as browser make and version, operating platform, or plugin configuration.
  • defaultView.window - a self-referential entry that points back to the root object; this is the name by which the defaultView object is most commonly known. In general, screen, navigator, and window DOM hierarchies combined usually contain enough information to very accurately fingerprint any given machine.

With the exception of cross-domain, cross-document access permissions outlined in later chapters, the operation of DOM bears relatively little significance to site security. It is, however, worth noting that DOM methods are wrappers providing access to highly implementation-specific internal data structures that may or may not obey the usual rules of JavaScript language, may haphazardly switch between bounded and ASCIZ string representations, and so forth. Because of this, multiple seemingly inexplicable oddities and quirks plague DOM data structures in every browsers - and some of these may interfere with client-side security mechanisms. Several such quirks are documented below:

Test description MSIE6 MSIE7 MSIE8 FF2 FF3 Safari Opera Chrome Android
Is window the same object as window.window? NO NO NO YES YES YES YES YES YES
Is document.URL writable? NO YES YES NO NO NO NO NO NO
Can builtin DOM objects be clobbered? overwrite overwrite overwrite shadowing shadowing shadowing overwrite overwrite shadowing
Does getElementsByName look up by ID= values as well? YES YES YES NO NO NO YES NO NO
Are .innerHTML assignments truncated at NUL? YES YES NO NO NO NO YES NO NO
Are location.* assignments truncated at NUL? YES YES YES YES YES YES YES NO NO

Trivia: traversing document.* is a possible approach to look up specific input or output fields when modifying complex pages from within scripts, but it is not a very practical one; because of this, most scripts resort to document.getElementById() method to uniquely identify elements regardless of their current location on the page. Although ID= parameters for tags within HTML documents are expected to be unique, there is no such guarantee. Currently, all major browsers seem to give the first occurrence a priority.

Browser-side Javascript

JavaScript is a relatively simple but rich, object-based imperative scripting language tightly integrated with HTML, and supported by all contemporary web browsers. Authors claim that the language has roots with Scheme and Self programming languages, although it is often characterized in a more down-to-earth way as resembling a crossover of C/C++ and Visual Basic. Confusingly, except for the common C syntax, it shares relatively few unique features with Java; the naming is simply a marketing gimmick devised by Netscape and Sun in the 90s.

The language is a creation of Netscape engineers, originally marketed under the name of Mocha, then Livescript, and finally rebranded to JavaScript; Microsoft embraced the concept shortly thereafter, under the name of JScript. The language, as usual with web technologies, got standardized only post factum, under a yet another name - ECMAScript. Additional, sometimes overlapping efforts to bring new features and extensions of the language - for example, ECMAScript 4, JavaScript versions 1.6-1.8, native XML objects, etc - are taking place, although none of these concepts managed to win broad acceptance and following.

Browser-side JavaScript is invoked from within HTML documents in four primary ways:

  1. Standalone <SCRIPT> tags that enclose code blocks,
  2. Event handlers tied to HTML tags (e.g. onmouseover="..."),
  3. Stylesheet expression(...) blocks that permit JavaScript syntax in some browsers,
  4. Special URL schemes specified as targets for certain resources or actions (javascript:...) - in HTML and in stylesheets.

Note that the first option does not always work when such a string is dynamically added by running JavaScript to an existing document. That said, any of the remaining options might be used as a comparable substitute.

Regardless of their source (<SCRIPT SRC="...">), remote scripts always execute in the security context of the document they are attached to. Once called, JavaScript has full access to the current DOM, and limited access to DOMs of other windows; it may also further invoke new JavaScript by calling eval(), configuring timers (setTimeout(...) and setInterval(...)), or producing JavaScript-invoking HTML. JavaScript may also configure self to launch when its objects are interacted with by third-party JavaScript code, by configuring watches, setters, or getters, or cross contexts by calling same-origin functions belonging to other documents.

JavaScript enjoys very limited input and output capabilities. Interacting with same-origin document data and drawing CANVASes, displaying window.* pop-up dialogs, and writing script console errors aside, there are relatively few I/O facilities available. File operations or access to other persistent storage is generally not possible, although some experimental features are being introduced and quickly scrapped. Scripts may send and receive data from the originating server using the XMLHttpRequest extension, which permits scripts to send and read arbitrary payloads; and may also automatically send requests and read back properly formatted responses by issuing <SCRIPT SRC="...">, or <LINK REL="Stylesheet" HREF="...">, even across domains. Lastly, JavaScript may send information to third-party servers by spawning objects such as <IMG>, <IFRAME>, <OBJECT>, <APPLET>, or by triggering page transitions, and encoding information in the URL that would be automatically requested by the browser immediately thereafter - but it may not read back the returned data. Some side channels related to browser caching mechanisms and lax DOM security checks also exist, providing additional side channels between cooperating parties.

Some of the other security-relevant characteristics of JavaScript environments in modern browsers include:

  • Dynamic, runtime code interpretation with no strict code caching rules. Any code snippets located in-line with HTML tags would be interpreted and executed, and JavaScript itself has a possibility to either directly evaluate strings as JavaScript code (eval(...)), or to produce new HTML that in turn may contain more JavaScript (.innerHTML and .outerHTML properties, document.write(), event handlers). The behavior of code that attempts to modify own container while executing (through DOM) is not well-defined, and varies between browsers.
  • Somewhat inconsistent exception support. Although within the language itself, exception-throwing conditions are well defined, it is not so when interacting with DOM structures. Depending on the circumstances, some DOM operations may fail silently (with writes or reads ignored), return various magic responses (undefined, null, object inaccessible), throw a non-standardized exception, or even abort execution unconditionally.
  • Somewhat inconsistent, but modifiable builtins and prototypes. The behavior of many language constructs might be redefined by programs within the current context by modifying object setters, getters, or overwriting prototypes. Because of this, reliance on any specific code - for example a security check - executing in a predictable manner in the vicinity of a potentially malicious payload is risky. Notably, however, not all builtin objects may be trivially tampered with - for example, Array prototype setters may be modifiable, but XML not so. There is no fully-fledged operator overloading in JavaScript 1.x.
  • Typically synchronous execution. Within a single document, browser-side JavaScript usually executes synchronously and in a single thread, and asynchronous timer events are not permitted to fire until the scripting engine enters idle state. No strict guarantees of synchronous execution exist, however, and multi-process rendering of Chrome or MSIE8 may permit cross-domain access to occur more asynchronously.
  • Global function lookups not dependent on ordering or execution flow. Functions may be invoked in a block of code before they are defined, and are indexed in a pass prior to execution. Because of this, reliance on top-level no-return statements such as while (1); to prevent subsequent code from being interpreted might be risky.
  • Endless loops that terminate. As a security feature, when the script executes for too long, the user is prompted to abort execution. This merely causes current execution to be stopped, but does not prevent scripts from being re-entered.
  • Broad, quickly evolving syntax. For example, E4X extensions in Firefox well-structured XHTML or XML to be interpreted as a JavaScript object, also executing any nested JavaScript initializers within such a block. This makes it very difficult to assume that a particular format of data would not constitute valid JavaScript syntax in a future-proof manner.

Inline script blocks embedded within HTML are somewhat tricky to sanitize if permitted to contain user-controlled strings, because, much like <TEXTAREA>, <STYLE>, and several other HTML tags, they follow a counterintuitive CDATA-style parsing: a literal sequence of </SCRIPT> ends the script block regardless of its location within the JavaScript syntax as such. For example, the following code block would be ended prematurely, and lead to unauthorized, additional JavaScript block being executed:

var user_string = 'Hello world</SCRIPT><SCRIPT>alert(1)</SCRIPT>';

Strictly speaking, HTML4 standard specifies that any </... sequence may be used to break out of non-HTML blocks such as <SCRIPT> (reference); in practice, this suit is not followed by browsers, and a literal </SCRIPT> string is required instead. This property is not necessarily safe to rely on, however.

Various differences between browser implementations are outlined below:

Test description MSIE6 MSIE7 MSIE8 FF2 FF3 Safari Opera Chrome Android
Is setter and getter support present? NO NO NO YES YES YES YES YES YES
Is access to prototypes via __proto__ possible? NO NO NO YES YES YES NO YES YES
Is it possible to alias eval() function? YES YES YES PARTLY PARTLY YES PARTLY YES YES
Are watches on objects supported? NO NO NO YES YES NO NO NO NO
May currently executing code blocks modify self? read-only read-only read-only NO NO NO YES NO NO
Is E4X extension supported? NO NO NO YES YES NO NO NO NO
Is charset= honored on <SCRIPT SRC="...">? YES YES YES YES YES YES NO YES YES
Do Unicode newlines (U+2028, U+2029) break lines in JavaScript? NO NO NO YES YES YES YES YES YES

Trivia: JavaScript programs can programatically initiate or intercept a variety of events, such as mouse and keyboard input within a window. This can interfere with other security mechanisms, such as "protected" <INPUT TYPE=FILE ...> fields, or non-same-origin HTML frames, and little or no consideration was given to these interactions at design time, resulting in a number of implementation problems in modern browsers.

Javascript character encoding

To permit handling of strings that otherwise contain restricted literal text (such as </SCRIPT>, ", ', CL or LF, other non-printable characters), JavaScript offers five character encoding strategies:

The fourth scheme is most space-efficient, although generally error-prone; for example, a failure to escape \ as \\ may lead to a string of \" + alert(1);" being converted to \\" + alert(1), and getting interpreted incorrectly; it also partly collides with the remaining \-prefixed escaping schemes, and is not compatible with C syntax.

The parsing of these schemes is uniform across all browsers if fewer than the expected number of input digits is seen: the value is interpreted correctly, and no subsequent text is consumed ("\1test" is accepted and becomes "\001test").

HTML entity encoding has no special meaning within HTML-embedded <SCRIPT> blocks.

Amusingly, although all the four aforementioned schemes are supported in JavaScript proper, a proposed informational standard for JavaScript Object Notation (JSON), a transport-oriented serialization scheme for JavaScript objects and arrays (RFC 4627), technically permits only the \u notation and a seemingly arbitrary subset of \ control character shorthand codes (options 3 and 5 on the list above). In practice, since JSON objects are almost always simply passed to eval(...) in client-side JavaScript code to convert them back to native data structures, the limitation is not enforced in a vast majority of uses. Because of the advice provided in the RFC, some non-native parseJSON implementations (such as the current example reference implementation from json.org) may not handle traditional \x escape codes properly, however. It is also not certain what approach would be followed by browsers in the currently discussed native, non-executing parsers proposed for performance and security reasons.

Unlike in some C implementations, stray multi-line string literals are not permitted by JavaScript, but lone \ at the end of a line may be used to break long lines in a seamless manner.

Other document scripting languages

VBScript is a language envisioned by Microsoft as a general-purpose "hosted" scripting environment, incorporated within Windows as such, Microsoft IIS, and supported by Microsoft Internet Explorer via vbscript: URL scheme, and through <SCRIPT> tags. As far as client-side scripting is considered, the language did not seem to offer any significant advantages over the competing JavaScript technology, and appeared to lag in several areas.

Presumably because of this, the technology never won broad browser support outside MSIE (with all current versions being able to execute it), and is very seldom relied upon. Where permitted to go through, it should be expected to have roughly the same security consequences as JavaScript. The attack surface and security controls within VBScript are poorly studied in comparison to that alternative, however.

Views: 569 | Added by: b1zz4rd | Rating: 0.0/0
Total comments: 1

Name *:
Email *:
Code *: