Part2
Browser Security Handbook, part 2
Browser Security Handbook, part 2Table of ContentsThis
section provides a detailed discussion of explicit security mechanisms
and restrictions implemented within browser. Long-standing design
deficiencies are discussed, but no specific consideration is given to
short-lived vulnerabilities. Perhaps the most important security concept within modern browsers is the idea of the same-origin policy.
The principal intent for this mechanism is to make it possible for
largely unrestrained scripting and other interactions between pages
served as a part of the same site (understood as having a particular
DNS host name, or part thereof), whilst almost completely preventing
any interference between unrelated sites. In practice, there is
no single same-origin policy, but rather, a set of mechanisms with some
superficial resemblance, but quite a few important differences. These
flavors are discussed below. With
no additional qualifiers, the term "same-origin policy" most commonly
refers to a mechanism that governs the ability for JavaScript and other scripting languages to access DOM properties and methods across domains (reference). In essence, the model boils down to this three-step decision process: - If
protocol, host name, and - for browsers other than Microsoft Internet
Explorer - port number for two interacting pages match, access is
granted with no further checks.
- Any page may set document.domain parameter to a right-hand, fully-qualified fragment of its current host name (e.g., foo.bar.example.com may set it to example.com, but not ample.com). If two pages explicitly and mutually set their respective document.domain parameters to the same value, and the remaining same-origin checks are satisfied, access is granted.
- If neither of the above conditions is satisfied, access is denied.
In
theory, the model seems simple and robust enough to ensure proper
separation between unrelated pages, and serve as a method for
sandboxing potentially untrusted or risky content within a particular
domain; upon closer inspection, quite a few drawbacks arise, however: - Firstly, the document.domain mechanism functions as a security tarpit: once any two legitimate subdomains in example.com, e.g. www.example.com and payments.example.com, choose to cooperate this way, any other resource in that domain, such as user-pages.example.com, may then set own document.domain likewise, and arbitrarily mess with payments.example.com. This means that in many scenarios, document.domain may not be used safely at all.
- Whenever document.domain
cannot be used - either because pages live in completely different
domains, or because of the aforementioned security problem - legitimate
client-side communication between, for example, embeddable page
gadgets, is completely forbidden in theory, and in practice very
difficult to arrange, requiring developers to resort to the abuse of
known browser bugs, or to latency-expensive server-side channels, in
order to build legitimate web applications.
- Whenever
tight integration of services within a single host name is pursued to
overcome these communication problems, because of the inflexibility of
same-origin checks, there is no usable method to sandbox any untrusted
or particularly vulnerable content to minimize the impact of security
problems.
On top of this, the specification is simplistic enough to actually omit quite a few corner cases; among other things: - The document.domain behavior when hosts are addressed by IP addresses, as opposed to fully-qualified domain names, is not specified.
- The document.domain behavior with extremely vague specifications (e.g., com or co.uk) is not specified.
- The algorithms of context inheritance for pseudo-protocol windows, such as about:blank, are not specified.
- The behavior for URLs that do not meaningfully have a host name associated with them (e.g., file://)
is not defined, causing some browsers to permit locally saved files to
access every document on the disk or on the web; users are generally
not aware of this risk, potentially exposing themselves.
- The
behavior when a single name resolves to vastly different IP addresses
(for example, one on an internal network, and another on the Internet)
is not specified, permitting DNS rebinding attacks and related tricks that put certain mechanisms (captchas, ad click tracking, etc) at extra risk.
- Many
one-off exceptions to the model were historically made to permit
certain types of desirable interaction, such as the ability to point
own frames or script-spawned windows to new locations - and these are
not well-documented.
All this ambiguity leads to a
significant degree of variation between browsers, and historically,
resulted in a large number of browser security flaws. A detailed
analysis of DOM actions permitted across domains, as well as context
inheritance rules, is given in later sections. A quick survey of
several core same-origin differences between browsers is given below: Test description | MSIE6 | MSIE7 | MSIE8 | FF2 | FF3 | Safari | Opera | Chrome | Android | May document.domain be set to TLD alone? | NO | NO | NO | YES | NO | YES | NO | NO | YES | May document.domain be set to TLD with a trailing dot? | YES | YES | NO | YES | NO | YES | NO | YES | YES | May document.domain be set to right-hand IP address fragments? | YES | YES | NO | YES | NO | YES | NO | NO | YES | Do port numbers wrap around in same origin checks? | NO | NO | NO | uint32 | uint32 | uint16/32 | uint16 | NO | n/a | May local HTML access unrelated local files via DOM? | YES | YES | YES | YES | NO | YES | YES | YES | n/a | May local HTML access sites on the Internet via DOM? | NO | NO | NO | NO | NO | YES | NO | NO | n/a |
Note: Firefox 3 is currently the only browser that uses a directory-based scoping scheme for same-origin access within file://.
This bears some risk of breaking quirky local applications, and may not
offer protection for shared download directories, but is a sensible
approach otherwise. On top of scripted DOM access, all of the contemporary browsers also provide the XMLHttpRequest JavaScript
API, by which scripts may make HTTP requests to their originating site,
and read back data as needed. The mechanism was originally envisioned
primarily to make it possible to read back XML responses (hence the
name, and the responseXML property), but currently, is
perhaps more often used to read back JSON messages, HTML, and arbitrary
custom communication protocols, and serves as the foundation for much
of the web 2.0 behavior of rapid UI updates not dependent on full-page transitions. The set of security-relevant features provided by XMLHttpRequest, and not seen in other browser mechanisms, is as follows: - The ability to specify an arbitrary HTTP request method (via the open() method),
- The ability to set custom HTTP headers on a request (via setRequestHeader()),
- The ability to read back full response headers (via getResponseHeader() and getAllResponseHeaders()),
- The ability to read back full response body as JavaScript string (via responseText property).
Since all requests sent via XMLHttpRequest
include a browser-maintained set of cookies for the target site, and
given that the mechanism provides a far greater ability to interact
with server-side components than any other feature available to
scripts, it is extremely important to build in proper security
controls. The set of checks implemented in all browsers for XMLHttpRequest is a close variation of DOM same-origin policy, with the following changes: - Checks for XMLHttpRequest targets do not take document.domain into account, making it impossible for third-party sites to mutually agree to permit cross-domain requests between them.
- In
some implementations, there are additional restrictions on protocols,
header fields, and HTTP methods for which the functionality is
available, or HTTP response codes which would be shown to scripts (see
later).
- In Microsoft Internet Explorer, although
port number is not taken into account for "proper" DOM access
same-origin checks, it is taken into account for XMLHttpRequest.
Since the exclusion of document.domain made any sort of client-side cross-domain communications through XMLHttpRequest impossible, as a much-demanded extension, W3C proposal for cross-domain XMLHttpRequest
access control would permit cross-site traffic to happen under certain
additional conditions. The scheme envisioned by the proponents is as
follows: - GET requests with custom headers limited
to a whitelist would be sent to the target system immediately, with no
advance verification, based on the assumption that GET
traffic is not meant to change server-side application state, and thus
will have no lasting side effects. This assumption is theoretically
sound, as per the "SHOULD NOT" recommendation spelled out in RFC 2616,
though is seldom observed in practice. Unless an appropriate HTTP
header or XML directive appears in the response, the result would not
be revealed to the requester, though.
- Non-GET requests (POST, etc) would be preceded by a "preflight" OPTIONS
request, again with only whitelisted headers permitted. Unless an
appropriate HTTP header or XML directive is seen in response, the
actual request would not be issued.
Even in its current
shape, the mechanism would open some RFC-ignorant web sites to new
attacks; some of the earlier drafts had more severe problems, too. As
such, the functionality ended up being scrapped in Firefox 3, and currently, is not available in any browser, pending further work. A competing proposal
from Microsoft, making an Microsoft Internet Explorer 8, implements a
completely incompatible, safer, but less useful scheme - permitting
sites to issue anonymous (cookie-less) cross-domain requests only.
There seems to be an ongoing feud
between these two factions, so it may take a longer while for any
particular API to succeed, and it is not clear what security properties
it would posses. As noted earlier, although there is a great deal of flexibility in what data may be submitted via XMLHttpRequest
to same-origin targets, various browsers blacklist subsets of HTTP
headers to prevent ambiguous or misleading requests from being issued
to servers and cached by the browser or by any intermediaries . These
restrictions are generally highly browser-specific; for some common
headers, they are as follows: HTTP header | MSIE6 | MSIE7 | MSIE8 | FF2 | FF3 | Safari | Opera | Chrome | Android | Accept | OK | OK | OK | OK | OK | OK | OK | OK | OK | Accept-Charset | OK | OK | OK | OK | BANNED | BANNED | BANNED | BANNED | BANNED | Accept-Encoding | BANNED | BANNED | BANNED | OK | BANNED | BANNED | BANNED | BANNED | BANNED | Accept-Language | OK | OK | OK | OK | OK | OK | OK | BANNED | BANNED | Cache-Control | OK | OK | OK | OK | OK | OK | BANNED | OK | OK | Cookie | BANNED | BANNED | BANNED | OK | OK | BANNED | BANNED | BANNED | OK | If-* family (If-Modified-Since, etc) | OK | OK | OK | OK | OK | OK | BANNED | OK | OK | Host | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | Range | OK | OK | OK | OK | OK | OK | BANNED | OK | OK | Referer | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | Transfer-Encoding | OK | OK | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED | User-Agent | OK | OK | OK | OK | OK | BANNED | OK | BANNED | BANNED | Via | OK | OK | OK | BANNED | BANNED | BANNED | BANNED | BANNED | BANNED |
Specific implementations may be examined for a complete list: the current WebKit trunk implementation can be found here, whereas for Firefox, the code is here. A long-standing security flaw in Microsoft Internet Explorer 6 permits stray newline characters to appear in some XMLHttpRequest fields, permitting arbitrary headers (such as Host)
to be injected into outgoing requests. This behavior needs to be
accounted for in any scenarios where a considerable population of
legacy MSIE6 users is expected. Other important security properties of XMLHttpRequest are outlined below: Test description | MSIE6 | MSIE7 | MSIE8 | FF2 | FF3 | Safari | Opera | Chrome | Android | Banned HTTP methods | TRACE | CONNECT TRACE* | CONNECT TRACE* | TRACE | TRACE | CONNECT TRACE | CONNECT TRACE** | CONNECT TRACE | CONNECT TRACE | XMLHttpRequest may see httponly cookies? | NO | NO | NO | YES | NO | YES | NO | NO | NO | XMLHttpRequest may see invalid HTTP 30x responses? | NO | NO | NO | YES | YES | NO | NO | YES | NO | XMLHttpRequest may see cross-domain HTTP 30x responses? | NO | NO | NO | YES | YES | NO | NO | NO | NO | XMLHttpRequest may see other HTTP non-200 responses? | YES | YES | YES | YES | YES | YES | YES | YES | NO | May local HTML access unrelated local files via XMLHttpRequest? | NO | NO | NO | YES | NO | YES | YES | YES | n/a | May local HTML access sites on the Internet via XMLHttpRequest? | YES | YES | YES | NO | NO | YES | NO | NO | n/a | Is partial XMLHttpRequest data visible while loading? | NO | NO | NO | YES | YES | YES | NO | YES | NO |
* Implements a whitelist of known schemes, rejects made up values. ** Implements a whitelist of known schemes, replaces non-whitelisted schemes with GET. WARNING:
Microsoft Internet Explorer 7 may be forced to partly regress to the
less secure behavior of the previous version by invoking a proprietary,
legacy ActiveXObject('MSXML2.XMLHTTP') in place of the new, native XMLHttpRequest API. Please note that the degree of flexibility offered by XMLHttpRequest,
and not seen in other cross-domain content referencing schemes, may be
actually used as a simple security mechanism: a check for a custom HTTP
header may be carried out on server side to confirm that a
cookie-authenticated request comes from JavaScript code that invoked XMLHttpRequest.setRequestHeader(), and hence must be triggered by same-origin content, as opposed to a random third-party site. This provides a coarse cross-site request forgery
defense, although the mechanism may be potentially subverted by the
incompatible same-origin logic within some plugin-based programming
languages, as discussed later on.
|