As
the web started to move from static content to complex applications,
one of the most significant problems with HTTP was that the protocol
contained no specific provisions for maintaining any client-associated
context for subsequent requests, making it difficult to implement
contemporary mechanisms such as convenient, persistent authentication
or preference management (HTTP authentication, as discussed later on,
proved to be too cumbersome for this purpose, while any in-URL state
information would be often accidentally disclosed to strangers or
lost). To address the need, HTTP cookies were implemented in Netscape Navigator (and later captured in spirit as RFC 2109,
with neither of the standards truly followed by most implementations):
any server could return a short text token to be stored by the client
in a Set-Cookie header, and the token would be stored by clients and included on all future requests (in a Cookie header). Key properties of the mechanism: - Header structure: every Set-Cookie header sent by the server consists of one or more comma-separated NAME=VALUE
pairs, followed by a number of additional semicolon-separated
parameters or keywords. In practice, a vast majority of browsers
support only a single pair (confusingly, multiple NAME=VALUE pairs are accepted in all browsers via document.cookie, a simple JavaScript cookie manipulation API). Every Cookie header sent by the client consists of any number of semicolon-separated NAME=VALUE pairs with no additional metadata.
- Scope: by default, cookie scope is limited to all URLs on the current host name - and not bound to port or protocol information. Scope may be limited with path=
parameter to specify a specific path prefix to which the cookie should
be sent, or broadened to a group of DNS names, rather than single host
only, with domain=. The latter operation may specify any
fully-qualified right-hand segment of the current host name, up to one
level below TLD (in other words, www.foo.bar.example.com may set a cookie to be sent to *.bar.example.com or *.example.com, but not to *.something.else.example.com or *.com); the former can be set with no specific security checks, and uses just a dumb left-hand substring match.Note: according to one of the specs, domain wildcards should be marked with a preceeding period, so .example.com would denote a wildcard match for the entire domain - including, somewhat confusingly, example.com proper - whereas foo.example.com would denote an exact host match. Sadly, no browser follows this logic, and domain=example.com is exactly equivalent to domain=.example.com. There is no way to limit cookies to a single DNS name only, other than by not specifying domain=
value at all - and even this does not work in Microsoft Internet
Explorer; likewise, there is no way to limit them to a specific port.
- Time to live:
by default, each cookie has a lifetime limited to the duration of the
current browser session (in practice meaning that it is stored in
program memory only, and not written to disk). Alternatively, an expires= parameter may be included to specify the date (in one of a large number of possible confusing and hard-to-parse
date formats) at which the cookie should be dropped. This automatically
enables persistent storage of the cookie. A much less commonly used,
but RFC-mandated max-age= parameter might be used to specify expiration time delta instead.
- Overwriting cookies: if a new cookie with the same NAME, domain, and path
as an existing cookie is encountered, the old cookie is discarded.
Otherwise, even if a subtle difference exists (e.g., two distinct domain=
values in the same top-level domain), the two cookies will co-exist,
and may be sent by the client at the same time as two separate pairs in
Cookie headers, with no additional information to help resolve the conflict.
- Deleting cookies:
There is no specific mechanism for deleting cookies envisioned,
although a common hack is to overwrite a cookie with a bogus value as
outlined above, plus a backdated or short-lived expires= (using max-age=0 is not universally supported).
- "Protected" cookies: as a security feature, some cookies set may be marked with a special secure keyword, which causes them to be sent over HTTPS only. Note that non-HTTPS sites may still set secure cookies in some implementations, just not read them back.
The
original design for HTTP cookies has multiple problems and drawbacks
that resulted in various security problems and kludges to address them:
- Privacy issues: the chief concern with the
mechanism was that it permitted scores of users to be tracked
extensively across any number of collaborating domains without
permission (in the simplest form, by simply including tracking code in
an IFRAME pointing to a common evil-tracking.com resource on any number of web pages, so that the same evil-tracking.com
cookie can be correlated across all properties). It is a major
misconception that HTTP cookies were the only mechanism to store and
retrieve long-lived client-side tokens - for example, cache validation directives or window.name
DOM property may be naughtily repurposed to implement a very similar
functionality - but the development nevertheless caused public outcry.Widespread
criticism eventually resulted in many browsers enabling restrictions on
any included content on a page setting cookies for any domain other
than that displayed in the URL bar (discussed later on),
despite the fact that such a measure would not stop cooperating sites
from tracking users using marginally more sophisticated methods. A
minority of users to this day browses with cookies disabled altogether
for similar reasons, too.
- Problems with ccTLDs:
the specification did not account for the fact that many country-code
TLDs are governed by odd or sometimes conflicting rules. For example, waw.pl, com.pl, and co.uk
should be all seen as generic, functional top-level domains, and so it
should not be possible to set cookies at this level, as to avoid
interference between various applications; but example.pl or coredump.cx
are single-owner domains for which it should be possible to set
cookies. This resulted in many browsers having serious trouble
collecting empirical data from various ccTLDs and keeping it in sync
with the current state of affairs in the DNS world.
- Problems with conflict resolution:
when two identically named cookies with different scopes are to be sent
in a single request, there is no information available to the server to
resolve the conflict and decide which cookie came from where, or how
old it is. Browsers do not follow any specific conventions on the
ordering of supplied cookies, too, and some behave in an outright buggy
manner. Additional metadata to address this problem is proposed in
"cookies 2" design (RFC 2965), but the standard never gained widespread support.
- Problems with certain characters:
just like HTTP, cookies have no specific provisions for character
escaping, and no specified behavior for handling of high-bit and
control characters. This sometimes results in completely unpredictable
and dangerous situations if not accounted for.
- Problems with cookie jar size:
standards do relatively little to specify cookie count limits or
pruning strategies. Various browsers may implement various total and
per-domain caps, and the behavior may result in malicious content
purposefully disrupting session management, or legitimate content doing
so by accident.
- Perceived JavaScript-related problems: the aforementioned document.cookie JavaScript API permits for JavaScript
embedded on pages to access sensitive authentication cookies. If
malicious scripts may be planted on a page due to insufficient escaping
of user input, these cookies could be stolen and disclosed to the
attacker. The concern for this possibility resulted in httponly cookie flag being incorporated into Microsoft Internet Explorer, and later other browsers; such cookies would not be visible through document.cookie (but, as noted in the previous section, are not always adequately hidden in XMLHttpRequest
calls). In reality, the degree of protection afforded this way is
minimal, given the ability to interact with same-origin content through
DOM.
- Problems with "protected" cookie clobbering: as indicated earlier, secure and httponly cookies are meant not to be visible in certain situations, but no specific thought was given to preventing JavaScript from overwriting httponly cookies, or non-encrypted pages from overwriting secure cookies; likewise, httponly or secure
cookies may get dropped and replaced with evil versions by simply
overflowing the per-domain cookie jar. This oversight could be abused
to subvert at least some usage scenarios.
- Conflicts with DOM same-origin policy rules:
cookies have scoping mechanisms that are broader and essentially
incompatible with same-origin policy rules (e.g., as noted, no ability
to restrict cookies to a specific host or protocol) - sometimes undoing
some content security compartmentalization mechanisms that would
otherwise be possible under DOM rules.
An IETF effort is currently underway to clearly specify currently deployed cookie behavior across major browsers. Test description | MSIE6 | MSIE7 | MSIE8 | FF2 | FF3 | Safari | Opera | Chrome | Android | Does document.cookie work on ftp URLs? | NO | NO | NO | NO | NO | NO | NO | NO | n/a | Does document.cookie work on file URLs? | YES | YES | YES | YES | YES | YES | YES | NO | n/a | Is Cookie2 standard supported? | NO | NO | NO | NO | NO | NO | YES | NO | NO | Are multiple comma-separated Set-Cookie pairs accepted? | NO | NO | NO | NO | NO | YES | NO | NO | NO | Are quoted-string values supported for HTTP cookies? | NO | NO | NO | YES | YES | NO | YES | NO | YES | Is max-age parameter supported? | NO | NO | NO | YES | YES | YES | YES | YES | YES | Does max-age=0 work to delete cookies? | (NO) | (NO) | (NO) | YES | YES | NO | YES | YES | YES | Is httponly flag supported? | YES | YES | YES | YES | YES | YES | YES | YES | NO | Can scripts clobber httponly cookies?* | NO | NO | NO | YES | NO | YES | NO | NO | (YES) | Can HTTP pages clobber secure cookies?* | YES | YES | YES | YES | YES | YES | YES | YES | YES | Ordering of duplicate cookies with different scope | random | random | some dropped | some dropped | most specific first | random | most specific first | most specific first | by age | Maximum length of a single cookie | 4 kB | 4 kB | ∞ | ∞ | ∞ | ∞ | ∞ | ∞ | broken | Maximum number of cookies per site | 50 | 50 | 50 | ∞ | 100 | ∞ | ∞ | ~70 | 50 | Are cookies for right-hand IP address fragments accepted? | NO | NO | NO | NO | NO | YES | NO | NO | NO | Are host-scope cookies possible (no domain= value)? | NO | NO | NO | YES | YES | YES | YES | YES | YES | Overly permissive ccTLD behavior test results (3 tests) | 1/3 FAIL | 1/3 FAIL | 3/3 OK | 2/3 FAIL | 3/3 OK | 1/3 FAIL | 3/3 OK | 3/3 OK | 2/3 FAIL |
*
Note that as discussed earlier, even when this is not directly
permitted, the attacker may still drop the original cookie by simply
overflowing the cookie jar, and insert a new one without a httponly or secure
flag set; and even if the ability to overflow the jar is limited, there
is no way for a server to distinguish between a genuine httponly or secure cookie, and a differently scoped, but identically named lookalike. Adobe Flash, a plugin believed to be installed on about 99% of all desktops,
incorporates a security model generally inspired by browser same-origin
checks. Flash applets have their security context derived from the URL
they are loaded from (as opposed to the site that embeds them with <OBJECT> or <EMBED>
tags), and within this realm, permission control follows the same basic
principle as applied by browsers to DOM access: protocol, host name,
and port of the requested resource is compared with that of the
requestor, with universal access privileges granted to content stored
on local disk. That said, there are important differences - and some
interesting extensions - that make Flash capable of initiating
cross-domain interactions to a degree greater than typically permitted
for native browser content. Some of the unique properties and gotchas of the current Flash security model include: - The ability for sites to provide a cross-domain policy, often referred to as crossdomain.xml,
to allow a degree of interaction from non-same-origin content. Any
non-same-origin Flash applet may specify a location on the target
server at which this XML-based specification should be looked up; if it
matches a specific format, it would be interpreted as a permission to
carry out cross-domain actions for a given target URL path and its
descendants.Historically, the mechanism, due to extremely lax
XML parser and no other security checks in place, posed a major threat:
many types of user content, for example images or text files, could be
trivially made to mimick such data without site owner's knowledge or
consent. Recent security improvements
enabled a better control of cross-domain policies; this includes a more
rigorous XML parser; a requirement for MIME type on policies to match text/*, application/xml, or application/xhtml+xml; or the concept of site-wide meta-policies, stored at a fixed top-level location - /crossdomain.xml.
These policies would specify global security rules, and for example
prevent any lower-order policies from being interpreted, or require
MIME type on all policies to non-ambiguously match text/x-cross-domain-policy.
- The ability to make cookie-bearing cross-domain HTTP GET and POST requests via the browser stack, with fewer constraints than typically seen elsewhere in browsers. This is achieved through the URLRequest API. The functionality, most notably, includes the ability to specify arbitrary Content-Type
values, and to send binary payloads. Historically, Flash would also
permit nearly arbitrary headers to be appended to cross-domain traffic
via the requestHeaders property, although this had changed with a series of recent security updates, now requiring an explicit crossdomain.xml directive to re-enable the feature.
- The
ability to make same-origin HTTP requests, including setting and
reading back HTTP headers to an extent greater than that of XMLHttpRequest (list of banned headers).
- The ability to access to raw TCP sockets via XMLSockets,
to connect back to the same-origin host on any high port (> 1024),
or to access third-party systems likewise. Following recent security
updates, this requires explicit cross-domain rules, although these may
be easily provided for same-origin traffic. In conjunction with DNS rebinding attacks or the behavior of certain firewall helpers, the mechanism could be abused to punch holes in the firewall or probe local and remote systems, although certain mitigations were incorporated since then.
- The ability for applet-embedding pages to restrict certain permissions for the included content by specifying <OBJECT> or <EMBED> parameters:
- The ability to load external files and navigate the current browser window (allowNetworking attribute).
- The ability to interact with on-page JavaScript context (allowScriptAccess attribute; previously unrestricted by default, now limited to sameDomain, which requires the accessed page to be same origin with the applet).
- The ability to run in full-screen mode (allowFullScreen attribute).
This model is further mired with other bugs and oddities, such as the reliance on location.* DOM being tamper-proof for the purpose of executing same-origin security checks. Flash
applets running from the Internet do not have any specific permissions
to access local files or input devices, although depending on user
configuration decisions, some or all sites may use a limited quota
within a virtualized data storage sandbox, or access the microphone.
|