7:25 AMBrowser Security Handbook, part 2.5
For performance reasons, most browsers regularly issue requests simultaneously, by opening multiple TCP connections. To prevent overloading servers with excessive traffic, and to minimize the risk of abuse, the number of connections to the same target is usually capped. The following table captures these limits, as well as default read timeouts for network resources:
Another interesting security control built on top of the existing mechanisms is the concept of restricting third-party cookies. For the privacy reasons noted earlier, there appeared to be a demand for a seemingly simple improvement: restricting the ability for any domain other than the top-level one displayed in the URL bar, to set cookies while a page is being visited. This was to prevent third-party content (advertisements, etc) included via <IMG>, <IFRAME>, <SCRIPT>, and similar tags, from setting tracking cookies that could identify the user across unrelated sites relying on the same ad technology.
The purpose of this design is to force legitimate businesses to make a (hopefully) binding legal statement through this mechanism, so that violations could be prosecuted. Sadly, the approach has the unfortunate property of being a legislative solution to a technical problem, bestowing potential liability at site owners who often simply copy-and-paste P3P header examples from the web without understanding their intended meaning; the mechanism also does nothing to stop shady web sites from making arbitrary claims in these HTTP headers and betting on the mechanism never being tested in court - or even simply disavowing any responsibility for untrue, self-contradictory, or nonsensical P3P policies.
The question of what constitues "first-party" domains introduces a yet another, incompatible same-origin check, called minimal domains. The idea is that www1.eu.example.com and www2.us.example.com should be considered first-party, which is not true for all the remaining same-origin logic in other places. Unfortunately, these implementations are generally even more buggy than cookies for country-code TLDs: for example, in Safari, test1.example.cc and test2.example.cc are not the same minimal domain, while in Internet Explorer, domain1.waw.pl and domain2.waw.pl are.
Although any third-party cookie restrictions are not a sufficient method to prevent cross-domain user tracking, they prove to be rather efficient in disrupting or impacting the security of some legitimate web site features, most notably certain web gadgets and authentication mechanisms.
* This includes script-initiated form submissions.
The task of detecting and handling various file types and encoding schemes is one of the most hairy and broken mechanisms in modern web browsers. This situation stems from the fact that for a longer while, virtually all browser vendors were trying to both ensure backward compatibility with HTTP/0.9 servers (the protocol included absolutely no metadata describing any of the content returned to clients), and compensate for incorrectly configured HTTP/1.x servers that would return HTML documents with nonsensical Content-Type values, or unspecified character sets. In fact, having as many content detection hacks as possible would be perceived as a competitive advantage: the user would not care whose fault it was, if example.com rendered correctly in Internet Explorer, but not open in Netscape browser - Internet Explorer would be the winner.
As a result, each browser accumulated a unique and very poorly documented set of obscure content sniffing quirks that - because of no pressure on site owners to correct the underlying configuration errors - are now required to keep compatibility with existing content, or at least appear to be risky to remove or tamper with.
Unfortunately, all these design decisions preceded the arrival of complex and sensitive web applications that would host user content - be it baby photos or videos, rich documents, source code files, or even binary blobs of unknown structure (mail attachments). Because of the limitations of same-origin policies, these very applications would critically depend on having the ability to reliably and accurately instruct the browser on how to handle such data, without ever being second-guessed and having what meant to be an image rendered as HTML - and no mechanism to ensure this would be available.
The first and only method for web servers to clearly indicate the purpose of a particular hosted resource is through the Content-Type response header. This header should contain a standard MIME specification of document type - such as image/jpeg or text/html - along with some optional information, such as the character set. In theory, this is a simple and bullet-proof mechanism. In practice, not very much so.
The first problem is that - as noted on several occasions already - when loading many types of sub-resources, most notably for <OBJECT>, <EMBED>, <APPLET>, <SCRIPT>, <IMG>, <LINK REL="...">, or <BGSOUND> tags, as well as when requesting some plugin-specific, security-relevant data, the recipient would flat out ignore any values in Content-Type and Content-Disposition headers (or, amusingly, even HTTP status codes). Instead, the mechanism typically employed to interpret the data is as follows:
This mechanism makes it impossible for any server to opt out from having its responses passed to a variety of unexpected client-side interpreters, if any third-party page decides to do so. In many cases, misrouting the data in this manner is harmless - for example, while it is possible to construct a quasi-valid HTML document that also passes off as an image, and then load it via <IMG> tag, there is little or no security risk in allowing such a behavior. Some specific scenarios pose a major hazard, however: one such example is the infamous GIFAR flaw, where well-formed, user-supplied images could be also interpreted as Java code, and executed in the security context of the serving party.
The exact logic implemented here is usually contrived and as poorly documented - but based on our current knowledge, could be generalized as:
As it is probably apparent by now, not much thought or standardization was given to browser behavior in these areas previously. To further complicate work, the documentation available on the web is often outdated, incomplete, or inaccurate (Firefox docs are an example). Following widespread complaints, current HTML 5 drafts attempt to take at least some content handling considerations into account - although these rules are far from being comprehensive. Likewise, some improvements to specific browser implementations are being gradually introduced (e.g., image/* behavior changes), while other were resisted (e.g., fixing text/plain logic).
Some of the interesting corner cases of content sniffing behavior are captured below:
In addition, the behavior for non-HTML resources is as follows (to test for these, please put sample HTML inside two files with extensions of .TXT and .UNKNOWN, then attempt to access them through the browser):
Microsoft Internet Explorer 8 gives an option to override some of its quirky content sniffing logic with a new X-Content-Type-Options: nosniff option (reference). Unfortunately, the feature is somewhat counterintuitive, disabling not only dangerous sniffing scenarios, but also some of the image-related logic; and has no effect on plugin-handled data.
An interesting study of content sniffing signatures is given on this page.
|Total comments: 0|