Home » 2009 » November » 29 » Millions of PDF invisibly embedded with your internal disk paths
7:55 AM
Millions of PDF invisibly embedded with your internal disk paths
I found an interesting privacy issue while analyzing PDF files. This bug
occurs when you are using Internet Explorer to print locally saved web pages
as PDF and affects all IE versions including IE8. It does not matter which
PDF generation software you are using like Adobe Acrobat Professional,
CutePDF, PrimoPDF, etc as long as you are invoking it from inside the IE
print function. In Windows, even when your default browser is not IE and if
you right click a file to select the PRINT from the context menu, then by
default it invokes the IE print handler. So, you will still see this issue
in the generated PDF.

This bug is NOT ABOUT the local disk path appearing in the FOOTER of your
pdf since it is clearly visible and already known by most people. This is
easy enough to hide by just going File -> Page Setup -> Change the Footer
value from “URL” to “-Empty-”. After doing that, you will not expect your
internal disk path being put anywhere else. However, that does not happen.

The privacy issue arises from the fact that your local disk path gets
invisibly embedded inside your PDF in the title attribute. Only when you
open the file in an Editor like Notepad, you will see it. Currently, there
is no option in IE to disable it. The only workaround is to manually nullify
this value by editing the PDF file. Note that this problem does not occur
when using other browsers such as Firefox and Chrome. In fact, Chrome
handles the other footer issue intelligently as well by showing your disk
path as “…”, rather than exposing it.

Proof of Concept:

Steps to reproduce:
1. Pick a .HTM or .HTML or .MHT file on your local computer.
2. Open this file in IE and click Ctrl-P.
OR Right-click the file in explorer and select PRINT from context menu.
4. Select any PDF writer as Printer such as Adobe PDF / CutePDF / PrimoPDF /
5. Click Print. When the PDF writer asks for a filename, provide any name.
6. Open the generated pdf in notepad, and search for “file://” without

Search for this on your favorite search engine (Google/Bing)
filetype:pdf file c (htm OR html OR mhtml)

Google Search 1 (for drive C)
R+mhtml%29&btnG=Search&aq=f&oq=&aqi=] – 4 million results
Google Search 2 (for drive D)
R+mhtml%29&btnG=Search&aq=f&oq=&aqi=] – 13 million results
and so on…. (I added till drive letter J and total was more than 50

So, out of 280 million pdfs accessible on the internet, more than 20% look
to be exposing internal disk paths which is a huge number. I have contacted
the Microsoft and Adobe Security Teams about this issue. Microsoft has plans
to fix this in IE9, while Adobe has opened the case but hasn’t planned the
timelines yet.


01.<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 4.0-c316
44.253921, Sun Oct 01 2006 17:14:39">
02. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
03. <rdf:Description rdf:about=""
04. xmlns:dc="http://purl.org/dc/elements/1.1/">
05. <dc:format>application/pdf</dc:format>
06. <dc:creator>
07. <rdf:Seq>
08. <rdf:li>LewtasS</rdf:li>
09. </rdf:Seq>
10. </dc:creator>
11. <dc:title>
12. <rdf:Alt>
13. <rdf:li xml:lang="x-default">file://C:\Documents and
Settings\lewtass\Desktop\eda newsletter</rdf:li>
14. </rdf:Alt>
15. </dc:title>
16. </rdf:Description>


01.<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="3.1-701">
02. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
03. <rdf:Description rdf:about=""
04. xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
05. <pdf:Producer>Acrobat Distiller 7.0.5 (Windows)</pdf:Producer>
06. </rdf:Description>
07. <rdf:Description rdf:about=""
08. xmlns:xap="http://ns.adobe.com/xap/1.0/">
09. <xap:CreatorTool>PScript5.dll Version 5.2.2</xap:CreatorTool>
10. <xap:ModifyDate>2009-03-18T15:07:10-07:00</xap:ModifyDate>
11. <xap:CreateDate>2009-03-18T15:07:10-07:00</xap:CreateDate>
12. </rdf:Description>
13. <rdf:Description rdf:about=""
14. xmlns:dc="http://purl.org/dc/elements/1.1/">
15. <dc:format>application/pdf</dc:format>
16. <dc:title>
17. <rdf:Alt>
18. <rdf:li
xml:lang="x-default">mhtml:file://O:\fema\shsp_2009\draft ijs\fy 2009
investment jus</rdf:li>
19. </rdf:Alt>

Thanks and Regards,
Security Researcher

Consider compiled binaries. They also leak paths of the developer's
compile environment (mainly PDB -
http://support.microsoft.com/kb/121366). E.g. My firefox.exe is:


This reminds me of the iPhone worm. Everyone knew about the default
root password years ago... it is not like people weren't already
scanning for gaolbroken phones before worms were released >:)

Then again, the worm brought attention to the fact.

PDF's are mainly used for two reasons:

1) Prevent end-user modification
2) Because Microsoft Word conceals information & tracks changes etc,
and too many government & enterprise organisations have been bitten by
it that PDF is a rule of thumb. At least in my experience.

Considering that, perhaps for the PDF format specifically this could
be an issue, under the assumption that consumers use PDF
/specifically/ to prevent data leakage.


Views: 668 | Added by: apeh1706 | Rating: 0.0/0
Total comments: 0
Name *:
Email *:
Code *: