Malicious PDF File

This tool will parse a PDF document to distinguish the central components utilized as a part of analyzed file. It won’t render a PDF archive.

Features included:

  • Load/parse objects and headers
  • Extract meta data (author, description, …)
  • Extract text from ordered pages
  • Support of compressed pdf
  • Support of MAC OS Roman charset encoding
  • Handling of hexa and octal encoding in text sections
  • PSR-0 compliant (autoloader)
  • PSR-1 compliant (code styling)

You can Take the best Certified Cyber Threat Intelligence Analyst online course to learn and analyze more related cyber threats

Analyzing a Malicious PDF File

We have created the PDF file with an EXE file embedded with it.

Step 1: To launch the PDF parser type pdf-parser

[email protected]:~# pdf-parser -h List all the options with PDFParser

Step2: To get the stats of the PDF Document.

[email protected]:~# pdf-parser -a /root/Desktop/template.pdf

Analyzing a Malicious PDF File

Step3: Passing stream data through Filters FlateDecode,ASCIIHexDecode, ASCII85Decode, LZWDecode and RunLengthDecode.

[email protected]:~# pdf-parser -f /root/Desktop/template.pdf

Analyzing a Malicious PDF File

Analyzing a Malicious PDF File


Step4: To get the Hashes of the PDF file.

[email protected]:~# pdf-parser -H /root/Desktop/template.pdf

Analyzing a Malicious PDF File

Step5: Case sensitive search in streams

[email protected]:~# pdf-parser –casesensitive /root/Desktop/template.pdf

Analyzing a Malicious PDF File

Step6: To get the javascripts added with the document.

pdf-parser –search javascript –raw /root/Desktop/template.pdf

Analyzing a Malicious PDF File

The stats option show insights of the items found in the PDF report. Utilize this to recognize PDF archives with unusual/unexpected objects, or to characterize PDF records.

The search option scans for a string in indirect objects (not inside the surge of Indirect objects). The inquiry is not case-sensitive and is defenseless to obfuscation methods.

Filter option applies the filter(s) to the stream, whereas raw option makes pdf-parser output raw data.

Also Read: