This tool will parse a PDF document to distinguish the central components utilized as a part of the analyzed file. It won’t render a PDF archive.
Features included:
- Load/parse objects and headers
- Extract metadata (author, description, …)
- Extract text from ordered pages
- Support of compressed pdf
- Support of MAC OS Roman charset encoding
- Handling of hexa and octal encoding in text sections
- PSR-0 compliant (autoloader)
- PSR-1 compliant (code styling)
You can Take the best Certified Cyber Threat Intelligence Analyst online course to learn and analyze more related cyber threats.
Analyzing a Malicious PDF File
We have created the PDF file with an EXE file embedded with it.
Step 1: To launch the PDF parser type pdf-parser
root@kali:~# pdf-parser -hList all the options with PDFParser
Step 2: To get the stats of the PDF Document.
root@kali:~# pdf-parser -a /root/Desktop/template.pdfStep 3: Passing stream data through Filters FlateDecode,ASCIIHexDecode, ASCII85Decode, LZWDecode, and RunLengthDecode.
root@kali:~# pdf-parser -f /root/Desktop/template.pdfStep 4: To get the Hashes of the PDF file.
root@kali:~# pdf-parser -H /root/Desktop/template.pdfStep 5: Case-sensitive search in streams
root@kali:~# pdf-parser –casesensitive /root/Desktop/template.pdfStep 6: To get the javascript added to the document.
pdf-parser –search javascript –raw /root/Desktop/template.pdfThe stats option shows insights into the items found in the PDF report. Utilize this to recognize PDF archives with unusual/unexpected objects, or to characterize PDF records.
The search option scans for a string in indirect objects (not inside the surge of Indirect objects). The inquiry is not case-sensitive and is defenseless to obfuscation methods.
The filter option applies the filter(s) to the stream, whereas the raw option makes the pdf-parser output raw data.
You can follow us on Linkedin, Twitter, and Facebook for daily Cybersecurity updates also you can take the Best Cybersecurity course online to keep yourself updated.