this post was submitted on 04 Nov 2024
50 points (94.6% liked)

cybersecurity

3249 readers
1 users here now

An umbrella community for all things cybersecurity / infosec. News, research, questions, are all welcome!

Community Rules

Enjoy!

founded 1 year ago
MODERATORS
 

cross-posted from: https://lemmy.sdf.org/post/24645301

They emailed me a PDF. It opened fine with evince and looked like a simple doc at first. Then I clicked on a field in the form. Strangely, instead of simply populating the field with my text, a PDF note window popped up so my text entry went into a PDF note, which many viewers present as a sticky note icon.

If I were to fax this PDF, the PDF comments would just get lost. So to fill out the form I fed it to LaTeX and used the overpic pkg to write text wherever I choose. LaTeX rejected the file.. could not handle this PDF. Then I used the file command to see what I am dealing with:

$ file signature_page.pdf
signature_page.pdf: Java serialization data, version 5

WTF is that? I know PDF supports JavaScript (shitty indeed). Is that what this is? “Java” is not JavaScript, so I’m baffled. Why is java in a PDF? (edit: explainer on java serialization, and some analysis)

My workaround was to use evince to print the PDF to PDF (using a PDF-building printer driver or whatever evince uses), then feed that into LaTeX. That worked.

My question is, how common is this? Is it going to become a mechanism to embed a tracking pixel like corporate assholes do with HTML email?

I probably need to change my habits. I know PDF docs can serve as carriers of copious malware anyway. Some people go to the extreme of creating a one-time use virtual machine with PDF viewer which then prints a PDF to a PDF before destroying the VM which is assumed to be compromised.

My temptation is to take a less tedious approach. E.g. something like:

$ firejail --net=none evince untrusted.pdf

I should be able to improve on that by doing something non-interactive. My first guess:

$ firejail --net=none gs -sDEVICE=pdfwrite -q -dFIXEDMEDIA -dSCALE=1 -o is_this_output_safe.pdf -- /usr/share/ghostscript/*/lib/viewpbm.ps untrusted_input.pdf

output:

Error: /invalidfileaccess in --file--
Operand stack:
   (untrusted_input.pdf)   (r)
Execution stack:
   %interp_exit   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   false   1   %stopped_push   1990   1   3   %oparray_pop   1989   1   3   %oparray_pop   1977   1   3   %oparray_pop   1833   1   3   %oparray_pop   --nostringval--   %errorexec_pop   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   %array_continue   --nostringval--
Dictionary stack:
   --dict:769/1123(ro)(G)--   --dict:0/20(G)--   --dict:87/200(L)--   --dict:0/20(L)--
Current allocation mode is local
Last OS error: Permission denied
Current file position is 10479
GPL Ghostscript 10.00.0: Unrecoverable error, exit code 1

What’s my problem? Better ideas? I would love it if attempts to reach the cloud could be trapped and recorded to a log file in the course of neutering the PDF.

(note: I also wonder what happens when Firefox opens this PDF considering Mozilla is happy to blindly execute whatever code it receives no matter the context.)

you are viewing a single comment's thread
view the rest of the comments
[–] GetOffMyLan@programming.dev -1 points 1 week ago* (last edited 1 week ago) (3 children)

The file is a serialised java array that contains a pdf file. I've seen a few things online about this. Some pdf readers accept it, some don't.

And I'm not sure why an application would output a pdf this way. But there's nothing harmful going on.

You're kind of freaking out about nothing.

[–] Hirom@beehaw.org 17 points 1 week ago (1 children)

It's a fair question. There's precedent where malware is embedded in PDFs.

[–] GetOffMyLan@programming.dev 2 points 1 week ago* (last edited 1 week ago)

Indeed. But the pdf file itself isn't the issue here. They very clearly don't know what serialisation is.

And while there are risks with java serialisation it isn't being deserialized here.

[–] evenwicht@lemmy.sdf.org 14 points 1 week ago* (last edited 1 week ago) (2 children)

You’re kind of freaking out about nothing.

I highly recommend Youtube video l6eaiBIQH8k, if you can track it down. You seem to have no general idea about PDF security problems.

And I’m not sure why an application would output a pdf this way. But there’s nothing harmful going on.

If you can’t explain it, then you don’t understand it. Thus you don’t have answers.

It’s a bad practice to just open a PDF you did not produce without safeguards. Shame on me for doing it.. I got sloppy but it won’t happen again.

[–] GetOffMyLan@programming.dev 2 points 1 week ago* (last edited 1 week ago)

It's literally just the format of the file here. If you skip the java serialisation header it's a normal pdf file. I said nothing about the pdf file itself.

I did explain what it is. I just don't know why certain programs encode it this way. It's supported by multiple pdf readers so it must be semi common but I can't find a reason for it to be encoded this way.

I'm trying to help you out there's no need to be a dick.