Here’s an example PDF (from some other open source project, interestingly) that does work consistently in qpdfview
.
It does seem to be hidden in an annotation indicated by a icon so when you first mouseover, you see “text file” and need to click on it:
Then the icon will be replaced with a icon which you can then click to see the save options:
As aforementioned, “Save and open…” will open up the file (called utf8test.txt) which is indeed a text file in Featherpad.
For the purposes of better understanding what we’re dealing with and how to make sure we’re comparing apples and oranges, I took the time to run a hexdump -C
against the PDF and you can see how the text file (a stream) is hidden in the annotation.
Here we can see an object of type EmbeddedFile
which is of size 4834, which can be confirmed with stat --format='%s'
:
00004dc0 65 6e 64 6f 62 6a 0a 39 20 30 20 6f 62 6a 0a 3c |endobj.9 0 obj.<|
00004dd0 3c 20 2f 54 79 70 65 20 2f 45 6d 62 65 64 64 65 |< /Type /Embedde|
00004de0 64 46 69 6c 65 20 2f 46 69 6c 74 65 72 20 2f 46 |dFile /Filter /F|
00004df0 6c 61 74 65 44 65 63 6f 64 65 20 2f 4c 65 6e 67 |lateDecode /Leng|
00004e00 74 68 20 32 36 34 39 20 2f 50 61 72 61 6d 73 20 |th 2649 /Params |
00004e10 3c 3c 2f 53 69 7a 65 20 34 33 38 34 3e 3e 20 3e |<</Size 4384>> >|
00004e20 3e 20 73 74 72 65 61 6d 0a 78 9c a5 58 4b 77 13 |> stream.x..XKw.|
Here’s an indication of an object of type Annot
(annotation) that has a subtype FileAttachment
:
00005880 8d dd 0a 65 6e 64 73 74 72 65 61 6d 0a 65 6e 64 |...endstream.end|
00005890 6f 62 6a 0a 37 20 30 20 6f 62 6a 0a 3c 3c 2f 54 |obj.7 0 obj.<</T|
000058a0 79 70 65 20 2f 41 6e 6e 6f 74 20 2f 53 75 62 74 |ype /Annot /Subt|
000058b0 79 70 65 20 2f 46 69 6c 65 41 74 74 61 63 68 6d |ype /FileAttachm|
000058c0 65 6e 74 20 2f 52 65 63 74 20 5b 32 34 30 2e 39 |ent /Rect [240.9|
Here’s the “text file” string that’s appears in the tooltip:
000058d0 34 34 38 38 32 20 37 35 31 2e 31 38 31 33 33 39 |44882 751.181339|
000058e0 20 32 35 35 2e 31 31 38 31 31 30 20 37 36 35 2e | 255.118110 765.|
000058f0 33 35 34 35 36 37 5d 20 2f 43 6f 6e 74 65 6e 74 |354567] /Content|
00005900 73 20 28 fe ff 00 74 00 65 00 78 00 74 00 20 00 |s (...t.e.x.t. .|
00005910 66 00 69 00 6c 00 65 29 20 2f 50 20 31 31 20 30 |f.i.l.e) /P 11 0|
There’s even a reference to the pushpin:
00005990 20 38 20 30 20 52 20 2f 4e 61 6d 65 20 2f 50 75 | 8 0 R /Name /Pu|
000059a0 73 68 50 69 6e 3e 3e 0a 65 6e 64 6f 62 6a 0a 31 |shPin>>.endobj.1|
And later after some XML metadata, there’s a catalog which defines the name of the attached file:
00006db0 36 20 30 20 6f 62 6a 0a 3c 3c 20 2f 54 79 70 65 |6 0 obj.<< /Type|
00006dc0 20 2f 43 61 74 61 6c 6f 67 20 2f 56 65 72 73 69 | /Catalog /Versi|
00006dd0 6f 6e 20 2f 31 2e 37 20 2f 50 61 67 65 73 20 31 |on /1.7 /Pages 1|
00006de0 20 30 20 52 20 2f 4e 61 6d 65 73 20 3c 3c 20 2f | 0 R /Names << /|
00006df0 45 6d 62 65 64 64 65 64 46 69 6c 65 73 20 3c 3c |EmbeddedFiles <<|
00006e00 2f 4e 61 6d 65 73 20 5b 20 28 75 74 66 38 74 65 |/Names [ (utf8te|
00006e10 73 74 2e 74 78 74 29 20 38 20 30 20 52 20 5d 3e |st.txt) 8 0 R ]>|
Hopefully that helps us dissect whta we’re dealing with.
That said, I’ve seen cases before where PDFs are viewable in one PDF reader but not in others. My feeling about this is that it seems like there are PDF creators out there that make PDFs that do not strictly adhere to the PDF format, but some readers cheat and let them slide. Perhaps they’re using deprecated features? That might be another explanation for why one thing works and another doesn’t.
I’m working at tackling the original example but since it’s not so simple of a file it’s a lot harder to manage