Digging for information by extracting data from a PDF document

Extracting text from a PDF document is one of the most popular information retrieval function. But how about other information such as images, metadata and more? It can be simple - but also tricky.

Using blockchains as an alternative to PKIs for digital signatures

The traditional technical environment for a digital signature is the public key infrastructure (PKI). Digital signatures are also used to implement electronic money such as Bitcoin. However, Bitcoin uses a new technology, the blockchain. This new technical infrastructure can also be employed to sign documents. But what are the benefits?

How to render the text of a PDF document if the font is not embedded?

Every developer of a PDF viewer, a PDF printer and a PDF to Image Converter tool comes across the requirement to render non embedded fonts and is facing quite a challenging task. Not only developers but also users of these tools might be interested in non embedded fonts and how they are treated by these tools.

Inline images and Type 3 fonts

I often hear that the inline image construct is a major flaw in the design of the PDF page description language. Inline images are an often used feature in Type 3 fonts. However, the stomach pain of some experts even caused them to adjust this feature in the upcoming PDF 2.0 standard.  What are inline images and why do some programmers of PDF readers feel uncomfortable about them?

How to convert signed documents to PDF/A?

I often get the question whether it is possible to convert digitally signed documents to PDF/A. Because there's no short answer to this I thought it would be helpful to explore the topic a bit into more detail.