Does OCR make sense for digitally generated PDFs?

Scanned PDF files usually consist of one raster image for each page. The OCR engine can recognize the text in this image and make the document searchable. But what about digitally generated documents?

Using native applications in a PDF document conversion service

Automated conversion of Office documents into PDFs has become a popular service. When designing the architecture of such a service, the question arises as to whether the native application or a specially developed software library should carry out the conversion. The pros and cons are not obvious, so it's worth taking a closer look.

Importing images into a PDF file - a seemingly trivial task

A picture is worth a thousand words. That's why they are fondly embedded in PDF files. One would expect that embedding images in a PDF file is a simple task. Because it seems so easy, there are also many, including free, tools for it. But do these tools do what you expect them to do?

A closer look reveals that embedding images is anything but trivial.