The problem with embedded fonts in PDF mass printing applications

PDF is more and more finding its way into mass printing applications. However, PDF spool files often ask too much from a print engine resulting in aborts or, even worse, incomplete prints which may not be noticed. What is special about PDF mass printing and what can be done about it?

Individual PDF files from various application software systems are assembled into large spool files along with print tickets before they are submitted to the mass print service. The print preparation steps such as merging, splitting, reformatting, pagination, barcode insertion etc. leads to spool files that contain huge amounts of font and other resources. In particular, it may happen that a spool file of 100'000 pages contains 300'000 embedded, slightly different font subsets of the same Times Roman or Helvetica font family. It is immediately clear that an average print engine can't properly handle such a spool file.

One possibility to solve the problem is to omit embedded fonts. However, since the PDF files often conform to PDF/A because they need to be archived, this is not a real option for a general solution. Furthermore, print service organizations are not used to handle these problems, since they are not used to it. Traditional spool file formats such as ASP and PostScript have been optimized to handle font resources in an economical way. So, one must find a general solution to reduce the amount of resources in the PDF spool file.

The general solution is an optimizer tool. It can replace redundant objects such as repeatedly embedded logo images by a single instance and merge subsets of the same font family into a single font program. However, the merging of font programs is not as easy as it seems for the following reasons:

The font subsets have been derived from different versions of the same font family, e.g. Helvetica 1.0, Helvetica 1.1 etc.
The font subsets have been created by different PDF libraries using different subsetting and embedding rules.
The character code to glyph mapping is different for each subset.
The various subsets use different font technologies such as TrueType, Type 1, CFF or OpenType.
The subsets use different metrics for equivalent glyphs.
etc.

Since a powerful font merging algorithm for the purpose of mass print preparation, we have developed a special tool to perform the task. In the best case the tool is able to reduce the number of embedded fonts in the above mentioned spool file from 300'000 to only 3 fonts.