PDF validation with customer specific extensions

While talking about PDF validation workflows I often come across questions like "Can I let the validation fail if the paper format does not match our corporate rules?". This and other customer specific requirements are indeed useful extensions to the pure file format and standard conformance tests.

Customer specific tests depend on whether the document was scanned or produced from a digital source and how it is intended to be used. For scanned documents the following checks could make sense:
  • Resolution of the scanned images
  • Compression algorithms
  • Manufacturer of scanner (stored as the producer property in the document's metadata)
  • Presence of OCR text
For digital born documents the following information could be helpful:
  • Names of the used fonts
  • Font embedding information (if not PDF/A)
  • Creator and producer application name
And, in general:
  • Page format (A4, Letter, etc.)
  • PDF minimum and maximum version (e.g. from 1.4 to 1.7)
  • Presence of specific features such as embedded files, transparency, patterns, shadings, color spaces etc.
I'd like to learn more about your specific requirements. Please let me know them and post a comment!