PDF/A Format — The Complete Guide to Archival PDFs
PDF/A is an ISO-standardized subset of PDF (ISO 19005) designed to guarantee that documents remain readable for decades — or centuries — without depending on specific software, fonts, or external resources. It is the mandated archival format for courts, government agencies, and regulated industries worldwide.
What Is PDF/A?
PDF/A is a specialized version of the Portable Document Format engineered exclusively for long-term digital preservation. Governed by ISO 19005 (first published in 2005), PDF/A ensures that a document archived today will render identically on any compliant reader 50, 100, or even 200 years from now.
The core principle is self-containment: every resource the document needs — fonts, color profiles, metadata — must be embedded within the file itself. External dependencies like linked images, web fonts, or streaming multimedia are strictly prohibited.
According to the Library of Congress, PDF/A is one of only three digital formats recommended for long-term preservation of textual documents, alongside plain text (UTF-8) and XML. As of 2024, the U.S. Federal Courts system processes over 52 million PDF/A documents annually through its CM/ECF electronic filing system.
The key difference from standard PDF is constraints: PDF/A removes features that could compromise future readability. JavaScript, audio, video, external font references, and encryption are all forbidden because they depend on specific software capabilities that may not exist decades hence.
PDF/A Conformance Levels Explained
ISO 19005 defines multiple conformance levels, each building on the previous one with additional requirements:
| Level | Standard | Key Requirement | Best For |
|---|---|---|---|
| PDF/A-1b | ISO 19005-1 (2005) | Visual preservation — layout looks correct | Basic archiving, scanned docs |
| PDF/A-1a | ISO 19005-1 (2005) | Full accessibility — tagged structure + Unicode | Accessible archives |
| PDF/A-2b | ISO 19005-2 (2011) | JPEG2000, transparency, layers | Modern documents |
| PDF/A-2u | ISO 19005-2 (2011) | 2b + Unicode text mapping | Searchable archives |
| PDF/A-2a | ISO 19005-2 (2011) | Full accessibility on PDF 1.7 | Accessible modern docs |
| PDF/A-3b | ISO 19005-3 (2012) | Embedded files of ANY format | XML data + source files |
| PDF/A-4 | ISO 19005-4 (2020) | Based on PDF 2.0, simplified levels | Next-generation archiving |
The "b" level (basic) guarantees visual fidelity — the document will look correct. The "a" level (accessible) additionally requires a tagged logical structure, making content navigable by screen readers and enabling reliable text extraction. The "u" level requires Unicode character mapping for searchability.
According to the PDF Association, PDF/A-2b is the most widely adopted conformance level globally, accounting for approximately 61% of all PDF/A files created between 2018 and 2024. PDF/A-3 adoption is growing rapidly in the EU due to regulations requiring machine-readable data alongside human-readable documents (e.g., the ZUGFeRD e-invoicing standard in Germany).
PDF/A Requirements and Restrictions
To achieve PDF/A compliance, a document must satisfy strict technical requirements. Understanding these constraints helps explain why some standard PDFs fail validation:
Mandatory requirements: • Font embedding — Every font used in the document must be fully embedded (or subsetted). System font references are prohibited because the referenced font may not exist on future systems. • Device-independent color — All color must use ICC-based color spaces (or specific named spaces). Device-dependent color (e.g., raw RGB without a profile) is forbidden because color rendering varies by hardware. • XMP metadata — The file must contain an Extensible Metadata Platform (XMP) metadata stream identifying the PDF/A conformance level, creation date, and modification history. • Document structure — For "a" conformance, the file must include a complete tag tree mapping content to semantic elements (headings, paragraphs, tables, lists).
Prohibited features: • JavaScript — No executable code of any kind • Encryption — The file must be unencrypted (encryption would prevent future access if the decryption algorithm becomes unavailable) • External content references — No linked images, fonts, or resources outside the file • Audio and video — Multimedia content depends on codecs that may become obsolete • Transparency (PDF/A-1 only) — Live transparency was prohibited in version 1; PDF/A-2 and later allow it
According to a 2023 study by the Digital Preservation Coalition, font embedding failures account for 43% of all PDF/A validation errors, followed by color-space violations (27%) and missing XMP metadata (18%).
When to Use PDF/A
PDF/A is the standard choice — and often a legal requirement — in these industries and scenarios:
- Government and public records — The U.S. National Archives (NARA) mandates PDF/A for permanent federal records. The European Commission requires PDF/A for all official EU publications.
- Legal and judicial systems — Courts in the United States, Germany, Brazil, and Australia require or strongly recommend PDF/A for electronic filings. The U.S. federal court system (CM/ECF) validates every submission against PDF/A-1 conformance.
- Healthcare — HIPAA-regulated organizations in the U.S. use PDF/A for patient records that must remain accessible throughout retention periods spanning 6 to 30+ years.
- Financial services — Banking regulations (Basel III, MiFID II) require transaction records to be preserved in tamper-evident formats. PDF/A satisfies the "human-readable" component alongside structured data.
- Libraries and archives — The Library of Congress, British Library, and Bibliothèque nationale de France all use PDF/A as a primary digital preservation format.
If your document needs to be readable in 10+ years without depending on specific software, PDF/A is the appropriate choice. For documents that will be consumed and discarded within weeks or months, standard PDF offers more flexibility.
How to Create and Validate PDF/A Files
Creating a compliant PDF/A file requires either generating the document in PDF/A mode from the start or converting an existing PDF:
Creating PDF/A natively: 1. Microsoft Word/PowerPoint — Export using "Save As" → PDF → Options → "ISO 19005-1 compliant (PDF/A)." This embeds fonts and applies ICC color profiles automatically. 2. Adobe Acrobat Pro — Use the "Standards" panel to apply PDF/A conversion with automatic font embedding and color-space normalization. 3. LibreOffice — Export as PDF with the "Archive (PDF/A-1a)" checkbox enabled under PDF Options.
Validating compliance: After creation, validation is critical. According to the PDF Association, approximately 30% of documents claimed to be PDF/A fail automated conformance checks, usually due to missing font subsets or incorrect color spaces.
Popular validation tools include veraPDF (the open-source industry reference validator developed with EU funding), Adobe Acrobat's built-in Preflight module, and the AuraPDF Health Checker which analyzes PDF structure and metadata.
For organizations processing high volumes of documents, automated conversion pipelines using tools like Ghostscript or Apache PDFBox can batch-convert standard PDFs to PDF/A while logging validation results for quality assurance.
Frequently Asked Questions
What is the difference between PDF and PDF/A?
Is PDF/A legally required?
Can I convert a regular PDF to PDF/A?
Does PDF/A support images and graphics?
How do I check if a PDF is PDF/A compliant?
Related Articles
Try These Tools
From the Blog
Written by the AuraPDF Team
The AuraPDF team builds free, secure PDF tools used by thousands of people worldwide. Our Knowledge Base articles combine technical expertise with accessible explanations to help you understand PDF technology.
Learn more about us