Skip to main content
AAuraPDF
Guide9 min read

How to Check PDF Health: Validate & Fix Document Issues

A complete guide to understanding your PDF's internal health — from font embedding and image resolution to structural integrity and compliance validation.

AuraPDF TeamMarch 29, 2026

How to Check PDF Health in 3 Steps

Analyzing a PDF's structure and identifying potential issues takes under 10 seconds:

Step 1: Go to AuraPDF's PDF Health Checker and upload your document.

Step 2: Review the health report. AuraPDF analyzes your PDF's internal structure and reports on: • Document properties — Page count, file size, PDF version, creator software • Font status — Which fonts are embedded, subsetted, or missing • Image analysis — Embedded image count, resolution (DPI), and compression • Metadata — Author, creation date, modification date, producer • Security — Encryption status, permission flags • Structural integrity — Cross-reference table validity, page tree structure

Step 3: Review flagged issues and use AuraPDF's recommended tools to fix them.

According to a 2024 analysis by veraPDF (the industry-standard open-source PDF validator), 38% of PDFs in enterprise environments contain at least one structural issue that could cause problems during printing, archiving, or processing. Catching these issues early prevents downstream failures.

What a PDF Health Check Reveals

A thorough PDF health check examines multiple layers of the document's internal structure:

1. Document Structure The PDF specification (ISO 32000) defines a precise file structure: header, body objects, cross-reference table, and trailer. Corruption in any of these layers can cause the file to fail in specific viewers, printers, or processing tools — even if it appears to open normally in Adobe Acrobat (which has extensive error recovery).

2. Font Embedding Fonts are the most common source of PDF problems. The health checker identifies: • Fully embedded fonts ✅ — The complete font is stored in the PDF • Subset embedded fonts ✅ — Only the used characters are embedded (normal for production PDFs) • Referenced fonts ⚠️ — The font name is listed but not embedded; appearance depends on the viewer's system fonts • Missing font data ❌ — The font reference is broken or the font program is corrupted

According to the Digital Preservation Coalition, font issues account for 43% of all PDF/A validation failures, making them the single most important factor in long-term document integrity.

3. Image Quality The checker reports each embedded image's resolution (DPI), compression method, and color space. Images below 150 DPI may appear pixelated when printed; images above 600 DPI add unnecessary file size. The report helps you decide whether compression would benefit the document.

4. Metadata Completeness Metadata includes author name, creation software, dates, and keywords. For compliance contexts (legal, regulatory, archival), complete metadata is often mandatory. Missing or incorrect metadata can cause documents to be rejected by automated filing systems.

5. Security Analysis The checker identifies encryption algorithms (40-bit RC4, 128-bit RC4, AES-128, AES-256), permission flags (print, copy, edit restrictions), and digital signature status. PDFs with legacy 40-bit encryption are flagged as insecure.

Common PDF Problems and How to Fix Them

Here are the most frequently detected issues and their solutions:

Problem: Missing or non-embedded fonts • *Symptom:* Text appears in a different font on some systems, or characters display as squares/boxes • *Cause:* The PDF creator didn't embed the fonts, relying on system fonts that may not be available • *Fix:* Re-export the document from the source application with font embedding enabled. In Microsoft Word: File → Options → Save → "Embed fonts in the file" • *Prevention:* Always embed fonts when creating PDFs for distribution. See our Font Embedding guide

Problem: High-resolution images causing large file size • *Symptom:* PDF is 20+ MB despite being only a few pages • *Cause:* Embedded images at 600+ DPI — far more than needed for screen viewing or standard printing • *Fix:* Use AuraPDF's Compress PDF tool to downsample images to the appropriate DPI for your use case • *More info:* Our DPI and Resolution guide explains optimal settings

Problem: Corrupted cross-reference table • *Symptom:* File opens with errors in some viewers, pages are missing, or content appears scrambled • *Cause:* Incomplete download, interrupted save operation, or bug in the creating software • *Fix:* Try opening in Adobe Acrobat (which can rebuild damaged cross-references) and re-saving. If the source file is available, re-export the PDF

Problem: Wrong page orientation • *Symptom:* Pages display sideways or upside down • *Cause:* Incorrect Rotate flag in the page dictionary, often caused by scanners • *Fix:* Use AuraPDF's Rotate PDF tool to correct the orientation

Problem: Unsearchable text (image-only PDF) • *Symptom:* Text cannot be selected, copied, or searched — the PDF is effectively a picture • *Cause:* The PDF was created by scanning without OCR, or text was converted to outlines • *Fix:* Apply OCR (Optical Character Recognition) using Adobe Acrobat, ABBYY FineReader, or Tesseract

Problem: Legacy encryption (40-bit RC4) • *Symptom:* Health checker flags security as "weak" • *Cause:* PDF was created with outdated encryption that can be broken in seconds • *Fix:* Remove the old encryption using Unlock PDF, then re-protect with AES-256 using Protect PDF

Professional PDF Validation Tools

Beyond AuraPDF's Health Checker, several specialized tools exist for deep PDF validation:

veraPDF (Free, open source) The gold standard for PDF/A compliance validation, developed by the Open Preservation Foundation with EU funding. veraPDF checks documents against all PDF/A conformance levels (1a, 1b, 2a, 2b, 2u, 3a, 3b, 3u, 4, 4e, 4f) and generates detailed machine-readable reports. Used by the Library of Congress, British Library, and National Archives of Australia for digital preservation validation.

Adobe Acrobat Pro Preflight ($22.99/month) The most comprehensive commercial validation tool. Preflight checks over 500 individual conditions including color accuracy, font embedding, image resolution, transparency flattening, print compatibility, and PDF/X compliance for commercial printing. It can also automatically fix many detected issues.

PAC (PDF Accessibility Checker) (Free) Specializes in accessibility validation against the PDF/UA standard (ISO 14289). Developed by the Swiss foundation Access for All, PAC checks tag structure, reading order, alt text, and contrast — essential for organizations required to meet Section 508 or WCAG compliance.

JHOVE (Free, open source) Developed by Harvard Library and the California Digital Library, JHOVE validates PDF file format conformance at a low level — verifying that the binary structure complies with the specification. It's less user-friendly but more thorough for structural validation than most alternatives.

According to the Open Preservation Foundation, organizations that implement automated PDF validation in their document workflows reduce document-related support tickets by 52% and processing failures by 67%.

When to Validate Your PDFs

Not every PDF needs a health check, but certain scenarios make validation essential:

Before archiving Documents entering long-term storage (legal records, financial statements, medical records) should be validated for PDF/A compliance and font embedding. A font that renders correctly today may not be available in 10 years.

Before printing Commercial print shops frequently reject PDFs that fail preflight checks — missing fonts, wrong color spaces (RGB vs CMYK), insufficient image resolution (below 300 DPI), and transparency issues cause the most rejections. According to Printing Industries of America, 22% of print jobs are delayed by PDF quality issues that could have been caught with preflight validation.

Before submission to government or legal systems Court e-filing systems (CM/ECF), government portals, and regulatory submission platforms often run automated validation. A PDF that fails their checks is rejected without review. Pre-validating with AuraPDF's Health Checker catches common rejection causes before submission.

After merging or editing Merging PDFs from multiple sources can introduce font conflicts, duplicate resources, and structural inconsistencies. A quick health check after merging confirms the output is clean.

After receiving from external sources PDFs received from clients, vendors, or unknown sources may contain structural issues, legacy encryption, or compatibility problems. Validating before processing prevents downstream failures.

During migration Organizations migrating document archives to new systems should validate all PDFs to identify and remediate issues before they become inaccessible. The Dutch National Archives reported that 14% of PDFs in their collection had structural issues that required remediation during a 2022 migration project.

AuraPDF's PDF Health Checker provides instant, free analysis — making it practical to validate every document before sharing, archiving, or submitting.

Frequently Asked Questions

What does a PDF health check do?
A PDF health check analyzes the internal structure of your document — font embedding, image quality, metadata, security settings, and structural integrity. It identifies potential issues like missing fonts, oversized images, weak encryption, and corrupted cross-references that could cause problems during printing, archiving, or processing.
Is the PDF Health Checker free?
Yes. AuraPDF's PDF Health Checker is completely free. Upload your PDF and receive an instant analysis of its structure, fonts, images, metadata, and security status. No account or subscription required.
How do I fix a corrupted PDF?
Try opening the PDF in Adobe Acrobat, which has built-in repair capabilities — it can rebuild damaged cross-reference tables and recover content from corrupted files. If Acrobat cannot repair it, tools like QPDF (free, command-line) can attempt structural recovery. If the source file exists, re-exporting the PDF is the most reliable fix.
Why do my fonts look different on other computers?
The PDF likely has referenced (not embedded) fonts. When a font isn't embedded, each viewer substitutes a different system font, changing the appearance. Re-create the PDF with font embedding enabled, or use Adobe Acrobat Pro's Preflight to embed missing fonts after the fact.
How do I check if a PDF is PDF/A compliant?
Use veraPDF (free, open source) for definitive PDF/A validation — it checks all conformance levels and produces detailed reports. Adobe Acrobat Pro's Preflight also validates PDF/A compliance. AuraPDF's Health Checker reports the PDF version and basic structural compliance, which can indicate compatibility issues.
Should I validate every PDF I create?
For casual use (personal documents, informal sharing), validation isn't necessary. For professional contexts — legal filings, commercial printing, regulatory submissions, long-term archiving — validation prevents costly rejections and ensures the document works correctly across all systems. A quick health check takes seconds and can save hours of troubleshooting.

Try These Tools

Read Next

A

Written by the AuraPDF Team

The AuraPDF team builds free, secure PDF tools used by thousands of people worldwide. Our guides combine hands-on expertise with technical depth to help you work with PDFs more effectively.

Learn more about us