FileForge > User Guides > PDF Comparator

📄 PDF Comparator

User Guide - FileForge Suite

📋 Overview

PDF Comparator is a professional visual comparison tool for PDF documents. It highlights text differences with color coding, making it easy to identify changes between document versions. Perfect for contract reviews, document audits, version control, and compliance verification.

Key Features

  • Visual Text Comparison - Highlights added, deleted, and modified text
  • Side-by-Side View - View both PDFs simultaneously with synchronized scrolling
  • Customizable Highlight Color - Choose your preferred color for highlighting differences
  • Page-by-Page Navigation - Jump to specific pages or navigate sequentially
  • Difference Summary - See total count of changes across entire document
  • Export Reports - Generate comparison reports as PDFs or text files
  • Built-in OCR - Automatically detect and process scanned/image-based PDFs
  • Password Protection Support - Open and compare password-protected PDFs

🚀 Getting Started

Quick Start in 4 Steps

Step 1: Load your "original" (baseline) PDF using Browse or drag & drop
Step 2: Load your "modified" (comparison) PDF
Step 3: Click Compare to run the comparison
Step 4: Review highlighted differences in side-by-side view
First Time User? Start with two similar PDFs that you know have a few differences. This will help you understand how the highlighting works before tackling complex documents!

🖥️ Interface Guide

Main Areas

Area Purpose
File Selection (Top) Browse or drag & drop Original PDF (left) and Modified PDF (right)
Action Buttons Compare, Reset, Export Report
Comparison View Side-by-side PDF viewers with synchronized scrolling
Page Navigation Current page display and navigation buttons
Difference Counter Total number of differences found
Status Bar Progress indicator and status messages

Navigation Controls

◀️ Previous Page: Go to previous page (keyboard: Left Arrow)
▶️ Next Page: Go to next page (keyboard: Right Arrow)
Page Selector: Jump directly to any page number

⚙️ How Comparison Works

Text Extraction & Analysis

PDF Comparator extracts text from both PDFs page by page, then performs intelligent comparison to identify:

  • Added Text - Content present in modified PDF but not in original
  • Deleted Text - Content present in original PDF but not in modified
  • Modified Text - Content that changed between versions
  • Unchanged Text - Content that remains the same

Page-by-Page Comparison

Comparison happens page-by-page, matching page 1 to page 1, page 2 to page 2, and so on. Documents should have the same page count for best results.

Important: PDF Comparator compares text content only. Images, formatting, fonts, and layout changes are not detected.

🎨 Text Highlighting

PDF Comparator highlights differences using a customizable highlight color (default: yellow). You can change the color using the color picker button in the toolbar.

How Highlighting Works

Panel Highlighted Text Means Example
Left Panel Only Deleted Text Text exists in original but was removed
Right Panel Only Added Text New text added in modified version
Both Panels Modified Text Text changed between versions
No Highlight Unchanged Text Same in both versions

Customizing the Highlight Color

  1. Click the color picker button in the toolbar (shows current color)
  2. Choose your preferred highlight color from the color dialog
  3. All highlights will update to use your selected color

How to Read Highlights

Original (Left): "The contract expires on December 31, 2024."

Modified (Right): "The contract expires on June 30, 2025."

Result: Both dates are highlighted - the old date on the left panel shows what was replaced, the new date on the right panel shows what replaced it.

Tip: When reviewing long documents, scan for colored highlights - they immediately show where changes occurred! Use a bright color like yellow or cyan for maximum visibility.

📷 OCR for Scanned Documents

PDF Comparator includes built-in OCR (Optical Character Recognition) to compare scanned or image-based PDFs that don't contain selectable text.

How It Works

  1. When you load a PDF, PDF Comparator checks if it contains extractable text
  2. If a scanned/image PDF is detected, you'll be prompted to enable OCR
  3. OCR uses Windows 10/11's built-in OCR engine to extract text from images
  4. Extracted text is then compared like any normal PDF
OCR Tips: For best results, use clear, high-resolution scans. Low-quality or skewed documents may produce less accurate text extraction.

OCR Requirements

  • Windows 10/11 - OCR uses the built-in Windows OCR API
  • Language Pack - English OCR is included by default; other languages require Windows language packs
Note: OCR accuracy depends on document quality. Handwritten text, unusual fonts, or poor scans may not be recognized accurately.

📤 Export & Reports

Export Options

Generate comparison reports in multiple formats:

PDF Report

Side-by-side comparison with highlighted differences. Perfect for sharing with stakeholders or archiving.

  • Preserves all color highlighting
  • Shows both original and modified side by side
  • Includes page numbers and navigation
  • Professional presentation format

Text Report

Plain text format listing all differences. Great for version control systems or detailed review.

  • Lists changes page by page
  • Shows before/after text
  • Searchable and lightweight
  • Easy to parse programmatically

What's Included in Reports

  • Document Information - File names, comparison date, page counts
  • Summary Statistics - Total differences found
  • Page-by-Page Changes - All differences organized by page
  • Visual Highlighting - Color coding preserved (PDF reports)

🔄 Common Workflows

Workflow 1: Contract Review

Load original contract as Original PDF
Load revised contract as Modified PDF
Click Compare
Review all highlighted text (left panel = deleted, right panel = added)
Pay special attention to numerical changes (dates, prices, quantities)
Export PDF report for stakeholder review

Workflow 2: Document Version Control

Load previous version as Original PDF
Load current version as Modified PDF
Compare to document changes
Export text report for version history
Archive both PDFs and comparison report together

Workflow 3: Compliance Verification

Load approved/certified version as Original PDF
Load current/submitted version as Modified PDF
Compare to verify no unauthorized changes
Check difference counter - should be zero for identical docs
Export report as proof of verification

Workflow 4: Multi-Page Document Audit

Load both document versions
Compare entire document
Use page navigation to review each page systematically
Note pages with significant changes
Export report highlighting changed pages

Workflow 5: Comparing Scanned Documents

Load a scanned PDF - you'll be prompted about OCR
Click Yes to enable OCR text extraction
Wait for OCR processing (may take a moment for large documents)
Load the second scanned PDF and enable OCR when prompted
Click Compare to analyze the extracted text
Review differences - OCR accuracy may vary based on scan quality

Workflow 6: Comparing Password-Protected PDFs

Load the first password-protected PDF
Enter the password when prompted
Load the second PDF (password-protected or not)
Enter its password if prompted
Click Compare to analyze the documents
Review differences as normal

💡 Tips & Best Practices

Document Preparation

  • Same page count - Documents should have matching page counts
  • Text-based PDFs - Works best with PDFs containing selectable text
  • Clean formatting - Simple layouts compare more reliably
  • Scanned documents - Enable OCR when prompted for image-based PDFs

Review Strategy

  • Start at page 1 - Review systematically to avoid missing changes
  • Focus on highlights - Don't read unchanged text
  • Check numbers carefully - Dates, prices, quantities are critical
  • Verify deletions - Highlights only in the left panel indicate removed content

Understanding Results

0 differences found: Documents are identical (text-wise)

Few differences: Minor revisions or corrections

Many differences: Substantial changes or different documents

Performance Considerations

  • Large documents (100+ pages) may take longer to compare
  • Complex layouts may affect comparison accuracy
  • Multiple columns or text boxes can reorder during extraction
  • Tables may not always align perfectly

Common Mistakes to Avoid

Comparing wrong versions: Always verify you've loaded the correct files before comparing. Check filenames carefully!
Ignoring page count mismatch: If documents have different page counts, pages won't align correctly. Add/remove pages before comparison.
Expecting visual layout comparison: PDF Comparator shows text differences only, not layout, fonts, or formatting changes.

🔧 Troubleshooting

Q: No differences shown but I know files are different

A: Check if differences are visual only (fonts, colors, layout). PDF Comparator detects text changes only. If text is identical, no differences will be shown.

Q: Too many differences showing - documents look similar

A: PDFs may have been regenerated from different sources, causing text extraction order differences. This is common with PDFs created from different programs.

Q: Text appears garbled or incorrectly extracted

A: PDF may have embedded fonts or special encoding. Try re-saving the PDF in Adobe Reader or similar tool before comparison.

Q: Comparison is very slow

A: Large documents with many pages take time. Be patient during comparison. Close other applications to free up memory.

Q: Pages aren't aligning correctly

A: Documents likely have different page counts. PDF Comparator matches page 1 to page 1, page 2 to page 2, etc. If one document has extra pages, alignment fails.

Q: Some text isn't highlighted as different

A: Minor spacing or whitespace differences may not trigger highlighting. Also, identical text in different locations may not be detected.

Q: Export fails or report is corrupted

A: Ensure you have write permissions to the export location. Try exporting to a different folder like Desktop.

Q: Can I compare scanned PDFs?

A: Yes! PDF Comparator automatically detects scanned/image-based PDFs and prompts you to enable OCR. The built-in OCR uses Windows 10/11's OCR engine to extract text from images. For best results, ensure your scanned documents are clear and readable.

Q: Can I compare password-protected PDFs?

A: Yes! When you load a password-protected PDF, you'll be prompted to enter the password. Once unlocked, you can compare it like any other PDF. Both PDFs can have different passwords if needed.

⌨️ Keyboard Shortcuts

Action Shortcut
Previous Page Left Arrow
Next Page Right Arrow
Reset Comparison Ctrl + R
Return to Main Menu Ctrl + M
Exit Application Alt + F4

⚠️ Limitations

What PDF Comparator CAN Do:

  • ✅ Compare text content
  • ✅ Highlight additions, deletions, modifications
  • ✅ Generate comparison reports
  • ✅ Page-by-page navigation
  • ✅ Export results
  • ✅ OCR scanned/image-based PDFs (Windows 10/11)
  • ✅ Open and compare password-protected PDFs

What PDF Comparator CANNOT Do:

  • ❌ Detect image differences
  • ❌ Compare formatting or fonts
  • ❌ Detect layout changes
  • ❌ Detect metadata changes
All User Guides