Skip to main content

Strip metadata from a Word doc or PDF before you send it — without uploading the file

Before you email a draft to opposing counsel, file it on ECF, or send a contract to a client: a 4-minute workflow that removes the author name, the tracked changes, the comments, the prior-revision history, and the document GUID. Word and PDF both covered. Mac and Windows.

Why this matters (a 60-second briefing)

⚠ Litigation hold — read this before you scrub

Before you start — do NOT scrub metadata that is under a preservation duty

If the document is subject to a litigation hold, a records-retention rule, a discovery obligation, a regulatory preservation duty, or a contractual preservation clause, removing metadata may be sanctionable spoliation. This article is about outgoing-mail hygiene on documents you produced and are now sending out — not about modifying documents you have a duty to preserve. When in doubt: scrub a copy, retain the original, document the chain. If your matter has a litigation hold in place, check with your records manager or e-discovery counsel before running any of the steps below.

The three classes of hidden content (and which tool handles each)

Most articles in this space treat 'metadata' as one undifferentiated category. It is not. There are three distinct classes of hidden content in a Word doc or PDF, and each is handled by a different tool. Naming them separately is the difference between a clean scrub and the 'I cleaned it, why is it still there?' failure mode.

1. Document properties

What it is: Author, Last Saved By, Company, Manager, Created date, Last Modified date, Last Printed date, Document GUID, Title (which often differs from the filename), Subject, Keywords, Comments-the-field, Hyperlink Base. Set by the software, often without the user noticing.

Which tool handles it: Word: Document Inspector → 'Document Properties and Personal Information'. PDF: FileHop PDF → Compress with 'Remove metadata' enabled (strips the Info dictionary and XMP from the catalog).

2. Tracked changes and comments

What it is: The riskiest class — these can reveal negotiating strategy, internal disagreements, and the actual edit history a partner wanted hidden. Accepting all changes and turning off Track Changes is NOT enough; until you remove them, residual markup can resurface if the receiver toggles Track Changes back on.

Which tool handles it: Word ONLY: Document Inspector → 'Comments, Revisions, Versions, and Annotations'. There is no fix-this-after-the-fact PDF tool; tracked changes have to be cleared in the .docx before the PDF export, otherwise they get baked into the page content as flattened bubbles.

3. Embedded-image EXIF (and other media metadata)

What it is: If you inserted a photo, a screenshot, or a phone-camera image, the EXIF can include the camera model, the exact timestamp, the GPS coordinates of where the photo was taken, and lens settings.

Which tool handles it: Word: Document Inspector handles document-level properties but does NOT walk every embedded image. Better practice: strip EXIF on each image file BEFORE inserting it. FileHop handles standalone image EXIF stripping in its image tools. If the image is already embedded in a PDF, the EXIF survives the PDF metadata strip — you would need to re-export the image, strip its EXIF, and re-insert.

Workflow A — Word document you are about to send

If the file you are sending started as a .docx, run this four-step workflow. The first two steps happen inside Microsoft Word; the last two happen inside FileHop. Total time: about three minutes.

  1. 1

    Step 1: Accept (or reject) every tracked change and delete every comment — in Microsoft Word

    Open the .docx in Word. Go to Review → Changes → Accept → Accept All Changes. Then go to Review → Comments → Delete → Delete All Comments in Document. Then turn Track Changes off (Review → Track Changes, toggle off). Save the file. This step has to happen in Word, not in FileHop — FileHop does not edit .docx track-changes or comment markup. Do not skip the 'turn Track Changes off' part: if you leave it on, the next person who opens the file in Word will see your edits as fresh tracked changes.

  2. 2

    Step 2: Run Word's Document Inspector — in Microsoft Word

    Still inside Word: File → Info → Check for Issues → Inspect Document. Check every category (Document Properties and Personal Information; Comments, Revisions, Versions, and Annotations; Custom XML Data; Headers, Footers, and Watermarks; Hidden Text; Invisible Content). Click Inspect. For every category that returns results, click Remove All. Save the file again. Important: run Document Inspector on a copy, not your original — the removal is not always reversible. This step is the canonical Microsoft-recommended .docx scrub procedure; it is the right tool for the .docx file format. FileHop is not a Word add-in and does not replicate this step.

  3. 3

    Step 3: Convert the cleaned .docx to PDF — in the FileHop desktop app (locally on your computer)

    Open FileHop and use the Word-to-PDF conversion. Drag in the cleaned .docx and convert. The conversion runs locally on your computer — the file does not transit our servers. This matters because the most common 'oh no' workflow is: lawyer cleans the .docx in Word, then uploads it to a free online Word-to-PDF converter to get the PDF, which re-introduces a privacy problem the Word scrub was designed to solve. Convert locally. The output is a PDF that has lost the .docx-specific structures (track-changes markup, custom XML, document properties) — but it may still carry PDF-level metadata that the converter wrote in (Producer, Creator, CreationDate). That is what Step 4 cleans up.

  4. 4

    Step 4: Strip the PDF's own metadata — in FileHop

    In FileHop, choose PDF → Compress, open the file from Step 3, and turn on the 'Remove metadata' option. Pick any quality level (for a born-Word PDF the file is usually already small; the compress step is mostly there to host the metadata-removal pass). Save. This strips the PDF's Info dictionary (Author, Producer, Title, Subject, Keywords, Creator, CreationDate, ModDate) and the XMP metadata block from the document catalog. The file is now clean of both .docx-side metadata (Step 2 handled that) and PDF-side metadata (Step 4 handled that).

    → Open the PDF Compressor

Workflow B — PDF you are about to send (no Word ancestor)

If the file is already a PDF and you did not produce it from a .docx (e.g., a scanned exhibit, a downloaded form, a PDF the client sent you), the workflow is shorter: there is no Word phase, just the PDF metadata strip and verification.

  1. 1

    Step 1: Strip the PDF's metadata in FileHop

    Open FileHop, choose PDF → Compress, drag in the file, and turn on 'Remove metadata'. Save. This removes the Info dictionary and XMP — Author, Producer, Title, Subject, Keywords, Creator, CreationDate, ModDate. If the PDF was produced by Adobe Acrobat or Word and never had its metadata cleaned, that is where the leaks live.

    → Open the PDF Compressor
  2. 2

    Step 2: If the PDF carries leftover comment/annotation residue, also use FileHop's aggressive compression

    If your source PDF has comment bubbles, sticky-note annotations, or highlight markup that you do NOT want to send, FileHop's aggressive compression preset removes non-essential annotations (comments, highlights, stamps, ink) while preserving links, form fields, and signatures. This is a content-level step, not a metadata-level step — different category of hidden content, handled by a different code path. If your PDF has tracked-changes markup baked into the page as flattened content (rare but it happens with bad exports), that is page content at this point and needs real redaction — a separate redaction workflow, not a metadata strip.

Verify it actually worked (60-second check before you hit Send)

Three minutes of verification prevents the kind of disclosure that ends up in a Bar journal case study. Run all five checks on the final file before it leaves your machine.

  1. 1 Open the final PDF in your default reader (Preview on Mac, Edge or Acrobat on Windows). Go to File → Properties (or the equivalent metadata panel). The Author, Title, Subject, Keywords, Creator, and Producer fields should be empty or generic (the Producer may show the export library name — that is normal and not sensitive).
  2. 2 On Windows: right-click the PDF in Explorer → Properties → Details. Scroll. Confirm Authors, Last saved by, Subject, Title, Tags, Comments are empty.
  3. 3 On Mac: right-click the PDF in Finder → Get Info → More Info. Confirm Authors and Title are empty.
  4. 4 If you ran Workflow A, also open the .docx (the one you scrubbed in Step 2) in Word once more and re-run File → Info — it should report no comments, no revisions, no document properties, and no custom XML.
  5. 5 Open the final PDF cold (close it, reopen it) and skim every page. Confirm no comment bubbles, no track-changes residue, no sticky notes visible on the page. If the file went through a flattening export, anything that looks like markup but stays through compression is now page content — flag for redaction, not metadata strip.

Why this workflow runs locally (and the exception)

The 'without uploading the file' line in the headline is the wedge for this guide. Here is what it means in practice, with the limits stated honestly.

  • Both the convert-to-PDF step and the metadata-strip step run inside the FileHop desktop app on your computer. Your draft does not transit our servers.
  • No telemetry on file contents. We do not log what you scrubbed, what was in it, or who you were sending it to.
  • No AI training on your files.
  • Open output format. FileHop writes standard PDF — opens in any reader the receiving party uses.
  • Honest scope on the .docx phase: Microsoft Word's Document Inspector is also a local tool — it does not phone home to Microsoft. The Word phase of this workflow is desktop-local end to end IF you run Word on your machine. Web-based Word (Office.com) is a different posture; the scrub still works but the file is on Microsoft's servers.
  • Honest scope on FileHop: cloud OCR is opt-in and clearly labelled in the app. If you don't turn it on, no part of the file leaves your computer. OCR is not relevant to this particular workflow.

FAQs

Doesn't 'Save As PDF' from Word remove all the metadata automatically?
It removes most of the .docx-specific metadata (tracked changes structures, custom XML, comment markup), but it writes new PDF-side metadata in the process — typically the Author from your Word user profile and the Producer string. That is why this workflow runs Word's Document Inspector first AND a PDF metadata strip second. One pass is not enough.
I accepted all tracked changes and turned off Track Changes. Why does the file still show 'tracked changes' when opened?
Two common reasons. First, accepting changes incorporates the edits but does not always remove the underlying revision-history records — Word's Document Inspector handles that under 'Comments, Revisions, Versions, and Annotations'. Second, the receiving party may have Track Changes turned on in their copy of Word, which causes their fresh edits to show as tracked changes — that is not your file's residue. The Document Inspector pass in Workflow A Step 2 covers the first case; the second is on the receiver.
Does FileHop remove metadata from a .docx file directly?
No. FileHop does not edit .docx metadata in place. For the .docx phase of this workflow, use Microsoft Word's Document Inspector (File → Info → Check for Issues → Inspect Document) — that is the right tool for that file format. FileHop handles the convert-to-PDF and PDF metadata strip phases.
Is FileHop a Word add-in?
No. FileHop is a separate desktop app. It does not install into Word's ribbon and does not run as an Office plug-in. Run Word's Document Inspector first, then come to FileHop for the convert-to-PDF and PDF metadata strip.
Can I do this on an iPad or Linux machine?
FileHop runs on macOS and Windows desktop only — no iPad, no Linux, no web version. Microsoft Word's Document Inspector is available on Word for Windows and Word for Mac; the iPad version of Word does not expose the full Document Inspector. If you only have an iPad, the safest path is to do this workflow on a desktop machine.
Will FileHop's metadata strip remove tracked-changes residue from a PDF?
If the tracked-changes markup is still structured (e.g., the PDF carries it as annotations), aggressive compression with annotation removal handles it. If the tracked-changes markup was flattened into page content during export — bubbles and balloons rendered as part of the page — that is no longer metadata, it is visible page content, and you need real redaction to remove it. A dedicated redaction workflow is a separate task from metadata strip.
What about EXIF on embedded images?
EXIF on a standalone image file can be stripped in FileHop's image tools. If the image is already embedded inside a Word doc or a PDF, the EXIF generally survives the document-level metadata strip. The defensive practice is to strip EXIF on each image BEFORE inserting it into the document.
Is it ethical for opposing counsel to look at my document's metadata if I forget to scrub it?
Under ABA Formal Opinion 06-442 (2006), reviewing metadata embedded in a received document is not prohibited by the Model Rules; the receiving lawyer may look. Some states disagree — New York State Bar Op. 749, for example, prohibits attorneys from 'surreptitiously' reviewing opposing-counsel metadata. Check your state. Either way, the sender's reasonable-care duty is real, which is why this workflow exists.
When should I NOT remove metadata?
If the document is subject to a litigation hold, a records-retention rule, a discovery production obligation, a regulatory preservation duty, or a contractual preservation clause, removing metadata may be sanctionable spoliation. Outgoing-mail hygiene on documents you produced is one problem; modifying documents under a preservation duty is a different problem with its own ethics framework. When in doubt, scrub a copy and retain the original.
Is the metadata-removal step destructive — can I get the metadata back?
Yes, it is destructive. FileHop writes a new PDF without the Info dictionary and XMP; the original file (the one you opened) is left untouched on disk, so the metadata still exists on that original. Best practice: scrub a copy, keep the original in your matter folder, document the chain if your firm requires that.
Will free online metadata removers work for this?
Functionally yes; the privacy posture is the problem. Uploading a draft you are about to send to opposing counsel to a third-party scrubber recreates the exact disclosure problem the scrub was meant to prevent. Many state bars now treat 'reasonable security measures' (ABA Op. 477R) as covering tool choice for documents protected by privilege or under a protective order. Process locally.

Before you hit Send

Run the four-step workflow on a copy of the file. Check the Author, Title, and Producer fields are empty in your PDF reader. Skim the final PDF cold. Three minutes of verification prevents the kind of disclosure that ends up in a Bar journal case study. If you do this kind of file work regularly — combine, compress, redact, annotate, sign, scrub — the persona page at /for/lawyers/ walks the broader workflow set, and the related guides below cover the adjacent steps.