Convert Old Client Scans (for Lawyers)
TIFF scans from a 2008 scanner. Image-only PDFs your old paralegal sent. Photographs of physical exhibits. Two distinct workflows — clean PDF and searchable PDF — explained honestly for both Mac and Windows.
When this guide is for you
You open a client file from years ago. The scans are TIFFs your modern PDF viewer won't preview. Or they're PDFs that look like documents but you can't select a word. Or they're phone photographs a client emailed you of a contract. You need them in a format that works in your current workflow — maybe just a clean PDF you can email and file, maybe a searchable PDF you can grep. This guide walks both paths on your own Mac or Windows machine, and is honest about which steps stay on your machine and which involve OCR (which is a different conversation).
Two workflows. Decide which you actually need.
Almost every page on the SERP conflates 'convert old scan' with 'OCR.' They are two different jobs. Pick the right one before you start.
| What you need | What it means mechanically | Where the work happens |
|---|---|---|
| Workflow A — Clean PDF | Convert the TIFF (or photograph, or image-only PDF) into a single, well-formed PDF you can email, archive, and file with a court. The output is still a picture of the page — you cannot search the text — but the format is universal. | Fully local on your Mac or Windows machine in FileHop. No cloud, no upload, no OCR. |
| Workflow B — Searchable PDF | Same as Workflow A, then add an invisible text layer behind the page image so you can search, select, and copy. This step requires OCR (optical character recognition). | OCR is a separate decision. FileHop ships a local OCR engine (the Extract Text tool, fully on-device) plus an optional cloud OCR path with per-file consent. For accuracy-critical legal work — depositions, evidence, contract text where a single misread word matters — route OCR to a specialist tool (ABBYY FineReader, Adobe Acrobat with local OCR enabled, or open-source Tesseract). FileHop then combines, compresses, and finalizes the searchable PDF locally. |
Most old-client-file work is Workflow A. If you can already read the document on screen and you only need to email it or file it with the court, you do not need OCR — you need a clean PDF. OCR is a separate, more expensive decision.
What FileHop handles locally on your Mac or Windows machine
The conversion primitives below are what the desktop app implements today. The list is what is in the codebase — no inflated claims. The two workflows that follow are built out of these primitives.
-
TIFF → PDF (single-page or multi-page)
Single- and multi-page TIFFs convert to a single PDF, page order preserved. Multi-page TIFF support is first-class as of v0.27 and is critical for legal scans — fax archives and 2000s flatbed scanner output are almost always multi-page TIFF.
-
Images (JPG / PNG / HEIC / WebP / BMP / GIF) → PDF
Drop a folder of scanner output or phone photographs and assemble a single PDF in the order you want. Useful when the client emailed three iPhone photos of a four-page contract.
-
Image-only PDF → re-saved clean PDF
Open a legacy image-only PDF that your viewer struggles with and re-export a clean modern PDF. This does NOT add a text layer — that is OCR (see Workflow B). The output is universally readable, just not searchable.
-
TIFF → JPG / PNG (per page)
When you want raster output instead of PDF — e.g., to drop a single page image into a brief or motion as an exhibit figure. Covered by the existing TIFF-to-JPG tool page.
-
Merge multiple PDFs
After conversion, combine the result with other PDFs (correspondence, pleadings, exhibits) into a single deliverable. Covered by the Merge PDF tool. Runs locally.
-
Compress for ECF size limits
Old scans are often huge — a 100-page uncompressed multi-page TIFF can run 2-3 GB. Compress the output PDF to fit court size limits. Covered by the Compress PDF tool and the ECF size-limit guide.
-
Re-page / split / rotate
Fix the order, drop duplicate pages, rotate misoriented scans. Covered by the Split PDF tool and the page editor.
What FileHop does NOT do
- •Legal-grade OCR accuracy guarantee — FileHop's local and cloud OCR are AI-based and can occasionally misread. For evidence-grade work where a misread word matters, run OCR in a specialist tool (ABBYY FineReader, Acrobat with local OCR, or Tesseract).
- •Bates numbering — if you need stamped Bates IDs on the converted PDF, run that step in Acrobat, CaseMap, or a dedicated litigation-support tool after conversion.
- •Linux builds — Mac and Windows only. No web app.
“Added multi-page TIFF support: cached PNG preview, rasterize-first edit/mock, convert to PDF and raster formats, and resize via page-walking encoder.”
In plain English: when you drop a multi-page TIFF into FileHop, it generates cached PNG previews so you can scroll the pages, lets you reorder or remove pages, and exports to PDF or raster formats in page order. The original TIFF on disk is never modified — FileHop only writes new outputs (PDF, JPG, PNG) and the page-preview cache next to the source file.
Workflow A — I just need a clean PDF (fully local)
The most common case: a TIFF, an image-only PDF, or a photo that you need to convert into something you can email, archive, and file. No OCR involved. Five steps; the file never leaves your machine.
- 1
Identify the source format
Look at the file extension and how your existing viewer behaves. If it is .tif or .tiff and your PDF viewer will not open it: that is a multi-page TIFF (common scanner and fax output). If it is .pdf but you cannot select any text: that is an image-only PDF (a PDF wrapper around scanned page images). If it is .jpg, .png, or .heic: that is a photograph or single-page scanner output. The destination format for all three is the same — a single PDF — but the source determines the import step.
- 2
Open FileHop and drop the file in
FileHop runs locally on your Mac or Windows machine. The file does not transit our servers for any step of this workflow. Drop the TIFF, image-only PDF, or photograph into the FileHop window. For multi-page TIFF, FileHop will preview each page using its rasterize-first cache — the original TIFF stays untouched on disk; FileHop only generates cached PNGs for the preview.
- 3
Order the pages
If the source is a folder of scanned JPGs (one page per file from an old flatbed scanner), drag them into the order you want. If the source is a single multi-page TIFF, the page order is already preserved from the original scan — verify and reorder only if needed. If the source is an image-only PDF, the page order comes from the original file. Drop, reorder, remove duplicates — none of it modifies the source file.
- 4
Export as a single PDF
Use the image-to-PDF wizard (two-step flow, shipped in v0.27.x). Choose the output filename, output location, and whether to overwrite the original or save as new — FileHop makes you choose explicitly; the overwrite mode is a deliberate step, not an accidental default. The output is a single PDF, all pages in order, no OCR text layer. This PDF opens in any modern viewer and is fileable with any court that accepts PDF (which is essentially all of them).
Open the Images-to-PDF tool → - 5
Optional — compress and merge
If the file is too large for court submission (ECF systems typically cap at 25-50 MB per PDF — see the ECF size-limit guide), run it through the Compress PDF tool. If you want to combine the converted scan with other case documents into a single deliverable, run it through Merge PDF. Both steps stay local on your machine. The compress + merge combination often reduces a 2 GB legacy TIFF archive to a sub-30-MB filing-ready PDF.
⚠ Before Workflow B — read this
About OCR — it is a separate decision from format conversion
OCR is a separate decision from format conversion. FileHop has two OCR options of its own: a local OCR engine (the Extract Text tool, fully on-device) and an optional cloud OCR path (OpenAI or Gemini, with explicit per-file consent). The conversion steps in Workflow A do NOT involve OCR and stay on your machine. For Workflow B routine work, either of FileHop's OCR options is fine. For accuracy-critical legal work where a single misread word matters — depositions, evidence, contract text — run OCR in a specialist tool (ABBYY FineReader, Adobe Acrobat with local OCR enabled, or open-source Tesseract) regardless of whether the scan is privileged, because AI-based OCR (local or cloud) can occasionally misread. Then bring the searchable PDF back to FileHop for combine, compress, and file.
Different audience, different routing
If you're a researcher — read the local-VLM version of this guide instead
Researchers reading this guide for IRB-restricted, embargoed, or unpublished material have a different routing than lawyers. The accuracy ceiling that researchers can tolerate (with a spot-check verification habit and source scans co-located) is below evidence-grade, but the privacy ceiling is higher — they cannot upload to cloud OCR and they often have multi-language, handwritten, or scientific-layout content that benefits from a modern Vision-Language Model running locally. The researcher-cluster sibling guide leads with local VLM OCR (MiniCPM-V, Chandra OCR, olmOCR-2) for that reason. Both guides share the same underlying privacy concern; they split on the accuracy ceiling.
Researchers — OCR a scanned research archive locally →Workflow B — I need a searchable PDF (OCR required)
The second workflow. The lawyer needs to grep the file. Two honest paths: FileHop's own OCR (local Extract Text or optional cloud, with consent) for routine work, or a specialist OCR tool (ABBYY, Acrobat-local, Tesseract) for accuracy-critical work.
Path 1 — FileHop's OCR (local or cloud)
Suitable for routine, non-evidence-critical scans where you want the convenience of FileHop's own OCR. Two options: (a) FileHop's local Extract Text tool runs OCR fully on-device — no upload, the file never leaves your machine; (b) FileHop's cloud OCR path uses OpenAI or Gemini for higher-end accuracy, with explicit per-file consent before any data leaves your machine. Either way, the resulting text layer is attached to a new PDF, which FileHop saves next to the original — you choose which to file. For accuracy-critical work where a single misread word matters, see Path 2.
Path 2 — Specialist OCR tools (accuracy-critical work)
For accuracy-critical work — deposition exhibits, evidence, contract text where a misread word matters — run OCR in a tool purpose-built for legal-grade accuracy before bringing the result to FileHop. This is the recommended path for evidence-grade work regardless of whether the scan is privileged, because AI-based OCR can occasionally misread in ways that matter. The three serious options:
-
ABBYY FineReader PDF
Mac + Windows desktopThe standard for accuracy-critical legal OCR. Local processing. Around USD 99-165/year per seat. Best for high-volume archive conversion — the in-house equivalent of an outsourced litigation-scanning vendor.
-
Adobe Acrobat Pro (local OCR mode)
Mac + Windows desktopAcrobat's Enhance Scans / Recognize Text feature does OCR locally when you have the desktop app installed. Around USD 240/year per seat. Confirm in Preferences that you are using the on-device OCR rather than Acrobat's cloud services — that toggle matters.
-
Tesseract (open-source)
Mac + Windows + Linux command lineFree, local, fully under your control. Slower per page than ABBYY or Acrobat and the learning curve is real (command-line interface, per-language model files). The right answer for firms that want zero per-seat licensing and have at least one technical person.
After OCR has been run upstream, bring the resulting searchable PDF back into FileHop for the combine, compress, re-page, and file steps — those remain local in FileHop as in Workflow A. FileHop never sees the document during the OCR step.
TIFF specifically — why old TIFFs won't open and what to do
Concrete troubleshooting for the actual problems lawyers hit with legacy TIFF files. Most pages in this SERP skip this; it is what makes the difference between a clean migration and a stuck file.
| Problem | What is happening and what to do |
|---|---|
| The TIFF opens in your image viewer but only shows page 1. | It is a multi-page TIFF (one file, many pages — the standard fax and 2000s-scanner output). Most modern macOS and Windows image viewers default to showing the first IFD only. FileHop reads all pages — drop the file in to see the full page list and export to PDF. |
| The TIFF will not open at all (file dialog errors, blank window). | It is using a legacy or unusual compression — CCITT Group 4, LZW with predictor, old JPEG-in-TIFF. FileHop's underlying TIFF library handles the standard variants; for very-legacy files, IrfanView (Windows) or Preview (macOS) often opens them, after which you can re-save as a standard TIFF or PNG before bringing into FileHop. |
| The TIFF opens but quality is awful at any zoom level. | The original scan was probably 150 DPI or lower. There is no software fix — the resolution is what was captured. For court submission a 150 DPI scan is usually still legible; for evidence-quality reproduction, rescan if you can. The Society for Computers & Law PDF/A explainer covers preservation-grade scanning in more depth. |
| The TIFF is enormous (single file > 500 MB). | Uncompressed multi-page TIFFs are huge. Convert to PDF via Workflow A and then run the Compress PDF tool — a 2 GB TIFF often becomes a 30 MB PDF after conversion and lossless compression, which is below most ECF size limits. |
What stays on your machine and what doesn't
Specific statement of where each step in this guide happens. Not a general privacy pitch — a step-by-step map.
- •Workflow A (clean PDF): every step runs on your Mac or Windows machine. The source files, the cached preview, the output PDF, the compression and merge steps — all local.
- •Workflow B Path 1 (FileHop's OCR): if you use the local Extract Text tool, the OCR step runs entirely on your machine — nothing uploads. If you opt into FileHop's cloud OCR, only the OCR step round-trips, and only after FileHop prompts you for explicit per-file consent. The conversion and finalize steps before and after always stay local.
- •Workflow B Path 2 (upstream local OCR): the OCR step happens in your chosen local tool (ABBYY, Acrobat-local, or Tesseract). FileHop never sees the document during OCR.
- •FileHop does not maintain a server copy of your files for any of these workflows. There is no FileHop account required for the file-handling steps in Workflow A or for the upstream-OCR path in Workflow B.
- •FileHop is Mac and Windows desktop only. There is no Linux build and no web app — that is a constraint, not a feature, but it is also why the file work in Workflow A can stay on your machine.
Frequently asked questions
Does FileHop do OCR on the converted PDF automatically? ▼
I have a 200-page multi-page TIFF from a 2009 scanner. Will FileHop handle it? ▼
Can I batch-convert a folder of old scans at once? ▼
What about photographs of a contract a client emailed me from their phone? ▼
Is the converted PDF court-acceptable / PDF/A-compliant? ▼
Does FileHop do Bates numbering on the converted PDF? ▼
Can I redact the converted PDF? ▼
Why route accuracy-critical OCR to specialist tools instead of FileHop's own OCR? ▼
Can I OCR an image-only PDF directly without converting first? ▼
What if my old TIFF uses an obscure compression and FileHop can't open it? ▼
Does FileHop run on Linux? ▼
What does FileHop cost? ▼
Get the desktop app
Free desktop install. No account required for the file-handling steps in Workflow A. Mac and Windows only. If you handle this kind of legacy-archive conversion regularly — TIFF, image-only PDFs, phone photos of physical documents — the lawyer persona page walks the broader workflow set, and the related guides below cover the adjacent steps.