How to split PDFs into logical files

Medical records, FOIA responses and responses from the Court run hundreds of pages or more. The logical evidence needs to be split out, classified and correctly filed. MasterFile makes it simple.

One of the common issues in managing evidence, is dealing with large PDFs that are actually bundles of multiple evidence files – dozens or hundreds of documents all in one large PDF file.

MasterFile simplifies the PDF splitting process, breaking apart large PDFs regardless of the original file size, into the multiple, logical documents these aggregate PDFs contain, automatically. That makes it easy to:

Review and extract key information, facts and issues, or apply work product to a single section of a document.
Produce specific documents.
Avoid navigating PDFs with large page counts and unrelated documents within them by dealing with each logical document on its own in MasterFile’s review platform.

Dozens of products, miscellaneous online tools or an online PDF splitter like Adobe Document Cloud will let you split files into smaller PDF documents based on a number of criteria – bookmarks, number of pages, file size and so on. That’s all well and good, but as a litigator or a legal nurse consultant being able to simply extract pages has no relevance to managing evidence, key facts and extracts within each document that’s inside the large PDF file now on your screen.

Let MasterFile split and load large PDF files for you automatically, so you can focus on what matters most – case analysis and case strategy.

Where do these irritating PDF files come from?

Common examples of large PDFs are medical records and production disclosures, FOIA responses and responses from the Court. Each of these bundles need to be broken apart and split into their constituent logical documents, and the individual files classified and correctly filed with case evidence: by date, author, summary, document type (i.e. is it an expert report, case law, invoice, ruling, etc.) and so forth. MasterFile’s Express Load splits such PDF files into their logical documents on-the-fly, one document per PDF, as the original PDF is being added to your case evidence.

You can split PDF files in one of two ways.

By bookmarks. Many digitally combined PDFs (by Adobe Document Cloud, Adobe Acrobat Pro or Acrobat DC, etc.) include bookmarks — either the original filename from which the PDF was made, or a description. MasterFile uses these as logical breaks at which to split a PDF. A bookmark is also used as a short summary for its related document. Nested and top-level bookmarks will all be processed.
By the starting page number of each contained document. That’s explained below.

Option 1 — Split PDF files on-the-fly via bookmarks

Simply navigate from Express Load to the folder containing the PDF to split.
Select it, and enter one of these options on how to interpret the PDF’s book marks:
- No bookmark numbers. Processing splits the entire file on each bookmark.
- Starting or ending bookmark numbers; if both are specified, just the documents between the starting and ending bookmarks are split out and loaded.
Click OK.

Option 2 – Split PDF files on-the-fly via page number

This output option lets you split PDF files using an Excel spreadsheet or CSV file to control what pages the logical documents start and end on. Starting and ending page numbers for each contained document are entered on consecutive rows in their respective columns. MasterFile uses the Excel sheet to determine where and how to split the PDF. MasterFile can append bookmark content too, using this option.

Why use an Excel or CSV file to split a large PDF?

The Excel file is actually a pseudo load file. Which means you can also include any other meta data you like, rather than updating meta data during review, such as each document’s date, sender/author, document type, issues it pertains to, and a summary or description. Excel’s cell copy commands make short work any of common information.

Note that the CSV and XSLX files should have exactly the same filename as the PDF you are splitting.

Here’s a screenshot of a simple Excel sheet to split the same FOIA document being split via bookmarks in the short animation above.

Notes

Splitting a PDF using bookmarks is useful if a PDF already has descriptive bookmarks and each bookmark marks the start of each logical document in the PDF. You can of course add bookmarks yourself and load the PDF. When splitting with bookmarks, however, the bookmark is the only meta data loaded (as the document summary) and therefore use short, meaningful summaries as bookmarks for each document within the large PDF.

Whether bookmarks exist or you are adding them, PDF products can and do introduce odd characters and invisible line breaks into bookmarks. For example:

Acrobat itself can add a square box character like this ▯ to bookmarks in some cases.
Bookmarks can’t be two or more lines yet Acrobat and other products let you add carriage returns or new line characters, or add these themselves, that are invisible — so bookmarks look like one long line. You will only be able to spot such line breaks by copying and pasting an entire bookmark into Notepad, removing breaks and odd characters, and then replacing the bookmark’s text with it.

Any of the above will cause an import to fail; they can not be detected in advance. We always recommend therefore you check the characters in each bookmark and test load your PDF in a new database before finally importing.

Questions

What happens to the original PDF?

If you want the original document, load it into MasterFile too (although we’ve found that’s rarely needed in practice) without splitting it into separate files. Splitting PDFs into logical documents as individual files is what matters, forms part of the case chronology, and is what you will need for production, etc. in future. Each split document will retain a reference its original page number range.

Can I split PDFs arbitrarily?

Yes, with Option 2, enter the page numbers to split at and a PDF document of those many pages will be created. You might use this to make separate PDF files of individual pages, PDFs of multiple pages with specific page counts per PDF file, or simply to split a large PDF file into several smaller ones with an equal number of pages in each. You can also use the same technique to select pages or a page range for each split PDF. Note that PDFs will also be created with the ‘excluded’ page ranges; simply delete them from MasterFile after the load process ends.

What numbering is applied to split PDF pages?

When a single PDF file that has been split out is loaded, the default assumption is that now, as a separate document, numbering is related to it, not to the larger file MasterFile extracted pages from. Page numbering therefore starts at 1.

However, MasterFile lets you manipulate starting numbers to coincide with specific situations. For example, suppose the page range of a split out PDF of 3 pages was 223, 224 and 225 in its source PDF file. But those three pages’ actual printed page numbers are 6, 7 and 8. You can set the extraction numbering to be 6, 7 and 8 rather than 1, 2 and 3. The original page range of 223 – 225 is also preserved.

More about PDF splitting processes for legal.

Almost any PDF viewer like Adobe Acrobat DC, Adobe Acrobat Pro for Mac, Foxit, etc. (excluding readers like Adobe Acrobat Reader DC) will let you split a PDF manually. A PDF cutter helps automate part of that, but a law firm needs to classify, organize, review, and analyze the key information in each split pdf. Bulk rename is hardly of any value at all: each filename and date prefix will be different, as will the document’s author, recipient, description, etc. Whether a response to an FOIA request or a medical record, that means reviewing, renaming, and organizing each individual PDF file, manually. Make sure to have significant amounts of time on hand!

MasterFile automates this document-splitting process for you. A multi-page PDF file will be split directly into logical documents as it is loaded; no additional software is needed. Whether you split a PDF using its dozens of bookmarks, or via a mapping using Excel, file organization takes place automatically. Content on one PDF page belonging to consecutive documents is also managed correctly. As new single or multi-page PDFs are created, review, mark key extracts, classify by legal issue, add notes and metadata, etc. Or, use MasterFile’s bulk loading and batch processing features and defer review to later.

Other PDF related advanced features include creating briefs, exhibit and witness profiles, and disclosure sets in PDF hyperlinked to underlying evidentiary documents, and more.

MasterFile is more than a PDF tool. It’s the perfect tool for litigation, case management and case analysis for your firm on both Windows and Mac operating systems.

Manage your evidence, disclosures, production, case chronologies and case analysis more efficiently and more effectively in MasterFile. Win more – and have an easier time doing so!

See how MasterFile can help your firm.
Book a demo today →