Though it took a number of years to gain initial acceptance, “concept search” technology has now matured to the point where it is deployed in the vast majority of document review platforms in the industry.
To be clear, I am NOT referring to Technology Assisted Review (TAR… aka “predictive coding”) technologies or workflows. In many ways, TAR evolved from concept search, but the two are not the same.
Let’s review. In the old fashioned (but sadly still prevalent) “linear review” workflow, documents are batched out to reviewers based on objective criteria like custodian, date range, or even Bates® range. It has been the “tried and true” method of document reviews for decades, despite the enormous cost and inherent risks.
Why create a concept search index in the first place? Because it is hands-down, one of the best ways to reduce the cost of review for virtually every matter (no matter how small…. or how large.) The cost of document review has always boiled down to the tension between the need for accuracy and the need for speed. “Document decisions per hour” is one key metric. “Overturns” (the number of document decisions a reviewer makes that are subsequently changed in second pass or QC review) is another. These tensions exist for virtually every case from 5 GB to 500 GB to 5 TB.
Concept search technology gives the document review team an opportunity to identify “semantically similar” documents to perform a more accurate and consistent document review at a lower cost. By clustering documents based on their similarity to one another, review teams can quickly identify near-duplicates and email threads. This gives them the ability to batch documents out to individual reviewers, organized in such a way that the individual reviewer can see the various iterations of a specific document and then able to apply coding decisions consistently across all “copies,” whether or not they were true “Hash” duplicates. This dramatically increases review speed (often by 5x to 10x or more) while at the same time reducing overturns.
Additionally, because the concept search index is built around “pattern matching” rather than specific keyword matching, two documents may be semantically similar but contain slightly different wording. These differences and the occurrences of each can be identified and quantified by the index. The result is that many platforms offer “keyword expansion” lists that can help attorneys identify additional keywords, “code words” and industry jargon word lists to search. They can also organize or categorize groups of documents based upon their overall subject matter, for more efficient review.
The financial ROI on employing this technology has been proven for over a decade. On that basis alone, if I were king of the world, I would declare that every single matter that every single client worked on would have it added to the workflow (right after declaring that the Red Sox are automatic winners of every World Series in the future… regardless of record!).
So why go “backwards” and discuss an older technology? Because while our industry continues to charge forward and looks for new “latest and greatest” technology, I believe that some of the most strategic values of concept search have either been forgotten… or more importantly, never learned in the first place. Even in instances where concept search is employed, it is only used for a relatively short period. We teach the junior attorneys how to use it to get through the initial review and document production. Once the production is out the door, we forget that it’s there for the rest of the case life-cycle. It’s time for the senior attorneys and yes, even first chair trial attorneys to get in the game. Concept search brings so much more to the table! Have you thought about these? Here are 5 key uses that most people miss:
Auditing your privileged documents to ensure that you have asserted privilege consistently on ALL versions of a document – This is one that I’ve been preaching for many years. It takes minimal effort but can yield significant results. One person with a priv log and a concept search tool can help prevent inadvertent waivers in a matter of minutes or hours.
Auditing redactions – This is the same idea as above but with more detail. IMHO, this should be part of the standard workflow during the actual document review. Review a document. Redact a document. Run a search for near duplicates. Redact them all the same way NOW.
The last three require that clients have their provider add the opposing party’s document production to the concept search index. Sad to say, but the vast majorities of clients opt out of this. I understand that while it’s easy to prove a hard-dollar ROI for the review and production phase, adding concept search to the documents that the other side has produced just seems like an extra cost to justify. Well, here is your justification…
Auditing the production from the opposing party – Run a keyword search on “redact” to pull out all redacted documents and then run a concept search on them. I can’t tell you how many times I’ve done this and found instances where the other side has either produced versions with no redactions whatsoever (can you say “privilege waiver”?), or at least multiple versions of documents that have significant differences in redactions applied.
Compare documents in the opposing production to the documents in “our” review set – What did they produce that looks just like what we had? What keywords did their set contain that we didn’t know to search on when we did our own review? This is an excellent opportunity to leverage concept searching during deposition and exhibit prep. Working on a patent case and getting ready for a Markman hearing? This is an excellent way to compare the language in “their” patent to the language in “ours”.
Last, but certainly not least is the ability to use concept search on witness testimony – Search on a Q&A pair from a deposition transcript to find related documents. More importantly, with today’s technology, we now have the ability to be connected to our document databases wherever we go… even into a deposition or courtroom.
To prove the last point, here is a real world example, from a real case with a real client of mine. My client (outside counsel on a construction defect and delay matter) was questioning a very uncooperative witness for the other side in a deposition. During questioning, it was uncovered that there was a memo… (summarizing)
Who wrote the memo?
I don’t recall.
Who received the memo?
I don’t recall.
When was it sent?
I don’t recall
(At this point, my client remembered that he had access to his document database that had close to 1 million records. The key was that his database was concept search enabled. He loaded up his database and then started up Microsoft Word.)
Fine, just tell me what the memo talked about.
(describes the contents of the memo)
My client typed the description while the witness was speaking, copied the text to the Windows clipboard, opened his document database program, pasted the witness’ own description into the search box and ran the search. He then turned his laptop around to face the witness. The memo that the witness was describing was the third hit on the results list. Bingo.
Anyone care to put a dollar value on that?
Until next time,