[John Batchelor just interviewed me for his show tonight (8/24/15) and asked that I repost this entry from last March. It is more relevant now than ever, with the Department of Justice conducting an investigation into the Clinton staff over possible violations of national security protocols. Meanwhile, here’s a link to all my posts about the Clinton e-mail issue, most of which were written as the story was originally unfolding last March.]
Over a period of nearly 16 years — since mid-1999 — I have been retained repeatedly as a consulting and/or testifying expert in litigation involving information technology (IT). Roughly half of my cases involved troubled or failed IT projects involving two or more parties; roughly half are intellectual property disputes (patent, trade secret, copyright); and the rest are a scattering of other topics (computer document forensics, licensing practices and interpretation, computer security, algorithmic analysis, and so on).
During that time, and particularly in the IT project failure cases, I have searched through literally millions of pages of documents, including probably hundreds of thousands of e-mails. In fact, e-mails are the lifeblood of my investigations and my favorite category of documents. Let me explain why.
You learn what people were really thinking, or at least saying, at that moment in time
Much more than prepared documents, and even more than memos and letters, e-mails contain unvarnished reactions to and commentaries on then-current events, particularly internal e-mails within a given group or organization. One of my all-time favorite e-mails, from an early IT project failure case, was sent by one programmer to the rest of his team, saying in pretty close to these exact words, “I can’t believer we’re charging our customer [a very large dollar amount] for this garbage! We should be embarrassed!” I was happy, because I was representing the customer, not the developers; I have seen similar e-mails in other cases from my client’s side and winced mightily. Most e-mails are more restrained than this, but even they can clearly indicate or at least suggest how people really felt about a given topic.
Generally speaking, I give internal e-mails the greatest weight for honesty; internal memos and reports come next; then e-mails traded with the other side, formal reports and memos next, then letters, then years-later recollections and testimony. It is always fascinating to see someone under oath at a deposition claim that things were a certain way at a given point in time — and then see that person presented with an e-mail from the time in question, written by them, that directly contradicts that assertion.
You can build a timeline
E-mails help establish — not just for a given month or day, but time-stamped right down to the hour, minute and often second — who said what, who knew what, who was told what. I have had cases where my ability to make a certain point — to establish a given fact or a high likelihood thereof in my expert report — boiled down to the interleaving time stamps of two or three e-mails. On a much broader scale, you can trace the history of the project and pinpoint the date when things started to go in a different direction — what I refer to as “the inflection point”. My preference, when I start a new case, is to get a comprehensive set of documents in chronological order and then read through them.
You can uncover misrepresentations
Internal e-mails often act as a reality check against more formal documents or pronouncements. One of my largest cases (over 1 million documents in the electronic database) involved public representations in investor conference calls by the officers of a publicly-traded corporation about how well a major IT re-engineering projects was going — even while internal e-mails and other documents showed not only that the project was going badly, but that said officers knew so. Similarly, I will often find a major disconnect between what is going into formal status reports or inter-party communications and what is actually being said internally on one side or the other.
You can see who was talking to whom and about what
In both project failure and intellectual property cases, it is often important to figure out the lines of communication of key information. So if Alice sends an e-mail about X to Bob and cc’s Carol at a given date in time, then Carol subsequently has a meeting with Dave about X, you know that Carol went into that meeting with Dave knowing what Alice said about X. That may seem obvious, but I have seen cases with major points that have hinged on just such a chain of events, allowing me to draw inferences about the context of Carol and Dave’s meeting that I would not have been able to others.
You can flush out other documents and e-mails
[ADDED] Produced e-mails will often indicate that they have one or more documents attached, but there are times when those documents have not been produced. Similarly, e-mails may make mention of other (unattached) documents or may reference earlier e-mails on the same related topics — and it turns out those have not been produced, either. Subsequent requests can then be made that these additional e-mails and/or documents be produced.
Production format of documents (including e-mails) is critical
When I started doing expert witness work back in 1999, almost all documents I received were produced on hard copy. On a new case, I would received anywhere from a few to a few dozen file boxes of documents, often in Bates-stamped order rather than chronological or topical. This made reading and especially searching through the documents very time consuming.
Today, the vast majority of document production is electronic, typically in PDF format, with the text already having been run through optical character recognition (OCR), so that the documents can be readily indexed and searched. In litigation today, if one side produced — or tried to produce — 50,000 pages of documents in hardcopy format, the other side would complain mightily to the judge or arbitrator, who would almost certainly go back and demand that the producing side scan in and index the documents before producing them in electronic format. I will note that in particularly nasty cases, one side may produce documents in TIFF (photo image) format, and often mediocre-quality images at that, which makes it hard to accurately OCR the documents in question. So having the documents in full-text-searchable format is critical.
Also critical with any production of files that were originally electronic files — including e-mails — is the preservation of associated metadata, that is, data about the document (vs. what’s actually in the document). The first level is getting the correct “Created by” and “Last modified” dates from the hosting operating system. For many applications (e.g., Microsoft Word), there is internal metadata as well; you can recover this with various utilities, and much of it you can see for certain document types via the operating system. For example, right click on a Microsoft Word document, bring up the Properties panel, and then click over to the Details tab, and you can see the metadata that MS Word tracks and keeps with the document file itself.
For e-mails, the ideal production is in the original e-mail electronic format itself (Outlook, etc.) with full headers and metadata. For most of my cases, the header information doesn’t matter that much — I’m usually just interested in authorship, recipients, and the date/time stamp. But in a matter such as this, where there are still serious questions about both the security and routing of said e-mails, full headers should have been produced in the disclosed e-mails. I strongly suspect that the headers are gone in the hardcopy e-mails produced by Clinton to the State Department — and I suspect the original electronic copies of those e-mails are likewise long gone.
Preservation of documents is critical
In litigation, each side is usually ordered to produce all documents responsive to a particular demand. However, in the course of that litigation, the court may order one side or the other to produce additional documents, often in response to what was found in earlier productions. These often turn out to be critical documents, yet the producing side held them back initially as not being responsive. Because of that, the standard in litigation is that each side must preserve all documents in any way related to the matter at hand, or be judged guilty of spoliation (deliberate destruction of evidence). Furthermore, this standard applies once there is even the suspicion or possibility of litigation.
If this were a litigation, the Clinton team would be in major trouble. It appears from their own admission that they and they alone determined what was responsive, and then — reading between the lines just a bit — destroyed the electronic originals of all the clintonemail.com e-mails, both those they deemed related to State Department work and those they deemed private. All metadata is gone, all headers are gone, and there is no way for the Clinton team to go back and do additional searches through the non-produced e-mails to see if any are responsive to additional requests by State or by Congress on certain topics, parties, and/or keywords.
Remember that — while Hillary may paint herself as less than technically savvy — both she and Bill are lawyers. They know how production of documents works, how preservation of evidence works. True, they aren’t in litigation and are subject to court sanctions or findings, but they know exactly what someone on the opposing side might look for, and they took preemptive and irreversible steps to block further investigation.
Again, with nearly 16 years of reading e-mails for litigation analysis purposes, I have to conclude that the Clinton team took deliberate steps to remove information from the e-mails they did produce, to destroy any backing electronic evidence of those e-mails, and to make them as difficult to search and analyze as possible. I’ve certainly seen it in my own professional work, and I know what is typically behind it. ..bruce..
[Here are all posts related to the Clinton e-mail issue.]
 Bates-stamping of documents means putting a unique sequential numeric ID on each page produced by a given party in the litigation, so that all parties involved can consistently refer to the exact same document and given page therein.