The Impenetrable Program Transforming How Courts Treat DNA Evidence

A legal battle is trying to expose the inner workings of TrueAllele, game-changing software that attempts to identify criminals based on subtle traces of DNA.
Image may contain Rug Labyrinth and Maze
Li-Anne Dias

In the summer of 2013, Kern County law enforcement faced a serious problem. A series of women were reporting rapes in Bakersfield, California, a large industrial metropolis north of Los Angeles. The victims generally identified their attacker as a dark-skinned man wearing a ski mask and a hoodie. Despite a varied m.o., police believed a single perpetrator was responsible. He snuck into houses while the occupants were asleep. Sometimes he bound his victims with tape or zip ties. Sometimes he covered their faces. The local news began calling him the East Side Rapist.

Police caught a break when they pulled over a man named Billy Ray Johnson, who was driving a red Chevy Caprice with an expired license. Johnson had a record of theft, assault, and domestic battery; he also had brass knuckles in his car. Soon, law enforcement began tracking Johnson’s cell phone, which would ping the police with his location every 15 minutes. For police, this data placed Johnson in the locations of the crimes on nights they were committed. A few months later, they arrested him.

But the physical evidence linking Johnson to the crime was proving difficult to assess. The Kern County Forensics Lab had collected samples of DNA from the crime scenes. This included both blood stains and touch DNA, which is the material that transfers to a surface—like a windowsill or doorknob— from a person’s skin. There was some blood on a towel, and touch DNA on a vase, and some zip-ties. But most of these samples couldn’t be analyzed with typical methods: They were too fragile or included a mix of DNA from multiple people, a combination that makes analysis difficult.

So the lab turned to TrueAllele, a program sold by Cybergenetics, a small company dedicated to helping law enforcement analyze DNA where regular lab tests fail. They do it with something called probabilistic genotyping, which uses complex mathematical formulas to examine the statistical likelihood that a certain genotype comes from one individual over another. It’s a type of DNA testing that’s becoming increasingly popular in courtrooms. Cybergenetics advertises that its program has been used in over 500 cases since 2009. STRmix, another company using a similar technique, has been used in thousands of cases. The TrueAllele tests determined that some of the DNA found at the crime scenes likely originated from Johnson—one sample had a one in 211 quintillion chance that it originated from someone else.

Johnson was tried and convicted of the rapes, and received multiple life sentences. After the trial, Cynthia Zimmer, Johnson’s prosecutor, didn’t mince words, calling him a “sadistic monster” to the press. Zimmer, who is running for Kern County District Attorney in 2018, is campaigning in part on her strength as a prosecutor who knows how to use advanced scientific evidence. Programs like TrueAllele are a part of that. “You have to be current with technology,” she said at a campaign event. In a recent phone interview, she told me that “the science of DNA has progressed and continues to progress.”

But now legal experts, along with Johnson’s advocates, are joining forces to argue to a California court that TrueAllele—the seemingly magic software that helped law enforcement analyze the evidence that tied Johnson to the crimes—should be forced to reveal the code that sent Johnson to prison. This code, they say, is necessary in order to properly evaluate the technology. In fact, they say, justice from an unknown algorithm is no justice at all.

As technology progresses forward, the law lags behind. As John Oliver commented last month, law enforcement and lawyers rarely understand the science behind detective work. Over the years, various types of “junk science” have been discredited. Arson burn patterns, bite marks, hair analysis, and even fingerprints have all been found to be more inaccurate than previously thought. A September 2016 report by President Obama’s Council of Advisors on Science and Technology found that many of the common techniques law enforcement historically rely on lack common standards.

In this climate, DNA evidence has been a modern miracle. DNA remains the gold standard for solving crimes, bolstered by academics, verified scientific studies, and experts around the world. Since the advent of DNA testing, nearly 200 people have been exonerated using newly tested evidence; in some places, courts will only consider exonerations with DNA evidence. Juries, too, have become more trusting of DNA, a response known popularly as the “CSI Effect.” A number of studies suggest that the presence of DNA evidence increases the likelihood of conviction or a plea agreement.

But, as the PCAST report says, “DNA analysis, like all forensic analyses, is not infallible in practice.” Many of these mistakes are caused by humans: Massachusetts, for example, is facing appeals from hundreds of cases because a forensic lab technician was using drugs to get high on the job—leading to potentially inaccurate results. But DNA analysis of complex mixtures—the kind that require probabilistic genotype matching—are particularly error-prone. According to the report, for that type of substance, “substantially more evidence is needed to establish foundational validity.”

Initially, DNA matching required a relatively pure sample, untainted by other bodily fluids. This is called “single-source DNA,” the kind made famous by rape kit testing and exonerations. But, as technology has improved, more processes have become available to detect DNA in trace amounts—like the kind left by a fingertip on a computer keyboard. These processes can also often parse DNA when the blood of multiple people is mixed together. Through probabilistic genotype matching, programs like TrueAllele can sort out the DNA strands presented in such a biological stew.

When Dr. Mark Perlin formed Cybergenetics in 1994, such techniques were tenuous. A few years later, the company began to focus on forensic technology: Perlin patented various algorithms that would be able to predict the presence of a specific person’s DNA from a sample that might include several people’s biological product. Perlin has marketed the tool, called TrueAllele, as the newest incarnation of DNA technology. In a series of YouTube PowerPoint presentations from 2014, Perlin argues that TrueAllele, unlike humans, is “objective...when it’s solving for genotypes, it never considers a reference or a suspect.” The founder quickly became an outspoken advocate-slash-salesman for the method’s use both by law enforcement and in exonerations; he often turns to videos to articulate his explanations. (He is also a musician who has written songs about catching criminals through his software.)

In 2009, the first TrueAllele case reached a courtroom. A Pennsylvania state trooper named Kevin Foley was tried for the stabbing death of his girlfriend’s estranged husband. Perlin testified that DNA found under the victim’s fingernails had a strong match statistic to Foley’s DNA: 189 billion to one. The court admitted the evidence; Foley was convicted of first-degree murder.

TrueAllele isn’t cheap. The Kern County lab used a $200,000 grant from the National Institute of Justice to purchase TrueAllele technology. A license to use TrueAllele is $60,000, according to court documents.

But the very thing that makes tools like TrueAllele invaluable to courts—its ability to make connections that elude humans—makes it difficult for those courts to assess. Probabilistic genotyping can analyze very small amounts of DNA by using the kind of complex code that would be impossible for a human (but not a computer) to run. This year, a ProPublica investigation uncovered aspects of the probabilistic software used by New York City forensic labs that might make the results unreliable. (New York forensic labs switched to another probabilistic software, STRmix, and advocates called for a New York State inspector general investigation into the lab.) Similarly, in 2014, STRmix, a competitor to TrueAllele, was found by a judge to have coding errors, involving certain mixtures of three-person DNA samples, that created misleading results.

After the scandal, STRmix released the algorithm publicly. But the cofounder of STRmix, John Buckleton, told me that he does not think that access to the algorithm would help lawyers figure out if the tool was free from error or bias. “I think it’s rubbish,” he says. “It would take a genius to work out an error from a code.” He adds that he kept the code public as a way to overcome critics of the technology.

Cybergenetics’ tagline is “Justice through better science,” and volumes of marketing materials boast about TrueAllele’s ability to pinpoint better results in the courtroom. In a newsletter, Cybergenetics writes about the Johnson case, saying that TrueAllele obtained results for eight samples where other methods found the results “inconclusive.” “Recovering ‘inconclusive’ DNA mixture evidence through TrueAllele computer interpretation generally leads to guilty pleas,” the newsletter boasts.

Still, Johnson’s lawyers argue that the source code is crucial to their defense. Johnson’s case is just one of many that used TrueAllele or other prototypes like it. But his lawyers—along with the ACLU, the Electronic Frontier Foundation, and the Northern California Innocence Project—are making the case that trial court’s decision not to allow defense experts to examine the source code prevented him from getting a fair trial. Jennifer Friedman, the forensic expert for the Los Angeles Public Defender’s Office, which also submitted a brief in the Johnson trial, called the decision not to require the source code “problematic.” “When we move into this technology, we are moving people who are doing algebra into doing calculus,” she added.

In trial documents, Perlin argues that allowing others to see his source code would violate his right to a trade secret, and ultimately threaten his business. He also says that it’s unnecessary, because his company runs its own validation testing. The ACLU and others, he writes on his blog, are just trying to “sow confusion” over an “unbiased system.” Furthermore, Cynthia Zimmer, the prosecutor, tells me that she found the arguments of the ACLU and the Innocence Project “hypocritical.” “We have also used TrueAllele to rule people out,” she explains. “I’m not saying we’ve gone back [to old cases], but we don’t file if TrueAllele rules it out.”

After I left phone messages, Perlin responded to emails from me, mostly reiterating his trial testimony in Johnson’s case and information available on his blog. Perlin has previously said that he believes his algorithm is more accurate than his competitors’. TrueAllele has recently announced that it will make the code accessible to defense attorneys for $10,000, plus $2,000 a day. Stephanie Lacambra, a staff attorney at the Electronic Frontier Foundation who also filed a brief in the Johnson case, told me that “Perlin’s financial interests should never take precedence over liberty.”

In October, the Department of Commerce National Institute of Standards and Technology announced that it would embark on a study to determine the reliability of DNA testing, including the algorithmic methods used by companies like TrueAllele. This research, which NIST says is to establish “foundational validity,” has been condemned by Perlin specifically through a post on the Cybergenetics blog, which calls the study “wasteful, “unnecessary,” and “meaningless.” Perlin argues that True Allele’s validity has already been proven through scientific, peer-reviewed research studies. But these studies that Perlin relies on are internal validation studies that were paid for and run by Cybergenetics.

The leader of the project, Dr. John Butler at NIST, has previously spoken about the problems with DNA probabilistic analysis. He told ProPublica that the study was not a “Consumer Reports on software,” but “to see, if presented with mixtures—and people are free to use manual methods or different software systems—what the different responses are.” I spoke with Dr. Mike Coble at NIST, an expert on forensic sciences who has published his own studies on probabilistic genotyping. He says that NIST is not planning to review particular companies, but rather is a “look at the foundational review of mixtures,” more generally.

Coble says that the goal is community education—including lawyers, judges and juries. “There’s a real hunger and desire to understand what’s going on in that box, what the program is doing and how does it do this,” he says.

This October, Perlin accepted an award from the Foundation for the Improvement of Justice. At the ceremony, he told the story of Darryl Pinkins, an Indiana exoneree who spent 25 years in prison for rape. He described TrueAllele as a program that “unmixes mixtures and it doesn’t take sides.” Perlin also announced his nonprofit, Justice Through Science. At the first conference, Zimmer, the prosecutor who put away Billy Johnson, was a guest speaker.

Tools like TrueAllele are continuing to become more common in courtrooms. Friedman, the forensic expert for the Los Angeles Public Defender’s Office, told me that she thinks probabilistic genotype matching is “becoming regular practice” in criminal cases. And though such advanced technology can provide results, it also creates a new strain for overworked attorneys. According to NYU Law Professor Erin Murphy, an expert in forensic evidence, defense lawyers—even good ones—often lack the funds required to understand this kind of technology. “Even lawyers doing their best are overwhelmed,” she says.

But Dana Degler, a staff attorney in the Innocence Project's Strategic Litigation Unit, argues that Johnson’s case is about more than scientific accuracy. It doesn’t matter whether or not TrueAllele “is garbage or the best,” she says. “That doesn’t change the entitlement of the defendant.” Especially when your life depends on the results.

Correction at 2:45 p.m. on 11/29/17: A previous version of this piece stated that the Department of Commerce National Institute of Standards and Technology was embarking on a study of TrueAllele and competitors. In fact, TrueAllele was not named in the study.