Aller au contenu. | Aller à la navigation

Outils personnels

Plateforme - ACCES
Navigation

Forensics

Par Naoum Salamé Dernière modification 10/11/2019 11:48

Dossier en préparation

 

Forensics, DNA Fingerprinting, and CODIS

Citation: Norrgard, K. (2008) Forensics, DNA fingerprinting, and CODIS. Nature Education 1(1):35

DNA is present in nearly every cell of our bodies, and we leave cells behind everywhere we go without even realizing it. Flakes of skin, drops of blood, hair, and saliva all contain DNA that can be used to identify us. In fact, the study of forensics, commonly used by police departments and prosecutors around the world, frequently relies upon these small bits of shed DNA to link criminals to the crimes they commit. This fascinating science is often portrayed on popular television shows as a simple, exact, and infallible method of finding a perpetrator and bringing him or her to justice. In truth, however, teasing out a DNA fingerprint and determining the likelihood of a match between a suspect and a crime scene is a complicated process that relies upon probability to a greater extent than most people realize. Government-administered DNA databases, such as the Combined DNA Index System (CODIS), do help speed the process, but they also bring to light complex ethical issues involving the rights of victims and suspects alike. Thus, understanding the ways in which DNA evidence is obtained and analyzed, what this evidence can tell investigators, and how this evidence is used within the legal system is critical to appreciating the true ethical and legal impact of forensic genetics.

How Does DNA Identification Work?

Although the overwhelming majority of the human genome is identical across all individuals, there are regions of variation. This variation can occur anywhere in the genome, including areas that are not known to code for proteins. Investigation into these noncoding regions reveals repeated units of DNA that vary in length among individuals. Scientists have found that one particular type of repeat, known as a short tandem repeat (STR), is relatively easily measured and compared between different individuals. In fact, the Federal Bureau of Investigation (FBI) has identified 13 core STR loci that are now routinely used in the identification of individuals in the United States, and Interpol has identified 10 standard loci for the United Kingdom and Europe. Nine STR loci have also been identified for Indian populations.

As its name implies, an STR contains repeating units of a short (typically three- to four-nucleotide) DNA sequence. The number of repeats within an STR is referred to as an allele. For instance, the STR known as D7S820, found on chromosome 7, contains between 5 and 16 repeats of GATA. Therefore, there are 12 different alleles possible for the D7S820 STR. An individual with D7S820 alleles 10 and 15, for example, would have inherited a copy of D7S820 with 10 GATA repeats from one parent, and a copy of D7S820 with 15 GATA repeats from his or her other parent. Because there 12 different alleles for this STR, there are therefore 78 different possible genotypes, or pairs of alleles. Specifically, there are 12 homozygotes, in which the same allele is received from each parent, as well as 66 heterozygotes, in which the two alleles are different.

The Statistical Strength of a 13-STR Profile

Within the U.S., the 13-STR profile is a widely used means of identification, and this technology is now routinely employed to identify human remains, to establish or exclude paternity, or to match a suspect to a crime scene sample.

In order to utilize STR information as a means of human identification, the FBI established the frequency with which each allele of each of the 13 core STRs naturally occurs in people of different ethnic backgrounds. To this end, the FBI analyzed DNA samples from hundreds of unrelated Caucasian, African American, Hispanic, and Asian individuals. Assuming that all 13 STRs follow the principle of independent assortment (and they should, as they are scattered widely across the genome) and that the population randomly mates, a statistical calculation based upon the FBI-determined STR allele frequencies reveals that the probability of two unrelated Caucasians having identical STR profiles, or so-called "DNA fingerprints," is approximately 1 in 575 trillion (Reilly, 2001).

This very small number needs to be put into perspective. Note that this figure refers to pairs of people, and there are many pairs of people in the world. Indeed, for the 100 million Caucasians in the world, there are 5,000 trillion pairs of people, so roughly eight or nine pairs would be expected to match at the 13 STR loci. This predicted matching does not specify which profile is shared by two people, and the chance that anyone matches the particular profile associated with a crime is still very small. The distinction between two people sharing a profile and one person having a particular profile is an example of the so-called "birthday problem." Here, the probability that a person has a particular birthday is 1 in 365, ignoring February 29, but there is a 50% chance that two people in a random group of 23 people have the same unspecified birthday (Weir, 2007).

DNA Extraction and Analysis

To perform a forensic DNA analysis, DNA is first extracted from a sample. Just one nanogram of DNA is usually a sufficient quantity to provide good data. The region containing each STR is then PCR amplified and resolved according to size, giving an overall profile of STR sizes (alleles). The 13 core STRs vary in length from 100 to 300 bases, allowing even partially degraded DNA samples to be successfully analyzed. The costs of analysis, both in time and reagents, are significantly reduced by the amplification of all 13 STRs in just two multiplex PCR reactions.

Depending on the complexity of the repeat unit, the different alleles of an STR can vary by as little as a single nucleotide. For instance, the aforementioned D7S820 STR is relatively simple and contains between 5 and 16 repeats of GATA. Another STR, D21S11, has a more complex repeat pattern consisting of a mixture of tetra- and trinucleotides, and it therefore has alleles that vary in size by a single base pair. Because of the need to differentiate single-base differences, PCR products are typically resolved using automated DNA sequencing technologies with software that recognizes allele patterns by comparison to a known "ladder."

Making an STR Match

In order to match, for example, crime scene evidence to a suspect, a lab would determine the allele profile of the 13 core STRs for both the evidence sample and the suspect's sample. If the STR alleles do not match between the two samples, the individual would be excluded as the source of the crime scene evidence. However, if the two samples have matching alleles at all 13 STRs, a statistical calculation would be made to determine the frequency with which this genotype is observed in the population. Such a probability calculation takes into account the frequency with which each STR allele occurs in the individual's ethnic group. Given the population frequency of each STR allele, a simple Hardy-Weinberg calculation gives the frequency of the observed genotype for each STR. Multiplying together the frequencies of the individual STR genotypes then gives the overall profile frequency.

Table 1: Example DNA Profiles Showing the STR Alleles for Each Sample and the Genotype Frequency of Suspect B for Each STR Locus

STR Locus Evidence Sample Suspect A Suspect B Suspect B's Genotype Frequency for Each STR
D3S1358 15, 17 17, 17 15, 17 0.13
vWA 15, 16 18, 19 15, 16 0.22
FGA 23, 27 21, 23 23, 27 0.31
D8S1179 12, 13 14, 15 12, 13 0.34
D21S11 28, 30 27, 30.2 28, 30 0.06
D18S51 12, 18 14, 18 12, 18 0.11
D5S818 13, 13 9, 12 13, 13 0.29
D13S317 12, 12 12, 12 12, 12 0.21
D7S820 10, 11 9, 10 10, 11 0.26
CSF1PO 8, 11 11, 12 8, 11 0.18
TPOX 7, 8 8, 8 7, 8 0.30
THO1 9.3, 9.3 6, 9.3 9.3, 9.3 0.38
D16S539 9, 13 11, 12 9, 13 0.10

In the fictional case shown in Table 1, Suspect A is excluded as the source of the crime scene sample. Suspect B, on the other hand, matches the crime scene sample at all 13 STRs. A calculation of the frequency of Suspect B's genotype, based upon the STR allele frequencies within Suspect B's ethnic group, reveals that the likelihood that a random member of this ethnic group has this profile is about 1 in 1.5 billion. It is important to understand that this number is the probability of seeing this DNA profile if the crime scene evidence did not come from the suspect but from some other person.

(...)