The world of DNA can be pretty daunting when it comes to understanding the lingo: chromosomes, autosomal, alleles, short tandem repeat, single-nucleotide polymorphism… The list goes on …..
So, here at WhichDNA, we will demystify the terminology, and pretty much tell you what is essential and what needs to be parked for the time being, until you are ready to take the next step. The great problem with learning about DNA, is that too much too soon can be be quite overwhelmingly and lead to that glazed over sensation when you simply can’t absorb it all.
You will notice around our website, here at WhichDNA, those strange words have a broken line under them. We wanted to create an interactive learning toolkit, where you can run your cursor on a word and find out what it means. The glossary is courtesy of the International Society of Genetic Genealogy, ISOGG for short. We salute you!
So, let’s begin…
Whenever anyone new to DNA asks me what they really need to know, my answer is quite simple: as much or as little as you want. For some people, they just want very basic answers to very basic questions, and are quite happy just to see their DNA matches and ethnic breakdown. Job done! Having a more in-depth DNA experience might not be what they want, or what they have time for. For others, they have an instinctive need to know more, and there is something for everyone.
So, here are the essentials (not in alphabetical order):
DNA: We will start with its full name, deoxyribonucleic acid, which is an acid consisting of a sequence of hundreds of millions of nucleotides found in the nuclei of cells containing the genetic information about an individual. It is shaped like a double-stranded helix (double helix), which consists of two paired DNA molecules and resembles a ladder that has been twisted. The “rungs” of the ladder are made of base pairs, or nucleotides with complementary hydrogen bonding patterns. DNA for short!
Swab: One of the ways DNA is collected. The other is spit!
DNA matches: Does what it says on the tin!
Y-DNA: So in simple terms, this is the type of DNA carried through the fatherline. So, your father’s father’s line, also known in genetic genealogy as the patriline or patrilineal line DNA, going back to time immemorial. In contrast the term paternal line strictly speaking refers to any line of descent on the father’s side of the family, so down to a male or female. But note this, Y-DNA is only inherited by males, from father to son, so females can’t take a Y-DNA test.
Mitochondrial DNA (mt-DNA): So in simple terms again, this is the type of DNA carried through the motherline. So, your mother’s mother’s line, also known in genetic genealogy as the matriline or matrilineal line DNA, going back again to time immemorial. In contrast the term maternal line strictly speaking refers to any line of descent on the mother’s side of the family, so down to a male or female. mt-DNA is inherited by both males and females (which doesn’t seem fair!), but only females can pass this on to their children, but in contrast both males and females can take a mt-DNA test.
Y-chromosome: The male sex chromosome. Only males have a Y-chromosome, which they receive from their father, who received it from his father, and so on. This transmission of the Y-chromosome down the male line is why it is useful for surname testing to determine if two males share a common ancestor.
X-chromosome: A sex chromosome. A female child receives one X-chromosome from her father and one X-chromosome from her mother. A male child receives an X-chromosome from his mother and a Y-chromosome from his father.
Autosomal DNA: You will hear a lot about autosomal DNA testing because it is “the” test when it comes to matching cousins from multiple ancestral lines and predicting your ethnic breakdown. Humans have 22 pairs of autosomes and one pair of sex chromosomes (the X chromosome and the Autsomal Y chromosome. Autosomes are numbered roughly in relation to their sizes. That is, Chromosome 1 has approximately 2,800 genes, while chromosome 22 has approximately 750 genes. So in genetic genealogy, autosomal DNA is used to describe DNA which is inherited from the autosomal chromosomes.
Centimorgan: You’ll hear this a lot if you get deeper involved in autosomal DNA testing. Park this for now!
DYS: You will see this if you take a Y-DNA test and view your STR (short tandem repeat) results. DYS is an acronym for DNA Y-chromosome Segment – the assigned number of a marker on a segment of the Y-chromosome. Example: DYS# 393.
Haplogroup: One of the most exciting things to discover is your haplogroup. So, what is it? We all have two haplogroups! One on the patriline and another on the matriline. A haplogroup is a genetic population group of people who share a common ancestor and are assigned letters of the alphabet, and refinements consist of additional number and letter combinations. And, because a haplogroup consists of similar haplotypes (also known as a genetic signature), it is possible to predict a haplogroup, but this isn’t always accurate in some cases. A SNP test confirms a haplogroup, so remember that test for future reference.
Time to the Most Recent Common Ancestor (TMRCA): Is another term you will hear quite a lot as you progress. The amount of time or number of generations since individuals have shared a common ancestor. Since mutations occur at random, the estimate of the TMRCA is not an exact number (i.e., 7 generations), but rather a probability distribution. As more information is compared, the TMRCA estimate becomes more refined.
And here are a selection of DNA terms, which can be found in full in our Glossary:
Allele: An allele (pronounced UH-leel is one of multiple alternative forms of a single gene, each of which is a viable DNA sequence occupying a given position, or locus on a chromosome. For example, in humans, one allele of the eye-color gene produces blue eyes and another allele of the eye-color gene produces brown eyes.
DNA Newbie: Someone who is new to the field of genetic genealogy. It is also the name of a Yahoo mailing list forum sponsored by the International Society of Genetic Genealogy.
DNA sequencing: The process of determining the exact order of the nucleotide bases in a segment of DNA.
Exact match: Two individuals with exactly the same results for all markers or regions compared.
Full mitochondrial sequence (FMS): The name given by Family Tree DNA to a mitochondrial DNA test which sequences the entire mtDNA genome comprising all 16,569 base pairs.
GEDCOM: Acronym for Genealogical Data Communications – A plain text program created for exchanging genealogical data between different genealogical programs.
Genealogical timeframe: Is the period in which it is possible to find genealogical records relating to individual ancestors which allow the researcher to construct family trees.
Generation The number of years between the birth of the parents and the birth of their children. Different studies use different numbers of years per generation.
Genetic cousins: Individuals whose Y-DNA, mtDNA or autosomal DNA test results match one another.
Genetic genealogist: A genealogist who is involved in genetic genealogy.
Genetic genealogy: The use of DNA testing in combination with traditional genealogical and historical records to infer relationships between individuals.
Genetic distance: The number of differences, or mutations, between two sets of results. A genetic distance of zero means there are no differences in the results being compared against one another (exact match).
Genetics: The field of biology that studies genes and their inheritance; the study of DNA.
Genome: The entire complement of genetic material in the chromosome set of an organism, virus or organelle. The human genome is composed of 46 chromosomes, with a total of 3 billion base pairs.
Haplotype: Is a set of markers (polymorphisms) on a single chromosome and tend to be inherited together. A haplotype can refer to a combination of alleles or to a set of single-nucleotide polymorphisms (SNPs). Haplotype is a contraction of the term haploid genotype. The term for the set of numbers that consists of your Y-chromosome or mitochondrial DNA results.
Haplotree: A haplogroup tree. A diagram or chart showing the different lineages within a haplogroup.
Hypervariable region (HVR): The sections of non-coding mitochondrial DNA that are used for low-resolution genealogical DNA tests.
International Society of Genetic Genealogy: A free society founded in 2005 for the promotion and education of genetic genealogy.
JoGG: The Journal of Genetic Genealogy – An online journal published quarterly with articles and features pertaining to genetic genealogy and anthrogenealogy.
Junk DNA: Slang term usually used in referring to the non-coding region of DNA on the Y-chromosome.
Marker: A specific place on a chromosome with two or more forms, called alleles, the inheritance of which can be followed from one generation to the next. In Y-chromosome DNA testing, this refers to non-coding Y-chromosome DNA. Numbers designate the individual DNA segments. Example: 393=13. This means at marker #393, your allele value is 13.
Match: In genetic genealogy a match is considered to exist when a comparison of the DNA test results of two persons suggests there is a high probability of them sharing a common genetic ancestor within a relevant period of time. The testing companies each set their own criteria to determine what constitutes a match, and the criteria vary depending on the type of test taken (Y chromosome DNA test, mitochondrial DNA test or autosomal test).
Modal haplotype: Is the most commonly occurring haplotype (a set of STR marker values) derived from the DNA test results of a specific group of people. The modal haplotype does not necessarily correspond with the ancestral haplotype – the haplotype of the most recent common ancestor. The two most commonly discussed modal haplotypes are the Atlantic Modal Haplotype, the most common haplotype in parts of Europe, associated with Haplogroup R1b, and the Cohen Modal Haplotype, the haplotype associated with the Jewish Cohanim tradition. However, a specific modal haplotype may be determined for any DNA test-based surname project or other test group.
Mutation: A change in the DNA that occurs spontaneously. Mutation is a scientific term that often connotes a negative connotation as a result of 1950s ‘B’ movies, but in genetic genealogy, mutations are utilised for distinguishing different ancestral lines. Mutations can also occur due to environmental factors, such as exposure to radiation.
No call: When the allele(s) for a SNP are not determined during the process of genotyping a person’s DNA and allele values (such as A, C, G or T) cannot therefore be reported. The SNP in question gets reported as a no call. The usual notation for this is with either the symbol – or the symbol ? When comparing two people’s autosomal DNA results, it is customary to treat no calls as SNPs that match. That is, they are considered wildcards within most matching algorithms.
Non-paternity event: An event which has caused a break in the link between the surname and the Y-chromosome resulting in a son using a different surname from that of his biological father (eg, illegitimacy, adoption, maternal infidelity).
Nucleotide: One of the four monomers that make up a DNA molecule.
Null value: A null is a value of zero on a marker. Nulls can occur due to missing genetic material on a marker, or a SNP can sometimes cause a null result. Several Y-STR markers have been identified in certain families to have null results (for example, DYS439, and DYS448)
Polygenic risk score (PRS): Is a score that gives you a genetic risk for something, as calculated using many SNPs at the same time. The score is typically calculated as a score for a disease, but it can be used for any trait that is affected by many different SNPs. In traits and diseases that are genetically complex, polygenic risk scores are more useful in risk prediction than looking at individual mutations. Most common diseases like heart disease, diabetes, autoimmune disease and mental disorders are genetically complex. In genetic genealogy, polygenic risk scores are mostly of interest if you want to predict ancestral height or hair colour, and perhaps follow them through generations.
Single-nucleotide polymorphism (SNP): (pronounced snip) A SNP test confirms your haplogroup by determining if a SNP has mutated from its derived or ancestral state. A SNP is usually found on a different area of the Y-chromosome than where the Y-STR markers are. Sometimes, a SNP may cause a null result on a marker.
Subclade: Referring to a “branch” farther down the phylogenetic tree. Example: H3 -> ‘3’ is a sub-clade of mitochondrial haplogroup is also referred to as SNP testing or deep clade testing.
Surname era: The surname era is the period of time in a particular region or country that the use of a family name has been hereditary. Note that even within specific regions the introduction of surnames to landowners and nobles often occurred several generations before the use of surnames became universal in that region.
Triangulation: A method of determining the ancestral haplotype of an ancestor using the DNA results of direct line descendants.
Y-STR: Acronym for Y-chromosome Short Tandem Repeat. The number of times the bases repeat that determines the value of the marker. Example: Thirteen repeats of the same bases equals a value of ’13’. Y-STRs are often used in forensics, paternity, and genealogical DNA testing. Y-STRs are taken specifically from the male Y chromosome. These Y-STRs provide a weaker analysis than autosomal STRs because the Y chromosome is only found in males, which are only passed down by the father, making the Y chromosome in any paternal line practically identical. This causes a significantly smaller amount of distinction between Y-STR samples. Autosomal STRs provide a much stronger analytical power because of the random matching that occurs between pairs of chromosomes during the zygote making process.
If you visit our “learn and discover” hub, here at WhichDNA, we will be publishing a series of articles which cover a variety of subjects in greater depth than mentioned here. For example, dedicated articles on Y-DNA, mt-DNA, autosomal, advanced DNA testing … the list goes on …..