How Does DNA Work?

DNA is a data storage molecule that lives inside almost all your cells. If you think of your genome as a recipe book, then chromosomes are the chapters, and genes are the recipes.

How Does DNA Work?

Your complete DNA set, called your genome, is made up of more than 20,000 genes. Bizarrely, the entire genome is present in each of your 30 trillion body cells. The only cells that don't contain DNA are red blood cells and the cornified cells in your skin, hair, and nails, all of which have no nucleus.

The difference between genes, chromosomes, DNA, and the genome is illustrated by the analogy of a recipe book

DNA, or deoxyribonucleic acid, is the chemical that encodes your biological blueprint.

For most of the time, DNA hangs around freely in the nucleus as long, fine strands of chromatin. Only when a cell is ready to divide does it coil up very precisely into 46 chromosomes to do the whole mitosis thing.

The relationship between DNA and chromosomes

Genes are long stretches of DNA which bundle into chromosomes when cells divide.

So, each gene is a stretch of DNA that encodes the recipe for building protein. These protein molecules, in turn, have many critical functions around the body, such as myosin in your muscles, antibodies in your immune system, and oxytocin in your brain.

As a result, your genes are in use all the time. Every day. Every minute. Every second. They're continually copied and converted into proteins at a rate that boggles the mind, just to keep your body ticking over.

Different cell types reference different genes to make different proteins

Different cell types reference different genes to make different proteins.

The Structure of DNA

Known as the double helix, DNA is made up of two complementary strands of bases known as adenine (A), thymine (T), cytosine (C), and guanine (G).

It's the precise sequence of A, T, C, and G bases that define your genes and creates the diversity of life. In The Mysterious World of The Human Genome, Frank Ryan offers the analogy of DNA as a train track that stretches to the horizon, with three billion sleepers representing those base pairs.

DNA base pairs (A, C, G, T) attached to the sugar-phosphate backbone

The DNA base pairs (A, C, G, T) attached to the sugar-phosphate backbone.

Base pairing of DNA is extremely reliable. But heritable mistakes do occur—at an estimated rate of 1 in 100 million bases per generation in humans. This is how we get mutations which alter our genetic code, potentially culminating in disease or adaptation.

The correct pairings and mispairings between DNA bases

The correct pairings and mispairings between DNA bases.

In mispairings, DNA bases form a hydrogen bond between the wrong atoms, or gain an extra proton to create a protonated wobble. At the end of this article, we'll take a look at how a single base deletion can corrupt an entire gene sequence. First though, let's examine the role of DNA in protein synthesis.

How Does DNA Work?

Now that we have an idea of the structure, how does DNA work? How does it keep us alive on a moment-to-moment basis?

Let's zoom out to our cells: the multipurpose biological factories that make up our tissues. Cells have complex internal structures bustling with organelles and proteins that keep us alive.

Animal Cell Diagram Cartoon Style

The basic features of an animal cell. Our DNA is stored in the nucleus, informing the production of proteins in the cytoplasm.

Besides water and fat, your body is made almost entirely of proteins. These include enzymes that drive digestion, hormones that coordinate growth, and antibodies that neutralise invading pathogens.

DNA isn't only a blueprint for foetal growth. It's essential to your ongoing survival, such as making insulin if you've just had breakfast, or cortisol to produce a stress response.

To make a new protein, a specific gene from your master recipe book must first be copied into single-stranded molecule called messenger RNA (mRNA). The mRNA then exits the nucleus and make its way to the factory floor of the cell, better known as the cytoplasm.

Here, the string of mRNA is translated into a chain of amino acids. In other words, the genetic recipe is converted into a product. But the product isn't ready for shipping until it twists and folds into a specific functional shape. Now it's a protein.

The Central Dogma of DNA

The Central Dogma of DNA.

An analogy is all very well and good, but how exactly is DNA converted into proteins on the molecular level? Do we even know? Yes, we do. I hope you're sitting down, because here comes an exquisite bit of molecular biology.

How DNA Expression Works

The Central Dogma describes the one-way flow of genetic information from DNA to proteins. We're going to examine three major stages here, known as transcription (copying DNA to mRNA), processing (customising mRNA), and translation (converting mRNA to proteins).

Step 1. Transcription

So you've got a bunch of DNA hanging around in the nucleus. It's time to express some genes.

A dedicated molecule known as RNA polymerase attaches itself to your DNA. It teases apart the two strands of the double helix, unwinding the ladder as it travels along the length of a gene.

This exposes the anti-sense strand, which contains complementary bases according to the sense strand.

Amazingly, the RNA polymerase multitasks here. As it unwinds the DNA strands, it also reads the individual bases, and builds a new strand of mRNA based on the principles of complementary base pairing. The emerging genetic string is called pre-messenger RNA or pre-mRNA.

DNA Transcription Illustration

How DNA transcription works: (1) The initiation phase sees RNA polymerase bind to a promoter sequence at the start of a gene and (2) unwind the double helix to expose the anti-sense strand. (3) In the elongation phase, RNA polymerase reads the DNA template one base at a time and constructs a pre-mRNA transcript from free-floating nucleotides. (4) RNA polymerase re-winds the original strands into a double helix. (5) The pre-mRNA strand is cut loose when RNA polymerase reaches a terminator sequence at the end of a gene.

It's a beautiful molecular dance. And it's happening at astonishing speed in your cells right now, all driven by spontaneous chemical interactions.

Step 2. RNA Processing

The DNA recipe book is written in such a way that a single recipe can be cut-and-paste to produce multiple alternative dishes. The biological term for this is alternative splicing.

So let's customise the dish. Still inside the nucleus, spliceosomes approach the pre-mRNA sequence to make their edits.

Spliceosomes cut out non-coding sequences of bases called introns and leave behind select coding sequences called exons.

On average, there are 9 exons per gene, although the supremely long dystrophin gene has 79 exons spanning 2.3 million bases. That's some heavy gene editing right there.

RNA Processing Illustration

How RNA processing works: Spliceosomes remove non-coding introns from the pre-mRNA strand, leaving only the desired exons to customise the gene recipe.

Step 3. Translation

So far, we've just been tinkering with the gene recipe. Now we need a chef to actually source the ingredients and produce the dish.

The finished messenger RNA strand exits the nucleus and lands in the fluid cell cytoplasm. Here, a ribosome binds to the start of the mRNA and does something remarkable.

Ribosomes read the mRNA in groups of three bases called codons. These are matched to complementary anticodons.

Free-floating transfer RNA units approach the ribosome, each carrying an amino acid matched to a specific anticodon. Spontaneous chemical bonding sees the mRNA translated into the desired sequence of amino acids.

DNA Translation Illustration

How RNA translation works: (1) Free-floating tRNAs deposit amino acids at the ribosome. (2) Spontaneous reactions see mRNA codons match to their complementary tRNA anti-codons. (3) The ribosome builds a chain of amino acids while (4) cutting the spent tRNA free. The amino acid chain is released when the ribosome reaches a stop codon.

How Amino Acids Create Proteins

Amino acid chains—or peptides—begin with a primary structure: a linear string of amino acids connected by covalent bonds. But this doesn't last long. During translation, amino acids form weak hydrogen bonds between one another, twisting and folding the chain into alpha helices and beta sheets that make up the molecule's secondary structure.

Protein Structure: How Amino Acids Fold into Polypeptides to Create Primary, Secondary, Tertiary, and Quaternary Structures

The four levels of protein structure.

Tertiary structures are more complex yet. As amino acids fold and meet, they spontaneously form ionic bonds, disulphide bridges, and hydrophilic and hydrophobic interactions. This secures them into 3D proteins with unique forms. And when multiple polypeptide chains convene, they produce the largest of all proteins with complex quaternary structures.

At this point, the molecule is packaged off to its destination outside the cell, or retained within as a new cell worker. The synthesis of a new protein is complete.

The Genetic Code

"Tell me more about the codons!" I hear you scream. And you'd be right. This is a good thing to scream about, if anything is.

The RNA alphabet has only four letters (A, U, C, and G), and since codons occur in groups of three, it means there are only 64 words (4 x 4 x 4) in our entire codon dictionary.

Yet because human cells can only produce 20 different amino acids, it leaves us with a fair bit of redundancy in the genetic code. In other words, each amino acid tends to be associated with more than one codon.

Every gene begins with a start codon (AUG), which also happens to translate to the amino acid methionine, making this the first amino acid to be docked with the ribosome. A string of amino acid docking ensues until the ribosome reaches a stop codon (either UAG, UGA, or UAA) which signals the end of the peptide chain.

Here are all the other possible codons (groups of three bases in sequence) and their corresponding amino acids (also helpfully rendered as a three-letter code so as not to confuse you).

The genetic code describes how each three-letter codon translates to specific amino acids

The Codon Table: Per the genetic code, a total of 64 codons refer to 20 corresponding amino acids.

Ala = Alanine Leu = Leucine
Arg = Arginine Lys = Lysine
Asn = Asparagine Met = Methionine
Asp = Aspartic Acid Phe = Phenylalanine
Cys = Cysteine Pro = Proline
Gln = Glutamine Ser = Serine
Glu = Glutamic Acid Thr = Threonine
Gly = Glycine Trp = Tryptophane
His = Histidine Tyr = Tyrosine
Ile = Isoleucine Val = Valine

How Fast Does DNA Work?

Depending on the size of the gene, it takes between 20 seconds and several minutes to produce a single protein molecule from mRNA.

Now scale the volume. Multiple ribosomes can work along the same mRNA strand with just 80 nucleotides separating them, in order to produce multiple proteins simultaneously. And there are up to 10 million ribosomes building proteins on demand in each cell.

It's all rather amazing really. DNA and its entourage perform a constant choreography, culminating in the normal functioning of any living organism, such as a friendly old toad. Isn't that brilliant?



How Does DNA Mutate?

Before you go, I want to show you a cool thing about mutation.

When a cell divides, it necessarily copies its entire genome of six billion bases long, which gives plenty of opportunities for error. Any addition, substitution, or deletion of a bases is considered a mutation, with the potential to change an entire gene recipe and its protein product.

For instance, in a frameshift mutation, deleting a single base shifts the reading frame in which each codon appears. Now all the bases after the mutation are displaced. When the gene comes to be translated, it produces a very different string of amino acids.

A frameshift mutation caused by a single base deletion can corrupt the entire protein product

A frameshift mutation caused by a single base deletion can corrupt the entire protein product.

Of course, DNA mutation isn't all bad. When it occurs in germ cells (sperm and eggs) or in early embryonic development, it's the initiating factor for evolution by natural selection. The problem is that mutation is a blind trial-and-error process such that, when it impacts a gene, it often leads to disease.

What causes DNA mutations? They're most common under three circumstances:

  • Foetal development. Mutations are much more prevalent during the rapid growth phase of foetal development and are passed on to all daughter cells thereafter.
  • Environmental mutagens. DNA continues to mutate throughout your lifetime, exacerbated by environmental factors like UV light, cigarette smoke, and even viruses.
  • Genetic inheritance. Genes come in pairs, with one variant (allele) inherited from each parent. This is one way to get genetic diversity. While a single faulty gene from dad may not matter, faulty genes from both parents leads to full blown disease.

Genetic mutations are known to cause more than 6,000 diseases, including those present since birth as well as diseases that develop over the lifetime, such as diabetes, heart disease, and cancer. Find out how we're starting to permanently fix these errors in my article How Does Gene Therapy Work?

Rebecca Casale, Creator of Science Me

Rebecca Casale is a writer and illustrator in Auckland, New Zealand. If you like her content, why not share it with your friends? If you don't like it, why not punish your enemies by sharing it with them?