How Does DNA Work?
DNA is a data storage molecule that lives inside almost all your cells. If you think of your genome as a recipe book, then chromosomes are the chapters, and genes are the recipes.
Your complete DNA set, called your genome, is made up of more than 20,000 genes. Bizarrely, the entire genome is present in each of your 30 trillion body cells. The only cells that don't contain DNA are red blood cells and the cornified cells in your skin, hair, and nails, all of which have no nucleus.
For most of the time, DNA hangs around freely in the nucleus as long, fine strands of chromatin. Only when a cell is ready to divide does it coil up very precisely into 46 chromosomes to do the whole mitosis thing.
So, each gene is a stretch of DNA that encodes the recipe for building protein. These protein molecules, in turn, have many critical functions around the body, such as myosin in your muscles, antibodies in your immune system, and oxytocin in your brain.
As a result, your genes are in use all the time. Every day. Every minute. Every second. They're continually copied and converted into proteins at a rate that boggles the mind, just to keep your body ticking over.
The Structure of DNA
Known as the double helix, DNA is made up of two complementary strands of bases known as adenine (A), thymine (T), cytosine (C), and guanine (G).
It's the precise sequence of A, T, C, and G bases that define your genes and creates the diversity of life. In The Mysterious World of The Human Genome, Frank Ryan offers the analogy of DNA as a train track that stretches to the horizon, with three billion sleepers representing those base pairs.
Base pairing of DNA is extremely reliable. But heritable mistakes do occur—at an estimated rate of 1 in 100 million bases per generation in humans. This is how we get mutations which alter our genetic code, potentially culminating in disease or adaptation.
In mispairings, DNA bases form a hydrogen bond between the wrong atoms, or gain an extra proton to create a protonated wobble. At the end of this article, we'll take a look at how a single base deletion can corrupt an entire gene sequence. First though, let's examine the role of DNA in protein synthesis.
How Does DNA Work?
Now that we have an idea of the structure, how does DNA work? How does it keep us alive on a moment-to-moment basis?
Let's zoom out to our cells: the multipurpose biological factories that make up our tissues. Cells have complex internal structures bustling with organelles and proteins that keep us alive.
Besides water and fat, your body is made almost entirely of proteins. These include enzymes that drive digestion, hormones that coordinate growth, and antibodies that neutralise invading pathogens.
DNA isn't only a blueprint for foetal growth. It's essential to your ongoing survival, such as making insulin if you've just had breakfast, or cortisol to produce a stress response.
To make a new protein, a specific gene from your master recipe book must first be copied into single-stranded molecule called messenger RNA (mRNA). The mRNA then exits the nucleus and make its way to the factory floor of the cell, better known as the cytoplasm.
Here, the string of mRNA is translated into a chain of amino acids. In other words, the genetic recipe is converted into a product. But the product isn't ready for shipping until it twists and folds into a specific functional shape. Now it's a protein.
An analogy is all very well and good, but how exactly is DNA converted into proteins on the molecular level? Do we even know? Yes, we do. I hope you're sitting down, because here comes an exquisite bit of molecular biology.
How DNA Expression Works
The Central Dogma describes the one-way flow of genetic information from DNA to proteins. We're going to examine three major stages here, known as transcription (copying DNA to mRNA), processing (customising mRNA), and translation (converting mRNA to proteins).
Step 1. Transcription
So you've got a bunch of DNA hanging around in the nucleus. It's time to express some genes.
A dedicated molecule known as RNA polymerase attaches itself to your DNA. It teases apart the two strands of the double helix, unwinding the ladder as it travels along the length of a gene.
This exposes the anti-sense strand, which contains complementary bases according to the sense strand.
Amazingly, the RNA polymerase multitasks here. As it unwinds the DNA strands, it also reads the individual bases, and builds a new strand of mRNA based on the principles of complementary base pairing. The emerging genetic string is called pre-messenger RNA or pre-mRNA.
It's a beautiful molecular dance. And it's happening at astonishing speed in your cells right now, all driven by spontaneous chemical interactions.
Step 2. RNA Processing
The DNA recipe book is written in such a way that a single recipe can be cut-and-paste to produce multiple alternative dishes. The biological term for this is alternative splicing.
So let's customise the dish. Still inside the nucleus, spliceosomes approach the pre-mRNA sequence to make their edits.
Spliceosomes cut out non-coding sequences of bases called introns and leave behind select coding sequences called exons.
On average, there are 9 exons per gene, although the supremely long dystrophin gene has 79 exons spanning 2.3 million bases. That's some heavy gene editing right there.
Step 3. Translation
So far, we've just been tinkering with the gene recipe. Now we need a chef to actually source the ingredients and produce the dish.
The finished messenger RNA strand exits the nucleus and lands in the fluid cell cytoplasm. Here, a ribosome binds to the start of the mRNA and does something remarkable.
Ribosomes read the mRNA in groups of three bases called codons. These are matched to complementary anticodons.
Free-floating transfer RNA units approach the ribosome, each carrying an amino acid matched to a specific anticodon. Spontaneous chemical bonding sees the mRNA translated into the desired sequence of amino acids.
How Amino Acids Create Proteins
Amino acid chains—or peptides—begin with a primary structure: a linear string of amino acids connected by covalent bonds. But this doesn't last long. During translation, amino acids form weak hydrogen bonds between one another, twisting and folding the chain into alpha helices and beta sheets that make up the molecule's secondary structure.
Tertiary structures are more complex yet. As amino acids fold and meet, they spontaneously form ionic bonds, disulphide bridges, and hydrophilic and hydrophobic interactions. This secures them into 3D proteins with unique forms. And when multiple polypeptide chains convene, they produce the largest of all proteins with complex quaternary structures.
At this point, the molecule is packaged off to its destination outside the cell, or retained within as a new cell worker. The synthesis of a new protein is complete.
The Genetic Code
"Tell me more about the codons!" I hear you scream. And you'd be right. This is a good thing to scream about, if anything is.
The RNA alphabet has only four letters (A, U, C, and G), and since codons occur in groups of three, it means there are only 64 words (4 x 4 x 4) in our entire codon dictionary.
Yet because human cells can only produce 20 different amino acids, it leaves us with a fair bit of redundancy in the genetic code. In other words, each amino acid tends to be associated with more than one codon.
Every gene begins with a start codon (AUG), which also happens to translate to the amino acid methionine, making this the first amino acid to be docked with the ribosome. A string of amino acid docking ensues until the ribosome reaches a stop codon (either UAG, UGA, or UAA) which signals the end of the peptide chain.
Here are all the other possible codons (groups of three bases in sequence) and their corresponding amino acids (also helpfully rendered as a three-letter code so as not to confuse you).
Ala = Alanine | Leu = Leucine |
Arg = Arginine | Lys = Lysine |
Asn = Asparagine | Met = Methionine |
Asp = Aspartic Acid | Phe = Phenylalanine |
Cys = Cysteine | Pro = Proline |
Gln = Glutamine | Ser = Serine |
Glu = Glutamic Acid | Thr = Threonine |
Gly = Glycine | Trp = Tryptophane |
His = Histidine | Tyr = Tyrosine |
Ile = Isoleucine | Val = Valine |
How Fast Does DNA Work?
Depending on the size of the gene, it takes between 20 seconds and several minutes to produce a single protein molecule from mRNA.
Now scale the volume. Multiple ribosomes can work along the same mRNA strand with just 80 nucleotides separating them, in order to produce multiple proteins simultaneously. And there are up to 10 million ribosomes building proteins on demand in each cell.
It's all rather amazing really. DNA and its entourage perform a constant choreography, culminating in the normal functioning of any living organism, such as a friendly old toad. Isn't that brilliant?
How Does DNA Mutate?
Before you go, I want to show you a cool thing about mutation.
When a cell divides, it necessarily copies its entire genome of six billion bases long, which gives plenty of opportunities for error. Any addition, substitution, or deletion of a bases is considered a mutation, with the potential to change an entire gene recipe and its protein product.
For instance, in a frameshift mutation, deleting a single base shifts the reading frame in which each codon appears. Now all the bases after the mutation are displaced. When the gene comes to be translated, it produces a very different string of amino acids.
Of course, DNA mutation isn't all bad. When it occurs in germ cells (sperm and eggs) or in early embryonic development, it's the initiating factor for evolution by natural selection. The problem is that mutation is a blind trial-and-error process such that, when it impacts a gene, it often leads to disease.
What causes DNA mutations? They're most common under three circumstances:
- Foetal development. Mutations are much more prevalent during the rapid growth phase of foetal development and are passed on to all daughter cells thereafter.
- Environmental mutagens. DNA continues to mutate throughout your lifetime, exacerbated by environmental factors like UV light, cigarette smoke, and even viruses.
- Genetic inheritance. Genes come in pairs, with one variant (allele) inherited from each parent. This is one way to get genetic diversity. While a single faulty gene from dad may not matter, faulty genes from both parents leads to full blown disease.
Genetic mutations are known to cause more than 6,000 diseases, including those present since birth as well as diseases that develop over the lifetime, such as diabetes, heart disease, and cancer. Find out how we're starting to permanently fix these errors in my article How Does Gene Therapy Work?