DNA Structure
Updated October 25, 2008
Adapted from http://faculty.clintoncc.suny.edu/faculty/michael.gregory/files/Bio%20101/Bio%20101%20Lectures/dna/dna.htm
I.
Knowing
that DNA is the genetic material involved integrating numerous discoveries over
an 80-year period of time.
|
Year |
Discovery |
Details |
|||||
|
1869 |
Friedrich Miescher isolates the nuclei of white blood cells obtained from pus cells. He called the chemical nuclein because it came from nuclei. It later became known as nucleic acid. |
|
|||||
|
1928 |
Non-pathogenic strains of Streptococcus pneumoniae can be transformed into pathogenic strains, persumably by transfer of a heritable substance. (Frederick Griffith -- Page 306) |
|
|||||
|
1950 |
For any sample of DNA, A=T and G=C. (Edwin Chargaff, Page 308) |
Data
leading to the Formulation of Chargaff’s Rules, from J. Biol. Chem. 177
(1949) |
|||||
|
Source |
A/G |
T/C |
A/T |
G/C |
Purines/
Pyrimidines |
||
|
Ox |
1.29 |
1.43 |
1.04 |
1.00 |
1.1 |
||
|
Human |
1.56 |
1.75 |
1.00 |
1.00 |
1.0 |
||
|
Hen |
1.45 |
1.29 |
1.06 |
0.91 |
0.99 |
||
|
Salmon |
1.43 |
1.43 |
1.02 |
1.02 |
1.02 |
||
|
Wheat |
1.22 |
1.18 |
1.00 |
0.97 |
0.99 |
||
|
Yeast |
1.67 |
1.92 |
1.03 |
1.20 |
1.0 |
||
|
1952 |
Phage DNA, but not protein, penetrates bacteria to reprogram the cells to make more phages. (Alfred Hershey & Martha Chase – Page 307) |
|
|||||
|
1952 |
Rosalind Franklin produces X-ray diffraction photos of DNA which Watson and Crick correctly interpret to show that DNA is a helical structure. |
|
|||||
|
1953 |
Watson and Crick publish their seminar article on the structure of DNA in Nature. (J.D. Watson and F.H.C. Crick, Molecular structure of nucleic acids: a structure for deoxyribose nucleic acids, Nature 171:737-738 [1953]). |
|
|||||
Chargaff's rules showed that A = T and G = C, so
there was complementary base pairing of a purine with a pyrimidine, giving the
correct width for the helix.

The paired bases can occur in any order, giving an
overwhelming diversity of sequences.
DNA is an ideal genetic material because it can
store information, is able to replicate, and is able to undergo changes
(mutate).

DNA is composed of units called nucleotides. Each
nucleotide contains a phosphate group, a deoxyribose sugar, and a nitrogenous
base.

The nucleotides joined together to form a chain.
The phosphate end of the chain is referred to as the 5' end. The opposite end
is the 3' end.

DNA is composed of two chains of nucleotides
linked together in a ladder-like arrangement with the sides composed of
alternating deoxyribose sugar and phosphate groups and the rungs being the
nitrogenous bases as indicated by the diagram below.
The "A" of one strand is always paired
with a "T" on the other. Similarly, the "G" of one strand
is paired with a "C" on the other.
The two strands are held together by hydrogen
bonds (electrostatic attraction). Two hydrogen bonds hold adenine to
thymine. Three bonds attach cytosine to guanine as indicated in the diagram
above.
During the process of cell division, the DNA
becomes tightly coiled, forming structures called chromosomes.
The diagram below is a portion of a double-stranded chromosome showing the
centromere and a portion of the base sequence. The diagram does not show the
extensive looping and coiling and the proteins associated with coiling. Notice
that the base sequence in the two chromatids is identical.

The diagram below shows that one strand of the DNA
double-helix serves as a template for the construction of mRNA. The sequence of
nucleotides in this DNA strand is complimentary (opposite) the sequence in
mRNA. The diagram also shows that the sequence of nucleotides in mRNA
determines the amino acids in the protein. For example GUG in mRNA (or CAC in
DNA) codes for valine (see below).
The strand of DNA that contains the genetic code
is called the anti-sense. It is often referred to as the coding strand or the
template strand. The other strand (the sense strand) is not used. Notice that
the sense strand has the same base sequence as mRNA except that mRNA has U
instead of T.

The codes in DNA are copied to produce mRNA. Each
three-letter code in mRNA (called a codon) codes for one amino
acid. The sequence of amino acids in proteins is therefore most directly
determined by the sequence of codons in mRNA, which in turn, are determined by
the sequence of bases in DNA.
There are four letters in the genetic alphabet (A,
T, G, and C) and each codon contains three letters. It is therefore possible to
have 64 different codons. Because there are only 20 different amino acids and
64 possible codons, some amino acids have several different codons.
Terminators are codes that indicate the end of a
genetic message (gene).
An initiator codon (usually AUG) indicates where
the genetic information begins.
DNA replication involves:
|
|
The DNA must be unwound and
bonds between the bases broken so that the two strands become separated. |
|
|
Each strand serves as a template
for the synthesis of a new strand. DNA polymerase adds nucleotides to match to the
nucleotide present on the template strand. A is paired with T and G with C.
Because each molecule of DNA contains one strand from the original strand,
the replication process is semiconservative replication. The nucleotides used for synthesis are ATP, GTP,
CTP and TTP. Each of these DNA nucleotides has three phosphate groups. Two of
the phosphates will be removed when the nucleotide is attached to the growing
chain of new DNA. |
|
|
The strand shown on the right
side of the diagram must be synthesized in fragments because the direction of
synthesis is 5' to 3'. The area in a DNA molecule where unwinding is
occurring is called a replication fork. In the diagram, it
looks like an upside-down Y. |
|
|
The resulting fragments are
called |
Covalent bonds must be formed between the
newly-added nucleotides.

DNA helicase unwinds the DNA molecule by breaking
hydrogen bonds.
DNA polymerase lengthens the strand that is being
synthesized by adding nucleotides that are complimentary to those on the
template strand (A paired with T and G paired with C).
It proofreads the new strand as it synthesizes it.
Incorrectly paired bases are removed and the correct one is inserted (discussed
below).
DNA polymerase cannot initiate a new strand, it
can only elongate a strand that is already present. Synthesis of new DNA
therefore cannot begin until a short strand of nucleotides is added. This short
strand is called a primer. Primase creates an RNA primer. DNA polymerase can
extend this strand by adding DNA nucleotides. The RNA primer will be removed
and replaced by DNA.
DNA ligase catalyzes the formation of the covalent
bonds between the
The link below may be a helpful summary.
http://www.johnkyrk.com/DNAreplication.html
DNA synthesis occurs at numerous different
locations on the same DNA molecule (hundreds in a human chromosome).
These form bubbles of replication with a replication
fork at the growing edge.
The replication rate of eukaryotic DNA is 500 to
5000 base pairs per minute.
A human cell typically requires a few hours to
duplicate the 6 billion base pairs.
Changes in the DNA code are called mutations. Repair enzymes
repair most of the errors that occur in DNA. There are three different
classes of repair mechanisms.
1. Proofreading corrects errors made during the
DNA replication process.
2. Mismatch repair corrects base pair mismatching (A-T and G-C).
3. Excision repair removes and replaces small segments of damaged DNA.
The overall error rate during DNA replication in
E.coli is one base in one million (106).
DNA polymerase proofreads the new strand of DNA as
it is synthesized and it removes mismatched bases and replaces them with the
correct bases.
After proofreading, the error rate is 1 in 1
billion (109) base pairs.
After DNA is replicated, some enzymes function to
locate mismatched base pairs, remove a short segment of nucleotides containing
the error, and replace the segment with the correct nucleotides. The new
segment is then sealed to the original strand by DNA ligase. Recall that this
is the enzyme that seals the
When repair enzymes detect a pairing error, how do
they know which DNA strand contains the error? The repair enzymes are capable
of distinguishing between the original strand of DNA and the new strand that
contains the error because the new strand is not methylated.
Methylation involves adding methyl groups (CH3) after DNA is
synthesized. Shortly after DNA is synthesized, however, the new strand is not
yet methylated.. Mismatch repair enzymes are able to detect which strand is not
methylated.
A number of environmental agents such as radiation
(UV, X-rays, radioactive elements) and chemicals (pesticides, cigarette smoke)
can cause mutations (changes) in DNA.
A number of enzymes monitor the DNA and repair
these changes. For example, excision repair occurs when a mutated segment of
DNA is removed and replaced with a new segment.
A common type of mutation caused by ultraviolet
radiation occurs when two thymines become bonded to each other, forming a kink
in the DNA molecule. This type of mutation, called a thymine dimer,
can result in incorrect nucleotides being paired with it when the strand is
replicated. To repair this mutation, an enzyme removes a segment of DNA that
contains the dimer and replaces the removed nucleotides with nucleotides
complimentary to the opposite strand. The new DNA is then bonded to the
original strand with DNA ligase.
Xeroderma pigmentosum is a genetic disease in
which some repair enzymes do not function.
Chromosomes are structures composed of condensed
DNA and associated proteins. When DNA condenses, the molecule becomes wrapped
around proteins called histones. The histones are then arranged
in a coiled pattern to produce a larger fiber. This larger fiber is further
compacted by looping to produce looped domains. The looped domains are coiled
and compacted to produce chromosomes.
Chromatin
is DNA and its associated protein. Heterochromatin
is DNA that is coiled and condensed. In this state, it is not transcribed. Euchromatin
is less condensed and is actively transcribed.
During interphase, looped domains may be attached
to protein supporting structures on the inside of the nuclear membrane. Some of
the DNA is coiled and compacted but other parts are not.
Less than 5% of eukaryotic DNA functions to code
for proteins. Approximately 1.5% of human DNA codes for protein. The function
of the remaining DNA is not known but perhaps much of it has no function.
Noncoding DNA is sometimes called "junk DNA".
Some parts of the DNA contain more genes than
other parts. The gene-rich portions are rich in G and C while the junk DNA is
rich in A and T. The light bands on chromosomes are gene-rich regions.
10-25% of eukaryotic DNA consists of sequences of
5 to 10 nucleotides repeated 100,000 to 1,000,000 times.
This type of DNA probably does not code for
proteins. A large proportion of this type of DNA is found at the tips of the
chromosomes and at the centromere.
Telomeres
DNA polymerase is not capable of initiating the
synthesis of DNA; it can only elongate a strand that has already been started.
Normally, an RNA primer functions to begin the process, allowing DNA polymerase
to attach and finish synthesizing the strand.

A DNA polymerase molecule will then replace the
RNA nucleotides with DNA nucleotides. This is not a problem for primers that
are not located on the 3' end of a DNA strand because DNA polymerase extends
the DNA strand that is already there.

RNA Primers located on the 5' end of a DNA strand
cannot be replaced because DNA polymerase cannot begin at the end of a strand.
It can only add to an existing strand.

The new DNA strand is shorter than the template
strand. As a result of the inability of DNA polymerase to initiate synthesis,
the DNA molecule becomes shorter with each cell division.
Human chromosomes have the sequence
"TTAGGG" repeated 100 to 1500 times at each end of the DNA
strand. These repetitive sequences do not contain any genetic
information.
Each time a cell divides, 50 to 500 of these
repeats are lost, making the DNA shorter. Short telomeres may prevent a cell
from dividing. The length of telomeres, therefore, may limit the number of
times a cell can divide.
Telomerase is an enzyme that restores the length
of telomeres. This enzyme is normally not found in somatic (body) cells but is
found in germ cells.
Some genes are present in many identical or very
similar copies called multigene families.
Multiple copies of identical genes usually code
for ribosomal RNA, ribosomal proteins, and histones. Hundreds to thousands of
copies of these genes result in faster production of ribosomes and histones.
Copies of similar genes probably evolved from the same
ancestral gene. The globin genes are an example. Hemoglobin is composed of two
alpha chains and two beta chains. These two genes probably evolved from the
same ancestral gene and, in turn, gave rise to a family of various alpha globin
genes and also a family of various beta globin genes.
The DNA of prokaryotes is not condensed into
chromosomes as in eukaryotes. Their chromosome consists of a single loop of
DNA.
Replication begins at a single origin and proceeds
in both directions. The rate of replication is approximately 1,000,000 base
pairs per minute. It typically requires 40 minutes.
Another round of replication may begin before the
previous one has finished, thus allowing cells to divide every 20 minutes.