Upstream DNA sequences controlling transcription by RNA polymerase II in eukaryotes (TATA box, CAAT box and GC box).
As mentioned in Expression of Gene : Protein Synthesis 2. Transcription in Prokaryotes and Eukaryotes
, eukaryotes have three RNA polymerases (I, II, III). RNA polymerase I synthesizes rRNA, RNA polymerase II transcribes protein coding genes and RNA polymerase III synthesizes tRNA and 5SRNA. These three RNA polymerases recognize different promoter sequences on DNA. (See Expression of Gene : Protein Synthesis 2. Transcription in Prokaryotes and Eukaryotes
Using mutations involving base substitutions at almost every position in 100bp sequence upstream of |3 globin startpoint (and also in other cases), three short sequences centred at -30, -75 and -90 were identified in the promoter for RNA polymerase II. Although the promoter itself may be long, but the sequences surrounding the consensus sequences called boxes
may actually influence the effectiveness of transcription. These boxes themselves are small in size and common in many tissues and organisms. The best characterized of these modules called boxes are TATA box, CAAT box
and GC box
Fig. 37.9. Different consensus sequences (boxes) in the promoter region for thymidine kinase gene (orientation of sequence is shown by arrows; Spl and CTFare transcription factors binding to GC and CAAT boxes).
TATA box has a resemblance with Pribnow box
of prokaryotes and was identified, in the laboratory of Dr. Hogness,
by Dr. Goldberg
during his Ph.D. work and is invariably AT rich. Therefore it is often referred to as Goldberg-Hogness box
or TATA box. It has been shown that the presence of TATA box is essential for transcription. This has been proved with the help of both in vitro
(in culture) and in vivo
(living system) experiments, where deletions or mutations in TATA box resulted in reduced or complete half of transcription. However, this is the least effective component of the promoter. It has also been shown that other sequences both downstream from TATA box (between TATA box and initiation site) and upstream from TATA box upto a distance of 200 bases may be important in controlling transcription or in other words, the gene activity (see Expression of Gene : Protein Synthesis 2. Transcription in Prokaryotes and Eukaryotes
Further upstream from the transcription initiation site, at position between -70 and -80, is also found a consensus sequence, which has the pattern, GG (C or T) CAATCT. It is called CAAT box, which is the most effective component of the eukaryotic promoters. The GC box, which may occur between -50 and -60 or between -90 and -100 contains the sequence GGGCGG and may be found in multiple copies in either orientation.
Downstream promoter for RNA polymerase III.
Promoter sequences for transcription have often been located only upstream the starting point. However, the genes coding for 5SRNA in Xenopus laevis
have been shown to have promoters downstream the initiation site and within the transcription unit between +55 and +80 base pairs from the startpoint of the gene. There are also other cases of RNA polymerase HI transcription, where promoters are located downstream. For instance in VA1 (adenovirus gene), the promoter lies between +9 and +72 base pairs, and in tRNAfmet
gene of Xenopus laevis,
the promoter lies in two parts, one lying between +8 and +30 and the other lying between +51 and +72. Any deletions in these internal sites have led to failure of initiation although deletions in other regions of the gene did not cause any failure of initiation. Some other tRNA genes in eukaryotes, also have similar promoter regions (see Expression of Gene : Protein Synthesis 2. Transcription in Prokaryotes and Eukaryotes
Homeotic box and regulation of gene activity in Drosophila and other eukaryotes.
During 1984 and 1985, a set of genes called homeotic genes, which determine the body plan, received major attention of molecular biologists. These homeotic genes are such that their mutations cause clean transformation of one body part into another. In this connection, in Drosophila,
two gene complexes, namely bithorax
(antenna-foot) have been studied in some detail and have now been cloned. In each of these two complexes, there are several genes, and not all of them are homeotic. It has been demonstrated that in almost all these genes, homeotic or non-homeotic, a nucleotide sequence (180 base pairs long) is always found near the 3' end of the gene. This sequence seems to be highly conserved and has now been discovered in other organisms like Xenopus
(toad), mice and humans (in humans, atleast five such homeotic genes having this 180 base pair long sequence are known). This sequence (180bp long) has been named Homeo box
by Walter Gehring
of the University of Basel in Switzerland. Some of these genes will be described.
a number of genes of antennapedia
complex include 'Antp'
Similarly genes of bithorax
complex include ubx
Among these genes 'ftz'
genes are interesting and are characterized as segmentation genes that follow the 'pair rule',
since every alternate segment is missing thus reducing the number of segments to half (ftz
homozygote, which dies after early larval stage; fushi tarazu
is a Japanese word meaning not enough segments). In homozygous individuals for mutation in 'engrailed'
gene also, the alternate segments are defective thus following 'pair rule'. Similarly, 'Antp'
gene in its mutant form, in homozygous condition, causes the conversion of antenna of adult flies into legs. These genes can prove useful in the study of genetic control of developmental pattern. (Consult next main topic for more details).
In some of the above genes, 'homeo box' has been sequenced as in 'fushi tarazu'
It has been shown, that although the genes having homeo-boxes may widely differ in their phenotypic effect to the extent that some of them are not really homeotic, but the 'homeo box' sequence seems to be similar and highly conserved. Therefore, it is believed that their role in regulation of gene activity should be similar. The 60 amino acids coded by 'homeo box' seem to give rise to a domain within the protein products of each of the homeotic genes that constitute the complexes like bithorax and antennapedia. It has also been shown that the 'homeo boxes' of Drosophila
genes are homologous to yeast MAT
α2genes, which regulate sets of genes determining the mating type of yeast cells (see later in this section for more details). The protein products of MATa.2
gene of yeast and that of 'engrailed'
gene of Drosophila
have been shown to bind specifically to a specific DNA sequence thus regulating the activity of either those genes to which the ho.moeo box belongs or even other unrelated genes.
Response elements for a variety of stimuli.
In signal transduction pathways (see later), smart genes are expressed eventually due to the presence of unique modules in DNA. These modules are short consensus sequences, usually < 200bp upstream of the gene to be expressed. These are called response elements (RE)
and may be present (either in the promoter or in the enhancer) in single or multiple copies. The response elements are used by certain proteins, and the binding region for a receptor protein
or transcription factor
extends for a short distance on either side of RE. Some examples of these response elements are given in Table 37.3.
The response elements have often been identified through the use of gene constructs, consisting of sequences flanking the gene in question and a reporter gene (e.g. cat
gene for chloramphenicol acetyltransferase or gus
gene for βglucuronidase). These gene constructs are used for getting transgenic plants where the response of reporter gene under the control of RE is assessed. (Consult next section for more details).