Separate DNA binding and transcription activation domains

Content
Expression of Gene : Protein Synthesis 2.  Transcription in Prokaryotes and Eukaryotes
Transcription in prokaryotes 
Single RNA polymerase in E. coli
Promoter sites for initiation of transcription in prokaryotes
Initiation and elongation of RNA synthesis in prokaryotes
'Inchworm model' for elongation of transcript
Elongation arrest vs termination of transcription
Termination and antitermination of mRNA synthesis in prokaryotes
Transcription in eukaryotes 
Multiple RNA polymerases in eukaryotes
Promoter, enhancer and silencer sites for initiation of transcription in eukaryotes
Transcription factors and initiation of RNA synthesis in eukaryotes
Formation of preinitiation (transcription) complex with RNA polymerase II (Pol II)
Structure and role of TFIID and other transcription factors (TBP, TAFs)
TFIIB domains for interaction with TFIID/TATA complex
Phosphorylation of CTD of a subunit of Pol II
Formation of pre-initiation complex with Pol I and Pol III
Separate DNA binding and transcription activation domains
Transcription factors and elongation of RNA chains in eukaryotes
Chromatin structure and transcription
Transcription in mitochondria
Transcription of vertebrate mtDNA
Transcription of yeast and plant mtDNA
Transcription in chloroplasts
Separate DNA binding and transcription activation domains
Transcription factors have separate (i) DNA binding domains and (ii) transcription activation domains. Through their DNA binding domains, they recognize specific DNA sequences for regulating gene expression, and after binding to DNA, they may interact with either RNA polymerase or with other transcription factors. Transcription activation domains, on the other hand, function through protein-protein interaction to bring about activation of transcription.

(a)DNA binding domains. Following are some of the protein motifs, which are involved in regulation of transcription through DNA binding : (i) several steroid receptors are activated by corresponding steroids, leading to binding of these receptors to DNA thus initiating transcription; (ii) zinc finger motif has a DNA binding domain, which was recognized for the first time in transcription factor TFHIA, required for transcription of 5S rRNA by RNA polymerase III (see later for details); (iii) helix-turn-helix, first recognized in phage λ repressors, is now known to be present in several transcription factors in Drosophila (homoeobox) and mammals; one alpha helix lies in a major grove of DNA and the other at an angle across DNA; (iv) helix-loop-helix motif was found in some developmental regulators; it regulates expression of genes coding for some eukaryotic DNA-binding proteins, and is involved in protein dimerization and DNA binding; (v) leucinc zippers consist of a stretch of amino acids with a leucine residue at every seventh position (see later for details).

(i) Steroid receptors. The steroid receptors include receptors for the steroid hormones, retinoids, vitamin D, thyroid hormones and a number of other compounds. These proteins contain separate domains for hormone binding, DNA binding and for transcriptional activation. The DNA binding domain contains 70 amino acid residues, with eight conserved cysteine residues, forming two zinc fingers with zinc. A peptide from this domain, can fold (into two a helices) in the presence of zinc, and is utilized for recognizing the appropriate binding site in DNA.

(ii) Zinc proteins (with Zn fingers, Zn clusters and Zn twists). In some proteins, there are zinc binding sites with distinct DNA binding motifs. In one such motif, a loop of amino acids protrudes out as zinc finger form zinc binding site (Fig. 32.20). In other motifs, a zinc twist or a zinc cluster is found. More information is available about zinc fingers than for zinc twists and/or zinc clusters. The zinc fingers are described as Cys2/His2 and Cys2/Cys2 fingers.
 
Three Cys2/His2 zinc fingers in the transcription factor SP1
Fig. 32.20. Three Cys2/His2 zinc fingers in the transcription factor SP1

Cys2/His2 fingers have the following consensus sequence : Cys—X2—4—Cys—X3—Phe—X5—Leu-X2-His-X3-His. Each finger consists of about 23 amino acids, linked to another finger by 7-8 amino acids. Details of some transcription factors with zinc fingers are listed in Table 32.4. These zinc fingers are required for binding to DNA. On one extreme, these fingers may involve almost the entire protein as in TFIIIA (9 fingers), and on the other extreme only a small domain is involved in forming zinc fingers, as in ADR1 (2 fingers). Since zinc fingers are found in several known transcription factors, this feature also helped in recognizing proteins that may function as transcription factors, e.g. TDF (testis determining factor), ZFX, ZFY (zinc finger proteins encoded by genes on human X and Y chromosomes).


genteic botany Biocyclopedia.com

In contrast to the above Cys2/His2 fingers which are repetitive in nature, there are also Cys2/Cys2 fingers which are non-repetitive. Some examples are given in Table 32.5. As shown in the table, some of the proteins with Cys2/Cys2 fingers are steroid receptors, which recognize and perhaps bind to response elements (in DNA) for these receptors. Mutations altering the amino acids in finger regions have been shown to alter the DNA binding ability of these zinc finger proteins, suggesting that fingers are responsible for DNA binding. Rarely fingers may bind RNA rather than DNA or may not bind any nucleic acid at all.

genteic botany Biocyclopedia.com

Translation initiation factors (e.g. eIF2B) have also been shown to have zinc fingers which help in recognition of initiation codons.

(iii) Helix-turn-helix (HTH) and homeodomain. The HTH motif was the first DNA-recognition motif discovered. The HTH motif consists of a 20-residues segment with an a helix (1-7 residues), a turn (8-11 residues) and a second a helix (12-20 residues). It occurs in a large family of prokaryotic DNA-binding proteins including CAP, X Cro and X Rep. The second HTH helix should lie in the major groove, and contributes to important base pair contacts for DNA binding. (For more details on prokaryotic HTH domains; see Regulation of Gene Expression 1.  Operon Circuits in Bacteria and other Prokaryotes and Regulation of Gene Expression 2.  Cricuit of Lytic Cycle and Lysogeny in Bacteriophages).

Among eukaryotic regulatory proteins, the most important domain with HTH motif is the 60-residues long homeodomain motif found in the products of a number of developmental genes in Drosophila. The polypeptide chain representing homeodomain, folds as a 3-helix bundle, in which second and third helices form the HTH.

(iv) Helix-loop-helix (HLH). The HLH proteins have some similarities with the leucine zipper family (see below). Like leucine zipper proteins, the HLH proteins have a basic region that contacts the DNA and a neighbouring region that mediates dimer formation. The dimerization region forms an cc-helix, a loop and a second αhelix. The HLH proteins play an important role in differentiation and development and their activity is modulated by heterodimer formation. For instance MyoD protein forms a heterodimer with E2A protein and helps in the differentiation of muscle cells.

(v) Leucine zippers, dimer formation and DNA binding. In 1984, Robert Tijan at the University of California (Berkeley) had identified a protein SP1, which was shown to bind at a motif repeated five times at the promoter of early genes in mammalian virus SV40. This led to selective activation of these genes. This first report of regulation of eukaryotic genes by DNA binding proteins, led to isolation and purification of a protein C/EBP (it binds to CAAT and to SV40 core enhancer; C/EBP = CAAT/Enhancer Binding Protein) by S.L. McKnight and his coworkers at the Carnegie Institution of Washington at Baltimore (U.S.A.). The gene for C/EBP was cloned and sequenced, giving information about the sequence of 359 amino acids in this protein. A computer search revealed a 60-amino acids region in C/EBP similar to regions in two other proteins encoded in myc and fos proto-oncogenes (proto-oncogenes function normally, but become cancer causing due to mutations; see Genetics of Cancer : Proto-oncogenes, Oncogenes and Tumour Suppressor Genes). Subsequently, in these and several other proteins a leucine zipper was discovered, which is a stretch of amino acids rich in leucine, occupying every seventh position (heptad repeat) in the potential zipper. These leucine residues may form an amphipathic alpha helix (amphipathy means that in a alpha helix hydrophobic groups face on one side, and hydrophilic groups face the other side; amino acids with different properties segregate on alpha helix). The leucine zippers form dimers among similar molecules (e.g. Jun-Jun homodimer) or among dissimilar molecules (e.g. Jun-Fos heterodimer). Earlier it was believed that leucines of one protein may interdigitate with the leucines of the zipper of another protein in reverse orientation. However, it has now been established that zipper regions associate in parallel (leucines overlap and line up side-by-side), when they form a dimer (Fig. 32.21) in the form of a coiled coil (two right handed helices wind around each other).
 
Association of leucine zippers : (a) antiparallel zippering (as earlier envisaged); (b) parallel zippering due to side-by-side overlapping (as known now to be correct).
Fig. 32.21. Association of leucine zippers : (a) antiparallel zippering (as earlier envisaged); (b) parallel zippering due to side-by-side overlapping (as known now to be correct).

A consensus sequence in protruding arms of leucine zippers, based on a comparison of several proteins; regions that aid in binding and touch the DNA are highlighted.
Fig. 32.22. A consensus sequence in protruding arms of leucine zippers, based on a comparison of several proteins; regions that aid in binding and touch the DNA are highlighted.

Amino acid sequences were compared in atleast 11 regulatory leucine zipper proteins including four mammalian proteins (C/EBP, CREB, Jun, Fos), two yeast proteins (GCN4, YAP1), two other fungal proteins (CYS-3, CPC1) and three plant proteins (HBP1, TGA1, OPAQUE2). This comparison revealed a consensus sequence shown in Figure 32.22. Following are the general features of these proteins : (i) proline and glycine are absent for rarely found, thus permitting formation of alpha helix; (ii) 2-5 heptad repeats of leucine were found, which helped in dimer formation for DNA binding; (iii) basic amino acids, arginine and lysine (arg/lys) were found close to leucine zipper, helping in DNA binding (zippering of two molecules bring arg/lys in proper position for combiriing with dyad symmetric motifs in DNA); (iv) zippering leads to formation of Y shaped dimers; (v) an asparagine is always found at the same position and interrupts alpha helix (like proline and glycine), thus allowing bending of arms of Y shaped dimer to permit a grip on DNA motif meant for binding (Fig. 32.23).
 
DNA binding regions of leucine zipper : (a) alpha helices protruding out from DNA (as earlier envisaged); (b) the. protruding regions of alpha helices bend at asparagine residues forming a Y-shaped dimer, to establish a grip on DNA motif.
Fig. 32.23. DNA binding regions of leucine zipper : (a) alpha helices protruding out from DNA (as earlier envisaged); (b) the. protruding regions of alpha helices bend at asparagine residues forming a Y-shaped dimer, to establish a grip on DNA motif.

(b) Transcription activation domains. The transcription activation domains are separate from DNA binding domains and each consists of 30-100 amino acids. Different types of activation domains have been exchanged and combined with other DNA binding proteins to produce chimeric transcription factors. The transcription activation domains function through protein-protein interactions and often help in establishing contacts with components of transcription complex, which leads to activation of transcription. Following three types of activation domains, shown in Figure 32.24, have been identified : (i) Acidic domains were first identified in yeast transcription factors GAL4 and GCN4. Later they were found in glucocorticoid hormone receptor and also in AP-1/Jun transcription factors. They have two features in common, i.e. there are regions of significant negative charge and they can form amphipathic alpha helical structures. The acidic domains help in association of TFIIB and TFIID. (ii) Glutamine rich domains were tirst identified in SP1, and later also in Drosophila's antennapaedia and ultrabithorax, in yeast-'s HAP1, HAP2, and GAL11 and in several mammalian factors (OCT-1, OCT-2, JUN, AP-2, SRF). (iii) Proline rich domains havebeen identified in CTF/NF-1 and in several other mammalian factors (AP-2, JUN, OCT-2, SRF). The mechanisms of action of these activation domains are shown in Figure 32.25.
 
Three types of protein domains responsible for transcriptional activation by DNA binding factors.
Fig. 32.24. Three types of protein domains responsible for transcriptional activation by DNA binding factors.

Mechanism of action of activation domains of transcription factors : (a) a hypothetical array of cis elements in the promoter/enhancer regions of a gene transcribed by Pol II, and the associated transcription factors (all these DNA binding factors may not be required simultaneously to initiate transcription); (b) mechanism(s) by which cis elements activate transcription may involve protein-protein interaction, so that distally found factors can take part in transcription initiation.
Fig. 32.25. Mechanism of action of activation domains of transcription factors : (a) a hypothetical array of cis elements in the promoter/enhancer regions of a gene transcribed by Pol II, and the associated transcription factors (all these DNA binding factors may not be required simultaneously to initiate transcription); (b) mechanism(s) by which cis elements activate transcription may involve protein-protein interaction, so that distally found factors can take part in transcription initiation.

Proteins that influence transcription without binding to DNA. Several proteins, which influence transcription without binding to DNA (sometimes through recognition of another protein) include the following : (i) E1A activates Ad (adenovirus) genes, pol II and pol HI genes, (ii) Tat acts at post-transcriptional level with the help of tar sequence in HIV RNA and (iii) Vmw 65 activates initial early (IE) genes in HSV (herpes simplex virus), by associating with another transcription factor.

Similarity between prokaryotic sigma factor and some eukaryotic transcription factors. Recently, similarity between sigma factor and atleast three different nuclear mRNA transcription factors has been demonstrated. These transcription factors include the following : (i) RPO24, which is the fourth . largest subunit of yeast RNA polymerase Il.(ii) RAP30, which is a part of a co'mplex of two proteins that bind tightly to human RNA polymerase II. (iii) TFIID, which binds to TATA box. The similarities of certain domains of sigma with these three factors are shown in Figure 32.26.

The above similarities are consistent with the observation that the two largest subunits of eukaryotic nuplear RNA polymerases are homologues of the large subunits of bacterial RNA polymerase.
 
Similarity of several regions of E. coli sigma factor (σ70), with those in three nuclear transcription factors, i.e. RPO24 (a subunit of yeast RNA polymerase II), RAP30 (a human transcription factor) and TFIID; and one mitochondrial transcription factor, MTF1.
Fig. 32.26. Similarity of several regions of E. coli sigma factor (σ70), with those in three nuclear transcription factors, i.e. RPO24 (a subunit of yeast RNA polymerase II), RAP30 (a human transcription factor) and TFIID; and one mitochondrial transcription factor, MTF1.