Molecular Biology of Cellulose Biosynthesis in Plants

Identification of Genes Encoding Cellulose Synthases in Plants
Cellulose synthase genes were first identified in A. xylinum and subsequently in other bacterial species (Matthysse et al., 1995b; Saxena et al., 1990; Wong et al., 1990) before they were identified in plants (Arioli et al., 1998; Pear et al., 1996). A. xylinum produces abundant amounts of cellulose, and it has been a model organism for studies on cellulose biosynthesis, so it is not surprising that cellulose biosynthesis genes were first identified in this organism. Interestingly, the genes from this organism were not found to be useful in isolating cellulose synthase genes from other organisms by nucleic acid hybridization techniques. However, Saxena et al. (1995) compared the derived amino acid sequence of the bacterial cellulose synthase with other proteins and found them useful in identifying conserved amino acid residues in β-glycosyltransferases, more specifically the conserved residues and sequence motif identified as D, D, D, QXXRW in processive β-glycosyltransferases (Saxena et al., 1995). Based on the deduced amino acid sequences of bacterial cellulose synthases and other β-glycosyltransferases, genes for plant cellulose synthases were first identified by random sequencing of a cotton fiber cDNA library (Pear et al., 1996). Two cDNA clones (GhCesA1 and GhCesA2) were identified from the cotton fiber cDNA library, and the derived amino acid sequence of GhCesA1 gave the first glimpse of the primary structure of a plant cellulose synthase (Pear et al., 1996).
In addition to the transmembrane regions and the conserved residues found in bacterial cellulose synthase, the cellulose synthase from plants was found to have additional features—the presence of two regions (originally referred to as CR-P and HVR) within the globular domain that contained the conserved residues and a zinc-finger domain at the N-terminus.

Around the same time that cDNA clones encoding cellulose synthases were identified in cotton by random sequencing (Pear et al., 1996), a number of cDNA clones encoding amino acid sequences containing the D, D, D, QXXRW conserved residues and sequence motif were identified by sequence analysis of expressed sequence tag (EST) sequences of Arabidopsis and rice that were available in the public databases (Cutler and Somerville, 1997; Saxena and Brown, 1997).
However, the proteins encoded by these cDNA clones did not show the additional features identified in the cotton cellulose synthases; instead these proteins resembled more the primary structure of the bacterial cellulose synthase and were referred to as cellulose synthase-like proteins with a role possibly in the synthesis of β-linked polysaccharides other than cellulose (Cutler and Somerville, 1997). Soon thereafter, a superfamily of genes encoding cellulose synthases (CesA) and cellulose synthase-like (Csl) proteins were identified in a large number of plants (Richmond and Somerville, 2000). The presence of a large number of genes belonging to the cellulose synthase superfamily in each plant was surprising at first, but the role of many of these CesA genes in cellulose biosynthesis became obvious following analyses of a number of Arabidopsis mutants affected in cellulose biosynthesis. Interestingly, two cellulose synthase genes were earlier identified in A. xylinum (Saxena and Brown, 1995). Although both genes encode a functional cellulose synthase as determined by in vitro cellulose synthase activities in mutants, only one gene was found to be essential for normal in vivo cellulose synthesis in A. xylinum (Saxena and Brown, 1995).