Eur. J. Biochem. 269, 224–232 (2002) (cid:211) FEBS 2002
Organization of six functional mouse alcohol dehydrogenase genes on two overlapping bacterial artificial chromosomes
2
Gabor Szalai1, Gregg Duester2, Robert Friedman1, Honggui Jia3, ShaoPing Lin3, Bruce A. Roe3 and Michael R. Felder1
1Department of Biological Sciences, University of South Carolina, Columbia, USA; 2Gene Regulation Program, The Burnham Institute, La Jolla, CA, USA; 3Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
3
The mouse genes are all in the same transcriptional orien- tation in the order Adh4-Adh1-Adh5a-Adh5b-Adh5ps-Adh2- Adh3. A phylogenetic tree analysis shows that adjacent genes are most closely related suggesting a series of duplication events resulted in the gene complex. Although mouse and human ADH gene clusters contain at least one gene for ADH classes I–V, the human cluster contains 3 class I genes while the mouse cluster has two class V genes plus a class V pseudogene.
Keywords: alcohol dehydrogenase; mouse; gene complex.
Mammalian alcohol dehydrogenases (ADH) form a com- plex enzyme system based on amino-acid sequence, func- tional properties, and gene expression pattern. At least four mouse Adh genes are known to encode di(cid:128)erent enzyme classes that share less than 60% amino-acid sequence iden- tity. Two ADH-containing and overlapping C57BL/6 bacterial artificial chromosome clones, RP23-393J8 and -463H24, were identified in a library screen, physically mapped, and sequenced. The gene order in the complex and two new mouse genes, Adh5a and Adh5b, and a pseudogene, Adh5ps, were obtained from the physical map and sequence.
The alcohol dehydrogenases (ADHs; EC 1.1.1.1) are zinc- containing, dimeric enzymes found in the cytosolic fraction of the cell that are capable of reversible oxidation of a spectrum of primary and secondary alcohols to the corre- sponding aldehydes and ketones. Mammalian ADHs currently known are grouped into six distinct classes [1–3] with members of different classes sharing less than 70% amino-acid sequence identity within a species. Humans possess classes I, II, III, IV and V [2] whereas the mouse is known to have expressed genes encoding ADHs of class I,
II, III and IV [4,5]. Humans have three class I genes [6] encoding proteins with greater than 90% positional identity whereas the mouse has a single class I gene [7,8]. A human class V ADH with approximately 60% positional identity at the amino-acid level with the other human classes has been revealed from cDNA encoded sequence but not yet associated with an enzyme [9]. Deermouse [10] and rat [11] ADH cDNAs have been isolated that would encode proteins most closely related to this human class V with the deermouse ADH cDNA encoding a protein with 67% positional identity to this class.
4
The ADH family of enzymes perform important meta- bolic functions. In the mouse, class III functions as a glutathione-dependent formaldehyde dehydrogenase [12] and is involved in S-nitrosoglutathione metabolism [13] and retinol metabolism (G. Duester, unpublished results). Class IV has an important role in retinol metabolism leading to retinoic acid production. Adh4 is expressed during embry- ogenesis [14,15] and Adh4 null mouse mutants have reduced fetal survival during vitamin A deficiency [16]. This obser- vation coupled with reduced conversion of retinol to retinoic acid in tissues of Adh4 mutant mice [12] suggests a role for class IV enzyme in retinoid signaling during embryogenesis. Expressed at high levels in liver, the role of class I in ethanol metabolism has been confirmed by natural [17] and engineered [12] enzyme-deficient animals. Adh1 null mutant mice also demonstrate a significant decrease in metabolism of retinol to retinoic acid [12]. The in vivo physiological roles of class I, III, and IV ADHs in the mouse are being unveiled with the development of targeted disruptions for each gene. The mouse Adh1 and Adh4 genes are linked [18] on chromosome 3 at 71.2 cM (Mouse Genome Database), and this complex is a candidate region for a quantitative trait locus (Alcp3) for alcohol preference [19].
The genes encoding different classes of mammalian ADH have different tissue expression patterns suggesting that
1 , contour-clamped
Correspondence to M. R. Felder, Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA. Fax: + 803 777 4002, Tel.: + 803 777 5135, E-mail: felder@mail.biol.sc.edu Abbreviations: BAC, bacterial artificial chromosome; NJ, neighbor-joining; PC, Poisson corrected; CHEF homogenous electric field. Definitions: The nomenclature for the alcohol dehydrogenase gene and enzyme family being followed is that of Duester, G., Farre´ s, J., Felder, M.R., Holmes, R.S., Ho¨ o¨ g, J.-O., Pare´ s, X., Plapp, B.V., Yin, S.-J. & Jo¨ rnvall, H. (1999) Biochem. Pharmacol. 58, 389–395. The previously used human ADH1, ADH2, ADH3 genes encoding class I enzymes are renamed ADH1A, ADH1B, and ADH1C indicating their greater than 90% similarity. The enzyme class names and gene names now corre- spond. The class II human and mouse enzymes are encoded by ADH2 and Adh2, respectively; class III enzymes by human ADH3 (old ADH5) and mouse Adh3 (old Adh5); class IV enzymes by human ADH4 (old ADH7) and mouse Adh4 (old Adh3); class V enzyme by human ADH5 (old ADH6). The deermouse class VI ADH is closest to human ADH5 protein in amino-acid sequence, and based on phy- logeny and genome organization of the mouse presented here is now referred to as class V with a gene symbol of Adh5. New mouse genes reported here are named Adh5a, Adh5b, and Adh5ps. (Received 20 September 2001, accepted 30 October 2001)
(cid:211) FEBS 2002
Mouse alcohol dehydrogenase gene complex (Eur. J. Biochem. 269) 225
complex regulatory mechanisms control this gene family. The mouse Adh1, Adh2, Adh3, and Adh4 genes have some overlap in tissue expression pattern, but levels of expression in different adult tissues and hormonal responsiveness of different genes in the complex vary widely [4,20–23].
5
full-length insert cDNA in pCK2. Probe was used in the hybridization solution at 2.4 · 106 c.p.m.ÆmL)1. Filters were prehybridized for ‡ 2 h at 65 (cid:176)C in 6 · NaCl/Pi/EDTA (1 · (cid:136) 0.15 M NaCl, 0.01 M sodium phosphate, 1 mM EDTA, pH 7.4)/6 · Denhardt’s solution, 0.5% SDS, and 50 lgÆmL)1 denatured salmon sperm DNA. After prehy- bridization, the solution was discarded and replaced with an identical hybridization solution except without Denhardt’s containing the radioactive probe. After overnight hybrid- ization the filters were washed twice with 6 · NaCl/Pi/ EDTA/0.5% SDS at room temperature for 15 min each, twice in 1 · NaCl/Pi/EDTA/0.5% SDS at 37) 42 (cid:176)C for 15 min each, and once in 0.1 · SSPE /0.5% SDS at 65 (cid:176)C. The filters were blotted of excess moisture, wrapped in Saran wrap, and exposed to Kodak XAR film at )80 (cid:176)C for several hours to a day depending upon intensity of signal and level of background necessary to reveal outline of the fields on the filter.
BAC DNA isolation and restriction enzyme digest analysis
cis-Acting sequences controlling the expression pattern of the Adh genes in mouse are not fully understood. A minimal promoter directing expression in transfected hepatoma cells has been defined for Adh1 [24,25]. How- ever, as much as 10 kb of sequence upstream of an Adh1 minigene does not direct expression in liver of transgenic mice [26] although kidney and adrenal expression is promoted. An entire Adh1 gene containing 7 kb of 5¢ and 21 kb of 3¢ flanking sequence in transgenic mice expresses properly in most tissues except liver and intestine [27]. More distal sequences must control expression of Adh1 in these tissues, and this has stimulated an effort to obtain mouse Adh1 bacterial artificial chromosome (BAC) clones to identify these cis-acting regions. Analysis of these BAC clones has provided a more detailed knowledge of the genomic organization of the mouse Adh gene complex. This is an important prerequisite to understanding the regulatory strategy of this gene family.
A single isolated bacterial colony containing a BAC clone of interest was obtained from a freshly streaked plate and used to inoculate 300 mL of Luria–Bertani media containing 20 lgÆmL)1 of chloramphenicol. After overnight growth at 37 (cid:176)C, cells were harvested by centrifugation. Isolation of BAC DNA was by a rapid alkaline lysis method using no organic extractions following the protocol provided by BACPAC Resources. Pipetting of the final BAC DNA preparation was carried out with large orifice tips from USA Scientific.
In this report, we show that the previously known four mouse Adh genes are clustered within 250 kb of the mouse genome. Furthermore, two additional Adh genes and one pseudogene are found within this DNA region, and the order of these genes on two overlapping BAC clones has been determined. All genes are transcribed in the same orientation, and sequence comparisons and location within the complex suggests that the two newly identified Adh genes are most closely related to the human ADH5 gene.
M A T E R I A L S A N D M E T H O D S
Materials
Chemicals and reagents were obtained from Sigma Chem- ical Co., Fisher Chemicals, J. T. Baker (Phillipsburg, NJ, USA), and New England Biolabs unless otherwise indicated. RPCI-23 segment 2 BAC library was purchased from BACPAC Resources of Roswell Park Memorial Institute (Buffalo, NY, USA). Individual clones were also purchased from this resource.
CDNA hybridization probes for mouse genes
The BAC DNA was digested with various restriction endonucleases and the fragments were resolved on 1% agarose gels in 0.5 · Tris/borate/EDTA (1 · Tris/borate/ EDTA (cid:136) 0.089 M Tris, 0.089 M boric acid, 0.002 M EDTA) using a contour-clamped homogenous electric field (CMHF) apparatus. Electrophoresis was conducted at 6 VÆcm)1 at 15 (cid:176)C for 16 hrwith the switch time ramped from 1 to 12 s. After completion of electrophoresis, gels were stained in 0.5 lgÆmL)1 ethidium bromide in 0.5 · Tris/borate/EDTA and washed for several hours in distilled water. After photographing, the gels were treated with 0.25 M HCl for 15 min, rinsed in distilled water, denatured with 0.5 M NaOH/1.0 M NaCl, and neutralized with 0.5 M Tris/HCl/ 1.5 M NaCl (pH 7.4). The gels were inverted and DNA was transferred overnight onto Hybond Nytran membranes in 10 · NaCl/Pi/EDTA by capillary movement. After trans- fer, membranes were baked for 2 h at 80 (cid:176)C.
BAC isolation, shotgun sequencing, and custom-synthetic primer directed closure
The ADH cDNA probes used in this study were inserts from pADHn1 for deermouse ADH5 and pCK2 for mouse ADH1 [10]. The ADH4 and ADH3 cDNAs have been previously described [4]. Mouse ADH2 cDNA was prepared using methods previously described [4] by performing RT-PCR on mouse liver RNA with primers overlapping the start and stop codons of the published mouse ADH2 cDNA sequence [5]. The ADH2 cDNA was confirmed by nucleotide sequence analysis. Isolated cDNA inserts were used to prepare all probes for labeling by random priming [28].
Screening the BAC library for mouse Adh1-containing clones
Five filters containing a total of over 90 000 clones were screened by hybridization with 32P-labeled mouse ADH1
The detailed procedures for cloned, large insert genomic DNA isolation, random shot-gun cloning, fluorescent- based DNA sequencing and subsequent analysis were used as described previously [29–31]. Briefly, BAC DNA was isolated free from host genomic DNA via a cleared lysate- acetate precipitation-based protocol [29]. Subsequently, 50 lg portions of purified BAC DNA were randomly sheared and made blunt ended [30,31]. After kinase treatment and gel purification, fragments in the 1- to 3- kb range were ligated into SmaI-cut, bacterial alkaline
226 G. Szalai et al.
(Eur. J. Biochem. 269)
(cid:211) FEBS 2002
R E S U L T S
Identification of Adh1-containing BAC clones
7
phosphatase-treated pUC18 (Pharmacia) and Escherichia coli, strain XL1BlueMRF¢ (Stratagene), was transformed by electroporation. A random library of approximately 1200 colonies were picked from each transformation, )1 grown in terrific broth [32] supplemented with 100 lgÆmL of ampicillin for 14 h at 37 (cid:176)C with shaking at 250 r.p.m., and the sequencing templates were isolated by a cleared lysate-based protocol [30].
DNA was isolated from all BAC clones hybridizing to the ADH1 cDNA probe. Purified DNA was digested with EcoRI and analyzed by Southern blotting and hybridization to ADH1 cDNA. Finally, 13 positive clones were identified and among these were found all EcoRI restriction fragments detectable in genomic DNA with an ADH1 cDNA probe. The Adh1-containing EcoRI fragments in C57BL/6 DNA are 6.8-, 3.8-, 2.5- and 1.1 kb. In addition, more faintly hybridizing 2.0- and 2.3 kb EcoRI fragments present in the genome were identified among the BAC clones. Of these 13 clones, two were chosen for extensive analysis because they overlap within the Adh1 gene. BAC 463H24 contains the entire Adh1 gene while 393J8 contains only the most 3¢ end, the 3.8- and 1.1-kb EcoRI [7] hybridizing fragments (Fig. 2A). Therefore, 393J8 extends farther downstream than 463H24 relative to Adh1 orientation. 393J8, in contrast to 463H24, includes the 2.0 kb non-Adh1 [7] hybridizing fragment as does another BAC clone 461A12. As 461A12 contains no Adh1 sequence, this confirms that the cross- hybridizing 2.0-kb EcoRI fragment is downstream of the Adh1 gene. Two additional BAC clones, 388D7 and 434F16, were identified that contained the 2.3-kb EcoRI Adh1-crosshybridizing fragment [7], but no Adh1 sequence, as the only species to hybridize to ADH1 cDNA.
Sequencing reactions were performed as previously described [31] using Taq DNA polymerase, the Perkin Elmer Cetus fluorescently labeled Big Dye Taq termina- tors. The reactions were incubated for 60 cycles in a PerkinElmer Cetus DNA Thermocycler 9600 and after removal of unincorporated dye terminators by filtration through Sephadex G-50, the fluorescently labeled nested fragment sets were resolved by electrophoresis on an ABI 3700 Capillary DNA Sequencer. After base calling with the ABI analysis software, the analyzed data was trans- ferred to a Sun workstation cluster, and assembled using the PHRED and PHRAP programs [33,34]. Overlapping sequences and contigs were analyzed using CONSED [35]. Gap closure and proofreading was performed using either custom primer walking or using PCR amplification of the region corresponding to the gap in the sequence followed by subcloning into pUC18 and cycle sequencing with the universal pUC-primers via Taq terminator chemistry. In some instances, additional synthetic custom primers were necessary to obtain at least threefold coverage for each base.
Adhgenes on 463H24
9
Draft and finished BAC sequences were analyzed on Sun workstations with the programs contained within the GCG package [36] as well as the BLAST [37], BEAUTY [38] and BLOCKS [39] programs. The sequences of BAC clones RPCI23-463H24 and RPCI23-463H24 have been deposited into GenBank and given accession numbers AC079823 and AC079682, respectively,
Computer analysis of DNA sequence
CLUSTAL W (version 1.81) was used to align the amino-acid sequences [40] for phylogenetic treatments. Any site at which the alignment postulated a gap in any sequence was removed from the data set for all pairwise comparisons so that a similar data set was used for each comparison. Phylogenies were constructed using the following methods: (a) the neighbor-joining (NJ) method [41] based on the Poisson corrected (PC) amino-acid distance [42] and (b) the NJ method based on the gamma-corrected (a (cid:136) 2.0) amino- acid distance [42].
Both methods produced essentially identical results; therefore, only the results of the NJ tree based on PC are presented here. The NJ method is reliable at reconstructing phylogenies even when evolutionary rates differ among branches of a phylogenetic tree [42]. The reliability of clustering patterns in NJ trees was tested by bootstrapping [43], which involved clustering of trees based on pseudos- amples of sites sampled in the data set (with replacement). Five-hundred bootstrap pseudosamples were used.
BAC 463H24 restriction endonuclease digests were pro- duced, and the resulting fragments were resolved by CHEF analysis. As estimated by the sizes of the restriction fragments, 463H24 contains an insert of over 165 kb (Fig. 1A). The blots prepared from CHEF separation of restriction fragments were hybridized to several mouse ADH cDNA clones. Only ADH1 and ADH4 cDNAs hybridize to restriction fragments in 463H24 (Fig. 1A). The results of several restriction digests revealed that the positions of Adh1 and Adh4 are resolved on this clone. For example, Adh1 is located on the 60-kb RsrII fragment while Adh4 is found on the 100-kb fragment (Fig. 1A, N/R). Either because of context or the sequence recognized by RsrII [CGG (A/T)CCG] there was never complete digestion at this site as revealed by the remaining 160-kb undigested insert present in a nonstoichiometric amount. As expected, this undigested fragment hybridizes to both ADH1 and ADH4 cDNAs. Adh4 is found on the 135-kb EagI fragment while Adh1 is found on the 135- and 25-kb fragments (Fig. 1A, N/E). Within the Adh1 gene a single EagI site is found near the 3¢ end of exon 6 accounting for the two hybridizing fragments. This also localizes the 3¢ end of Adh1 approxi- mately 21 kb from one end of the clone. Adh1 and Adh4 are on the large SalI fragment suggesting this site is nearer the opposite end of the clone than EagI. As EagI, RsrII, and SalI all cut the clone once and EagI and SalI are nearer the ends of the BAC, other double digests were performed to construct a more detailed map of 463H24 (Fig. 2A).
BLAST
(available
programs
at
8
Two PmeI sites are present (Fig. 1A, N/P) and both Adh1 and Adh4 are found on the large 90-kb fragment. The location of the additional PmeI site was not determined from the digests performed. The RsrII/PmeI double digest
GenBank and EST database searches were performed http:// [44] using www.ncbi.nlm.nih.gov/BLAST website). Sequence analysis were carried out using the GCG software and manipulation package.
(cid:211) FEBS 2002
Mouse alcohol dehydrogenase gene complex (Eur. J. Biochem. 269) 227
(Fig. 1A, N/R/P) delineates the position of Adh1 and Adh4 (Fig. 2A) on the BAC.
Adhgenes on BAC 393J8
blot of these CHEF separated restriction endonuclease fragments of 393J8 with four ADH cDNA probes are shown in Fig. 1B. The single SalI site is located approxi- mately 30 kb from the Adh1-tagged end of the clone confirmed by the strong hybridization signal of the 30-kb fragment when probed with ADH1 cDNA (Fig. 1B, N/S). The faint signal observed from the large fragment must be due to cross-hybridization with other ADH genes. The only RsrII site is located about 100 kb from the Adh1 end of the clone (Fig. 1B, N/R). The single SgrAI site is located about 135 kb from the Adh1 end (Fig. 1B, N/Sg), and the EagI site
The presence of the 3¢ end of Adh1 near one end of 393J8 provides a useful approach to determine the location of each restriction enzyme site located nearest this end by probing with ADH1 cDNA. Sizes of fragments produced by digestion of 393J8 with various restriction enzymes were determined by CHEF (Fig. 1B). The results of probing a
Fig. 1. CHEF analysis of restriction enzyme digests of BAC clone. The restriction enzymes used to prepare BAC digests loaded into each lane are indicated at the top. The enzymes used were N, NotI; B, BssHII; E, EagI; P, PmeI R, RsrII; S, SalI; and Sg, SgrAI. The top panel in both (A) and (B) shows the fragments detected by ethidium bromide staining. (A) Identification of ADH-containing restric- tion fragments in BAC 463H24. Blots probed with ADH1 or ADH4 cDNAs are indicated. (B) Identification of ADH-containing restric- tion fragments in BAC 393J8. Autoradiogram resulting from probing identically prepared blots with ADH4, ADH2, ADH3 or deer- mouse ADH5 (mammalian class V) cDNAs are shown.
228 G. Szalai et al.
(Eur. J. Biochem. 269)
(cid:211) FEBS 2002
tional orientation. The contig also contains three SalI sites, all located within 7-kb of each other, upstream from the 5¢ end of the gene. These sites are too close to resolve on CHEF gels and must represent the single SalI site shown on the map (Fig. 2A). Based on the sequence, the PmeI site on the map is located upstream of Adh4. The sequence also identifies the location of the additional PmeI site which is near the 5¢ end of Adh4. The RsrII site on the map is on a different contig (182 084–106 188) and is upstream of the 5¢ end of the Adh1 gene located on this contig. The single EagI site located near the 3¢ end of exon 6 in the Adh1 gene localizes this gene on 463H24 and the draft sequence confirms the 5¢ to 3¢ orientation of this gene. As one end of 393J8 begins in the 3¢ end of Adh1, this confirms that Adh4 and Adh1 are transcribed in the same orientation.
located nearest Adh1 is over 60 kb (Fig. 1B, N/E) from the end (the largest EagI fragment). ADH1 cDNA also hybridizes to a 75-kb PmeI fragment (Fig. 1B, N/P), but this band is actually composed of two nearly identical fragments. The smaller 12-kb PmeI fragment is from the middle of this clone. The observation that SalI, RsrII, and SgrAI digests not only have a strong signal from the fragment containing the 3¢ end of Adh1 but also a weak hybridization signal from the other fragment in the digest suggests other Adh genes that weakly cross-hybridize to Adh1 are found on this clone. In EagI digests, ADH1 cDNA hybridizes to the 5-kb fragment which contains the portion of the Adh1 gene located 5¢ of this site to the end of the clone, to the 60-kb fragment already mentioned, and to the slightly smaller, weakly hybridizing fragment, which must contain other Adh genes.
BLAST analysis of 393J8 complete sequence (AC 079832.16) with ADH1, ADH2, and ADH3 cDNAs also revealed that these genes have the same transcriptional orientation. The overall length of the sequence is 194 850 bp.
Identification of Adh5a, Adh5b, and Adh5psbetween Adh1and Adh2
10
The ability to precisely locate the position of several restriction enzyme sites relative to the Adh1 end of the clone enabled a more detailed map to be constructed by analyzing additional double digests. The resulting blots were also hybridized with ADH2, ADH3 and deermouse ADH5 (Fig. 1B) cDNAs. ADH1 and deermouse ADH5 cDNAs both gave strong and weak signals suggesting that these probes cross-hybridize to other ADHs. For instance, ADH1 probe gives strong and weak signals with the 30-kb and 140- kb NotI/SalI fragments, respectively. However, deermouse ADH5 probe gives approximately equal signals in both fragments. The deermouse ADH5 probe does cross-hybrid- ize to Adh1 sequences, but the strong signal suggests there may be other cross-hybridizing sequences in the small fragment. Both mouse ADH2 and ADH3 cDNA probes hybridize only to the large fragment. Resolution of the positions of the genes on the BAC was obtained from single and double digests. The Adh2 gene is found on the large NotI/SgrAI fragment whereas Adh3 is on the small fragment (Fig. 1B, N/Sg).
A map (Fig. 2A) of the two overlapping BACs was generated based upon estimated fragment sizes generated in single and double digests, hybridization of the different probes, and the ability to determine precisely the location of restriction sites relative to the Adh1 end of the BAC. The positions of Adh4, Adh2, and Adh3 are arbitrarily placed in the middle of the flanking restriction sites. Adh1 is anchored by the presence of the EagI site in the gene. The deermouse ADH5 cross-hybridizing sequence is positioned encompass- ing restriction sites thought to reside in the hybridizing sequence. However, additional Adh sequences may reside between the position of Adh5 and Adh1 based upon weak cross-hybridization signals observed with both cDNAs as probe and the fact that the EagI is placed within Adh5, but could be between related genes. However, the other nearby EagI site may be able to resolve two Adh5 cross-hybridizing species. The precise location of the mouse ortholog of deermouse Adh5 cannot be determined from mapping alone but was determined from sequence data (see below).
Transcriptional orientation of Adh4, Adh1, Adh2and Adh3genes
The draft sequence of 463H24 (AC 079682.14) consists of four unordered pieces and is suggested to be approximately 182084 bp in length. BLAST analysis reveals that the Adh4 gene is on the 32 589–106 087 bp contig in that transcrip-
Sequences in 393J8 hybridizing to the deermouse ADH5 cDNA maps between Adh1 and Adh2 but closer to Adh2 (Fig. 2A). It is possible that other sequences in this region may cross hybridize to ADH1 or deermouse ADH5 cDNA probes. BLASTN analysis of the region revealed that the exon 2–exon 6 deermouse ADH5-like sequence is found at position 97 694–110 934 in the BAC. Exon 6 sequence contains a 9-bp deletion followed by a 7-bp stretch and then a single bp deletion when compared to the deermouse ADH5 or human ADH5 cDNAs. After 22 codons in the altered reading frame a stop codon is encountered strongly suggesting this is a nonfunctioning pseudogene, Adh5ps. Before the deletion, the encoded sequence has a positional identity of 24/26 amino-acid residues with human ADH5 and there is an identity of 29/57 after the deletion by returning to the proper reading frame. No coding regions beyond exon 6 were found by BLASTN searches with other ADH cDNAs. However, a perfect match was found between nucleotide positions 7–95 of an adult male liver cDNA clone (GI: 12836366) and nucleotide position 89 989–90 077 in the BAC. This cDNA has a start codon at position 76 and encodes six amino acids with 3/6 identity with human ADH5 exon 1 encoded sequence. Also, this cDNA at position 511–2407 has a 99% identity with position 101 875–103 771 in 393J8 that includes potential exons 4 and 5. Because so much of the cDNA extends beyond these potential exons, it is probable that this cDNA represents partially spliced mRNA. Nucleotides 2513–2795 of this same cDNA have a 99% sequence identity with position 103 877–104 159 in the BAC. TBLASTN searches in the 75 kb of sequence between Adh1 and Adh5ps for sequences homologous to deermouse ADH5 or human ADH5 protein sequence revealed two probable complete genes each with nine exons. These genes are defined as Adh5a and Adh5b based on location and phylogeny (see below). TBLASTN and BLASTN analyses define the location of Adh5a as encompassing 29 317–47 164 in the 5¢ to 3¢ orientation. Adh5b is located from 57 150 to 74 874. A TBLASTN search of the GenBank database failed to locate the first and last exons of the Adh5 gene.
(cid:211) FEBS 2002
Mouse alcohol dehydrogenase gene complex (Eur. J. Biochem. 269) 229
nucleotide identity with the consensus initiator sequence, respectively. The Adh5b initiator shares 5/10 identity with the consensus sequence and does not have, in contrast to Adh2 and Adh5a, the important G residue after the ATG or the A/G three nucleotides upstream from the ATG. Although the first intron in the Adh5b gene begins with gt the match with the consensus is poor only having the AG at the end of the exon before the gt in the intron. The 5¢ end of the first intron matches the consensus much more in Adh5a and Adh2.
The locations of the intron interruptions in the coding sequence of the mouse genes in the cluster are shown in Table 1. The molecular map of the mouse Adh complex is shown in Fig. 2B along with a comparison with the human ADH complex. The sizes of introns in all genes including the pseudogene are shown in Fig. 2C.
Phylogeny of mouse and human Adhgenes
In the phylogeny of the mouse and human genes, there were five major clusters (Fig. 3), each separated by a bootstrap
Adh5a and Adh5b have available complete cDNA or EST sequence represented in the database, respectively. This has allowed the last exon of Adh5b to be defined by an EST expressed in mouse skin (GI: 4404412). A full-length cDNA (GI: 12840922) obtained from 10-day-old male pancreas represents the expressed product of Adh5a and has been used to determine gene structure. The complete gene structures of Adh5a and Adh5b are fully annotated in GenBank (GI: 15383846). Both genes consist of nine exons and encode 374 amino acids excluding the initiation methionine which is the same as the mouse Adh4 and Adh1 protein products. Exon 1 of Adh5b is tentatively identified as a sequence encoding the intiation met and five additional amino acids before encountering a gt splice site. The nucleotide sequence of the Adh2 gene is fully annotated in GenBank (GI: 1538346). This gene has nine exons and encodes a polypeptide of 376 amino acids, again exclusive of the inititation methionine. All three genes, Adh2, Adh5a and Adh5b have consensus gt and ag nucleotides flanking the intron sequences. The translation initiation codons for Adh5a and Adh2 have 8/10 and 6/10
Fig. 2. Structural organization of the genes in the mouse Adh complex. (A) Restriction map of the area encompassed by the two BACs indicating positions of the mapped Adh genes. Restriction enzyme symbols are the same as in Fig. 1. Genes known to lie between two restriction sites are arbitrarily placed in the middle. The location of the two BAC clones are shown below the physical map. (B) Molecular map of the mouse Adh complex (top) compared to the human ADH complex (bottom, based on the sequence in NT 022863). Arrows indicate transcriptional orientation. The draft sequence of 463H24 (GI 15042854) consists of four unordered contigs, but molecular distances and orientation were derived based on BAC end sequence from the database, restriction sites located within contigs compared to the physical map, and paired BLASTN analysis of the BAC against ADH4 and ADH1 cDNA. The order of contigs in 463H24 relative to the 5¢ to 3¢ transcriptional orientation of the Adh4 and Adh1 genes on the BAC is (32489/5054)-(32589/106087; Adh4)-(182084/106188; Adh1). One contig (1/4953) cannot be placed on the map, but is not at either end of the BAC. (C) Organization of introns and exons in the mouse genes. Introns are lines and exons are rectangles. The genes are given in their order within the complex.
230 G. Szalai et al.
(Eur. J. Biochem. 269)
(cid:211) FEBS 2002
Table 1. Amino-acid positions encoded by each exon in six mouse Adh genes. Codons split between exons are shown at the end and beginning of adjacent exons. Blanks indicate exon coding is the same as in Adh4. The human ADH2 exon coding regions are shown as this gene is most closely related to the mouse Adh2. Exons for human ADH2 are from [48], and human ADH5 organization is from [9,47]. Sources for organization of other mouse genes are: Adh4 [45], Adh1 [7] and Adh3 [49].
Exon
1 2 3 4 5 6 7 8 9 Gene
Met1–5 6–39 40–86 86–115 115–188 189–275 276–321 321–368 368–374
Met3–5
hADH1C
70
100
hADH1B
100
hADH1A
96
mADH1
hADH7
100
mADH7
hADH6
87
mADH6A
100
mADH6B
hADH4
100
mADH4
72
hADH5
100
mADH5
0.1
PC
Adh4 Adh1 Adh5a Adh5b Adh3 Adh2 ADH2 ADH5 40–87 40–87 87–116 87–116 115–192 116–193 116–188 193–279 194–280 189–275 280–323 281–326 276–321 323–369 326–372 321–368 369–376 372–379 368–375
11
value of 87 or greater: (a) hADH1A, hADH1B, hADH1C, and mADH1 cluster significantly with a bootstrap value of 100; (b) hADH4 and mADH4 cluster (100); (c) hADH5, mADH5A and mADH5B cluster (87); (d) hADH2 and mADH2 cluster (100); and (e) hADH3 and mADH3 cluster (100). The proteins encoded by the newly identified mouse Adh5a and Adh5b genes are most closely related to the human ADH5 and to each other, which is the basis for their nomenclature.
Fig. 3. Phylogeny of mouse and human Adh genes constructed by the neighbor-joining method. Numbers on the branches are percentages of 500 bootstrap samples that support the branch.
D I S C U S S I O N
in the order Adh4-Adh1-Adh2-Adh3 in less than 250 kb of DNA, and all genes are in the same transcriptional orientation being 5¢ to 3¢ from the Adh4 to Adh1 orientation. This is the same order and transcriptional orientation as found for the human genes except the human has three ADH1 genes (ADH1C, ADH1B and ADH1A) instead of the single mouse Adh1 ortholog. The mouse sequence cross- hybridizing to deermouse ADH5 cDNA was mapped physically to a region between Adh1 and Adh2 correspond- ing to the location of the human ADH5 gene. The amino- acid similarity of 67% between the deermouse encoded protein and the human ADH5 protein [10] previously provided the rationale for suggesting that the deermouse sequence represents a separate class VI. A similar sequence has been identified in the rat [11]. However, the identity of 67% is an intermediate value between members of different classes and members of the same class between species. The orthologous physical location and this intermediate value suggests this represents sequence most related to human ADH5. Pairwise BLASTN comparisons between the draft sequence of the BAC in this area and the deermouse and human ADH5 cDNAs allowed the identification of poten- tial exons 2–6 in this physically mapped region. However, exon 6 was found to contain deletions that altered the reading frame leading to a termination codon. Although available cDNA indicates this region is expressed with spliced exons 1–3, the deletion in exon 6 strongly suggests that no functional protein is made. A partial sequence from exon 6 of an unidentified mouse Adh gene from a C57BL/6 library contains an identical deleted sequence in the exon 6 , University of South Carolina, region [46] (D. Dolney Columbia, SC, USA, unpublished results). Because the location of this sequence is similar relative to other genes in human, it is defined as Adh5ps.
TBLASTN pairwise searches between deermouse and human ADH5 cDNA translation products and the BAC sequence located between this mouse Adh5ps and Adh1 identified two additional genes. The newly defined Adh5a and Adh5b genes reside between Adh1 and Adh2 indicating an overall order of Adh4-Adh1-Adh5a-Adh5b-Adh5ps-Adh2- Adh3. The mouse Adh5a and Adh5b reside in the same relative position as the human ADH5 gene. The structures of both genes are very similar to the human ADH5 gene [9] except an additional amino-acid residue is encoded by exon 4 in the human sequence compared to the mouse. The
In this report, we have isolated a series of mouse BAC clones using a mouse ADH1 cDNA as a probe. As two of these BACs overlap at the Adh1 gene, they could be orientated relative to the 5¢ to 3¢ orientation of this gene. Using cDNAs for three other mouse ADH genes and a deermouse gene to probe restriction digests of these clones, we were able to determine the order of the four mouse genes and the sequences cross-hybridizing to the deermouse ADH5 cDNA probe. Sequence of the two overlapping BAC clones combined with the physical map allowed us to determine that the previously known mouse genes are found
(cid:211) FEBS 2002
Mouse alcohol dehydrogenase gene complex (Eur. J. Biochem. 269) 231
NIH Mouse BAC Sequencing Program for generating the sequence of RP23-393J8 and RP23-463H24 BAC clones.
R E F E R E N C E S
1. Duester, G., Farre´ s, J., Felder, M.R., Holmes, R.S., Ho¨ o¨ g, J.-O., Pare´ s, X., Plapp, B.V., Yin, S.-J. & Jo¨ rnvall, H. (1999) Recom- mended nomenclature for the vertebrate alcohol dehydrogenase gene family. Biochem. Pharmacol. 58, 389–395. 2. Jo¨ rnvall, H. & Ho¨ o¨ g, J.-O. (1995) Nomenclature of alcohol dehydrogenases. Alcohol Alcohol. 30, 153–161.
3. Jo¨ rnvall, H., Shafqat, J., el-Ahmad, M., Hjelmquist, L., Persson, B. & Danielsson, O. (1997) Alcohol dehydrogenase variability. Evolutionary and functional conclusions from characterization of further variants. Adv. Exp. Med. Biol. 414, 281–289.
4. Zgombic-Knight, M., Ang, H.L., Foglio, M.H. & Duester, G. (1995) Cloning of the mouse class IV alcohol dehydrogenase (retinol dehydrogenase) cDNA and tissue-specific expression patterns of the murine ADH gene family. J. Biol. Chem. 270, 10868–10877.
5. Svensson, S., Stro¨ mberg, P. & Ho¨ o¨ g, J.-O. (1999) A novel subtype of class II alcohol dehydrogenase in rodents. J. Biol. Chem. 274, 29712–29719.
positions where intron sequence disrupts coding sequence in Adh5a and Adh5b genes is identical to Adh4 [45] and Adh1 [7,46] However, Adh5a, Adh5b, and Adh2 are remarkably similar to human ADH5 at the exon 8/exon 9 boundary. In all cases there is a potential stop codon just downstream of exon 8 that would produce a truncated protein if splicing between exons 8 and 9 fails to occur. Recently, alternative splicing of exons 8 and 9 has been reported for human ADH5 [47]. All the mouse genes except Adh4 [45] contain a potential translational stop codon in frame just downstream of exon 8, but it is unknown with what frequency a transcription product lacking exon 9 is produced for any of these genes. All genes in the complex are very similar in location of intron positions within the coding region of the gene with the exception of the Adh2 gene that encodes four additional amino acids in exon 5, but two less in exon 7. This ADH2 protein contains 376 amino acids [5], but is not as large as the human ortholog that contains 379 amino acids [48]. An overview of the gene structure within the complex suggests that genes in the middle of the complex are larger than the ones at the ends. The Adh5ps even with only six exons is the largest in the complex. The genes (Adh4, Adh1, Adh5a, Adh5b, Adh5ps) at the 5¢ end of the complex relative to transcriptional orientation characteristically have small introns between exons 4 and 5, but genes at the 3¢ end (Adh2 and Adh3) have a larger intron 4.
6. Duester, G., Smith, M., Bilanchone, V. & Hatfield, G.W. (1986) Molecular analysis of the human class I alcohol dehydrogenase gene family and nucleotide sequence of the gene encoding the b subunit. J. Biol. Chem. 261, 2027–2033.
7. Ceci, J.D., Zheng, Y.-W. & Felder, M.R. (1987) Molecular anal- ysis of mouse alcohol dehydrogenase: nucleotide sequence of the Adh-1 gene and genetic mapping of a related nucleotide sequence to chromosome 3. Gene 59, 171–182.
8. Edenberg, H.J., Zhang, K., Fong, K., Bosron, W.F. & Li, T.-K. (1985) Cloning and sequencing of cDNA encoding the complete mouse liver alcohol dehydrogenase. Proc. Natl Acad. Sci. USA 82, 2262–2266.
9. Yasunami, M., Chen, C.-S. & Yoshida, A. (1991) A human alcohol dehydrogenase gene (ADH6) encoding an additional class of isozyme. Proc. Natl Acad. Sci. USA 88, 7610–7614.
10. Zheng, Y.-W., Bey, M., Liu, H. & Felder, M.R. (1993) Molecular basis of the alcohol dehydrogenase-negative deer mouse. Evidence for deletion of the gene for class I enzyme and identification of a possible new enzyme class. J. Biol. Chem. 268, 24933–24939. 11. Ho¨ o¨ g, J.-O. & Brandt, M. (1995) Mammalian class VI alcohol dehydrogenase: novel types of the rodent enzymes. Adv. Exp. Med. Biol. 372, 355–365.
12. Deltour, L., Foglio, M.H. & Duester, G. (1999b) Metabolic deficiences in alcohol dehydrogenase Adh1, Adh3, and Adh4 null mutant mice. J. Biol. Chem. 274, 16796–16801.
13. Liu, L., Hausladen, A., Zeng, M., Que, L., Heitman, J. & Stamler, J.S. (2001) A metabolic enzyme for S-nitrosothiol conserved from bacteria to humans. Nature 410, 490–494.
This report presents a detailed map of the mouse Adh gene complex and finds that six genes in the same transcriptional orientation are found within 250 kb of DNA sequence. A pseudogene located in a transcribed region is also detected in this locus. The first gene in the complex at the 5¢ end relative to transcriptional orientation, Adh4, is expressed at high level in adult stomach, esophagus and skin with lower levels produced in ovary, uterus, seminal vesicle [4,23]. The next gene, Adh1, is expressed at highest levels in liver, adrenal, and small intestine [4,21] with somewhat smaller amounts being found in kidney and still smaller amounts detectable in several tissues including ovary, uterus, seminal vesicle. Expression of Adh2 occurs in liver with lesser expression in kidney [5] although an extensive expression pattern at the RNA level has not been established, while the Adh3 gene is widely expressed in mouse tissues [4]. The expression pattern of Adh5a and Adh5b is still to be defined. There is some order in the liver expression pattern of the different genes as related to their position on the chromosome progressing from the 5¢ end of the complex in which Adh4 expression is totally absent from liver, to the middle and 3¢ end of the complex where Adh1, Adh2, and Adh3 are highly expressed in liver. At this time, it is unknown if individual regulatory elements between the genes control expression of each gene during development and differentiation, or if a locus control region in combi- nation with local elements control expression of the complex. The knowledge of the organization of this locus will be useful in addressing these questions in transgenic mouse expression studies.
14. Ang, H.L., Deltour, L., Hayamizu, T.F., Zgombic-Knight, M. & Duester, G. (1996) Retinoic acid synthesis in mouse embryos during gastrulation and craniofacial development linked to class IV alcohol dehydrogenase gene expression. J. Biol. Chem. 271, 9526–9534.
15. Ang, H.L., Deltour, L., Zgombic-Knight, M., Wagner, M.A. & Duester, G. (1996) Expression patterns of class I and class IV alcohol dehydrogenase genes in developing epithelia suggest a role for alcohol dehydrogenase in local retinoic synthesis. Alcohol Clin. Exp. Res. 20, 1050–1064.
A C K N O W L E D G E M E N T S
16. Deltour, L., Foglio, M.H. & Duester, G. (1999a) Impaired retinol utilization in Adh4 alcohol dehydrogenase mutant mice. Dev. Genet. 25, 1–10.
17. Burnett, K.G. & Felder, M.R. (1980) Ethanol metabolism in in alcohol dehydrogenase. Peromyscus genetically deficient Biochem. Pharmacol. 29, 125–130. This work was supported by NIH grants AA 11823 (M. R. F), AA 09731 (G. D), and HG 00313 (B. A. R.). The contribution of R. Friedman in constructing the phylogenetic tree was supported through NIH grant GM 43940 to A. L. Hughes. We are grateful to the
232 G. Szalai et al.
(Eur. J. Biochem. 269)
(cid:211) FEBS 2002
32. Sambrook, J., Fritsch, E.F. & Maniatis, T. (1989) Molecular Cloning: a Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. 18. Holmes, R.S., Albanese, R., Whitehead, F.D. & Duley, J. (1981) Mouse alcohol dehydrogenase isozymes: products of closely localized duplicated genes exhibiting divergent kinetic properties. J. Exp. Zool. 217, 151–157.
33. Ewing, B., Hillier, L., Wendl, M. & Green, P. (1998) Basecalling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8, 175–185.
34. Ewing, B. & Green, P. (1998) Basecalling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194. 35. Gordon, D., Abajian, C. & Green, P. (1998) Consed: A. graphical tool for sequence finishing. Genome Res. 8, 195–202. 19. Peirce, J.L., Derr, R., Shendure, J., Kolata, T. & Silver, L.M. (1998) A major influence of sex-specific loci on alcohol preference in C57Bl/6 and DBA/2 inbred mice. Mamm. Genome 9, 942–948. 20. Balak, K.J., Keith, R.H. & Felder, M.R. (1982) Genetic and developmental regulation of mouse liver alcohol dehydrogenase. J. Biol. Chem. 257, 15000–15007.
36. Genetics Computer Group (1994) Program Manual for the Wisconsin Package, Version 8, Genetics Computer Group, Madison, WI, USA 21. Felder, M.R., Watson, G., Hu(cid:128), M.O. & Ceci, J.D. (1988) Mechanism of induction of mouse kidney alcohol dehydrogenase by androgen. J. Biol. Chem. 263, 14531–14537.
22. Tussey, L. & Felder, M.R. (1989) Tissue-specific genetic variation in the level of mouse alcohol dehydrogenase is controlled tran- scriptionally in kidney and posttranscriptionally in liver. Proc. Natl Acad. Sci. USA 86, 5903–5907.
37. Altschul, S.F., Gish, W., Myers, E.W. & Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410. 38. Worley, K.C., Wiese, B.A. & Smith, R.F. (1995) BEAUTY: an enhanced BLAST-based search tool that integrates multiple bio- logical information resources into sequence similarity searches. Genome Res. 5, 173–184.
23. Dolney, D.E.A., Szalai, G., Duester, G. & Felder, M.R. (2001) Molecular analysis of genetic di(cid:128)erences among inbred mouse strains controlling tissue expression pattern of alcohol dehydro- genase 4. Gene 267, 145–156.
24. Lin, Z., Edenberg, H.J. & Carr, L. (1993) A novel negative element in the promoter of mouse liver alcohol dehydrogenase. J. Biol. Chem. 268, 10260–10267.
39. Heniko(cid:128), S. & Heniko(cid:128), J.G. (1994) Protein family classification based on searching a database of blocks. Genomics 19, 97–107. 40. Thompson, J.D., Higgins, D.G. & Gibson, T.J. (1994) CLU- improving the sensitivity of progressive multiple STALW: sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680. 25. Carr, L.G., Zhang, K. & Edenberg, H.J. (1989) Protein–DNA interaction in the 5¢ region of the mouse alcohol dehydrogenase gene Adh-1. Gene 78, 277–285.
41. Saitou, N. & Nei, M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425. 42. Nei, M. & Kumar, S. (2000) Molecular Evolution and Phylogenies. Oxford University Press, New York. 43. Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791.
26. Xie, D., Narasimhan, P., Zheng, Y.-W., Dewey, J.J. & Felder, M.R. (1996) Ten kilobases of 5¢-flanking region confers proper regulation of the mouse alcohol dehydrogenase-1 (Adh-1) gene in kidney and adrenal of transgenic mice. Gene 181, 173–178. 27. Szalai, G., Ceci, J., Dewey, M. & Felder, M.R. (2001) Identifica- tion and expression of cosmids with an allelic variant of class I alcohol dehydrogenase in transgenic mice. Chemico-Biol. Interact. 130–131, 481–490.
44. Altschul, S.F., Madden, T.L., Scha¨ (cid:128)er, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. (1997) Gapped BLAST and PSI- BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402. 28. Feinberg, A.P. & Vogelstein, B. (1983) A technique for radiola- belling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132, 6–32.
45. Zgombic-Knight, M., Deltour, L., Haselbeck, R.J., Foglio, M.H. & Duester, G. (1997) Gene structure and promoter for Adh3 encoding mouse class IV alcohol dehydrogenase (retinol dehy- drogenase). Genomics 41, 105–109.
12 29. Pan, H.Q., Wang, Y.P., Chissoe, S.L., Bodenteich, A., Wang, Z., Iyer, K., Clifton, S.W., Crabtree, J.S. & Roe, B.A. (1994) The complete nucleotide sequence of the SacBII domain of the P1 pAD10-SacBII cloning vector and three cosmid cloning vectors: 11, pTCF, svPHEP, and LAWRIST16. Genet. Anal. Tech. Appl. 181–186.
46. Zhang, K., Bosron, W.F. & Edenberg, H.J. (1987) Structure of the mouse Adh-1 gene and identification of a deletion in a long alternating purine-pyrimidine sequence in the first intron of strains expressing low alcohol dehydrogenase activity. Gene 57, 27–36. 47. Stro¨ mberg, P. & Ho¨ o¨ g, J.-O. (2000) Human class V alcohol dehy- drogenase (ADH5): a complex transcription unit generates C-ter- minal multiplicity. Biochem. Biophys. Res. Comm. 278, 544–549. 30. Bodenteich, A., Chissoe, S., Wang, Y.F. & Roe, B.A. (1993) Shotgun cloning as the strategy of choice to generate templates for high-throughput dideoxynucleotide sequencing. In Automated DNA Sequencing and Analysis Techniques (Venter, J.C., ed.), pp. 42–50. Academic Press, London, UK.
48. von Bahr-Landstro¨ m, Jo¨ rnvall, H. & Ho¨ o¨ g, J.-O. (1991) Cloning and characterization of the human ADH4 gene. Gene 103, 269–274.
49. Foglio, M.H. & Duester, G. (1996) Characterization of the func- tional gene encoding mouse class III alcohol dehydrogenase (glutathione-dependent formaldehyde dehydrogenase) and an unexpressed processed pseudogene with an intact open reading frame. Eur. J. Biochem. 277, 496–504. 31. Chissoe, S.L., Bodenteich, A., Wang, Y.F., Wang, Y.P., Burian, D., Clifton, S.W., Crabtree, J., Freeman, A., Iyer, K., Jian, L., Ma, Y., McLaury, H.J., Pan, H.Q., Sharan, O., Toth, S., Wong, Z., Zhang, G., Heisterkamp, N., Gro(cid:128)en, J. & Roe, B.A. (1995) Sequence and analysis of the human ABL gene, the BCR gene, and regions involved in the Philadelphia chromosomal translocation. Genomics 27, 67–82.