Phylogenetic analysis of the chromate ion transporter (CHR) superfamily Ce´ sar Dı´az-Pe´ rez1, Carlos Cervantes1, Jesu´ s Campos-Garcı´a1, Adriana Julia´ n-Sa´ nchez2 and He´ ctor Riveros-Rosas2
1 Instituto de Investigaciones Quı´mico-Biolo´ gicas, Universidad Michoacana, Morelia, Michoaca´ n, Me´ xico 2 Departamento de Bioquı´mica, Facultad de Medicina, Universidad Nacional Auto´ noma de Me´ xico, Me´ xico
Keywords chromate resistance; chromate transport; Cupriavidus metallidurans; phylogenetic analysis; Pseudomonas aeruginosa
Correspondence H. Riveros-Rosas, Departamento de Bioquı´mica, Facultad de Medicina, Universidad Nacional Auto´ noma de Me´ xico, Apdo. Postal 70–159, Delegacio´ n Coyoaca´ n, 04510 Me´ xico, D.F., Me´ xico Fax: +52 (55) 5616 2419 Tel: +52 (55) 5622 0829 E-mail: hriveros@servidor.unam.mx
(Received 18 June 2007, revised 26 Sep- tember 2007, accepted 12 October 2007)
doi:10.1111/j.1742-4658.2007.06141.x
ChrA is a membrane protein that confers resistance to the toxic ion chro- mate through the energy-dependent chromate efflux from the cytoplasm. In the protein databases, ChrA is a member of the chromate ion transporter (CHR) superfamily, composed of at least several dozens of members, dis- tributed in the three domains of life. The aim of this work was to perform a phylogenetic analysis of the CHR superfamily. An exhaustive search for ChrA homologous proteins was carried out at the National Center for Bio- technology Information database. One hundred and thirty-five sequences were identified as members of the CHR superfamily [77 long-chain sequences, or bidomains (LCHR), and 58 short-chain sequences, or mono- domains (SCHR)], organized mainly as tandem pairs of genes whose resultant proteins probably possess oppositely oriented membrane topol- ogy. LCHR sequences were split into amino and carboxyl domains, and the resultant domains were aligned with the SCHR proteins. A phyloge- netic tree was reconstructed using four different methods, obtaining similar results. The domains were grouped into three clusters: the SCHR proteins cluster, the amino domain cluster of LCHR proteins and the carboxyl domain cluster of LCHR proteins. These results, as well as differences in the genomic context of CHR proteins, enabled the proteins to be sorted into two families (SCHR and LCHR), and 10 subfamilies. Evidence was found suggesting an ancient origin of LCHR proteins from the fusion of two SCHR protein-encoding genes; however, some secondary events of fusion and fission may have occurred later. The separate distribution of the LCHR and SCHR proteins, differences in the genomic context in both groups and the fact that chromate transport has been demonstrated only in LCHR proteins suggest that the CHR proteins comprise two or more par- alogous groups in the CHR superfamily.
Abbreviations CHR, chromate ion transporter; ME, minimum evolution; MP, maximum parsimony; NJ, neighbour joining; TMS, transmembrane segment; UPGMA, unweighted pair-group method using arithmetic averages.
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6215
metal cations [2,3]. This constitutes one of the best- studied bacterial mechanisms of resistance to chro- mium, which is based on the active efflux of chromate driven by the membrane potential [4,5]. The responsi- ble protein, ChrA, has been characterized in Pseudo- monas aeruginosa [6] and Cupriavidus metallidurans, Because toxic heavy metals (including chromium) have been abundant on the Earth since the beginning of life, microbes have been exposed to them for nearly four billion years [1]. In response, cells have developed diverse mechanisms that confer heavy metal resistance, such as, for example, the active extrusion of toxic
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
30
A
25
i
20
15
10
s n e t o r p f o r e b m u N
5
0
120 180 240 300 360 420 480 540 600
Sequence length of CHR proteins (number of amino acid residues)
formerly Alcaligenes eutrophus and Ralstonia metallidu- rans [7].
B
SCHR proteins
chr domain
Bacterial LCHR proteins chr domain chr domain
Fungal LCHR proteins
chr domain
chr domain
Results
Several putative homologues of the ChrA protein, called the chromate ion transporter (CHR) family, were identified after the sequencing of bacterial ge- nomes [3,8]. ChrA proteins are not related by sequence to other families of efflux membrane proteins, but rather appear to be a widespread family from bacteria to archaea. They are about 400 amino acids in length with two homologous duplicated domains, which prob- ably arose by a tandem internal gene duplication event from a putative ancestor of six transmembrane seg- ments (TMSs) [3,8]. Several putative homologues of ChrA with half-size length have been reported, sup- porting this proposal. This is the first instance in which both extant primordial polypeptide unit-equivalent and full-length duplicated proteins have been identified [3,8]. The recent identification of protein sequences homologous to ChrA in fungi has increased the interest for a phylogenetic analysis, as the broad phyletic distri- bution observed in protein members of the CHR family suggests that this corresponds to an ancient protein family. Therefore, this work reports the results of a phylogenetic analysis of the CHR family as an attempt to trace the probable evolutionary pathways followed the results obtained by ChrA proteins. Moreover, enabled the different reported ChrA proteins to be sorted into two families and 10 different subfamilies.
Fig. 1. Sequence length distribution and organization of proteins that belong to the CHR superfamily. (A) The first group (123–234 amino acids) in the histogram (open columns) corresponds to the length of bacterial monodomain short-chain chromate ion transport- ers (SCHR); the second group (grey columns; 345–495 amino acids) corresponds to the bacterial bidomain long-chain chromate ion transporters (bacterial LCHR); a third group includes fungal bido- main long-chain chromate ion transporters (fungal LCHR) (black col- umns; 502–584 amino acids). (B) Schematic representation of the identified domains in the protein members of the CHR superfamily. Bacterial SCHR (monodomain), bacterial LCHR and fungal LCHR (bi- domain) are shown. It should be noted that fungal LCHRs possess a longer interdomain segment than bacterial LCHRs. Only one fungal LCHR protein sequence from Gibberella zeae (accession Q4IG15) exhibits a longer sequence (906 amino acids in length, not shown), because it possesses a sequence homologous to the gly- cine ⁄ D-amino acid oxidase domain fused to the C-terminal domain.
Distribution of proteins inside the CHR family
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6216
about 100 amino acids for SCHR to 150 amino acids for LCHR) is mainly a result of differences in the length of the hydrophilic loops that separate TMSs. Only bidomain LCHR protein sequences were identi- fied in fungi (fungal LCHR). These showed a longer length (502–584 amino acids) because they possess an interdomain sequence longer than that of bacterial length (Fig. 1B). The LCHR sequences sequence One hundred and thirty-five nonredundant protein sequences were identified as members of the CHR fam- ily. Of these, 128 proteins belong to bacteria, one to archaea and six to fungi; protein sequence members of the CHR family were not found in either animals or plants, although, recently, additional CHR homolo- gous protein sequences have been found in the proto- zoan alveolata Paramecium tetraurelia [9] and the green algae Ostreococcus tauri and Ostreococcus luci- marinus [10]. This broad phyletic distribution (archaea, bacteria and eukarya) suggests an ancient origin for the CHR family. The amino acid length of CHR pro- teins sorted by size is shown in Fig. 1A. Two main sizes of CHR proteins were found: bacterial monodo- main proteins with a sequence length of 123–234 amino acids, called short-chain CHR (SCHR); and bacterial bidomain proteins with a sequence length of 345–495 amino acids, called bacterial long-chain CHR (bacterial LCHR) (Fig. 1B). The difference in size amongst members of the same group (ranging from
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
tree constructed with the
distribution of CHR proteins in monodomain SCHRs and bidomain LCHRs, the latter with homologous duplicated domains, suggests that they arose by a tan- dem gene duplication event.
the carboxyl comprises
Geobacter metallireducens
It is interesting to note that the organisms with the largest number of different CHR proteins all belong to the b-proteobacteria. For example, the chromate-resis- tant bacterium C. metallidurans possesses three LCHRs and one pair of SCHRs, Burkholderia vietnamiensis G4 possesses five LCHRs and two pairs of SCHRs, Burk- holderia xenovorans LB400 possesses three LCHRs and two pairs of SCHRs and Burkholderia sp. 383 pos- sesses two LCHRs and two pairs of SCHRs. It is not known whether these last three bacteria are chromate resistant, but another species of the same genera, Burk- holderia cepacia MCMB-821, isolated from an alkaline crater lake, was resistant to 0.1% (w ⁄ v) chromium(vi) and reduced chromate efficiently [11]. Although the presence of CHR proteins was not studied in this strain, its chromate-reducing ability was suppressed by respiratory chain inhibitors in the same way as in chromate-resistant bacteria with CHR proteins. comprise clusters that the
Phylogenetic analysis of monodomains
found [but
individual phylogenetic domains contained in both SCHR and LCHR pro- teins. Three main clusters can be observed in this fig- ure (delimited by yellow capped pins). The first cluster comprises the amino domains of LCHR proteins, which cluster together (blue branch on the tree); the second cluster terminal domains of LCHR (red branch); and the third cluster comprises the monodomain SCHR proteins (green and orange branches). Only a few exceptions to this pattern were obtained and are highlighted in colour (Fig. 2): shaded (a) (yellow that are sequences) possesses two SCHR proteins located in the phylogenetic tree inside the protein clus- ters that comprise the amino (Q39S24) and carboxyl (Q39S25) terminal domains of LCHR proteins (indeed, the anomalous positions of these two monodomain proteins suggest that they are derived from an ancient LCHR that suffered a late fission as a secondary event); (b) Magnetospirillum magnetotacticum (purple shaded sequences) also possesses two SCHR proteins that are located in the phylogenetic tree inside the protein amino (UPI000038343D) and carboxyl (UPI0000383C0E) ter- minal domains of the LCHR5 proteins (however, because these two SCHR proteins are reported on dif- ferent contigs, this case needs further revision); (c) Treponema pallidum (blue shaded sequences), Burk- holderia cepacia and Burkholderia fungorum (magenta shaded sequences) possess bidomain LCHR proteins that are located inside the protein cluster that com- prises the monodomain SCHR proteins, suggesting that they are derived from a late secondary fusion event of two adjacent SCHR genes.
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6217
Monodomain chrA genes are always present as a tan- dem gene pair in each organism; no unpaired chrA gene was four exceptions have been recorded recently after the phylogenetic analysis had been completed: the monodomain chrA genes from the firmicute bacteria Desulfotomaculum reducens MI-1 (UniProt accession number A4J468), Symbiobacterium thermophilum IAM 14863 (UniProt accession number Q67QP8), and Carboxydothermus hydrogenoformans Z-2901 (UniProt accession number Q3AEF8), and the proteobacterium Legionella pneumophila str. Lens (UniProt accession number Q5WZN7)]. This observa- tion strongly suggests that: (a) the majority of SCHR proteins are not functional as monomers; and (b) a subsequent gene fusion of a monodomain chrA pair led to the bidomain LCHRs. To gain an insight into the relationships between monodomain and bidomain CHR proteins, an alignment of LCHR proteins was performed to identify the poorly conserved interdo- main region. Each LCHR protein was then split in the middle of the interdomain region to obtain individual domains. These individual domains were aligned with the monodomain SCHR proteins, and a phylogenetic analysis from this alignment was performed using both distance-based [unweighted pair-group method using arithmetic averages (UPGMA), neighbour-joining (NJ) and minimum evolution (ME)] and character-based (maximum parsimony, MP) methods. Figure 2 shows a It is interesting to note that the topology obtained within the amino terminal domains of LCHR proteins (blue branch in Fig. 2) is practically identical to the topology observed within the carboxyl terminal domains (red branch). Although bootstrap supporting values were relatively low, all resulting trees were similar, regardless of the algorithms employed to perform the phylogenetic analysis. This pattern suggests that bulk LCHR proteins conform to a monophyletic group, and agrees with the proposal that the origin of bidomain LCHR proteins was an ancient duplication followed by a gene fusion of two ancestral monodomain CHRs [3,8]. In the lower part of Fig. 2, pairs of monodomain SCHR proteins are indicated by grey lines that connect each pair member, showing that SCHRs are clustered into three protein subfamilies (SCHR1, SCHR2 and SCHR3); green branches correspond to the first gene of each pair, and orange branches to the second gene. This clear-cut separation of SCHR proteins into first
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6218
ginosa ChrA protein, carry out different roles in their transporting functions [12,13]. A similar situation has been reported for transporters of the major facilitator superfamily [14,15]. Only two SCHR protein pairs in Borrelia burgdorferi (orange shaded sequence) and Thermotoga maritima (green shaded sequence) possess and second pair members, or between amino and car- boxyl domains of LCHR proteins, suggests that each protein member in a pair cannot be exchanged, and therefore carries out a different function. This hypoth- esis is in agreement with experimental data suggesting that the two halves of a bidomain LCHR, the P. aeru-
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
Fig. 2. Phylogenetic analysis of individual domains contained in both SCHR and LCHR proteins. Phylogenetic tree constructed using the MP method with the available protein sequences belonging to the CHR superfamily. Similar trees were obtained with UPGMA, NJ and ME methods. Sequence names are indicated by the abbreviated species names followed by the UniProt database accession number [38] and protein sequence length (protein sequences from bacteria are indicated in black, archaea in blue and fungi in brown). A full list of the organ- isms’ names and proteins included in this tree is given in supplementary Table S1. Circles indicate nodes with bootstrap values of more than 50% (white), 70% (grey) or 90% (black) in 500 random replicates in all methods used (ME, NJ, UPGMA and MP) concurrently. Trees were rooted with a TRAP C4-dicarboxylate transport system DctM domain from Jannaschia sp. CCS1. Yellow-capped pins mark the boundaries among the three main clusters of proteins: N-domains of LCHR proteins (blue branch); C-domains of LCHR proteins (red branch), and mono- domain SCHR proteins (green and orange branches). Black pins indicate the boundaries between individual CHR subfamilies. It should be noted that SCHR proteins are present in all organisms as a pair of sequences (indicated by external grey connecting lines). Protein sequences with an anomalous position in the tree are indicated as colour-shaded sequences and are discussed in the text. ChrA proteins for which there is experimental evidence of function are indicated by asterisks.
A4S8Y6) are related to the bacterial LCHR1 subfamily (data not shown). an unexpected position in the phylogenetic tree that could be related to horizontal gene transfer events, although additional evidence is required.
into bacterial
The four unpaired SCHRs from D. reducens MI-1 (A4J468), S. thermophilum IAM 14863 (Q67QP8), C. hydrogenoformans Z-2901 (Q3AEF8) and L. pneu- mophila str. Lens (Q5WZN7), not included in Fig. 2, are the nearest relatives to SCHR proteins from the firmicute bacterium Desulfitobacterium hafniense Y51 (Q24WV9 and Q24WV8; red shaded sequences). Inter- estingly, the two SCHR proteins from De. hafniense Y51 possess the monodomain chrA genes with the highest identity between first and second pair genes (I ¼ 42%). These data suggest from that SCHRs D. reducens, S. thermophilum, L. pneumophila and De. hafniense probably comprise a new SCHR subfam- ily related to the SCHR3 subfamily, but additional information and more protein sequences are needed to confirm this assumption.
Phylogenetic analysis of bidomain proteins
taxa; however,
Figure 4 shows a classification of CHR protein sequences into three different taxonomic categories. This scheme was constructed using criteria proposed by Riveros-Rosas et al. [16] on the basis of phylo- genetic and genomic context similarities of CHR pro- teins. Thus, each proposed subfamily comprises a set of homologous proteins that: (a) possesses a character- istic genomic context (see next section); and (b) forms a closed group in which the identity, similarity and sta- tistical significance between any two members of the closed group are higher than those with any other pro- tein sequence outside the subfamily. In Fig. 4, there are striking differences in the distribution of LCHR and SCHR subfamilies taxa. Thus, b-proteobacteria possess CHR proteins from five dif- ferent subfamilies (SCHR1, LCHR1, LCHR2, LCHR5 and LCHR6), whereas c-proteobacteria possess CHR proteins from only two different subfamilies (LCHR1 and LCHR5) and none from SCHR subfamilies. By contrast, the LCHR1 subfamily possesses the widest further distribution inside bacterial information is needed to rationalize these data. Finally, it is important to emphasize that the number of protein sequences belonging to the CHR superfam- ily almost doubled in the last year; however, as of July 2007, all of these new protein sequences could be ascribed to one of the 10 protein subfamilies described in Fig. 4 (with the exception of the unpaired SCHR genes mentioned above, which probably constitute a new subfamily related to the SCHR3 subfamily). Therefore, the protein sequences included in Figs 2 and 3 can be considered as a representative sample of the whole CHR superfamily.
(UniProt Genomic context
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6219
accession number Q00TQ5) (UniProt Because SCHR proteins are organized as a tandem pair of genes, their sequences were merged into bido- main proteins and aligned with the LCHR proteins. A new phylogenetic analysis from this alignment was per- formed. Figure 3 shows a phylogenetic tree constructed with both bidomain LCHR and merged SCHR protein pairs. The topology of this tree was very similar to that obtained with the individual domains. Bacterial LCHR proteins were clustered into six subfamilies (LCHR1–LCHR6). Fungal LCHRs comprise another subfamily related to the bacterial LCHR1 subfamily. Again, SCHR protein pairs clustered into three sub- families (SCHR1–SCHR3). In addition, the recently recorded LCHR protein sequences from Paramecium tetraurelia accession numbers A0C488, A0BTM8, A0DNE3, A0D2A2, A0D495), Ostreococcus tauri and Ostreococcus lucimarinus (UniProt accession number Because only two proteins inside the CHRs are well characterized to the level of biochemical function, we
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
SINME-Q92RU1 480 aa MESSP-Q11M17 421 aa
BRAJA-Q89K59 461 aa
BRAJA-Q9RH61 461 aa
MAGMA-UPI00003838D3 419 aa
MESLO-Q98BV9 463 aa DESVU-Q72WJ6 445 aa
BURVI-A4JRX3 441 aa MAGMA-UPI000038343D-UPI0000383C0E
LCHR5
NOVAR-Q2G7Y2 495 aa
PSEPU-Q88JU1 450 aa AZOVI-Q4IV90 481 aa
PSEFL-Q3KEB9 451 aa
CAUCR-Q9A997 426 aa
PSEAE-P14285 416 aa PSESP-Q8RSK4 410 aa
*
CYTHU-Q11T46 385 aa SYNEL-Q55027 393 aa BURFU-Q13FS3 402 aa
LCHR2
BURVI-A4JEF6 382 aa BURVI-A4JKP7 392 aa CHRVI-Q7NZK0 393 aa RALSO-Q8XSC3 401 aa RALME-P17551 401 aa RALME-Q1LGJ3 407 aa SYNSP-Q7U6L5 394 aa
*
PROMA-Q7V1Y4 408 aa
SYNEL-Q31R97 383 aa
LCHR3
BDEBA-Q6MM16 378 aa NOSPU-UPI000038DDA8 406 aa CROWA-Q4CB65 389 aa
GLOVI-Q7NN49 423 aa
SYNSP-P74550 399 aa NOSSP-Q8YXU9 402 aa
METJA-Q58128 402 aa
DESVU-Q72EZ2 450 aa
LCHR4
SYNSP-Q6ZEU2 412 aa GEOME-Q39S24-Q39S25 TRIER-Q111F9 416 aa
OCEIH-Q8EN82-Q8EN83
OCEIH-Q8ELG1-Q8ELG2 BACHA-Q9KEI3-Q9KEI2
BACSU-o05216-o05215
DESHA-Q24WV9-Q24WV8
BORBU-o51408-o51407
SCHR3 & related
MYCPU-Q98PJ4-Q98PJ5
THEMA-Q9X2D9-Q9X2D8
TREPA-UPI000017CB17 456 aa
TREDE-Q73LW8-Q73LW9
CLOPE-Q8XI72-Q8XI73
BACTH-Q8A415-Q8A416
SCHR2
BACTH-Q8A6Z0-Q8A6Y9
CLOTE-Q897R9-Q897R8
FUSNU-Q8RFI4-Q8RFI3 FUSNU-Q7P7K0-Q7P7J9 BURVI-A4JII4-A4JII3 BURSP-Q39C65-Q39C66
BURFU-Q13RY0-Q13RY1
BORBR-Q7WMS4-Q7WMS3
CHRVI-Q7NZ05-Q7NZ04
DECAR-Q47AL6-Q47AL7 RALME-Q1LKD1-Q1LKD2
SCHR1
RALSO-Q8Y0R4-Q8Y0R3
BRAJA-Q89G30-Q89G31
RHOPA-Q6N442-Q6N443 BURFU-Q13YC2-Q13YC3
BURVI-A4JC66-A4JC65
BURSP-Q39IS7-Q39IS8
BURFU-Q142U2 470 aa
LCHR6
BURSP-Q39E28 402 aa BURVI-A4JGN8 402 aa
GIBZE-Q4IG15 906 aa NEUCR-Q8WZQ7 584 aa
Fungal LCHR & related
MAGGR A4UCC5 502aa ASPNI-Q5B203 523 aa ASPNI-Q5B7Y9 510 aa USTMA-Q4PCF0 581 aa
CYTHU-Q11PK4 400 aa
VIBVU-Q7MI68 383 aa VIBPA-Q87M51 380 aa VIBCH-Q9KPM8 380 aa
SHEON-Q8EI63 390 aa
0 5 / 0 2 / 2 5 / 8 8
MICDE-Q21DU0 387 aa COREF-Q8FN03 375 aa
CORGL-Q8NMW5 376 aa
BACHA-Q9KFB1 397 aa OCEIH-Q8ELF3 400 aa BACCE-Q9XBH5 393 aa
EXISP-Q41CD3 385 aa
RHORU-Q2RV35 416 aa
CHLAU-Q3E301 394 aa
LCHR1
BURFU-Q13XT1 449 aa PSEAE-Q9HWB1 401 aa RALME-Q1LDT3 411 aa RALEU-Q46QK9 405 aa
CHRVI-Q7NYK3 437 aa
BURVI-A4JH58 405 aa BURSP-Q39DJ9 388 aa
METFL-Q1GYI1 404 aa
BORBR-Q7WMU4 441 aa
DEIRA-Q9RRS2 400 aa
0.5
KINRA-A6W599 404 aa RHOSP Q3J447 392aa AGRTU-Q7CRJ4 409 aa ACEAC-Q84GL9 406 aa RHOPA-Q6NC87 429 aa
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6220
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
Fig. 3. Phylogenetic analysis of bidomain LCHR proteins and tandem pairs of SCHR proteins. Phylogenetic tree constructed using the ME method with the available protein sequences belonging to the CHR superfamily. Pairs of SCHR proteins were fused in tandem to produce bidomain sequences that were aligned with bidomain LCHR proteins. Similar trees were obtained with UPGMA, NJ and MP methods. Sequence names are indicated by the abbreviated species names followed by the UniProt database accession number [38] and protein sequence length (protein sequences from bacteria are indicated in black, archaea in blue and fungi in brown). A full list of the organisms’ names and proteins included in this tree is given in supplementary Table S1. Circles indicate nodes with bootstrap values of more than 70% (white), 80% (grey) or 90% (black) in 1000 random replicates in all methods used (ME, NJ, UPGMA and MP) concurrently. The bootstrap val- ues for critical nodes, marked with arrowheads, are explicitly shown using the UPGMA ⁄ NJ ⁄ MP ⁄ ME methods. Trees were rooted with a TRAP C4-dicarboxylate transport system DctM domain from Jannaschia sp. CCS1 (not indicated). Scale bar represents 0.5 amino acid substi- tutions per site. Square brackets indicate identified protein subfamilies. Protein sequences with an unexpected position in Fig. 2 are indi- cated as colour-shaded sequences and are discussed in the text. ChrA proteins for which there is experimental evidence of function are indicated by asterisks.
[13] LCHR5 subfamily) was found to possess a 13 TMS arrangement, with the amino and carboxyl terminal domains oppositely oriented [13]. The insertion of an additional TMS in the middle of the P. aeruginosa pro- tein is assumed to have caused its distinct membrane topology. Many examples of oppositely oriented pro- teins, either as separate polypeptides or as tandem fusion proteins, have been reported [15]. As very few exceptions of organisms with only one copy of a monodomain chrA gene were found, the possibility that pairs of SCHR proteins are oppositely oriented proteins was considered. Thus, we analysed the distri- bution of positively charged lysine (K) and arginine (R) residues between the hydrophilic loops on either side of the membrane inside each SCHR protein sub- family, as expected from the observation that the ori- entation of membrane proteins is largely determined by the ‘positive-inside’ rule [15,18]. Because only the last four TMSs inside each CHR domain are common to both C. metallidurans and P. aeruginosa ChrA pro- teins from LCHR2 and LCHR5 subfamilies, respectively, the first two TMSs were not analysed. Figure 6B shows an alignment of
tried to obtain further insights by comparing the geno- mic context of different CHR proteins. Figure 5 shows the genomic context of LCHR2, LCHR3 and LCHR5 subfamilies. In this figure, the well-characterized geno- mic context of ChrA2 from C. metallidurans (LCHR2 subfamily, accession P17551 [17]) was used as a refer- ence to compare with other genomes. It can be seen that LCHR2 members are usually associated with chrB (78%) and chrF (67%) genes. With regard to LCHR5 members, they are associated with chrB, chrC and chrF genes, but are less frequently found (19%, 19% and 31%, respectively). Both ChrB and ChrF proteins have been shown to function as regulators of ChrA-medi- ated chromate resistance in C. metallidurans [17]. The chrC gene, whose product has been linked to the prob- able repair of chromate-caused cell damage by super- oxide dismutase, was found to be variably associated with LCHR2 (33%) and LCHR5 (19%) clusters. It is interesting to note that a bidomain protein formed by the fusion of ChrE and ChrF exists only in some LCHR5 members. By contrast, LCHR3 members pos- sess a diverse genomic context and are not associated with any genes related to chromate resistance. The same situation occurs within LCHR1, LCHR4, LCHR6, SCHR1, SCHR2 and SCHR3 subfamilies (data not shown). The difference between the high association of ChrB with LCHR2 protein members (78%), but not with LCHR5 protein members (19%), is probably related to the fact that the associated chrB gene is essential for chromate efflux by ChrA in C. me- tallidurans [17], an LCHR2 protein member, but not in P. aeruginosa [6] (M. Ramı´ rez-Dı´ az & C. Cervantes, unpublished data), an LCHR5 protein member.
Membrane topology of SCHR proteins
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6221
The C. metallidurans ChrA protein (from the LCHR2 subfamily) has been reported to have 10 TMSs, with the amino and carboxyl terminal domains similarly oriented with respect to the membrane axis [8]. By contrast, the P. aeruginosa ChrA protein (from the the protein sequences from the SCHR2 subfamily and individual CHR domains from the P. aeruginosa ChrA protein. It can be seen that TMSs III, IV, V and VI from the N-terminal CHR domain of P. aeruginosa are homo- logous to TMSs X, XI, XII and XIII from the C-ter- minal CHR domain of P. aeruginosa. Furthermore, it can be observed that loops with a higher content of positively charged residues (K + R) are opposed in the N-terminal domain equivalent of SCHR2 and the C-terminal domain equivalent of SCHR2 (Fig. 6B). Figure 6C shows the average number of (K + R) resi- dues per loop for each CHR domain; it is clear that predicted inside loops possess a higher (K + R) con- tent than do periplasmic loops. Therefore, it can be concluded that SCHR2 protein pairs possess an opposed membrane orientation. Similar results were obtained with proteins from the SCHR1 and SCHR3 subfamilies (data not shown).
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
Family
Subfamily
Fungal LCHR & related
LCHR1
ions. Metal
LCHR2
began, in a biosphere that had been previously exposed to volcanic activities, which created polluted environ- ments [19]. There is no general mechanism for resis- tance to all heavy metal ion resistance systems have been found on plasmids of every bacte- rial group tested; the absence of known resistance determinants in any group probably reflects insufficient research effort. The mechanisms of resistance are gen- erally efflux ‘pumping’ (removing toxic ions that enter the cell by systems involved in the transport of nutri- ent cations or oxyanions) and enzymatic detoxification (generally redox chemistry), converting more toxic to less toxic or less available metal ion species [19].
Ascomycota Basidiomycota Bacteroidetes β proteobacteria γ proteobacteria α proteobacteria Firmicutes Actinobacteria Chloroflexi Deinococcus-Thermus β proteobacteria Cyanobacteria Bacteroidetes
R H C L
LCHR3
y
Cyanobacteria δ proteobacteria
l i
LCHR4
m a f r e p u s R H C
LCHR5
LCHR6
Cyanobacteria δ proteobacteria Euryarchaeota α proteobacteria γ proteobacteria β proteobacteria δ proteobacteria β proteobacteria
SCHR1
β proteobacteria α proteobacteria
SCHR2
R H C S
Firmicutes Spirochaetes Bacteroidetes Fusobacteria
SCHR3 & related
As CHR protein members exist in all three domains of life, and no evidence of horizontal chrA gene trans- fer between these domains has been found in this work, we can propose that CHR is an ancient protein family. However, as the genomic context amongst CHR subfamilies displays a high diversity, even inside the majority of CHR subfamilies, it is probable that each CHR protein subfamily carries out different func- tional roles in addition to chromate efflux. Thus, dif- ferences exist in regulation and membrane topology between the two characterized CHRs: for example, ChrB is a membrane-bound regulatory protein essen- tial for chromate resistance in C. metallidurans [17], because deletion of the chrB gene leads to the hyperac- cumulation of chromate, whereas it is not required in P. aeruginosa for chromate resistance [6] (M. Ramı´ rez- Dı´ az & C. Cervantes, unpublished data).
elongatus
Firmicutes Spirochaetes Thermotogae
Fig. 4. Taxonomy of the CHR superfamily. CHR proteins were sorted into two families and 10 subfamilies. The group(s) of organism(s) in which these proteins were found are indicated on the right (if more than one-half of the proteins in one subfamily belong to a single group of organisms, it is indicated in bold). This scheme was constructed using criteria proposed by Riveros-Rosas et al. [16] on the basis of phylogenetic analysis and genomic context similarities of CHR pro- teins. The protein subfamilies were numbered after Nies et al. [3].
Discussion
Thus, on the one hand, these data suggest that CHR proteins may suffer drastic alterations in membrane topology but conserve their main function, whereas, on the other, the genomic context suggests that CHR proteins possess other functions in addition to chro- mate transport. The latter is supported by the proper- the ChrA protein from the nonchromate- ties of resistant bacterium Synechococcus strain PCC7942 (accession number Q55027). This ChrA pro- tein (from the LCHR2 subfamily) is located inside a sulfur-regulated operon whose expression does not provide chromate resistance [20]. Indeed, an Sy. elong- atus strain with a deletion of the chrA gene exhibited increased resistance to chromate, suggesting that ChrA may take up, instead of efflux, chromate.
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6222
It is commonly believed that bacterial heavy metal resistance has arisen as a result of human pollution in recent centuries. It seems more likely, however, that toxic metal resistance systems arose soon after life In short, CHR proteins comprise an ancient protein superfamily present in the three domains of life. Their members are sorted into two protein families (LCHR and SCHR), and 10 different subfamilies. These pro- teins possess differences in membrane topology orien- tation and, in their genomic context, probably evolved diverse physiological functions in addition to chromate transport. The existence of protein superfamilies in
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
Fig. 5. Schematic representation of the local genomic context of CHR genes from LCHR2, LCHR3 and LCHR5 subfamilies. Boxed arrows indicate the genes and their direction of transcription. Identified genes are indicated according to the characterized chromate resistance determinant from Cupriavidus metallidurans [17] or the sulfur-regulated plasmid gene srp from Synechococcus sp. [20]. chrB encodes a membrane-bound protein necessary for the regulation of chromate resistance in C. metallidurans; chrC encodes a protein homologous to iron-containing superoxide dismutase; the chrE gene product is probably a rhodanese-type enzyme; chrF might encode a repressor for chro- mate-dependent induction; the srpA gene product exhibits similarity to the enzyme catalase; srpB encodes a membrane-bound P-type ATPase subunit; srpD encodes a cysteine synthase; srpE encodes a c-glutamyl transferase; and merA encodes a protein homologous to mercuric reductase. Nonlabelled genes were observed in only one of the genomic contexts included in the figure.
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6223
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
C
A
C-terminal CHR domain
Periplasm
N-terminal CHR domain
0.4
0.4
2.1
2.6
3.2
TMS III
TMS IV
TMS V
TMS VI
I
I
I I
I I
I I I
I I I
II I
I I I
X X X
I
I
V V V
X
V
X
V
0
0.1
1.8
3.8
1.9
TMS X
TMS XI
TMS XII
TMS XIII
C-terminal CHR domain
Cytoplasm
N-terminal CHR domain
B
Experimental procedures
which one single scaffold supports multiple functions has been found as a recurrent theme in nature [21–23].
nonredundant protein sequence databases. The amino acid sequence from ChrA of P. aeruginosa (accession number P14285) and other identified ChrAs reported as members of the CHR family in PF02417 [26] and COG2059 [27] were used as bait for gapped blastp and psi-blast searches using default gap penalties and the blosum 62 substitution matrix [28]. The E-value inclusion threshold was set at 10)5. Orthologous protein sequences identified in several bacterial strains belonging to the same species, with 100% identity,
Protein sequence data were retrieved in the second half of 2005 from Swiss-Prot + TrEMBLE [24] and GenBank [25]
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6224
Sequence analysis
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
Fig. 6. Alignment between protein sequences from the SCHR2 subfamily and individual CHR domains from the Pseudomonas aeruginosa ChrA protein. (A) Diagram illustrating the 13 TMSs from the P. aeruginosa ChrA protein and the N-terminal and C-terminal CHR domains with oppositely oriented membrane topology; grey-coloured TMSs and companion loops comprise the amino acid residues shown in the align- ment of (B). (B) The first nine sequences in the alignment comprise the P. aeruginosa N-terminal CHR domain and equivalent SCHR2 sequences (TMSs III–VI). The last nine sequences in the alignment comprise the P. aeruginosa C-terminal CHR domain and equivalent SCHR2 sequences (TMSs X–XIII). Consensus sequences (50% threshold frequency) for the N-terminal domain equivalent of SCHR2 and C-terminal domain equivalent of SCHR2 are indicated. The identity and similarity between N-terminal and C-terminal domains from P. aeru- ginosa ChrA protein (accession P14285) are 16% and 25%, respectively. Lysine (K) and arginine (R) residues located inside the hydrophilic loops are highlighted in black (three residues from each of the flanking transmembrane helices were included to allow for possible mispre- diction of the exact positions of the loop ends). The membrane topology of the P. aeruginosa ChrA protein was determined previously by analysis of translational fusions with the reporter enzymes alkaline phosphatase and b-galactosidase [13]. (C) Diagram with the TMSs and periplasmic (“) and cytoplasmic (^) loops included in the alignment. The figures indicate the average number of positively charged residues (K + R) per loop per sequence in the N-terminal CHR domain and C-terminal CHR domain.
were considered as redundant and excluded from the phylo- genetic analysis.
and
References
1 Silver S & Phung LT (2005) A bacterial view of the periodic table: genes and proteins for toxic inorganic ions. J Ind Microbiol Biotechnol 32, 587–605.
2 Canovas D, Cases I & de Lorenzo V (2003) Heavy
metal tolerance and metal homeostasis in Pseudomonas putida as revealed by complete genome analysis. Environ Microbiol 5, 1242–1256.
3 Nies DH (2003) Efflux-mediated heavy metal resistance
in prokaryotes. FEMS Microbiol Rev 27, 313–339.
4 Pimentel BE, Moreno-Sanchez R & Cervantes C (2002) Efflux of chromate by Pseudomonas aeruginosa cells expressing the ChrA protein. FEMS Microbiol Lett 212, 249–254.
(accession number Q28KP9),
5 Alvarez AH, Moreno-Sanchez R & Cervantes C (1999) Chromate efflux by means of the ChrA chromate resis- tance protein from Pseudomonas aeruginosa. J Bacteriol 181, 7398–7400.
6 Cervantes C, Ohtake H, Chu L, Misra TK & Silver S (1990) Cloning, nucleotide sequence, and expression of the chromate resistance determinant of Pseudo- monas aeruginosa plasmid pUM505. J Bacteriol 172, 287–291.
Progressive multiple protein sequence alignments were calculated with muscle [29,30] and the clustalw package [31]; they were corrected manually according to gapped blastp results [28]. Because automated gene prediction pro- grams may introduce errors in exon ⁄ intron recognition [32], the exon ⁄ intron boundaries of eukaryotic ChrAs were man- ually inspected with the GeneComber system [33], and con- trasted with the multiple protein sequence alignments obtained. Thus, from six analysed sequences, only exon ⁄ intron boundaries of ChrA from Magnaporthe grisea (acces- sion A4UCC5) required corrections. Phylogenetic analyses were carried out with mega version 3.1 software [34], using both MP and distance-based methods (UPGMA, NJ and ME), with the aid of the empirical Jones–Taylor–Thornton amino acid substitution model; gaps were treated by pair- wise deletion. Differences between amino acid sequences were corrected for multiple substitutions assuming a c dis- tribution for rate variations between sites. The c-shaped parameter (a ¼ 1.0) was estimated with the Whelan–Gold- man matrix of substitutions and the eight-category discrete c model using tree-puzzle [35]. Confidence limits of branch points were estimated by 500 bootstrap replications. Phylogenetic trees were rooted with the TRAP C4-dicarb- oxylate transport system DctM domain from Jannaschia sp. CCS1 identified using psi-blast (E-value ¼ 5 · 10)3) as a protein sequence dis- tantly related to ChrAs, with a significant identity (25%) and similarity (40%). DctM is a large integral membrane protein with 12 putative TMSs that belongs to the ion transporter superfamily [36], and also uses the membrane potential as a source of energy [37]; it is probable that it arose by an intragenic duplication event from a primordial six TMS protein [36]. These data indicate that DctM is a suitable outgroup for the CHR protein family.
7 Nies A, Nies DH & Silver S (1990) Nucleotide sequence and expression of a plasmid-encoded chromate resis- tance determinant from Alcaligenes eutrophus. J Biol Chem 265, 5648–5653.
8 Nies DH, Koch S, Wachi S, Peitzsch N & Saier MH Jr
Acknowledgements
(1998) CHR, a novel family of prokaryotic proton motive force-driven transporters probably containing
Ciencia y Tecnologia (CONACYT) (Me´ xico; 41712Q) grants to CC, and Direccio´ n General de Asuntos del Personal, Universidad Nacional Auto´ noma de Me´ xico CONACYT IN224006) (DGAPA-UNAM; (Me´ xico; 52278) grants to HRR. CD-P was supported by a fellowship from UMSNH. We thank Josefina Bolado for careful reading of the manuscript, and two anonymous referees for insightful comments.
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6225
This work was supported by Coordinacio´ n de Investi- gacio´ n Cientı´ fica (UMSNH) and Consejo Nacional de
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
chromate ⁄ sulfate antiporters. J Bacteriol 180, 5799– 5802.
22 Dunwell JM, Purvis A & Khuri S (2004) Cupins: the most functionally diverse protein superfamily? Phyto- chemistry 65, 7–17.
9 Aury JM, Jaillon O, Duret L, Noel B, Jubin C, Porcel BM, Segurens B, Daubin V, Anthouard V, Aiach N, et al. (2006) Global trends of whole-genome duplica- tions revealed by the ciliate Paramecium tetraurelia. Nature 444, 171–178.
23 Riveros-Rosas H & Julian-Sanchez A (2006) Functional plasticity of medium-chain dehydrogenases ⁄ reductases. In Enzymology and Molecular Biology of Carbonyl Metabolism, Vol. 12 (Weiner H, Plapp B, Lindahl R & Maser E, eds), pp. 419–433. Purdue University Press, West Lafayette, IN.
10 Palenik B, Grimwood J, Aerts A, Rouze P, Salamov A, Putnam N, Dupont C, Jorgensen R, Derelle E, Rombauts S, et al. (2007) The tiny eukaryote Ostreo- coccus provides genomic insights into the paradox of plankton speciation. Proc Natl Acad Sci USA 104, 7705–7710.
24 Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O’Donovan C, Phan I, et al. (2003) The SWISS-PROT protein knowledgebase and its (Suppl.)TrEMBL in 2003. Nucleic Acids Res 31, 365–370.
25 Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J & Wheeler DL (2006) GenBank. Nucleic Acids Res 34, D16–D20.
11 Wani R, Kodam KM, Gawai KR & Dhakephalkar PK (2007) Chromate reduction by Burkholderia cepacia MCMB-821, isolated from the pristine habitat of an alkaline crater lake. Appl Microbiol Biotechnol 75, 627– 632.
12 Aguilera S, Aguilar ME, Chavez MP, Lopez-Meza JE, Pedraza-Reyes M, Campos-Garcia J & Cervantes C (2004) Essential residues in the chromate transporter ChrA of Pseudomonas aeruginosa. FEMS Microbiol Lett 232, 107–112.
13 Jimenez-Mejia R, Campos-Garcia J & Cervantes C
26 Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL, et al. (2004) The Pfam protein fami- lies database. Nucleic Acids Res 32, D138–D141. 27 Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, et al. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41.
(2006) Membrane topology of the chromate transporter ChrA of Pseudomonas aeruginosa. FEMS Microbiol Lett 262, 178–184.
28 Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W & Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389– 3402.
29 Edgar RC (2004) MUSCLE: multiple sequence align-
14 Pao SS, Paulsen IT & Saier MH Jr (1998) Major facili- tator superfamily. Microbiol Mol Biol Rev 62, 1–34. 15 Rapp M, Granseth E, Seppala S & von Heijne G (2006) Identification and evolution of dual-topology membrane proteins. Nat Struct Mol Biol 13, 112–116.
ment with high accuracy and high throughput. Nucleic Acids Res 32, 1792–1797.
16 Riveros-Rosas H, Julian-Sanchez A, Villalobos-Molina R, Pardo JP & Pina E (2003) Diversity, taxonomy and evolution of medium-chain dehydrogenase ⁄ reductase superfamily. Eur J Biochem 270, 3309–3334.
17 Juhnke S, Peitzsch N, Hubener N, Grosse C & Nies
30 Edgar RC (2004) MUSCLE: a multiple sequence align- ment method with reduced time and space complexity. BMC Bioinformatics 5, 113.
DH (2002) New genes involved in chromate resistance in Ralstonia metallidurans strain CH34. Arch Microbiol 179, 15–25.
18 von Heijne G (1989) Control of topology and mode of
31 Thompson JD, Higgins DG & Gibson TJ (1994) CLUS- TAL W: improving the sensitivity of progressive multi- ple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.
assembly of a polytopic membrane protein by positively charged residues. Nature 341, 456–458.
19 Silver S & Phung LT (1996) Bacterial heavy metal
32 Reese MG, Hartzell G, Harris NL, Ohler U, Abril JF & Lewis SE (2000) Genome annotation assessment in Drosophila melanogaster. Genome Res 10, 483–501.
resistance: new surprises. Annu Rev Microbiol 50, 753– 789.
33 Shah SP, McVicker GP, Mackworth AK, Rogic S &
20 Nicholson ML & Laudenbach DE (1995) Genes
Ouellette BF (2003) GeneComber: combining outputs of gene prediction programs for improved results. Bioinfor- matics 19, 1296–1297.
encoded on a cyanobacterial plasmid are transcription- ally regulated by sulfur availability and CysR. J Bacte- riol 177, 2143–2150.
21 Anantharaman V, Aravind L & Koonin EV (2003)
34 Kumar S, Tamura K & Nei M (2004) MEGA3: Inte- grated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform 5, 150–163.
Emergence of diverse biochemical activities in evolution- arily conserved structural scaffolds of proteins. Curr Opin Chem Biol 7, 12–20.
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6226
C. Dı´az-Pe´ rez et al.
Chromate ion transporter (CHR) phylogeny
(UniProt): an expanding universe of protein informa- tion. Nucleic Acids Res 34, D187–D191.
35 Schmidt HA, Strimmer K, Vingron M & von Haeseler A (2002) TREE-PUZZLE: maximum likelihood phylo- genetic analysis using quartets and parallel computing. Bioinformatics 18, 502–504.
Supplementary material
36 Rabus R, Jack DL, Kelly DJ & Saier MH Jr (1999)
TRAP transporters: an ancient family of extracytoplas- mic solute-receptor-dependent secondary active trans- porters. Microbiology 145, 3431–3445.
is available
37 Forward JA, Behrendt MC, Wyborn NR, Cross R & Kelly DJ (1997) TRAP transporters: a new family of periplasmic solute transport systems encoded by the dctPQM genes of Rhodobacter capsulatus and by homo- logs in diverse gram-negative bacteria. J Bacteriol 179, 5482–5493.
The following supplementary material online: Table S1. LCHR and SCHR proteins identified in ar- chaea, bacteria and fungi. This material is available as part of the online article from: http://www.blackwell-synergy.com
38 Wu CH, Apweiler R, Bairoch A, Natale DA, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, et al. (2006) The Universal Protein Resource
FEBS Journal 274 (2007) 6215–6227 ª 2007 The Authors Journal compilation ª 2007 FEBS
6227
Please note: Blackwell Publishing is not responsible for the content or functionality of any supplementary materials supplied by the authors. Any queries (other than missing material) should be directed to the corre- sponding author for the article.