doi:10.1046/j.1432-1033.2002.03237.x
Eur. J. Biochem. 269, 5259–5263 (2002) (cid:1) FEBS 2002
Extra terminal residues have a profound effect on the folding and solubility of a Plasmodiumfalciparumsexual stage-specific protein over-expressed in Escherichiacoli
Sushil Prasad Sati1, Saurabh Kumar Singh1, Nirbhay Kumar2 and Amit Sharma1 1Malaria Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi, India; 2Department of Molecular Microbiology and Immunology, Hopkins Malaria Research Institute, The Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
precipitated after protease cleavage from its fusion partner. The sixth construct, which produced soluble protein in high yields, also gave highly stable and soluble Pfg27 after clea- vage of the fusion. These results indicate that extra amino acid residues at the termini of over-expressed proteins can have a significant effect on the folding of proteins expressed in E. coli. Our data suggest the potential for development of a novel methodology, which will entail construction of fusion proteins with maltose binding protein as a chaperone on the N-terminus and a C-terminal (cid:1)solubilization tag(cid:2). This system may allow large-scale production of those proteins that have a tendency to misfold during expression.
Keywords: expression; fusion protein; precipitation; protein folding; solubility.
The presence of extra N- and C- terminal residues can play a major role in the stability, solubility and yield of recombi- nant proteins. Pfg27 is a 27K soluble protein that is essential for sexual development in Plasmodium falciparum. It was over-expressed using the pMAL-p2 vector as a fusion pro- tein with the maltose binding protein. Six different constructs were made and each of the fusion proteins were expressed and purified. Our results show that the fusion proteins were labile and only partially soluble in five of the constructs resulting in very poor yields. Intriguingly, in the sixth con- struct, the yield of soluble fusion protein with an extended carboxyl terminus of 17 residues was several fold higher. Various constructs with either N-terminal or smaller C-ter- minal extensions failed to produce any soluble fusion pro- tein. Furthermore, all five constructs produced Pfg27 that
multifactorial nature of protein folding, and indicate that phenomena like protein stability or its lability are not completely understood yet. The molecular and structural elements which determine protein folding are significant players in the success or failure of over-expression tech- niques.
Despite the widespread use of fusion protein-based over- expression vectors for the production and purification of proteins in Escherichia coli, the molecular or structural elements which determine protein stability and solubility for recombinant proteins are not well understood. The solubi- lity of non membrane-bound proteins is a complex biochemical phenomena, and it is generally believed that properly folded proteins are reasonably soluble in aqueous solutions. There are many factors that affect protein solubility, and one such player is the amino acid sequence variation at the amino (N-) and carboxyl (C-) termini. In a cellular environment, partially folded or misfolded proteins are generally prone to aggregate formation, and the cell machinery gets rid of these aggregates by the combined action of chaperones and intracellular proteases [1–3]. Several studies have shown that the nature of terminal residues in proteins (i.e. polar or nonpolar) can play a role in recognition and subsequent action by cellular proteases [4,5]. In many cases, polar residues at the carboxyl terminal are able to prevent recognition by tail-specific proteases [4–7]. Together, these studies point to a complex and
We were interested in producing large amounts of a 27K cytoplasmic protein (Pfg27) from Plasmodium falciparum for biochemical and biophysical studies. This protein plays a crucial role in the sexual development of P. falciparum, and parasites lacking its gene fail to develop sexually [8]. Initial efforts to express soluble Pfg27 as a fusion protein with His-tag in the pRSET vector system (Invitrogen) were unsuccessful as the over-expressed Pfg27 aggregated after purification resulting in precipitation. Expression systems with the maltose binding protein (MBP) have been used routinely to enhance the solubility and yield of fusion products. Aside from being an efficient tag for affinity chromatography, MBP is able to act as a molecular chaperone to enhance the solubility of fused partners [9,10]. Therefore, various MBP-Pfg27 fusion constructs were engineered to study the behavior of the expressed proteins. These constructs had variations in the sequence and length of extra amino acid residues at the termini of Pfg27. All six MBP-Pfg27 constructs produced equivalent amounts of fusion protein as judged by induction analysis. Intriguingly, only one construct provided proteins of both high stability and solubility which could be used in biochemical and structural studies. Our results indicate that the critical element in obtaining high quality and quantity of
Correspondence to A. Sharma, Malaria Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110067, India. Tel/Fax: +91 11 6711731, E-mail: asharma@icgeb.res.in Abbreviations: MBP, maltose binding protein. (Received 11 July 2002, revised 26 August 2002, accepted 6 September 2002)
5260 S. P. Sati et al. (Eur. J. Biochem. 269)
(cid:1) FEBS 2002
Pfg27 was the presence of a carboxyl terminal extension of 17 residues, which seems to function as a (cid:1)solubilization tag(cid:2). The exact mechanism for this phenomenon where an extra stretch of residues is able to confer enhanced solubility properties to a protein is not yet known.
M A T E R I A L S A N D M E T H O D S
constructs Pfg27B–F, the protease cleavage mixtures were centrifuged at 15 000 g for 10 min in a microcentrifuge. The resulting supernatants and pellets were loaded on an SDS/ PAGE gel to check the solubility of MBP and Pfg27 after cleavage. All protein concentrations were measured by UV absorption at 280 nm and verified by SDS/PAGE. Stand- ard protocols were followed for western analysis where Pfg27 was probed with polyclonal anti-Pfg27 Ig produced in mice.
R E S U L T S
Oligonucleotides were purchased from Genosys, USA. Various enzymes, the pMAL-p2 vector and amylose resin were bought from New England Biolabs (NEB), USA. All other chemicals were obtained from Sigma-Aldrich Co., USA.
Design of the pRSET construct
Expression constructs
We first attempted to express recombinant Pfg27-(His)6 fusion protein by taking advantage of a prokaryotic expression vector, pRSET-C (Invitrogen). The coding sequence of Pfg27 was PCR amplified using gene specific primers. The antisense primer lacked the final stop codon and the PCR product was cloned into the PvuII site of pRSET-C. Subsequent sequence analysis as well as expres- sion upon induction confirmed production of the fusion protein. However, this construct produced insoluble protein which aggregated and precipitated upon purification, making it unacceptable for either functional or structural studies.
Design of various MBP + Pfg27 fusion constructs
The different constructs generated to examine the role of extra amino acid residues at N- and C-termini were as follows (Fig. 1):
The Pfg27 constructs were PCR amplified from P. falcipa- rum (3D7 strain) genomic DNA by using the following primers (the restriction sites are shown in bold and the respective enzymes in parentheses): Pfg27A: 5¢-AAA CTGCAGATGAGTAAGGTACAAAAG-3¢ and 5¢-AAA AAGCTTAATATTGTTGTGATGTGGTTCATC-3¢ (PstI- HindIII); Pfg27B: 5¢-AAACTGCAGATGAGTAAGGTA CAAAAG-3¢ and 5¢-AAACTGCAGTTAAATATTGTTG TGATGTGGTTCATC-3¢ (PstI); Pfg27C: 5¢-AAAAAGC TTATGAGTAAGGTACAAAAG-3¢ and 5¢-AAAAAGC TTTTAAATATTGTTGTGATGTGGTTCATC-3¢ (Hin- dIII); Pfg27D: 5¢-AAAGAATTCATGAGTAAGGTACA AAAG-3¢ and 5¢-AAACTGCAGTTAAATATTGTTGT GATGTGGTTCATC-3¢ (EcoRI-PstI); Pfg27E:5¢-AAAC TGCAGATGAGTAAGGTACAAAAG-3¢ and 5¢-AAAA (PstI- GCTTTCACTTCGAATTCCATGGTACCAG-3¢ HindIII); Pfg27F: 5¢-AAACTGCAGATGAGTAAGGTA CAAAAG-3¢ and 5¢-AAAAAGCTTTTACGACGTTGT GTGATGTGGTTCATC-3¢ (PstI-HindIII).
Pfg27A: This construct encodes a fusion protein in which Pfg27 has 12 extra residues at its N-terminal (from the vector backbone) and 17 extra amino acids at its C-terminal. The natural end of Pfg27 is …HHNNI but in this construct Pfg27 ends with …HHNNI + LVPWNSKLGTGR RFTTS (the terminal polar residues are underlined). It also has an unexpected D to N mutation at the seventh residue position of Pfg27. This construct produces highly soluble and stable protein due to its 17 residue extension.
PCR was performed using standard protocols and the product purified using Qiagen PCR purification kits. DNA fragments were cloned into the expression vector pMAL-p2 resulting in constructs designated Pfg27A–F. All constructs were verified by DNA sequencing in an ABI 310 automated sequencer using a forward primer that anneals to the malE gene upstream of the polylinker region and a reverse primer that anneals to the lacZ alpha sequence downstream of the first in-frame stop codon in pMAL vector.
Expression and analysis of recombinant proteins
DNA constructs were transformed into BL21 (B834 DE3) strain of E. coli by the heat shock method and transform- ants grown in LB broth in presence of carbenicillin (50 mgÆmL)1) and 0.2% glucose. For protein production, 100 mL of bacterial cultures were induced by 0.3 mM IPTG at D of (cid:1) 0.6 and grown for another two hours. The cultures were spun down at 6000 g, pellets suspended in lysis buffer (20 mM Tris pH 7.5, 200 mM NaCl, 1 mM EDTA and 1 mM phenylmethanesulfonyl fluoride) and subjected to sonication. The cell suspensions were then centrifuged at 26 000 g for 30 min, and the lysates loaded on pre- equilibrated amylose columns. The columns were washed with 12 volumes of lysis buffer and the MBP-Pfg27 proteins eluted with 10 mM maltose. Protein fractions were analyzed by SDS/PAGE and the fusion proteins cleaved with factor Xa in lysis buffer containing 10 mM CaCl2. In case of
Fig. 1. Diagrammatic representation of the seven Pfg27 over-expression constructs. The arrow shows the factor Xa cleavage site, and wedge represents the D to N mutation in some constructs. Extra N- and C-terminal sequences are shown along with the original pRSET vector construct (Pfg27HIS) for reference. The underlined residues indicate the polar end residues.
Extra terminal residues affect protein solubility (Eur. J. Biochem. 269) 5261
(cid:1) FEBS 2002
Pfg27B: This construct has the same 12 extra residues at the N-terminal as Pfg27A, retains the D to N mutation, but has no C-terminal extension. Any difference observed between the behavior of fusion proteins Pfg27A or Pfg27B can be directly attributed to the 17 residue extension in Pfg27A. This construct produced insoluble protein as it lacks the 17 residue C-terminal extension. In addition, it clearly shows that the accidental D to N mutation does not effect protein solubility.
Pfg27C: This construct has 15 extra amino acid residues at the N-terminal, does not have the D to N mutation and also lacks any C-terminal extension. This construct indicates that the N-terminal variation has no effect on protein solubility. Pfg27D: This construct does not have extra residues at either termini and contains no mutation. It is noteworthy that this construct is designed to express Pfg27 in its native state without any extra residues. It therefore represents a typically preferred construct. This is a control construct designed to make wild-type Pfg27 without extensions or the D to N mutation.
the length of
Pfg27E: This construct has extensions at the N- and C-termini and also the D to N mutation. It is closest to Pfg27A in design but has a truncated C-terminal extension of only seven amino acids which end in polar residues. This the C-terminal indicates that construct extension plays an important role.
Pfg27F: This construct has the same N-terminal extension as Pfg27A,B,E but lacks a C-terminal extension. It was designed to address whether presence of polar residues at the C-terminal is sufficient to confer increased solubility to the fusion protein. The end sequence of native Pfg27 (…NNI) was changed to (…TTS). Therefore, the terminal three residues are identical to the ones in Pfg27A.
Stability and solubility of various MBP-Pfg27 proteins The Pfg27A–F construct DNAs were freshly transformed into E. coli cells, and the transformants grown, induced and processed under identical conditions (see Materials and methods). The cell pellets were sonicated and the resulting lysates were loaded onto pre-equilibrated amylose columns. Sufficient and equivalent amounts of amylose resin were used for each construct so that the yield differences were normalized. All experiments were conducted in 100 mL cultures and the data presented in Table 1 has been scaled to represent the yields for one liter cell culture. These constructs were used to express and purify MBP-Pfg27A–F fusion proteins by maltodextrin affinity chromatography
according to manufacturer’s instructions. All constructs showed equivalent levels of protein induction but the resulting fusion protein had varying solubilities (Fig. 2, Table 1). The six constructs produced (cid:1) 200 mgÆL)1 of fusion protein. However, the behavior of the proteins varied once the cells had been lysed for downstream processing and protein purification. Pfg27A produced the highest protein levels in a stable and soluble form (Table 1). Approximately 33% of the total induced protein could be purified off the amylose column for Pfg27A. In contrast, only (cid:2)7.5% was eluted off the amylose resin for Pfg27B–F. Further, no protein was detected in the Pfg27A column flow through but (cid:2)5% of the fusion proteins from Pfg27B–F did not bind to amylose resin and were found in the flow through fractions (Table 1). The latter observation suggests improper folding of Pfg27 in the case of Pfg27B–F fusions due to which these interacted poorly with the amylose resin, a behavior noted earlier [9].
Fig. 2. (A–F) Protein expression and purification profile of Pfg27A–F constructs by SDS/PAGE analysis. Lane 1, protein standards; lane 2, uninduced cell pellet; lane 3: induced cell pellet; lane 4, induced cell supernatant; lane 5, induced cell pellet; lane 6, amylose column flow- through; lane 7, eluted MBP-Pfg27–70K fusion protein marked by an arrow.
Table 1. Expression and purification analysis of six MBP + Pfg27 constructs.
Amount of fusion protein (mg)
Construct Flow through Eluted fusion protein Final yield of Pfg27
In the MBP fusion system, a factor Xa protease site has been engineered in the multiple cloning site such that MBP can be released from the over-expressed fusion protein. In an attempt to obtain native Pfg27, the six fusion proteins were incubated with appropriate amounts of the factor Xa protease for cleavage. All fusion proteins cleaved success- fully under identical reaction conditions. The stability and solubility of MBP after factor Xa cleavage was found high and identical in all cases. However, the resulting Pfg27 proteins from constructs Pfg27B–F were labile, and precipi- tated immediately after cleavage (Fig. 3). The identity of the precipitated protein (Pfg27) was confirmed by doing western blot analysis using anti-Pfg27 polyclonal antibodies. In sharp contrast, Pfg27A produced Pfg27 that remained
Pfg27A Pfg27B Pfg27C Pfg27D Pfg27E Pfg27F 0 12 10 10 10 13 66 13 15 17 14 15 24 0 0 0 0 0
5262 S. P. Sati et al. (Eur. J. Biochem. 269)
(cid:1) FEBS 2002
high yield and solubility of Pfg27A in a complex process of assisted protein folding.
soluble could be purified to homogeneity, and subsequently crystallized. The final yield of Pfg27 from Pfg27A was 24 mgÆL)1 of starting culture while it was 0 for Pfg27B–F.
To dissect further the structural elements responsible for the enhanced solubility phenomenon observed in Pfg27A, we identified three key issues: (a) the role of the N-terminal extension; (b) the role of the D to N mutation; and (c) the role of the sequence and length of the C-terminal extension. To address whether presence of extra residues at the N-terminal of Pfg27A were responsible for its high yields, we engineered constructs Pfg27B, Pfg27C, Pfg27E and Pfg27F which share similar N-terminal extensions. Clearly, the presence of these extensions was not enough to yield soluble fusion protein. Next, we addressed whether the differing solubilities were due to the accidental D to N mutation in Pfg27A. It is well documented that the solubility of proteins can be affected severely by single amino acid mutations. However, in the present study we can exclude this possibility as the D to N mutation is conserved in four of the six constructs (Fig. 1), which nonetheless retain the contrasting solubility profiles for their respective fusion proteins. Finally, to address whether the presence of terminal polar resides and the exact length of the C-terminal extension were responsible for the observed (cid:1)solubilization effect(cid:2) in Pfg27A, we engineered two constructs Pfg27E and Pfg27F. Once again, these constructs yielded only partially soluble fusion protein.
Effect of temperature on the expression profiles
We examined the role of temperature in enhancing the solubility properties of various fusion proteins. For some constructs, cell cultures were grown at 37 (cid:4)C but the temperature dropped to 30 (cid:4)C or 25 (cid:4)C after induction. However, no discernible difference in the relative protein solubility profiles was observed. Indeed, the drastic differ- ence in fusion protein quality between Pfg27A and the rest of the constructs was retained. Therefore, the differing protein solubilities were inherent in the constructs and could not be modulated by varying the ambient growth conditions.
The postcleavage precipitation of Pfg27 from constructs Pfg27B–F indicated misfolding of Pfg27. It is probable that these fusions were maintained in solution due to the well known (cid:1)solubilization effects(cid:2) of MBP. Although MBP remained soluble after cleavage, Pfg27 precipitated. These experiments also highlight the central issue that solubility of fusion proteins is not necessarily indicative of either proper folding or of increased solubility of the protein of interest. We propose that the success of fusion protein systems should be ascertained only once the protein of interest has been cleaved off and shown to retain both its folded state and its biological activity.
Fig. 3. SDS/PAGE analysis of Pfg27B (A) and Pfg27A (B) cleavage with factor Xa. (A) Lane 1, protein standards; lane 2, cleavage mixture; lane 3, cleavage mixture supernatant; and lane 4, cleavage mixture pellet. Proteins Pfg27C–F gave identical precipitation profiles. (B) Lane 1, Pfg27A fusion protein; and lane 2, cleavage mixture of MBP-Pfg27A fusion protein. Protein produced from constructs Pfg27C–F showed identical precipitation behavior.
D I S C U S S I O N
It is possible that the exact sequence and length of the 17 residue extension together contribute (cid:1)solubilizing and stabilizing effects(cid:2) observed in Pfg27A. This phenomenon, where extra C-terminal residues affect the stability of an over-expressed protein, has some precedence (see Table 2). A more extensive study can now be undertaken to verify whether segments like the 17 residue extension at the C-terminus of Pfg27A can be used in a more generic fashion.
In the postgenomic era of functional genomics, structural genomics and an expanding scope for biotechnology products, the production of large amounts of soluble native protein has necessarily taken the center-stage. Our findings
Table 2. Effect of C-terminal extensions on various proteins expressed in E. coli.
Protein Variation Effect Reference
Arc repressor Lambda repressor C-terminal tail C-terminal tail Increased stability Increased stability [4] [5] protein
The routine production of recombinant proteins in soluble and biologically active form remains a challenge. In this context, MBP and other fusion protein expression systems have been widely used, both because the fusions serve as efficient purification tags and because they are able to promote the solubility of the fused partner [9,10]. In the present study, we engineered several constructs with the overall aim of obtaining high quality Pfg27 protein for biochemical and biophysical applications. In the first instance, we used one of the most commonly used vehicles for over-expression, namely the pET vector as part of pRSET system. Although reasonable amounts of Pfg27 were produced upon induction, this construct failed to produce protein in a soluble form. Subsequently, the MBP system was used and proved useful in producing at least partially soluble proteins from constructs Pfg27A–F. We found that fusion to MBP did indeed promote the proper folding of Pfg27 into a native and stable form but with a caveat. Our results suggest that the dramatic positive effect on stability and solubility of Pfg27 produced using the construct Pfg27A can be attributed to the presence of 17 extra residues at the C-terminal end. It is probable that MBP and the C-terminal extension both contribute to the
Ara C Aldehyde C-terminal tail C-terminal tail Increased stability [11] Increased stability [12] dehydrogenase
Extra terminal residues affect protein solubility (Eur. J. Biochem. 269) 5263
(cid:1) FEBS 2002
4. Bowie, J.U. & Sauer, R.T. (1989) Identification of C-terminal extensions that protect proteins from intracellular proteolysis. J. Biol. Chem. 264, 7596–7602.
this
highlight the contribution of extra C-terminal residues in producing recombinant Pfg27 in a native, folded state which is now suitable for biochemical, biophysical and immu- nological characterization. Structural elements like the 17 residue C-terminal extension may have widespread application in recombinant protein production. More the complex and study highlights significantly, multifactorial nature of protein folding.
5. Silber, K.R., Keiler, K.C. & Sauer, R.T. (1992) Tsp: a tail-specific protease that selectively degrades proteins with nonpolar C ter- mini. Proc. Natl Acad. Sci. USA 89, 295–299.
6. Keiler, K.C. & Sauer, R.T. (1996) Sequence determinants of C-terminal substrate recognition by the Tsp protease. J. Biol. Chem. 271, 2589–2593.
A C K N O W L E D G E M E N T S
7. Gottesman, S., Roche, E., Zhou, Y. & Sauer, R.T. (1998) The ClpXP and ClpAP proteases degrade proteins with carboxy- terminal peptide tails added by the SsrA-tagging system. Genes Dev. 12, 1338–1347.
We thank the past and present members of the Malaria Group, ICGEB, New Delhi for help and discussions. N.K. is supported by the National Institutes of Health grant AI46760. A.S. is supported by an International Wellcome Trust Senior Research Fellowship.
R E F E R E N C E S
8. Lobo, C.A., Fujioka, H., Aikawa, M. & Kumar, N. (1999) Dis- ruption of the Pfg27 locus by homologous recombination leads to loss of the sexual phenotype in P. falciparum. Mol. Cell 3, 793–798. 9. Kapust, R.B. & Waugh, D.S. (1999) Escherichia coli maltose- binding protein is uncommonly effective at promoting the solu- bility of polypeptides to which it is fused. Protein Sci. 8, 1668– 1674.
1. Maurizi, M.R., Trisler, P. & Gottesman, S. (1985) Insertional mutagenesis of the lon gene in Escherichia coli: lon is dispensable. J. Bacteriol. 164, 1124–1135. 10. Riggs, P. (2000) Expression and purification of recombinant proteins by fusion to maltose-binding protein. Mol. Biotechnol. 15, 51–63. 11. Ghosh, M. & Schleif, R.F. (2001) Stabilizing C-terminal tails on 2. Keller, J.A. & Simon, L.D. (1988) Divergent effects of a dnaK mutation on abnormal protein degradation in Escherichia coli. Mol. Microbiol. 2, 31–41. AraC. Proteins 42, 177–181.
3. Straus, D.B., Walter, W.A. & Gross, C.A. (1988) Escherichia coli heat shock gene mutants are defective in proteolysis. Genes Dev. 2, 1851–1858. 12. Rodriguez-Zavala, J. & Weiner, H. (2001) Role of the C-terminal tail on the quaternary structure of aldehyde dehydrogenases. Chem. Biol. Interact. 130–132, 151–160.