Physiological truncation and domain organization of a novel uracil-DNA-degrading factor Ma´ ria Puka´ ncsik1, Ange´ la Be´ ke´ si1, E´ va Klement2, E´ va Hunyadi-Gulya´ s2, Katalin F. Medzihradszky2,3, Jan Kosinski4,5, Janusz M. Bujnicki4,6, Carlos Alfonso7, Germa´ n Rivas7 and Bea´ ta G. Ve´ rtessy1
1 Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest, Hungary 2 Proteomics Research Group, Biological Research Center, Hungarian Academy of Sciences, Szeged, Hungary 3 Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, USA 4 Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland 5 PhD School, Institute of Biochemistry and Biophysics PAS, Warsaw, Poland 6 Institute of Molecular Biology and Biotechnology, Adam Mickiewicz University, Poznan, Poland 7 Chemical and Physical Biology, Centro de Investigaciones Biolo´ gicas, Madrid, Spain
Keywords cell death; DNA; nuclease; protein structural modeling; uracil
Correspondence B. G. Ve´ rtessy, Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, H-1113, Budapest, Karolina u´ t 29, Hungary Fax: +36 1 466 5465 Tel: +36 1 279 3116 E-mail: vertessy@enzim.hu
(Received 2 April 2009, revised 16 December 2009, accepted 18 December 2009)
doi:10.1111/j.1742-4658.2009.07556.x
Structured digital abstract l MINT-7385914: UDE (uniprotkb:Q961C4) and UDE (uniprotkb:Q961C4) bind (MI:0407) by
cosedimentation in solution (MI:0028)
Uracil in DNA is usually considered to be an error, but it may be used for signaling in Drosophila development via recognition by a novel uracil- DNA-degrading factor (UDE) [(Bekesi A et al. (2007) Biochem Biophys Res Commun 355, 643–648]. The UDE protein has no detectable similarity to any other uracil-DNA-binding factors, and has no structurally or func- tionally described homologs. Here, a combination of theoretical and experi- mental analyses reveals the domain organization and DNA-binding pattern of UDE. Sequence alignments and limited proteolysis with different prote- ases show extensive protection by DNA at the N-terminal duplicated con- served motif 1A ⁄ 1B segment, and a well-folded domain within the C-terminal half encompassing conserved motifs 2–4. Theoretical structure prediction suggests that motifs 1A and 1B fold as similar a-helical bundles, and reveals two conserved positively charged surface patches that may bind DNA. CD spectroscopy also supports the presence of a-helices in UDE. Full functionality of a physiologically occurring truncated isoform in Tribolium castaneum lacking one copy of the N-terminal conserved motif 1 is revealed by activity assays of a representative truncated construct of Drosophila melanogaster UDE. Gel filtration and analytical ultracentrifuga- tion results, together with analysis of predicted structural models, suggest a the possible dimerization mechanism for preserving functionality of truncated isoform.
Introduction
Abbreviations DmUDE, Drosophila melanogaster uracil-DNA-degrading factor; DmrcUDE, recombinant Drosophila melanogaster uracil-DNA-degrading factor; MQAP, model quality assessment program; TcUDE, Tribolium castaneum truncated uracil-DNA-degrading factor isoform; UDE, uracil-DNA-degrading factor; UDG, uracil-DNA glycosylase.
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1245
The nucleobase uracil is not a normal constituent of DNA, although it provides the same Watson–Crick interaction pattern for adenine as does thymine (i.e. 5-methyl-uracil), and is actually used as the adenine- counterpart base in RNA. Despite its usual absence, there are two physiological ways for uracil to appear
M. Puka´ ncsik et al.
Protein function preserved in a truncated isoform
four distinct
in DNA: cytosine deamination and thymine replace- ment. Cytosine-to-uracil transitions via hydrolytic deamination are among the most frequently occurring spontaneous mutations. These generate premutagenic U:G mispairs [1,2]. Thymine replacement by uracil can occur if the cellular dUTP ⁄ dTTP ratio increases, as most DNA polymerases will incorporate either uracil or thymine against adenine, based solely on the avail- ability of the corresponding building block nucleotides [3,4]. Thymine-replacing uracil moieties are not muta- genic, as they provide the same genomic information, but may perturb the binding of factors that require the 5-methyl group on the thymine ring for recognition. translated genomes of other pupating insects, but no structural or functional data have been published on any of these putative proteins. In all of these sequences of homologous proteins, conserved sequence motifs could be identified (motifs 1–4), the first of which is substantially longer and is usually present in two copies (motifs 1A and 1B). Comparison of these motifs with motifs in UDGs does not offer any clue regarding the structure and function of UDE, as no apparent similarity could be observed (Fig. 1B–E) [18]. Investigation of this protein may therefore offer new insights into the physiological role and catalytic mech- anism of nucleases. To this end,
existence of the
suggests
it has been shown that
in the present study we probed the domain organization of UDE from D. melanogaster, expressed as a recombinant protein (DmrcUDE), by limited proteolysis, and revealed that a specific trun- cated fragment lacking the N-terminus may fold into a stable conformation. Interestingly, we also identified such a truncated physiologically occurring UDE iso- form from the pupating insect Tribolium castaneum (TcUDE) [19]. The TcUDE isoform lacks one copy of the N-terminal duplicated first motif. We generated the respective segment from DmrcUDE by chemical cleav- age with hydroxylamine, and found that this truncated segment retains catalytic specificity and activity. The structural results therefore offer an explanation for the physiological truncated isoform. De novo modeling was performed using rosetta, and a 3D structural model was constructed for the tan- demly duplicated N-terminal motifs 1A and 1B. The model that both motifs comprise similar three-helical bundles, with the same topology and rela- tive orientation of a-helices. A high content of helical secondary structure in UDE was also independently confirmed by CD. The predictions, together with the domain organization studies, offer a model of DNA binding to an extended surface on the protein along the conserved motifs.
Results
larvae late Identification of a physiologically occurring truncated isoform of UDE
There are also two mechanisms to ensure uracil-free DNA: prevention and excision. dUTPases prevent incorporation into DNA by removing dUTP uracil in from the DNA polymerase pathway [5]. Uracil DNA, produced by either cytosine deamination or uracil misincorporation, is excised by uracil-DNA gly- cosylases (UDGs) in the base excision repair pathway [6,7]. Among the different UDGs, the protein product of the ung gene (termed UNG) is by far the most effi- cient in catalyzing uracil excision [8]. UNG is responsi- ble for most of the repair process, as its mutation in Escherichia coli, mouse and human has been found to induce a considerable increase in uracil content [9–11]. Null mutations in the dUTPase gene (dut) result in a nonviable phenotype that can be rescued by a second null mutation in the ung gene. The dut)ung) genotype incorporation into DNA [9,12]. presents mass uracil Interestingly, an analogous situation, with simulta- neous lack of dUTPase and UNG activities, arises in Drosophila larvae under physiological conditions. On the one hand, the ung gene coding for the major UDG enzyme is not present in the Drosophila genome [13]. On the other hand, the dUTPase level is under the limit of detection in larval tissues, and that the enzyme is present exclusively in the imaginal disks [14]. Simultaneous lack of UNG and dUTPase may lead to accumulation of uracil- substituted DNA in fruitfly larval tissues. A specific protein termed uracil-DNA-degrading factor (UDE), which recognizes and degrades uracil-DNA, was identified in Drosophila and pupae, strengthening the hypothesis that Drosophila melanog- aster may use uracil-DNA as a signal to switch on metamorphosis-related cell death [15–17].
recognize uracil-DNA.
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1246
blast searches indicated that UDE has detectable homologs only in pupating insects (Fig. 1A). The mul- tiple sequence alignment shows four conserved motifs (Fig. 1A,E). The first extended UDE motif is present in two highly similar copies. The UDE homolog from T. castaneum contains only one copy of motif 1, sug- gesting that lack of the first motif may still result in a functional protein (Fig. 1A). UDE is the first member of a new protein family whose members It has no glycosylase activity, and its sequence does not show any appreciable similarity to those of other nucleases or uracil-DNA-recognizing proteins [15] (Fig. 1). Sig- nificantly similar protein sequences were found only in
M. Puka´ncsik et al.
Protein function preserved in a truncated isoform
A
B
AUDG
C
UDG families
DRUDG
Common α/β fold
UDE
1
2
3
UDGX
1A
1B
2
3
4
UNG
MUG/TDG
SMUG
D
UDG families Motif 1
Motif 2
Motif 3
VhhLGQDPYH
F
hFhhWG
hhcppHPSP
UNG
hhhxGINPGL
F/Y
hhhFxG
haVhPppSh
MUG/TDG
hhhhGShPxx
Y
hhhxpG
hhxLPSTSx
UDGX
hhhhGExPGx
F
hhhxhG
hhhxaHPSh
AUDG
xLxLLExPGP
f
VVhxLG
xhxxxHPSh
DRUDG
F
hhFhGMNPGP
hhVhVG
VxxLxHPSP
SMUG
E
UDE motifs
GFKDxxxAxxTLxxLxxRDxpYpxxxhxGLhxxAKRVLxxTKxExKhxxIKxAhxxhEpaL
1A/1B
GpYKcLRp
2
TWDIxRN
3
KxFpxcxxxPTxxHLxxIxWAYSxpxxKhK
4
Fig. 1. Sequence alignment of UDE homologs in D. melanogaster and T. castaneum, and conserved motifs in UDE and members of the UDG superfamily. (A) Alignment of D. melanogaster and T. castaneum UDE homologs. Gray background: conserved motifs. Red letters: strictly conserved residues. (B) Evolutionary relationship and organization of conserved motifs among UDG proteins [18]. Gray background: uracil-DNA-recognizing proteins present in D. melanogaster. (C) Organization of conserved motifs in UDE. (D, E) Consensus sequences of UDG (D) and UDE (E) motifs. Upper-case letters: conserved residues. Lower-case letters: residues with conserved characteristics (h, hydro- phobic; a, aromatic; p, polar ⁄ charged). Nonconserved positions are indicated by x. A conserved F ⁄ Y residue, overlapping with the uracil ring, is invariably present C-terminal to motif 1 in UDGs. Underlined Asp ⁄ Glu residues in UDG motif 1 are involved in catalysis; the underlined His in UDG motif 3 is suggested to stabilize reaction intermediates. Note the lack of detectable similarities between UDE and UDG motifs.
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1247
M. Puka´ ncsik et al.
Protein function preserved in a truncated isoform
To confirm the in silico prediction of the UDE-like protein product in Tribolium, extracts of the insect larvae were investigated by western blot, using the polyclonal antiserum produced against DmrcUDE. As expected from the high sequence similarity, the antise- rum recognized the Tribolium protein (TcUDE) as well (Fig. 2). The blot clearly indicates that larval extract from T. castaneum contains a single protein that reacts with the UDE-specific antibody. This positive band is found at a position corresponding to a much lower molecular mass than that of DmrcUDE and that of the form of D. melanogaster uracil-DNA- physiological degrading factor (DmUDE). The altered position of TcUDE was in agreement with the genomic data (Fig. 1A), and led to the conclusion that the physiolog- ically occurring TcUDE lacks the N-terminal segment. These results suggest that an isoform of UDE lacking motif 1A may fold on its own, and may form a func- tional protein.
the Domain organization studies using limited proteolysis Trypsin was selected first, as the UDE protein con- tains many potential tryptic cleavage sites (i.e. Lys and Arg residues) scattered throughout the sequence. Fig- ure 3A indicates fast initial fragmentation leading to loss of 5–7 kDa fragments from either the N-terminus or the C-terminus, or both. This initial fragmentation is not affected by the presence of DNA. Flexibility of the N-terminal segment (residues 1–47) is also sug- gested by the drastic overrepresentation of basic resi- dues, leading to an extremely high pI (11.5) for this segment. At later stages of proteolysis, DNA protec- tion is evident, as a specific fragment persists stably in the presence of DNA, whereas this fragment is rapidly degraded in the absence of DNA. Several smaller frag- ments are produced in relatively large amounts in the absence of DNA, whereas these peptides are practi- cally absent in the presence of DNA. The data suggest the presence of an inner folded core, which is sug- gested to participate in DNA binding, on the basis of DNA-binding-induced stabilization. The large number of potential tryptic cleavage sites prevented straight- fragments, observed forward identification of on SDS ⁄ PAGE, by MS.
To delineate the domain organization of the UDE pro- tein more precisely, limited proteolysis experiments were performed. Three proteases with different specificities were used. Experiments were conducted with DmrcUDE alone, and also in the presence of added DNA to study potential DNA-binding protein segments. in the protein.
DmUDE
2 3
4
1B
1A
TcUDE
1
2 3
4
DmUDE DmrcUDE TcUDE
55 kDa
36 kDa
Fig. 2. Immunodetection of UDE homolog from T. castaneum. Western blot indicates that polyclonal anti-DmUDE serum recog- nizes the UDE homolog from T. castaneum that appeared at a lower position than physiological DmUDE or DmrcUDE. Lane 1: D. melanogaster larval extract. Lane 2: purified DmrcUDE. Lane 3: T. castaneum larval extract.
For further characterization and localization of pro- tein segments involved in DNA binding to UDE, two additional sets of experiments were conducted, using highly specific chymotrypsin [20] and Asp-N endopro- teinase. These enzymes have considerably fewer poten- In both cases, tial cleavage sites (Fig. 3B,C). protection by DNA is again evident Figure 3B shows that, in the absence of DNA, initial chymotryptic cleavage removes a segment of about 9.6 kDa from UDE, whereas in the presence of DNA, the removed peptide is much smaller, around 3 kDa. MS analysis of the initially cleaved fragments revealed that the C-terminus remained intact, and the two pep- tide bonds most sensitive to chymotrypsin could there- the N-terminus at Trp10 and fore be localized at Tyr69 in the presence and in the absence of DNA, respectively (Fig. 3D). DNA binding is therefore asso- ciated with significant protection at the Tyr69-Arg70 peptide bond located within the conserved motif 1A. In addition, DNA-binding-induced conformational changes are also reflected at the Phe104-Glu105 and Tyr311-Ile312 peptide bonds, which become exposed in the presence of DNA (Fig. 3D). involvement of To characterize the potential
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1248
the C-terminal region of UDE in DNA binding, Asp-N endoproteinase was also used for limited proteolysis, as the C-terminus of the protein is rather rich in Asp residues (Fig. 3C,D). When it is digested by Asp-N endoproteinase, the primary cleavage removes a short the fragment of about 3.4 kDa, independently of
M. Puka´ncsik et al.
Protein function preserved in a truncated isoform
C
A
Trypsin digestion
Asp-N proteinase digestion w/o U-DNA
w/U-DNA
w/o U-DNA
w/U-DNA
0´
15´
30´
60´
0´
15´
30´
60´ MM
0´
60´
180´
300´ MM 0´
60´
180´ 300´
45 kDa
55 kDa
36 kDa
36 kDa
28 kDa
29 kDa 24 kDa
20 kDa
17 kDa
11 kDa
14.2 kDa
B
D
Chymotrypsin digestion w/o U-DNA
w/U-DNA
W10
Y69 W107 F136 Y156 F194 F198
Y311
MM 0´
30´ 60´ 120´ 180´
0´
30´ 60´ 120´
F104
N111
55 kDa
1B
2 3
4
1A
36 kDa
His – tag
28 kDa
17 kDa
D44 D66
D126
D179 D193
D333
11 kDa
Fig. 3. Initial domain analysis of DmUDE by limited proteolysis. (A) Tryptic digestion pattern. Arrows indicate fragments that are preferen- tially produced in the absence of DNA; the star shows the detected position of stable fragment persisting in the presence of DNA. (B, C) Limited digestion patterns obtained using high-specificity chymotrypsin and Asp-N endoproteinase. The timescale of limited digestion and the presence or absence of added ligand are indicated at the top of the gel. MM, molecular markers. (D) Summary of cleavage sites identi- fied by MS. Top row: chymotryptic sites. Bottom row: Asp-N sites. Solid arrows indicate cleavage sites that are similarly observable in both the presence and the absence of DNA. Dashed arrows indicate sites protected in the presence of DNA. Dotted arrows indicate cleavage sites detected only in the presence of DNA. The cleavage site of hydroxylamine is marked with a bold arrow.
changes or
olytic sites [despite the presence of numerous potential tryptic, chymotryptic and Asp-N sites (Figs 1A and 3D)]. Motifs 1A and 1B, on the other hand, are signifi- cantly more prone to proteolysis, especially in the absence of DNA. DNA binding provides significant protection against proteolytic cleavage along motifs indicating either DNA-binding-induced 1A and 1B, conformational covering of otherwise exposed proteolytic sites by DNA binding to these segments.
Motif 1A is dispensable for UDE function presence of DNA. This loss is in good agreement with a C-terminal cleavage (at Asp333) leading to the loss the first N-terminal Asp of 2.6 kDa; cleavage at (Asp44) would remove a peptide of 6.6 kDa, which is much larger than estimated from the gel electropho- retic analysis. It is evident that, in the absence of DNA, additional cleavages can also occur, yielding 23–25 and 17 kDa polypeptides, as observed on SDS ⁄ PAGE. Binding of DNA induces significant pro- tection against all of these cleavages, except at the Asp333 site, which shows the same highly exposed character for Asp-N endoproteinase digestion in the presence and in the absence of DNA.
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1249
The results of proteolytic experiments are summa- rized in Fig. 3D. It is obvious that the segment encom- passing motifs 2–4 is a well-folded part of the protein, even in the absence of DNA that lacks exposed prote- To produce a specific truncated DmUDE isoform mimicking the physiologically occurring protein in T. castaneum, we selected a chemical agent, hydroxyl- amine, that cleaves peptide bonds exclusively between
M. Puka´ ncsik et al.
Protein function preserved in a truncated isoform
tion of N-terminal and C-terminal hydroxylamine- cleaved segments by Ni2+–nitrilotriacetic acid chroma- tography (Fig. 4A).
To check whether the removal of motif 1A alters the specific function of the protein, we performed catalytic assays and electrophoretic mobility shift assays with the purified Gly112–Glu355 C-terminal fragment. Fig- ure 4B shows that the C-terminal segment preserves catalytic activity and specificity for uracil-substituted DNA that do not depend on the presence or absence of available divalent metal ions. The gel shift indicates the DNA-binding capability of the C-terminal frag- ment, and also demonstrates the specific DNA-cleaving activity (Fig. 4C). Asn and Gly [21]. There is only one such peptide bond in DmUDE, at Asn111-Gly112 (Fig. 3D), located between motifs 1A and 1B. Figure 4A shows that, in agreement with the previously determined exposed character of the linker segment between motifs 1A and 1B, hydroxylamine cleaved the protein into an N-ter- minal Met1–Asn111 and a C-terminal Gly112–Glu355 fragment, as verified by MS. The molecular masses of the cleavage products are 14 and 28 kDa as calculated from the sequence, whereas values of 16 and 32 kDa were estimated from the SDS ⁄ PAGE gels. The C-ter- minal fragment closely corresponds to the physiologi- cal TcUDE isoform. The presence of the N-terminal His-tag on DmrcUDE allowed straightforward separa-
N111
A
DmrcUDE
1A
1B
2 3
4
His- tag
N-terminal M1-N111
C-terminal G112-E355
HA digested
Purified C-term
Intact
Intact
G112- E355
M1-N111
G112-E355 DmrcUDE
B
C
G112-E355 DmrcUDE 0 50 100
Control DNA U-DNA 0′ 30′ 60′ 90′ 0′ 30′ 60′
D
E
G112-E355 DmrcUDE
Full-length DmrcUDE
ss U-oligo
ds U-oligo 0′ 30′ 60′ 120′
ds U-oligo 0′ 30′ 60′ 120′
ss U-oligo 0′ 30′ 60′ 120′
31-mer 0′ 30′ 60′ 120′
31-mer
Fig. 4. (A) Production and characterization of the truncated UDE isoform. Cleavage with hydroxylamine (HA) generates the expected fragments. In the schematic repre- sentation, the single cleavage site at Asn111 between the 1A and 1B motifs is marked with an arrow. Gel images show gelectrophoretic analysis of hydroxylamine cleavage and purification of the C-terminal motif to homogeneity. (B) Electrophoretic mobility shift assay. The concentration of DmrcUDE Gly112–Glu355 segment used in the experiment is given at the top of the lanes (lgÆmL)1). Uracil-DNA plasmid, 20 lgÆmL)1, was used in all mixtures. (C–E) Truncated UDE lacking motif 1A retains uracil-DNA-degrading activity. (C) Uracil-DNA or control DNA linearized plasmid was incubated for the indicated time periods with truncated DmrcUDE (Gly112–Glu355 segment). Note degradation (as well as shift) of the uracil-containing DNA plasmid substrate. (D, E) Activities of full-length UDE and Gly112–Glu355 truncated DmrcUDE constructs were compared using uracil- containing fluorescently labeled synthetic double-stranded (ds) and single-stranded (ss) oligonucleotide substrates (incubation times are indicated). Note the specific degradation product very close to the 31mer standard position, indicating that cleavage of the oligonucleotide only occurred at the uracil- containing position. The catalytic activity of the truncated enzyme is still present, but is detectable only on single-stranded substrate.
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1250
M. Puka´ncsik et al.
Protein function preserved in a truncated isoform
eluted from the gel filtration column at practically the same position as observed for the full-length UDE, corresponding to 52 kDa. As the calculated molecular mass of the monomeric Gly112–Glu355 fragment is 28 kDa, the elution profile strongly suggests that this fragment forms a dimer.
To clearly identify the cleavage site of the UDE pro- tein and its truncated form on uracil-containing DNA substrate, we performed cleavage experiments using synthetic 60mer single-stranded and double-stranded oligonucleotides, containing one single uracil moiety in one of the strands, at the 32nd position. The uracil- containing strand was labeled with a fluorescent dye to aid visualization of the reaction (Fig. 4D,E).
Quaternary protein structure of full-length and truncated proteins
for
Analytical ultracentrifugation was also applied to corroborate the results from the gel filtration studies. The sedimentation equilibrium technique is reported to be optimal for determining native molecular masses [25]. In fact, our results with full-length DmrcUDE indicate that the determined molecular mass was 42.8 ± 2 kDa, in very close agreement with the mass calculated from the amino acid sequence (Fig. 6). For the truncated Gly112–Glu355 construct, the deter- mined native molecular mass was 49 ± 1.2 kDa, cor- responding rather closely to a dimer of the truncated segment (for which the calculated masses are 28 kDa for the monomer and 56 kDa for the dimer). These results, in agreement with the gel filtration data, argue for a native monomer of the full-length protein and a native dimer for the truncated construct. Sedimentation velocity experiments
revealed that full-length DmrcUDE has a main sedimenting species (82% of the loading concentration) with a standard sedimentation value of 2.6S ± 0.1S, which, together
1.0
1.4
0.8
1.2
0.6
) s (
C
0.4
1.0
To determine whether the absence of the N-terminus has any effect on the quaternary structure organization of UDE, the the native molecular masses full-length protein and the C-terminal fragment were determined by analytical gel filtration. The full-length protein eluted at a position corresponding to 52 kDa, which is somewhat larger than the full-length mono- mer calculated molecular mass of 41.446 kDa. This alteration may indicate partial rapid equilibrium dimerization and ⁄ or the anomalous gel permeation behavior may suggest that the proteins contain signifi- cant amounts of natively unfolded, highly flexible seg- ments. To check this suggestion, we performed an in silico analysis using several servers for sequence- based prediction of structural disorder [22–24]. The results are shown in Fig. 5, and indicate that the dif- ferent predictors suggest, in agreement, considerably high flexibility at the N-terminus and C-terminus, as well as in the region between motifs 1A and 1B. Inter- the C-terminal Gly112–Glu355 fragment estingly,
0.2
0.8
0.0
0 8 2 A
0
2
4
6
8
10
12
14
0.6
Sedimentation coefficient (S)
0.4
1.0
0.2
0.9
0.8
RONN IUPred DISOPRED
0.0 0.03
l
0.7
y t i l i
0.00
0.6
i
–0.03
0.5
s a u d s e R
b a b o r p
0.4
6.95
7.00
7.05
7.10
7.15
7.20
0.3
Radius (cm)
i
r e d r o s D
0.2
0.1
0.0
full-length DmrcUDE (s) and 0.8 mgÆmL)1
–0.1
0
50
100
150
200
250
300
350
Number of amino acids
Fig. 5. Disorder profile of DmrcUDE. The plot shows sequence position against probability of disorder. Segments of the sequence at the N-terminus and C-terminus and between motif 1A and motif 1B were classified as disordered by three predictor programs (IUPRED, RONN, and DISOPRED).
Fig. 6. Determination of UDE oligomer status by analytical ultracen- trifugation. Top panel: sedimentation equilibrium gradients of 0.53 mgÆmL)1 for DmrcUDE Gly112–Glu355 (h) at 13 400 g as described in Experi- mental procedures. The solid line shows the fit of the experimental data to single ideal species. Bottom panel: residual distribution as a function of the sedimentation distance (this plot corresponds to the difference between the experimental data and the fitted data for each point). Inset panel: sedimentation coefficient distributions of full-length DmrcUDE (solid line) and DmrcUDE Gly112–Glu355 (dashed line).
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1251
M. Puka´ ncsik et al.
Protein function preserved in a truncated isoform
these clusters, one single cluster contained the specific topology that exhibited pseudosymmetrical orientation of the two motifs. Importantly, members of this cluster exhibited low energy levels and were well scored by MQAPs (proq – predicted LGscore in the range 1.2–3.2, and metamqap – predicted rmsd in the range 3–4.2 A˚ , for the five lowest-energy representatives of the cluster), which indicates a high probability that they resemble the currently unknown native structure. Figure 7 depicts the predicted model
with the sedimentation equilibrium data, is compatible with a protein monomer whose hydrodynamic behav- ior deviates from the expected for a globular species [calculated frictional ratio (f ⁄ f0) = 1.6]; the rest of the protein sediments as faster oligomeric species. The truncated Gly112–Glu355 protein construct showed significant polydispersity, with main peaks at 2.5S, 4.3S, and 6.0S, representing approximately 70%, 20%, and 6%, respectively, of the loading concentration. The 2.5S peak is compatible with a protein globular monomer (f ⁄ f0 = 1.3). These data argue for potential monomer self-association into dimers and higher-order oligomers.
in several different orientations. The two homologous motifs (1A and 1B) form a four-helix bundle interaction surface (Fig. 7A,B). On the surface of the model, a well-con- served, positively charged surface is well defined. This may serve as the nucleic acid-binding surface, in agree- ment with the limited proteolysis data.
Structure prediction of UDE reveals a pseudosymmetrical arrangement of two a-helical bundles
structural prediction, Estimation of secondary structural elements by CD spectroscopy
To verify structural predictions, CD spectroscopy mea- surements were performed, as CD spectra in the far- UV wavelength (190–240 nm) range are very indicative of different secondary structural elements [30]. Spectra of the intact protein and of the C-terminal fragment Gly112–Glu355 showed double negative maxima at 208 and 222 nm, which are characteristic for the pres- ence of a-helices (Fig. 8). Quantitative evaluation of the spectral data was performed with k2d and selcon [24,31,32]. The estimated percentages of protein sec- ondary structures from CD spectra reveal 37% a-heli- ces and 18–26% b-structure.
For the DmUDE full-length sequence was submitted to the genesilico metaserv- er [26], which is the gateway providing a unified inter- face to several servers for secondary and tertiary structure predictions. The analysis of predictions of domain composition suggested that UDE contains an N-terminal helical region of approximately 30 residues and at least three structural domains corresponding to motifs 1A and 1B, and the C-terminus, encompassing motifs 2, 3 and 4. The C-terminus of 40–50 residues and the loop connecting motifs 1A and 1B (between residues 109 and 137) are predicted to be mostly dis- ordered. All three domains are predicted to be mainly helical, although the secondary structure predictions for the third domain were uncertain, as there was no agreement between alternative servers.
Discussion
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1252
The potential signaling role of deoxyuridine moieties in genomes of pupating insects was first suggested by Deutsch et al. [16], on the basis of the lack of UDG activity in these insects. The hypothesis stating that uracil-DNA might be present transiently in larval stages and that its degradation at the end of larval stages may contribute to cell death during metamor- phosis was much debated, owing to independent find- ings from several laboratories showing the presence of UDG activity in some developmental stages of Dro- sophila [33–37]. This debate was resolved by the fully annotated Drosophila genome, which clearly indicated the lack of the major UDG gene ung but the presence of several other genes that encode catalytically much less efficient UDGs. The absence of dUTPase in larval stages [14] and our recent discovery of the strictly regu- lated UDE [15] reinforced the hypothesis on the possi- ble role of uracil-DNA in Drosophila and suggested a The fold recognition analysis did not reveal any con- fident matches with known protein structures, suggest- ing that the UDE 3D structure may exhibit a novel fold. Therefore, to predict at least partially the tertiary structure of UDE, we performed de novo modeling of the region encompassing motifs 1A and 1B, using the rosetta program [27]. In total, about 500 000 differ- ent models (also known as decoys) were generated, and 10% of the lowest-energy structures were clustered on the basis of their similarity. The representatives of the best clusters were refined with the rosetta full atom refinement protocol, and scored with the model quality assessment programs (MQAPs) proq [28] and metamqap [29]. Evaluation of the largest clusters revealed that both motifs 1A and 1B comprise similar three-helical bundles, with the same topology and rela- tive orientation of the helices. Nevertheless, the top clusters differed in relative orientation of the two heli- cal bundles to each other (data not shown). Among
M. Puka´ncsik et al.
Protein function preserved in a truncated isoform
A
B
C
D
Motif 1A
Motif 1B
Cartoon model colored by sequence conservation
Surface model colored by sequence conservation
Surface model colored by electrostatic potential
–3 kT/e
+ 3 kT/e
Cartoon model colored by motifs (blue and red – protease cleavage sites)
Orange – strictly conserved Yellow – conserved Green – variable
Orange – strictly conserved Yellow – conserved Green – variable
Fig. 7. Structural model of DmUDE duplication fragment. Structures are shown in two views: front (upper panel) and top (bottom panel). (A) Cartoon representation. Duplicated motifs 1A and 1B are colored green and orange, respectively, and the nonconserved linker is colored gray. Peptide bonds protected from proteolytic cleavage on DNA binding are colored blue. The peptide bond between residues 104 and 105, cleaved only on DNA binding, is colored red. Note that the duplicated fragments are only approximately symmetrical, as the model is of low resolution and the local conformation of the backbone is uncertain. (B, C) Sequence conservation mapped onto the ribbon diagram (B) or the molecular surface (C) (conserved residues are colored orange and yellow; variable residues are colored green). (D) Electrostatic potential mapped onto the molecular surface (positively and negatively charged regions are colored blue and red, respectively). Arrows indicate the positively charged conserved patches that may accommodate DNA.
6000
4000
2000
0
–2000
) 1 – l o m d · 2 m c
to either UDGs [18] or
–4000
role for UDE in programmed cell death during meta- morphosis. Functional analysis of UDE identified this protein as a novel uracil-recognizing factor [15], with no similarities the Exo- III ⁄ Mth212 nuclease [38]. Multiple sequence alignments of UDE homologs from all available pupating insect genomes indicated the presence of conserved motifs in most species, with the same distribution (Fig. 1).
Intact DmrcUDE
g e d ( E R M
–6000
Θ
G112-E355 DmrcUDE
–8000
–10 000
200
210
240
250
260
230
220 Wavelength (nm)
Fig. 8. CD spectra of intact UDE (solid line) and C-terminal frag- ment (dashed line) confirm the presence of a-helices. MRE, mean residue molar ellipticity.
suggesting that the
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1253
The UDE homolog in T. castaneum lacks one copy of the N-terminal duplicated first motif (Figs 1 and 2). TcUDE showed reactivity with the antiserum produced truncated against DmrcUDE, TcUDE isoform is a well-folded UDE-like protein. It was also observable on the blot that the physiological forms of the proteins from both Drosophila and Tribo- lium extracts were detected at much higher electro- phoretic positions than expected from the calculated molecular mass values: molecular masses estimated
M. Puka´ ncsik et al.
Protein function preserved in a truncated isoform
A
B
Motif 1
Motif 1 A
Motif 1
Motif 1 B
Motifs 2,3,4
Motifs 2,3,4
Motifs 2,3,4
DmUDE
TcUDE
C
Fig. 9. Structural models of DmUDE pseu- dodimer (A) and TcUDE dimer (B). Struc- tures are shown in cartoon representation and colored by motif (motif 1A in DmUDE and motif 1 in TcUDE, dark red; motif 1B, dark gray; nonconserved segments, light gray). Residues 1–11 of TcUDE are not shown (the conformation of this fragment is very uncertain). C-terminal parts correspond- ing to motifs 2, 3 and 4 are shown schemat- ically only. (C) Alignment between motif 1 residues for DmUDE and TcUDE. Identical and conserved residues are colored red and green, respectively. The helical prediction is indicated. Note the numerous conserved hydrophobic and polar residues that may form the dimerization surface.
Theoretical analysis
Experimental analysis
UDE sequence
T. cas predicted protein product
Western blotting
BLAST
Circular dichroism
Analytical ultracentrifugation
Multiple sequence alignment
Trypsin
Chymotrypsin
Secondary structure prediction
Limited proteolysis
Domain organization
Peptide identification by MS
Asp-N endoproteinase
Modeling
Hydroxylamine cleavage
Identification of conserved surface patches
Mapping of electrostatic potential
Analysis of C-ter fragment
Prediction of DNA binding site
DNA cleavage assay
DNA binding by EMSA
Quaternary structure by gel flitration
Fig. 10. Flowchart scheme of bioinformatics and experimental approaches.
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1254
M. Puka´ncsik et al.
Protein function preserved in a truncated isoform
39.9
is 41.446 kDa, and the
this interaction pattern may be preserved in the sug- gested TcUDE dimer (Fig. 9). The predicted model could not provide direct assessment of the pattern of interaction between the two copies of motif 1 (1A ⁄ 1B). Nevertheless, numerous hydrophobic interactions are very probable between the conserved apolar residues, and hydrogen bonds can also easily form between the polar side chains within the three-helical bundles. In the dimerized modules, the nucleic acid-binding surface can also be formed in a manner very similar that in the full-length UDE proteins. from the electrophoretic experiment are 53.3 and 41.7 kDa for DmUDE and TcUDE homologs, respec- tively, whereas the sequence-based theoretical masses and 28 kDa. Recombinant DmrcUDE, are expressed in E. coli, did not show such a large devia- tion from its expected position (the calculated molecu- lar mass electrophoretic estimation is 44.2 kDa). The large shift of the physio- logical samples to the higher apparent molecular mass positions on SDS ⁄ PAGE may be indicative of some post-translational modification, the identification of which is in progress.
Conclusions
the duplicated fragments at
The significance of UDE is two-fold: (a) it may be developed into a versatile molecular biotechnological tool [39]; and (b) its targeting may yield species-specific insecticides to be used against, for example, malaria mosquitoes. Here, we employed a multidisciplinary set of theoretical and experimental approaches (schemati- cally described in Fig. 10) to reveal structural and func- tional characteristics. The present data provide insights into the domain structure and nucleic acid-binding site of this novel DNA-degrading protein in the context of sequence motifs that have previously not been described in nucleases or uracil-recognition proteins.
Experimental procedures
DNA cloning and recombinant protein expression
Recombinant His-tagged UDE corresponding to the DmUDE (Q961C4) sequence (DmrcUDE) was expressed and purified as described previously [15]. The truncated Gly112–Glu355 construct of DmrcUDE corresponding to the T. castaneum UDE homolog sequence was generated from pET–HisUDE with the following primers: 5¢-GAG ATA TAC ATA TGG GCG GAG GGG CGT CCA GCA AG-3¢ and 5¢-AAG CTT GAG CTC GAG CTC CTC CCT CTT CTT CTT CC-3¢. The DNA fragment was cloned into pET22b (Novagene, Merck, Darmstadt, Germany), using NdeI and XhoI sites. The recombinant construct included a His6 tag and a linker segment at the C-terminus.
Limited proteolysis experiments indicated that DNA binding may occur along the conserved motifs 1A and 1B, as binding induces significant protection against multiple proteases at peptide bonds scattered through- out these segments (Fig. 3). Secondary structure predic- tion revealed that the N-terminal end of UDE are mainly helical, and this observation has also been confirmed by CD measure- ments (Fig. 8). According to the predicted structural model of the fragment encompassing motifs 1A and 1B, the duplicated motifs together form a conserved helical bundle (Fig. 7A,B). The pseudodimer contains a large conserved segment on the surface, composed of two positively charged patches separated by small region of negative potential (Fig. 7C,D, arrows). These patches may correspond to the DNA-binding site and ⁄ or cata- lytic site. However, on the basis of this predicted model, the DNA-binding mode and catalytic site residues cannot be confidently predicted. Interestingly, many protease cleavage sites that are protected after DNA binding are located on the opposite, nonconserved and the structure (Fig. 7A). negatively charged, side of Therefore, these sites are not likely to be directly steri- cally protected from cleavage by bound DNA. Instead, they are probably localized in a region that is partially flexible or disordered in the absence of the DNA.
cleavage
Western blotting
Western blotting was performed as described in [15], using anti-DmUDE serum at 1 : 180 000 dilution as primary anti- body, and peroxidase-labeled secondary antibody. Extracts from D. melanogaster and T. castaneum larvae were pre- pared with the addition of protease inhibitor cocktail (Sigma-Aldrich, Budapest, Hungary). The same amount of total protein from each extract was loaded on SDS ⁄ PAGE
(Fig. 6). A structural model for
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1255
The primary structure of TcUDE implied that the lack of one copy of motif 1 does not necessarily per- turb the formation of a functional protein. This hypothesis was confirmed by producing the respective from truncated isoform with chemical DmrcUDE (Fig. 4). We therefore conclude that the physiological form of TcUDE could have the same unique function and the same putative physiological role. Native molecular mass estimation by gel filtration and analytical ultracentrifugation indicated that the truncated DmrcUDE, representing TcUDE, forms a homodimer this homodimer is shown in Fig. 9. The two homologous motifs, 1A and 1B, in the DmrcUDE monomer may lead to the formation of a partial pseudodimer, and
M. Puka´ ncsik et al.
Protein function preserved in a truncated isoform
gels. This was confirmed by running the SDS ⁄ PAGE gels in duplicate, and using one gel for Coomassie protein quan- tification by laser densitometry, and the other gel for blot- ting. Densitometry data indicated that the protein content in the lanes differed by less than 4%. Blot results were detected by enhanced chemiluminescence.
(pSUPERIOR-puro; Invitrogene, Csertex, Budapest, Hun- gary) in the dut)ung) K12 CJ236 E. coli strain [15]. Control plasmid was prepared in the XL1Blue E. coli strain. Plas- mids were purified with a Qiagen plasmid isolation kit, and linearized with NotI restriction endonuclease. DNA, 50 lgÆmL)1, was incubated with 50 lgÆmL)1 UDE C-termi- nal fragment. The nuclease assay was performed in 25 mm Tris ⁄ HCl (pH 7.5), also containing 0.1 mgÆmL)1 BSA and either 1 mm MgCl2 or 1 mm EDTA at 37 (cid:2)C. At given reaction times, aliquots were withdrawn and incubated at 65(cid:2)C for 15 min (for Mg2+-containing reaction mixtures); or at room temperature in the presence of 60 mm NaOH for 15 min (for EDTA-containing reaction mixtures). This treatment resulted in cleavage of abasic sites. Products were detected by standard ethidium bromide staining after agarose gel electrophoresis.
Alternatively, assays were also run using synthetic uracil- containing single-stranded and double-stranded oligonucleo- tides (purchased from Eurofins MWG Operon, Ebersberg, Germany). The uracil-containing oligonucleotide was labeled at the 5¢-end with Cye3 fluorescent dye, and contained one single uracil moiety at the 32nd position. Its complementary strand (to be used for constructing the double-stranded sub- strate) did not contain either uracil or fluorescent label.
Limited proteolysis DmrcUDE at 0.5 mgÆmL)1 was incubated with 50 ngÆmL)1 trypsin (Sigma-Aldrich), or 1.25 lgÆmL)1 chymotrypsin (EMP Biotech GmbH, Berlin, Germany), or 2.5 lgÆmL)1 Asp-N endoprotease (Sigma-Aldrich), in the absence or in the presence of 0.5 mgÆmL)1 uracil-DNA plasmid (prepared in a dut)ung) K12 CJ236 E. coli strain [15]) 20 mm Hepes (pH 7.5) containing 150 mm KCl and 1 mm dithiothreitol, or 50 mm Tris ⁄ HCl (pH 8.0) containing 1 mm CaCl2, or 100 mm Tris ⁄ HCl (pH 8.5) containing 5 mm MgCl2 (for trypsin, chymotrypsin or Asp-N endoprotease digestions, respectively). Reactions were run at room temperature and terminated after different time intervals by the addition of 1 mm phenylmethanesulfonyl fluoride (for trypsin and chymotrypsin digestions), or by addition of 5 mm EDTA and immediate freezing (for Asp-N endoprotease digestion).
The uracil-containing oligonucleotide labeled with Cye3 was 5¢-CTC GCA AAT GAA CTG GGC GAT GCG GTC GCA CUA CTT CAC CTC GAA ATC AAC ATC TGA GTG-3¢ (with the uracil position underlined).
The complementary oligonucleotide was 5¢-CAC TCA GAT GTT GAT TTC GAG GTG AAG TAG TGC GAC (with the CGC ATC GCC CAG TTC ATT TGC GAG-3¢ adenine position opposite to uracil in the double-stranded oligonucleotide underlined).
Hydroxylamine is a chemical reagent that specifically induces peptide bond cleavage at the Asn–Gly bond [21]. There is only one such site in UDE, between residues 111 and 112. UDE at 2 mgÆmL)1 was incubated overnight with 2 m hydroxylamine in 0.2 m NH4HCO3 at 37 (cid:2)C. The two prod- ucts (N-terminal and C-terminal fragments) were dialyzed in 25 mm Hepes (pH 7.5) containing 150 mm KCl and 1 mm protease inhibitor cocktail, and this was followed by separa- tion on Ni2+–nitrilotriacetic acid resin. The C-terminal Gly112–Glu355 fragment was stored in 25 mm Hepes (pH 7.5) containing 150 mm KCl and protease inhibitor cocktail, according to the manufacturer’s suggestion (Sigma-Aldrich).
Hydroxylamine digestion
For preparation of double-stranded substrates, equal amounts of uracil-containing oligonucleotide and its com- plementary strand were incubated at 95 (cid:2)C for 5 min. For the assay, 25 pmol of single-stranded or double-stranded oligonucleotides was incubated with 50 lgÆmL)1 full-length or truncated UDE in a final volume of 10 lL, in 25 mm Tris ⁄ HCl (pH 7.5), also containing 0.1 mgÆmL)1 BSA and 1 mm EDTA, at 37 (cid:2)C. At given reaction times, reaction mixtures were run on 10% Tris ⁄ borate ⁄ EDTA ⁄ PAGE gel. Products were visualized under UV light.
The activity assays performed using the Gly112–Glu355 construct, either produced by cloning or by hydroxylamine digestion, gave very similar results.
Analysis of the limited proteolysis fragments was performed either without fractionation or after 1D SDS ⁄ PAGE sepa- ration. The unfractionated fragments were analyzed on a Bruker Reflex III MALDI-TOF mass spectrometer in a sin- apinic acid matrix in positive linear mode. SDS ⁄ PAGE-sep- arated fragments were in-gel digested by trypsin, and the digests were analyzed by LC-MS ⁄ MS analysis as in [40–42].
MS
Electrophoretic mobility shift assay of DNA binding
top of
the
The protein concentration used is listed in micrograms per milliliter at the figure. Plasmid DNA, 20 lgÆmL)1, was used in all mixtures. The buffer was 25 mm Tris ⁄ HCl (pH 7.5), also containing 0.1 mgÆmL)1
For plasmid substrates, uracil-containing plasmid DNA was prepared by amplification of normal plasmid DNA
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1256
Catalytic assay
M. Puka´ncsik et al.
Protein function preserved in a truncated isoform
BSA and 1 mm EDTA. Protein and DNA were mixed, and the mixtures were loaded on agarose gel.
of every spectrum were averaged. Spectral data processing was performed with the built-in jasco software of the spec- tropolarimeter. Far-UV CD spectra were processed by k2d and selcon to estimate the fraction of secondary structural elements.
Analytical gel filtration analysis
Analytical gel filtration was conducted on Superdex 200HR column calibrated with BSA, ovalbumin, chymotrypsin, and RNase (molecular masses of 67, 43, 25 and 13.7 kDa, respectively). Calibrating proteins or UDE samples were applied in a total volume of 500 lL, at a concentration of 1–6 mgÆmL)1.
Bioinformatics analyses
Sequences of putatively homologous UDE proteins were identified by the tblastp program and genomic blast at the NCBI, used for similarity searches. A multiple sequence alignment of UDE proteins was calculated by clustalw [46]. Prediction of domain boundaries, secondary structure and fold recognition were conducted via the genesilico metaserver [26], which is the gateway providing a unified interface to several servers for secondary and tertiary struc- ture predictions. Independent runs were performed for the full-length UDE sequence (NCBI GI number: 28572066), a variant without the N-terminus (amino acids 42–355), a region encompassing motifs 1A and 1B (amino acids 42– 205) as well as both motifs separately (amino acids 42–111 or 129–198), and the C-terminal region alone, without the disordered C-terminus (amino acids 210–320).
independent
An Optima XL-A analytical ultracentrifuge (Beckman- Coulter, Palo Alto, CA, USA) was used to perform the analytical ultracentrifugation experiments. Detection was performed by means of a UV–visible absorbance detection system. Experiments were conducted at 20 (cid:2)C, using an AnTi50 eight-hole rotor and epon–charcoal standard double- sector centerpieces (12 mm optical path). Absorbance scans were taken at the appropriate wavelength (280 nm). Proteins were used at 0.53 mgÆmL)1 for full-length DmrcUDE and 0.8 mgÆmL)1 for the Gly112–Glu355 construct. Sedimenta- tion velocity was determined using 400 lL samples, and the selected speed was 16 800 g. Differential sedimentation coefficient distributions, c(s), were calculated by least- squares boundary modeling of sedimentation velocity data using the program sedfit [25,43]. From this analysis, the experimental sedimentation coefficients were corrected for solvent composition and temperature with the program sednterp to obtain the corresponding standard sedimenta- tion values (Fig. 6, inset). Short-column (85 lL)
sedimentation equilibrium runs (7900, 13 400 and were performed at multiple speeds 20 260 g). After the equilibrium scans, a high-speed centri- fugation run (140 000 g) was performed to estimate the corresponding baseline offsets. Weight-average buoyant molecular masses were determined by fitting a single-species model to the experimental data (Fig. 6), using the hetero- analysis program [44]. The molecular masses of proteins DmrcUDE and DmrcUDE Gly112-Glu355 was determined from the experimental buoyant values, using 0.737 and 0.734 mLÆg)1 as their partial specific volumes (calculated from the amino acid composition, using sednterp [45]).
Analytical ultracentrifugation analysis
Structural modeling was performed with rosetta, using a standard low-resolution ab initio procedure followed by a full atom refinement. In order to explore broader confor- folding simulations mational space, several were performed for UDE from D. melanogaster (GI: 28572066; amino acids 41–198) and its two homologs from Anopheles gambiae (GI: 58377038; amino acids 3–160) and Bombyx mori [the protein sequence was reconstructed from two contigs using NCBI ORF prediction on contig 6156 (GI: 46642882) and contig 413277 (GI: 46731897); amino is similar to the general de acids 53–209]. This protocol novo rosetta protocol used in casp6 [47]. Here, in the first low-resolution stage, about 100 000 decoys (i.e. preliminary models) were generated for each homolog. Additionally, separate simulations were run with options promoting more compact structures. Decoy sets from all simulations were clustered independently using a quality threshold algorithm, in such a way as to obtain the biggest cluster of minimal size 25 and a maximal rmsd threshold of 5 A˚ . Then, the centroid decoys (i.e. cluster members closest to the average structure of the cluster) and the five lowest-energy decoys from each cluster of size greater than 10 members were selected for the full atom refinement. Next, the refined structures of homologs were used as templates for modeling the corresponding set of structures of UDE from D. mela- nogaster. All resulting structures were scored with proq [28] and metamqap [29], and the final model was selected on the basis of the scores and evaluation of the approximate pseudosymmetry between duplicated fragments.
The electrostatic potential was calculated using apbs [48]
and mapped on the molecular surface with pymol [49].
Far-UV CD spectra (190–240 nm) were recorded on a JAS- CO 720 spectropolarimeter, using 1 mm pathlength cuvettes thermostatted at 25 (cid:2)C and a Neslab RTE-100 computer- controlled thermostat. Full-length DmrcUDE and Gly112– Glu355 DmrcUDE fragment at 0.2 mgÆmL)1 were measured in 20 mm potassium phosphate buffer (pH 7.5). Three scans
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1257
CD spectroscopy
M. Puka´ ncsik et al.
Protein function preserved in a truncated isoform
9 Lari SU, Chen CY, Vertessy BG, Morre J & Bennett
Acknowledgements
SE (2006) Quantitative determination of uracil residues in Escherichia coli DNA: contribution of ung, dug, and dut genes to uracil avoidance. DNA Repair (Amst) 5, 1407–1420.
10 Nilsen H, Rosewell I, Robins P, Skjelbred CF, Ander-
sen S, Slupphaug G, Daly G, Krokan HE, Lindahl T & Barnes DE (2000) Uracil-DNA glycosylase (UNG)- deficient mice reveal a primary role of the enzyme during DNA replication. Mol Cell 5, 1059–1065. 11 Kavli B, Andersen S, Otterlei M, Liabakk NB, Imai K, Fischer A, Durandy A, Krokan HE & Slupphaug G (2005) B cells from hyper-IgM patients carrying UNG mutations lack ability to remove uracil from ssDNA and have elevated genomic uracil. J Exp Med 201, 2011–2021. 12 el-Hajj HH, Wang L & Weiss B (1992) Multiple mutant of Escherichia coli synthesizing virtually thymineless DNA during limited growth. J Bacteriol 174, 4450–4456. 13 Drysdale R (2003) The Drosophila melanogaster genome sequencing and annotation projects: a status report. Brief Funct Genomic Proteomic 2, 128–134.
14 Bekesi A, Zagyva I, Hunyadi-Gulyas E, Pongracz V, Kovari J, Nagy AO, Erdei A, Medzihradszky KF & Vertessy BG (2004) Developmental regulation of dUTPase in Drosophila melanogaster. J Biol Chem 279, 22362–22370.
This work was supported by the following grants: the Hungarian Scientific Research Fund (OTKA K68229), Howard Hughes Medical Institutes (#55005628 and #55000342), Alexander von Humboldt Foundation, GVOP-3.2.1.-2004-05-0412 ⁄ 3.0, JA´ P_TSZ_071128_TB_ INTER from the National Office for Research and Technology of Hungary, FP6 STREP 012127, FP6 SPINE2c LSHG-CT-2006-031220, and TEACH-SG LSSG-CT-2007-037198 from the EU to B. G. Ve´ rtessy. A. Be´ ke´ si was supported by NKTH-OTKA H07- BEL74200. Research in the laboratory of J. M. Bujnicki has been supported by a FP6 grant from the European Union (‘DNA ENZYMES’ MRTN-CT- 2005-019566) and by an NIH (grant 1R01GM081680). J. Kosinski was a PhD student at the Postgraduate School of Molecular Medicine, Medical University of Warsaw, and had a START fellowship from the Foun- dation for Polish Science. Kromat Ltd is gratefully acknowledged for providing the use of an Agilent 1100 nanoLC-XCT Plus IonTrap system. C. Alfonso and G. Rivas are holders of grant BIO2008-04478-C03-03 from the Spanish Ministerio de Ciencia e Innovacio´ n.
References
1 Lindahl T (1993) Instability and decay of the primary
structure of DNA. Nature 362, 709–715.
15 Bekesi A, Pukancsik M, Muha V, Zagyva I, Leveles I, Hunyadi-Gulyas E, Klement E, Medzihradszky KF, Kele Z, Erdei A et al. (2007) A novel fruitfly protein under developmental control degrades uracil-DNA. Biochem Biophys Res Commun 355, 643–648.
2 Krokan HE, Drablos F & Slupphaug G (2002) Uracil in DNA – occurrence, consequences and repair. Onco- gene 21, 8935–8948.
3 Pearl LH & Savva R (1996) The problem with pyrimi-
16 Deutsch WA (1995) Why do pupating insects lack an activity for the repair of uracil-containing DNA? One explanation involves apoptosis Insect Mol Biol 4, 1–5.
dines. Nat Struct Biol 3, 485–487.
17 Dudley B, Hammond A & Deutsch WA (1992) The presence of uracil-DNA glycosylase in insects is dependent upon developmental complexity. J Biol Chem 267, 11964–11967.
4 Mosbaugh DW (1988) Purification and characterization of porcine liver DNA polymerase gamma: utilization of dUTP and dTTP during in vitro DNA synthesis. Nucleic Acids Res 16, 5645–5659.
18 Aravind L & Koonin EV (2000) The alpha ⁄ beta fold
uracil DNA glycosylases: a common origin with diverse fates. Genome Biol 1, RESEARCH0007, doi:10.1186/ gb-2000-1-4-research0007.
5 Vertessy BG & Toth J (2008) Keeping uracil out of DNA: physiological role, structure and catalytic mechanism of dUTPases. Acc Chem Res 42, 97–106. 6 Krokan HE, Standal R & Slupphaug G (1997) DNA
19 Richards S, Gibbs RA, Weinstock GM, Brown SJ,
glycosylases in the base excision repair of DNA. Biochem J 325, 1–16.
7 Dogliotti E, Fortini P, Pascucci B & Parlanti E (2001)
Denell R, Beeman RW, Gibbs R, Bucher G, Friedrich M, Grimmelikhuijzen CJ et al. (2008) The genome of the model beetle and pest Tribolium castaneum. Nature 452, 949–955.
The mechanism of switching among multiple BER path- ways. Prog Nucleic Acid Res Mol Biol 68, 3–27.
20 Keil B (1992) Specificity of Proteolysis. Springer-Verlag,
Berlin-Heidelberg-New York.
21 Bornstein P & Balian G (1977) Cleavage at Asn-Gly bonds with hydroxylamine. Methods Enzymol 47, 132–145.
22 Dosztanyi Z, Csizmok V, Tompa P & Simon I (2005) IUPred: web server for the prediction of intrinsically
8 Kavli B, Sundheim O, Akbari M, Otterlei M, Nilsen H, Skorpen F, Aas PA, Hagen L, Krokan HE & Slupph- aug G (2002) hUNG2 is the major repair enzyme for removal of uracil from U:A matches, U:G mismatches, and U in single-stranded DNA, with hSMUG1 as a broad specificity backup. J Biol Chem 277, 39926–39936.
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1258
M. Puka´ncsik et al.
Protein function preserved in a truncated isoform
unstructured regions of proteins based on estimated energy content. Bioinformatics 21, 3433–3434.
36 Green DA & Deutsch WA (1983) Repair of alkylated DNA: Drosophila have DNA methyltransferases but not DNA glycosylases. Mol Gen Genet 192, 322–325. 37 Breimer LH (1986) A DNA glycosylase for oxidized
23 Yang ZR, Thomson R, McNeil P & Esnouf RM (2005) RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics 21, 3369–3376.
thymine residues in Drosophila melanogaster. Biochem Biophys Res Commun 134, 201–204.
24 Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF & Jones
DT (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337, 635–645.
38 Georg J, Schomacher L, Chong JP, Majernik AI, Raa- be M, Urlaub H, Muller S, Ciirdaeva E, Kramer W & Fritz HJ (2006) The Methanothermobacter thermautot- rophicus ExoIII homologue Mth212 is a DNA uridine endonuclease. Nucleic Acids Res 34, 5325–5336.
39 Be´ ke´ si A, Felfo¨ ldi F, Puka´ ncsik M, Zagyva I & Ve´ rtessy GB (2008) USA Patent Application No. 11 ⁄ 160040. 40 Varga B, Barabas O, Kovari J, Toth J, Hunyadi-Gulyas E, Klement E, Medzihradszky KF, Tolgyesi F, Fidy J & Vertessy BG (2007) Active site closure facilitates juxtaposition of reactant atoms for initiation of catalysis by human dUTPase. FEBS Lett 581, 4783–4788.
25 Schuck P, Perugini MA, Gonzales NR, Howlett GJ & Schubert D (2002) Size-distribution analysis of proteins by analytical ultracentrifugation: strategies and applica- tion to model systems. Biophys J 82, 1096–1111. 26 Kosinski J, Cymerman IA, Feder M, Kurowski MA, Sasin JM & Bujnicki JM (2003) A ‘FRankenstein’s monster’ approach to comparative modeling: merging the finest fragments of fold-recognition models and iterative model refinement aided by 3D structure evaluation. Proteins 53(Suppl 6), 369–379.
41 Nemeth-Pongracz V, Barabas O, Fuxreiter M, Simon I, Pichova I, Rumlova M, Zabranska H, Svergun D, Petoukhov M, Harmat V et al. (2007) Flexible segments modulate co-folding of dUTPase and nucleocapsid proteins. Nucleic Acids Res 35, 495–505. 42 Dubrovay Z, Gaspari Z, Hunyadi-Gulyas E,
27 Mitrophanous K, Yoon S, Rohll J, Patil D, Wilkes F, Kim V, Kingsman S, Kingsman A & Mazarakis N (1999) Stable gene transfer to the nervous system using a non-primate lentiviral vector. Gene Ther 6, 1808–1818.
28 Wallner B, Fang H & Elofsson A (2003) Automatic consensus-based fold recognition using Pcons, ProQ, and Pmodeller. Proteins 53(Suppl 6), 534–541.
Medzihradszky KF, Perczel A & Vertessy BG (2004) Multidimensional NMR identifies the conformational shift essential for catalytic competence in the 60-kDa Drosophila melanogaster dUTPase trimer. J Biol Chem 279, 17945–17950.
43 Schuck P (2000) Size-distribution analysis of macromol- ecules by sedimentation velocity ultracentrifugation and Lamm equation modeling. Biophys J 78, 1606–1619. 44 Cole JL (2004) Analysis of heterogeneous interactions.
29 Kosinski J, Gajda MJ, Cymerman IA, Kurowski MA, Pawlowski M, Boniecki M, Obarska A, Papaj G, Sroczynska-Obuchowicz P, Tkaczuk KL et al. (2005) FRankenstein becomes a cyborg: the automatic recom- bination and realignment of fold recognition models in CASP6. Proteins 61(Suppl 7), 106–113.
30 Whitmore L & Wallace BA (2008) Protein secondary
Methods Enzymol 384, 212–232.
45 Laue TM, Shah BD, Ridgeway TM & Pelletier
structure analyses from circular dichroism spectroscopy: methods and reference databases. Biopolymers 89, 392–400.
SL(1992) Computer-aided interpretation of analytical sedimentation data for proteins. In Analytical Ultracentrifugation in Biochemistry and Polymer Science. pp. 90–125, Royal Society of Chemistry, Cambridge.
31 Andrade MA, Chacon P, Merelo JJ & Moran F (1993) Evaluation of secondary structure of proteins from UV circular dichroism spectra using an unsupervised learn- ing neural network. Protein Eng 6, 383–390.
32 Deleage G & Geourjon C (1993) An interactive graphic program for calculating the secondary structure content of proteins from circular dichroism spectrum. Comput Appl Biosci 9, 197–199.
33 Deutsch WA & Spiering AL (1982) A new pathway
46 Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R et al. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948. 47 Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, Markow TA, Kaufman TC, Kellis M, Gelbart W, Iyer VN et al. (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450, 203–218.
48 Studebaker AW, Balendiran GK & Williams MV
expressed during a distinct stage of Drosophila develop- ment for the removal of dUMP residues in DNA. J Biol Chem 257, 3366–3368.
(2001) The herpesvirus encoded dUTPase as a potential chemotherapeutic target. Curr Protein Pept Sci 2, 371–379.
49 DeLano WL (2002) The PyMOL Molecular Graphics
System. DeLano Scientific, San Carlos, CA.
34 Deutsch WA (1987) Enzymatic studies of DNA repair in Drosophila melanogaster. Mutat Res 184, 209–215. 35 Morgan AR & Chlebek J (1989) Uracil-DNA glycosy- lase in insects. Drosophila and the locust. J Biol Chem 264, 9911–9914.
FEBS Journal 277 (2010) 1245–1259 ª 2010 The Authors Journal compilation ª 2010 FEBS
1259