A novel plant protein disulfide isomerase family
homologous to animal P5 molecular cloning and
characterization as a functional protein for folding of
soybean seed-storage proteins
Hiroyuki Wadahama
1,
*, Shinya Kamauchi
1,
*
,
†, Yumi Nakamoto
2
, Keito Nishizawa
2
,
Masao Ishimoto
2
, Teruo Kawada
1
and Reiko Urade
1
1 Graduate School of Agriculture, Kyoto University, Uji, Japan
2 National Agricultural Research Center for Hokkaido Region, Sapporo, Japan
Secretory, organellar and membrane proteins are syn-
thesized and folded with the assistance of molecular
chaperones and other folding factors in the endoplas-
mic reticulum (ER). In many cases, protein folding in
the ER is accompanied by N-glycosylation and the for-
mation of disulfide bonds [1]. Formation of disulfide
bonds between correct pairs of cysteine residues in a
nascent polypeptide chain is thought to be catalyzed
Keywords
endoplasmic reticulum; protein disulfide
isomerase; soybean; storage protein;
unfolded protein response
Correspondence
R. Urade, Graduate School of Agriculture,
Kyoto University, Uji, Kyoto 611-0011, Japan
Fax: +81 774 38 3758
Tel: +81 774 38 3757
E-mail: urade@kais.kyoto-u.ac.jp
†Present address
Osaka Bioscience Institute, Suita, Japan
Database
The nucleotide sequence data for the cDNA
of GmPDIM and genomic GmPDIM have
been submitted to the DDBJ EMBL Gen-
Bank databases under accession numbers
AB189994 and AB295118, respectively
*These authors contributed equally to this
article
(Received 11 October 2007, revised 18
November 2007, accepted 20 November
2007)
doi:10.1111/j.1742-4658.2007.06199.x
The protein disulfide isomerase is known to play important roles in the
folding of nascent polypeptides and in the formation of disulfide bonds in
the endoplasmic reticulum (ER). In this study, we cloned a gene of a novel
protein disulfide isomerase family from soybean leaf (Glycine max L. Mer-
rill. cv Jack) mRNA. The cDNA encodes a protein called GmPDIM. It is
composed of 438 amino acids, and its sequence and domain structure are
similar to that of animal P5. Recombinant GmPDIM expressed in Escheri-
chia coli displayed an oxidative refolding activity on denatured RNase A.
The genomic sequence of GmPDIM was also cloned and sequenced. Com-
parison of the soybean sequence with sequences from Arabidopsis thaliana
and Oryza sativa showed significant conservation of the exon intron struc-
ture. Consensus sequences within the promoters of the GmPDIM genes
contained a cis-acting regulatory element for the unfolded protein response,
and other regulatory motifs required for seed-specific expression. We
observed that expression of GmPDIM was upregulated under ER-stress
conditions, and was expressed ubiquitously in soybean tissues such as the
cotyledon. It localized to the lumen of the ER. Data from co-immunopre-
cipitation experiments suggested that GmPDIM associated non-covalently
with proglycinin, a precursor of the seed-storage protein glycinin. In addi-
tion, GmPDIM associated with the a¢subunit of b-conglycinin, a seed-
storage protein in the presence of tunicamycin. These results suggest that
GmPDIM may play a role in the folding of storage proteins and functions
not only as a thiol-oxidoredactase, but also as molecular chaperone.
Abbreviations
DSP, dithiobis[succinimidylpropionate]; ER, endoplasmic reticulum; ERSE, ER stress responsive element; PDI, protein disulfide isomerase.
FEBS Journal 275 (2008) 399–410 ª2007 The Authors Journal compilation ª2007 FEBS 399
by the protein disulfide isomerase (PDI) family of pro-
teins [2–4]. In humans, 17 genes of the PDI family
have been identified [5]. The physiological role of each
PDI protein and their interactions with each other and
with other ER-resident molecular chaperones have
been partially elucidated. P5, an animal PDI family,
was first discovered in Chinese hamster [6]. P5 has
both thiol-oxidoreductase activity and chaperone activ-
ity [7,8]. In addition, roles for P5 other than the fold-
ing of nascent proteins have been reported in animal
cells. Zebrafish P5 is involved in the production of
midline-derived signals required to establish left right
asymmetry [9]. In human tumor cells, cell-surface
P5 was required for shedding of the soluble major his-
tocompatibility complex class I-related ligand, resulting
in the promotion of tumor immune evasion [10]. In
plants, a set of 22 orthologs of known PDI-like pro-
teins was discovered using a genome-wide search of
Arabidopsis thaliana and these were separated into
10 phylogenetic groups [11]. Among these groups,
group V genes show structural similarities to ani-
mal P5. However, group V gene products in plant cells
have not been identified.
Large quantities of storage protein are synthesized
in the ER during seed development in soybean cotyle-
don cells [12]. Approximately 70% of seed-storage
proteins are composed of the two major globulins
glycinin and b-conglycinin. They are folded and
assembled into trimers in the ER, and then trans-
ported and deposited in the protein storage vacuoles
[13]. Glycinin is synthesized as a 60 kDa precursor
polypeptide and is proteolytically processed into
40 kDa acidic and 20 kDa basic subunits in the pro-
tein storage vacuoles [14–16]. A1aB1b, a major glyci-
nin, possesses two intradisulfide bonds between
Cys12–Cys45 and Cys88–Cys298. These disulfide
bonds are required for assembly into hexamers and
for the structural stability of the protein [17–19].
Thus, proper folding and disulfide bond formation is
important for the effective deposition of glycinin in
the vacuoles. ER-resident PDI proteins may play a
central role in this folding process. Previously, we
identified two novel PDI proteins belonging to
group IV, GmPDIS-1 and GmPDIS-2, and demon-
strated that GmPDIS-1 is associated with proglycinin
in the ER [20]. However, involvement of the other
PDI proteins in the folding of storage proteins
remains unclear.
In this study, we isolated cDNA clones and genomic
sequences encoding a soybean group V gene of the
PDI family. The tissue distribution and cellular locali-
zation of GmPDIM and changes in its expression dur-
ing seed development are described. In addition, our
data suggest that GmPDIM and proglycinin or b-con-
glycinin associate during the course of the folding
process.
Results
cDNA cloning of GmPDIM
To clone the soybean ortholog of group V Arabidopsis
PDI-like2-2 or PDI-like2-3 [11], a blast search was
performed using the nucleotide sequences of these
cDNAs from the Institute for Genomic Research
Soybean Index. The tentative consensus sequence
BU926832 was obtained. Using primer sets designed
from this sequence, we cloned a cDNA from the RNA
extracted from young soybean leaves by 3¢-RACE and
5¢-RACE. This cDNA encoded GmPDIM, a protein
of 438 amino acids (supplementary Fig. S1) containing
a putative N-terminal secretary signal sequence and a
C-terminal tetrapeptide, KDEL, which acts as a signal
for retention in the ER [21,22]. GmPDIM possesses
two tandem thioredoxin-like motifs, each containing a
CGHC active site. Arginine residues R126 and R255,
which are involved in the regulation of the active site
redox potential in human PDI [5,23], were conserved.
In addition, glutamic acid residues E58 and E186,
which have been suggested to facilitate ‘the escape’ of
the cystein residue of the active site from a mixed
disulfide bond with substrate [5,23], were also con-
served. The amino acid sequence of GmPDIM and
orthologs from other plant species were 80% simi-
lar, excluding the putative N-terminal signal peptide.
The amino acid sequence identity between GmPDIM
and human P5 was 46%.
Recombinant GmPDIM was expressed in Escherichia
coli as a soluble protein and was purified by affinity-
column and gel-filtration chromatography (supplemen-
tary Fig. S2A). Recombinant GmPDIM had a CD
spectrum typical of a folded protein (supplementary
Fig. S3). The domain structure of GmPDIM was pre-
dicted to be a linear sequence of three domains in an
a–a¢–b from the sequence homology to the conserved
domains. Therefore, we subjected the recombinant
GmPDIM protein to limited proteolysis with either
trypsin or V8 protease to determine their domain
boundaries. The native recombinant protein was
digested to give smaller peptide fragments after treat-
ment with either protease. The sites of proteolytic
cleavage were determined to be Lys150 (K150) and
R255 by N-terminal sequencing of the trypsin peptide
fragments. The N-terminal sequences of other frag-
ments generated by protease digestion were AHHHHH
and corresponded to the N-terminal histidine tag of
Soybean protein disulfide isomerase family H. Wadahama et al.
400 FEBS Journal 275 (2008) 399–410 ª2007 The Authors Journal compilation ª2007 FEBS
the recombinant protein. We next determined the
C-terminal amino acid residues of the peptide frag-
ments by measuring their masses by MALDI-TOF
MS. Most cleavage sites resided in two narrow regions,
overlapping the putative boundary regions in
GmPDIM between a and a¢, and a¢and b, respectively
(Fig. 1). These results show that GmPDIM has a lin-
ear sequence of three domains in an a–a¢–b pattern
similar to animal P5 [5].
We next determined the activity of recombinant
GmPDIM, which catalyzed oxidative refolding of the
reduced and denatured RNase A. The specific activity
of GmPDIM was 45 mmol RNaseAÆmin
)1
Æmol
)1
(sup-
plementary Fig. S2B). Despite the fact that human P5
has molecular chaperone activity [7], no such activity
was detected with recombinant GmPDIM, (data not
shown).
Cloning of genomic sequences of GmPDIM
The genomic sequence encoding GmPDIM was cloned
and sequenced. Alignment and comparison with the
cDNA sequence showed that GmPDIM contains nine
exons (supplementary Fig. S4). Nucleotide sequence of
the ORF of the GmPDIM gene was identical to that
of the cDNA. Comparison of the soybean genomic
sequence of GmPDIM with those of A. thaliana (AGI
number At1g04980 and At2g32920) and Oryza sativa
(MOsDb number Os09g27830) identified significant
conservation in the exon intron structure across these
species. Moreover, all introns matched degenerate con-
sensus sequence of branch points of plants (YTNAN)
upstream of the 3¢splice site [24].
We next analyzed the promoter region of GmPDIM,
2340 bp upstream of the translational initiation codon
ATG. A search of the database of plant promoters
(PLACE: http://www.dna.affrc.go.jp/PLACE/) using
the sequences upstream of the coding region of
GmPDIM as the query detected an ER stress responsive
elements (ERSE; CCAAT-N
9
-CCACG) [25] and a
number of cis-acting regulatory elements involved in
the regulation of endosperm specific genes (Table 1).
Expression of GmPDIM in soybean tissues
We next prepared antiserum against recombinant
GmPDIM. Anti-GmPDIM serum immunoreacted to
recombinant GmPDIM by western immunoblot
(Fig. 2A, lane 1), and also to two bands from cotyle-
don cell extract of 50 and 52 kDa (Fig. 2A, lane 2).
The intensity of these bands decreased when anti-
GmPDIM serum was pre-incubated with purified
recombinant GmPDIM (Fig. 2A, lanes 3–5), suggest-
ing that the antibodies specifically immunoreacted with
GmPDIM or a protein homologous to GmPDIM.
Further western immunoblot analyses indicated that
GmPDIM is expressed ubiquitously in roots, stems,
trifoliolate leaves, flowers and cotyledons (Fig. 2B).
The approximate quantity of this protein in leaves
decreased during leaf expansion.
Large amounts of seed-storage proteins are syn-
thesized and are translocated to the ER during the
maturation stage of embryogenesis. Previously, we
demonstrated that the synthesis of glycinin was initi-
ated when the seeds achieved a mass of 50 mg and
increased gradually until they grew to 300 mg. We also
demonstrated that the synthesis of b-conglycinin was
initiated when the seeds achieved a mass of 40 mg,
increased until the seeds grew to 70 mg, and then
decreased [20]. Under such conditions, the folding
machinery comprised of molecular chaperones and
other functional proteins must be strengthened in
response to the increased de novo synthesis of seed-
storage proteins. Therefore, we next measured the
mRNA and protein levels of GmPDIM using real-time
RT-PCR or western immunoblot, respectively. The rel-
ative level of GmPDIM mRNA was higher in the early
stages of seed development and subsequently decreased
(Fig. 3A). The amount of GmPDIM protein was also
higher in the early stages, but decreased until the seed
grew to 100 mg. Expression of GmPDIM increased in
the late stage of seed development (Fig. 3B). These
results suggest that upregulation of the expression of
GmPDIM occurs at a time when the requirement for
molecular chaperones is high.
Fig. 1. Putative domain structure of GmPDIM. Schematic representation of cleavage sites in GmPDIM by limited proteolysis. The upper line
represents recombinant protein. The lower boxes indicate the domain boundaries predicted by an NCBI conserved domain search. The
arrows indicate cleavage sites. Black boxes in domain a and a’ represent the CGHC motif. SP, signal peptide.
H. Wadahama et al. Soybean protein disulfide isomerase family
FEBS Journal 275 (2008) 399–410 ª2007 The Authors Journal compilation ª2007 FEBS 401
Upregulation of GmPDIM by ER stress
Many ER-resident chaperones are upregulated by the
accumulation of unfolded protein in the ER (i.e.
ER stress) [26–29]. Because the consensus sequences
to ERSE were found within the promoter region of
GmPDIM, we next tested whether expression of
GmPDIM responded to ER stress. When ER stress
was induced by treatment with tunicamycin or
l-azetidine-2-carboxylic acid in soybean cotyledons,
GmPDIM mRNA increased (Fig. 4A,B). Upregula-
tion of mRNA of GmPDIM was detected by DNA
array analysis with a genechip (Affymetrix, Santa
Clara, CA, USA) designed from soybean expression
sequence tags (data not shown). In addition, protein
levels of GmPDIM, BiP and calreticulin were also
increased in the cotyledons treated with tunicamycin
(Fig. 4C).
Table 1. Putative regulatory motifs found within the promoter sequences of GmPDIM.
Motif
Consensus
sequence Function Strand
Distance
from ATG Sequence
a
ERSE CCAAT-N9-CCACG Putative cis-acting element involved
in unfolded protein response
))117 CCAAT CCAAT-catatattt-aCACG
)300CORE TGTAAAG found in upstream of the promoter
from the B-hordein gene of
barley and the alpha-gliadin,
gamma-gliadin, and low molecular
weight glutenin genes of wheat
))1073 TGTAAAG
))1474 TGTAAAG
DPBFcore Dc3 ACACNNG bZIP transcription factors, DPBF-1
and 2 (Dc3 promoter-binding
factor-1 and 2) binding core sequence;
Found in the carrot
(D.c.) Dc3 gene promoter; Dc3
expression is normally embryo-specific,
and also can be induced by ABA
+)95 ACACacG
))1100 ACACttG
+)1470 ACACaaG
E-box CANNTG E-box of napA storage-protein gene of
Brassica napus. Sequence is
also known as RRE (R response
element). Conserved in many
storage-protein gene promoters
+)140 CAaaTG
+)1100 CAagTG
+)1596 CAaaTG
+)1632 CAaaTG
))140 CAaaTG
))1100 CActTG
))1596 CAttTG
))1632 CAgtTG
GCN4 motif TGAGTCA cis-acting element required for
endosperm-specific expression
))1874 TGAGTCA
Prolamine box TGCAAAG cis-acting element involved in
quantitative regulation of the
GluB-1 gene
+)23 TGCAAAG
RY repeat CATGCA RY repeat found in RY G box (the
complex containing the two RY
repeats and the G-box) of napA
gene in Brassica napus; Required
for seed specific expression
+)270 CATGCA
SEF3 motif AACCCA Soybean consensus sequence found
in the 5’-upstream region of
b-conglycinin gene
+)1509 AACCCA
SEF 4 motif RTTTTTR Soybean consensus sequence found
in the 5’-upstream region of
b-conglycinin gene
+)299 gTTTTTa
+)419 gTTTTTa
+)476 aTTTTTa
+)1031 gTTTTTa
))372 aTTTTTa
))1608 gTTTTTa
))1656 aTTTTTg
))2298 gTTTTTa
a
Conserved bases of the motifs are in large letters.
Soybean protein disulfide isomerase family H. Wadahama et al.
402 FEBS Journal 275 (2008) 399–410 ª2007 The Authors Journal compilation ª2007 FEBS
GmPDIM is an ER luminal protein
GmPDIM has an N-terminal signal sequence for tar-
geting it to the ER, and a C-terminal ER-retention sig-
nal sequence KDEL. We performed a magnesium-shift
assay to confirm the localization of GmPDIM in the
rough ER. Microsomes were prepared from the cotyle-
dons and centrifuged through a sucrose gradient in the
presence of magnesium or EDTA. The buoyant density
of rough ER is decreased by dissociation of ribosomes
in the presence of EDTA. Fractions were collected
from the sucrose gradient and were analyzed by wes-
tern immunoblot. The peak of GmPDIM at a density
of 1.21 gÆmL
)1
in the presence of magnesium was
shifted to fractions of lighter sucrose (1.16 gÆmL
)1
)in
the presence of EDTA (Fig. 5A), indicating that Gm-
PDIM localized in the rough ER. Next, microsomes
were purified from cells and treated with proteinase K
in the absence or presence of Triton X-100. GmPDIM
was resistant to protease treatment in the absence of
detergent (Fig. 5B, lane 3), but when the microsomal
membranes first were disrupted by Triton X-100, Gm-
PDIM was degraded (Fig. 5B, lane 4). These results
indicate that GmPDIM is an ER luminal protein.
Association of GmPDIM with proglycinin and
b-conglycinin a in the cotyledon
GmPDIM has oxidative folding activity in vitro and
localizes to the ER lumen of the cotyledon, suggesting
that it may function on folding of glycinin [17].
Because nascent polypeptides and molecular chaper-
ones transiently associate with each other in the ER,
we next attempted to detect an interaction between
GmPDIM and proglycinin, which is translocated into
the lumen of the ER for folding. Because a transient
association between a chaperone and nascent polypep-
tide is generally unstable, immunoprecipitation experi-
ments were carried out after treatment with the protein
cross-linker dithiobis[succinimidylpropionate] (DSP).
GmPDIM was detected in the immunoprecipitate with
B
A
Fig. 2. Expression of GmPDIM in soybean tissues. (A) Purified
recombinant GmPDIM (20 ng) (lane 1) and proteins extracted from
the cotyledon (30 lg) (lanes 2–5) were analyzed by western immu-
noblot with anti-GmPDIM serum (1 lL) treated without (lanes 1, 2)
or with 16 lg (lane 3), 80 lg (lane 4) or 400 lg (lane 5) purified
recombinant GmPDIM. (B) Thirty micrograms of protein extracted
from the cotyledon (80 mg bean) (lane 1), root (lane 2), stem
(lane 3), 3 cm leaf (lane 4), 6 cm leaf (lane 5), 9 cm leaf (lane 6)
and flower (lane 7) were analyzed by western immunoblot with
anti-GmPDIM serum.
A
B
Fig. 3. Expression of GmPDIM in soybean cotyledons during matu-
ration. (A) GmPDIM mRNA was quantified by real time RT-PCR.
Each value was standardized by dividing the value by that for actin
mRNA. Values are calculated as a percentage of the highest value
obtained during maturation. Data represent the mean ± SD for four
experiments. (B) Proteins (25 lg) extracted from cotyledons were
analyzed by western immunoblot with anti-GmPDIM serum.
H. Wadahama et al. Soybean protein disulfide isomerase family
FEBS Journal 275 (2008) 399–410 ª2007 The Authors Journal compilation ª2007 FEBS 403