MINIREVIEW
Alternative splicing: role of pseudoexons in human disease
and potential therapeutic strategies
Ashish Dhir and Emanuele Buratti
International Centre for Genetic Engineering and Biotechnology (ICGEB), Trieste, Italy
Introduction
Towards the end of the 1970s, in the beginning of
pre-mRNA splicing research [1,2], defining exons and
introns was essentially based on observing the final
composition of the mature mRNA molecule. In 1978,
any sequence that was included in a mature mRNA
became tagged as an ‘exon’, whereas all the intervening
genomic sequences that were left out during the splic-
ing process became defined as ‘introns’ [3]. However,
this way of thinking did not explain what makes an
exon an exon or an intron an intron. The discovery of
the basic splice site consensus sequences during the
same years [4,5], and later on of enhancer and repres-
sor elements, has taken us a long way in the direction
of discovering exon- and intron-definition complexes
[6–8]. Nowadays, the splicing signals that define ex-
ons introns have been greatly aided by basic research,
bioinformatic approaches and advanced sequencing
tools [9,10]. In this regard, we certainly know much
more about splicing regulation than we did 20 years
ago. Considering that several reviews have been writ-
ten recently on the subject, the reader is referred to
them for further information on the latest discoveries
[11–14]. Most important, in this respect, have been the
initial observations that in alternative splicing pro-
cesses the same nucleotide sequence could be defined
by the spliceosome as an intron or an exon in response
to specific signals [15,16]. It is now clear that these
kinds of decision (What is an exon? What is an
intron?) are of paramount importance in explaining
genome complexity and evolutionary pathways
[17–20]. However, the sum of this new knowledge does
not necessarily mean that we are near the goal of
Keywords
alternative splicing; antisense
oligonucleotides; mRNA; pseudoexons;
splicing therapy
Correspondence
E. Buratti, Padriciano 99, 34012 Trieste, Italy
Fax: +39 040 226555
Tel: +39 040 3757316
E-mail: buratti@icgeb.org
(Received 26 August 2009, revised 15
October 2009, accepted 5 November 2009)
doi:10.1111/j.1742-4658.2009.07520.x
What makes a nucleotide sequence an exon (or an intron) is a question
that still lacks a satisfactory answer. Indeed, most eukaryotic genes are full
of sequences that look like perfect exons, but which are nonetheless
ignored by the splicing machinery (hence the name ‘pseudoexons’). The
existence of these pseudoexons has been known since the earliest days of
splicing research, but until recently the tendency has been to view them as
an interesting, but rather rare, curiosity. In recent years, however, the
importance of pseudoexons in regulating splicing processes has been stea-
dily revalued. Even more importantly, clinically oriented screening studies
that search for splicing mutations are beginning to uncover a situation
where aberrant pseudoexon inclusion as a cause of human disease is more
frequent than previously thought. Here we aim to provide a review of the
mechanisms that lead to pseudoexon activation in human genes and how
the various cis- and trans-acting cellular factors regulate their inclusion.
Moreover, we list the potential therapeutic approaches that are being tested
with the aim of inhibiting their inclusion in the final mRNA molecules.
Abbreviations
3¢ss, 3¢splice site; 5¢ss, 5¢splice site; AON, antisense oligonucleotide; LINE, long interspersed elements; NMD, nonsense-mediated decay;
PTB, polypyrimidine tract binding protein; SINE, short interspersed elements.
FEBS Journal 277 (2010) 841–855 ª2010 ICGEB Trieste (Italy) Journal compilation ª2010 FEBS 841
understanding most splicing decisions. Indeed, even
the latest attempts at ‘designing’ exons based on
current state-of-the-art knowledge have basically dem-
onstrated that there is still a long way to go before we
can become as good as the spliceosome in deciding
what is an exon and what is an intron [21].
Where do pseudoexon sequences come
into the story?
Central to the issue of deciding what is an exon and
what is an intron is the question of their origin, a very
much debated field to this day that basically deals with
deciding the order of appearance of introns during
evolution, whether first, early or late [22]. Whatever
the answer to this question will turn out to be, it is
now clear that many of the ‘new’ exons in our genome
originate from the insertion of transposable sequence
elements belonging to the SINE and LINE classes in
the eukaryotic genome [23–25]. In particular, exoniza-
tion of Alu elements (which are primate specific and
represent the most abundant mobile elements in the
human genome) through retrotranposition–mutation
events is a prominent source of new exons in the
eukaryotic transcriptome, as schematically depicted in
Fig. 1 [26,27].
However, even if we ignore this particular class of
exonization event, every in silico analysis shows that
‘false exons’ are very abundant in the intronic
sequences of most genes [with this term we refer to
any nucleotide sequence between 50 and 200–300
nucleotides in length with apparently viable 5¢and 3¢
splice sites (5¢ss and 3¢ss) at either end]. Presently,
there is evidence that inclusion of many of these
sequences is actively inhibited due to the presence of
intrinsic defects [28], the presence of silencer elements
[29–31] or the formation of inhibiting RNA secondary
structures [32]. Even if a combination of all these ele-
ments succeeds in repressing the use of many of these
pseudoexon sequences, we have to consider the possi-
bility that there must be many exceptions to this rule.
First, it is probable that several of these pseudoex-
ons may actually be recognized only in particular cir-
cumstances, such as a consequence of particular
external stimuli [33,34] or present in a given tissue or
developmental stage. Proof of this possibility is the
observation that ‘novel’ exons keep being identified
even in well-known and studied genes, such as the
DMD gene [35].
Second, our failure to observe their use in normal
conditions may also be due to the fact that their inclu-
sion can intentionally lead to premature insertion of a
termination codon in the mature mRNA and the con-
sequent rapid degradation by nonsense-mediated decay
(NMD) pathways [36] (Fig. 1). Such an occurrence has
been described in the rat a-tropomyosin gene with a
putative pseudoexon sequence localized downstream of
two mutually exclusive exons: an upstream exon that
is included only in smooth muscle tissue and a down-
stream exon that is included in most cell types [37].
Fig. 1. The left panel shows a schematic model of Alu element exonization. The element (Alu) is inserted by retrotransposition and during
the course of evolution mutations within this sequence create viable splicing sequences. The middle panel shows the effect of the inclusion
of a nonsense exon sequence (NE) in a transcript. When this nonsense exon sequence is included, the resulting transcript is degraded by
NMD (lower diagram). The right panel shows the classical pathway of pseudoexon (PE) inclusion in human disease. In this case, a nucleotide
sequence on the brink of becoming an exon becomes activated following a number of different mutational events.
Pseudoexons in human disease A. Dhir and E. Buratti
842 FEBS Journal 277 (2010) 841–855 ª2010 ICGEB Trieste (Italy) Journal compilation ª2010 FEBS
Experimental analysis has shown that, when this
pseudoexon is included in the mRNA molecule
together with the ubiquitously expressed downstream
exon, the formation of a stop codon causes activation
of the NMD pathway. On the other hand, when inclu-
sion of this pseudoexon occurs with the upstream
smooth muscle tissue-specific exon, then it can still be
removed through a resplicing pathway (and a normally
processed mRNA molecule can be generated). For this
reason, the term ‘nonsense’ exon is now preferred to
define these kinds of sequence, which according to
bioinformatic analyses may be more prevalent in
human genes than previously thought [37].
Nonetheless, from a human disease point of view,
many pseudoexon intronic sequences seem poised on
the brink of becoming exons (Fig. 1) and a compre-
hensive list of more than 60 published pathological
pseudoexon events is presented in Table 1. Although
briefly reviewed previously elsewhere [38], the recent
advances in pseudoexon research warrant a second
look at several pseudoexon-related issues, especially
with regards to novel therapeutic approaches.
Cis-acting sequences in pseudoexon
inclusion
As previously mentioned, most pathological pseudoex-
on inclusion events originate from the creation of new
splicing donor or acceptor splice sites within an intron-
ic sequence, followed by the subsequent selection
of weaker ‘opportunistic’ acceptor or donor site
sequences (Fig. 2A). A preliminary analysis of the
strength of donor sites activated in pseudoexon inclu-
sion events has highlighted their relatively high
strength (according to in silico prediction programs)
with respect to normally processed exons and to cryp-
tic donor sites activated following normal donor site
inactivation [39]. In a slightly lower number of cases,
pseudoexon activation has been observed following the
creation of de novo acceptor sites (Table 1), whereas
branch-point creation still represents a minority (prob-
ably owing to the fact that a new branch point needs
to find both a viable acceptor and donor site nearby,
rather than just one of them).
In addition to de novo creation of strong donor,
acceptor and branch site sequences, the other most fre-
quent mechanisms that may lead to pseudoexon activa-
tion involves the creation deletion of splicing
regulatory sequences that will be discussed more in
detail below (Fig. 2B). Finally, in two individual cases,
the rearrangement of genomic regions through gross
deletions (Fig. 2C) [40] or genomic inversions
(Fig. 2D) [41] has also been described to give rise to
pseudoexon inclusion events. This has come about
either by bringing together viable splice sites that
would normally be too far away from each other on
the gene sequence or by activating exons in what
would normally have been the antisense genomic
strand.
In a few genes, a particularly interesting method of
pseudoexon activation event has also occurred follow-
ing the inactivation of naturally occurring up
stream 5¢ss (FAA,IDS,MUT) [42–45] or downstream
3¢ss (BRCA2,CFTR) [46,47] (Fig. 2E). These findings
suggest that the processivity of these mRNA tran-
scripts probably represents an element capable of
determining pseudoexon repression apart from being
capable of influencing normal splicing levels [48].
On a more general note, a still underappreciated
aspect of pseudoexon recognition that concerns the
effect of cis-acting sequences is represented by the
potential influence of RNA secondary structure on
splicing efficiency [49]. Recently, it has been shown
that donor site usage in the inclusion of two pseudoex-
on sequences in the ATM and CFTR genes is strongly
dependent on their availability in the single-stranded
region [50]. Interestingly, the same conclusion was
reached in a recent study by Schwartz et al. [51] analy-
sing the differences between exonized and nonexonized
Alu elements. In this work, it was found that one of
the major discriminating factors between these two
classes of Alu elements was represented by the poten-
tial availability of 5¢ss sequences in an unstructured
conformation.
Trans-acting factors in pseudoexon
inclusion
Not many studies have focused on identifying the role
played by trans-acting factors in pseudoexon inclusion.
However, because of its significance, this is an area of
research that would probably benefit from increased
attention by researchers in the future.
In the case of nonpathologically related pseudoex-
ons carrying nonsense codons, the presence of splicing
regulatory elements may well provide a clue with
regards to the possible roles played by these
sequences. For example, in the case of the previously
described tropomyosin pseudoexon [37], the specific
binding of hnRNP H F proteins has been described
as a potential key modifier of this pseudoexon inclu-
sion event [52]. The fact that these proteins are partic-
ularly downregulated in cardiomyocytes may explain
the cell-specific repression of the downstream ‘normal’
exon 3 that is otherwise present in all cell types
(Fig. 3A).
A. Dhir and E. Buratti Pseudoexons in human disease
FEBS Journal 277 (2010) 841–855 ª2010 ICGEB Trieste (Italy) Journal compilation ª2010 FEBS 843
Table 1. Pathological pseudoexon inclusion events in human disease. NA, not available; SRE, splicing regulatory element.
Gene Size (bp)
Activating
mutation Reference DBASS3 DBASS5 reference
a-Gal A 57 SRE creation [78] http: ⁄⁄www.som.soton.ac.uk research geneticsdiv dbass5 viewsplicesite.aspx?id=317
ATM 65 SRE deletion [56] http: ⁄⁄www.som.soton.ac.uk research geneticsdiv dbass5 viewsplicesite.aspx?id=324
ATM 137 5¢ss creation [79] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=331
b-globin 165 5¢ss creation [80] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=323
b-globin 126 5¢ss creation [81] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=336
b-globin 73 5¢ss creation [82] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=335
BRCA1 66 3¢ss creation [83] NA
BRCA2 93 Downstream
3¢ss deletion
[46] NA
CD40L 59 5¢ss creation [84] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=437
CEP290 128 5¢ss creation [85] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=342
CFTR 49 5¢ss creation [86] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=330
CFTR 84 5¢ss creation [87] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=328
CFTR 101 SRE creation [88] NA
CFTR 184 Downstream
3¢ss deletion
[47] http://www.som.soton.ac.uk/research/geneticsdiv/dbass3/view.asp?item=splice&id=31
CFTR 214 5¢ss creation [89] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=322
CHM 98 3¢ss creation [90] NA
COL4A3
a
74 3¢ss creation [91] NA
COL4A5 30 3¢ss creation [92] http://www.som.soton.ac.uk/research/geneticsdiv/dbass3/view.asp?item=splice&id=240
COL4A5 147 SRE creation [92] NA
CTDP1 95 5¢ss creation [93] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=333
CYBB 56 5¢ss creation [94] NA
CYBB 61 5¢ss creation [95] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=306
DHPR
QDPR
152 5¢ss creation [96] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=334
DMD 98 28 kb gene
inversion
[41] NA
DMD 108 28 kb gene
inversion
[41] NA
DMD 125 28 kb gene
inversion
[41] NA
DMD 149 28 kb gene
inversion
[41] NA
DMD 160 28 kb gene
inversion
[41] NA
DMD 180 28 kb gene
inversion
[41] NA
DMD 58 5¢ss creation [97] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=340
DMD 67 5¢ss creation [98] NA
DMD 89 5¢ss creation [98] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=338
DMD 90 5¢ss creation [98] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=339
DMD 95 3¢ss creation [97] http://www.som.soton.ac.uk/research/geneticsdiv/dbass3/view.asp?item=splice&id=275
DMD 147 5¢ss creation [99] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=326
DMD 149 3¢ss creation [98] http://www.som.soton.ac.uk/research/geneticsdiv/dbass3/view.asp?item=splice&id=274
DMD 172 202 5¢ss creation [100] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=320
DMD 46 132 3¢ss creation [101] http://www.som.soton.ac.uk/research/geneticsdiv/dbass3/view.asp?item=splice&id=273
FBN1 93 5¢ss creation [102] NA
FGB 50 SRE creation [63] NA
FGG 75 5¢ss creation [103] NA
FVIII 191 5¢ss creation [104] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=332
GALC 34 ND [105] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=344
GHER 69 5¢ss creation [106] NA
GHR 102 SRE deletion [57,107] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=316
GUSB
a
68 5¢ss creation [108] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=311
Pseudoexons in human disease A. Dhir and E. Buratti
844 FEBS Journal 277 (2010) 841–855 ª2010 ICGEB Trieste (Italy) Journal compilation ª2010 FEBS
Interestingly, repression of the tropomyosin non-
sense exon was also observed following PTB overex-
pression. PTB is a well-known and powerful splicing
modifier that plays a major role in alternative splicing
regulation [8,53]. Recently, this protein has been
reported to also downregulate the inclusion efficiency
of a pathological pseudoexon in NF-1 intron 31 inde-
pendently of the activating mutation that creates a
very strong splicing acceptor site [54] (Fig. 3B). This
finding suggests that silencer binding sites may be
Table 1. (Continued.)
Gene Size (bp)
Activating
mutation Reference DBASS3 DBASS5 reference
HADHB 56 106 5¢ss creation [109] NA
HSPG2 130 5¢ss creation [110] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=346
IDS 78 5¢ss creation [111] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=329
IDS 103 Upstream 5¢ss
deletion
[42,43] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=282
INI1
SNF5
72 5¢ss creation [112] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=410
ISCU 86 100 3¢ss creation [113–115] NA
JK 136 Internal 7 kb
deletion
[40] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=341
MCBB 64 SRE deletion [116] NA
MYO6 108 5¢ss creation [117] NA
MUT 76 5¢ss creation
or upstream
5¢ss deletion
[45,68] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=434
http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=394
NDUFS7 122 5¢ss creation [118] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=357
NF-1 70 5¢ss creation [70] NA
NF-1 107 5¢ss creation [70] NA
NF-1 172 3¢ss creation [119] http://www.som.soton.ac.uk/research/geneticsdiv/dbass3/view.asp?item=splice&id=277
NF-1 58 3¢ss creation [120] NA
NF-1 76 5¢ss creation [120] NA
NF-1 54 5¢ss creation [121] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=411
NF-1 177 5¢ss creation [70,122,123] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=318
NF-2 106 Branch-point
creation
[124] NA
NPC1 194 5¢ss creation [125] NA
OA1
GPR143
165 3¢ss creation [69] http://www.som.soton.ac.uk/research/geneticsdiv/dbass3/view.asp?item=splice&id=114
OAT
a
142 5¢ss creation [126] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=433
OTC 135 3¢ss creation [127] NA
PCCA 84 SRE creation [68] NA
PCCB 72 5¢ss creation [68] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=436
PHEX 50 100
170
5¢ss creation [128] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=321
PKHD1 116 5¢ss creation [129] NA
PMM2 66 3¢ss creation [130] NA
PMM2 123 5¢ss creation [130,131] NA
PRPF31 175 5¢ss creation [132] NA
PTS
a
45 Branch-point
optimization
[133] NA
PTS
b
79 Py-tract
optimization
[133] NA
RB1 103 3¢ss creation [134] NA
RYR1 119 5¢ss creation [135] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=337
SOD-1 43 5¢ss creation [136] NA
TSC2 89 5¢ss creation [137] http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/viewsplicesite.aspx?id=307
a
Alu-derived pseudoexons.
b
LINE-2-derived pseudoexons.
A. Dhir and E. Buratti Pseudoexons in human disease
FEBS Journal 277 (2010) 841–855 ª2010 ICGEB Trieste (Italy) Journal compilation ª2010 FEBS 845