New human and mouse microRNA genes found by homology search Michel J. Weber

Laboratoire de Biologie Mole´ culaire Eucaryote, UMR5099, CNRS and Universite´ Paul Sabatier, IFR109, Toulouse, France

Keywords antisense transcript; genomic localization; human genome; microRNA; mouse genome; mRNA degradation; RNA interference

Correspondence M. Weber, LBME, 118 route de Narbonne, 31062 Toulouse Cedex, France Fax: +33 5 61 33 58 86 Tel: +33 5 61 33 59 56 E-mail: weber@ibcg.biotoul.fr

(Received 4 June 2004, revised 24 August 2004, accepted 7 September 2004)

doi:10.1111/j.1432-1033.2004.04389.x

Conservation of microRNAs (miRNAs) among species suggests that they bear conserved biological functions. However, sequencing of new miRNAs has not always been accompanied by a search for orthologues in other spe- cies. I report herein the results of a systematic search for interspecies ortho- logues of miRNA precursors, leading to the identification of 35 human and 45 mouse new putative miRNA genes. MicroRNA tracks were written to visualize miRNAs in human and mouse genomes on the UCSC Genome Browser. Based on their localization, miRNA precursors can be excised either from introns or exons of mRNAs. When intronic miRNAs are anti- sense to the apparent host gene, they appear to originate from ill-character- ized antisense transcription units. Exonic miRNAs are, in general, nonprotein-coding, poorly conserved genes in sense orientation. In three cases, the excision of an miRNA from a protein-coding mRNA might lead to the degradation of the rest of the transcript. Moreover, three new exam- ples of miRNAs fully complementary to an mRNA are reported. Among these, miR135a might control the stability and ⁄ or translation of an alter- native form of the glycerate kinase mRNA by RNA interference. I also dis- cuss the presence of human miRNAs in introns of paralogous genes and in miRNA clusters.

silencing complex (RISC). The RISC complex either inhibits translation elongation or triggers mRNA deg- radation, depending upon the degree of complementar- ity of the miRNA whith its target (for reviews, see [6–8]).

Abbreviations miRNA, microRNA; pre-miRNAs, miRNA precursors; RISC, RNA-induced silencing complex; snoRNA, small nucleolar RNA.

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

59

MicroRNAs (miRNA) are (cid:1) 22 nucleotide-long RNAs that function in translational repression by base pair- ing with their target mRNA in a variety of pluricellu- lar organisms. They originate from long precursors (pri-miRNA) that, in animals, are cleaved by the Dro- sha endonuclease in the nucleus [1] to give (cid:1) 70 nuc- leotide-long miRNA precursors (pre-miRNAs) with a characteristic hairpin structure. In the plants, excision of pre-miRNAs is performed by DCL1, a Dicer homo- logue [2,3]. Following the export of pre-miRNAs to the cytoplasm by Exportin-5 [4,5], the loop region of the hairpin is removed by the Dicer endonuclease to produce a short double-stranded RNA (dsRNA), a strand of which, corresponding to the mature miRNA, is predominantly incorporated in the RNA-induced Since the seminal identification of the first miRNAs, Caenorhabditis elegans lin-4 and let-7, using genetic approaches [9,10], hundreds of miRNAs have been characterized experimentally using various cloning strategies in plants, C. elegans, Drosophila melanogas- ter, zebrafish, pufferfish, mouse, rat and human [8]. More recently, algorithms have been developed to identify putative precursor miRNAs from sequenced genomes [11,12]. These approaches predict that the human genome might contain 200–255 miRNA genes

M. J. Weber

New human and mouse microRNA gene

[13]. However, a comprehensive list of these sequences is still not available. [18] criteria proposed by Ambros et al. the lowest

for at [16]. This

these new data,

It was recognized that many miRNAs are evolu- tionarily conserved, some of them from worm to human [14]. MiRNA genes have been characterized experimentally from a variety of organisms and might have orthologues in other species, suggesting a powerful method to predict the existence of new miRNA genes. I report here on the results of a sys- tematic search for potential human orthologues of mouse miRNAs deposited in the miRNA Registry [15] and, vice-versa, of potential mouse orthologues of known human miRNAs. In addition, I searched recently identified rat for human orthologues of miRNAs led me to identify potential orthologues of miRNAs that were described previ- in the other. After ously in one species, but not inclusion of the total number of human and mouse miRNAs deposited at the miRNA Registry now approaches the theoretical number of 255 predicted by Lim et al. [13]. I only considered, as valid candidates, those poten- tial miRNA precursors that conformed to the empir- in ical particular, a hairpin structure of free energy, as predicted by mfold, and a minimum of 16 nucleotides of the mature miRNA engaged in Watson– Crick or G ⁄ U base pairings (criterion C). The method used for searching the new miRNAs ensured their phy- the logenetic conservation (criterion D). Moreover, detection of the (cid:1) 22-nucleotide mature forms by Nor- thern blot (criterion A) and ⁄ or their identification in a cDNA library (criterion B) were checked for the sequences deposited in the Rfam2.2 miRNA Registry (Wellcome Trust Sanger Institute) least one species. In most cases, sequences of the new mature miRNAs were perfectly conserved between human and mouse, but differed by one nucleotide in few cases (see for example, mir-155). Further validation of these can- didates was performed on the basis of additional cri- teria, such as conservation of the host gene (see below) or position relative to known miRNA clusters.

Using this information, I wrote custom tracks that allow for the localization of the miRNA genes in the human and mouse genomes. These are now available on the UCSC Genome Browser [17]. Using this tool, I systematically determined the position and orientation of miRNA genes relative to known transcriptional units, examined the conservation of miRNA gene local- ization between the human and mouse genomes, and made a comprehensive list of miRNA clusters. This search led to several testable hypotheses concerning the transcription of miRNA genes, and to the predic- tion of new mRNA targets.

This led to the identification of 60 new potential miRNA precursors (15 for human and 45 for mouse) that were made available before publication in Rfam version 3 of the miRNA Registry of the Wellcome Trust Sanger Institute (Table S1). Moreover, with the collaboration of S. Griffiths-Jones (Wellcome Trust Sanger Institute), the names of several miRNAs depos- ited at the miRNA Registry have been changed, so that orthologous precursors, based on conservation of both sequences and synteny, have similar names in both human and mouse. This was particularly important when a mature miRNA had multiple, closely related, precursors (see for example mmu-mir-9-2, mmu-mir- 138-12 and mmu-mir-199a-1).

Results and Discussion

New potential human and mouse microRNA precursors

Moreover, the coordinates of both new and previ- ously described miRNA precursors were used to write custom tracks that allowed their localization in the human and mouse genomes on the UCSC Genome Browser. A similar track was also written for the gen- ome of C. elegans.

New human miRNA precursors predicted by homology with rat microRNAs

I searched for potential human orthologues of recently cloned rat miRNAs [16]. The best BLAT hits were examined for conservation of both synteny in human and rodent genomes and the presence of a stem-loop structure. This allowed one to propose 20 new human microRNA precursors (Table 1).

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

60

The entire set of human and mouse precursor and mature miRNA sequences from the miRNA Registry (version 2.2) was submitted to a BLAT search against the human genome. The results (in BED format) were exported to excelTM to generate a table with 2873 ent- ries. A similar BLAT search was performed against the mouse genome, generating an excelTM table of a sim- ilar size. These tables were then ranked according to chromosome number and chromosome position and filtered for perfect and near-perfect matches. The cor- responding sequences were subsequently examined for a potential hairpin structure with mfold, and the results were compared to those of known miRNAs from the miRNA Registry. miR-151 was identified in the mouse [19], and miR- 151* in the rat [16]. In both cases, the same predicted

M. J. Weber

New human and mouse microRNA gene

e h t

t s o h

n

i

t a h t

) c

) c

) c

d e z

i l

i l

i l

i l

) c

) c

i l

i l

) 2 4 5 1 2 0 K A

a c o

l

e s o h t

i

i

i

i

i

o t

( 9 9 2 - r i

e r e w

l

l

l

l

l

6 2 3 - R m - o n r

9 2 3 - R m - o n r

0 3 3 - R m - o n r

2 2 3 R m - o n r

5 2 3 - R m - o n r

l

] 6 1 [

. l a

a t i ( s e d i t o e c u n o w

s u o g o o h t r o

m o r f s r e f f i

m o r f s r e f f i

m o r f s r e f f i

m o r f s r e f f i

m o r f s r e f f i

a t i ( e d i t o e c u n e n o y b

D

a t i ( s e d i t o e c u n r u o f y b

D

D

a t i ( e d i t o e c u n e n o y b

m - a s h f o e n e g

t s o h e h t f o n o r t n i e m a S

a t i ( s e d i t o e c u n r u o f y b

D

t y b

D

e t o N

i

i

t e m K

s n o g e r

y b

e h t

n

i

d e b i r c s e d

e m o n e g

s r o s r u c e r p

n a m u h

e h t

n

i

i

A N R o r c m

. ) 1 . 3

t a r

i

f o

s e c n e u q e s

n o s r e v (

l

i

e r u t c u r t s

d e v r e s n o c

s e u g o o h t r o

i

f o

i

n a m u h

l

i

y r t s g e R A N R m e h t

n

i

n o i t a n m a x e

a i t n e t o p

y b

i

r o

e h T

.

d e t i s o p e d

i

n p r i a h r o s r u c e r p d e t c d e r P

R m

] 7 5 , 4 5 [

n e e b

t a r

h t i

e v a h

w

r e v r e s

l

k s i r e t s a

r e s w o r

B

n a

y g o o m o h

y b

y b

e m o n e G

i

i

i

d e t a c d n

i

d e t c d e r p

C S C U

e h t

i

A N R m e r u t a M

s A N R M

n o

i

T A L B

i

y b

s A N R o r c m n a m u h

. s A N R m e s u o m

w e N

-

-

-

-

-

-

-

-

-

-

-

e m o n e g

i

i

i

i

i

i

i

i

r o

. 1

i

i

i

t a r

2 2 3 R m - a s h

e l b a T

n a m u h

e h t

3 2 3 R m - a s h *

p 5 - 4 2 3 R m - a s h *

p 3 - 4 2 3 R m - a s h *

5 2 3 R m - a s h

6 2 3 R m - a s h *

8 2 3 R m - a s h *

9 2 3 R m - a s h

0 3 3 R m - a s h *

1 3 3 R m - a s h *

5 3 3 R m - a s h *

e m a N

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

61

M. J. Weber

New human and mouse microRNA gene

o w

l

) c

i l

) c

, ) c

i l

i l

i

i

i

i

l

l

7 3 3 - R m - o n r

5 4 3 - R m - o n r

y b 6 4 3 - R m - o n r

l

i

i

) c

a t i ( s e d i t o e c u n o w

l

i l

m o r f s r e f f i

m o r f s r e f f i

m o r f s r e f f i

a t i ( e d i t o e c u n e n o y b

D

e d i t o e c u n e n o y b 1 5 1 - R m - o n r

m o r f s r e f f i d 1 5 1 - R m - a s h

a t i (

e t o N

t y b

D

t y b 6 4 3 - R m - u m m m o r f d n a

D

a t i ( s e d i t o e c u n e e r h t

s e d i t o e c u n

e r u t c u r t s

i

i

n p r i a h r o s r u c e r p d e t c d e r P

i

A N R m e r u t a M

. ) d e u n i t n o C

(

-

-

-

-

-

-

-

-

-

-

-

i

i

i

i

i

i

i

i

i

. 1

i

i

e m a N

e l b a T

7 3 3 R m - a s h *

8 3 3 R m - a s h *

9 3 3 R m - a s h *

0 4 3 R m - a s h *

2 4 3 R m - a s h *

5 4 3 R m - a s h

6 4 3 R m - a s h

b 5 3 1 R m - a s h *

b 8 4 1 R m - a s h *

* 1 5 1 R m - a s h *

1 5 1 R m - a s h *

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

62

M. J. Weber

New human and mouse microRNA gene

Its human orthologue was the best BLAT hit

However, a likely orthologue could be found by exploring the most likely (i.e. of conserved synteny) portion of human genome for conserved sequences; e.g. mmu-mir-345 resides upstream of the AK047628 RefSeq gene. found upstream of C14orf69, for AK047628. Such identification was made straightfor- ward by the examination of the ‘Human ⁄ Chimp ⁄ Mouse ⁄ Rat ⁄ Chicken Multiz Alignments & PhyloHMM Cons track’ of the human UCSC Genome Browser.

precursor encodes both mature miRNAs from its 50 (miR-151*) and 30 (miR-151) portions. This also holds true for hsa-mir-151, although the mature form of miR-151 differs from the rodent sequence by one nuc- leotide. This conservation reinforces the hypothesis that, in mammals, the same precursor gives rise to both miR-151 and miR-151*. From the predicted hairpin structure of the precursor, the energies of hybridization of the four nucleotides located at the 50 end of miR- 151* and miR-151 are 1.7 and 0.1 kcalÆmol)1, respect- ively. As ‘RISC assembly favors the siRNA strand whose 50 end has a greater propensity to fray’ [20], it is expected that miR-151* will be more abundant than miR-151. Indeed, miR-151 was cloned only once among 913 miR sequences [19]. Although, to my know- ledge, neither miR-151, nor miR-151* were cloned in human, it is unlikely that the single base substitution compared to rodent sequence (at nt 10 of miR-151) could alter the balance between the two miRs.

(A–C) Alignment of the sequences from mammalian mir-329 Fig. 1. Alignment of microRNA sequences from mammalian genomes. (A), mir-322 (B) and mir-346 (C). Abbreviations: hsa, Homo sapiens; mmu, Mus musculus; rno, Rattus norvegicus; pan, Pan troglodytes. (D) Alignment of the sequences of rodent mir-350 with the corresponding sequences from human and chimpanzee genomes. Rodent sequences were retrieved from The miRNA Registry. Human sequences were retrieved from the human genome sequence by examination of highly conserved regions in the syntenic segments. The sequence of pan-mir-350 was obtained by BLATing the human sequence against the chimpanzee genome. The sequences of mir-329 and mir-298 are 100% conserved between human and chimpanzee. The sequences of mature microRNAs are boxed. The antisense miR boxes indicate the portion of the hairpin precursor structure that is base-paired with the miR, as predicted by MFOLD.

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

63

For certain rodent miRNA precursors, a BLAT search in the human genome produced no matches. Similarly, hsa-mir-329 and -322 were identified on the basis of their conserved stem-loop structure and conserved position relative to other miRNAs or Ref- Seq genes. However, the presumptive mature miRNAs hsa-miR-329 and -322 differ from their mouse and rat orthologues by four nucleotides (Fig. 1A,B). Most of these changes retained base-pairing in the precursor miRNA by forming G ⁄ U interactions or resided in un- paired positions (data not shown). Consequently, the folding free energy calculated using mfold was either little affected (mir-329) or even decreased in the case

M. J. Weber

New human and mouse microRNA gene

of mir-322 (DG ¼ )46.6 for human and )41.0 kcalÆ mol)1 for mouse).

The same conclusion holds for hsa-mir-346, where the mature miR sequence differed from those of mouse and rat by two and three nucleotides, respectively (Fig. 1C). In this case, the folding free energy of the human and rat precursors remained comparable ()46.4 and )50.6 kcalÆmol)1, respectively).

coding genes (see below). Many human miRNAs are located in introns of noncoding mRNAs. In general in these cases, the mRNA was not or poorly conserved in the mouse genome. When human miRNAs were in nonconserved genes, or outside characterized genes, I examined the known flanking genes. In all cases but three, human and mouse miRNAs were found to reside in conserved synteny regions. The three appar- ent discrepancies in the position of a miRNA in the human and mouse genomes were hsa-mir-9-3, hsa-mir- 339 and hsa-mir-326. As detailed in Appendix S1, these cases most probably originate from errors in the present assembly of the mouse genome. Orthologous human and mouse miRNAs thus reside either in introns of orthologous genes, and ⁄ or in conserved synteny with surrounding genes.

Mutations in mature miARNs, particularly in their 30 portion, are compatible with their function as trans- lational repressors [21]. The same study, however, revealed that G:U wobble pairing in the 50 region of the miRNA had detrimental effects that could not be predicted on the basis of changes in the free energy of annealing with the target mRNA. Therefore, hsa-mir- 329, hsa-mir-322 and hsa-mir-346 require further experimental validation to be considered as bona fide miRNA precursors. It was however, surprising that, Human miRNA precursors that reside in introns of known genes

in the predicted hairpin structures of mir-329, -322 and -346, the anti- sense sequence of the miR was more conserved than the miR itself (Fig. 1A–C). Significantly, a mutation in the antisense sequence of miR-322 was accompanied by a compensatory change in the miR sequence, so that a G ⁄ C base pair in the mouse precursor was replaced by an A ⁄ U in the human one. The single change in the antisense sequence of miR-329 occurred at an unpaired position in both human and rodent precursor hairpin structures. As it is difficult to conceive that evolution- ary pressure might be higher on the antisense than on the sense strand [8], this may suggest that the antisense strand was cloned accidentally in certain cases [16]. Eighty one human miRNA precursors were found to be located in an intron of a known gene, or of a gene defined by a complete cDNA sequence, in the sense orientation (Table S2). It is however, important to note that human miRNAs that were classified as located outside of known genes might in fact reside in still uncharacterized splicing variants. For example, hsa-mir-10b is located 972 nucleotides upstream of the HOXD4 gene (NM_014621). However, mmu-mir-10b resides in an intron of a long form of the mouse Hoxd4 pre-mRNA (NM_010469), but this alternative form has no documented human orthologue.

In a few cases, the homology search allowed local- ization of human sequences similar to some rodent miRNA precursors but that had accumulated deleteri- ous mutations. For example, the human orthologous sequence of rodent mir-350 could be localized in an intron of the KARP-1-binding protein (KAB) gene. However, a nine base-pair deletion in the human and chimpanzee genomes removed the first seven nucleo- tides of the mature microRNA (Fig. 1D). It this case, it is noteworthy that the antisense strand accumulated mutations, possibly due to a lack of selective pressure after the inactivation of mir-350 by the deletion in the mature miRNA sequence.

Human and mouse miRNA precursors reside in conserved regions of synteny

I

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

64

Using the UCSC Genome Browser, examined whether orthologous human and mouse miRNA precursors reside in conserved synteny regions. This proved to be the case for miRNAs located in known According to current models, intronic miRNA pre- cursors that have the same orientation as their host gene might be produced upon cleavage of the spliced intron by the Drosha endonuclease. In certain cases, the miRNA sequence is included in intronless ESTs that are often members of a cluster of overlapping, also intronless, ESTs. It is striking that, often, no such EST cluster was found in the other introns of the miRNA host gene (e.g. hsa-mir-103–2 and hsa-mir-98) or only in adjacent introns. Due to the uncertainty in the orientation of intronless ESTs, it was not possible to assess the orientation of the clusters relative to that of the host gene, and that of the microRNA. Never- theless, this suggests that introns that host microRNAs might be particularly stable. Alternatively, certain of these intronic miRNAs might be produced by unchar- acterized transcription units embedded in the same orientation in the apparent host gene. However, this possibility appears unlikely, as to my knowledge, there is only two examples of such a situation in the human genome [22,23].

M. J. Weber

New human and mouse microRNA gene

an intronless transcript antisense to the BC045813 transcription unit.

A third example is hsa-mir-142 that lies within an intron of the AK090885 gene in the antisense orienta- tion, but also in the antisense, intronless and polyaden- ylated AX721088 mRNA (1.6kb). Interestingly, a natural chromosome translocation fuses the 50 portion of hsa-mir-142 to a truncated c-Myc gene in aggressive B-cell leukemia [19,24]. This translocation most prob- ably fuses the AX721088 transcription unit to the c-Myc gene.

In addition, 17 miRNAs were located in an intron of a known gene, but in the antisense orientation (Table S3). Among these, 10 were in a miRNA cluster (see below). To determine how these miRNAs might their genomic context was carefully be generated, explored. As shown in Table S3, several miRNAs in this category are in fact in a transcription unit that has an orientation opposite that of the apparent host gene. In particular, hsa-mir-302 lies within an intron of the HDCMA18P gene in the antisense orientation, but in an intron of ESTs BG207228 and BU565001 in the sense orientation (Fig. 2A). These two ESTs largely in the overlap the HDCMA18P transcription unit opposite orientation.

In several cases, the miRNA was embedded in a cluster of intronless ESTs, the orientation of which could not be ascertained. This however, cannot be con- sidered as indicative of an antisense transcript, as a similar observation was made for some miRNAs that reside in introns in the same orientation as the host gene (see above). suggest Taken together, these observations this cluster shown by the polyA tails of

Fig. 2. Antisense intronic miRNAs. (A) Localization of hsa-mir-302. This miRNA resides in an intron of the HDCMA18P gene in the antisense orientation but in an intron of ESTs BG207228 and BU565001 in the sense orientation. In all figures, miRNA genes are colored in green when they reside on the upper strand, and in magenta when on the lower strand. (B) Localization of hsa-let-7d. This miRNA resides in an intron of pre-mRNAs BC045813 and BC036695 in the antisense orientation. The last exon of these two mRNAs overlaps the intronless BC064349 mRNA in the antisense direction. The asterisks indicate the positions of the polyA tails of mRNAs and ESTs. The miRNA hsa-let-7d resides in a cluster of several intronless ESTs, only some of which are shown. The orientation of this cluster is antisense to that of the BC045813 mRNA, as indicated by the presence of a polyA tail in the sequence of mRNA BC0644349 and of two ESTs. The corres- ponding transcription unit might also contain hsa-let-7a-1 and hsa-let-7f-1 in the sense orientation. Figures 2 and 3 are adapted from windows of the UCSC Genome Browser.

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

65

that when miRNA genes are located in introns of known genes in the antisense orientation, they might in fact be part of transcription units in opposite orientation to the presumptive host gene. In the case of hsa-mir- 302, the microRNA is clearly located in an intron of Similarly, hsa-let-7d resides in the gene defined by the BC045813 mRNA in the opposite orientation it is located in a cluster of (Fig. 2B). In addition, (cid:1) 50 unspliced ESTs that spans about 2-kb and over- laps the intronless BC064349 mRNA. The orientation the BC045813 of is opposite that of mRNA, as the BC064349 mRNA and of several ESTs (Fig. 2B). These data thus that hsa-let-7d, strongly suggest and possibly hsa-let-7f-1 and hsa-let-7a-1, are part of

M. J. Weber

New human and mouse microRNA gene

forms of the host gene mRNA. The case of hsa-mir- 99b is further discussed below.

the antisense transcript (Fig. 2A). In other cases, like hsa-mir-142, hsa-mir-133a-1 ⁄ mir-1-2 and possibly hsa- let-7d ⁄ let)1-7f-1 ⁄ let)1-7a-1, the microRNA is located in an exon of the antisense transcript.

MiRNAs complementary to expressed sequences and potential regulation of glycerate kinase gene expression by miR-135a Human miRNA precursors that overlap with exons of known genes or ESTs

After intersecting the new human miRNA table of the UCSC Genome Browser with the chrN_EST and chrN_mRNA tables, I examined miRNAs that are included in, or overlap with, exons. The miRNAs that were only included in intronless ESTs were for the most part not further examined due to the uncertainty in EST orientation. In addition, the localization of sev- eral miRNAs in an exon possibly corresponded to intron retention events (hsa-mir-126, mir-25 and -mir- 224). Accordingly, these miRNAs were classified as intronic. The mouse mmu-mir-135a-1 miRNA resides in the long form of antisense orientation in an alternative, the 30-UTR of the 6230410P16Rik gene, the ortho- logue of the human GLYCTK (glycerate kinase) gene (Fig. 3A). Therefore, mmu-miR-135a is perfectly com- plementary to the alternative form of Glyctk mRNA and might regulate its stability by an siRNA mechan- ism. Interestingly, mmu-mir-135a-1 also resides in an intron of the spliced AK051019 mRNA in the sense orientation; the latter might thus be the actual miRNA host gene. These two transcriptional units are largely overlapping, so that several genomic segments are exonic on both strands (Fig. 3A).

for hsa-mir-198, which resides

the 30-UTR of

Except for the four examples discussed below (hsa- mir-135a-1, hsa-mir-99b, hsa-let-7e, hsa-mir-125a), miRNAs that are embedded in, or overlap with exons of known transcripts, are always in the same orienta- tion. The corresponding genes were generally noncod- ing, except in the 30-UTR of the FSTL1 (follistatin-like) gene, and for located in the C20orf166 gene that hsa-mir-133a-2, encodes a potential 117 amino acid protein. In addi- tion, hsa-mir-21 probably resides in the sense orienta- the tion in an alternative form of VMP1 gene, characterized by mRNA BC053563. Accordingly, mmu-mir-21 is located in the 30-UTR of the mouse VMP1 ortholog, 4930579A11Rik, also in the sense orientation.

These three cases are particularly intriguing, as exci- sion of the miRNA precursor from the host mRNA by the Drosha endonuclease would probably trigger the degradation of the rest of the mRNA. A similar mech- anism is documented in E. coli, where RNase III cleaves its own mRNA at a stem-loop structure and its degradation [25]. Similarly, RNase III triggers the polycistronic metY-nusA-infB RNA, cleaves to release the metY tRNA and initiate the decay of the nusB-infB protein-coding mRNA [26]. Whether such a mechanism also operates in higher eucaryotes remains speculative. Of note, no rodent orthologue of hsa-mir- 198 could be found by a BLAT search in the mouse and rat genomes. A similar situation holds for the human genome: hsa-mir-135a-1 is located 741-base pairs downstream of the GLYCTK gene, in the antisense orientation. This miRNA is embedded in a cluster of 10 overlap- ping intronless ESTs that might be part of a longer, alternative form of the GLYCTK 30-UTR (Fig. 3B). This hypothesis is supported by the fact that several ESTs of this cluster (AI493054, AI380271, AW204878, AW207007 and BM555864) are polyadenylated. In addition, hsa-mir-135a-1 resides, in the sense orienta- tion, in an intron of EST AI936688. Based on these observations, it is tempting to speculate that, in both mouse and human, mir-135a-1 is produced from an intron of the host gene (AK051019 and AI936688 in mouse and human, respectively) and can direct the degradation of a long form of the glycerate kinase mRNA by an RNA interference mechanism. Accord- ingly, a switch from GLYCTK to AI936688 gene tran- scription would be accompanied by the production of hsa-miR-135a, which could base-pair with pre-existing GLYCTK long mRNAs and trigger their degradation. This mechanism would thus block glycerate kinase production at both the transcriptional and transla- tional levels, while the shorter form of glycerate kinase mRNA, which results from the use of an alternative polyA site, would not be affected. In addition, hsa- miR-135a could be produced from its second precur- sor, hsa-mir-135a-2.

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

66

Four miRNAs (hsa-mir-34b, hsa-mir-205, hsa-mir- 133a-2 and hsa-mir-99b) overlap a splicing site. In the three first cases, it is not clear how the miRNA precur- sor is produced. It might originate either from an exon, or from an intron of uncharacterized alternative As noted above, hsa-let-7e and hsa-mir-125a are located in the first exon of mRNA AK125996, in the antisense orientation, whereas hsa-mir-99b overlaps the splicing donor site (Fig. 3C). Interestingly, the first exon of mRNA AK125996 overlaps that of the antisense

M. J. Weber

New human and mouse microRNA gene

Fig. 3. Antisense exonic miRNAs. (A) Localization of mmu-mir-135a-1. This miRNA resides in the antisense orientation in an alternative lon- ger form of the 30-UTR of the mouse 6230410P16Rik mRNA. This gene is the orthologue of the human GLYCTK gene, as shown by BLATing the sequence of the human protein (see BLAT track). The miRNA mmu-mir-135a-1 is also located in an intron of the antisense AK051019 mRNA, indicated in red. (B) Localization of hsa-mir-135a-1. This miRNA resides in a cluster of ESTs downstream of the GLYCTK gene, in the opposite orientation. The asterisks indicate the localization of the polyA tail of some of these ESTs. In addition, hsa-mir-135a-1 resides in the sense orientation in an intron of EST AI936688 (indicated in red). (C) Localization of hsa-mir-99b, hsa-let-7e and hsa-mir-125a. The latter two miRNAs are in the first exon of the AK125996 mRNA, in the antisense orientation, whereas hsa-mir-99b overlaps the splicing donor site. Note that the AK125996 mRNA overlaps in the antisense direction with the BC041134 and AY358799 mRNAs.

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

67

mRNA AY358799 over 113 nucleotides. Although the overlapping region is outside of the miRNA-containing it is tempting to speculate that these three segment, miRNAs reside, in the sense orientation, in a longer form of the AY358799 mRNA, and might regulate the translation and ⁄ or stability of the mRNA AK125996.

M. J. Weber

New human and mouse microRNA gene

reside in introns of paralogous gene

Table 2. MicroRNA that families.

MicroRNA

Host gene

Function

Notes

Dynamins

CTD RNA polII phosphatases

a

Protein tyrosine phosphatases

b

Slit homologs, axonal guidance

c

Pantothenate kinases

d

hsa-mir-199b hsa-mir-199a-1 hsa-mir-199a-2 hsa-mir-26b hsa-mir-26a-1 hsa-mir-26a-2 hsa-mir-153–1 hsa-mir-153–2 hsa-mir-218–1 hsa-mir-218–2 hsa-mir-107 hsa-mir-103–2 hsa-mir-103–1 hsa-mir-211

DNM1 DNM2 DNM3 CTDSP1 CTDSPL CTDSP2 PTPRN PTPRN2 SLIT2 SLIT3 PANK1 PANK2 PANK3 TRPM1

Transient receptor potential

cation channels (melastatins)

e

hsa-mir-204 hsa-mir-148b

TRPM3 COPZ1

Coatamer protein

complex, subunits zeta

hsa-mir-152

COPZ2

MicroRNAs in gene families

a mmu-mir-153 is located in the Ptprn2 gene. No documented miRNA resides in the mouse Ptprn gene. b No documented miRNA resides in the human SLIT1 gene. In the October 2003 freeze of the mouse gen- ome, the Slit2 gene, as evidenced by BLATing the human Slit2 protein, comes in two parts, located on chr5_random (aa 133–538) and chr5 (aa 539–1529). Mmu-mir-218–1 resides in the chr5_random part. c hsa-mir- 103–2 and mir-103–1 are closely related to hsa-mir-107 (87 and 91% identity, respectively). d hsa-mir-211 and )204 have no sequence homol- ogy. e hsa-mir-148b and )152 display significant homology (77% identity over 66 nucleotides).

As shown in Table 2, many related human precursor miRNAs reside in corresponding introns of paralogous genes. The mature miRNAs are either identical (miR- 15, miR-218), closely related (miR-199a and b, miR26a and b), or display significant homology (hsa- mir-148b and -152, hsa-mir-107, -103-1 and -103-2). Therefore, the sequences of these intronic miRNAs have been largely conserved after gene duplications, raising the possibility that their function might have been conserved as well. This also suggests that addi- tional genes in the families shown in Table 2 might contain still uncharacterized miRNAs. Intriguingly, hsa-mir-211 and -204 are located within an intron of the TRPM1 and TRPM3 genes, respectively, that bracket paralogous exons. These localizations are con- served in the mouse genome. However, these two miR- NAs have no sequence similarity. In this case, it is thus possible that the presence of miRNAs in the introns of the TRPM1 and TRPM3 genes is posterior the TRPM gene family. This to the expansion of hypothesis is reinforced by the fact that the other members of the family, TRPM2 and TRPM4 through TRPM8, do not host known miRNAs. The large dif- ference in the size of the introns of the TRPM1 and TRPM3 genes that host mir-211 and mir-204 (3 and 44 kb, respectively) also suggests extensive rearrange- ments posterior to gene duplication. It

Only two other vertebrate miRNAs have been previ- ously shown to be fully complementary to a cellular mRNA: mmu-mir-127 and mmu-mir-136 reside in the intronless Rtl1 gene, in the opposite orientation. Whereas the Rtl1 gene is paternally expressed, the two miRNAs are only expressed from the maternal chro- mosome [27]. This reciprocal imprinting suggests that mmu-miR-127 and mmu-miR-136 regulate Rtl1 gene expression by an siRNA mechanism. This situation probably also holds for the human genome, as both hsa-mir-127 and hsa-mir-136 reside in the inverse ori- entation within the presumptive human orthologue of Rtl1 (XM_352144).

is also intriguing to note that hsa-mir-199b, -199a-1 and -199a-2 reside in introns of the DNM1, DNM2 and DNM3 genes, respectively, but in the opposite orientation (Table 2). As discussed above, the DNM genes are thus probably not the actual hosts for these miRNAs. Indeed, hsa-mir-199a-2 is embedded in a large cluster of ESTs antisense to the DNM3 gene that also contains hsa-mir-214 (Table S3). Similarly, mmu-mir-199a-2 and mmu-mir-214 reside in the opposite orientation in an intron of the dynamin 3 gene (9630020E24Rik) and are embedded in a large cluster of antisense mRNAs and ESTs (not shown). Although there is no clear evidence for tran- scripts antisense to DNM1 and DNM2 genes, these observations suggest that the conservation of highly related miRNAs in the DNM gene family might result from the expansion in the vertebrate genomes of antisense transcripts in addition to the DNM genes themselves.

Clusters of microRNAs

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

68

Therefore, this study uncovers new cases where a miRNA might regulate the stability of a cellular mRNA through an siRNA mechanism. In all cases dis- the miRNA resides on the opposite cussed above, strand relative to its target mRNA. This situation dif- fers from that of miR-196a, which is fully complement- ary to the 30-UTR of HOXB8 mRNAs, except for a G ⁄ U wobble [28]. In that case, the miRNA gene resides at distance from its target gene, although in the same HOX locus. The existence of miRNA clusters has been already noted [14,29,30], but a precise definition of a cluster

M. J. Weber

New human and mouse microRNA gene

clusters

was not given. I suggest here three criteria: miRNAs in a cluster are in the same orientation, and not separated by a transcription unit or a miRNA in the opposite orientation. This definition is clearly more restrictive than that used by others [31]. Using such a definition, hsa-mir-10a and hsa-mir-196–1, although 52 kb apart, do not form a cluster, as they are separated by five HoxB genes. Similarly, hsa-mir-181c was not included in the cluster formed by hsa-mir-24-2, -27a and -23a, as it is not in the same orientation. The condition that clustered miRNAs have the same orientation might imply that they originate from the nucleolytic degrada- tion of a unique transcript, as previously shown in HeLa cells for the hsa-miR-23 ⁄ 27 ⁄ 24-2 and hsa-miR- 17 ⁄ 18 ⁄ 19a ⁄ 20 ⁄ 19b-1 clusters [32]. A comprehensive list of 37 human miRNA clusters is presented in Table S4. them are conserved in the mouse genome, All of although some mouse contain additional members, for which no human orthologue could be detected (mmu-mir-300, -329 and -341).

[36,37]. The present identification of hsa-mir-99b thus allows for the identification of an additional cluster on human chromosome 19. In the human genome, no sequence similarity was found in the mRNAs that the three clusters (AK091713, AK125996 and host AK095614, respectively). The chr11 and chr21 clusters reside in an intron of their host gene, but no homolo- gies in the two introns containing the clusters were found outside of the miRNA sequences. In contrast, two miRNAs of the chr19 cluster, hsa-mir-125a and hsa-let-7e were in an exon of the AK125996 host gene in the antisense orientation, whereas hsa-mir-99b encompassed the intron–exon junction, as noted above (Fig. 3C). Remarkably, the distances between the miRNA in a given cluster are conserved between the human and mouse genomes, but vary considerably between clusters (Table S4). In contrast to the cases shown in Table 2, these three clusters were thus prob- ably not generated by the duplication of their host gene. Functional studies are needed to better under- stand the reasons why the miRNAs of the miR- 125 ⁄ lin-4, miR-100 ⁄ 99 and let-7 families reside in clus- ters. Interestingly, miR-125, miR-100 and let-7 are co- regulated during Drosophila pupal development, and are expressed from a common precursor [36–38].

In this miRNA cluster,

Among the 37 human clusters, 19 are located in Ref- Seq genes, known genes, or mRNAs from GenBank while 18 lie outside of any characterized transcription unit. In no case are clustered miRNAs located in differ- ent introns of the same gene. This contrasts with intro- nic snoRNAs, for which an intron always carries a single snoRNA, but multiple introns of the same gene can carry different, or highly related snoRNAs [33–35]. This apparent difference between miRNAs and snoR- NAs might be in fact only provisional, as the maternally expressed mouse Mirg gene may contain several miRNAs in different exons [27]. Moreover, clustered miRNAs are in three cases distributed between the intronic and exonic parts of a gene. First, hsa-let-7c and hsa-mir-125b-b2 are located in the same exon of mRNA AK125996, whereas hsa-mir-99b resides on the intron– exon junction, as discussed above (Fig. 3C). Second, hsa-mir34b is located within an intron of mRNA BC021736, whereas hsa-mir-34c resides in an adjacent exon of the same transcript. Finally, hsa-mir-145 is located within the intronless AK093957 mRNA, whereas hsa-mir-143 resides outside of the mRNA. These special cases will probably be more fully understood when additional transcripts are available in the data bases. The newly described hsa-mir-299, -mir-323 and -mir- 329 miRNAs (Table 1) are part of a cluster that also includes hsa-mir-134 and -mir-154 on chr14q32.31. The mouse cluster contains in addition mmu-mir-300, as well as several copies related to mmu-mir-134 ⁄ 154 [27]. In both human and mouse, this locus is subjected to genomic imprinting [27,39]. Mmu-mir-154 (and possibly other members of the cluster) resides in an intron of the maternally expressed, noncoding Mirg gene (AJ517757), which is poorly conserved in the human genome. several miRNAs are conserved between human and mouse: miR-323, miR-134, miR-154, miR-323 and miR-299, while hsa- miR-329 differs from its rodent orthologue by four nu- cleotides (Table 1). Moreover, the human orthologue of mmu-mir-300 could not be found by a BLAT search. The reason for such a differential conservation of miRNAs within this cluster will probably be under- stood when their function is elucidated.

Prediction of miRNA candidates near known miRNA genes

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

69

gambiae The three miRNA clusters composed of hsa-mir- 125b-1, -let7a-2, -mir-100 (chr11), hsa-mir-125b-2, -let- 7c, -mir-99a (chr21) and hsa-mir-125a, -let-7e, -mir-99b (chr19) are composed of related precursors (Table S4). Interestingly, hsa-mir-125b-2 is the closest homologue of C. elegans lin-4. The first two clusters have already been described and are conserved in the mouse gen- ome, while only one copy of the cluster resides in the D. melanogaster genomes and Anopheles The analysis of highly conserved regions with a hairpin structure near known miRNAs can reveal numerous new miRNA candidates. For example, such a sequence resides 94 base pairs from hsa-mir-144 (Figure S1A). The orthologous mouse sequence is located 101 base

M. J. Weber

New human and mouse microRNA gene

transcripts. This might be the case for most miRNAs classified as intergenic. It appears that such transcripts are for the most part non protein-coding, often intro- nless and poorly conserved and can carry a cluster of miRNAs.

pairs from mmu-mir-144. Two other candidates are located near hsa-mir-195, in the intronless AK098506 mRNA (Figure S1B–C). This genomic region contains additional discrete peaks of conservation between the human, chimpanzee, mouse and rat genomes (Figure S1D) and might thus contain additional candidates. As pointed out in reference [8], there are more conserved hairpin structures in the human genome than they are miRNAs. However, restricting the infomatic search of conserved hairpin structures to the vicinity of known miRNAs will certainly unravel many interesting candi- dates.

Conclusions

A systematic search of orthologues of known rodent microRNA precursors in the human genome has led to the identification of 35 new miRNA genes. Similarly, 42 new mouse miRNAs were found by orthology with known human miRNA precursors. Added to those deposited in the miRNA registry, about 200 miRNA genes are now localized in both human and mouse genomes, a number close to the total number (255) predicted by Lim et al. [13].

These two possible mechanisms present similarities with those whereby snoRNAs are generated. In verte- brates, snoRNAs (except for U3, U8 and U13) are located in introns of host genes [42–44] and arise from the exonucleolytic degradation of spliced, debranched introns. A minor, splicing-independent pathway invol- ving endonucleolytic cleavage within the intron has also been identified in yeast and rat [33,45–48]. In addition, in plants and certain yeasts, snoRNAs can be produced from independent, polycistronic transcription units. Differences exist, however, between miRNA and snoRNA localizations. First, an intron always contains a single snoRNA but can contain several miRNAs. Second, a single gene can host several snoRNAs in various introns, but a gene contains in general only one intronic miRNA, or one cluster of miRNAs in a single intron. The single known exception for this sec- ond rule is the maternally expressed mouse Mirg gene, which hosts several miRNA candidates in different exons [27]. Whether this situation is restricted to loci further submitted to genomic imprinting deserves experiments.

imperfect

Among the 196 human miRNA precursors analyzed, 98 were located in introns of known genes, 81 in the sense orientation and 17 in the antisense orientation. In the latter case, it is suggested that the miRNA in fact belongs to an antisense transcription unit. Several cases were presented where miRNA precursors are totally included in a polyadenylated mRNA, some- times as clusters. These data suggest that miRNA pre- cursors can be generated by two mechanisms.

example,

The identification of miRNA targets in animals is made difficult by the complementarity between the miRNA and the target mRNAs [21,49– 51]. Therefore, only few miRNA targets have been proposed in mammals [27,28,49]. Here, I provide three new examples of a perfect complementarity between a miRNA and a cellular mRNA: mmu-miR-135a, whose sequence is fully complementary to a long form of the 30-UTR of the mouse glycerate kinase gene, generated by the use of an alternative polyA site. The same situ- ation probably holds true in human. Similarly, hsa-let- 7e and hsa-miR-125a are perfectly complementary to parts of an exon of the noncoding AK125996 mRNA. The significance of this second case is however, uncer- tain, as the AK125996 mRNA is only conserved in the mouse genome in the portions that correspond to the miRNA precursors.

A similar study was performed [52] and an analysis of miRNAs hosted in the Mirg gene has also been published [53].

Methods

The sequences of human and mouse mature (miR) and pre- cursor (mir) miRNAs were obtained from The miRNA Registry (version Rfam2.2), and the entire set of sequences

Precursors could be excised from introns, most probably after splicing. As miRNA precursors are excised by the Drosha endonuclease [1], their genera- tion might occur independently of intron debranching. This mechanism can also apply for certain miRNA clusters. For cluster of miR-17-18- the 19a(cid:2)20-19-1 was detected by RT-PCR in a 0.4-kb product [32]. This cluster resides in a 3.8-kb intron of the BC040320 mRNA. The RT-PCR product might thus originate either from the primary transcript before splicing, or from a more stable product during the degradation of the spliced intron. It remains to be examined whether the cell ⁄ tissue specificity of the expression of the host gene fully accounts for the spe- cificity of miRNA expression [19,40,41]. Moreover, the compilation of expression data might indicate whether the host gene of a miRNA and candidate target genes for the miRNA are expressed in the same cells.

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

70

Alternatively, miRNA precursors could also be gen- erated from the exonic part of RNA polymerase II

M. J. Weber

New human and mouse microRNA gene

7 Carrington JC & Ambros V (2003) Role of microRNAs in plant and animal development. Science 301, 336–338. 8 Lai EC (2003) microRNAs: runts of the genome assert

themselves. Curr Biol 13, R925–R936.

9 Lee RC, Feinbaum RL & Ambros V (1993) The

C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75, 843–854.

10 Reinhart BJ, Slack FJ, Basson M, Pasquinelli AE,

Bettinger JC, Rougvie AE, Horvitz HR & Ruvkun G (2000) The 21-nucleotide let-7 RNA regulates develop- mental timing in Caenorhabditis elegans. Nature 403, 901–906.

11 Lai EC, Tomancak P, Williams RW & Rubin GM (2003) Computational identification of Drosophila microRNA genes. Genome Biol 4, R42.

was subjected to a BLAT search [54] in the human and mouse genomes (UCSC version hg16, July 2003 and UCSC version mm4, October 2003, respectively) on the UCSC Genome Browser server. The BLAT results were exported to two ExcelTM tables, one for each genome, that were then ordered according to chromosome number and BLAT hit chromosomal position. This allowed for the determination of the position of the closest match of a mouse miRNA in the human genome, and vice-versa. Sequences correspond- ing to potential new miRNA precursors were obtained from the UCSC Genome Browser, and their hairpin structure was assessed with the mfold program [55] (available at http://www.bioinfo.rpi.edu/applications/mfold/old/rna/). Multiple alignments were performed with the multalin program [56] (available at http://prodes.toulouse.inra.fr/ multalin/multalin.html).

12 Lim LP, Lau NC, Weinstein EG, Abdelhakim A,

Acknowledgements

Yekta S, Rhoades MW, Burge CB & Bartel DP (2003) The microRNAs of Caenorhabditis elegans. Genes Dev 17, 991–1008.

13 Lim LP, Glasner ME, Yekta S, Burge CB & Bartel DP (2003) Vertebrate microRNA genes. Science 299, 1540. 14 Lagos-Quintana M, Rauhut R, Lendeckel W & Tuschl T (2001) Identification of novel genes coding for small expressed RNAs. Science 294, 853–858.

15 Griffiths-Jones S, Bateman A, Marshall M, Khanna A & Eddy SR (2003) Rfam: an RNA family database. Nucleic Acids Res 31, 439–441.

16 Kim J, Krichevsky A, Grad Y, Hayes GD, Kosik

Seitz

KS, Church GM & Ruvkun G (2004) Identification of many microRNAs that copurify with polyribosomes in mammalian neurons. Proc Natl Acad Sci USA 101, 360–365.

17 Karolchik D, Baertsch R, Diekhans M, Furey TS,

I thank Sam Griffiths-Jones (The Wellcome Trust Sanger Institute) for checking the new miRNAs before their addition to The miRNA Registry, and for his help in the submission of custom tracks to the UCSC Genome Browser; Jim Kent, Donna Koralchik and Hi- ram Clawson for integrating the microRNA tracks in the UCSC Genome Browser. I thank Emmanuel Ka¨ s (CNRS, Toulouse, France) for helpful discussions and Tamas Kiss and Herve´ (CNRS, Toulouse, France) for their critical reading of the manuscript. This work was supported by grants from CNRS and La Ligue Nationale contre le Cancer to Tamas Kiss.

References

1 Lee Y, Ahn C, Han J, Choi H, Kim J, Yim J, Lee J,

Hinrichs A, Lu YT, Roskin KM, Schwartz M, Sugnet CW, Thomas DJ, et al. (2003) The UCSC Genome Browser Database, Nucleic Acids Res 31, 51–54.

Provost P, Radmark O, Kim S & Kim VN (2003) The nuclear RNase III Drosha initiates microRNA proces- sing. Nature 425, 415–419.

2 Park W, Li J, Song R, Messing J & Chen X (2002)

18 Ambros V, Bartel B, Bartel DP, Burge CB, Carrington JC, Chen X, Dreyfuss G, Eddy SR, Griffiths-Jones S, Marshall M et al. (2003) A uniform system for micro- RNA annotation. RNA 9, 277–279.

19 Lagos-Quintana M, Rauhut R, Yalcin A, Meyer J,

CARPEL FACTORY, a Dicer homolog, and HEN1, a novel protein, act in microRNA metabolism in Arabidopsis thaliana. Curr Biol 12, 1484–1495.

Lendeckel W & Tuschl T (2002) Identification of tissue- specific microRNAs from mouse. Curr Biol 12, 735–739.

3 Reinhart BJ, Weinstein EG, Rhoades MW, Bartel B & Bartel DP (2002) MicroRNAs in plants. Genes Dev 16, 1616–1626.

20 Schwarz DS, Du Hutvagner GT, Xu Z, Aronin N & Zamore PD (2003) Asymmetry in the assembly of the RNAi enzyme complex. Cell 115, 199–208.

4 Lund E, Guttinger S, Calado A, Dahlberg JE &

21 Doench JG & Sharp PA (2004) Specificity of micro-

Kutay U (2004) Nuclear export of microRNA precur- sors. Science 303, 95–98.

RNA target selection in translational repression. Genes Dev 18, 504–511.

5 Bohnsack MT, Czaplinski K & Gorlich D (2004)

22 Bejanin S, Cervini R, Mallet J & Berrard S (1994) A

Exportin 5 is a RanGTP-dependent dsRNA-binding protein that mediates nuclear export of pre-miRNAs. RNA 10, 185–191.

6 Bartel DP (2004) MicroRNAs: genomics, biogenesis,

unique gene organization for two cholinergic markers, choline acetyltransferase and a putative vesicular transporter of acetylcholine. J Biol Chem 269, 21944– 21947.

mechanism, and function. Cell 116, 281–297.

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

71

M. J. Weber

New human and mouse microRNA gene

Tuschl T (2003) The small RNA profile during Drosophila melanogaster development. Dev Cell 5, 337– 350.

23 Conrad C, Vianna C, Freeman M & Davies P (2002) A polymorphic gene nested within an intron of the tau gene: implications for Alzheimer’s disease. Proc Natl Acad Sci USA 99, 7751–7756.

24 Gauwerky CE, Huebner K, Isobe M, Nowell PC & Croce CM (1989) Activation of MYC in a masked t (8; 17) translocation results in an aggressive B-cell leukemia. Proc Natl Acad Sci USA 86, 8867–8871.

37 Sempere LF, Sokol NS, Dubrovsky EB, Berger EM & Ambros V (2003) Temporal regulation of microRNA expression in Drosophila melanogaster mediated by hor- monal signals and broad-complex gene activity. Dev Biol 259, 9–18.

25 Bardwell JC, Regnier P, Chen SM, Nakamura Y, Grun- berg-Manago M & Court DL (1989) Autoregulation of RNase III operon by mRNA processing. EMBO J 8, 3401–3407.

38 Bashirullah A, Pasquinelli AE, Kiger AA, Perrimon N, Ruvkun G & Thummel CS (2003) Coordinate regula- tion of small temporal RNAs at the onset of Drosophila metamorphosis. Dev Biol 259, 1–8.

26 Regnier P & Grunberg-Manago M (1989) Cleavage by RNase III in the transcripts of the met Y-nus-A-infB operon of Escherichia coli releases the tRNA and initi- ates the decay of the downstream mRNA. J Mol Biol 210, 293–302.

27 Seitz H, Youngson N, Lin SP, Dalbert S, Paulsen M,

39 Lin SP, Youngson N, Takada S, Seitz H, Reik W, Paul- sen M, Cavaille J & Ferguson-Smith AC (2003) Asym- metric regulation of imprinting on the maternal and paternal chromosomes at the Dlk1-Gtl2 imprinted cluster on mouse chromosome 12. Nat Genet 35, 97–102.

40 Chen CZ, Li L, Lodish HF & Bartel DP (2004) Micro-

RNAs modulate hematopoietic lineage differentiation. Science 303, 83–86.

41 Houbaviy HB, Murray MF & Sharp PA (2003)

Bachellerie JP, Ferguson-Smith AC & Cavaille J (2003) Imprinted microRNA genes transcribed antisense to a reciprocally imprinted retrotransposon-like gene. Nat Genet 34, 261–262.

Embryonic stem cell-specific MicroRNAs. Dev Cell 5, 351–358.

42 Kiss T (2002) Small nucleolar RNAs: an abundant

group of noncoding RNAs with diverse cellular func- tions. Cell 109, 145–148.

28 Yekta S, Shih IH & Bartel DP (2004) MicroRNA-direc- ted cleavage of HOXB8 mRNA. Science 304, 594–596. 29 Mourelatos Z, Dostie J, Paushkin S, Sharma A, Char- roux B, Abel L, Rappsilber J, Mann M & Dreyfuss G (2002) miRNPs: a novel class of ribonucleoproteins con- taining numerous microRNAs. Genes Dev 16, 720–728. 30 Lagos-Quintana M, Rauhut R, Meyer J, Borkhardt A & Tuschl T (2003) New microRNAs from mouse and human. RNA 9, 175–179.

43 Bachellerie JP, Cavaille J & Huttenhofer A (2002) The expanding snoRNA world. Biochimie 84, 775–790. 44 Weinstein LB & Steitz JA (1999) Guided tours: from precursor snoRNA to functional snoRNP. Curr Opin Cell Biol 11, 378–384.

45 Caffarelli E, Arese M, Santoro B, Fragapane P & Boz- zoni I (1994) In vitro study of processing of the intron- encoded U16 small nucleolar RNA in Xenopus laevis. Mol Cell Biol 14, 2966–2974.

31 Calin GA, Sevignani C, Dumitru CD, Hyslop T, Noch E, Yendamuri S, Shimizu M, Rattan S, Bullrich F, Negrini M & Croce CM (2004) Human microRNA genes are frequently located at fragile sites and genomic regions involved in cancers. Proc Natl Acad Sci USA 101, 2999–3004.

32 Lee Y, Jeon K, Lee JT, Kim S & Kim VN (2002)

MicroRNA maturation: stepwise processing and subcel- lular localization. EMBO J 21, 4663–4670. 33 Cavaille J, Vitali P, Basyuk E, Huttenhofer A &

46 Caffarelli E, Fatica A, Prislei S, De Gregorio E, Fraga- pane P & Bozzoni I (1996) Processing of the intron- encoded U16 and U18 snoRNAs: the conserved C and D boxes control both the processing reaction and the stability of the mature snoRNA. EMBO J 15, 1121– 1131.

47 Fragapane P, Prislei S, Michienzi A, Caffarelli E &

Bachellerie JP (2001) A novel brain-specific box C ⁄ D small nucleolar RNA processed from tandemly repeated introns of a noncoding RNA gene in rats. J Biol Chem 276, 26374–26383.

34 Cavaille J, Seitz H, Paulsen M, Ferguson-Smith AC &

Bozzoni I (1993) A novel small nucleolar RNA (U16) is encoded inside a ribosomal protein intron and origi- nates by processing of the pre-mRNA. EMBO J 12, 2921–2928.

48 Villa T, Ceradini F, Presutti C & Bozzoni I (1998) Pro- cessing of the intron-encoded U18 small nucleolar RNA in the yeast Saccharomyces cerevisiae relies on both exo- and endonucleolytic activities. Mol Cell Bio. 18, 3376–3383.

Bachellerie JP (2002) Identification of tandemly-repeated C ⁄ D snoRNA genes at the imprinted human 14q32 domain reminiscent of those at the Prader–Willi ⁄ Angel- man syndrome region. Hum Mol Genet 11, 1527–1538. 35 Tycowski KT, Shu MD & Steitz JA (1996) A mamma- lian gene with introns instead of exons generating stable RNA products. Nature 379, 464–466.

36 Aravin AA, Lagos-Quintana M, Yalcin A, Zavolan M,

Marks D, Snyder B, Gaasterland T, Meyer J &

49 Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP & Burge CB (2003) Prediction of mammalian microRNA targets. Cell 115, 787–798.

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

72

M. J. Weber

New human and mouse microRNA gene

50 Stark A, Brennecke J, Russell RB & Cohen SM (2003) Identification of Drosophila MicroRNA Targets. Plos Biol 1, E60.

57 Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM & Haussler D (2002) The human gen- ome browser at UCSC. Genome Res 12, 996–1006.

51 Kiriakidou M, Nelson PT, Kouranov A, Fitziev P,

Supplementary material

Bouyioukos C, Mourelatos Z & Hatzigeorgiou A (2004) A combined computational-experimental approach pre- dicts human microRNA targets. Genes Dev 18, 1165– 1178.

52 Rodriguez A, Griffiths-Jones S, Ashurst JL & Bradley A (2004) Identification of mammalian microRNA host genes and transcription units. Genome Res 14, 1902– 1910.

53 Seitz H, Royo H, Bortolin ML, Lin SP, Ferguson-Smith AC & Cavaille V (2004) A large imprinted microRNA gene cluster at the mouse Dlk1-Gtl2 domain. Genome Res 14, 1741–1748.

54 Kent WJ (2002) BLAT – the BLAST-like alignment

tool. Genome Res 12, 656–664.

55 Zuker M (2003) Mfold web server for nucleic acid fold- ing and hybridization prediction. Nucleic Acids Res 31, 3406–3415.

clustered with

56 Corpet F (1988) Multiple sequence alignment with hier- archical clustering. Nucleic Acids Res 16, 10881–10890.

FEBS Journal 272 (2005) 59–73 ª 2004 FEBS

73

The following material is available from http://www. blackwellpublishing.com/products/journals/suppmat/ EJB/EJB4389/EJB4389sm.htm Appendix S1. Human and mouse miRNA precursors reside in conserved regions of synteny. Fig. S1. miRNAs gene candidates known miRNAs genes. Table S1. New microRNA precursors deposited in The miRNA Registry (version 3.0), at the Wellcome Trust Sanger Institute. Table S2. Human miRNAs that reside in introns of known genes in the sense orientation. Table S3. Human miRNAs that reside in introns of known genes in the reverse orientation. Table S4. Clusters of human microRNAs.