Computational reconstruction of metabolic networks from high throughput profiling data

Chia sẻ: Nguyễn Thị Thùy Linh | Ngày: | Loại File: PDF | Số trang:13

Thêm vào BST

Báo xấu

19
lượt xem 2
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

In this paper, we develop a computational method for the metabolic network reconstruction that can uncover not only pairwise interactions but also interactions involving more than two substrates/products such as triple interactions, quartic interactions, etc.

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: Computational reconstruction of metabolic networks from high throughput profiling data

Journal of Computer Science and Cybernetics, V.27, N.1 (2011), 23-35 COMPUTATIONAL RECONSTRUCTION OF METABOLIC NETWORKS FROM HIGH-THROUGHPUT PROFILING DATA NGUYEN QUYNH DIEP1 , PHAM THO HOAN1 , HO TU BAO2 TRAN DANG HUNG1 , PHAM QUOC THANG3 1 Hanoi 2 National University of Education, 136 Xuan-Thuy, Cau-Giay, Hanoi, Vietnam Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan 3 Tay-Bac University, Son-La city, Vietnam ´ ` ´ ’ a T´m t˘t. Hˆu hˆt c´c phu.o.ng ph´p t´ to´n t´i hiˆn mang sinh hoc hiˆn nay m´.i chı tˆp trung o a a e a a ınh a a e e o . . . . . .o.ng t´c gi˜.a hai phˆn tu., trong khi d´ mang chuyˆn h´a lai bao gˆ m c´c phan u.ng liˆn ’ o . ` a ’ ´ t` c´c tu ım a a u a ’ o . e o e ` ´ ´ ´ e u e a ı a a a o . o quan dˆn t`. 2 dˆn 6 chˆt. V` vˆy m` c´c phu.o.ng ph´p kh´m ph´ mang sinh hoc dang tˆ n tai khˆng a a a . . . ’ ` ´ ’ ´ th´ ho.p dˆ t´i hiˆn c´c phan u.ng sinh h´a c´ nhiˆu ho.n hai chˆt tham gia. ıch . e a e a o o e a . ’ ´ B`i b´o n`y gi´.i thiˆu mˆt phu.o.ng ph´p t´ to´n t´i hiˆn mang c´c chˆ t chuyˆn h´a t`. d˜. liˆu a a a o e o a ınh a a e a a e o u u e . . . . . .o.ng c´c chˆ t o. c´c diˆu kiˆn ho˘c th`.i diˆm kh´c nhau. Phu.o.ng ph´p khˆng chı ’ ` ´ ´ ’ do nˆ ng dˆ/khˆi lu . o o o e e a o e a a a ’ a ` a o . . . ` ph´t hiˆn c´c tu.o.ng t´c gi˜.a hai phˆn tu. m` c`n ph´t hiˆn du.o.c c´c tu.o.ng t´c nhiˆu ho.n hai phˆn a e a a u a ’ a o a e a a e a . . . ´ ´ ´ ´ ’ o a a e a u o ’ tu., d´ l` c´c tu.o.ng t´c ba chˆt, tu.o.ng t´c bˆn chˆt, v.v. Trong phu.o.ng ph´p dˆ xuˆt, ch´ ng tˆi su. a a a o a a ` .o.ng t´c da chˆt. Ch´ ng tˆi cung cˆp mˆt ’ ´ ´ o o o a e o ım a a u o a o dung dˆ do thˆng tin phu thuˆc bˆc ba dˆ d` t` c´c tu a . . . . . . c´ch nh` m´.i vˆ dˆ do thˆng tin phu thuˆc bˆc ba m` th´ ho.p trong viˆc ph´t hiˆn c´c tu.o.ng a ın o ` o e . o o a a ıch . e a e a . . . . . ´ ` ´ ’ e e a a ` o t´c nhiˆu ho.n hai biˆn. Hiˆu n˘ng cua phu.o.ng ph´p dˆ xuˆt d˜ du.o.c d´nh gi´ trˆn c´c d˜. liˆu mˆ a e e a a a e a u e . . a . ’ o ’ o ’ phong c´c hˆ chuyˆn h´a sinh hoc. T´ ch´ x´c cua phu.o.ng ph´p t´i hiˆn lai mang chuyˆn h´a a e e ınh ınh a ’ a a e . e . . . . ´ ´ ´ ’ a ’ du.o.c d´nh gi´ o. hai m´.c: c´c tu.o.ng t´c hai chˆt v` c´c tu.o.ng t´c ba chˆt. Kˆt qua t´i hiˆn cua a a ’ u a a a a a a a e e . . .o.ng ph´p dˆ xuˆt l` rˆt triˆn vong. ’ ´ ´ e a a a e Phu a ` . Abstract. All computational methods of biological network reconstruction up to now aim only to ﬁnd pairwise interactions. While metabolic networks composed mainly of reactions that often consist of from 2 to 6 substrates/products, the existing computational methods may not be appropriate to reconstruct interactions of more than two variables like reactions in the metabolic networks. In this paper, we develop a computational method for the metabolic network reconstruction that can uncover not only pairwise interactions but also interactions involving more than two substrates/products such as triple interactions, quartic interactions, etc. In the proposed method we use the ternary mutual information to capture high order interactions. The key idea is to propose a novel view on the ternary mutual information that can be appropriately used to reconstruct reactions involving more than two substrates/products. We have applied the proposed method to synthesized metabolome data; the reconstruction accuracy has been evaluated at the levels of pairwise and triple interactions. The performance of the method is promising. Keywords: Mutual information, entropy, biological network reconstruction. 24 NGUYEN Q.D., PHAM T.H., HO T.B., TRAN D.H., PHAM Q.T. 1. INTRODUCTION Thanks to the advancement of high-throughput technologies, we can now measure simultaneously the concentrations of thousands of molecular species in a biological system, such as mRNAs [22] and metabolites [18]. These high-throughput data are snapshots of a biological system and are informative to infer what has happened in the system. The analysis of the high-throughput data to uncover underlying biological mechanisms, e.g. gene regulatory networks (see [12] for an overview) or metabolic networks [6, 20] is one of the challenges in systems biology. Computational reconstruction of gene regulatory networks from transcriptome data has been deeply investigated by different approaches. These reverse engineering methods fall into three broad categories: (1) information theory models [24, 5, 19] with a variety of measures of pairwise mutual information between genes; (2) Bayesian and graphical networks [10, 25] that maximize a scoring function over some alternative network models to find the best model fitting the data; (3) differential and difference equations [11, 4] that explain the data by a system of mathematical equations. All the work on the gene regulatory network reconstruction until now aims to find only pairwise interactions (concerning with two genes). Different from gene regulatory networks that mainly concern with pairwise interactions, metabolic networks are composed mainly of reactions that often consist of from 2 to 6 metabolites (substrates/products). Thus, the metabolic network reconstruction should aim to find groups of metabolites that each involves in the same reaction. Up to now, there have been efforts to reconstruct metabolic networks that use methodologies of gene regulatory network reconstruction [6, 20]. As a consequence, they can only detect pairwise interactions but not interactions of more than two metabolites. In this work, we develop a computational method net-reconstruct for the metabolic network reconstruction that can uncover not only pairwise interactions but also interactions involving more than two substrates/products, for example, triple interactions, quartic interactions, etc. In this method we use the interaction mutual information [9] to capture multiple interactions. The key idea is to propose a novel view on the interaction mutual information that can be appropriately use to reconstruct reactions involving more than two substrates/products. When applying on the synthetic perturbation data of full-random networks (all structures, kinetic laws and parameter values are randomly generated, [2]) as well as of a semi-random networks, the human red blood cell metabolism ([14, 20]), our method gave promising results of interaction subsets that are close to the validated metabolic reactions. The interaction subsets with highest mutual information found from our method often correspond to metabolic reactions in the original networks, also many original reactions have been found in the results of our software. When evaluating accuracy at the level of pairwise interactions, the results of our method agreed with those of recent research on reconstruction methods. 2. 2.1. METHODS Mutual information between two variables Mutual information measure is more general than Pearson’s correlation coefficient (P P C ) to capture dependency between two variables. While P P C accounts only for linear or monotonic relationships, the mutual information takes into account all types of dependence. Given RECONSTRUCTION OF METABOLIC NETWORKS FROM HIGH-THROUGHPUT PROFILING DATA H(X) 25 H(Y) H(X|Y) H(Y|X) MI(2)(X,Y) = H(X) - H(X|Y) = H(Y) - H(Y|X) Figure 2.1. The Venn diagram for mutual information M I (2) of two variables two random variables X and Y with the joint density function fX,Y and marginal density functions fX , fY , the mutual information M I (2) of two variables X and Y [8] is defined as follows: fX,Y (x, y) M I (2)(X, Y ) = fX,Y (x, y) log dxdy (2.1) fX (x)fY (y) (we use the superscript number 2 to emphasize that the mutual information here is for 2 variables) If X and Y are independent, the mutual information M I (2)(X, Y ) = 0; if they are perfectly dependent, M I (2)(X, Y ) approaches infinity. The mutual information M I (2)(X, Y ) can also be interpreted in terms of information entropy [8] as M I (2)(X, Y ) = H(X) + H(Y ) − H(X, Y ) = H(X) − H(X|Y ) = H(Y ) − H(Y |X) (2.2) (2.3) (2.4) From Eq. 2.3 and Eq. 2.4 we can interpret the meaning of M I (2)(X, Y ) as it measures the reduction the uncertainty of X due to the knowledge of Y , or vice versa [3]. The above interpretation of Shannon entropy can be visualized by the Venn diagram in Figure 2.1, where M I (2)(X, Y ) is the intersection of two entropy circles H(X) and H(Y ), and H(X, Y ) is the union of two sets H(X) and H(Y ) [3, 13]. 2.2. Mutual information for more than two variables The mutual information M I (2) can detect interactions (edges) between two variables in a network. However, in most biological networks, each node (variable) may interact (link) with some others in the same or different mechanisms. Metabolic networks are an example of such networks, where each metabolite may interact with some others in different reactions. In this section, we present an extension of M I (2) that allows capturing the interactions of three variables. The generalization of mutual information of three variables from that of two variables is not trivial [3, 13]. One of those generations is i nteraction mutual information [9] that has received much attention but with controversial interpretations, defined as follows: M I (3)(X, Y, Z) = H(X) + H(Y ) + H(Z) − H(X, Y ) −H(Y, Z) − H(X, Z) + H(X, Y, Z) = MI (2) (X, Y ) − M I (2) (X, Y |Z) (2.5) (2.6) 26 NGUYEN Q.D., PHAM T.H., HO T.B., TRAN D.H., PHAM Q.T. (a) (b) H(X) H(X) H(Z) H(Y) H(Y) H(Z) MI(3)(X,Y,Z)>0 (3) MI (X,Y,Z)