YOMEDIA
ADSENSE
Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives
34
lượt xem 4
download
lượt xem 4
download
Download
Vui lòng tải xuống để xem tài liệu đầy đủ
A back -propagation artificial neural net has been trained to estimate the activity values of a set of 18 N-alkyl-N-acyl- -aminoamide derivatives from the results of molecular mechanics and RHF/PM3/SCF MO semi-empirical calculations. The input descriptors include molecular properties such as the partition coefficient P, 3d structure dependent parameters, charge dependent parameters, and topological descriptors.
AMBIENT/
Chủ đề:
Bình luận(0) Đăng nhập để gửi bình luận!
Nội dung Text: Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives
Journal of Chemistry, Vol. 38, No.3, P. 91 - 96, 2000<br />
<br />
<br />
Using a combination of theoretical descriptors and<br />
a neural network to predict the activity of a set of<br />
N-alkyl-n-acyl- -aminoamide derivatives<br />
Received 21-02-2000<br />
Pham Van Tat, Pham Nu Ngoc Han<br />
Department of Chemistry, University of Dalat<br />
<br />
<br />
Summary<br />
A back -propagation artificial neural net has been trained to estimate the activity values of a<br />
set of 18 N-alkyl-N-acyl- -aminoamide derivatives from the results of molecular mechanics and<br />
RHF/PM3/SCF MO semi-empirical calculations. The input descriptors include molecular<br />
properties such as the partition coefficient P, 3d structure dependent parameters, charge<br />
dependent parameters, and topological descriptors.<br />
<br />
<br />
I - Introduction In general, the skeleton of N-alkyl-N-acyl-<br />
-aminoamide derivatives is given in figure 1<br />
Quantitative structure-activity relationship<br />
[1].<br />
(QSAR) has been used extensively in<br />
correlation molecular structure features of O R2 H<br />
compounds to their biological, chemical, and<br />
physical properties. The preferability of QSAR N<br />
R N R3<br />
is that there is quantitative connection between 1<br />
the microscopic (molecular structure) and the 8<br />
O R H<br />
macroscopic (empirical) properties (particularly R4 O 2<br />
13<br />
biological activity) of a molecule. Furthermore, 3 (A) 9 11 N<br />
4 7<br />
this connection can be used to predict empirical 2 N 10 R3<br />
properties of a compound with its molecular 15 1<br />
16 5 O<br />
structure given. COOH 14 6 R4 12<br />
The N-alkyl-N-acyl- -aminoamides were<br />
synthesized and screened against several protein (B)<br />
tyrosine phosphatases (PTPase), and several<br />
classes of potent and the selective inhibitors of<br />
various PTPase. The compounds of general<br />
formula in figure 1 bearing a cinnamate group<br />
were shown to exhibit low micromolar Figure 1: The skeleton of N-alkyl-N-acyl- -<br />
inhibitory activity against HePTP which is a aminoamide<br />
phosphatase specific to hematopoeitic cells and A) general structure, B) the R1 group is<br />
implicated in acute leukemia. replaced by a cinnamic group<br />
<br />
<br />
<br />
<br />
91<br />
The electrostatic interaction, and bulk or are used here shown in table 1 and the<br />
steric effect, and transfer property (trans- numeration needed for indicating net charge is<br />
ferability) of the molecules are considered as given in figure 1.<br />
microscopic properties. Theoretical descriptors<br />
Table 1: Theoretical descriptors are used in calculations [4]<br />
<br />
3d structure dependent parameters<br />
Volume Van der Waals volume<br />
Mol. Weight Molecular weight of a molecular<br />
Polar Molecular polarizability<br />
Sp. Polar Specific polarizability of a molecule<br />
LogP The logarithm of the partition coefficient P<br />
Charge dependent parameters<br />
Dipole Dipole moment of the molecule<br />
MaxQpos Largest positive charge over the atoms<br />
MarQneg Largest negative charge over the atoms<br />
ABSQ Sum of absolute values of the charges on each atom<br />
ABSQon Sum of absolute values of the charges on the nitrogen and oxygen in<br />
molecule<br />
Net charge of the atoms<br />
C1 Net charge of 1 atom C7 Net charge of 7 atom N13 Net charge of 13 atom<br />
C2 Net charge of 2 atom O8 Net charge of 8 atom C14 Net charge of 14 atom<br />
C3 Net charge of 3 atom N9 Net charge of 9 atom C15 Net charge of 15 atom<br />
C4 Net charge of 4 atom C10 Net charge of 10 atom C16 Net charge of 16 atom<br />
C5 Net charge of 5 atom C11 Net charge of 11 atom HOMO Highest occupied MO<br />
C6 Net charge of 6 atom O12 Net charge of 12 atom LUMO Lowest unoccupied MO<br />
Topological descriptors<br />
<br />
Description and calculation<br />
First-order molecular connectivity index computed over all single bonds of<br />
a hydrogen-suppressed graph of the molecule, no hydrogen atoms present.<br />
It has been found to be one of most useful 2-D descriptors when computing<br />
X1 a QSAR/QSPR expression. The quantitative form of X1 is: X1 = ( i . j)0.5<br />
where sume is over all ij bonds connecting atom (i) to atom (j). Single<br />
bonds of j = number of skeletal neighbors of atom i. i = i - h where i =<br />
number of atom is electrons in sigma orbital and h = number of hydrogen<br />
atoms bonded to skeletal atom i.<br />
<br />
X3 Third-order molecular connectivity index<br />
<br />
<br />
<br />
92<br />
First-order valence connectivity index over all bonds for the entire<br />
molecule.<br />
VX1 VX1 = (1/ Vi . Vj)0.5 where v has been defined under VX0.<br />
VX1 = 1 V in the Kier and Hall notation.<br />
<br />
Third order shape index for molecules.<br />
It encodes atom identity involved in the assessing the shape of a molecule.<br />
Therefore, it can discern isomers of the same molecule.<br />
Ka3 = (A + - 1) (A + - 3)2 / (Pi + ) for A = odd number<br />
Ka3 Ka3 = (A + - 2) (A + - 3)2 / (Pi + ) for A = even number<br />
(Kappa Alpha3)<br />
A = number of atoms in the molecule.<br />
= ((Ri/RCps3) - 1)<br />
where summation is over all atoms in the molecule and Ri and RCsp3 are the<br />
radii for the ith atom and for an sp3 carbon atom.<br />
<br />
Wiener Index is a topological parameter W as formulated by H. Wiener.<br />
The Wiener Index is based on the graph of molecule (skeletal system<br />
without hydrogen). The path number W is defined as the sum of the<br />
distance between any two carbon atoms in the molecule, in terms of<br />
carbon-carbon bonds.<br />
WienI The brief method of calculation is as follows: Multiply the number of<br />
carbon atoms on one side of any bond by those on the other side; W = sum<br />
of these values for all bonds. The Wiender Index of a molecule is generally<br />
higher for larger molecules and provides some measure of the branching of<br />
the molecule. In particular, it is larger for extended molecules and smaller<br />
for more compact ones. It correlates with ovality and volume and in some<br />
cases, it can be used in place of one or both of these molecular descriptors.<br />
It is a untiless parameter.<br />
<br />
Zero order valence connectivity index computed over all atoms in the entire<br />
molecule. VX0 = (1/ iV)0.5 where the summation is over all atoms in the<br />
VX0 molecule. iV = (ZV - h)/(Z - ZV - 1) where: ZV - number of valence<br />
electrons in the skeletal atom i, Z - atomic number, h - number of hydrogen<br />
atoms bonded to atom i.<br />
<br />
<br />
In this work, we carried out the molecular mechanics and RHF/PM3/SCF MO semi-<br />
empirical calculations from which the molecular properties were evaluated, and the investigated<br />
results were obtained by multiple linear regression analysis and neural network.<br />
<br />
II - Computational method<br />
<br />
1. The data and related software<br />
<br />
<br />
<br />
<br />
93<br />
18 N-alkyl-N-acyl- -aminoamides and the activity values IC50 are taken from [1] and<br />
shown in table 2. The structures were optimized using molecular mechanics and RHF/PM3/SCF<br />
MO semiemprirical quantum chemical approaches with the help of the programs HyperChem 5.11<br />
[6], Gaussian 98 [7], Alchemy 2000, SciQSAR 3.0 [4] and the statistical program Essential<br />
Regression 2.218 (3/1999) is a compiled MS Excel Marco (Add-in) [8], and NeuroSolution 3.0<br />
program [2]. All sorts of calculations were carried out on the Pentium II 350 MHz computer with<br />
128 M RAM at the Faculty of Chemistry, University of Dalat.<br />
2. Multivariate linear regression analysis<br />
Table 2: Inhibitory activity of Substituted N-alkyl-N-acyl- -aminoamide molecules [1]<br />
<br />
No R2 R3 R4 IC50, µM No R2 R3 R4 IC50, µM<br />
1 n-hexyl n-butyl -H 9.0 10 Phenyl Cyclohexyl -H 6.20<br />
2 n-hexyl tert-butyl -H 7.5 11 Phenyl Benzyl -H 3.90<br />
3 n-hexyl Cyclohyxyl -H 9.0 12 Phenyl -CH2COOH -H 10.4<br />
4 n-hexyl Benzyl -H 6.0 13 Phenyl -CH2CO2Me -H 20.2<br />
5 n-hexyl -CH2COOH -H 7.5 14 Phenyl -CH2CO2Et -H 9.60<br />
6 n-hexyl -CH2CO2Me -H 7.2 15 Methyl Benzyl -H 7.20<br />
7 n-hexyl -CH2CO2Et -H 10 16 Ethyl Benzyl -H 15.0<br />
8 Phenyl n-butyl -H 6.7 17 n-proyl Benzyl -H 6.10<br />
9 Phenyl tert-butyl -H 4.0 18 n-butyl Benzyl -H 6.30<br />
<br />
The regression equation used here is as Using a multivariate linear regression<br />
follows [3, 4]: analysis is a fast method to identify the<br />
A = PiXi + C (1) calculated properties that are important for the<br />
th<br />
where the Xi - the i independent descriptor prediction of experimental quantities. The<br />
and Pi - the fitting parameter for the descriptor, magnitude of fitting parameter Pi indicates the<br />
the A-biological activity of the drug, and C- amount of the contribution of the descriptor to<br />
constant. the activity. That is, the larger the magnitude of<br />
Pi is the more important it is to the activity.<br />
3. Neural network For the detailed observation of the data<br />
characteristics, the descriptors are selected in<br />
A neural net is a tool that can be used to<br />
the linear regression analysis by leave-one-out<br />
predict the value of a parameter using a<br />
method on basis the change of multiple R. The<br />
computational system which is made up a<br />
principal descriptors are series of the net<br />
number of simple, yet highly connected<br />
charges of atoms located in the ring benzene,<br />
processing elements called nodes which process<br />
i.e. C2, C3, C5, C6, and other sites are O12, O8,<br />
information by its dynamic state response to<br />
C10, C11. The descriptors which represent net<br />
external inputs. A recent article that describes<br />
charges are principal to describe activity for N-<br />
the use of a neural nets to correlate physical<br />
properties of compounds can be found in alkyl-N-acyl- -aminoamide.<br />
Soman's article [2, 5].<br />
<br />
III - Results and discussion<br />
<br />
1. Multivariate linear regression analysis<br />
<br />
94<br />
Besides, there are also the topological descriptors for a neural net. All investigations<br />
descriptors, 3d structure dependent parameters were performed by the program Essential<br />
and charge dependent parameters, i.e. X1, VX0, Regression 2.218 (3/1999) which is a compiled<br />
VX1, logP, WienI, Polar, ABSQ, ABSQneg. MS Excel Macro (Add-in). The parameters<br />
The most important parameters seem to be X1, multiple R, R2, Standard Error, PRESS,<br />
VX0, VX1, WienI, Polar and ABSQneg. This Significance F and t-values were used to select<br />
means that for N-alkyl-N-acyl- -aminoamide, the best regression model. The best regression<br />
the electrostatic interaction, steric effect and the model has significant 10 variables in table 3.<br />
transferability are important to determine the DivIC50 = 1/ IC50 = -4.1674 + 0.4393 X1 -<br />
activity. These properties should be as useful as 0.3211VX0 - 0.4123VX1 + 0.0254 Volume -<br />
Table 3: The regression statistics, Pi - values0.0011<br />
and t-values<br />
WienIof +the0.0056<br />
best regression<br />
MolWeightmodel<br />
- 0.0129<br />
Dipole - 0.3144ABSQ + 0.4165O8 - 1.48O12 (2)<br />
No Regression Statistics Parameter Pi t-value Parameter Pi t-value<br />
1 Multiple R 0.9900 PX1 0.4393 13.75 Pmol.Weight 0.0056 8.901<br />
2 R Square (R2) 0.9801 PVX0 -0.3211 -7.371 PDipole -0.0129 -4.010<br />
3 Standard Error 0.0122 PVX1 -0.4123 -12.47 PABSQ -0.3144 -11.38<br />
4 PRESS 0.0101 Pvolume 0.0254 10.47 PO8 0.4165 4.007<br />
5 Significance F 0.00005 PWienI -0.0011 -13.94 PO12 -1.4800 -5.419<br />
<br />
The descriptors found in equation (2) were used for the back -propagation neural net. 9 N-<br />
alkyl-N-acyl- -aminoamides are taken from a SDF file of Cambridge databases. The predicted<br />
DivIC50 values of 9 these derivatives by multiple linear regression are given in table 4 and the<br />
regression plot in figure 2.<br />
2. Results of the back-propagation neural network<br />
The architecture of a neural net involves the number of descriptors for input layer which 10<br />
being equal to the number of the variable in equation (2), the number of hidden layer is 1 and the<br />
number of nodes of hidden layer are 20, the number of descriptor for output layer is 1 (the DivIC50<br />
value). We carried out training the neural net with a set of 18 N-alkyl-N-acyl- -aminoamides when<br />
the trained conditions are momentum of 0.7, transfer function is TanhAxon, Maximum Epochs of<br />
2000. The NeuroSolution 3.0 was used in this work.<br />
<br />
Table 4: The predicted DivIc50 values by multiple regression and a 10 x 20 x 1 neural net<br />
<br />
Predicted DivIC50 Predicted<br />
No R2 R3 R4 R5 DivIC50.epx by multiple linear DivIC50 by<br />
regression neural net<br />
1 n-hexyl Benzyl -H 3-Br 0.0507 0.05248 0.04971<br />
2 n-hexyl Benzyl -H 3-Cl 0.0506 0.04768 0.05364<br />
3 n-hexyl Benzyl -H 3-F 0.0519 0.05724 0.05082<br />
4 n-hexyl Benzyl -H 3-OCH3 0.0615 0.06011 0.06301<br />
5 n-hexyl Benzyl -H 3-OH 0.1104 0.10134 0.10115<br />
<br />
<br />
95<br />
6 n-hexyl Benzyl -H 3-NH3 0.0551 0.05636 0.05157<br />
7 n-hexyl tert-butyl -H 3-Br 0.0535 0.05666 0.05102<br />
8 n-hexyl Cyclohyexyl -H 3-Cl 0.0562 0.05824 0.05258<br />
9 n-hexyl -CH2COOOH -H 3-F 0.0546 0.04765 0.05052<br />
<br />
We used a 10 x 20 x 1 neural net to predict the DivIC50 values of 9 N-alkyl-N-acyl- -<br />
aminoamide derivatives which the neural net was not trained in table 4. The correlation is illustrated<br />
in figure 3. The correlation coefficient R2 is 0.97866 with a standard deviation of 0.00259. These<br />
initial investigations are thus very promising to predict the inhibitory activity of new drugs.<br />
<br />
<br />
<br />
<br />
Figure 2: The activity values are predicted by Figure 3: The activity values are predicted by a<br />
multiple linear regression analysis 10 x 20 x 1 neural net<br />
<br />
Conclusion Ohashi, A. M M. Mjalli. Bioorg. Med.<br />
Chem. Lett. 24, 2953-2958 (1995).<br />
We have used the molecular mechanics and<br />
RHF/PM3/SCF MO semi-empirical calculations 2. A. G. Soman, J. A. Darsey, D. W. Noid and<br />
from which the molecular properties are B. G. Sumpter. Chimicaoggi/Chemistry<br />
evaluated and combined a multivariate linear Today, March (1995).<br />
regression analysis and a back-propagation 3. N. R. Draper, H. Smith. Applied regression<br />
neural net for the prediction of the DivIC50 analysis, 2nd Edition, Jhn Wiley & Sons,<br />
values of 9 N-alkyl-N-acyl- -aminoamide New York (1998).<br />
derivatives and applicable for the development 4. Scivision. SciQSAR 3.0 User' Guide,<br />
of new drugs. The above approach shows a Burlingtong USA, Copyright (1999).<br />
promising technique.<br />
5. Scivision. SciLogP 3.0 User' Guide,<br />
The predictive power of our neural network<br />
Burlington USA, Copyright (1999).<br />
shows very good agreement with experimental<br />
values when the trained condition of a neural 6. Hypercube, Inc. Hyperchem Release 5.1 for<br />
net is Maximum Epoch of 2000. The danger of Windows, October (1996).<br />
overtraining of the neural net was checked with 7. J. Michael Frishch. Gaussian 98 User's<br />
standard deviation. The correlation coefficients Reference, Gaussian, Inc, 1994-1998.<br />
and standard deviations are appropriate.<br />
8. D. David Steppan, Joachim Werner, P.<br />
Rober Yeater. Essential Regression and<br />
References Experimental Design for Chemists and<br />
1. X. Cao, E. J. Moran, D. Siev, A. Lio, C. Engineers, Copyright, June (1998).<br />
<br />
96<br />
97<br />
ADSENSE
CÓ THỂ BẠN MUỐN DOWNLOAD
Thêm tài liệu vào bộ sưu tập có sẵn:
Báo xấu
LAVA
AANETWORK
TRỢ GIÚP
HỖ TRỢ KHÁCH HÀNG
Chịu trách nhiệm nội dung:
Nguyễn Công Hà - Giám đốc Công ty TNHH TÀI LIỆU TRỰC TUYẾN VI NA
LIÊN HỆ
Địa chỉ: P402, 54A Nơ Trang Long, Phường 14, Q.Bình Thạnh, TP.HCM
Hotline: 093 303 0098
Email: support@tailieu.vn