intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives

Chia sẻ: Lê Na | Ngày: | Loại File: PDF | Số trang:7

34
lượt xem
4
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

A back -propagation artificial neural net has been trained to estimate the activity values of a set of 18 N-alkyl-N-acyl- -aminoamide derivatives from the results of molecular mechanics and RHF/PM3/SCF MO semi-empirical calculations. The input descriptors include molecular properties such as the partition coefficient P, 3d structure dependent parameters, charge dependent parameters, and topological descriptors.

Chủ đề:
Lưu

Nội dung Text: Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives

Journal of Chemistry, Vol. 38, No.3, P. 91 - 96, 2000<br /> <br /> <br /> Using a combination of theoretical descriptors and<br /> a neural network to predict the activity of a set of<br /> N-alkyl-n-acyl- -aminoamide derivatives<br /> Received 21-02-2000<br /> Pham Van Tat, Pham Nu Ngoc Han<br /> Department of Chemistry, University of Dalat<br /> <br /> <br /> Summary<br /> A back -propagation artificial neural net has been trained to estimate the activity values of a<br /> set of 18 N-alkyl-N-acyl- -aminoamide derivatives from the results of molecular mechanics and<br /> RHF/PM3/SCF MO semi-empirical calculations. The input descriptors include molecular<br /> properties such as the partition coefficient P, 3d structure dependent parameters, charge<br /> dependent parameters, and topological descriptors.<br /> <br /> <br /> I - Introduction In general, the skeleton of N-alkyl-N-acyl-<br /> -aminoamide derivatives is given in figure 1<br /> Quantitative structure-activity relationship<br /> [1].<br /> (QSAR) has been used extensively in<br /> correlation molecular structure features of O R2 H<br /> compounds to their biological, chemical, and<br /> physical properties. The preferability of QSAR N<br /> R N R3<br /> is that there is quantitative connection between 1<br /> the microscopic (molecular structure) and the 8<br /> O R H<br /> macroscopic (empirical) properties (particularly R4 O 2<br /> 13<br /> biological activity) of a molecule. Furthermore, 3 (A) 9 11 N<br /> 4 7<br /> this connection can be used to predict empirical 2 N 10 R3<br /> properties of a compound with its molecular 15 1<br /> 16 5 O<br /> structure given. COOH 14 6 R4 12<br /> The N-alkyl-N-acyl- -aminoamides were<br /> synthesized and screened against several protein (B)<br /> tyrosine phosphatases (PTPase), and several<br /> classes of potent and the selective inhibitors of<br /> various PTPase. The compounds of general<br /> formula in figure 1 bearing a cinnamate group<br /> were shown to exhibit low micromolar Figure 1: The skeleton of N-alkyl-N-acyl- -<br /> inhibitory activity against HePTP which is a aminoamide<br /> phosphatase specific to hematopoeitic cells and A) general structure, B) the R1 group is<br /> implicated in acute leukemia. replaced by a cinnamic group<br /> <br /> <br /> <br /> <br /> 91<br /> The electrostatic interaction, and bulk or are used here shown in table 1 and the<br /> steric effect, and transfer property (trans- numeration needed for indicating net charge is<br /> ferability) of the molecules are considered as given in figure 1.<br /> microscopic properties. Theoretical descriptors<br /> Table 1: Theoretical descriptors are used in calculations [4]<br /> <br /> 3d structure dependent parameters<br /> Volume Van der Waals volume<br /> Mol. Weight Molecular weight of a molecular<br /> Polar Molecular polarizability<br /> Sp. Polar Specific polarizability of a molecule<br /> LogP The logarithm of the partition coefficient P<br /> Charge dependent parameters<br /> Dipole Dipole moment of the molecule<br /> MaxQpos Largest positive charge over the atoms<br /> MarQneg Largest negative charge over the atoms<br /> ABSQ Sum of absolute values of the charges on each atom<br /> ABSQon Sum of absolute values of the charges on the nitrogen and oxygen in<br /> molecule<br /> Net charge of the atoms<br /> C1 Net charge of 1 atom C7 Net charge of 7 atom N13 Net charge of 13 atom<br /> C2 Net charge of 2 atom O8 Net charge of 8 atom C14 Net charge of 14 atom<br /> C3 Net charge of 3 atom N9 Net charge of 9 atom C15 Net charge of 15 atom<br /> C4 Net charge of 4 atom C10 Net charge of 10 atom C16 Net charge of 16 atom<br /> C5 Net charge of 5 atom C11 Net charge of 11 atom HOMO Highest occupied MO<br /> C6 Net charge of 6 atom O12 Net charge of 12 atom LUMO Lowest unoccupied MO<br /> Topological descriptors<br /> <br /> Description and calculation<br /> First-order molecular connectivity index computed over all single bonds of<br /> a hydrogen-suppressed graph of the molecule, no hydrogen atoms present.<br /> It has been found to be one of most useful 2-D descriptors when computing<br /> X1 a QSAR/QSPR expression. The quantitative form of X1 is: X1 = ( i . j)0.5<br /> where sume is over all ij bonds connecting atom (i) to atom (j). Single<br /> bonds of j = number of skeletal neighbors of atom i. i = i - h where i =<br /> number of atom is electrons in sigma orbital and h = number of hydrogen<br /> atoms bonded to skeletal atom i.<br /> <br /> X3 Third-order molecular connectivity index<br /> <br /> <br /> <br /> 92<br /> First-order valence connectivity index over all bonds for the entire<br /> molecule.<br /> VX1 VX1 = (1/ Vi . Vj)0.5 where v has been defined under VX0.<br /> VX1 = 1 V in the Kier and Hall notation.<br /> <br /> Third order shape index for molecules.<br /> It encodes atom identity involved in the assessing the shape of a molecule.<br /> Therefore, it can discern isomers of the same molecule.<br /> Ka3 = (A + - 1) (A + - 3)2 / (Pi + ) for A = odd number<br /> Ka3 Ka3 = (A + - 2) (A + - 3)2 / (Pi + ) for A = even number<br /> (Kappa Alpha3)<br /> A = number of atoms in the molecule.<br /> = ((Ri/RCps3) - 1)<br /> where summation is over all atoms in the molecule and Ri and RCsp3 are the<br /> radii for the ith atom and for an sp3 carbon atom.<br /> <br /> Wiener Index is a topological parameter W as formulated by H. Wiener.<br /> The Wiener Index is based on the graph of molecule (skeletal system<br /> without hydrogen). The path number W is defined as the sum of the<br /> distance between any two carbon atoms in the molecule, in terms of<br /> carbon-carbon bonds.<br /> WienI The brief method of calculation is as follows: Multiply the number of<br /> carbon atoms on one side of any bond by those on the other side; W = sum<br /> of these values for all bonds. The Wiender Index of a molecule is generally<br /> higher for larger molecules and provides some measure of the branching of<br /> the molecule. In particular, it is larger for extended molecules and smaller<br /> for more compact ones. It correlates with ovality and volume and in some<br /> cases, it can be used in place of one or both of these molecular descriptors.<br /> It is a untiless parameter.<br /> <br /> Zero order valence connectivity index computed over all atoms in the entire<br /> molecule. VX0 = (1/ iV)0.5 where the summation is over all atoms in the<br /> VX0 molecule. iV = (ZV - h)/(Z - ZV - 1) where: ZV - number of valence<br /> electrons in the skeletal atom i, Z - atomic number, h - number of hydrogen<br /> atoms bonded to atom i.<br /> <br /> <br /> In this work, we carried out the molecular mechanics and RHF/PM3/SCF MO semi-<br /> empirical calculations from which the molecular properties were evaluated, and the investigated<br /> results were obtained by multiple linear regression analysis and neural network.<br /> <br /> II - Computational method<br /> <br /> 1. The data and related software<br /> <br /> <br /> <br /> <br /> 93<br /> 18 N-alkyl-N-acyl- -aminoamides and the activity values IC50 are taken from [1] and<br /> shown in table 2. The structures were optimized using molecular mechanics and RHF/PM3/SCF<br /> MO semiemprirical quantum chemical approaches with the help of the programs HyperChem 5.11<br /> [6], Gaussian 98 [7], Alchemy 2000, SciQSAR 3.0 [4] and the statistical program Essential<br /> Regression 2.218 (3/1999) is a compiled MS Excel Marco (Add-in) [8], and NeuroSolution 3.0<br /> program [2]. All sorts of calculations were carried out on the Pentium II 350 MHz computer with<br /> 128 M RAM at the Faculty of Chemistry, University of Dalat.<br /> 2. Multivariate linear regression analysis<br /> Table 2: Inhibitory activity of Substituted N-alkyl-N-acyl- -aminoamide molecules [1]<br /> <br /> No R2 R3 R4 IC50, µM No R2 R3 R4 IC50, µM<br /> 1 n-hexyl n-butyl -H 9.0 10 Phenyl Cyclohexyl -H 6.20<br /> 2 n-hexyl tert-butyl -H 7.5 11 Phenyl Benzyl -H 3.90<br /> 3 n-hexyl Cyclohyxyl -H 9.0 12 Phenyl -CH2COOH -H 10.4<br /> 4 n-hexyl Benzyl -H 6.0 13 Phenyl -CH2CO2Me -H 20.2<br /> 5 n-hexyl -CH2COOH -H 7.5 14 Phenyl -CH2CO2Et -H 9.60<br /> 6 n-hexyl -CH2CO2Me -H 7.2 15 Methyl Benzyl -H 7.20<br /> 7 n-hexyl -CH2CO2Et -H 10 16 Ethyl Benzyl -H 15.0<br /> 8 Phenyl n-butyl -H 6.7 17 n-proyl Benzyl -H 6.10<br /> 9 Phenyl tert-butyl -H 4.0 18 n-butyl Benzyl -H 6.30<br /> <br /> The regression equation used here is as Using a multivariate linear regression<br /> follows [3, 4]: analysis is a fast method to identify the<br /> A = PiXi + C (1) calculated properties that are important for the<br /> th<br /> where the Xi - the i independent descriptor prediction of experimental quantities. The<br /> and Pi - the fitting parameter for the descriptor, magnitude of fitting parameter Pi indicates the<br /> the A-biological activity of the drug, and C- amount of the contribution of the descriptor to<br /> constant. the activity. That is, the larger the magnitude of<br /> Pi is the more important it is to the activity.<br /> 3. Neural network For the detailed observation of the data<br /> characteristics, the descriptors are selected in<br /> A neural net is a tool that can be used to<br /> the linear regression analysis by leave-one-out<br /> predict the value of a parameter using a<br /> method on basis the change of multiple R. The<br /> computational system which is made up a<br /> principal descriptors are series of the net<br /> number of simple, yet highly connected<br /> charges of atoms located in the ring benzene,<br /> processing elements called nodes which process<br /> i.e. C2, C3, C5, C6, and other sites are O12, O8,<br /> information by its dynamic state response to<br /> C10, C11. The descriptors which represent net<br /> external inputs. A recent article that describes<br /> charges are principal to describe activity for N-<br /> the use of a neural nets to correlate physical<br /> properties of compounds can be found in alkyl-N-acyl- -aminoamide.<br /> Soman's article [2, 5].<br /> <br /> III - Results and discussion<br /> <br /> 1. Multivariate linear regression analysis<br /> <br /> 94<br /> Besides, there are also the topological descriptors for a neural net. All investigations<br /> descriptors, 3d structure dependent parameters were performed by the program Essential<br /> and charge dependent parameters, i.e. X1, VX0, Regression 2.218 (3/1999) which is a compiled<br /> VX1, logP, WienI, Polar, ABSQ, ABSQneg. MS Excel Macro (Add-in). The parameters<br /> The most important parameters seem to be X1, multiple R, R2, Standard Error, PRESS,<br /> VX0, VX1, WienI, Polar and ABSQneg. This Significance F and t-values were used to select<br /> means that for N-alkyl-N-acyl- -aminoamide, the best regression model. The best regression<br /> the electrostatic interaction, steric effect and the model has significant 10 variables in table 3.<br /> transferability are important to determine the DivIC50 = 1/ IC50 = -4.1674 + 0.4393 X1 -<br /> activity. These properties should be as useful as 0.3211VX0 - 0.4123VX1 + 0.0254 Volume -<br /> Table 3: The regression statistics, Pi - values0.0011<br /> and t-values<br /> WienIof +the0.0056<br /> best regression<br /> MolWeightmodel<br /> - 0.0129<br /> Dipole - 0.3144ABSQ + 0.4165O8 - 1.48O12 (2)<br /> No Regression Statistics Parameter Pi t-value Parameter Pi t-value<br /> 1 Multiple R 0.9900 PX1 0.4393 13.75 Pmol.Weight 0.0056 8.901<br /> 2 R Square (R2) 0.9801 PVX0 -0.3211 -7.371 PDipole -0.0129 -4.010<br /> 3 Standard Error 0.0122 PVX1 -0.4123 -12.47 PABSQ -0.3144 -11.38<br /> 4 PRESS 0.0101 Pvolume 0.0254 10.47 PO8 0.4165 4.007<br /> 5 Significance F 0.00005 PWienI -0.0011 -13.94 PO12 -1.4800 -5.419<br /> <br /> The descriptors found in equation (2) were used for the back -propagation neural net. 9 N-<br /> alkyl-N-acyl- -aminoamides are taken from a SDF file of Cambridge databases. The predicted<br /> DivIC50 values of 9 these derivatives by multiple linear regression are given in table 4 and the<br /> regression plot in figure 2.<br /> 2. Results of the back-propagation neural network<br /> The architecture of a neural net involves the number of descriptors for input layer which 10<br /> being equal to the number of the variable in equation (2), the number of hidden layer is 1 and the<br /> number of nodes of hidden layer are 20, the number of descriptor for output layer is 1 (the DivIC50<br /> value). We carried out training the neural net with a set of 18 N-alkyl-N-acyl- -aminoamides when<br /> the trained conditions are momentum of 0.7, transfer function is TanhAxon, Maximum Epochs of<br /> 2000. The NeuroSolution 3.0 was used in this work.<br /> <br /> Table 4: The predicted DivIc50 values by multiple regression and a 10 x 20 x 1 neural net<br /> <br /> Predicted DivIC50 Predicted<br /> No R2 R3 R4 R5 DivIC50.epx by multiple linear DivIC50 by<br /> regression neural net<br /> 1 n-hexyl Benzyl -H 3-Br 0.0507 0.05248 0.04971<br /> 2 n-hexyl Benzyl -H 3-Cl 0.0506 0.04768 0.05364<br /> 3 n-hexyl Benzyl -H 3-F 0.0519 0.05724 0.05082<br /> 4 n-hexyl Benzyl -H 3-OCH3 0.0615 0.06011 0.06301<br /> 5 n-hexyl Benzyl -H 3-OH 0.1104 0.10134 0.10115<br /> <br /> <br /> 95<br /> 6 n-hexyl Benzyl -H 3-NH3 0.0551 0.05636 0.05157<br /> 7 n-hexyl tert-butyl -H 3-Br 0.0535 0.05666 0.05102<br /> 8 n-hexyl Cyclohyexyl -H 3-Cl 0.0562 0.05824 0.05258<br /> 9 n-hexyl -CH2COOOH -H 3-F 0.0546 0.04765 0.05052<br /> <br /> We used a 10 x 20 x 1 neural net to predict the DivIC50 values of 9 N-alkyl-N-acyl- -<br /> aminoamide derivatives which the neural net was not trained in table 4. The correlation is illustrated<br /> in figure 3. The correlation coefficient R2 is 0.97866 with a standard deviation of 0.00259. These<br /> initial investigations are thus very promising to predict the inhibitory activity of new drugs.<br /> <br /> <br /> <br /> <br /> Figure 2: The activity values are predicted by Figure 3: The activity values are predicted by a<br /> multiple linear regression analysis 10 x 20 x 1 neural net<br /> <br /> Conclusion Ohashi, A. M M. Mjalli. Bioorg. Med.<br /> Chem. Lett. 24, 2953-2958 (1995).<br /> We have used the molecular mechanics and<br /> RHF/PM3/SCF MO semi-empirical calculations 2. A. G. Soman, J. A. Darsey, D. W. Noid and<br /> from which the molecular properties are B. G. Sumpter. Chimicaoggi/Chemistry<br /> evaluated and combined a multivariate linear Today, March (1995).<br /> regression analysis and a back-propagation 3. N. R. Draper, H. Smith. Applied regression<br /> neural net for the prediction of the DivIC50 analysis, 2nd Edition, Jhn Wiley & Sons,<br /> values of 9 N-alkyl-N-acyl- -aminoamide New York (1998).<br /> derivatives and applicable for the development 4. Scivision. SciQSAR 3.0 User' Guide,<br /> of new drugs. The above approach shows a Burlingtong USA, Copyright (1999).<br /> promising technique.<br /> 5. Scivision. SciLogP 3.0 User' Guide,<br /> The predictive power of our neural network<br /> Burlington USA, Copyright (1999).<br /> shows very good agreement with experimental<br /> values when the trained condition of a neural 6. Hypercube, Inc. Hyperchem Release 5.1 for<br /> net is Maximum Epoch of 2000. The danger of Windows, October (1996).<br /> overtraining of the neural net was checked with 7. J. Michael Frishch. Gaussian 98 User's<br /> standard deviation. The correlation coefficients Reference, Gaussian, Inc, 1994-1998.<br /> and standard deviations are appropriate.<br /> 8. D. David Steppan, Joachim Werner, P.<br /> Rober Yeater. Essential Regression and<br /> References Experimental Design for Chemists and<br /> 1. X. Cao, E. J. Moran, D. Siev, A. Lio, C. Engineers, Copyright, June (1998).<br /> <br /> 96<br /> 97<br />
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2