Tổng quan luận án Vật lý: Vai trò của trình tự hydrophobic và polar đối với cơ chế gấp nếp của protein và sự kết tụ của peptide

MINISTRY OF EDUCATION

VIETNAM ACADEMY

AND TRAINING

OF SCIENCE AND TECHNOLOGY

GRADUATE UNIVERSITY SCIENCE AND TECHNOLOGY ———————

NGUYEN BA HUNG

THE ROLE OF HYDROPHOBIC AND POLAR SEQUENCE ON FOLDING MECHANISMS OF PROTEINS AND AGGREGATION OF PEPTIDES

Major: Theoretical and computational physics Code: 9 44 01 03

HANOI − 2018

SUMMARY OF PHYSICS DOCTORAL THESIS

INTRODUCTION

The problem of protein folding has always been of prime concern in molecular biology. Under normal physiological conditions, most proteins acquire well deﬁned compact three dimensional shapes, known as the native conformations, at which they are biologically active. When proteins are unfolding or misfolding, they not only lose their inherent biological activity but they can also aggregate into insoluble ﬁbrils structures called amyloids which are known to be involved in many degenerative diseases like Alzheimer’s disease, Parkinson’s disease, type 2 diabetes, cerebral palsy, mad cow disease etc. Thus, determining the folded structure and clarifying the mechanism of folding of the protein plays an important role in our understanding of the living organism as well as the human health.

Protein aggregation and amyloid formation have also been studied extensively in recent years. Studies have led to the hypothesis that amyloid is the general state of all proteins and is the fundamental state of the system when proteins can form intermolecular interactions. Thus, the tendency for aggregation and for- mation amyloid persists for all proteins and is a trend towards competition with protein folding. However, experiments have also shown that possibility of aggre- gation and aggregation rates depend on solvent conditions and on the amino acid sequence of proteins. Some studies have shown that small amino acid sequences in the protein chain may have a signiﬁcant eﬀect on the aggregation ability. As a result, knowledge about the link between amino acid sequence and possibility of aggregation is essential for understanding amyloid-related diseases as well as ﬁnding a way to treat them.

Although all-atom simulations are now widely used molecular biology, the application of these methods in the study of protein folding problem is not feasible due to the limits of computer speed. A suitable approach to the protein folding problem is to use simple theoretical models. There are quite a number of models with diﬀerent ideas and levels of simplicity, but most notably the Go model and the HP network model and tube model.

Considerations of tubular polymer suggest that tubular symmetry is a fun- damental feature of protein molecules which forms the secondary structures of proteins (α and β). Base on this idea, the tube model for the protein was de- veloped by Hoang and Maritan’s team and proposed in 2004. The results of the tube model suggest that this is a simple model and can describes well many of the basic features of protein. The tube model is also the only current model that can simultaneously be used for the study of both folding and aggregation processes.

In this thesis, we use a tube model to study the role of hydrophobic and polar sequence on folding mechanism of proteins and aggregation of peptides. Spatial ﬁll of the tubular polymer and hydrogen bonds in the model play the role of background interactions and are independent of the amino acid sequence. The amino acid sequence we consider in the simpliﬁed model consists of two types of amino acids, hydrophobic (H) and polar (P). To study the eﬀect of HP sequence on the folding process, we will compare the folding properties of the tube model using the hydrophobic interaction (HP tube model) with tube model using the pairing interaction which is similar to the Go model (Go tube model). This comparison helps to clarify the role of non-native interactions in non-native interactions. To study the role of the HP sequence on aggregation of protein, we will compare the possibility of aggregation of peptide sequences with diﬀerent HP sequences including the consideration of the shape of the aggregation structures and the properties of aggregation transition phase. In addition, in the study of protein aggregation, we propose an improved model for hydrophobic interaction in the tube model by taking into account the orientation of the side chains of hydrophobic amino acids. Our research shows that this improved model allows for obtaining highly ordered, long-chain aggregation structures like amyloid ﬁbrils.

1. The objectives of the thesis:

The aim of the studies is to gain fundamental understanding of the role of hydrophobic and polar sequence on folding mechanism of proteins and aggre- gation of peptides

2. The main contents of the thesis:

The general understanding of protein and protein folding, protein aggregation is introduced in chapters 1, 2 of this thesis. Chapter 3 presents the methods used to simulate and analyze the data. The obtained results of role of HP sequence for protein folding are presented in chapter 4. The results of role of HP sequence for protein aggregation are presented in chapter 5.

Chapter 1

Protein folding

1.1 Structural properties of proteins

Proteins are macromolecules that are synthesized in the cell and responsible for the most basic and important aspects of life. Proteins are polymers (polypep- tides) formed from sequences of 20 diﬃrent types of amino acids, the monomers of the polymer. The amino acids in the protein diﬀer only in their side chains and are linked together through peptide bonds that form a linear sequence in a particular order.

Under normal physiological conditions, most proteins acquire well deﬁned compact three dimensional shapes, knows as the native conformations, at which they are biologically active.

The amino acid sequence in the protein determines the structure and function

of the protein. Proteins has four types of structure.

Primary structure: It is just the chemical sequence of amino acids along the backbone of the protein. These amino acid in chain linked together by peptide bonds.

Secondary structure is the spatial arrangement of amino acids. There are two such types of structures: the α-helices and the β-sheets. This kind of structure which maximize the number of hydrogen bonds (H-bonds) between the CO and the NH groups of the backbone.

Tertiary structure: A compact packing of the secondary structures comprises tertiary structures. Usually, theses are the full three dimensional structures of proteins. Tertiary structures of large proteins are usually composed of several domains.

Quaternary structure: Some proteins are composed of more than one polypep- tide chain. The polypeptide chains may have identical or diﬀerent amino acid sequences depending on the protein. Each peptide is called a subunit and has its own tertiary structure. The spatial arrangement of these subunits in the protein is called quaternary structure

There are a number of semi-empirical interactions that are introduced by chemists and physicists to describe interactions in proteins: disulﬁde bridges,

Coulomb interactions, Hydrogen bonds, Van der Waals interactions, Hydrophobic interactions.

1.2 Protein folding phenomenon

Once translated by a ribosome, each polypeptide folds into its characteristic three-dimensional structure from a random coil. Since the fold is maintained by a network of interactions between amino acids in the polypeptide, the native state of the protein chain is determined by the amino acid sequence (hypothesis of thermodynamics).

1.3 Paradox of Levinthal

Levinthal paradox which addresses the question: how can proteins possibly ﬁnd their native state if the number of possible conformations of a polypeptide chain is astronomically large?

1.4 Folding funnel

Figure 1.1: The diagram sketches of funnel describes the protein folding energy lanscape

Based on theoretical and empirical research ﬁndings, Onuchic and his col- leagues have come up with the idea of the folding funnel as depicted in Figure 1.1. The folding process of the protein in the funnel is the simultaneous reduc- tion of both energy and entropy. As the protein begins to fold, the free energy decreases and the number of conﬁgurations decreases (characterized by reduced well width).

Figure 1.2: Free energy lanscape in the two-state model. In this model, ∆F is the diference between the free energy of the folded and unfolded states. ∆FN and , ∆FD, ∆F are the height of barrier from the unfolded and folded states and free energy diﬀerence between the N and U states , respectively

In the canonical depiction of the folding funnel, the depth of the well repre- sents the energetic stabilization of the native state versus the denatured state, and the width of the well represents the conformational entropy of the system. The surface outside the well is shown as relatively ﬂat to represent the heterogeneity of the random coil state.

1.5 The minimum frustration principle

The minimum frustration principle was introduced in 1989 by Bryngelson and Wolynes based on spin glass theory. This principle holds that the amino acid sequence of proteins in nature is optimized through natural selection so that the frustrated caused by interaction in the natural state is minimal.

1.6 Two-state model for protein folding

Experimental observations suggest that the two-state model is a common mechanism used to characterize folding dynamics of the majority of small, globuar proteins. In a two-state model of protein folding, the single domain protein can occupy only one of two states: the unfolded state (U) or the folded state (N). The free energy diagram for two-state model is characterized by a large barrier separating the folded state and the unfolded state corresponding minima of the free energy of a reaction coordinate. The free energy diﬀerence between the N and U states (∆F ) characterize the degree of stability of the folding state called folding free energy. Rates of folding kf and unfolding ku obey the law Vant Hoﬀ-

Arrhennius:

(cid:18) (cid:19) − (1.1) kf,u = ν0 exp ∆FN,D kBT

For ν0 is constant, T is the temperature and kB is the Boltzmann constant. The change of such as temperature, pressure, and concentration may aﬀect on the ∆F .

1.7 Cooperativity of protein folding

Cooperativity is a phenomenon displayed by systems involving identical or near-identical elements, which act dependently of each other. The folding of proteins is cooperative process. In the protein, cooperativity is applied to the two- state process and is understood as the sharpness of thermodynamic transitions. In practice, cooperativity is determined by the parameter measured by the ratio between the enthalpy van’t Hoﬀ and the thermal enthalpy.

(1.2) κ2 = ∆HvH/∆Hcal

High cooperativity means that the system satisﬁes the two-state standard and

κ2 is closer to 1, the higher the co-operation and vice versa.

1.8 Hydrophobic interaction

The hydrophobic eﬀect is the observed tendency of nonpolar substances (such as oil, fat) to aggregate in an aqueous solution and exclude water molecule. The tendency of nonpolar molecules in a polar solvent (usually water) to interact with one another is called the hydrophobic eﬀect. In the case of protein folding, the hydrophobic eﬀect is important to understanding the structure of proteins. The hydrophobic eﬀect is considered to be the major driving force for the folding of globular proteins. It results in the burial of the hydrophobic residues in the core of the protein.

1.9 HP lattice model

In the HP lattice model, there are two types of amino acids with respect to their hydrophobicity: polar (P), which tend to be exposed to the solvent on the protein surface, and hydrophobic (H), which tend to be buried inside the globule

protein. The folding of the protein is deﬁned as a random step in a 2D or 3D network. Using this model, Dill had design some HP sequence that the minimal energy state in the tight packet conﬁgurations was unique. The phase transition of the sequences is designed to be well cooperative. Research shows that aggregate due to hydrophobic interaction is the main driving force for folding.

1.10 Go model

The Go model ignores the speciﬁcity of amino acid sequences in the protein chain and interaction potential is build based on the structure of the folded state. The basis of the Go model is the maximum consistent principle of protein interac- tions in the folded state. The results of the study show that the Go model for the folding mechanism is quite good with the experiment, especially in determining the contribution of amino acid positions in the polypeptide chain to the transi- tion state during protein folding. . Because the model is based on a native state structure, the Go model can not predict the protein structure from the amino acid sequence that is only used to study the folding process of a known structure.

1.11 Tube model

Considerations of symmetry and geometry lead to a description of the pro- tein backbone as a thick polymer or a tube. At low temperatures, a homopoly- mer model as a short tube exhibits two conventional phases: a swollen essen- tially featureless phase and and a conventional compact phase, along with a novel marginally compact phase in between with relatively few optimal structures made up of α-helices and β-sheets. The tube model predicts the existence of a ﬁxed menu of folds determined by geometry, clariﬁes the role of the amino acid se- quence in selecting the native-state structure from this menu, and explains the propensity for amyloid formation.

Chapter 2

Amyloid Formation

2.1 The structure of amyloid ﬁbril

Figure 2.1: 3D structure of the Alzheimer’s amyloid-β (1-42)ﬁbrils has a PDB code of 2BEG (a) view along the direction of ﬁbril axis (b) view perpendicular to the direction of ﬁbril axis

Amyloid ﬁbrils possess a cross-β structure, in which β-strands are oriented perpendicularly to the ﬁbril axis and are assembled into β-sheets that run the length of the ﬁbrils (Figure 2.1). They generally comprise 24 protoﬁlaments, that often twist around each other. Repeated interactions between hydrophobic and polar groups run along the ﬁbril axis.

2.2 Mechanism of amyloid aggregation

The formation of amyloid can be considered to involve at least three steps and are generally referred to as lag phase, growth phase (or elongation) phase and an equilibration phase. Seeding involves the addition of a preformed ﬁbrils to a monomer solution thus increasing the rate of conversion to amyloid ﬁbrils. Ad- dition of seeds decreases the lag phase by eliminating the slow nucleation phase.

Chapter 3

Methods and Models for simulations

3.1 HP tube model

The backbone of the protein is models as a string of Cα atoms separated by an interval of 3.8˚A, forming a ﬂexible tube of 2.5˚A also has a constraint with both the tube’s three radii (local and non-local). Potential 3 objects describing this condition are given in ﬁgure 3.1)

(cid:40)

∀ i, j, k (3.1) Vtube(i, j, k) = ∞ 0 if Rijk < ∆ if Rijk ≥ ∆

The bending potential in the tube model is related to the spatial constraints of the polypeptide chain. The bending potential at position i given by (Figure 3.1)

  (3.2) Vbend(i) =

 ∞ eR 0 if Ri−1,i,i+1 < ∆ if ∆ ≤ Ri−1,i,i+1 < 3.2 ˚A if Ri−1,i,i+1 ≥ 3.2 ˚A .

eR = 0.3 (cid:15) > 0 and the unit (cid:15) corresponds to the energy of a local hydrogen bond In the tube model, local hydrogen bonds are made up of atoms i and i+3 and assigned to energy equal to −(cid:15). Non-local hydrogen bonds are formed between the atoms i and j > i + 4 and have the energy of −0.7 (cid:15). The energy and geometric constraints of a local hydrogen bond between the atom i and the atom j are deﬁned as follows:



(3.3)

The same for a non-local hydrogen bond:

Figure 3.1: Sketch of the potentials used in the tube model of the protein. r, y are the local radius of curvature, nonlocal radius of curvature; z is distance between two amino acid residues; eR and eW are beding energy and hydrophobic energy



(3.4)

 

j > i + 4 ehbond = −0.7 (cid:15) 4.1 ˚A ≤ rij ≤ 5.3 ˚A |(cid:126)bi · (cid:126)bj| > 0.8 |(cid:126)bj · (cid:126)cij| > 0.94 |(cid:126)bi · (cid:126)cij| > 0.94 . In the tube model, hydrophobic interactions are introduced in the form of paring potential between non-continuous Cα atoms in sequence (j > i + 1) given by

(cid:40)

(3.5) Vhydrophobic(i, j) = eW 0 rij ≤ 7.5 ˚A rij > 7.5 ˚A ,

eW denotes the hydrophobic interaction energy for each contact, depending In the most studies, these

on the hydrophobicity of the amino acids i and j. values were selected by eHH = −0.5 (cid:15), eHP = eP P = 0.

3.2 Go tube model

The Go tube model is a tube model in which hydrophobic interaction energy

is replaced by the same energy interaction as the Go-like interaction model:

(3.6) E = Ebend + Ehbond + EGo .

Thus, the Go tube model retains the geometric and symmetric properties, the

bending energy and hydrogen bonds as in tube model. Go-type energy is built on the structure of the given native state. Interactive Go is given by:

(cid:40)

(3.7) VGo(i, j) = Cij eW 0 rij ≤ 7.5 ˚A rij > 7.5 ˚A ,

where Cij are the elements of the native contact map. Cij = 1 if between i and j exist in the native state and Cij = 0 in the other case. An contact in the native state is deﬁned when the distance between two consecutive Cα atoms is less than 7.5 ˚A.

3.3 Tube Model with correlated side chain orientations

we apply an additional constraint on the hydrophobic contact by taking into account the side chain orientation: ni · cij < 0.5 and −ni · cij < 0.5. Where ni and nj are the normal vectors of the Frenet frames associated with bead i and j, respectively, cij is an unit vector pointing from bead i to bead j. The new constraint is in accordance with the statistics drawn from an analysis of PDB structures

3.4 Structural protein parameters

To study the protein folding to the native state, we examine the properties of the protein conﬁgurations obtained from the simulation through a number of characteristic features including folding contacts, root mean square deviation (rmsd) and radius of gyration (Rg) .

3.5 Monte Carlo simulation method

For studying the folding and aggregation of protein, we carry out multiple in- dependent Monte Carlo (MC) simulations with Metropolis algorithm. The trans- fer of states of the systems in the models used is made by pivot, crank-shaft and tranlocation motion for protein aggregation and pivot, crank-shaft motion for protein folding.

3.6 Parallel tempering

Parallel tempering , also known as replica exchange MCMC sampling, is a simulation method aimed at improving the dynamic properties of Monte Carlo

method simulations of physical systems, and of Markov chain Monte Carlo (MCMC) sampling methods more generally by exchanges conﬁgurations at diﬀerent tem- peratures.

Using Metropolis algorithm to swap two conﬁgurations

(3.8) kBA = min {1, exp [(βi − βj) (Ei − Ej)]}

For kBA is the probability of moving from A to B. This method is very eﬀective to ﬁnd the basic state simultaneously at each temperature still obtained balanced set and they are easily applied on parallel computers.

3.7 The weighted histogram analysis method

The Weighted Histogram Analysis Method (WHAM) allows for optimal anal- ysis of data obtained from MC simulations as well as other simulations over a wide range of parameters by combining multiple histograms together.

The probability is found system at the temperature T

R (cid:80) l=1

Nk (E) e−βkE

(3.9) P (βk, E) =

R (cid:80) l=1

nl exp [−βlE − fl]

(cid:88) (3.10) fk = ln P (E, βk)

fm are calculated from Eqs. 3.9 and 3.10 self-consistently. Normally, fm converge quickly when the histograms balance and overlap. Determining the values of fk completely determines P (E, β) at any temperature.

Chapter 4

The role of hydrophobic and polar sequence on folding

mechanisms of proteins

Figure 4.1: Ground state conformations of two HP sequences considered in our study: a three-helix bundle (a) and a GB1-like structure (b)

In this chapter we study the folding process of protein in two models: the HP tube Model and the Go tube Model. In this study, we construct the tube Go model for the two strutures in such a way that the total hydrophobic energy of each structure are the same in the two models. The study was conducted with two proteins of the same length of N = 48: a three helix bundle (3HB) and a GB1-like structure (GB1). Figure 4.1 shows the native state of protein GB1 and 3HB.

In the HP tube model, eHH = −0.5(cid:15), eHP = eP P = 0 and the unit (cid:15) corre-

sponds to the energy of a local hydrogen bond.

4.1 Thermodynamics of protein folding in HP tube model

Figure 4.2a–c show the temperature dependence of the averaged radius of gyration, (cid:104)Rg(cid:105), average energy E and the speciﬁc heat of 3HP protein in the HP tube model. Average energy, radius decreases as the temperature decreases. The speciﬁc heat graph has a maximum Cmax = 1526kB at Tf = 0, 296(cid:15)/kB. It can be seen that for the tube HP model there is a small shoulder on the right of the speciﬁc heat peak at T ≈ 0.5 (cid:15)/kB corresponding to a sharp decrease in the average radius of motion as the temperature decreases. At T ≈ 0.5 (cid:15)/kB there is a sharp decrease in the size of the protein while the energy does not decrease much.This shoulder corresponds to a collapse transition.

Figure 4.3: similar as ﬁgure 4.2 in the tube Go model.

Figure 4.2: Temperature dependence of the averaged radius of gyration, (cid:104)Rg(cid:105), average energy E and the speciﬁc heat of 3HP protein in the HP tube model

Same with GB1 protein (ﬁg 4.2d–f), the transition temperature of the speciﬁc heat maximum of GB1 protein is Tf = 0.243 (cid:15)/kB and maximum of the speciﬁc heat Cmax = 509.7 kB, both signiﬁcantly lower than 3HB, showing that the phase transition of GB1 is less sharp and less cooperative.

4.2 Thermodynamics of protein folding in Go tube model

Figure 4.3 show the temperature dependence of the averaged energy E, av- erage radius of gyration, (cid:104)Rg(cid:105) and the speciﬁc heat of 3HP and GB1 protein in the Go tube model. The folding transition phase and collapse transition phase are sharper than the HP tube model. For both proteins, the change of the av- erage energy and the average radius of gyration were signiﬁcantly greater at the transition temperature with greater slope than the HP tube model. Speciﬁc heat has only a single peak at the transition temperature Tf and in particular, no shoulder appears at temperatures greater than the transition temperature. In the tube Go model, the collapse and folding transitions coincide at temperature Tmax. Collapse phase in the Go tube model is the same as the folding phase.

The folding transition temperature Tf is also slightly higher in the tube Go model: 0.345 (cid:15)/kB versus 0.296 (cid:15)/kB for 3HB protein and 0.291 (cid:15)/kB versus 0.243 (cid:15)/kB for GB1 protein. The maximum of the speciﬁc heat,Cmax, are roughly 2.8 and 4.1 times higher in the tube Go model comparing to the tube HP model corresponding to 3HB and GB1 protein (4269 kB versus 1526 kB for 3HB protein and 2104 kB versus 509.7 kB for GB1 ). These observations suggest that the tube

Go model is signiﬁcantly more cooperative than the tube HP model and the latter also yields a higher stability of the native state.

4.3 Folding transition phase in HP tube model and Go tube model

Figure 4.5: Same as 4.4 but for GB1 in HP tube model at Tf = 0.243 (cid:15)/kB

Figure 4.4: trajectories and normalized histograms of 3HB protien in HP tube model obtained at a large time of 2 × 109 MC steps at the folding transition temperature Tf = 0.296 (cid:15)/kB

Figure 4.7: Same as 4.6 but for GB1 in Go tube model at Tf = 0.291 (cid:15)/kB

Figure 4.6: Trajectories and normalized histograms of 3HB protien in Go tube model obtained at a large time of 2 × 109 MC steps at the folding transition temperature Tf = 0.345 (cid:15)/kB

Figure 4.4 and ﬁgure 4.5 describes long trajectories 2 × 109 MC steps at

temperature Tf = 0.296 (cid:15)/kB for 3HB protein and Tf = 0.243 (cid:15)/kB for GB1 protein in HP tube model. The energy and rmsd vary strongly at the transition temperature, while the radius of gyration Rg is only around the median value. Shows the existence of the folding phase at small energy and rmsd values, and the denaturing phase at the energy values rmsd and large. For the 3HB protein, the energy distribution graphs (Fig. 4.4(d)) and the root-mean-square deviation (Figure 4.4(e)) have two peaks distinguish between folding and unfolding phase and the radius gyration graph Rg has only one peak (Figure 4.4(f)). For GB1 proteins, the graphs of Rg have only one peak (Figure 4.5(f)) but the energy and rmsd distribution graph has two peak (Figure 4.5(d,e)). These results indicate that the existence of two phase: folding and unfolding phase for both proteins, but the phase separation in terms of energy of 3HB is more apparent than that of GB1. The phases of both proteins at tempature transition phase also did not diﬀer in average size shown by the radius of gyration. There are also intermediate states between the two phases.

Figure 4.4 and ﬁgure 4.5 describes long trajectories 2 × 109 MC steps at temperature Tf = 0.345 (cid:15)/kB for GB1 protein and Tf = 0.345 (cid:15)/kB for GB1 protein in Go tube model. The energy, rmsd and Rg of the two proteins are strongly variable over time. The energy state and rmsd diagram have two distinct peak, the Rg histogram has a sharp peak at low values for folding state and broad shoulders at large values. The two-phase separation: fold and unfold in the Go tube model is much clearer than the HP tube model.

The eﬀective free energy at a given temperature T is deﬁned as F (E, rmsd) = −kBT log P (E, rmsd). Here P (E, rmsd) is the density of the probability that the protein is in the energy state E and rmsd given.

Figure 4.8 describes the free energy at T = Tf for the 3HB and GB1 proteins in the HP tube model and the Go tube. In the Go tube model, free energy consists of only two minimums showing the two states of phase transition. The HP tube model has a more complex free energy surface, consisting of three minimums in the case of 3HB proteins and 2 minima in the case of the GB1 protein. Basically, the free energy surface of 3HB in the HP tube model still exhibits a 2-state system due to the 2 minima of the unfold phase link together by a low margin and can be lumped together. In all cases, there is always a free energy margin between the folding and the unfolding phase. The unfolding phase of proteins in the Go tube model is always high in energy, while unfolding phase in the HP tube model involves energy states that range from low to high energy. The existence of unfolding state with low energy is a consequence of the HP sequence in the HP tube model, allowing the formation of hydrophobic contacts that do not exist

(a)

(b)

(c)

(d)

Figure 4.8: Two-dimensional free energy landscape as the function of E and rmsd at the folding transition temperature Tf = 0.345 (cid:15)/kB in HP tube model (a), Tf = 0.296 (cid:15)/kB in Go tube model (b) for 3HB protien and at Tf = 0.291 (cid:15)/kB in HP tube model (c), Tf = 0.243 (cid:15)/kB in Go tube model (d) for GB1 protein

in the native state. At the same time, the folding transition temperature Tf in the HP tube model lower in the Go model also makes it easier to form hydrogen bonds in the unfolded state.

Comparison of the HP tube model and the Go tube model suggests that changing the model changes the transition state. Speciﬁcally, for the 3HB pro- tein, the transition state is near the (E, rmsd) = (−43 (cid:15), 5.5˚A) in HP tube model, and (−24 (cid:15), 5˚A) in Go tube model. For GB1 protein, the transition state is near (−26 (cid:15), 5.8˚A) in the HP tube model and (−28 (cid:15), 8˚A) in the Go tube model. How- ever, it can be seen that the transition state is not as great as the change of unfolded status when moving from the HP tube model to the Go tube model. This is consistent with previous theoretical and empirical studies suggesting that the mechanism of protein folding as well as the transition state depends primarily on the geometry of the folded state.

4.4 Eﬀect of hydrophobic interaction intensity on folding process

3HB protein continued to be used in this study. The value eHH varies from 0.15 to 0.7. Fig 4.10 describes the eHH dependence of the speciﬁc heat. When eHH increases, Cmax decreases, Tf increases. The graphs have a sharp peak signaling

Figure 4.10: Temperature dependence of the speciﬁc heat of 3HP protein in the HP tube model with diﬀer- ent hydrophobic interaction intensities eHH = −0.2 (cid:15), −0.3 (cid:15), −0.5 (cid:15) v −0.7 (cid:15).

Figure 4.9: Ground state conformations obtained by the simulations for 3HB protein with varying hy- drophobic interaction intensities. The display struc- ture corresponds to eHH = −0.2 (cid:15) (a), eHH = −0.21 (cid:15) (b), eHH = −0.3 (cid:15) (c), eHH = −0.5 (cid:15) (d), eHH = −0.7 (cid:15) (e).

the phase transition type 1. From eHH = −0.3(cid:15) to eHH = −0.7(cid:15) graph has a small shoulder, it expands when eHH increases. At the values |eHH| < 0.3 epsilon the shoulder does not exist or very small to be recognized on the graph.

4.11 depicts the dependence of the average energy (cid:104)E(cid:105) and the radius of gy- ration (cid:104)Rg(cid:105) on the temperature. Average energy changes at the folding transition temperature Tf . When |eHH| > 0.2 (cid:15), then the change of Rg by the temperature is monotonous. The change of Rg by temperature occurs more slowly and the in- ﬂection point of the graph occurs at higher temperatures as |eHH| increases. This proves that as |eHH| increases, the collapse phase occurs at higher temperatures. For |eHH| ≤ 0.2 (cid:15), the radius of the radius depends on temperature in the form of non-monotonous: at low temperature Rg has a large value corresponding to the basic state is single-α; as the temperature rises, the single helix becomes unstable due to thermal oscillations and therefore Rg decreases; As temperatures continue to rise, the hydrogen bonds break down and the protein conﬁguration is folded in size increasing lead the Rg increase.

The cooperativity depend on the hydrophobic force intensity is determined by the ratio between the enthalpy van’t Hoﬀ and the thermal enthalpy κ2 = ∆H vH/∆Hcal. The value κ2 equal to 0, 5975 ± 0, 0166; 0, 6181 ± 0, 0116; 0, 7267 ± 0, 0206; 0, 7475 ± 0, 0256 for (cid:15)HH = 0, 2; 0, 3; 0, 5; 0, 7. The results show that when the hydrophobic interaction is stronger, the cooperation also becomes stronger show by the increasing of the value of κ2.

Figure 4.11: Temperature dependence of the average energy (cid:104)E(cid:105) (a), the averaged radius of gyration, (cid:104)Rg(cid:105) (b) of 3HP protein in the HP tube model with diﬀerent hydrophobic interaction intensities eHH = −0.2 (cid:15), −0.3 (cid:15), −0.5 (cid:15) v −0.7 (cid:15).

Chapter 5

the role of hydrophobic and polar sequence on

aggregation of peptides

This chapter studies the aggregation of the short peptide in the tube model with correlated side chain orientations. We study the role of the HP sequence on protein aggregation and formation of amyloid ﬁbrils. We consider 12 HP sequences of length N = 8 as given in table 5.1 with number of peptide in each systems changing from m = 1 to m = 20. The sequences, denoted as S1 through S12, are selected in such a way that they contain only 2 or 3 hydrophobic (H) residues, corresponding to hydrophobic fraction of 25% and 37.5%, respectively. Figure 5.1 shows that the lowest energy conformation obtained in the simulations,supposed to be the ground state of a given system, strongly depends on the sequence.

5.1 Sequence dependence of aggregate structures

Fig. 5.1 shows that the lowest energy conformation obtained in the simula- tions. Two sequences, S2 and S11, form a double layer β-sheet structure with characteristics similar to that of a cross-β structure. A similar structure but less ﬁbril-like is also found for sequence S12 with some parts that are non-β-sheet. Both sequences S3 and S4 form a α-helix bundle. The helix bundle of sequence S4 however is more ordered and has an approximate cylinder shape, in which the α-helices are almost parallel to each other.

Table 5.1: HP sequences of amino acids of peptides considered in present study (H – hydrophobic, P – polar). The parameter s denotes the minimal sequence separation between two consecutive H amino acids.

Sequence name Sequence S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12

s P P P H H P P P 1 P P H P H P P P 2 P P H P P H P P 3 P H P P P H P P 4 P H P P P P H P 5 H P P P P P H P 6 H P P P P P P H 7 P P H H H P P P 1 P P H P H H P P 1 P H P P H H P P 1 P H P H P H P P 2 P H P P H P H P 2

The role of hydrophobic residues in aggregation can be ﬁgured out from the structures of the aggregates. The packing of hydrophobic side chains is best

Figure 5.1: Ground state conformations obtained by the simulations for systems of M = 10 peptides for 10 HP sequences (S1–S10) as given in Table 5.1.

observed for sequences S2 and S11, for which the hydrophobic residues are aligned within each β-sheet and the hydrophobic side chains from the two β-sheets are facing each other. This packing is possible due to the HPH pattern in these sequences which position the hydrophobic side chains on one side of each β-sheet. An alignment of hydrophobic residues is also seen for sequence S12 due to the HPH segment of this sequence. In the aggregate of sequences S4, which is a helix bundle, the hydrophobic side chains are gathered along the bundle axis, thanks to to the alignment of hydrophobic side chains along one side of each α-helix. This alignment is due to the HPPPH pattern in the S4 sequence. On the other hand, the S3 sequence with the HPPH pattern also forms a helix but the hydrophobic side chains are not well aligned in the helix, leading to a less ordered aggregate.

5.2 Thermodynamics of aggregation

We ﬁnd that the speciﬁc heat strongly depends on both the sequence and the system size. Fig. 5.2 and Fig. 5.3 show the temperature dependence of the speciﬁc heat per molecule for various system sizes for sequences S2 and S4, respectively. For sequence S2, it is shown that as M increases the speciﬁc heat’s peak shifts toward higher temperature and its height increases (Fig. 5.2). This result indicates that the aggregate becomes increasingly stable and the transition becomes more cooperative as the system size increases. For sequence S4, for which the aggregates are helix bundles, the height of the main peak increases with M but the position of the peak varies non-monotonically (Fig. 5.3). Note that the aggregation transition for sequences S4 is always found at a slightly lower temperature than the folding transition of individual chain. This is in contrast

Same as Fig.

Figure 5.3: 5.2 but for sequence S4 systems at 1 mM concentration. For clarity, the system sizes shown are fewer than for sequences S2.

Figure 5.2: Temperature dependence of the speciﬁc heat C per molecule for sequence S2 systems with the number of chains M equal to 1, 2, 3, 4, 5, 6, 8 and 10 as indicated. The position of a putative physiological temperature, T ∗, is indicated.

with sequence S2, whose aggregation transition temperature is always higher than the folding temperature of a single chain.

In Fig. 5.4, the results of the maximum speciﬁc heat per molecule, Cpeak/M , and the temperature of the peak, Tpeak, are combined for all sequences considered and for several values of M . It is shown that the variation of both Cpeak/M and Tpeak increases with M . Note that for M = 10, the highest speciﬁc heat maxima correspond to sequences S2 and S11 whose aggregates are ﬁbril-like (see Fig. 5.1). For sequences S2 and S11, Cpeak/M is not only the highest among all sequences but also increases with M much faster than other sequences. Our results indicates that the propensity of forming ﬁbril-like aggregates is associated with the cooperativity of the aggregation transition.

The wide variation in the transition temperatures Tpeak among sequences sug- gests another interesting aspect of aggregation. Suppose that we consider the sys- tems at the physiological temperature, T ∗. In our model, a rough estimate of T ∗ could be 0.2 (cid:15)/kB, which corresponds to a local hydrogen bond energy of 5 kBT ∗. For M = 1, one ﬁnds that all sequences but S10 has Tpeak < T ∗ suggesting that the peptides are substantially unstructured at T ∗ as a single chain. For M = 6 and M = 10, only three sequences, S3, S4 and S5, have Tpeak < T ∗, while the other have Tpeak > T ∗. Thus, sequences S3, S4 and S5 do not aggregate at T ∗ while other sequences do. This result indicates that the variation of aggregation transition temperatures among sequences is also a reason why protein sequences behave diﬀerently towards aggregation at the physiological temperature. Some se-

quences do not aggregate because aggregation is thermodynamically unfavorable at this temperature.

Figure 5.5: Energy as function of Monte Carlo steps in a trajectory at T = 0.2 for the sequence S2 system with M = 4 at 1 mM concentration. The conformation shown is a metastable state with a 3-peptide β-sheet in contact with a disordered helix formed by the 4th peptide.

Figure 5.4: Dependence of the maximum of the spe- ciﬁc heat Cpeak per molecule (a) and its temperature Tpeak (b) on the sequence for systems of M = 10 (solid), M = 6 (dashed) and M = 1 (dotted) peptides at 1 mM concentration. The horizontal line in (b) indicates a pu- tative physiological temperature T ∗.

Note that the ability of forming ﬁbril-like aggregates is not necessarily asso- ciated with a high aggregation transition temperature. In fact, Fig. 5.4b shows that sequences S2 and S11 have only a medium value of Tpeak among all sequences, for both M = 6 and M = 10. Some sequences with a higher Tpeak, such as S8, S9 and S10, form disordered aggregates.

Fig. 5.2 shows that for sequence S2, systems of M ≤ 4 have the speciﬁc heat peaked at a lower temperature than T ∗ = 0.2(cid:15)/kB, which means that these systems do not aggregate at T ∗. Only for M > 4, the speciﬁc heat peak temper- ature is higher than T ∗ indicating that the ﬁbril-like aggregates formed by this sequence are stable at T ∗. Thus, a suﬃcient number of peptides is needed for the aggregation to happen at a given temperature. We also ﬁnd that the lower peak in the speciﬁc heat of the system of M = 4 (Fig. 5.2) corresponds to a transition from metastable aggregates at intermediate temperature to the ground state at low temperature.

Fig. 5.5 shows the trajectory of an equilibrium simulation at T = 0.2(cid:15)/kB for sequences S2 with M = 4. The time dependence of the system’s energy in this trajectory indicates that the peptides do not aggregate most of the time, so that the energy is relatively high, but for some short periods they can spontaneously

form a metastable aggregate of a much lower energy. This metastable aggregate has a three-stranded β-sheet (Fig. 5.5, inset) and could act as a template for ﬁbril growth in systems of more peptides.

5.3 Kinetics of ﬁbril formation

First, we consider a system of M = 10 peptides with concentration c = 1 mM under equilibrium condition. Fig. 5.6 shows the dependence of the total free energy of the system on the size of the largest aggregate, m, formed at three temperatures slightly below Tpeak including T = T ∗ = 0.2 (cid:15)/kB. It is shown that for all these temperatures the free energy has a maximum at m = 3, suggesting that m = 3 could be the size of the critical nucleus for ﬁbril formation. The free energy barrier for aggregation in Fig. 5.6 is found to increase with T and is about of 1 kBT to 4 kBT . This barrier is not large and is consistent with the fact that the sequence considered is highly aggregation-prone. For m > 3, Fig. 5.6 shows that the free energy decreases almost linearly with n, which is consistent with the fact that the growth of the aggregate in size is essentially one-dimensional.

We then considered a larger system of M = 20 peptides and studied the time evolutions from random conﬁgurations of dispersed monomers. Up to 100 independent trajectories are carried out to determine the statistics. We ﬁrst consider the system at concentration c = 1 mM and T = 0.2 (cid:15)/kB. Fig.5.7 (a and b) shows three typical trajectories with the total energy E and the size of Interestingly, these trajectories the largest aggregate m as functions of time. show clear evidence of an initial lag time, during which m ﬂuctuates but remains small (m ≤ 3) before a rapid and almost monotonic growth (Fig. 5.7 b). They also shows that nucleation is complete for m = 3. A peptide conﬁguration at a nucleation event is shown on Fig. 5.7d indicating that a possible nucleus is a three-stranded β-sheet formed by three peptides (Fig. 5.7e). Fig. 5.7c shows that the system can form multiple aggregates of various sizes. The distribution of the aggregate size obtained after a suﬃcient long time is bimodal reﬂecting the fact that the system size is ﬁnite and clusters of less than 4 peptides are unstable. Thus, one either observes one large cluster with size close to the system size or several smaller clusters. The largest aggregates of m = 20 peptides have the form of an elongated double β-sheet strongly resemble a cross-β-structure (Fig. 5.7f).

It is shown in Fig. 5.8 (a and b) that for T = 0.2 (cid:15)/kB, the time dependence of (cid:104)nβ(cid:105) can be ﬁtted well to the exponential relaxation function of M (1 − e−t/t0), where t0 is the characteristic time of aggregation. This time dependence also depends strongly on the concentration c with t0 increases more than 3 times by

Figure 5.6: Dependence of total free energy, F , on the size of the largest aggregate, m, for the sequence S2 system of M = 10 peptides at 1 mM concentration and at three diﬀerent temperatures, T = 0.2, 0.21 and 0.22 (cid:15)/kB, as indicated. A barrier with the maximum located at m = 3 is indicated.

Figure 5.7: Kinetics of ﬁbril formation for sequence S2 with M = 20 peptides at concentration 1 mM and temperature T = 0.2(cid:15)/kB. (a) Dependence of the energy, E, on time, t, measured in MC steps for three diﬀerent trajectories. (b) Time dependence of the maximum aggregate size m for the same three trajectories as shown in (a). Arrows indicate nucleation event for each trajectory. (c) Histogram of the aggregate size given by the number of peptides obtained at a large time of t = 1.5 × 109 MC steps. (d) Snapshot of peptide conﬁguration at a nucleation event. (e) Conformation of the nucleated cluster formed by three peptides taken from the conﬁguration shown in (d). (f) Conformation of an elongated ﬁbril-like structure formed by 20 peptides.

changing c from 1 mM to 0.5 mM. There seems to be no evidence of a lag phase at T = 0.2 (cid:15)/kB as (cid:104)nβ(cid:105) increases linearly with t for small t (Fig. 5.8b). This lack of evidence, however,may be due to the fact that the deviation from the exponential growth is too small to be observed. Indeed, we ﬁnd that if the temperature is increased a little to T = 0.21 (cid:15)/kB, the lag phase can be observed. Fig.5.8c shows that the growth of (cid:104)nβ(cid:105) in time is signiﬁcantly deviated from the exponential relaxation function at small time. This growth when plotted in a log-log scale (Fig. 5.8c) shows that at small time (cid:104)nβ(cid:105) ∝ tα with α ≈ 1.25. The exponent α > 1 indicates that the time dependence of (cid:104)nβ(cid:105) behaves like a convex function, which proves the existence of the lag phase at small time. The stronger evidence of the lag phase at T = 0.21 (cid:15)/kB compared to that at T = 0.2 (cid:15)/kB is consistent with the higher free energy barrier for nucleation at the former temperature previously shown in Fig. 5.6.

5.4 Aggregation of mixed sequences

Finally, we study the aggregation for a binary mixture of two sequences, S2 and S4. It was shown that in homogeneous systems, the ﬁrst sequence is strongly ﬁbril-prone, whereas the second one forms only α-helices. Furthermore, the se- quence S4 has the aggregation transition temperature lower than T ∗, so the its aggregate is not stable at T ∗. Strikingly, our simulations at T ∗ show that in a binary system of equally 10 chains of each sequence, after a suﬃciently long time, a fraction of the S4 chains aggregate and convert into β-sheet conformation on an existing aggregate formed by the S2 chains (see Fig. 5.9). Though this fraction is only about 10% on average, this observation shows that the template-based mechanism for ﬁbril formation can be eﬀective for polypeptides of very diﬀerent natures. Here, the ﬁbril-like aggregate formed by the aggregation-prone peptides acts as the template for the aggregation of non-aggregation-prone peptides. Note that due to the mismatch of diﬀerent hydrophobic patterns in the two sequences, the aggregates formed by the two sequences are more disordered than the homo- It is also shown in Fig.5.9c that the growth of this geneous ones (Fig. 5.9b). mixed aggregate at the given temperature remains exponential but the character- istic time for aggregation is larger than in corresponding homogeneous system of sequence S2.

Figure 5.9: (a) Snapshot of a conformation obtained in a simulation of the binary mixture of 10 chains of sequence S2 and 10 chains of sequence S4 at concen- tration c = 1 mM and temperature T = 0.2(cid:15)/kB. H residues are shown in dark green. P residues are in light green and pink colors for the S2 and S4 chains, respec- tively. (b) Zoom-in side and top views of the aggregate shown in a. Note that six S4 chains are present in the aggregate, and ﬁve of them are in the β-sheet conﬁgu- ration. (c) Time dependence of the average number of peptides in β-sheet conformation, (cid:104)nβ(cid:105), obtained from 100 independent simulations, for both sequences to- gether (squares) and for sequences S4 only (circles). A ﬁt to the exponential relaxation function as given in the caption of Fig. 5.8 with t0 = 832 × 106 (solid line) is shown for the case of both sequences.

Figure 5.8: Time dependence of the average number of peptides in β-sheet conformation, (cid:104)nβ(cid:105), in the ag- gregation of sequence S2 with M = 20. The system is considered at temperatures T = 0.2(cid:15)/kB (a,b) and 0.21(cid:15)/kB (c,d) and at several concentrations, c = 1 mM (squares), 0.5 mM (circles) and 0.25 mM (trian- gles), as indicated. The average of nβ for each con- centration is taken over 100 independent trajectories. Right ﬁgures (b and d) plot the same data as in the left ﬁgures (a and c), respectively, except that in log- log scale. Data points are ﬁtted to an exponential relaxation function of M (1 − e−t/t0 ) for c = 1 mM (solid) with t0 = 570 × 106 for c = 1 mM in (a) and t0 = 1850 × 106 for c = 0.5 mM in (a), and t0 = 109 for c = 1 mM in (c). The log-log plots shows that the growth of nβ at small times follows a power law, (cid:104)nβ(cid:105) ∝ tα, with α = 1 in (b) and α = 1.25 in (d) for both concentrations of 1 mM and 0.5 mM.

5.5 Discussion

Previous study of the tube model has shown that hydrophobic-polar sequence can select protein’s secondary and tertiary structures. In particular, the HPPH and HPPPH patterns have been identiﬁed as strong α-formers, whereas the HPH pattern is a β-former. Strikingly, exactly the same binary patterns have been used in experiments that allow the successful design of de novo proteins. In the present study, we ﬁnd that these simple selection rules still hold for the peptides in aggregates, even though the model has been changed by considering the ori- entations of side chains. The present study shows that the binary pattern also determines the orderness of the aggregate. In particular, there should be some compatibility between the alignment of hydrophobic side chains and the overall

symmetry of the aggregate. Interestingly, the HPH pattern appears to be both a strong β-former and a highly aggregation-prone sequence. Our ﬁnding is in a full agreement with experimental design of amyloids, which shows that segments of alternating hydrophobic and polar pattern (such as PHPHPHP) can direct protein sequences to form amyloid-like ﬁbrils.

The role of side-chains in amyloid ﬁbril formation has been stressed in early all-atom simulations of short peptides. In the tube model without consideration of side chain orientations, amyloid aggregation may also be obtained. However, these aggregation are sometimes disorganized in arranging the β-sheet. Here, we show that the correlated orientations of hydrophobic side-chains are important for both the ordered packing of β-strands within a β-sheet and the stacking of In particular, the alternating hydrophobic polar pattern β-sheets in the ﬁbril. leads to β-sheets of hydrophobic side chains oriented on one side of the β-sheet. This one-sided orientation stabilizes the two-layered β-sheet aggregate, which is the system’s ground state and can grow into a long ﬁbril, as shown for the case of sequence S2.

Our thermodynamics calculations show that the formation of ﬁbril-like ag- gregates is much more cooperative than that of non-ﬁbril-like aggregates. This cooperativity was indicated by both the height of the speciﬁc heat peak and the increase of the maximum speciﬁc heat per molecule with the system size. The high cooperativity of ﬁbril formation can be understood as due to the highly ordered nature of ﬁbril structures and the dominating contribution of intermolecular in- teractions in these structures. We also ﬁnd that thermodynamic stability is not a distinguished feature of ﬁbril-like aggregates. In particular, sequences associated with very high aggregation transition temperature do not necessarily form ﬁbril- like aggregates. The increased overall hydrophobicity of the sequence is shown to enhance the stability of the aggregates without impact on their ﬁbril charac- teristics. Our work shows that the HP pattern is a determinant of both amyloid aggregation and its thermodynamic stability, rather than all hydrophobicity of the sequence.

The sequence S2 in our study shows the one-step nucleation. The impact of the HP sequence on nucleation is also associated with a small nucleation barrier and the rapid nucleation with almost invisible lag phase observed for this sequence. For this ﬁbril-prone sequence, it is found that the non-equilibrium behavior of a larger system is consistent with equilibrium properties of smaller systems at the same peptide concentration. In particular, the frequent formation and dissolving of the aggregates before nucleation and the growth of the aggregates after nu- cleation are in accord with their thermodynamic stabilities as isolated systems.

Interestingly, the small size of the critical nucleus found in our study agrees with those obtained in homopolymer studies as well as in lattice heteropolymer and all-atom simulations of short peptides

Our simulation result on the peptide binary mixture is fully consistent with experiment of Ridgley and shows that a cross-β-sheet can be heterogeneous in its peptide composition. It is possible that naturally occurring amyloid ﬁbrils can possess this heterogeneity due to the templated self-assembly process.

Conclusion

Our studies in this thesis have led to the following conclusions about the role of hydrophobic and polar sequence on folding mechanism of proteins and aggregation of peptides:

1. The hydrophobic and polar (HP) sequence has a strong inﬂuence on the fold- ing mechanism of proteins. The folding process of a protein with a deﬁned HP sequence is characterized by two separate transitions: the collapse tran- sition from a swollen state to a compact but disordered state occurring at a higher temperature and the folding transition from the compact disordered state to the folded state occurring at a lower temperature. The disordered state is stabilized primarily by hydrophobic interaction. The intensity of the hydrophobic interaction has a great aﬀect on the temperature of the collapse transition and the temperature of the folding transition of the protein, but within a fairly wide range the intensity of interaction does not change the structure of the folded state of the protein .

2. The folding cooperativity and thermal stability of proteins in the tube model with HP sequence is considerably lower than in the tube Go model. This shows that in the free energy landscape have been shaped by the geometric and symmetric elements of the protein, the conﬂict between the interactions (leading to frustration) persists for a considerable time with speciﬁc amino acid. The conﬂict among interactions are almost completely eliminated in the Go tube model, a model with optimized interactions for protein folded structure.

3. The aggregate structure of short peptides and the thermodynamic properties In the of the aggregation transition strongly depend on the HP sequence. HP sequences, there exist patterns of high aggregation propensities forming structures that are rich in beta-sheet or alpha-helix. In addition to the HP sequence, the orientation of hydrophobic side chains also has a great inﬂu- ence on the order and the symmetry of aggregate structures. Our simulation showed that the peptides contained sequences with the HPH pattern (two H amino acid separated by one P), the aggregate is a two-layer β-sheet struc- ture similar to that found in amyloid ﬁbrils. The highest peak in the speciﬁc heat belongs to sequences which forms the double layer β-sheet structure. This result suggests that the propensity to form amyloid may be linked to the cooperativity of the aggregation transition. The formation of amyloid ﬁbrils follows the nucleation and growth mechanism with the existence of a

lag phase.

4. The template structure plays an important role in ﬁbrils formation. Amyloid ﬁbrils can be formed by a mixture of peptides of non-homogeneous amino acid sequences. Our simulations also show another feature of amyloid forma- tion, that is considerably non-speciﬁc to a sequence, namely the ﬁbril induced aggregation of a non-aggregation-prone sequence. This templating property certainly complicates the problem of amyloid formation as it suggests that the cross-β structure can be heterogeneous in their sequence or peptide com- position.

List of works has been published

[1] Nguyen Ba Hung, Trinh Xuan Hoang, “Folding of proteins in presculpted

free energy landscape”, Conmunications in Physics 23, 313–320 (2013).

[2] Nguyen Ba Hung, Trinh Xuan Hoang, “Aggregation of peptides in the tube model with correlated sidechain orientations”, Journal of Physics: Conference Series 627, 012028 (2015).

[3] Nguyen Ba Hung, Duy Manh Le, Trinh X. Hoang, “Sequence dependent aggregation of peptides and ﬁbril formation”, Journal of Chemical Physics 147, 105102 (2017).