The free energy landscape for the folding of a small acid-soluble protein from molecular dynamics simulations
Abstract
Intrinsically disordered proteins (IDPs) have been studied widely due to their abundance
in biological systems and most notably, for their important roles in cellular functions. Despite their
prevalence in nature, there is still much to be known about these proteins. Due to their dynamic
nature, these proteins have posed a problem for scientists in the past who have tried to characterize
them using traditional methods. In particular, the /-type small acid-soluble protein has been of
interest. This IDP has been attributed as the main factor for bacterial spore survival under extreme
conditions. Therefore, the determination of the binding mechanism of the /-type small, acidsoluble protein to spore DNA is essential for understanding its role in spore resistance.
In order to investigate the protein-DNA binding event, the C chain of the protein was first
isolated and used to determine both folding and unfolding properties of the protein, with respect
to their energies. This was achieved through the use of the molecular dynamics package Gromacs
2018.6 and the Plumed 2.4.4 plugin. In combination this allowed for a new form of accelerated
conformational sampling to be achieved, known as Well-Tempered Bias-Exchange Metadynamics
(WT-BEMETA). A total of 400 ns was obtained for an unbiased simulation (UB) of the C chain.
Two biased simulations of the initial, folded structure obtained from X-ray crystallography (F1,
F2) were simulated with a total of 100 ns collected for each. Lastly, two biased simulations of
unfolded structures (UF1, UF2) were simulated with a total of 100 ns collected for each. These
simulations incorporated the use of several collective variables for biasing, namely the number of
hydrogen bonds by monitoring two separate regions of the helices (NH), and the distance between
the two helices (DC). For each CV, a replica of the system was produced in order to apply the timedependent bias potential. An additional, unbiased replica of the system was also incorporated into
the exchange process. This combination yielded a total of 4 replicas for each of the 4 biased
systems using WT-BEMETA. Analysis with the Gromacs plug-in Metagui 3 allowed for the
identification of prominent folded and unfolded structures, based on their sets of collective
variables (CV) and corresponding energies for the F1. Two post-processing CVs (RMSD, rootmean-square deviation from the reference, folded structure and the number of native contacts, Q)
were also used to extract more information about the structural states of the systems as well as to
confirm the results. For the comparison of the unbiased simulation (UB) against the first, folded
biased simulation (F1), the sampling efficiency was determined to be significantly improved by
the incorporation of a bias. The sampling of the UB simulation confirmed that it was not capable
of accessing unfolded structural states, even with a longer simulation time. These results were
confirmed based on the small ranges of values obtained from the post-processing CVs. This
outcome illustrated the need for advanced sampling techniques. For the biased systems, F1 and F2
were capable of sampling both folded and unfolded structural states based on the CV range of
values that were obtained for the simulations. Meanwhile, UF1 and UF2 only sampled unfolded
structures based on their corresponding CV values. These results were confirmed by the
assessment of their free energy profiles, as well as with post-processing CVs. Therefore, more
sampling was needed to allow the 4 WT-BEMETA systems to sample the same phase space and
to improve reliability. The (UF1, UF2) systems were not used for further analysis, since they were
not capable of sampling both folded and unfolded structures. Since F1 and F2 displayed similar
results in their free energy profiles, only F1 was used for the determination of microstates and the
free energy landscape, for simplification.
Ultimately, the use of WT-BEMETA allowed for the identification of prominent
microstates of the protein as it unfolded. The free energy landscape was determined for these
structures, where folded structures were associated with lower energies (1-3.5 kJ mol-1) and
unfolded structures were associated with higher energies (3.5-12 kJ mol-1). Since the IDP exists in
an unfolded, structural state in nature, unfolded structures should correspond to lower energies.
This meant that these results had only captured transitioning, unfolded structures of the protein,
rather than true unfolded structures. The higher energies obtained for these transitioning structures
were attributed to the initial breaking of hydrogen bonds as the protein unfolded. In order to obtain
true unfolded structures, as well as to improve the reliability of the results, more sampling should
be conducted in the future. With more simulation time, these results can be extended further in
order to determine the binding mechanism of this IDP to spore DNA, as well as to understand the
effects of an additional protein chain present. Ultimately, this research will aid in the development
of identifying novel inhibitors that may prevent this binding process, as well as to avoid the
consequences of malfunctioning IDPs.
Rights
This material is made available for use in research, teaching, and private study, pursuant to U.S. Copyright law. The user assumes full responsibility for any use of the materials, including but not limited to, infringement of copyright and publication rights of reproduced materials. Any materials used should be fully credited with its source. All rights are reserved and retained regardless of current or future development or laws that may apply to fair use standards. Permission for publication of this material, in part or in full, must be secured with the author and/or publisher.McGregor, Lauren