The virus wreaking havoc on our lives is an efficient infection machine. Comprised of only 29 proteins (compared to our 400,000), with a genome 1/200,000 the size of ours, SARS-CoV-2 is expertly evolved to trick our cells to contribute its machinery to assist in its propagation.
In the last few months, scientists have learned a great deal about the mechanics of this mindless enemy. But what we've learned still pales in comparison to what we don't know.
There are a number of ways scientists uncover the workings of a virus. Only by using these methods in tandem can we find and exploit the coronavirus's weak spots, says Ahmet Yildiz, associate professor of Physics and Molecular Cell Biology at the University of California, Berkeley.
Yildiz and his collaborator Mert Gur at Istanbul Technical University are combining supercomputer-powered molecular dynamics simulations with single molecule experiments to uncover the secrets of the virus. In particular, they are studying its spike (S) protein, the part of the virus that binds to human cells and begins the process of inserting viral RNA into the cell.
"Many groups are attacking different stages of this process," Gur said. "Our initial goal is to use molecular dynamics simulations to identify the processes that happen when the virus binds to the host cell."
There are three critical phases that allow the spike protein to break into the cell and begin replicating, Yildiz says.
First, the spike protein needs to transform from a closed configuration to an open one. Second, the spike protein binds to its receptor on the outside of our cells. This binding triggers a conformational change within the spike protein and allows another human protein to cleave the spike. Finally, the newly exposed surface of the spike interacts with the host cell membrane and enables the viral RNA to enter and hijack the cell.
In early February, electron microscope images revealed the structure of the spike protein. But the snapshots only showed the main configurations that the protein takes, not the transitional, in-between steps.
We only see snapshots of stable conformations. Because we don't know the timing of events that allow the protein to go from one stable conformation to the next one, we don't yet know those intermediary conformations."
Ahmet Yildiz, Associate Professor of Physics and Molecular Cell Biology, University of California
That's where computer modeling comes in. The microscope images provide a useful starting point to create models of every atom in the protein, and its environment (water, ions, and the receptors of the cell). From there, Yildiz and Gur set the protein in motion and watched to see what happened.
"We showed that the S protein visits an intermediate state before it can dock to the receptor protein on the host cell membrane" Gur said. "This intermediate state can be useful for drug targeting to prevent the S protein to initiate viral infection."
Whereas many other groups around the world are probing the binding pocket of the virus, hoping to find a drug that can block the virus from latching onto human cells, Yildiz and Gur are taking a more nuanced approach.
"The spike protein strongly binds to its receptor with a complex interaction network," Yildiz explained. "We showed that if you just break one of those interactions, you still won't be able to stop the binding. That's why some of the basic drug development studies may not produce the desired outcomes."
But if it's possible to prevent the spike protein from going from a closed to open state -- or a third, in-between state that we're not even aware of to the open state -- that might lend itself to a treatment.
Find, and break, the important bonds
The second use of computer simulations by Yildiz and Gur identified not just new states, but the specific amino acids that stabilize each state.
"If we can determine the important linkages at the single amino acid level -- which interactions stabilize and are critical for these confirmations -- it may be possible to target those states with small molecules," Yildiz said.
Simulating this behavior at the level of the atom or individual amino acid is incredibly computationally intensive. Yildiz and Gur were granted time on the Stampede2 supercomputer at the Texas Advanced Computing Center (TACC) -- the second fastest supercomputer at a U.S. university and the 19th fastest overall -- through the COVID-19 HPC Consortium. Simulating one microsecond of the virus and its interactions with human cells -- roughly one million atoms in total -- takes weeks on a supercomputer...and would take years without one.
"It's a computationally demanding process," Yildiz said. "But the predictive power of this approach is very powerful."
Yildiz and Gur team, along with approximately 40 other research groups studying COVID-19, have been given priority access to TACC systems. "We're not limited by the speed at which the simulations happen, so there's a real-time race between our ability to run simulations and analyze the data."
With time of the essence, Gur and his collaborators have churned through calculations, re-enacting the atomic peregrinations of the spike protein as it approaches, binds to, and interacts with Angiotensin-converting enzyme 2 (ACE2) receptors -- proteins that line the surface of many cell types.
Their initial findings, which proposed the existence of an intermediate semi-open state of the S protein compatible to RBD-ACE2 binding via all-atom molecular dynamics (MD) simulations, was published in the Journal of Chemical Physics.
Furthermore, by performing all-atom MD simulations, they identified an extended network of salt bridges, hydrophobic and electrostatic interactions, and hydrogen bonding between the receptor-binding domain of the spike protein and ACE2. The results of these findings were released in BioRxiv.
Mutating the residues on the receptor-binding domain was not sufficient to destabilize binding but reduced the average work to unbind the spike protein from ACE2. They propose that blocking this site via neutralizing antibody or nanobody could prove an effective strategy to inhibit spike protein-ACE2 interactions.
In order to confirm that the computer-derived insights are accurate, Yildiz's team performed lab experiments using single molecule fluorescence resonance energy transfer (or smFRET) -- a biophysical technique used to measure distances at the one to 10 nanometer scale in single molecules
"The technique allows us to see the conformational changes of the protein by measuring the energy transfer between two light emitting probes," Yildiz said.
Though scientists still don't have a technique to see the atomic details of molecules in motion in real-time, the combination of electron microscopy, single molecule imaging, and computer simulations can provide researchers with a rich picture of the virus' behavior, Yildiz says.
"We can get atomic resolution snapshots of frozen molecules using electron microscopy. We can get atomic level simulations of the protein in motion using molecular dynamics in a short time scale. And using single-molecule techniques we can derive the dynamics that are missing from electron microscopy and the simulations," Yildiz concluded.
"Combining these methods together give us the full picture and dissect the mechanism of a virus entering to the host cell."
Gur, M. (2020) Conformational transition of SARS-CoV-2 spike glycoprotein between its closed and open states. Journal of Chemical Physics. doi.org/10.1063/5.0011141.