Hype versus reality: What you can't do with DeepMind's AlphaFold in drug discovery PlatoBlockchain Data Intelligence. Vertical Search. Ai.

Hype versus reality: What you can’t do with DeepMind’s AlphaFold in drug discovery

Analysis DeepMind’s AlphaFold model has predicted nearly all known protein structures discovered yet, though its ability to help scientists discover new drugs remains unproven.

Proteins are complex molecules created by organisms to carry out the biological functions necessary for life. Generally made up of a string of 20 amino acids, these chains fold up in countless ways, with their final shape determining how they work and interact with other things.

It’s not a straightforward process determining how a protein will fold. For example, let’s say you wanted to synthesize a protein or slightly alter its operation. You can’t adjust its amino acids or come up with a new string of them and know for sure how they will turn out and work when folded. This is where computers come into it.

Advances in AI algorithms and training have led to the development of software, such as AlphaFold, that can accurately predict the 3D shapes of proteins given their amino acid combinations.

AlphaFold is impressive, and has now predicted over 200 million proteins from their amino acid strings. Researchers hoped that building such a large database would allow scientists to develop treatments targeting specific proteins associated with diseases such as cancer or dementia. Coming up with such medicines may require you to know the physical structure of the protein, which is where programs like AlphaFold can be used.

An investigation led by academics at MIT in America, however, shows just how difficult the task is in practice. Essentially, the AI software is useful in one step of the process – structure prediction – but can’t help in other stages, such as modeling how drugs and proteins would physically interact.

“Breakthroughs such as AlphaFold are expanding the possibilities for in silico (computer simulation) drug discovery efforts, but these developments need to be coupled with additional advances in other aspects of modeling that are part of drug discovery efforts,” James Collins, lead author of the study published in Molecular Systems Biology and a bioengineering professor at MIT, said in a statement.

“Our study speaks to both the current abilities and the current limitations of computational platforms for drug discovery.”

Collins and his colleagues used AlphaFold to simulate interactions between bacterial proteins and antibacterial compounds, a task known as molecular docking. The goal was to use molecular docking to rank the candidate compounds by how strongly they bind to the target protein. A molecule that binds strongly to a protein is more likely to be an effective drug; it could be more effective at preventing the protein from carrying out a pathogenic function, such as tumor growth, for example.

The team tested AlphaFold’s ability to model interactions between 296 essential proteins from E. coli bacteria with 218 antibacterial compounds, including antibiotics such as tetracyclines. AlphaFold was not very effective for modelling molecular docking simulations accurately.

“Utilizing these standard molecular docking simulations, we obtained an auROC value of roughly 0.5, which basically says you’re doing no better than if you were randomly guessing,” Collins said.

Not the smartest AI on the block

Other machine learning models were more accurate than AlphaFold for some simulations, according to Felix Wong, co-author of the paper and a postdoctoral researcher at MIT.

“The machine-learning models learn not just the shapes, but also chemical and physical properties of the known interactions, and then use that information to reassess the docking predictions,” he said. “We found that if you were to filter the interactions using those additional models, you can get a higher ratio of true positives to false positives.”

Derek Lowe, a longtime drug discovery chemist and science writer, told The Register he wasn’t surprised with the results given that AlphaFold was not really trained for molecular docking simulations. “Docking small molecules into a given protein structure is really a different problem than determining that protein structure in the first place,” he said. 

Being able to model these types of chemical interactions is an unsolved problem. No algorithm is perfect. Even if scientists have a good model of the protein, its shape changes when it is interacting with a potential drug candidate in mysterious ways.

“Virtual screening has never yet reached the ‘works every time’ level – sometimes it provides useful information and sometimes it doesn’t, and you are never sure up front which of those regimes you’re working in. Added to that is the way that different docking software will give you different answers, and for any given target one of them might give notably more useful answers than another – but again, you don’t know up front which of those it’ll be,” Lowe said.

“Even with perfect protein structures, some of them are going to be better ‘fits’ for a docking-and-scoring approach than others, and AlphaFold structures, while impressive, are not perfect, either. But to me, this isn’t so much on AlphaFold as it is on docking technology.”

AlphaFold may prove useful for other parts of the drug discovery pipeline, where comparing protein structures obtained via different methods against the model’s predictions is valuable.

“The biggest problems in drug discovery are the ones that contribute to our roughly 85 percent failure rate in the clinic. And those are picking the right targets and getting early warnings about toxicity. Neither of those are helped much at all by knowing protein structures,” Lowe added. ®

Time Stamp:

More from The Register