
‘Open science’ can be a miracle cure for drug research
It was a question that had stumped scientists for decades. In his 1972 acceptance speech for the Nobel Prize in Chemistry, Christian Anfinsen launched a quest to predict a protein’s 3D structure.
If scientists could predict how proteins fold, they could unlock everything from drug development for rare diseases, to enzymes that could eat rubbish.
Fifty years later, researchers at Google’s Deepmind used a machine-learning system named AlphaFold to solve the problem by mimicking what was known about proteins from what scientists had already discovered in labs. AlphaFold was fed years of data from those experiments and was then able to predict a protein’s structure with 99% accuracy. Today over two million scientists are using the Alphafold structure for other hard-to-solve medical problems such as the discovery of new antibiotics and treatments for malaria.
The discovery is an example of what is referred to as “open science,” where data and scientific research are freely accessible to all scientists looking to solve a problem. Last year Anfinsen’s challenge came full circle when AlphaFold’s developers accepted the Nobel Prize in Chemistry.
Matthew Todd, chair of drug discovery at UCL and head of chemistry at the Structural Genomics Consortium, wants to build on AlphaFold’s success by using machine learning to predict molecules that can be used as the starting point for medical research.
“In early-stage drug research what you need is a molecule that can get you going and that interacts with a specific biological molecule in a cell,” he explains. “Finding that molecule is hard. You can screen millions of molecules before finding the first candidate. AI is helping with that.”
Before machine learning, finding a molecule required years of trial and error. Most medical research would fail and cost pharmaceutical companies millions (if not billions) in wasted money. Open science has the potential to speed that process along by taking the very best data from multiple sources and then feeding it into an AI. Now pharmaceutical giants Pfizer and AstraZeneca are teaming up with The Structural Genomics Consortium to fund early stage, open research to improve the ability to predict molecules that bind to proteins. Their research will be open source, so it won’t just benefit the two pharmaceutical giants, but all medical researchers trying to find cures for diseases.
“In the early stage of research, it’s very difficult for one company to do all the research on their own,” explains Karen Godbold, head of global public-private partnerships at Pfizer. “You must have a collective working on it, and through a public-private partnership you can bring all the experts together and accelerate the research.”
When asked how to describe the collaboration with The Structural Genomics Consortium Godbold adds, “For me, early-stage open science research ensures that research is addressing the needs of patients more broadly. A fragmented approach doesn’t do that as successfully.”
The collaborative approach to open science doesn’t come naturally to the medical research industry. Traditionally, early-stage research is conducted in silos, either in publicly funded institutions like charities or universities, or private pharmaceutical companies. The research is quickly patented and then must be licensed for further use. While this is sometimes lucrative for the patent owners, it also slows down medical advances if all research is hoarded and mistakes are replicated.
But for Dafydd Owen, senior director of medicinal chemistry at Pfizer, collaboration isn’t just about getting the best minds together to solve a problem, it also makes good business sense: “If there is anything we can do to improve those attrition rates so we can deploy our research dollars more effectively then that’s a good investment for us.”
Running out of steam
While AlphaFold has been hailed as a potential game-changer for drug discovery, there is a snag: it’s running low on the right kind of data. The machine learns from real-world examples from known protein structures stored in the global Protein Data Bank (PDB). This database holds more than 200,000 protein structures, but most show proteins interacting with natural molecules, not drug compounds. For AlphaFold to be truly effective in drug discovery, it needs to see examples of how proteins interact with a wide variety of drug-like molecules. Those examples mostly sit behind the locked doors of pharmaceutical companies.
To solve this problem a group of big pharma companies – including AbbVie, Johnson & Johnson, and Sanofi – are contributing data to build its an AlphaFold-style model, called OpenFold 3. To do so they are training the AI on thousands of protein-drug interaction structures that have never been publicly shared. Using privacy-preserving tech from Berlin startup Apheris, companies can contribute data whilekeeping trade secrets safe. The resulting AI wil be fully open source to serve both industry and academia.
One of the biggest risks to open science isn’t a lack of desire for collaboration, but sudden funding cuts that risk stunting medical advances. The US National Institutes for Health has slashed 800 medical research projects and cut over $2bn in funding. Most of the cuts come from projects mentioning the words “HIV”, “trans”, “Covid-19”, and “climate”. These cuts could also have negative impacts on efforts to find cures to other diseases, if machines can’t learn from abundant data.
“There is no question that AI is having a positive effect speeding up drug discovery and that will only grow as we provide better, more reliable, data on how drug-like molecules interact with proteins,” says Todd.
To make that happen, humans will need to do one of the things they do best: collaborate.
Share This Article