Lighting up the biological darkness, with structures and chemical probes

Credit: Temerty Faculty of Medicine

When Aled Edwards launched the Structural Genomics Consortium (SGC) in 2003, he saw a chance to take on the biological unknown. With the Human Genome Project wrapping up, researchers had identified the ~20,000 genes that encode the proteins that make the cell tick. But the roles of most of these proteins were unknown, and the structures of even fewer of these had been solved. With the SGC — a public–private partnership backed initially by GSK, the Wellcome Trust and Canadian funders — Edwards and colleagues set out to solve the structures of more of these proteins to shed light on their functions.

“Sorting out the roles of all our proteins is the biggest problem in biology,” says Edwards.

In some ways, times have changed. The once empty Protein Data Bank (PDB) now holds over 200,000 structures. The SGC has deposited over 4,000 structures of human proteins, providing first snapshots for many of the solved proteins from the human proteome. By 2021, the world’s researchers had structural insight into nearly 50% of human proteins. And with the rise of structure prediction programmes such as AlphaFold, that number jumped to over 75%.

And yet, says Edwards, the biological unknown remains as vast as ever. Popular proteins keep racking up citations, but most proteins are still understudied — not because researchers can’t work on them, but because they don’t.

As the SGC turns 20, with a new cast of 6 academic and 8 industry partners, it remains focused on this void. In addition to generating protein structures, it now also develops and shares chemical tools, protein purification protocols and antibody assessments for others to explore the abyss. Working as part of the Target 2035 initiative, the SGC wants to help identify chemical probes or other pharmacological modulators for every protein in the proteome.

The technical challenges ahead are as massive as the philosophical ones, adds Edwards. “We're going to need a cultural shift, because right now folks are very uncomfortable working in research areas that have no papers and no knowledge.”

What was your goal for SGC at launch in 2003?

There was the practical part of what we needed to do, and there was what I wanted to do. And these are two different things.

On the practical side, a lot of the large pharmaceutical firms realized in the late 1990s that they were all pursuing similar structural biology projects, and thought “wouldn't it be great if we could just get this done for us”. Because they don't sell structures, they sell drugs. And so, they wanted to set up the SGC, to put money into a pot and hire a CEO to “get the job done”. Eventually that group and the Wellcome Trust contacted me, and asked “would you like to start this thing?” I was given a bag of money and we had to deliver 365 structures, from a target list, in 3 years.

But my personal driver was different. I have a traditional science background in biochemistry, and was lucky enough to land in Roger Kornberg’s lab in Stanford — kicking off the structural biology of RNA polymerase II — just as his work on transcription was taking off. [Kornberg won the Nobel Prize in 2006 for his insights into the molecular basis of transcription.] But when I moved back to Canada to work on RNA polymerase structural biology, all of a sudden, the yeast genome was sequenced. And I thought, “Oh, my God, it's like a playground”. Yeast have 6,000 genes, and nobody knew what most of them did. And a lot of folks were working on transcription. Where am I going to have more fun?

So I started to think then about how we could do experiments at scale — to tackle what I thought, and continue to think, was the biggest problem in biology. Our genome encodes the instruction manual for humans; in some sense, the answer to all our questions is in there. And yet, so many of us focus on the problems and proteins of yesterday — the one protein that we happen to learn about. And I thought, “wouldn't it be cool to develop a mechanism to determine the function of every protein in the proteome?”

But the art is in the how. So, when the Wellcome Trust asked me if I wanted the opportunity to build the SGC and generate structures of human proteins, a practical piece fell into place. From the work of GSK and others in the field of nuclear receptors, I'd seen the impact that chemical probes could have on the trajectory of science, and thought that we needed to make tool compounds for all proteins. And having purified proteins and structural biology capabilities shortened the path to that goal. Indeed, I remember sitting on a bench in 2002 in Berlin with Michael Sundström, who founded the SGC’s Oxford lab, discussing even then that this would be a way to get compounds for every protein.

How did you pivot from structural biology to chemical biology?

For the first 7 years or so, we focused on establishing our reputation for delivery and reproducibility. The pharmaceutical firms that partnered with us in the early days did so because they could repeat what we did and what we found. This might sound prosaic, especially with today’s focus on reproducibility, but then it was less common to devote incredible attention to materials and methods. Many scientists think, “oh no, now I’ve got to write the materials and methods section". But we thought, “this is the part we have got to get right”. We also implemented electronic lab notebooks from the outset to institutionalize this commitment. So, we spent a lot of our first 7–8 years proving we could do it, and gaining the trust of industry by developing a culture of delivery against milestones. Of course, we also published papers as much as we could to keep our academic street cred.

We were also really focused on being open, including never filing for patents as a core principle. This ‘pure as the driven snow’ approach was a business decision for us. If you commit never to file a patent, there's a level of trust that you can engender with your partners, and a stronger chance of attracting enough resources from the business and philanthropic and academic sectors together to tackle the understudied protein problem.

Then in 2009, I wrote a Commentary in Nature Chemical Biology with colleagues articulating the need for more open-access chemical probes, while concocting a plan to build a consortium to invent them. We chose a “pioneer” area of science that was too large for any single organization to take on, and proposed to tackle it together. We chose epigenetics, and specifically methyl transferases and bromodomains. Soon after, companies really piled on, and we got to work with some great scientists inside and outside of pharma to crack the question of whether epigenetic targets were druggable. It was a lot of fun.

Does AlphaFold change your approach?

I love the AlphaFold story. But to me, this isn't the instant success that the media loves; it was a 50-year instant success story.

It started with the community’s creation of the PDB in the early 1970s, as a mechanism to share structural coordinates. There you have a real example of open science — a pioneering repository where people can share data. Isn't that cool? And then there is the small detail of the over US$10 billion invested into structural biology data collection, data curation, data sharing, benchmarking competitions and more.

So, something like AlphaFold was eventually going to happen. That is part of the reason we started to shift our emphasis away from just solving the structures, to also developing the capabilities to generate chemical probes. Today, this cannot be done computationally. You still have to be able to actually purify proteins, experimentally find hits, solve structures, and do all the medicinal chemistry — which is our core competency.

Remarkably, and distressingly, there's still just tons of proteins that no one has ever worked on. In 2011, our ‘Too many roads not taken’ paper in Nature looked at the factors that influence which proteins were understudied. We looked at everything that we could think of — including the availability of mouse knockout studies, human genetics findings and chemical probe availability. And we found that the availability of high-quality, openly available tool compounds had the biggest influence on research intensity.

The availability of a probe is probably the most direct indication of druggability as well, which makes it useful for industry. But for me, that's secondary to its value for exploring the biology around a protein. So, our decision to focus on tool compounds wasn't capricious. It was practical: we knew that the greatest influence we could have on the community was by generating tool compounds.

You’ve since outlined the Target 2035 plan to make a probe for every protein. Where is this at?

We’re in an intense planning phase where we have engaged about 80 scientists from around the world to help us to figure out the Target 2035 strategy and the role for SGC.

Regardless of that outcome, we know we’ll need to be able to purify all of the human proteins, so that is our current priority. We also continue to work on structures and probes at SGC, and on making screening data available. All these will be useful for who knows how many applications. Gifts that will keep on giving.

What attracts me to Target 2035 is that it’s a bounded challenge. There are ~20,000 genes in you and me, and in a million years there are going to be 20,000 plus or minus a handful. It makes sense to approach it like an organized army, but that’s not how hypothesis-driven science is funded. I think we continue to approach biological research as if the scope is unbounded, as if we have to do a random walk and we don't know where we're going. So, part of what we're doing is figuring out which aspects of science can be organized. What parts of the process can be scaled and replicated? Are there screening technologies that can be applied to any and all proteins?

In my fantasy world, what happens is that we divide and conquer. For example, one group will focus on the GTPases and another will focus on the kinases — and then another group can come along and say “I want to do extracellular glycanases”. These groups would get access to screening platforms and compound libraries. And before you know it, as a community, another 100 probes are done.

And we need to think ambitiously. If we wanted to pat ourselves on the back, we could point out that we and our collaborators have made and donated nearly 200 chemical probes to date — more, probably, than anyone else has produced. But I prefer to look at the 18,000-plus probes to go, and say we need to raise our game a lot. We need to imagine a new way of making this happen at a scale that's not dreamed of yet.

Pulling the year 2035 out of my you-know-what was a bit audaciously stupid. But if we didn’t set a date, we’d never get there. And it's not impossible. If we can provide a roadmap, gather the world around and get people excited, we can do it.

Anticipating technological advances is almost impossible. But computational methods will start to kick into gear, and new screening methods on the horizon will make screening easier. Thanks to companies like Enamine, researchers have easy access to billions of compounds that they can just buy. None of this was possible a few years ago.

What about the scientific appetite for tackling understudied proteins?

It’s a disappointing answer, but I don't think it's growing fast enough. Culture is the hardest thing to change.

Optimistically, there's more attention now to the fact that research is relatively concentrated among a subset of genes, and there are a few individual research programmes that are explicitly funding research into new genes — such as the NIH’s Accelerating Medicines Partnership Program for Alzheimer's Disease and the Illuminating the Druggable Genome project. But it's not systematic. It’s usually dependent on a champion inside a funding agency.

I just don't think scientists, in general, agree that working on proteins that no one's ever worked on before is a good idea. They might theoretically like the idea, but then it goes through peer review and gets killed. Because imagine: you’ve got one grant from a person who has published three papers in Nature hinting at a cool hypothesis about how protein X might cause epilepsy, and another grant to study kinase 483, that has literally no papers on it yet. Who are you going to fund? And who wants to work on kinase 483?

We need to get a global movement together if we are ever going to make progress on this — to devote resources to research areas that have no papers or no knowledge. But culturally, it’s not cool yet.

留言 (0)

沒有登入
gif