AI has done it again.
After solving one of the grandest mysteries in biology—predicting protein structure—it decoded how proteins link up into complexes, and dreamed up novel protein structures that may ultimately be turned into drugs to control our basic biology, health, and life.
Yet when faced with enormous protein complexes, AI faltered. Until now. In a mind-bending feat, a new algorithm deciphered the structure at the heart of inheritance—a massive complex of roughly 1,000 proteins that helps channel DNA instructions to the rest of the cell. The AI model is built on AlphaFold by DeepMind and RoseTTAfold from Dr. David Baker’s lab at the University of Washington, which were both released to the public to further experiment on.
Our genes are housed in a planet-like structure, dubbed the nucleus, for protection. The nucleus is a high-security castle: only specific molecules are allowed in and out to deliver DNA instructions to the outside world—for example, to protein-making factories in the cell that translate genetic instructions into proteins.
At the heart of regulating this traffic are nuclear pore complexes, or NPCs (wink to gamers). They’re like extremely intricate drawbridges that strictly monitor the ins and outs of molecular messengers. In biology textbooks, NPCs often look like thousands of cartoonish potholes dotted on a globe. In reality, each NPC is a massively complex, donut-shaped architectural wonder, and one of the largest protein complexes in our bodies.
Why care? Like tackling a massive jigsaw puzzle, solving the NPC structure is rewarding on its own. But because they control how DNA information is transmitted to the rest of the cell, NPCs are essential for gene therapy, mRNA-type vaccines, CRISPR, and potentially other genetic treatments we haven’t yet imagined.
“NPC [is] a hotspot for disease-associated mutations and host–pathogen interactions,” said Dr. Di Jiang, a senior editor at Science, in a deep dive into NPCs for their latest issue. “The work reported here represents a triumph of experimental structural biology.”
A Structural Enigma
“Nuclear pores” sounds like something from a skincare video. But for cell biologists, they’re a decades-long enigma. “NPCs are essential for life,” explained Dr. Francis Collins, a former director at the National Institutes of Health (NIH).
Our DNA strands are curled around a protein spool. They’re then sequestered inside the nucleus, which shields DNA from potentially harmful chemicals, viruses, or other junk. Picture wrapping donut holes in double-layered plastic wrap—that’s the nuclear envelope. Now punch a few holes into the wrapper—those are the NPCs.
These seemingly simple “holes in the wall” are critical gatekeepers of genetic control in cells. Our cells function by translating DNA code into proteins to build physical tissues or control basic biological functions—telling a cell when to divide or die, balancing metabolism, and warding off viral invaders.
But DNA is sequestered inside the nucleus. Hundreds of protein messengers need to enter into the nuclear sanctum to transcribe DNA instructions into mRNA and shuttle it back to the cell’s protein-making factories. Each run has to bypass NPCs—which act as guards and channels in one structure.
Scientists have long sought to decode NPC structure, using biochemical wizardry to tamper with its normal function or X-rays to scan its crystalline structure. The work was utterly painstaking. From those data, scientists found two main types of proteins form the gate.
The first type builds the gating scaffold. These proteins, dubbed NUPs (nucleoporins), tag-team to line the tunnel. The second type acts like living drywall mud. These proteins are far more flexible, plastered along the scaffolding proteins and extending into the central channel, where they can physically grab onto cargo to help it move along.
Made up of nearly 1,000 proteins that form roughly 30 different “docks,” NPC structures are tough to solve because they dynamically change. Multiple proteins, for example, act as interconnected hinges to change the configuration or size of the pores. Because the whole structure “intimately embraces” the nuclear envelope, NPCs can’t be studied in isolation, explained the team, led by Drs. Gerhard Hummer and Martin Beck at Max Planck Institute of Biophysics, and Dr. Jan Kosinski at the European Molecular Biology Laboratory. So far, even with state-of-the-art biochemical means, scientists have only solved 46 percent of the NPC structure.
“It’s like when you disassemble and reassemble an electronic device. There will always be some screws left, and you just don’t know where they are supposed to be,” said Kosinski. But thanks to AI, “we finally managed to fit most of them, and now, we know exactly where they are, what they do, and how.”
Enter AI
The team first tapped into and improved upon a popular method for analyzing NPCs called cryo-ET analysis. The method rose to fame in 2015 when it resolved cellular structures down to near-atomic scale. One of the problems with solving NPC structure is the lack of resolution from previous datasets, the team explained. Here, they collected an “approximately fivefold larger dataset” than in their previous attempt, and used a new computing method to analyze the data.
Looking at the newly-drawn maps, the team could distinguish between the nuclear envelope—or DNA “wrapper”—when it was in a constricted state versus a more relaxed one. Digging deeper, the team tapped into AlphaFold and RoseTTAfold to predict a comprehensive set of models for NPC proteins. The duo worked well—the analysis could model most nuclear proteins with high confidence, and matched data from traditional microscopic analysis methods.
Then came the hard part. Like docks in a shipyard, NPCs are heavily linked with protein traffic ways, which are often difficult to model in 3D. Using their model, the team mapped “anchor points” of protein linkers to the NPC main tunnel. Further modeling built a “Google Map” for how linkers connect. Like a well-organized shipyard, each helps maintain the NPC structure.
Hacking the Heart of Inheritance
Using AI to solve protein structures has been touted as the breakthrough of the decade. This study is one of the first to showcase the algorithm’s power in messy, complicated, real-world settings.
“This work exemplifies how, in the future, structural biology will embrace cell biology to create atomic models of ever larger assemblies of molecules that perform different functions in different parts of the cell,” said Beck.
The revolution is already on its way. In the same journal issue, a separate team led by Dr. Hao Wu at Harvard Medical School combined microscopy imaging with AlphaFold to solve part of the NPC structure using the eggs of Xenopus laevis, the African clawed frog that’s a darling in biochemical research.
But AI isn’t yet the savior. As Dr. Thomas Schwartz at MIT, who wasn’t involved in the studies, pointed out, NPCs are living creatures that change their configuration. For example, their channels tend to be wider when they’re happily nestled inside the nuclear envelope, versus after they’re yanked out to study under a microscope. In other words, protein complexes are hard to decipher and control. But AI is on our side.
“We can now think of building a complete dynamic model of the NPC and simulate nuclear transport in atomic detail,” he said. With AI-based protein prediction on a roll, even more exciting is what’s yet to come.