News
Scientists Build Programmable Protein Cages That Mimic Viruses
<p>A virus has no blueprint. Over billions of years, nature refined a trick: build large, complex shells from a single protein that plays different geometric roles depending on where it sits. On May 20, 2026, researchers at the University of Washington&#8217;s <a href='https://www.ipd.uw.edu/' target='_blank' rel='noopener'>Institute for Protein Design</a> (IPD) published two back-to-back studies in Nature showing that AI-powered computational methods can now replicate that trick from scratch, with structures confirmed at atomic resolution and cages demonstrated inside living mammalian cells.</p>
<p>The new protein cages are <strong>two to three times larger</strong> than anything the same laboratory previously built using strict symmetry. They can be loaded with molecular cargo, enter cells, and be tuned to different sizes by adjusting a single geometric parameter. The foundational problem of building virus-like nanoscale containers on demand, from proteins designed entirely at a computer, just shifted from open question to solved problem.</p>
<h2>The Soccer Ball Principle, Scaled to the Nanometer</h2>
<p>Think of a soccer ball. Its surface panels come in two shapes, pentagons and hexagons, arranged so the pentagons introduce curvature and the hexagons fill the flatter regions in between. No single panel changes its size or chemistry. Together, the arrangement closes a flat lattice into a sphere. Viral capsids have used this same geometric logic for billions of years to protect genetic material during transit between host cells.</p>
<p>That logic has a name: quasisymmetry. Structural biologist Aaron Klug and biophysicist Donald Caspar first formally described the principle in 1962, analyzing how icosahedral viruses build shells far larger than a strictly symmetric icosahedron allows. In a strictly symmetric cage, every protein subunit is surrounded by identical neighbors. In a quasisymmetric cage, chemically identical subunits adopt subtly different backbone conformations depending on their local environment, letting the shell grow far larger before it closes.</p>
<p>Prior computational protein design had hit a ceiling at the strictly symmetric, 60-subunit icosahedron. Quasisymmetric designs demand that one amino acid sequence accommodate multiple distinct local geometries simultaneously, a requirement that earlier methods lacked the precision to satisfy. Shunzhi Wang, now an assistant professor at New York University&#8217;s (NYU) Grossman School of Medicine and the paper&#8217;s lead author, built the new strategy around an insight borrowed from soft-matter physics.</p>
<p>The obstacle he exploited is called <strong>geometric frustration</strong>: a flat hexagonal lattice cannot tile a spherical surface without distortion. Rather than treating that frustration as a problem to eliminate, the team used it as a design lever. Accept the frustration, allow pentagonal defects to appear at calculated positions in the growing hexagonal sheet, and the lattice bends into a curve and then closes into a hollow sphere. The cage size is set by how many hexagons appear between each pair of pentagons, a tunable quantity controlled by the geometry of the bridging linker protein.</p>
<figure class="wp-block-image aligncenter featured-image" style="margin:1.5em auto;text-align:center;"><img class="aligncenter" src="https://budgyapp.com/wp-content/uploads/2026/05/computational-design-of-quasisymmetric-protein-cage-for-gene-therapy-delivery.webp" alt="Computational design of quasisymmetric protein cage for gene therapy delivery." style="width:100%;max-width:800px;height:auto;border-radius:8px;display:block;margin:0 auto;" /><figcaption style="text-align:center;font-size:0.85em;color:#888;margin-top:0.5em;">Computational design of quasisymmetric protein cage for gene therapy delivery.</figcaption></figure>
<h2>How RFdiffusion Cracked the Geometry</h2>
<h3>The Linker That Bends the Lattice</h3>
<p>The design relies on two proteins produced separately and combined at controlled ratios. The first is a trimeric component designated C3-A, which forms the hexagonal faces of the growing lattice. The second is a dimeric linker designated C2-B, which bridges adjacent C3-A trimers at each lattice edge. The cone angle of the C2-B linker determines the local curvature of the entire assembly.</p>
<p>A shallow cone angle keeps adjacent trimers nearly coplanar, favoring flat sheets that never close. A steeper angle introduces positive curvature. At the angle corresponding to T=3 icosahedral geometry, pentagonal defects appear spontaneously in the growing hexagonal lattice and force it to close. The team designed a family of C2-B variants, labeled by their cone angle in degrees. The α20 variant produces a small cage analogous to a regular dodecahedron. The α30 variant, with a steeper angle, yields a mixture of larger topologies. The α25 variant gives the confirmed icosahedral structure that the study characterized in full structural detail.</p>
<h3>Where AI Enters the Pipeline</h3>
<p>Designing C2-B required holding two C3-A trimers in geometrically precise, rigid register using a protein scaffold with no natural equivalent. The team used <a href='https://github.com/RosettaCommons/RFdiffusion' target='_blank' rel='noopener'>RFdiffusion, the deep-learning diffusion model for protein backbone generation</a> developed at the same institute, to generate candidate backbones satisfying those geometric constraints. Given fixed positions of two input motifs extracted from the previously designed LHD101 heterodimer pair, RFdiffusion ran symmetric diffusion sampling to produce C2-B scaffolds de novo, with no natural starting structure to copy.</p>
<p>ProteinMPNN then assigned amino acid sequences to those backbones. AlphaFold2 predicted folded structures and filtered candidates that passed structure-prediction quality thresholds for experimental testing. The pipeline mirrors the general framework used across recent IPD projects, applied here to a problem that required generating a protein capable of bridging two fixed geometric anchors at a tunable angle while self-assembling correctly at nanomolar concentrations.</p>
<ul>
<li><strong>Two proteins,</strong> C3-A (trimeric) and C2-B (dimeric), produced separately and mixed to trigger cage assembly</li>
<li><strong>T=3 cage</strong> confirmed by cryo-electron microscopy (cryo-EM, which fires electrons through frozen protein samples for near-atomic-resolution imaging), with coordinates deposited in the Protein Data Bank (PDB, the global repository for macromolecule structures) as accession 9OM3</li>
<li><strong>2-3x larger</strong> internal volume than prior strictly symmetric two-component cages from the same laboratory</li>
<li><strong>Multiple cage sizes</strong> accessible from one C3-A component by substituting C2-B linker variants carrying different cone angles</li>
</ul>
<h2>Two Papers, One Day, Two Routes to the Same Sphere</h2>
<p>The Wang et al. paper published simultaneously with a companion study, <a href='https://www.doi.org/10.1038/s41586-026-10554-z' target='_blank' rel='noopener'>Design of One-Component Quasisymmetric Protein Nanocages</a>, led by Sangmin Lee, assistant professor of chemical engineering at Pohang University of Science and Technology (POSTECH) and a former postdoctoral researcher in Baker&#8217;s group. Where the two-component approach controls curvature through a tunable bridging linker, Lee&#8217;s approach encodes the required curvature into a single protein subunit designed to undergo spontaneous symmetry breaking as it self-assembles from one genetic construct.</p>
<table>
<thead>
<tr>
<th>Feature</th>
<th>Two-Component Cages (Wang et al.)</th>
<th>One-Component Cages (Lee et al.)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Building blocks</td>
<td>Two proteins: C3-A trimer and C2-B dimer</td>
<td>Single protein subunit</td>
</tr>
<tr>
<td>Assembly mechanism</td>
<td>Geometric frustration via tunable cone angle</td>
<td>Spontaneous symmetry breaking from programmed curvature</td>
</tr>
<tr>
<td>Assembly trigger</td>
<td>Mixing separately produced components</td>
<td>Self-assembly from one protein species</td>
</tr>
<tr>
<td>Cargo loading method</td>
<td>Controllable at the mixing step</td>
<td>Interior volume accessible by design</td>
</tr>
<tr>
<td>Key practical advantage</td>
<td>On-demand cargo packaging during co-assembly</td>
<td>Single gene encodes the full cage</td>
</tr>
</tbody>
</table>
<p>Together the two papers mark a categorical shift. Two-component cages with strict symmetry existed from the Baker lab since 2016. Quasisymmetric designs, which capture the actual architectural logic of real viral capsids, have now arrived in two independent forms on the same day, from overlapping teams, using different underlying principles. That convergence argues for a design capability that is reproducible engineering rather than a singular lucky result.</p>
<h2>Electron Microscopy Validates the Cage Structures</h2>
<p>Structural confirmation drew on three independent imaging modalities. During early design screening, negative-stain electron microscopy (nsEM) revealed multiple cage populations, including assemblies consistent with dodecahedral geometry, the target icosahedral topology, and potential larger analogs whose morphology tracks known classes of fullerene-like carbon structures. The appearance of distinct size classes when the linker angle was varied confirmed that cage size was genuinely tunable by linker choice rather than fixed by the C3-A trimer itself.</p>
<p>The icosahedral cage structure was resolved by cryo-EM at near-atomic resolution and deposited in the <a href='https://www.rcsb.org/structure/9OM3' target='_blank' rel='noopener'>Protein Data Bank as accession 9OM3</a>. Separately, cryo-electron tomography (cryo-ET) produced a three-dimensional reconstruction of the same cage under near-physiological conditions, deposited as PDB 9OP9. The X-ray crystal structure of the isolated C2-B-α20 linker, solved at the National Synchrotron Light Source II at Brookhaven National Laboratory, was deposited as PDB 9NDL and showed the designed fold achieved with high geometric fidelity.</p>
<p>Defect analysis of nsEM micrographs matched theoretical predictions closely. For cage topologies below a triangulation number of nine, only pentagonal defects appeared, consistent with positive Gaussian curvature throughout the shell. For larger topologies, both pentagonal and heptagonal defects were visible. Heptagonal defects produce negative curvature and are energetically unfavorable, but the authors&#8217; analysis indicates they can form under kinetic control during closure of large cages, precisely the behavior documented in carbon fullerene chemistry. The authors draw that comparison explicitly, placing their protein cages in the same geometric family as buckminsterfullerene and its larger carbon cousins.</p>
<h2>Cargo Loading and Cellular Uptake Tests</h2>
<p>Structural characterization was accompanied by functional demonstrations designed to establish what the cages can carry and where they can go. Experiments documented across the study included:</p>
<ul>
<li>Fusing superfolder green fluorescent protein (sfGFP, an engineered GFP variant that retains fluorescence even when attached to large protein assemblies) to cage components for live-cell tracking of cage position and movement inside mammalian cells</li>
<li>Covalent cargo attachment via HaloTag (a protein tag forming an irreversible bond with synthetic ligands) and SpyTag/SpyCatcher (an isopeptide-bond-forming pair that permanently links protein modules), enabling modular decoration of both cage interior and exterior surfaces</li>
<li>Cellular uptake experiments showing that functionalized cages are internalized by mammalian cells, the minimum prerequisite for any intracellular delivery application</li>
<li>Live-cell imaging in collaboration with Liam J. Holt&#8217;s research group at NYU Langone Medical Center&#8217;s Institute for Systems Genetics, using the cages as calibrated probes to study cytoplasmic crowding, intracellular diffusion, and protein localization</li>
</ul>
<p>Extended data also demonstrated successful cage assembly when the building-block proteins were fused to a broad set of additional proteins of interest, suggesting the modular surface-attachment strategy generalizes beyond the specific cargo combinations tested in the main experiments. The cage&#8217;s two-component design offers a particular logistical advantage for cargo loading: molecular passengers can be encapsulated during the mixing step that triggers assembly, without requiring disassembly and reassembly of a pre-formed structure.</p>
<h2>Immunogenicity and the Path to Clinical Use</h2>
<p>David Baker, director of the Institute for Protein Design at the University of Washington and a 2024 Nobel laureate in Chemistry, described the combined significance of the two papers in an IPD statement accompanying the publications:</p>
<blockquote>
<p>These papers show that protein design is beginning to capture some of the architectural principles that nature uses to build at very large scales.</p>
</blockquote>
<p>That framing also marks where the work stops. The cages are de novo proteins whose sequences did not exist in nature before computational design produced them. Whether a human immune system will treat them as benign carriers or mount an inflammatory response is not yet established. <strong>Immunogenicity testing</strong>, the systematic measurement of immune responses a new protein antigen triggers, is the next required step before any human research application becomes feasible. Additional pre-clinical studies are needed, the authors note, before the cages enter human research protocols.</p>
<p>Research was funded by a consortium spanning the Defense Threat Reduction Agency, the Bill and Melinda Gates Foundation, the Howard Hughes Medical Institute, and the NIH&#8217;s National Institute on Aging, among others. A gift from Microsoft supported part of the computational work. That funding portfolio, spanning national security, global infectious disease, and basic cell biology, reflects the multiple application areas researchers envision for programmable protein containers at this scale.</p>
<p>If immunogenicity data show the cages are well-tolerated, the size range now accessible through quasisymmetric design places genetic payloads previously too large for strictly symmetric carriers within reach of a single engineered vessel. If immune reactivity proves significant, engineering around that response becomes the next multi-year problem. Both paths forward start with the same experiment, and none of those results exist yet.</p>