
The therapeutic potential of cannabinoids like cannabidiol (CBD) and ∆⁹-tetrahydrocannabinol (THC) is immense, spanning applications from pain management to epilepsy treatment. However, a fundamental biochemical barrier has long constrained their clinical utility: cannabinoids are lipophilic, or "oily," leading to poor water solubility. This characteristic severely limits their bioavailability in oral and topical formulations, creating a significant bottleneck for pharmaceutical development. Glycosylation—the attachment of sugar molecules—is nature's go-to strategy for enhancing the solubility and stability of such compounds. While the concept is simple, efficiently and selectively producing glycosylated cannabinoids has remained a major challenge, hindering the transition from laboratory curiosity to scalable biomanufacturing.
The journey toward water-soluble cannabinoids began not in the cannabis plant itself, but through borrowed tools. Foundational research demonstrated the principle by using enzymes from entirely different species. A key 2017 study, for instance, identified a glycosyltransferase from the stevia plant, UGT76G1, capable of glycosylating various cannabinoids in vitro [3]. This work was a crucial proof-of-concept, confirming that "cannabosides" could be created and did indeed exhibit vastly improved water solubility. Concurrently, the field of synthetic biology was making strides in engineering microbes like Saccharomyces cerevisiae to produce cannabinoid precursors from simple sugars [4]. Yet, these advancements remained on parallel tracks. Scientists had the parts—microbial production chassis and foreign enzymes—but lacked a native, optimized, and integrated blueprint for efficient synthesis. The central questions remained unanswered: Does cannabis naturally produce these valuable compounds, and if so, what is the specific molecular machinery it uses?
A landmark 2025 study published in PNAS by Pinkas et al. provides the missing blueprint, fundamentally shifting the paradigm of cannabinoid engineering [1]. The research elegantly dismantles the problem through a multi-stage, discovery-driven approach.
The first major contribution was to confirm a long-standing hypothesis. By developing a sophisticated analytical workflow using high-resolution mass spectrometry, the researchers hunted for glycosylated molecules in various Cannabis sativa tissues. For the first time, they documented the natural occurrence of glycosylated forms of key cannabinoid pathway molecules, including olivetolic acid (OA), cannabigerolic acid (CBGA), and cannabidiolic acid (CBDA). This discovery was pivotal, proving that the plant possesses the inherent capability and providing a natural template for bioengineering efforts.
With confirmation that these compounds exist, the next logical step was to find the enzymes responsible. The team performed a genome-wide search for UDP-glycosyltransferases (UGTs)—the enzyme class responsible for glycosylation—within the cannabis genome. By combining phylogenetic analysis and gene expression data, they narrowed down the candidates to those highly expressed in cannabinoid-rich tissues. Subsequent functional screening in E. coli successfully identified four native cannabis UGTs (CsUGTs) with the ability to glycosylate the cannabinoid precursor OA [1]. This provided the field with a native toolkit, optimized by evolution for the plant's unique chemistry.
However, discovery of the native enzymes revealed a new, more subtle bottleneck. The most active enzyme, CsUGT14, showed a strong preference for glycosylating the intermediate OA rather than the final cannabinoid products like CBGA. In a production context, this is a critical flaw, as it would divert metabolic flux away from the desired end-products, sabotaging overall yield.
Here, the study makes its most innovative leap. Traditional protein engineering relies on obtaining a high-resolution crystal structure of the enzyme to guide modifications—a process that is often time-consuming and sometimes impossible. Lacking a crystal structure for CsUGT14, the researchers turned to artificial intelligence.
Using AlphaFold2, they generated a highly accurate predicted 3D structure of the enzyme. This in silico model served as the digital scaffold for computational protein design.
With the predicted structure in hand, they employed the FuncLib design methodology to rationally engineer the enzyme's active site. The goal was twofold: first, to shift its substrate preference away from the OA intermediate and toward the final cannabinoid products, and second, to improve its overall stability and activity. Through systematic screening of designed variants, they identified mutants that achieved exactly that. Notably, they discovered a single point mutation that could even control the specific chemical isomer being produced—a remarkable level of precision. This crystallography-free workflow represents a powerful new approach to enzyme optimization.
The significance of this work extends far beyond cannabinoid synthesis. It establishes a powerful and generalizable research paradigm for natural product engineering: AI-predicted structure + computational design + functional screening. This workflow can be applied to countless other valuable plant-derived molecules where structural information is a limiting factor, accelerating the development of novel biotherapeutics, materials, and chemicals.
For cannabinoids, this study provides the critical missing components for building a complete, cell-based factory. By introducing these engineered, highly specific CsUGT enzymes into microbes already engineered to produce cannabinoids, the scalable biosynthesis of water-soluble cannabosides is now within reach. This opens the door to novel oral drugs, injectables, and cosmetic formulations with superior performance.
The design-build-test-learn cycle demonstrated in the paper is the very essence of modern synthetic biology. This iterative process could be further accelerated by platforms that automate large-scale library construction and screening. For instance, DNA Synthesis & Cloning and Ailurus vec, which link enzyme performance directly to host cell survival, can generate massive, high-quality datasets ideal for training next-generation AI models for protein engineering.
Ultimately, this research masterfully bridges the gap between natural product discovery and AI-driven engineering. By first listening to nature and then precisely editing its machinery, the study not only solves a long-standing problem in pharmaceutical science but also illuminates a faster, more intelligent path toward harnessing the vast chemical diversity of the biological world.
Ailurus Bio is a pioneering company building biological programs, genetic instructions that act as living software to orchestrate biology. We develop foundational DNAs and libraries, transforming lab-grown cells into living instruments that streamline complex research and production workflows. We empower scientists and developers worldwide with these bioprograms, accelerating discovery and diverse applications. Our mission is to make biology the truly general-purpose technology, as programmable and accessible as modern computers, by constructing a biocomputer architecture for all.
