Self-Designing Cells: Evolutionary Algorithms Driving Intelligent Virtual Microbial Systems

ABSTRACT

Artificial intelligence, combined with evolutionary algorithms, represents an ongoing advancement in modelling and analysing cellular processes. Conventional simulations of virtual cells are based on predefined architectures, thereby limiting the possibility of adaptation and self-optimisation observed in living organisms. In this paper, a self-designing virtual cell framework, evolving through Darwinian principles, was developed to adapt and optimise cellular responses to nutrients and drug exposure using multi-objective genetic algorithms. It was compared with three established biological models. Simulations were conducted using Python-based tools. Key parameters—including kill rate constants, clearance rates, and transport flux limits related to metabolic efficiency and drug response—were optimised over 30 generations using Pareto-front techniques to balance trade-offs between biomass production, ATP stability, and effective agent clearance. By enabling multi-generational self-optimisation without manual intervention, this work connects artificial evolution with computational biology. The framework represents a foundational step toward adaptive, AI-powered models capable of simulating resilience, efficiency, and evolutionary dynamics. This study demonstrates the potential of AI-driven approaches to develop realistic, adaptive virtual cells, contributing toward next-generation synthetic biological systems.

INTRODUCTION.

Globally, the AI-driven modelling of cellular systems is transforming drug discovery and synthetic biology [1]. Institutes like the Arc Institute and Chan Zuckerberg Initiative are advancing virtual cell platforms using single-cell data [2, 3]. In India, efforts by institutions such as IISc and NCBS support computational biology, while AI-driven cellular evolution is in its early stages. This work extends these initiatives by using evolutionary algorithms to develop self-optimising virtual cells [4]. Self-optimisation describes the autonomous adjustment of these parameters through Darwinian evolutionary processes – namely variation, selection, and inheritance – without manual intervention or predefined behavioural rules.The virtual modelling of cells is a vastly growing field at the intersection of artificial intelligence and biology. Early approaches, such as agent-based modelling or rule-based systems, such as COPASI (“COPASI,” n.d.) and BioJazz, simulated biological pathways using static architectures; however, these lack the adaptability seen in real cells [5]. Furthermore, AI-powered initiatives, such as The AI Virtual Cell, have demonstrated the potential for predicting cell responses [6]. Such models have been used in cancer research, gene circuit optimization, and cellular metabolism modeling. Despite these advances, few models allow for open-ended evolution [7].

Recently, artificial intelligence (AI) and machine learning (ML) have transformed biotechnology by enabling the virtual modelling of complex biological systems. While traditional cell modelling required a set of handwritten rules, machine learning now allows systems to learn from data and make predictions as it does so [8]. Evolutionary algorithms simulate natural selection, allowing systems to adapt under defined environmental pressures [9]. Recent studies demonstrate the use of genetic programming and differential evolution in optimizing cellular networks [10]. While both the use of AI and ML have shown great promise in simulating biological systems, most research papers focus solely on static models – relying on predefined datasets. Therefore, there is a lack of systems that evolve autonomously through Darwinian principles, as existing virtual cell simulators primarily focus on prediction, rather than adaptation. No current model offers a framework where virtual cells can self-optimise over generations, mirror natural evolution. Pareto-front optimization is a method of solving problems with multiple competing objectives, where instead of finding one single “best” solution, we find a set of balanced solutions in which improving one objective would worsen another.

Thus, there is a growing need for virtual cells that can modify their architecture, function, and behaviour over multiple generations, as real cells do [11]. Therefore, this project aims to close the gap of self-optimisation by applying evolutionary algorithms to mimic Darwinian selection to evolve virtual cells for energy efficiency, resilience and intelligence. Darwinian evolution or selection is the process by which heritable variation, combined with natural selection, leads to adaptive changes in populations over generations. A virtual cell is a model that simulates realistic cell behaviour. In this study, the virtual cell framework is based on Escherichia coli (E. coli) as a representative prokaryotic model organism due to its well-characterised metabolic pathways and widespread use in computational and experimental biology. Darwinian principles are implemented computationally to allow virtual cells to evolve over generations. Self-designing refers to the cell’s ability to evolve structure and function without direct human input. This paper enables the application of Darwinian evolution via AI-powered evolutionary algorithms, allowing virtual cells to self-design and optimise. This study is significant as it introduces a framework to create self-optimising virtual cells that have adaptive traits, going beyond static simulations and towards more lifelike biological simulations. Projects such as the Virtual Cell Challenge and CZI Biohub reinforce the need for such dynamic, learning-based cell models [12]. A model such as this could transform our understanding of complex diseases, stress responses and synthetic life, opening new pathways for precision simulation and disease modelling.

MATERIALS AND METHODS.

Data Collection.

This study aimed to evolve biological parameters inside an SBML-defined metabolic network, to optimise biological response to drug exposure or nutrient uptake using evolutionary computation techniques. A multi-objective evolutionary algorithm was used to evolve key metabolic variables and evaluate trade-offs between different objectives, such as biomass production, ATP maintenance and drug clearance. The simulations were conducted using Python (v3.13). The following libraries were used: NumPy (v2.3.2), Matplotlib (v3.10.0), DEAP (v1.4.3), and Tellurium (v2.2.11.1). The models run were: BIOMD0000000425, Infection–drug–bacteria dynamics; MODEL2405080002, Antibiotics with low cytotoxicity; MODEL3023609334, Core E. coli metabolism.

Evolution of Virtual Genome.

The development process of the virtual genome was realized through the use of a genetic algorithm programming environment built in Python to simulate the process of evolution as postulated by Darwin. The programming process began with the creation of genomes through the use of the genetic algorithm programming environment. The genomes are generated randomly in terms of binary or numeric codes that entail various traits and sets of parameters. The genomes are then run through various simulations involving crossover and mutations to ensure that they are as per the set fitness function that encompasses various traits such as energy efficiency and resilience. Resilience is defined as a cell’s capacity to survive environmental stress with minimal accumulated damage, measured using survival time and damage relative to a predefined maximum threshold. The entire process was run over various generations to study the adaptive behaviors that occur over time. The rates of mutations and crossover are changed dynamically to ensure that there is no stagnation due to genetic similarities. The end result was the development of evolved genomes that result in the creation of virtual living organisms that are most fit and can adjust to their environments.

Comparison of Virtual Genome with Biomodels.

The procedure to conduct this included 5 steps. First, model loading: Each SBML model was parsed using Tellurium and RoadRunner in order to simulate biochemical reactions and access species, fluxes and local parameters. Second, parameter selection: From the parameters accessed, some were evolved for each model, for BIOMD0000000425, the parameters evolved included: kill rate constants (k_kill); drug clearance rate; binding affinity (Km or IC50); bacterial regrowth rate (µ); and drug degradation constant. Whereas, for MODEL2405080002, the parameters evolved included: molecular descriptor weights; feature importance weights; descriptor thresholds; and regularisation strength in the model, and for MODEL3023609334, the parameters evolved included: upper bounds or capacities on key transporter reactions (glucose uptake, efflux pumps); maintenance cost parameters (GAM, NGAM), biomass reaction stoichiometric coefficients; enzyme usage bounds; and flux bounds of resistance-related reactions.

Next was simulating, for each set of parameters, the model was run in a simulated environment using the functions simulate() or steadyState(). Next, the resulting output variables were recorded, and if a simulation failed, a penalty value was assigned. Finally, evolution was done using a genetic algorithm. The steps taken in order to evolve the models were: population initialisation, a random population of parameter sets was generated within biologically valid bounds; evaluation, each parameter set was scored based on simulation output; and Pareto optimisation, where Pareto fronts were constructed to capture optimal trade-offs. Finally, this process was repeated over 30 generations. Following the procedure, the following data were procured: fitness values for each generation, evolution trajectory over generations, and final Pareto front showing optimal trade-offs. Therefore, data was plotted using Matplotlib to visualise: fitness progression, convergence of the population, and final distribution of solutions on the Pareto front.

RESULTS.

Evolution of Virtual Genome.

The simulation proved that the virtual genome was capable of adaptive optimization by simulating the process of evolution. The initial population had randomly distributed and inefficient genetic codes; however, over time, there was an evident development of adaptive lineages. The fitness function was steadily improving with defined convergence to optimal genetic codes. The fittest genomes showed characteristics that are similar to those found in biology; one such characteristic was energy-efficient regulation. The initial generation of the virtual genome is shown in Fig. 1. The occurrence of mutations permitted limited evolutionary innovation, leading at times to the development of genomes with increased fitness while preserving essential traits. Overall fitness was improved in the final population by means of multi-objective optimization; There was equilibrium development in terms of efficiency of growth, use of resources, and stability. This is evident from the values in Table 1, where the visual representations illustrate the increase in average fitness with diminishing variability over generations and alignment with Pareto fronts, as shown in Figs. 2 and 3.

**Figure 1.** Initial Generation (0) of Resilience against Energy Efficiency. The x-axis represents Energy Efficiency (EE), and the y-axis represents Resilience (RS) of the virtual genomes. Each blue dot corresponds to an individual genome in the population, showing its performance in terms of EE and RS. Each blue dot also represents an individual genome from the evolved virtual genome population

Table 1. Table showing the best energy efficiency, resilience, and intelligence values of evolved virtual genomes across selected generations, illustrating the progression of optimization during the evolutionary process.
S. No.	Generation	Best Energy Efficiency	Best Resilience	Best Intelligence
1	0	0.5996147866	0.9015718327	0.7811236355
2	11	0.5926175906	0.8979988583	0.9648819169
3	27	0.5947135548	0.9003556253	0.889856943
4	56	0.5893943124	0.9072629292	0.8295106028
5	80	0.5881250656	0.9033531319	0.8667021357

**Figure 2.** Generation 40 of Evolved Resilience against Energy Efficiency. The x-axis represents Energy Efficiency (EE), while the y-axis represents Resilience (RS) of the evolved virtual genomes. Each blue dot denotes an individual genome optimized through 40 generations of evolution. Compared to Generation 0, the distribution is more compact, indicating partial optimization and the emergence of genomes with balanced energy efficiency and resilience trade-offs. Each blue dot also represents an individual genome from the evolved virtual genome population

**Figure 3.** Final Generation (80) of Evolved Resilience against Energy Efficiency. The x-axis represents Energy Efficiency (EE), and the y-axis represents Resilience (RS) of the final evolved genomes. Each blue dot denotes an individual genome after 80 generations of evolution. The clustering of points in the upper-right region indicates convergence toward optimal solutions, where genomes exhibit high resilience and stable energy efficiency, reflecting the success of evolutionary optimization in improving overall system performance. Each blue dot also represents an individual genome from the evolved virtual genome population.

Parameter selection for Biomodels.

The parameters selected for evolution in each model were chosen based on their biological influence and system sensitivity, that is, their direct control over the dynamic response or steady-state efficiency of the network. In the infection–drug–bacteria model (BIOMD0000000425), parameters governing antibiotic efficacy and bacterial survival, such as kill rate constants (kₖᵢₗₗ), drug clearance rates, and regrowth constants (μ), were prioritized because they dictate the pharmacodynamic equilibrium between bacterial inhibition and recovery. In the low-cytotoxicity antibiotics model (MODEL2405080002), selection focused on feature-weighting parameters and descriptor thresholds, as these regulate how molecular features influence antibiotic toxicity and potency, allowing the evolutionary algorithm to optimize for high selectivity with minimal side effects. In the E. coli metabolic model (MODEL3023609334), parameters were chosen for their role in cellular resource allocation, including flux bounds on key transport reactions (glucose and oxygen uptake), ATP maintenance demands (GAM/NGAM), and stoichiometric coefficients of biomass precursors. By targeting parameters that modulate both energy efficiency and growth yield, the study ensured that the evolutionary process captured realistic metabolic trade-offs across varying nutrient conditions.

Comparison of BioModels with Virtual Genome.

The virtual genome method mirrors natural selection by simulating adaptive evolution across different biological modeling levels. The three biomodels were compared using a virtual genome evolution framework, which showed different adaptation behaviors based on their complexity. ODE-based bacterial growth model (BIOMD0000000425) adapted quickly to environmental changes and Optimized growth and antibiotic tolerance by adjusting kinetic parameters. Deep learning antibiotic activity model (MODEL2405080002) improved prediction accuracy by evolving feature weights. It showed the emergence of “virtual resistance traits,” like enhanced resilience. The E. coli genome-scale metabolic model (MODEL3023609334) exhibited more nuanced adaptation adapted a balanced way between growth, efficiency, and robustness, and evolved trade-offs (Pareto-optimal) between energy use and stress resilience. Overall, the virtual genome framework successfully simulated evolutionary adaptation in diverse biological models. Collectively, these results suggest that the virtual genome approach can emulate Darwinian selection pressures across mechanistic, statistical, and systems-scale biological models, creating a unified digital ecosystem for simulating cellular self-design and evolutionary intelligence.

DISCUSSION.

The results demonstrate that the proposed evolutionary framework successfully models adaptive behaviour within a virtual cellular environment. The application of multi-objective genetic algorithms made it possible to optimize various parameters such as energy efficiency, robustness, and intelligence within one process instead of applying previously defined models about biology. The fitness criteria of each generation within the virtual genome was identified by the creation of Pareto fronts that represented the evolution within the process towards optimal fitness. The framework successfully captured the core principles of Darwinian evolution: variation, selection, and inheritance

Two important similarities are established with existing bodies of work. Firstly, there are similarities with a previous research study that utilised evolutionary algorithms to evolve virtual networks [2]. The work here shares conceptual overlap with this study since it relates to the gradual process of adapting through population-based optimisation. Additionally, there are similarities with another study, which illustrated that molecular evolution through generations can be achieved through the use of evolutionary algorithms [9]. In this case, the project here attains similar objectives with similar algorithms to evolve system-level characteristics instead of molecular features. But in the process, this project offers novel inclusion with the evolution process, applying to the virtual genome with features such as energy efficiency.

In comparison, a study proposed BioJazz to simulate the evolution of cellular networks virtually; however, these models still relied on rules defined beforehand [5]. In contrast to the above-mentioned models, the workflow presented in this study waives strict structural requirements to introduce the possibility of self-evolution. At the same time, AI-aided virtual cell simulations developed for cancer research remain primarily predictive in nature and lack the capacity for autonomous self-evolution across generations [6]. The key problem with this lies in its use of abstraction; there are matters such as the correction of mutations that are too complex in reality to be taken into account. The potential areas that may be expanded in the subsequent projects include incorporating real omics data to ensure that the processes mimic reality more accurately. In conclusion, the project has successfully shown that evolving a virtual genome is possible.

ACKNOWLEDGMENTS.

I would like to express my sincere gratitude to Ms. Nirupma Singh, my mentor, for her constant guidance, technical insight, and encouragement throughout the development of this project. Her expertise in computational biology and evolutionary modelling was instrumental in refining the concept of virtual genome evolution. I am also deeply thankful to my parents for their continuous support, motivation, and belief in my work.

REFERENCES.

DiNuzzo, M., How artificial intelligence enables modeling and simulation of biological networks to accelerate drug discovery. Frontiers in Drug Discovery 2, 221-296 (2022).
R. W. Smith, B. V. Sluijs, and C. Fleck, Designing synthetic networks in silico: a generalised evolutionary algorithm approach. BMC Systems Biology 11(1), 118-119 (2017).
Y.H. Roohani, et al., Virtual Cell Challenge: Toward a Turing test for the virtual cell. Cell 188(13), 3370-3374 (2025).
T. Yang, et al., Build the virtual cell with artificial intelligence: a perspective for cancer research. Military Medical Research 12(1), 4-5 (2025).
S. Feng, et al., BioJazz: in silico evolution of cellular networks with unbounded complexity using rule-based modeling. Nucleic Acids Research 43(19), 123-124 (2015).
C. Bunne, et al., How to build the virtual cell with artificial intelligence: Priorities and opportunities. Cell 187(25), 7045-7063 (2024).
S. Zhang, et al., Assessing deep learning methods in cis-regulatory motif finding based on genomic sequencing data. Briefings in Bioinformatics 23(1) 374-375 (2022).
C. M. Ardila, and P.K. Yadalam, Unresolved questions in the application of artificial intelligence virtual cells for cancer research. Military Medical Research 12(1), 19-20 (2025).
J. S. L. Browning, D.R. Tauritz, and J. Beckmann, Evolutionary algorithms simulating molecular evolution: a new field proposal. Briefings in Bioinformatics 25(5), 360-361 (2024).
J. Jumper, et al., Highly accurate protein structure prediction with AlphaFold. Nature 596(7873), 583-589 (2021).
C.S. Groff-Vindman, et al., The convergence of AI and synthetic biology: the looming deluge. Biomedical Innovations 2(1) 20-21 (2025).
S. Nijim, et al., Rare Disease Drug Repurposing. JAMA Netw Open 8(5), 258330-258331 (2025).

Posted by buchanle on Wednesday, June 3, 2026 in May 2026.

Tags: Evolutionary, Self-designing, Self-optimisation, Virtual-cells