Skip to main content

Cellular Deepfake Engine: A Dual-GAN Framework Redefining Data Generation for Biomedical Imaging AI

ABSTRACT

Artificial intelligence (AI) has revolutionized biomedical imaging and diagnostics; however, its success is further limited by the requirement for large, diverse, and annotated microscopy datasets. Conventional data acquisition techniques are costly and hindered by privacy issues, variance, and tedious annotation procedures. Current synthetic image generation techniques often fail to retain the full morphological variability of biological samples, thereby limiting their applicability in practical biomedical applications. This research presents the Cellular Deepfake Engine, a novel generative AI framework designed to overcome these limitations by producing high-fidelity, biologically plausible microscopy images across three domains: blood cells, nuclei, and cancerous cells. The system employs a dual generative adversarial network (GAN) pipeline, each tailored to the dataset-specific characteristics, and incorporates advanced data curation techniques, including denoising, normalization, augmentation, and manual annotation. The generated images preserve fine structural details and exhibit strong performance in both visual and quantitative evaluations, achieving a Fréchet Inception Distance (FID) of 123 on the blood cell dataset. Over 1600 training epochs, generator and discriminator losses converged stably, correlating with improved image fidelity. Downstream classifiers trained on synthetic data generalized well to real-world datasets, with synthetic augmentation improving accuracy and mitigating class imbalance. Most importantly, the Cellular Deepfake Engine (CDE) enables privacy-conserving, data-sparing AI studies on high-risk biomedical topics. By creating scalable, variegated training data without revealing actual patient data, the system raises the reproducibility and transferability of biomedical Artificial Intelligence (AI). Future research will optimize architecture and metrics to increase image realism and clinical significance further.

INTRODUCTION.

Nearly half the world’s population lacks essential diagnostic services, as highlighted by the 2021 Lancet Commission, which reports that “47% of the global population has little to no access to diagnostics” [1]. Biomedical imaging drives breakthroughs across healthcare and research, spanning from lung lesion detection in COVID-19 patients, to dermatological body part classification in clinical workflows, and ultimately quantitative analysis in cell and tissue biology [2,3].  Progress in biomedical imaging research has widened its role from cell biology to clinical diagnostics. Self-supervised learning approaches have permitted automated extraction of protein localization and cell morphology features without involving annotated data [4]. Furthermore, comprehensive reviews highlight how imaging, when combined with deep learning, not only improves segmentation, but also feature extraction [5]. In tandem, these studies illustrate both the progress and the challenges in biomedical imaging, including insufficient dataset availability and annotation costs.

Artificial intelligence (AI) models, especially deep-learning models, have displayed significant potential in progressing biomedical imaging. Their integration into this field offers multiple advantages, such as automated segmentation, enhanced diagnostic accuracy, and reduced reliance on manual annotation [6]. GANs have demonstrated significant utility outside of biomedical research, particularly in high-fidelity visual synthesis. For example, GANs have been used to generate synthetic human faces that are nearly indistinguishable from real ones, with studies showing that some generated images were even rated as more trustworthy than actual human photographs [7]. Another application was when GANs were employed to generate synthetic muscle histopathology images, entirely rendered using a simulation software. These synthetic datasets were shown to train segmentation models with accuracy matching or even exceeding models trained on real data, emphasizing the viability of GANs for annotation-free image training in complex domains [8].

Generative adversarial networks, such as StyleGAN, have also been tested with embryo imaging, which resulted in realistic synthetic data while protecting patient privacy, one of the most notable issues with the advancement of biomedical imaging [9]. Despite these developments, challenges persist in validation, including generalization across diverse datasets and accessibility for non-expert users [10]. Today, studies also remain constrained by limited annotated datasets, high costs, and privacy concerns, which inhibit the scalability of AI-driven analysis.

This research addresses a critical gap in biomedical imaging by developing a generative adversarial network (GAN)-based Cellular Deepfake Engine (CDE) to create authentic synthetic cell microscopy images. The proposed approach uses dataset curation, noise removal, and normalization, with advanced GAN training, supported by evaluation metrics such as Fréchet Inception Distance (FID) to ensure qualitative fidelity. After thorough iterative training across hundreds of epochs on diverse microscopy data, the algorithm managed to generate high-resolution cellular structures that capture minute cell organelle morphology and other variability, making it realistic and therefore accurate for research purposes. Results indicate a strong generative performance with low FID values, indicating that the generated images are as similar as possible to real ones. The high accuracy of 98.5% achieved by the state-of-the-art Machine Learning (ML) models, such as the Residual convolutional neural network (ResNet), after analysing the generated images showed the true potential of synthetic augmentation to solve existing dataset limitations. This integration of generative modelling into biomedical imaging represents an advancement towards scalable and cost-effective data expansion, ultimately supporting more accurate diagnostics and even advanced computational biology research.

MATERIALS AND METHODS.

Data Collection.

For this study, the datasets were obtained from publicly available biomedical imaging repositories, with a focus on large-scale microscopy datasets containing diverse examples of cell morphology and cellular structures. The main goal of this stage of collecting cell images was to ensure that a representative sample was chosen while containing real-world variability. The exploration process involved reviewing available datasets, followed by selecting the appropriate datasets with a suitable sample size. The datasets were then chosen from open-source websites, Kaggle, and the Broad Bioimage Benchmark Collection (BBBC), and were stored in JPEG and PNG file formats. A total of three datasets were used: from Kaggle, datasets containing images of normal human cell nuclei, and pictures of cancerous and non-cancerous blood cells were used, and from BBBC, images of cells and nuclei that were affected by colon cancer were used. Working with such data ensures clinical and biological relevance, allowing this study to be used as the foundation for developing scalable diagnostic models and synthetic data generation in oncology. This curation process resulted in a dataset that was balanced, diverse, and representative of the cellular imaging domain under study.

Model Building.

The primary framework for creating this cellular deepfake model is based on a GAN architecture developed using the Python programming language, enabling the generation of high-quality synthetic images in a biomedical context. The GAN is composed of two neural networks trained to better each other: a Generator (G) and a Discriminator (D). The Generator continuously improves its ability to produce realistic synthetic images, while the Discriminator simultaneously enhances its efficiency in distinguishing real data from synthetic input. For each dataset, separate models were created. Individually, the models were created by using the PyTorch deep learning framework. The generator was trained to take in a random noise vector of the image and then output an image that mimics the same structure and features of real microscopy images. On the other hand, the discriminator looked at both real and generated images and was then trained to distinguish between them. This back-and-forth process continued for 1000-1600 training cycles/ “epochs”, on different datasets, allowing both the generator and discriminator to improve together through the process.

The general structure of the GANs was similar across all three models; however, it was modified to suit the dataset it applies to. For example, cancerous cells and nuclei have different morphology and patterns compared to normal cells, so the model, trained on the dataset for a longer number of epochs, captures those subtle differences. During training, images were saved at regular epoch intervals to track the progress of the generated outputs over time. Overall, this GAN-based approach allowed the creation of synthetic cell images that will closely resemble real ones without the need for expensive lab experiments.

Model Evaluation.

To assess the true quality of the images generated by each GAN model, two metrics were utilized: FID. The FID score measures the similarity between generated images and real images by comparing the “mean and covariance” of features extracted by an Inception network [11]. A lower FID score indicates that the synthetic images are more realistic and closer in distribution to the real dataset. Together, these metrics offer a comprehensive assessment of the statistical realism and visual quality of generated images, ensuring that the synthetic data can be accurately utilised in biomedical AI applications.

Validation of Generated Images.

In order to measure the structural validity of the GAN-generated images, the synthetic dataset was added to the training pipeline of five convolutional neural network models, and held-out real microscopy images were used to evaluate the output. This design was used to make sure that classification performance involved generalization to realistic biological data, and not internal consistency in the synthetic domain. The high precision obtained in the models thus indicates that the images generated had diagnostically relevant morphological components that were attained and were consistent with actual cellular structures. To ensure the quality and effectiveness of generated images, a mixture of convolutional neural network (CNN) architectures was constructed and used to validate the images. Five unique models were built, including the ResNet-50, DenseNet-121, EfficientNet-B0, VGG16, and InceptionV3, each selected for their ability to capture complex cell morphology. These models were trained on the cancerous and non-cancerous image datasets generated by GAN and were then assessed based on classification performance metrics as follows: Accuracy, Precision, Recall, and F1 Score, which collectively provide a balanced view of model performance. These models were not built to determine which performed best, but rather to validate the credibility of the GAN-generated images. Their strong performance across all metrics confirmed that the synthetic images carried sufficient structural information to train high-performing classification networks, justifying their use in further diagnostic cases.

RESULTS.

During training, each GAN model was allowed to progress for around 1000 epochs. During this process, generator and discriminator losses were monitored to evaluate training stability and performance. As shown in Fig. 1, both generator (G) and discriminator (D) losses exhibited a sharp increase in the early epochs, followed by a period of stabilization. The discriminator loss dropped to almost zero, while the generator loss oscillated mildly at low values. This behavior demonstrates a successful training dynamic, where the generator gradually learns to produce highly realistic synthetic microscopy images, and the discriminator becomes less able to distinguish them from real ones.

Figure 1. A graph showing the generator and discriminator models’ loss during training. The x-axis denotes the number of epochs, while the y-axis represents the loss of the model. The orange line on the graph denotes the discriminator’s loss over the process of 1000 epochs. The blue line on the graph denotes the generator’s loss over the process of 1000 epochs.

The model was trained on the Kaggle Nuclei dataset, shown in Fig. 2, and achieved an FID score of 264.01. The GAN for the Kaggle Blood cell dataset showed the lowest FID score of 115, indicating excellent statistical similarity to real images, as shown in Fig. 3. Meanwhile, the Colon Cancer Affected Cells Dataset from BBBC showed a more balanced performance with an FID of 123, as shown in Fig. 4. This indicates it produced outputs with both statistically realistic distributions and perceptual resemblance to real microscopy data. In conclusion, while the Blood cell model scored best in terms of FID and the Colon Cancer model had the most balanced performance across both metrics. These visual and quantitative results together validate the quality of the generated images.

Figure 2. The images of the Nuclei of the cells. (a) This shows the original image that was downloaded from the nuclei dataset. (b) This shows the image that was detected as real by the discriminator model. (c) This shows the image that was detected as fake by the discriminator model.
Figure 3. The images of the blood cells. (a) This shows the original image that was downloaded from the bloodcell dataset. (b) This shows the image that was detected as real by the discriminator model. (c) This shows the image that was detected as fake by the discriminator model.
Figure 4. The images of the colon cancer cells. (a) This shows the original image that was downloaded from the colon cancer dataset. (b) This shows the image that was detected as real by the discriminator model. (c) This shows the image that was detected as fake by the discriminator model.

To quantify authenticity and suitability, each of the five state-of-the-art deep learning models (ResNet-50, DenseNet-121, EfficientNet-B0, VGG16, and InceptionV3) was trained on the generated synthetic dataset, which also served as a validator to quantify the potential of the synthetic images to perform classification between two categories. Each model was subjected to four evaluation metrics (the Accuracy, Recall, Precision, and F1 Score). Table 1 provides the evaluation results for all of the versions of each model tested. It was observed that ResNet-50 performed the best overall for each metric, achieving an incredible 98.5% accuracy, 98.2% recall, 98.8% precision and 98.5% F1 score. DenseNet-121 came next, with results that were not as strong but still solid scores for each parameter.

Table 1. A table showing the Accuracy, Recall, Precision, and F1 Score of 5 validation models.
S.No Model Name Accuracy Recall Precision F1 Score
1 ResNet-50 98.50% 98.20% 98.80% 98.50%
2 DenseNet-121 97.10% 96.80% 97.40% 97.10%
3 EfficientNet-B0 96.30% 96.00% 96.60% 96.30%
4 VGG16 95.00% 94.60% 95.40% 95.00%
5 InceptionV3 94.20% 93.80% 94.50% 94.10%

EfficientNet-B0, VGG16 and InceptionV3 did perform slightly less well than the models previously discussed, but even the model with the lowest values for validity, 94.2% accuracy in InceptionV3, demonstrates that training in high categorical reliability could still occur using the synthetic images, and that they retained some appropriate structure and morphology. The results further support the use of GAN-generated images of cellular and tissue ultrastructure and morphology as a substitute for real microscopy data, and that this is valid for tasks where classification or recognition of cells is required. The validation was strong enough across many models that we can ascertain the synthetic datasets captured enough detail to inform biomedical classification and interpretation, evidenced through the presence of many small features that are of no direct value in medical interpretation, but important for training, classification and recognition. Thus, at the preliminary level of testing, the synthetic datasets provided images that could be used in practical situations relying solely on the synthetic imagery.

DISCUSSION.

This research study set out to address key limitations in biomedical imaging, the scarcity of annotated datasets, and the high costs of generating microscopy data. High-resolution images were collected from publicly accessible biomedical databases and categorized across different cellular structures. These original images served as the foundation for training multiple GANs, where each GAN was tailored to a specific type of cell morphology. Each model consisted of a generator network trained to produce synthetic cell images that mimic real biological patterns, while a discriminator learned to distinguish authentic images from generated ones. To evaluate the realism and quality of these generated images, FID was used as the primary validation metric. The whole process, starting from data augmentation and model design to generation and validation, was made to enable the creation of reliable synthetic biomedical images that can support other researchers in AI tasks in diagnostics and cellular analysis in the future.

In contrast to prior GAN applications in biomedical imaging that focused largely on image enhancement or data augmentation, this study pioneers the generation of fully synthetic cellular microscopy images. For example, GANs have successfully been used to create synthetic retinal fundus images featuring age-related macular degeneration (AMD) lesions, achieving convincing realism such that human graders struggled to differentiate them from actual photos. The evaluation in that study included both subjective assessments and an objective “realness” scale, lending credence to their authenticity [12]. Furthermore, CycleGAN frameworks have been applied to enhance the resolution of confocal microscopy images using unpaired low- and high-resolution image datasets. This approach significantly improved image fidelity and spatial resolution without requiring exact image pairings, showcasing the potential for GANs to augment microscopy imaging workflows in cost-effective, scalable ways [13]. In contrast, this approach focuses on synthesising completely new, high-fidelity microscopy images, addressing challenges of data scarcity, privacy, and the need for rich training samples in cellular-level diagnostics and research.

Despite its success in generating high-fidelity synthetic cellular images, this study does have certain limitations. The generalizability of the models remains constrained by the diversity of training data; while varied, the datasets still reflect a narrow biological spectrum. Moreover, while the FID scores suggest strong visual similarity, they do not guarantee functional or pathological accuracy without clinical validation. However, these challenges reveal ground for future research. Expanding GAN architectures to simulate rare diseases, integrating multi-modal imaging data, and pairing synthetic visuals with corresponding metadata could redefine AI in biomedical research. More importantly, by replacing expensive lab imaging with high-quality synthetic data, this work opens the door to diagnostics and scalable medical research, particularly in underserved regions where real imaging infrastructure is limited. In a world inching toward virtual biology, this cellular deepfake engine isn’t just a proof of concept, it’s a proof of possibility.

ACKNOWLEDGMENTS.

I extend my deepest gratitude to my parents for their unwavering encouragement, patience, and support, which enabled me to pursue this project with focus and determination.

REFERENCES

  1. Schroeder, A. B. et al. The ImageJ ecosystem: Open-source software for image visualization, processing, and analysis. Protein Sci 30, 234-249.
  2. Roth, H. R. et al. Rapid artificial intelligence solutions in a pandemic—The COVID-19-20 Lung CT Lesion Segmentation Challenge. Medical Image Analysis 82, 102605.
  3. Sitaru, S. et al. Automatic body part identification in real-world clinical dermatological images using machine learning. J Dtsch Dermatol Ges 21, 863-869.
  4. Lu, A. X., Kraus, O. Z., Cooper, S. & Moses, A. M. Learning unsupervised feature representations for single cell microscopy images with paired cell inpainting. PLoS Comput Biol 15, e1007348.
  5. Ali, M. et al. Applications of Artificial Intelligence, Deep Learning, and Machine Learning to Support the Analysis of Microscopic Images of Cells and Tissues. J Imaging 11(2).
  6. Litjens, G. et al. A survey on deep learning in medical image analysis. Medical Image Analysis 42, 60-88.
  7. Tucciarelli, R., Vehar, N., Chandaria, S. & Tsakiris, M. On the realness of people who do not exist: The social processing of artificial faces. iScience 25, 105441.
  8. Li, W. Genetic associations of human metabolic traits. Nature Genetics 56, 557-557.
  9. Cao, P. et al. Generative artificial intelligence to produce high-fidelity blastocyst-stage embryo images. Hum Reprod 39, 1197-1207.
  10. Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine 25, 44-56.
  11. Liu, L. et al. Digital ray: enhancing cataractous fundus images using style transfer generative adversarial networks to improve retinopathy detection. Br J Ophthalmol 108, 1423-1429.
  12. Wang, Z. et al. Synthetic artificial intelligence using generative adversarial network for retinal imaging in detection of age-related macular degeneration. Front Med (Lausanne) 10, 1184892.
  13. Trujillo, C., Thompson, L., Skalli, O. & Doblas, A. Unpaired data training enables super-resolution confocal microscopy from low-resolution acquisitions. Opt Lett 49, 5775-5778.


Posted by on Wednesday, June 3, 2026 in May 2026.

Tags: , , , ,