Abstract
Cervical cancer is a severe threat to women’s health. The majority of cervical cancer cases occur in developing countries. The WHO has proposed screening 70% of women with high-performance tests between 35 and 45 years of age by 2030 to accelerate the elimination of cervical cancer. Due to an inadequate health infrastructure and organized screening strategy, most low- and middle-income countries are still far from achieving this goal. As part of the efforts to increase performance of cervical cancer screening, it is necessary to investigate the most accurate, efficient, and effective methods and strategies. Artificial intelligence (AI) is rapidly expanding its application in cancer screening and diagnosis and deep learning algorithms have offered human-like interpretation capabilities on various medical images. AI will soon have a more significant role in improving the implementation of cervical cancer screening, management, and follow-up. This review aims to report the state of AI with respect to cervical cancer screening. We discuss the primary AI applications and development of AI technology for image recognition applied to detection of abnormal cytology and cervical neoplastic diseases, as well as the challenges that we anticipate in the future.
keywords
Introduction
Cervical cancer remains a major cause of mortality among women with > 662,000 new cases diagnosed and approximately 349,000 deaths reported globally in 20221. The long premalignant phase of cervical cancer and the natural progression of the disease make cervical cancer the only cancer that is currently preventable through primary and secondary prevention2. The World Health Organization (WHO) has launched a global strategy to eliminate cervical cancer by the end of the millennium that includes vaccination against human papillomavirus (HPV), screening of women using high-performance testing, early diagnosis, and timely treatment of high-grade cervical intraepithelial neoplasia (CIN 2/3) and cancer. Cervical cytology has been widely used for cervical cancer screening but global implementation poses several challenges and suboptimal sensitivity of the test necessitates frequent screening. The WHO has recommended HPV detection-based tests for primary screening due to the higher sensitivity and objective nature of the test. While cytology triage of HPV-positive women to colposcopy before confirmation of abnormal cervical cytology serves as an effective primary approach in high-income countries, deploying this strategy requires a well-organized infrastructure and the expertise of professionals, including pathologists, cytopathologists, laboratory scientists, and experienced colposcopists3. The WHO also recommends visual inspection with acetic acid (VIA) for triaging HPV-positive women because VIA is a cost-effective approach suitable for limited resourced settings. However, VIA has strong diagnostic subjectivity and may lack precision, potentially resulting in pre-cancerous lesions going undetected for an extended period of time4. Colposcopy, as a diagnostic tool, also has the limitations of a subjective test requiring high level of competency.
Recent developments in artificial intelligence (AI) offer considerable prospects for an automated, objective, and unbiased detection of cervical cancer and precancerous conditions. The idea of computers simulating human behavior, cognition, and actual thinking was proposed by Alan Turing as early as 1950. The term, “artificial intelligence” was officially coined by John McCarthy at an academic conference in 19565. AI began with the following major directions: perceptrons; Bayesian networks; pattern recognition; human-computer interaction; knowledge representation; and computer vision. As AI entered the “golden” era, there was a surge in interest and the performance of AI gradually evolved into complex algorithms that resemble the logic of human beings. With the development of machine learning as a core technology of AI, computers can learn from data analysis, derive standards from the data, and use these standards to predict and classify unknown objects. Computers have the capability of finding features and learning and distinguishing new text, images, signals, and other data automatically. From the development of artificial neural networks, the concept of deep learning emerged and is now widely used in the fields of medical diagnosis, medical image recognition, natural language processing, and health management applications. In recent years, AI has demonstrated significant advantages in several aspects of detecting cervical cancer, including the segmentation and classification of cytology6,7, colposcopy8, and the early detection of cervical cancer lymph node metastasis (LNM) on magnetic resonance imaging (MRI)9. A significant proportion of current research focuses on developing deep learning algorithms for automatic processing, recognition, feature extraction, and classification of cervical images, which enables AI to analyze images, identify patterns, and interpret cancer characteristics. The WHO has noted that AI can enhance screening tests and techniques that involve visual evaluation of digital images10. It is anticipated that AI-assisted screening will have a major role in low-resource areas, addressing the shortage of competent healthcare personnel. Additionally, the internet and mobile data, cloud computing, and mobile devices have improved access to healthcare services in remote areas, thereby reducing healthcare costs, while remote digital education platforms can enhance the professionalism of local physicians and alleviate the global shortage of specialists11,12. Despite the positive developments in AI for cervical cancer screening, further exploration and validation are needed to prove its effectiveness at the population level. It is essential that AI should align with the standards required for adjunctive use in routine clinical settings. The precision of AI in screening tasks, particularly regarding misdiagnoses and misdiagnosed cases, is a critical issue in future enhancements. Also, the willingness and trust of clinicians to embrace AI reflects the hurdles that this novel technology must overcome to be fully applied to cervical cancer screening solutions. In this review we reviewed the current state of research and the application of AI in cervical cancer screening, analyzed the ongoing challenges related to technological advancement, and advocate for the promotion and acceleration of widespread use of AI to screen and diagnose cervical cancer.
Machine learning in cervical cancer risk prediction
Cervical cancer screening typically involves a series of procedures, including HPV testing, visual examinations, cytology, colposcopy, and biopsies. Each method requires skill and experience and/or substantial resources and time. In resource-limited areas, the implementation of comprehensive and high-quality screening programs presents considerable challenges. Leveraging existing clinical data for efficient and intelligent screening or prediction is of substantial value to overcome the challenges. The results of HPV testing and HPV genotyping combined with other clinical information, such as age, menstrual status, and behavior, can be utilized to predict the progression of positive high-risk (hr)HPV cases and the risk of cervical cancer13,14. Moreover, prediction of cervical cancer risk by integrating HPV test results with cytologic findings and biomarkers has been shown to improve upon conventional screening methods, thereby reducing the referral rates for colposcopy15,16. A predictive model to identify those at high risk of developing cervical cancer has been developed based on prior HPV results and historical medical records, allowing for individualized risk stratification and management17. These predictive models may guide development of risk-stratified cervical cancer screening strategies.
Technically, most predictive models are constructed by machine learning algorithms, such as support vector machines (SVMs) and random forests, which form the foundation of deep learning and represent the precursors of AI development. Machine learning algorithms are relatively interpretable in medical applications and perform well in classification and prediction tasks. However, intelligent analysis of diverse data types presents challenges, particularly in the highly abstract feature extraction required for unstructured data, which requires implementation of neural network architectures. To intelligently address various tasks in cervical cancer screening, deep learning solutions are more adept at handling different types of unstructured data and integration of multimodal data. Several studies have combined machine learning and deep learning approaches to enhance the robustness of diagnostic classification tasks, combining deep neural networks for feature engineering and machine learning algorithms for classification tasks, which result in more accurate and interpretable classifications18–20. Through the application of these technologies, cervical cancer screening can be more efficient and accurate, providing essential support for early detection and intervention.
AI-guided technologies in cervical cancer screening
The tests used for cervical cancer screening include HPV testing, cytology (both conventional and liquid-based cytology), and VIA (by naked eye or enhanced with a magnifying device). Lugol’s iodine can also be used in place of acetic acid (VILI), although Lugol’s iodine is not widely recommended. Recently, cervical cancer screening has become increasingly dependent on the detection of hrHPV, which has a higher sensitivity and negative predictive value compared to cytology. Most programs recommend triaging HPV-positive women with a combination of HPV16/18 testing and cytology followed by colposcopy. However, cytology tests in low- and middle-income countries (LMICs) have highly variable performance and low sensitivity due to lack of trained personnel, infrastructure, and quality assurance. In addition to colposcopy and cervical biopsy for diagnostic purposes, a colposcopy-guided biopsy is critical for determining whether further treatment is necessary. Therefore, colposcopists need comprehensive training to achieve a requisite level of proficiency to perform diagnostics capabilities. Nevertheless, colposcopic equipment and expert or well-trained colposcopists are both scarce resources for LMICs.
AI is the simulation of human-like cognitive and learning capabilities by computer systems. AI refers to the capability of machines to sift and discern patterns from representative examples to assimilate knowledge features and foresee unexpected data. Presently, AI in cervical cancer research is primarily focused on the automatic detection, feature extraction, and learning classification of various cervical images. Intelligent analysis of cervical images by advanced computer vision techniques is becoming an auxiliary or even alternative method for detecting cervical cancer at an early stage.
Availability of datasets for AI-guided cervical cancer screening
The availability of large and high-quality datasets of cervical clinical data provides a solid foundation for training and validating AI algorithms. Several high-quality public datasets with annotations are available, including the Cx22 dataset5 and ISBI Challenge Database6,7 for segmenting cytology images and the SIPaKMeD dataset8 and Harlev datasets9 for classifying cytology cells based on morphology. However, datasets for colposcopy images are relatively limited. Presently, the largest public dataset, Intel & MobileODT Cervical Cancer Screening10, is collected by mobile-level colposcopy devices. However, public access to datasets captured with high-magnification colposcopy equipment is still lacking. The International Agency for Research on Cancer (IARC) Cervical Cancer Image Bank11 is one such database compiled by collaborating colposcopists using standard formats, although the scale is quite modest.
Feature representative of AI model
Representative variables, such as female age, menopausal status, parity, medical history, and HPV results, are typically selected as relevant clinical features for a cervical cancer risk prediction model, ensuring a comprehensive and robust representation of the data. However, for whole-slide and colposcopy images, deep learning algorithms, like convolutional neural networks (CNNs), are required to extract high-dimensional features from the images. With increasing network depth, the selected features become more representative, ultimately capturing unique patterns and textures that can contribute to the diagnosis of cervical cancer. Several preprocessing steps are performed before the images are fed into the neural network, such as normalizing the images, resizing the images to a consistent pixel size, and applying augmentation techniques to enhance the robustness of the model. After the features have been extracted, the features are normalized and combined to form a feature vector, which is used as input in the AI model. By utilizing imaging data appropriately, the multi-faceted nature of the disease can be captured and performance of the AI model can be enhanced.
AI algorithms for cervical cancer detection and diagnosis
With deep learning, various features in images, such as color, texture, and relative objects, are systematically captured by neural networks. The CNN, a leading deep learning architecture, extracts high-level features, such as edges and textures, from cervical cell images through multiple layers of convolution and pooling operations. The CNN is widely utilized for cell detection, segmentation, classification, and extraction of regions of interest (ROIs) in cytologic images. Moreover, advanced deep learning methods, such as graph neural networks (GCNs), perform convolution operations on graph-structured data, capturing the relationships between nodes and the structural information of the graph. By leveraging the node relationships and information propagation within the graph structure, GCNs enhance the ability to process complex structured data. GCNs are increasingly used to interpret high-resolution colposcopic images. Annotating and interpreting medical images requires well-trained cytologists, pathologists, and specialists with at least 5–10 years of experience, making the process both time-consuming and resource-intensive. At present, deep learning algorithm exploration is primarily aimed at alleviating this issue. A semi- or weakly-supervised learning method, for example, can analyze and learn features from partially or minimally annotated images, applying pseudo-labels to unannotated images for classification and object detection tasks. With self- and un-supervised learning methods, which do not require manually annotated category labels, feature learning can be achieved through a vast collection of unannotated image samples. For example, generative adversarial networks (GANs) generate high-quality synthetic images, enhancing the diversity of datasets. GANs consist of a generator and discriminator that work through an adversarial training process to produce high-quality images. The generator creates realistic images to deceive the discriminator, while the discriminator distinguishes between real and generated images. The generator progressively improves the quality of the generated images in this adversarial process, making it increasingly difficult for a discriminator to differentiate the generated images from real images. The training method does not require labeled data and ensures the robustness of the model as well. Cervical cancer screening includes a variety of tests, including HPV testing as the primary screening, cytology triage, or HPV and cytology co-testing. Colposcopy is used as a preliminary diagnostic and the screening results are also required for reference. Transformer neural network (Transformer) captures dependencies between different positions in sequence data through self-attention mechanisms, while multi-head attention mechanisms allow the model to focus on different parts of the input sequence simultaneously. This enables Transformers to effectively handle long-range dependencies, making Transformers highly effective in natural language and image processing tasks. Specifically, Transformers can integrate different modalities of data, such as HPV testing results, cytology, and cervical images, showcasing the strong capabilities of Transformers in multimodal data processing and complex sequence modeling12,13. As a result of evaluating multiple performance metrics comprehensively, models enable automatic image classification and abnormal detection and assist physicians in diagnostic decision-making by integrating multimodal data. Thus, the rate of misdiagnosis and missed diagnoses is significantly reduced, resulting in improved screening efficiency.
Finally, multiple performance metrics were used to evaluate the AI model to ensure that the effectiveness was assessed in a comprehensive manner. Model performance was assessed with respect to sensitivity and specificity, accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (AUC-ROC) as primary metrics. Intersection over union (IoU), dice coefficient, and mean average precision (mAP) are typically used to evaluate segmentation accuracy. Average precision (AP) was calculated for each class and the mAP was computed to provide an overall performance measure across all classes. Additionally, evaluation metrics can be used to evaluate the performance of classification algorithms. Metrics, such as diagnostic accuracy, which encompasses the overall accuracy of the AI diagnostic outputs, and positive predictive values (PPVs) and negative predictive values (NPVs), indicating the likelihood of true positives and true negatives, respectively, were also used to evaluate the AI model as a diagnostic tool. These metrics are essential for assessing the practical applicability of the AI model in clinical settings and the potential impact on screening outcomes. The performance of AI technology in the early screening and detection of cervical cancer has been validated by several studies and has demonstrated good diagnostic accuracy14–16.
AI model workflows require rigorous data quality and model selection to avoid biases and noise that can cause under- or over-fitting, which affects the generalization ability of the model. A model is then selected and fine-tuned to achieve an optimal model based on performance. The high sensitivity and specificity are typically expected by clinical decision applications, but clinically acceptable results may vary depending on the application. Thus, internal and external validation is crucial to ensuring that the model is stable and generalizable. In general, AI is an evolving process that will require regular updates on datasets used to train the model as well as optimization for clinical use.
AI-guided methods in enhancing cervical cytology
An AI-assisted system analyzes cytologic images, develop mathematical models based on deep learning or other AI techniques, and screen digital smear images to identify normal and abnormal cells and facilitate cervical cancer screening. The initial attempt to automate cervical smear screening system began in 1992 with approval of the PAPNET Testing System for rescreening of conventional cervical smears that were manually screened17. The liquid-based cytology technique was an important innovation to the traditional Pap smear for improving the quality of cervical cytology. The cervical samples are collected in liquid media (ThinPrep18 or SurePath19). After preparation and staining of smears, the smears are scanned with a microscope slide scanner to produce digital images. The US Food and Drug Administration (FDA) approved two computer-aided imaging systems for automated detection of abnormal cells in 2010 (BD FocalPoint GS Imaging System and the ThinPrep Imaging System20,21). Although suspicious cells are detected on a slide by computer-aided systems, the entire slide must be manually screened and interpreted by a cytopathologist. Although this system has improved with respect to sensitivity and efficiency21–23, extensive manual screening is required and the final diagnosis is fully dependent upon the final cytopathologist manual screening process. BestCyte, a whole slide image (WSI)-based scanning technology, was introduced in 201424. A powerful method is applied to categorize and systematically display images of clinically significant cells in galleries based on the cytomorphologic characteristics within fields of view (FOVs). BestCyte supports the annotation of images at the cell level (40× magnification) could potentially standardize objectivity among cytologists, leading to fewer discrepancies in final diagnoses25. In addition, BestCyte also incorporates cell annotation and WSI review through a remote operation platform for peer review by cytopathologists. These automated screening systems, however, are not actually AI-assisted screening technologies, but rather forms of computer-aided imaging techniques.
Automated smear analysis involves the following procedures: digital image slide acquisition; identifying ROIs; segmenting to isolate relevant features of cells; and classifying images into pre-neoplastic categories for cytopathologist review26. AI technology is primarily utilized in the segmentation and classification phases, which helps reduce the daily workload of cytopathologists and improves the efficiency of screening. Figure 1 illustrates the workflow of deep learning networks used in cervical cytology diagnosis. Segmentation refers to the isolation of multiple regions of cells to extract precise information about the ROI for the detection of abnormal cells. Generally, cytopathology requires a closer examination of characteristics of the cell nucleus, therefore the nucleus and cytoplasm must be precisely segmented. Sompawong et al.27 developed a Mask-RCNN architecture that used a classification branch to distinguish between normal and abnormal features based on the nuclear locations and a segmentation branch to pinpoint the nuclei locations on a Pap smear slide. U-Net is a convolutional network widely used for medical segmentation tasks. U-Net was introduced by Ronneberger et al.28 as a model that can learn from a few annotated images. Several studies in recent years have examined the performance of U-net on the segmentation of cervical cells with a dice rate > 90% on internal datasets29–31. Zhang et al.30 validated the proposed GC-UNet in an actual cervical cancer diagnosis setting, achieving a remarkable precision rate of 99.5%. This finding indicates that U-net might be a highly effective method for segmenting cervical nuclei, which will serve as an important tool for diagnosing and screening cervical cancer. Additionally, the method shows significant potential for practical application, with its rapid processing time of just 0.85 seconds per image. As compared to the U-net, Wang et al.32 proposed a multi-layer deep learning framework for improving the accuracy of detection of CIN2+ cells. The method employs a coarse-to-fine strategy for quick identification of target ROI tissue location through semantic segmentation, followed by precise single HSIL cell detection on specific ROIs. The multi-layer deep learning framework is also 20 times faster than U-net in processing one piece of WSI.
In general, the classification of cervical cells is intended to improve detection rates of cervical intraepithelial neoplasia. Current AI-assisted slide recognition and cytologic classification tasks rely on digital images from whole slide imaging, mainly including scanned conventional Pap smears and scanned LBC slides after staining. Recent studies have demonstrated that deep learning methods, such as CNNs and feature attention networks, can achieve an accuracy > 90% in both binary and multiple classification tasks for conventional smear images33–37. It is worth mentioning that Wang et al.33 contributed a transfer-learned CNN, which is used for classification by a limited number of Pap smear images with coarse image-level labels but has achieved remarkable performance. However, the liquid-based cytologic smear preparation method has the advantage of a smaller scan area and minimizing obscurations and cell overlap, which has led to the development of efficient AI cytology applications14,38–40. Bao et al.41 developed a supervised deep learning algorithm based on 188,542 digital LBC images and evaluated the capability of detecting CIN2+ and CIN3+ lesions. The accuracy of the algorithm was comparable to that of experienced cytologists. The exploratory development of deep learning classifiers also covers specially stained cytology slides, such as p16/Ki-67 dual-stained (DS) and H&E-stained slides. Wentzensen et al.42 applied two neural networks (CNN4 and Inception-v3) to develop a new image analysis platform based on deep learning and validated the platform in a large sample population. Compared with manual screening, the performance of AI classifier has similar sensitivity and higher specificity42. The simultaneous presence of p16 and Ki-67 in the same cell provides a valuable triage strategy, thus this AI classifier reduces the workload of cytologists and saves unnecessary resources. Considering the variety of types of cytologic images, Cheng et al.43 enhanced abnormal cell detection in cervical smears by integrating CNN with a recurrent neural network (RNN), providing an adaptable algorithm that could be used with different slide preparations, staining, and imaging methods. The most recent research on AI systems for cytology is shown in Table 1.
Aside from AI research on cervical cytology images, AI-based digital microscopes have also been developed, providing new opportunities to address the challenges of cervical cancer screening in LMICs. Tang et al.48 utilized the augmented reality (AR) technique with an AI microscope to provide real-time assistance for cervical cytology diagnosis. The AR technique significantly improved the sensitivity of LSIL and HSIL, and also enhanced the consistency of various atypical squamous cells48. In addition, a digital diagnostic system for Papanicolaou smears was developed by Holmstrom et al.49 using a portable whole-slide microscope scanner and a deep convolutional network trained on commercially available image-analysis platforms for the detection of cervical carcinoma cells4. The Holmstrom et al. study49 found that using the system for identifying squamous cell atypia is viable, resulting in high sensitivity, especially for detecting high-grade atypia slides, which might help reduce the workload for microscopists and cytopathologists in low-resource settings. The findings of all the research studies validate the potential and benefits of using AI to aid in cytologic diagnosis, as well as the potential to assist in screening efforts in areas with limited resources. Moreover, Tang et al.48 discussed the potential utilization of AI technology, such as AI microscopes, to enhance the professional training of newly trained cytopathologists in low-resource settings. Hologic, Inc. announced the launch of the first FDA-approved digital cytology system in February 2024 (Genius™ Digital Diagnostics System equipped with Genius™ Cervical AI algorithm and volumetric imaging technology). This diagnostics system consists of image acquisition, analysis of images by the AI algorithm, image storage, and remote peer review50. The Genius™ Digital Diagnostics System demonstrates that AI technology has a bright future in cervical cytology and is projected to have a significant impact on cervical cancer screening during the coming years. AI-assisted liquid-based cytology testing may facilitate the rapid expansion of cervical cancer screening, while also being more cost-effective51.
AI applications in colposcopic diagnosis and assistance in biopsies
The AI-assisted colposcopy diagnostic system combines high-definition colposcopic imaging to identify cervical lesions from annotated colposcopic images with assessment of suspicious lesion areas using image recognition algorithms. Due to the subjectivity of colposcopic diagnosis, AI technology is critical in helping primary care colposcopists in low-resource healthcare areas correctly differentiate between normal and abnormal cervical findings, grading, and categorizing cervical lesions efficiently. Few studies have evaluated the effectiveness of AI-based diagnosis with smartphone-obtained colposcope images, and these studies have shown promising results that are systematically superior to t medical experts52–55. AI technology is increasingly being used to assist experienced colposcopists in enhancing their diagnostic performance, classifying lesions more effectively, identifying the transformation zone (TZ), and guiding colposcopists in determining biopsy sites (Figure 2). The most recent AI-colposcope research has concentrated on the development of deep learning-based classifiers for cervical neoplasia on magnified cervix images obtained with special equipment, which increases the consistency with histopathologic findings56–63. Considering the heterogeneity of colposcopic imaging equipment and the prevalent lack of standardized annotations of colposcopic images, the application of semi-supervised learning algorithms for inferring cervical dysplasia categorization from limited high-quality colposcopy images represents a current research trajectory within the field of AI-assisted colposcopy64,65. Based on the ASCCP colposcopy standards, one of the most critical factors for grading colposcopic findings is the type of cervical TZ and whether it is fully or partially visible66. Referencing the Colposcopy Terminology published by the International Federation for Cervical Pathology and Colposcopy (IFCPC), a cervical TZ is typically defined as a region where squamous metaplasia has developed and is known to be a predisposing site for cervical cancer development47. Thus, few studies have been conducted on the implementation of deep learning algorithms to improve the segmentation of the acetowhite lesion and determine its TZs67,68. The results of these studies reveal that precise segmentation of the TZ can effectively enhance the discriminative representation capacity of the deep learning-based CIN classifier60,63. In clinical colposcopy practice, a critical objective of cervical cancer screening is differentiating CIN grades. When lesions are detected, biopsies of 2–4 sites are obtained to ascertain the most severe lesion. For lesions diagnosed as CIN 2/3, treatments, such as conization or LEEP, might be required. To improve the accuracy and appropriateness of biopsy sites, some AI-assisted colposcopes provide guidance regarding cervical biopsies and predict the location of the biopsy site57,69,70. The most current advanced AI models for application on colposcopy are summarized in Table 2.
The development of AI models for the classification of cervical lesions has resulted in impressive results, even achieving diagnostic capabilities comparable to colposcopists in some studies69,74–76. However, the independent interpretability of AI models still lacks clinical credibility. Moreover, very few of these developed AI models have been validated for applicability to real clinical use. Kim et al.72 evaluated the feasibility of the Cerviray AI system® on 234 patients and reporterrd superior sensitivity and similar specificity over two colposcopists for detecting high-grade lesions. However, the sensitivity significantly improved when an AI system worked in conjunction with at least one colposcopist. Wu et al.71 conducted a retrospective hospital-based study evaluating the colposcopic AI diagnostic system, CAIADS. CAIADS guided fewer biopsy sites and had the greatest biopsy sensitivity for high-grade lesions compared to subspecialists71. With the assistance of CAIADS, the sensitivity achieved by junior colposcopists on CIN grades and biopsy was significantly improved. These studies demonstrated the clinical applicability of the AI-assisted colposcopy diagnostic system, which can assist novice colposcopists in developing diagnostic abilities to the level of experienced practitioners and guide novice colposcopists in performing efficient biopsy procedures. Given these findings, an AI-assisted colposcopy diagnostic system will have a valuable support role in cervical cancer screening in areas with limited resources. AI colposcopy systems, like CAIADS and Cerviray AI, are also equipped with remote access functionality. Thus, the further development of cloud-based AI colposcopy platforms might narrow the gap in colposcopic examinations between LMICs. In addition, mobile-based and other portable hardware colposcopy devices that incorporate AI technology in a more accessible and user-friendly format, such as MobileODT77 and Cervicare AI78, have also shown promising performance in validation studies and have been successfully commercialized. However, there is currently no FDA approved AI-based colposcopy tool for cervical cancer detection. Currently, AI in colposcopy aims to address the shortage of experienced colposcopists, reduce misdiagnosis rates, and enhance the diagnostic efficiency of traditional colposcopy. In fact, AI holds significant potential for providing effective education in colposcopy. The IARC offers training resources for colposcopy in several languages, providing comprehensive and accessible educational materials to a global audience79. A recent study by Chen et al.80 developed an online digital education tool with numerous real-world colposcopy images for colposcopy training, which provided short-term improvements in colposcopist competency and confidence. In addition to the acquisition of standard terminology as well as a greater understanding of colposcopy, digital training platforms facilitate interactive educational exchanges with colposcopists. This platform enables learners to customize their enrichment of knowledge by addressing specific educational needs (Figure 3). Based on this inspiration, AI interpretability on colposcopic images might also be a potential training benefit for novice colposcopists.
AI-assisted Cervical Cancer Screening Challenges and Suggestions
AI will have significant implications for cervical cancer screening, especially with AI-based methods for cytology screening that have achieved considerable technical proficiency. This has been attributed to the development of AI algorithms in cervical cytology over the past few years. The precision of segmentation models29,31,32,81,82 and accuracy of classification models33–35,37 have demonstrated impressive evaluation results, with both exceeding 90% in precision and accuracy. From the perspective of learning strategy, there has been a shift from ensemble learning83,84 to transfer learning33,35. Advances in AI algorithms for cytology have been enhanced by several public datasets, including the Herlev9, ISBI Challenge6,7, Sipakmed8, and Cx22 databases5. Various types of Pap smear, single-cell, and overlapping cervical cell images are provided in these datasets to train different segmentation and classification tasks. Additionally, the development of AI-based colposcopy diagnostic models still requires further enhancement to ensure precise cervical lesion classification, including increasing the specificity of early cervical lesions and accurately subdividing more subtypes of cervical lesions (e.g., adenocarcinoma in situ and cervical adenocarcinoma). In general, AI algorithm models are accurate to approximately 85% in the classification of cervical lesions on colposcopy images. It is interesting to note that light-weight neural networks, such as MobileNet, EfficientNet, and SqueezeNet, have increasingly been adopted as backbone networks in both cytology and colposcopy62,67,85. Given this trend, equally efficient models can likely be deployed on mobile devices or portable computers, making the models more suitable for environments that are limited in resources. However, there are very few publicly available colposcopy image datasets, with the IARC Cervical Cancer Image Bank11 being one of the few that meet expert qualification standards for collection procedures and image quality. Most colposcopy images used in AI algorithm development are not publicly available. Thus, establishing a standardized and regulated data platform for systematic management and quality control is essential for further advances. With the emergence of large language models (LLMs), the revolution in artificial intelligence has officially begun. Recently, Meta’s FAIR lab Segment Anything Model (SAM), a giant AI image segmentation model aiming to revolutionize machine learning for versatile and accurate image segmentation, was released86. This state-of-the-art model will not be restricted by image types and domains. Wang et al.87 developed a foundational model based on whole-slide pathologic images, demonstrating that the latest generative AI technologies are already being applied in the medical field to address the modeling challenges of such large-scale medical images. This general model can also be applied to other types of image data, such as CT, MRI, or X-rays. Therefore, for the development of advanced AI algorithm models, processing colposcopy images from different devices of varying qualities by one foundation AI model will be a future direction.
As discussed in this review, validation studies that evaluate AI models may show heterogeneity due to factors, such as the diversity of populations, variations in slide preparation, and various evaluation metrics. Therefore, it is difficult to provide a direct comparison of AI algorithms. Most evaluation results indicate that AI models used in clinical validation for cytology have difficulty achieving a 0.9 sensitivity14,15,46, although the speed of slide reading has significantly improved45,47. When compared to cytologists, AI-based systems for CIN detection are generally in line with the proficiency of junior cytologists, with some exceeding senior cytologists in performance14,47. Additionally, the specificity of different AI models varies considerably, most likely due to the varying case sample distributions included in the studies. Despite the relatively advanced application of AI image analysis in cytology4, increasing clinical validation studies and standardizing the method of reporting the results of these studies should be prioritized. Furthermore, large-scale studies are lacking for the clinical validation of AI image analysis in colposcopy. Currently, only Yuan et al.60 and Xue et al.69 have included sufficient numbers of cases to evaluate the performance of AI systems in clinical settings. However, despite the improved sensitivity of AI systems compared to junior colposcopists in these validation studies61,71,72, some differences remain. An advantage of AI-assisted colposcopy is that it facilitates guided biopsies and enables women to detect lesions more effectively with fewer biopsy samples, thereby causing less cervical damage. Yuan et al.60 showed that the number of biopsies performed using the AI system was slightly higher than the number performed by the colposcopist in each case. However, Wu et al.71 reported that the AI system performed fewer biopsies than colposcopists. The significant variability in colposcopy image capture and collection results in AI system performance differences, emphasizing the importance of standardized protocols in colposcopy image collection. Additionally, there will be an emphasis on assessing the extent to which AI models can be applied to large real-world populations based on prospective clinical studies to improve the effectiveness of cervical cancer screening, further incorporating screening strategies.
Finally, despite the potential benefits and cost-effectiveness of AI in cervical cancer screening, clinicians are concerned that the lack of clear interpretability of diagnostic decisions raises substantial concerns regarding safety, resilience, and ethical considerations. If AI technology is integrated into cervical cancer screening strategies, especially by using AI-assisted cytology to replace traditional cytology screening every 5 y, the cost-effectiveness could be comparable to HPV testing51. AI systems can significantly reduce diagnostic time and cost. However, pathologists and colposcopists using these systems to assist with diagnostic purposes must navigate the allocation of responsibility for clinical outcomes, as well as the privacy and security of patient data, which require further definition. Hence, AI applications can only assist physicians in diagnosing rather than replace physicians in clinical decision-making. Healthcare professionals should understand the advantages and limitations of AI when using AI-based products. Furthermore, AI must be used in a manner that respects the autonomy of patients. To ensure ethical compliance with AI, it is imperative that clear guidelines and regulations be established as well. Digital health innovation is a focus of the US FDA Digital Health Innovation Action Plan88, which is intended to regulate and monitor digital health devices. Similarly, the European Union Medical Device Regulation emphasizes stringent requirements for the clinical evaluation and post-market surveillance of AI-powered medical devices89. The National Medical Products Administration in China is strengthening its regulatory processes to accommodate rapid advances in AI technology. The International Telecommunication Union is working on international standards to facilitate global harmonization90. Global efforts are being made towards establishing robust governance for AI in healthcare, ensuring patient safety, and promoting innovation. This technology must be approached with caution and strict regulations in place to protect patient privacy, informed consent, and ethical considerations to prevent misuse.
Conclusions
In this paper we have described the current development, application, challenges, and future directions of AI in cervical cancer screening. Cervical cancer screening methods with AI technology have the potential to significantly transform the prevention and control of cervical cancer. Further applications of AI to cervical cancer screening could deliver high-quality clinical performance, provide diagnostic rationale of explanation and interpretability on standardized platforms, and archive extensive real-world cervical images for education purposes. This might enable the gap between tertiary and primary care hospitals to be narrowed, in turn maximizing health care for a broader segment of the population. A further application of AI will be in the prevention and control of cervical cancer, reducing the workload of medical personnel while increasing diagnostic accuracy and efficiency. It is hoped that current research on AI is expected to translate into clinical practice, which will expedite the global goal of eliminating cervical cancer.
Conflicts of interest statement
No potential conflicts of interest are disclosed.
Author contributions
Conceived and designed the analysis: Partha Basu, Youlin Qiao, Fanghui Zhao.
Collected the data: Tong Wu.
Performed the analysis: Tong Wu, Eric Lucas.
Wrote the paper: Tong Wu, Partha Basu.
Acknowledgements
We thank Dr. Peng Xue and Ms. Mingyang Chen at Chinese Academy of Medical Sciences and Peking Union Medical College for comments of the original manuscript.
- Received May 30, 2024.
- Accepted August 12, 2024.
- Copyright: © 2024 The Authors
This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.
- 35.↵
- 36.
- 37.↵
- 38.↵
- 39.
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.
- 54.
- 55.↵
- 56.↵
- 57.↵
- 58.
- 59.
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.
- 74.↵
- 75.
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵