Colorectal cancer (CRC) accounts for approximately 10% of newly diagnosed cancer cases and cancer-related deaths worldwide1. The identification of molecular subtypes of CRCs has significantly advanced treatment strategies, including targeted therapy and immunotherapy. CRC forms a highly heterogeneous set of tumors with well-classified subtypes based on the genome, DNA methylome, and transcriptome signatures2–4, among which genetic factors and transcriptomic profiles have been more extensively studied and are better understood. However, epigenetic regulatory mechanisms also exert profound influence on tumor cell phenotype5, while epigenetic subtyping and regulation of gene expression under different CRC subtypes remain poorly understood. This gap in knowledge underscores the need to understand the epigenetic determinants of CRC phenotypic diversity.
The (intrinsic) consensus molecular subtype (CMS) of CRCs
Cancer subtyping based on transcriptomic profiles has been widely accepted as an important source of disease stratification because gene expression is closely associated with cellular phenotypes and tumor behaviors. In 2015 an international consortium integrated six independently identified transcriptome-based subtyping systems and established the CMS system for CRCs. Most tumors were classified into one of four CMSs (CMS1-4)3, which were characterized by high immune infiltration, canonical WNT and MYC activation, metabolic disorders, and high mesenchymal infiltration, respectively. The fibrotic CMS4 had the worst overall relapse-free survival3.
The CMS groups were identified from bulk gene expression profiles of heterogeneous tissues, which obscured the composition of cell types and microenvironmental interactions within the tumor. Subsequent research using single-cell RNA sequencing (scRNA-seq) provided global cellular landscapes of CRCs. Joanito et al.6 performed subtyping analysis exclusively on malignant epithelial cells, which identified two cancer cell states and established the intrinsic CMS (iCMS) classification system. Cancer cells were classified into iCMS2 and iCMS3, in which iCMS2 cells exhibited higher WNT and MYC activities and adenoma gene signatures, while iCMS3 cells exhibited higher MAPK pathway activity and sessile serrated lesion gene signatures.
The (i)CMS systems were initially defined based on transcriptomic profiles, while a deeper correlative analysis with well-defined genetic and microenvironmental features further intensified the understanding of each CMS (Figure 1). First, iCMS2 cells exhibited high frequencies of somatic copy number alterations (SCNAs) and mutations in APC and TP53, whereas iCMS3 cells had few SCNAs and were enriched for mutations in KRAS, PIK3CA, and BRAF. Next, all iCMS2 cells were microsatellite stable (MSS), while iCMS3 cells were further classified into MSS and microsatellite unstable (MSI). The iCMS3-MSI tumors featured elevated immune response signatures and higher immune cell infiltration6. Indeed, iCMS2, iCMS3-MSS, and iCMS3-MSI corresponded to CMS2, CMS3 and CMS1 CRCs, respectively. The fibrotic CMS4 subtype was not included in the iCMS system, suggesting that microenvironmental mesenchymal features are decoupled from the intrinsic characteristics of tumor cells.
Epigenetic classification of CRCs
Non-mutational epigenetic reprogramming is a hallmark of cancer5,7. Previous research involving CRC subtyping has primarily focused on DNA methylation due to the higher technical feasibility of DNA methylation assays compared with chromatin-related assays. In 1999, Toyota et al.4 first reported that a subset of CRCs exhibited abnormally increased DNA methylation at cancer-specific methylated regions (hypermethylation of the promoter regions of eight marker genes), forming a distinct subtype referred to as CpG-island methylator phenotype (CIMP)4. Subsequent studies expanded the methylated marker panel of CIMP tumors and further classified all CRCs into CIMP-high, CIMP-low, and CIMP-negative groups based on the degree of genome-wide hypermethylation8,9. The CIMP-High subtype is closely associated with the prognosis of CRC patients and encompasses nearly all cases with BRAF mutations8,9.
Unlike DNA methylation, chromatin-related epigenetic regulatory profiles, including histone modification, chromatin accessibility, and transcription factor (TF) binding, have not been thoroughly studied in CRC. Liu and colleagues10 recently reported in Cancer Discovery a systematic analysis of the CRC epigenetic regulatory landscape using single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq). Based on the genome-wide chromatin accessibility profile, Liu et al.10 identified two distinct epigenetic CRC subgroups [group-1 and group-2 (Figure 3A in Liu et al.10)]. Group-1 cells are characterized by frequent SCNAs, MSS, and predominant origination from the left colon, whereas group-2 cells encompass all MSI cases and primarily originate from the right colon. The genetic alterations, clinical features, and more importantly, gene expression signatures of epigenetically identified group-1 and group-2 CRC subgroups perfectly matched the scRNA-seq-defined iCMS2 and iCMS3 CRCs (Figure 3D-F of Liu et al.10), respectively. This finding confirmed for the first time that the well-characterized (i)CMS classification system is robust at the epigenetic level, which further expands the scope of iCMS systems to the epigenetic regulation level.
TF activities shape both inter- and intra-subtype heterogeneities
The (i)CMS system provides a multi-dimensional phenotypic classification of CRCs, encompassing genomic, transcriptomic, microenvironmental, and clinical features. However, the subtype-specific gene regulation mechanisms underlying this phenotypic classification have not been established. Using the scATAC-seq atlas, Liu and colleagues10 systematically identified epigenetic regulators with subtype-specific activities, including cis-regulatory elements (Figure 4B, D in Liu et al.10) and transcription factors. The activities of several TFs show significant differences between iCMS subtypes, including HNF4A, PPARA, and CDX2 for iCMS2 tumors, and FOXA3 and MAFK for iCMS3 tumors (Figure 4E, F in Liu et al.10). Notably, a subsequent patient-level analysis confirmed that the activities of these TFs exhibit strong inter-subtype discrepancies and high intra-subtype consistencies. HNF4A, for example, was hyper-activated in the tumor cells of every iCMS2 subtype patient, but not in the iCMS3 subtype (Figure 4G, H in Liu et al.10). This finding indicated that hyper-activation of subtype-specific TFs shapes the phenotypic characteristics of iCMS subtypes.
Despite the shared TF activity profiles of tumors within the same subtype, strong intra-subtype heterogeneities have been noted for both gene expression and chromatin accessibility. Among patients of the same subtype, only a small fraction (approximately 10%) of genes and regulatory elements were consistently activated across all patients, while the vast majority of aberrantly activated genes were still individual-patient-specific (Figure 4H, I in Liu et al.10). Notably, this intra-subtype minor similarity and major diversity for gene expression and chromatin accessibility could be accurately explained by TF binding profiles. TFs, which serve as master regulators of gene expression and cellular phenotypes, shape both inter- and intra-subtype heterogeneities by subtype-specific activation and patient-specific binding and activation of downstream target sites (Figure 2).
Synergistic TF modules regulate multi-dimensional CRC phenotypes
The molecular subtyping of CRCs encompasses multiple aspects of tumor cell phenotypes, including clinical features, driver mutations, genome instabilities, transcriptome signatures, and DNA methylation profiles11. Different aspects of cancer cell characteristics are modulated and interconnected by complicated gene regulation networks. Using weighted correlation network analysis, Liu and colleagues10 identified several TF modules related to multi-dimensional CRC heterogeneities. TFs within the same module exhibit high correlations in terms of TF activities (Figure 7A in Liu et al.10), thus forming distinct collaborative regulatory units to shape cellular phenotypes. TF modules associated with iCMS classifications, including ME5 and ME8, are also significantly associated with the left/right side origins of CRC tumors, confirming shared gene regulation programs between iCMS subtypes and the location of tumor origin (Figure 7B, C in Liu et al.10). Several TF modules are independently related to microsatellite instability or CIMP subtypes, while iCMS-related ME8 also contribute to these features. These results indicate that the complicated phenotypic subtypes are deconvoluted into distinct regulatory modules, while the synergistic activities of different TFs within the same modules collaboratively shape different cellular phenotypes of tumor cells.
In addition to modulating multi-dimensional CRC phenotypes, regulation of a single CRC subtype also requires the collaborative efforts of different TF modules. Specifically, three distinct TF modules (ME5, ME6, and ME8) are all associated with the gene expression signatures of iCMS2 CRCs. However, ME5 and ME6 are exclusively associated with the downregulated iCMS2 CRC genes compared to healthy colon epithelial cells, while ME8 is only associated with the upregulated iCMS2 genes10 (Figure 7F, G in Liu et al.10). Consistent with this observation, several upregulated genes in iCMS2 were confirmed as direct targets of module 8 TFs. Taken together, the synergistic activities of different TF modules collaboratively regulate distinct downstream genes and shape multi-dimensional heterogeneities of CRCs.
Epigenetic regulation of intra-tumor heterogeneities (ITHs)
As a tumor evolves, cancer cells gradually form genetically heterogeneous subclones, which form complicated ITHs. ITHs can arise from genetic, epigenetic, or microenvironmental inputs and provide fuel for therapy resistance during tumor evolution12. ITH analysis requires multi-regional sampling and high-resolution methods, which is difficult using traditional methods. Recent progress in single-cell multi-omics sequencing technologies has provided powerful tools for studying the dynamics of multi-dimensional ITH along cancer evolution. A milestone study published in Science by Bian and colleagues13 used single-cell multi-omics sequencing to analyze primary tumors and lymphatic and distant metastases of CRCs, which simultaneously acquired genome (CNVs), DNA methylome, and transcriptome profiles from each single cell. Different subclones within each tumor, as well as the phylogenetic relationships between subclones, can be accurately identified using SCNA profiles. Further correlative analysis between genetic lineages and DNA methylation profiles revealed that genome-wide DNA methylation levels are tightly associated with genetic lineages, confirming that the ITH of DNA methylation is also shaped by genetic ITH. Associations between DNA methylation and gene expression levels have also been demonstrated, in which the DNA methylation ratios of gene promoters are negatively associated with the levels of gene expression13. These results demonstrated the feasibility of single-cell multi-omics sequencing in reconstructing genetic lineages and tracing epigenomic and transcriptomic dynamics during tumor evolution.
The milestone work of Bian et al.13 provided comprehensive insights into the genomic, transcriptomic, and DNA methylome evolution of CRCs. However, little is known about the dynamics of chromatin-related epigenetic regulators, including histone modification, chromatin accessibility, and TF binding. Liu and colleagues10 adopted a similar strategy in the scATAC-seq atlas, in which genetic subclones and corresponding phylogenetic relationships could be inferred from SCNA profiles, thus making it possible to investigate the dynamics of chromatin accessibility and TF binding during cancer evolution. Profound ITH on chromatin accessibility profiles has been demonstrated. Subsequently emerged subclones were shown to have higher iCMS-specific gene expression signatures (Figure 5E in Liu et al.10). The activities of iCMS-specific TFs were gradually established along the phylogenetic evolutionary trajectory (Figure 5G, H in Liu et al.10). This stepwise activation of iCMS-specific TFs aligned with the expression patterns of iCMS signature genes, providing insights into the epigenetic programs regulating the formation of iCMS phenotypes during tumor evolution.
Early epigenetic dynamics before malignant transformation
Most CRCs are derived from adenomatous polyps. Therefore, illustrating the regulatory mechanisms during the generation of adenomas could provide targets for early screening and diagnosis of CRCs. Liu and colleagues10 systematically identified the chromatin state signatures for precancerous polyps and observed a strong anti-correlation between chromatin accessibility and DNA methylation levels (Figure 2D, F in Liu et al.10), which collaboratively regulate the expression of several CRC drivers (Figure 3). Notably, abnormal chromatin states acquired in adenomas are predominantly maintained during subsequent malignant transformation, both for chromatin accessibility and DNA methylation (Figure 3). This epigenetic co-regulation of chromatin accessibility and DNA methylation provides promising candidates for CRC screening (Figure 2H-K in Liu et al.10). For example, HOXA genes exhibit extremely high frequencies of hypermethylation in CRCs and have excellent diagnostic value14. However, the chronologic order and causal relationships between the dynamics of chromatin accessibility and DNA methylation are unclear. In fact, it is more likely that an epithelial cell first loses DNA methylation due to metabolic abnormalities15. Then, loss of DNA methylation on many regulatory elements, such as promoters and enhancers, leads to increases in chromatin accessibilities of these elements, which leads to aberrant activation of target gene transcription. Delicate efforts are still needed to further illustrate how these different epigenetic factors collaboratively regulate gene expression and contribute to disease initiation and progression.
Perspective
The (i)CMS classification system, initially defined from gene expression profiles, provides comprehensive molecular stratification for CRCs. Different CRC subtypes have been characterized based on distinct clinical, genomic, transcriptomic, and microenvironmental profiles, which have significantly revolutionized the paradigms of precision medicine in CRC11. Liu et al.10 first extended the scope of (i)CMS from phenotypic traits to gene regulation programs, linking the well-established subtypes to epigenetic underpinnings, which provided novel insights and potential targets for CRC therapy.
TFs are master regulators of cellular phenotypes. Dysregulated TF activities have been frequently observed in various cancer types and are associated with multiple cancer hallmark properties. Thus, dysregulated TFs could serve as a unique class of therapeutic targets. Several strategies have been established to design TF-targeting drugs, including disrupting DNA binding activities, inhibiting interactions with coactivators, or modulating proteasomal degradations16. Several TFs are widely activated in different patients, suggesting that suppressing these TFs may lead to relatively uniform antitumor responses in a wide variety of patients. Abnormally activated TFs in CRCs, both subtype-specific or shared, might serve as novel targets for CRC treatment.
Conflict of interest statement
No potential conflicts of interest are disclosed.
Author contributions
Conceived and designed the analysis: Xin Zhou, Fuchou Tang.
Performed the analysis: Zhenyu Liu.
Wrote the paper: Zhenyu Liu.
- Received May 9, 2024.
- Accepted June 24, 2024.
- Copyright: © 2024, The Authors
This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License.