Comprehensive characterization of CRC with germline mutations reveals a distinct somatic mutational landscape and elevated cancer risk in the Chinese population

Objective: Hereditary colorectal cancer (CRC) accounts for approximately 5%–10% of all CRC cases. The full profile of CRC-related germline mutations and the corresponding somatic mutational profile have not been fully determined in the Chinese population. Methods: We performed the first population study investigating the germline mutation status in more than 1,000 (n = 1,923) Chinese patients with CRC and examined their relationship with the somatic mutational landscape. Germline alterations were examined with a 58-gene next-generation sequencing panel, and somatic alterations were examined with a 605-gene panel. Results: A total of 92 pathogenic (P) mutations were identified in 85 patients, and 81 likely pathogenic (LP) germline mutations were identified in 62 patients, accounting for 7.6% (147/1,923) of all patients. MSH2 and APC was the most mutated gene in the Lynch syndrome and non-Lynch syndrome groups, respectively. Patients with P/LP mutations had a significantly higher ratio of microsatellite instability, highly deficient mismatch repair, family history of CRC, and lower age. The somatic mutational landscape revealed a significantly higher mutational frequency in the P group and a trend toward higher copy number variations in the non-P group. The Lynch syndrome group had a significantly higher mutational frequency and tumor mutational burden than the non-Lynch syndrome group. Clustering analysis revealed that the Notch signaling pathway was uniquely clustered in the Lynch syndrome group, and the MAPK and cAMP signaling pathways were uniquely clustered in the non-Lynch syndrome group. Population risk analysis indicated that the overall odds ratio was 11.13 (95% CI: 8.289–15.44) for the P group and 20.68 (95% CI: 12.89–33.18) for the LP group. Conclusions: Distinct features were revealed in Chinese patients with CRC with germline mutations. The Notch signaling pathway was uniquely clustered in the Lynch syndrome group, and the MAPK and cAMP signaling pathways were uniquely clustered in the non-Lynch syndrome group. Patients with P/LP germline mutations exhibited higher CRC risk.


Introduction
Colorectal cancer (CRC) is the third and second most common cancer in men and women worldwide, respectively 1 , and the fifth most common cancer in China 2 . Although most cases of CRC are sporadic, inherited factors are known to contribute to approximately 30%-35% of CRC cases 3 . Approximately 5%-10% of patients with CRC carry high-risk germline mutations that are associated with known hereditary CRC syndromes, including Lynch syndrome (also known as hereditary non-polyposis CRC), familial adenomatous polyposis (FAP), MUTYH-associated polyposis, Peutz-Jeghers syndrome, juvenile polyposis syndrome, PTEN hamartoma tumor syndrome, and serrated polyposis syndrome [4][5][6] . The germline mutations associated with these syndromes have been extensively investigated at both the genomic and individual gene levels, and the heritability of many of these mutations has been confirmed in population and/or family studies. New germline mutations with suspected heritability have also been reported in recent years 7,8 . Many hotspot mutations have been identified in hereditary CRC syndromes, primarily involving APC, MLH1, MSH2, MSH6, and PMS2 7,8 . Therefore, hereditary CRC syndromes are associated with both hotspot and non-hotspot germline mutations.
Previous research has shown that pathogenic germline mutations increase the risk of cancers, including not only CRC 7 but also hereditary breast and ovarian cancer syndrome 9 and lung cancer 10 . However, this risk remains to be clearly defined for Chinese patients with CRC. Furthermore, the somatic mutational landscape of hereditary CRC syndromes has yet to be characterized and compared with that of sporadic CRC. This comparison may aid in understanding the mechanisms underlying hereditary CRC syndromes. In this study, we recruited a large cohort of 1,923 unselected patients with CRC, investigated both the germline and somatic mutational landscapes, and performed extensive comparisons between patients with and without pathogenic germline mutations. More importantly, by comparing the incidence of individual mutations in our cohort with that in the general population, we clarified the risk associated with the identified germline mutations. This study provides important information regarding the mutational landscape, cancer risk, and potential carcinogenic mechanisms of CRC-related germline mutations in the Chinese population. Our findings may help establish preventive and therapeutic strategies for patients with CRC with suspected heritability.

Ethics approval
All experimental plans and protocols for the study were submitted to the ethics/licensing committees of the indicated participating hospitals for review and approval before the start of the clinical study, and were approved by the corresponding committees of the participating hospitals (Approval No. S2015-032-02). Because the study had a retrospective design and used retrospective samples collected by the participating hospitals, informed consent was not required. Patients with pathogenic (P) or likely pathogenic (LP) germline mutations were informed of the test results. All experiments, methods, procedures, and personnel training were carried out in accordance with the relevant guidelines and regulations of the participating hospitals and laboratories.

Study design
The study was designed and implemented in 7 Chinese hospitals, and both cancer tissue and blood samples were collected retrospectively. The study was designed to include as many patients with CRC as possible, provided that the tissue or blood samples were available for next-generation sequencing (NGS). Samples collected between January 2016 and August 2020 from 1,923 patients with CRC were obtained according to the availability of samples for NGS testing in the participating hospitals. The details of patient demographic information, pathological information, family history, and microsatellite instability (MSI)/mismatch repair (MMR) information are summarized in Table 1. Family history was defined as confirmed CRC patients with at least one immediate family member (first degree relative) with a history of CRC diagnosis. The immediate family members included parents, siblings, and children. The collected samples comprised tissue samples [formalin-fixed paraffinembedded (FFPE) samples or frozen samples from surgery] and blood samples obtained at the time of CRC diagnosis confirmation. Diagnosis was confirmed with imaging examinations and subsequent pathological examinations. No participants received chemotherapy, radiotherapy, targeted therapy, or immunotherapy before the tissue and blood samples were collected. The somatic sequencing data presented in this study were from FFPE samples or frozen tissue samples. Germline sequencing data were obtained from the corresponding genomic DNA of white blood cells. Data meeting the following criteria were included in subsequent analysis: ratio of remaining data filtered by fastq in raw data ≥85%; proportion of Q30 bases ≥85%; ratio of reads on the reference genome ≥85%; target region coverage ≥98%; and average sequencing depth in tissues ≥2,200×. The called somatic variants were required to meet the following criteria: read depth at a position ≥20×; variant allele fraction (VAF) ≥2% for tissue and PBL genomic DNA; somatic-P value ≤0.01; strand filter ≥1. VAF values were calculated for Q30 bases. The copy number variation (CNV) was detected with CNVkit version 0.9.3 (https://github.com/etal/cnvkit). Further analyses of genomic alterations were also performed, including single nucleotide variants (SNVs), insertion/deletion (indels), and CNVs.

Interpretation of pathogenicity of germline mutations and calculation of somatic TMB
The pathogenicity of germline mutations was defined and predicted according to the 5-grade classification system of the American College of Medical Genetics and Genomics Guidelines for the Interpretation of Sequence. All germline mutations were categorized into P, LP, or non-pathogenic (non-P) groups. The variants of uncertain significance (VUS), and benign and likely benign mutations were defined as the non-P group in this study. TMB was calculated by division of the total number of tissue non-synonymous SNP and indel variations (VAF > 2%) by the full length of the exome region of the 605-gene NGS panel (Supplementary Table S2). The genomic sequence from the DNA of PBLs was used for genomic alignment when calling the somatic mutations.

Statistical analysis
Statistical analysis was performed, and figures were plotted in GraphPad Prism 5.0 software (GraphPad Software, Inc, La Jolla, CA, USA). Student's t-test was performed when 2 groups were compared, and analysis of variance and post hoc tests were performed when 3 or more groups were compared. Chi-square test and Fisher's test were performed when rates or percentages were compared for significance. Figures for the mutation spectrum were produced with R software (https://www.r-project.org/). Data for pathway enrichment analysis were analyzed with the method described by DAVID Bioinformatics Resources 6.8 (https://david.ncifcrf. gov/) and were visualized with corresponding packages for R software. The protein-protein interaction network was analyzed with the STRING database, and the hub genes were determined with Cytoscape software (cytoscape.org); the Degree method was used to rank the genes. The odds ratio (OR) was calculated on the basis of the frequency of a certain germline mutation from the Genome Aggregation Database (gnomAD) in the general population and the corresponding mutation frequency obtained from this study. The OR and 95% confidence interval (CI) for each germline mutation was calculated in SPSS 17.0 software (IBM China Company Limited, Beijing, China). *P < 0.05; **P < 0.01; and ***P < 0.001.

The panorama of germline mutations in Chinese patients with CRC
First, we investigated the genetic landscape of germline alterations in all 1,923 recruited patients with CRC, among whom we identified 92 P germline mutations in 85 patients ( Figure 1A) and 81 LP germline mutations in 62 patients ( Figure 1A). The remaining 1,776 patients carried VUS, benign, or likely benign germline alterations (non-P). The proportion of patients with P or LP germline mutations was 7.6% (147/1,923). The highest number of P mutations was seen in APC and MSH2 (n = 14),  , and RAD50 (n = 7). MLH1 and MSH2 exhibited the highest number of LP mutations (n = 10), followed by MSH6 (n = 7), NTRK1 (n = 7), and ATM (n = 6). Further analysis indicated that 27 of 92 P mutations were detected in patients who had been diagnosed with Lynch syndrome (Figure 1B, left panel). MSH2 was the gene associated with the most mutations in Lynch syndrome (14) and was followed by MLH1 (n = 7), MSH6 (n = 4), and PMS2 (n = 2) ( Figure 1B, middle panel). For patients without Lynch syndrome, APC was identified as the gene associated with the most mutations (n = 14) and was followed by BRCA1 (n = 8), RAD50 (n = 7), MUTYH (n = 5), ATM (n = 5), and BRCA2 (n = 4) ( Figure 1B, right panel). Interestingly, we observed a significantly higher ratio of patients with MSI-H or dMMR in the P or LP group than the non-P group ( Table 1). We also identified a significantly higher ratio of patients with family history in the P and LP groups than the non-P group. Patients with P or LP mutations were significantly younger than those in the non-P group ( Table 1). A significant difference in stage distribution was observed between the LP and the non-P group, possibly because of the low number of patients in the LP group in stages I and III. We observed no significant differences in P and LP germline mutations between males and females ( Table 1). Next, we identified the specific types of mutations related to the P and LP alterations. Most mutations involved frameshift (deletion and insertion), nonsense, nonsynonymous (single nucleotide mutations), or splicing (Figure 2A). These mutations may cause large fragment changes or key amino acid alterations in proteins and therefore substantially influence gene function and potentially lead to high susceptibility to CRC. APC, MSH2, and MLH1, identified as the 3 genes with the highest number of P and LP mutations, might lead to familial adenomatous polyposis and Lynch syndrome.
The distribution of germline mutations in the highly mutated genes is shown in Figure 2B. Both P (red) and LP (blue) mutations of APC, ATM, MLH1, MSH2, MSH6, and PMS2 are plotted on individual gene schemes. Most germline mutations were located in key functional domains (blue bars). This effect was most prominent for APC, in which several mutations were distributed in the suppressor APC, APC_u9, and PTZ00449 superfamily domains. This observation suggested that P/LP germline mutations within key functional domains are more likely to be pathogenic than other mutations.
We identified several novel, previously unreported germline mutations in the dbSNP, gnomAD, and ClinVar databases ( Table 2). These mutations included frameshift, nonsense, and splicing mutations potentially causing large fragment alterations in genes. All were classified as LP mutations, owing to their deleterious properties and undetermined clinical significance. Interestingly, patients with mismatch repair-related gene mutations (MSH2 and MSH6) and NTRK1 germline mutations exhibited very high levels of somatic TMB and a high ratio of MSI-H, thus suggesting that these mutations might behave in the same manner as known P mutations, although further clinical evidence is needed to validate this hypothesis.

Correlations among characteristic somatic mutational landscapes, functional alterations, and germline mutations in CRC
The somatic mutational features of CRC with germline mutations, and how this condition relates to sporadic CRC, remain to be investigated in detail. Here we studied the somatic mutational features of CRC with or without P/LP germline mutations (Supplementary Figure S1A), focusing specifically on the differences among the P, LP, and non-P groups in terms of individual gene mutational frequency ( Figure 3A-D), TMB ( Figure 3E), and mutations significantly affecting pathways or functions (Figure 4).
We identified substantial differences in the SNV/indel mutational frequency of highly mutated genes ( Figure 3A). For many genes, including TP53, SYNE1, and KMT2D, a significantly higher mutational frequency was identified in the P group than the non-P group. Similarly, a higher mutational frequency was found in the LP group than the non-P group in several genes, including ZFHX3 and KMT2D. Interestingly, the mutational frequency of APC and KARS did not differ among the 3 groups. In contrast, most CNV alterations did not differ significantly across the 3 groups, except for NCOA3 (P < 0.05), although we did observe a trend toward higher CNV alterations in the non-P group ( Figure 3B). The overall CNV rate of the P group was significantly lower than that of the non-P group (P < 0.001).
Next, we investigated the difference between the Lynch syndrome and non-Lynch syndrome groups with P mutations (Supplementary Figure S1B). Patients with Lynch syndrome exhibited a significantly higher mutational frequency than those who did not have Lynch syndrome ( Figure 3C); this was the case for most genes, except APC, TP53, and PIK3CA, whose mutational frequency did not significantly differ. In contrast, patients without Lynch syndrome exhibited a trend toward a higher frequency of CNV alterations than those with Lynch syndrome, although this association was not significant ( Figure 3D). Next, we examined and compared the TMB for the P (including both patients with and without Lynch syndrome), LP, and non-P groups. Patients with Lynch syndrome and P mutations exhibited a much higher TMB than patients without Lynch syndrome with P mutations, and patients from the LP and non-P groups ( Figure 3E).
To further investigate the similarities and differences in somatic mutations among the P, LP, and non-P groups, and to study the mechanistic discrepancies between Lynch syndrome and patients without Lynch syndrome with CRC, we performed gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) clustering analysis and compared the results from each group. Figure 4A shows the most significant clustering in the GO (upper row) and KEGG (lower row) analysis for the P, LP, and non-P groups. Some common biological processes, functions, and pathways were observed among the groups, together with several substantial differences. The common clustering for GO and KEGG findings across the 3 groups is summarized in Supplementary Table S3   Similarities and differences were also compared between the Lynch syndrome and non-Lynch syndrome groups with regard to P germline mutations. Figure 4B shows the most significant clustering in GO (upper row) and KEGG (lower row) analysis for the Lynch syndrome and non-Lynch syndrome groups. Common clustering is shown in Supplementary Table S6. The most common pathways were the Wnt signaling pathway, the calcium signaling pathway, and human papillomavirus infection. Differences in the biological processes in terms of GO clustering are listed in Supplementary Table S7; interestingly, a large amount of Lynch-unique clustering was observed. Differences in KEGG clustering are shown in Supplementary  Table S8. Notably, the Notch signaling pathway was clustered in the Lynch syndrome group but not the non-Lynch syndrome group, whereas the MAPK signaling pathway and AMP signaling pathway were clustered in the non-Lynch syndrome group    but not the Lynch syndrome group. Information related to the genes enriched in each GO and KEGG category in Figure 4 is provided in Supplementary Table S9 (GO enrichment) and  Supplementary Table S10 (KEGG enrichment). Next, we used the STRING database to analyze the protein interaction network for each subgroup. The top 20 genes in terms of protein interaction are listed in Supplementary Table  S11. Each subgroup was compared with the P group, and the same genes are labeled with identical colors. In all groups, TP53 was the most common interacting gene. However, EGFR and SRC genes were found in the LP, non-P, and non-Lynch syndrome groups, but not in the P group, thus suggesting substantial differences in the protein interaction network. NOTCH1 was found only in the P and P-Lynch syndrome groups but not in the other groups, thus verifying the results of the pathway enrichment analysis. These findings strongly suggest that the mechanism of carcinogenesis in patients with P germline mutations is distinct from that in patients with no P germline mutations.

Germline mutations increase the risk of CRC in the Chinese population
P or LP germline mutations may increase cancer susceptibility and risk. To quantify the risk of CRC in individuals carrying P or LP germline mutations, we calculated the ORs for individual germline mutations and all mutations as a whole. The prevalence of all germline mutations in the general population was determined by gnomAD screening. By comparing the prevalence in the general population and the mutation frequency identified in this study, we calculated the OR for each mutation site, or all mutations as a whole, as an indicator of CRC risk. Table 3 shows the detailed demographic information, gene names, variation sites, allele counts, allele frequencies in the general population, and ORs for each P germline mutation detected in this study. The overall OR for all P mutations was 11.13 (95% CI: 8.289-15.44). Similarly, Table 4 shows demographic and mutational information, along with the calculated OR of all LP mutations, with an overall OR of 20.68 (95% CI: 12.89-33.18). These results indicated strong enrichment in P or LP mutations in the studied population of patients with CRC, thus indicating a significantly higher risk of CRC in patients carrying these germline mutations.
Some patients with CRC recruited for this study lacked prognostic data. Consequently, we were unable to perform prognostic analysis. However, prognostic data were successfully obtained from a previous report 11 ; the patient prognosis was then compared between those with and without germline mutations. As shown in Supplementary Figure S2, patients with germline mutations exhibited significantly poorer overall survival than those without germline mutations (P = 0.0087). The median survival time for the germline group was 1,323 days, whereas the median survival for the non-germline group had not been reached.

Discussion
Previous research has identified correlations between P germline mutations and hereditary CRC, including MLH1/ MSH2/MSH6/PMS2 mutations with Lynch syndrome (also known as hereditary non-polyposis CRC), APC mutations with FAP, MUTYH mutations with MUTYH-associated polyposis, STK11 mutations with Peutz-Jeghers syndrome, SMAD4/BMPR1A mutations with juvenile polyposis syndrome, PTEN mutations with PTEN hamartoma tumor syndrome, and RNF43 mutations with serrated polyposis syndrome [4][5][6][7] . Although the relationships among these diseases and mutations are known, the frequency, location, and distribution of germline mutations in the Chinese population, and their quantitative relationships with CRC risk have yet to be elucidated. The distribution of rare germline mutations and their roles in the pathogenesis of CRC are also worthy of exploration. In addition, no systematic studies have investigated the similarities and differences in the somatic mutational landscape between patients with and without P/LP germline mutations. In this study, we recruited a large cohort of 1,923 cases and systematically investigated germline mutations and corresponding somatic mutational alterations in a Chinese population.
As expected, a significantly higher proportion of patients with P or LP mutations had a family history of CRC than did non-P patients, thus suggesting that these germline mutations increased the risk of CRC in affected families. Because of the high proportion of affected MMR genes in P and LP mutations, the proportion of patients with dMMR and MSI-H was significantly higher in these groups; therefore, these patients may respond well to immunotherapy. Our results also confirmed the early onset of CRC in patients with P or LP mutations, thereby indicating a similar trend to those of FAP and Lynch syndrome. Although some novel mutations were not determined to be pathogenic, their   7,12,13 . However, because of a lack of sufficient evidence for LP germline mutations, many mutations in MLH1, MSH2, MSH6, and PSM2 could not be confirmed as Lynch syndrome mutations. Therefore, the incidence of Lynch syndrome might have been underestimated, and the actual incidence could have exceeded 2%, as described in previous reports 7,12,13 . The APC gene had the highest number of P germline mutations, thus indicating that FAP is the most common form of hereditary CRC in Chinese population, followed by Lynch syndrome. In addition, ATM gene germline mutations have been detected in other malignant tumors 14 . Because ATM is an important candidate member of the DNA damage and repair (DDR) pathway, germline mutations may directly lead to abnormal DNA repair. The present evidence suggests that ATM germline mutations are not cancer type-specific, because they have been reported in many cancers and have been suggested to potentially increase the risk of some cancers 14 10 . Because no hotspot mutations have been reported in BRCA1/2 in the Chinese population, many mutations were categorized as LP or VUS. Additional clinical evidence is necessary to confirm their pathogenicity in cancer. We compared the ratio and distribution of germline mutations between Chinese and Western populations by using the data from the present study and data reported by Hahnen et al. 11 in 2017. We found that PALB2 was ranked as the top P mutation in the Western population but had a much lower ranking in the Chinese population (Supplementary Figure  S3A). In contrast, APC was ranked as the top P mutation in the Chinese population but was not detected in the Western population. Moreover, ATR was ranked as the top LP mutation in the Western population but was not detected in the Chinese population. Differences between these populations were also reflected in the proportion of patients with Lynch syndrome. The proportion of patients with Lynch syndrome with P mutations in the Chinese population was 29.3% (27/92), compared with a ratio of 15.0% in the Western population (3/20) (Supplementary Figure S3B). These comparisons indicate a potential differential germline mutational landscape in CRC.
Frameshift and nonsense mutations were the 2 most common types of mutations detected in the study, followed by missense and splicing mutations. Frameshift and nonsense mutations lead to the partial or complete loss of function of corresponding proteins, thus increasing the risk of cancer in mutation carriers. Missense mutations in key amino acids can also induce substantial changes in protein function, whereas splicing mutations can influence transcription and subsequent translation. We found that most mutations in highly mutated genes were located in known functional domains, thus reflecting the roles of these domains in maintaining normal protein function. Indeed, because all mutations identified in this study were heterozygous, a partial loss of function might be compensated for by the other normal allele. These heterozygous mutations might not be lethal but could increase the risk of cellular aberrant transformation and carcinogenesis.
In this study, we conducted the first comparative study of somatic mutational landscapes on the basis of the pathogenicity classification of germline mutations. We found that the mutational frequency of most of the highly mutated genes in *Data from gnomAD. Table 4 Continued the P group was higher than that in the non-P group; the LP group also showed a similar trend toward a higher mutational frequency, possibly because the mutations in the P group affected the MMR, DDR, and homologous recombination deficiency pathways, thus leading to abnormal DNA repair and a large number of somatic mutations 16 . The patients with and without Lynch syndrome in the P group showed a similar trend, and the mutational frequency in patients with Lynch syndrome was much higher than that in patients without Lynch syndrome. This finding was also confirmed by TMB statistics: the TMB of patients with Lynch syndrome was significantly higher than that of the other 3 groups. TMB has been suggested to be an effective indicator for patient prognosis stratification in immunotherapy 17 . Our data provided strong evidence supporting the use of immunotherapy in patients with Lynch syndrome. Interestingly, we observed no difference in the frequency of APC and KRAS mutations across the 3 groups, thus suggesting that major driver gene mutations may be common driving factors for CRC, regardless of P germline mutations. In addition, our data showed that the CNV variation in the non-P group was higher than that in the P group, and that CNV variation in the patients without Lynch syndrome was also higher than that in patients with Lynch syndrome, thus indicating a seesaw effect. That is, a higher proportion of SNV/indel mutations corresponded to a lower proportion of CNV alterations, whereas a lower proportion of SNV/indel mutations corresponded to a higher proportion of CNV alterations. This observation suggests that CRC is a highly heterogeneous cancer in which pathogenesis is diverse and depends on different types of genetic alterations. The co-existence and balance of mutations and CNVs may be related to both genetic and environmental backgrounds. Similar observations of the seesaw effect have also been reported in other studies 10,18,19 .
Our detailed clustering analysis led to interesting discoveries. We found the first reported evidence that the Notch pathway is clustered in only patients with Lynch syndrome with P germline mutations, but not patients without Lynch syndrome. Furthermore, we observed that the MAPK and cAMP signaling pathways were clustered in patients without Lynch syndrome but not patients with Lynch syndrome. In contrast, the Wnt and calcium signaling pathways, along with the human papillomavirus infection pathway, were all clustered in CRC. This finding suggests that the Notch pathway is specific to patients with Lynch syndrome, whereas the MAPK and cAMP signaling pathways are specific to patients without Lynch syndrome.
The Wnt and calcium signaling pathways, along with human papilloma virus infection, may be common pathogenic factors for CRC, regardless of germline mutations. The Notch pathway plays an important role in embryonic development, cell proliferation, and differentiation. Furthermore, the role of the Notch pathway has been investigated for many different types of tumors 20 , including CRC 21 . However, the role of the Notch pathway in Lynch syndrome has not been studied previously. Our identification of Lynch-specific Notch pathway activity demonstrated the existence of distinct pathogenic mechanisms in patients with Lynch syndrome and patients without Lynch syndrome with CRC; therefore, our research provides key information that may facilitate molecular typing.
In this study, we report the first quantification of the risk of CRC associated with P and LP germline mutations. We also calculated the overall OR for the P and LP groups. The frequency of mutations identified by gnomAD screening represents the frequency of a certain alteration in the general population. Because most P or LP germline mutations exhibited very low incidence, the frequency in the general population, and in patients with cancer, may exhibit a certain degree of randomness and may not accurately represent the true frequency. Thus, the overall OR for the P or LP group as a whole may have greater relevance and significance for the population. For some relatively common germline mutations, such as those from APC and the 4 MMR genes, the risk associated with individual genes can be calculated; for the less frequent gene mutations, larger population studies and familial evidence are urgently needed. In this study, the overall OR of both the P and LP groups exceeded 10, thus suggesting that patients with such germline mutations had a significantly greater risk of CRC than the average-risk population. Previous studies of other cancers also support this method for evaluating the risk of germline mutations from population data 10,22,23 . From the perspective of treatment, personalized therapeutic strategies should be given to patients with such mutations, and more frequent and detailed examinations should be performed on their unaffected family members carrying these mutations. This practice would enable detection of tumors as early as possible and support early intervention.

Conclusions
In this study, we fully characterized germline and somatic mutations in Chinese patients with CRC. We found that 7.6% of our study cohort carried germline variants linked to greater susceptibility to CRC. Patients with P or LP mutations had a higher proportion of MSI-H, dMMR, family history of CRC, and significantly lower age. The somatic mutations in Chinese patients with patients with CRC were fully characterized and found to exhibit distinct features. The Notch signaling pathway was uniquely clustered in patients with Lynch syndrome, whereas the MAPK and cAMP signaling pathways were uniquely clustered in patients with CRC who did not have Lynch syndrome. Our findings provide important information for potential molecular typing and therapy for patients with CRC with germline mutations.