A nomogram-based immune-serum scoring system predicts overall survival in patients with lung adenocarcinoma

Objective: The immunoscore, which is used to quantify immune infiltrates, has greater relative prognostic value than tumor, node, and metastasis (TNM) stage and might serve as a new system for classification of colorectal cancer. However, a comparable immunoscore for predicting lung adenocarcinoma (LUAD) prognosis is currently lacking. Methods: We analyzed the expression of 18 immune features by immunohistochemistry in 171 specimens. The relationship of immune marker expression and clinicopathologic factors to the overall survival (OS) was analyzed with the Kaplan-Meier method. A nomogram was developed by using the optimal features selected by least absolute shrinkage and selection operator (LASSO) regression in the training cohort (n = 111) and evaluated in the validation cohort (n = 60). Results: The indicators integrated in the nomogram were TNM stage, neuron-specific enolase, carcino-embryonic antigen, CD8center of tumor (CT), CD8invasive margin (IM), FoxP3CT, and CD45ROCT. The calibration curve showed prominent agreement between the observed 2- and 5-year OS and that predicted by the nomogram. To simplify the nomogram, we developed a new immune-serum scoring system (I-SSS) based on the points awarded for each factor in the nomogram. Our I-SSS was able to stratify same-stage patients into different risk subgroups. The combination of I-SSS and TNM stage had better prognostic value than the TNM stage alone. Conclusions: Our new I-SSS can accurately and individually predict LUAD prognosis and may be used to supplement prognostication based on the TNM stage.


Introduction
Lung cancer is the most lethal malignancy worldwide. It has become a major threat to public health 1 . Non-small cell lung carcinoma (NSCLC) accounts for 85% of all primary lung cancer cases, and the most commonly diagnosed pathological type is lung adenocarcinoma (LUAD) 2 . For patients with early-stage LUAD, radical surgical resection remains the preferred treatment. For LUAD, similarly other types of solid tumors, the prognosis is mainly based on tumor, node, and metastasis (TNM) clinical staging after surgery. However, the prognosis varies widely among patients with the same clinical stage. Traditional TNM staging provides limited prognostic information 3 . Therefore, there is an urgent need for a method to improve prediction of patient prognosis and identify high-risk patients among those in similar disease stages.
Accumulating evidence indicates that cancer development is influenced by the host immune system 4,5 . Moreover, an innovative definition of TNM staging, in which T stands for T cells, and M stands for memory cells, has been described 6 . The immunoscore-first proposed by Galon et al. and mainly based on the density of immune cell infiltration in the center of the tumor (CT) and the invasive margin (IM)-has been reported to have prognostic value that may supplement TNM classification in colorectal cancer 7,8 . Thus, the incorporation of immune cells into the new staging system is crucial 1,9 . Unfortunately, an intuitive and effective staging system for predicting LUAD prognosis remains to be developed.
Least absolute shrinkage and selection operator (LASSO) regression has been extended and broadly applied to survival analysis of high-dimensional data 10 . Nomograms are often used as a precise medical tool to predict personalized prognosis by integrating and illustrating statistically significant features 11,12 . Nomograms graphically demonstrate the predicted contributions of different risk factors, and therefore are intuitive and convenient 3,13,14 . In this study, we aimed to screen for key immune indicators and clinicopathological features affecting LUAD prognosis by using LASSO regression in a training cohort. We then used these variables to construct a nomogram that we validated in a validation cohort. Our nomogram-based immune-serum scoring system (I-SSS) was able to further classify patients in the same clinical stage into different risk-based subgroups, thus supplementing prognostication with TNM staging and guiding individualized treatment in clinical settings.

Patients and tissue specimens
In this retrospective study, we selected 171 patients with LUAD who underwent radical surgery at Tianjin Medical University Cancer Institute and Hospital (TMUCIH) between September 2012 and March 2013. This study was approved by TMUCIH. Informed consent was obtained from all individual participants included in the study. The patients did not receive any adjuvant treatment or oncogene screening before surgery. Hematoxylin and eosinstained tissue sections from all patients were reviewed by 2 pathologists, who then selected the most appropriate tissue sections including the CT and IM regions. Tumor staging was determined according to the American Joint Committee on Cancer (8th edition) criteria. Clinical information for all patients was obtained through review of archived data, and follow-up information was obtained through medical records and telephone interviews. The median follow-up time for the survivors was 68 months (range, 1-72 months). The endpoint of the study was overall survival (OS).

Immunohistochemistry (IHC)
We selected 9 prognostic immune biomarkers for IHC staining, on the basis of previous research results, including those from pan T cells (CD3), cytotoxic T cells (CD8), B cells (CD20), memory T cells (CD45RO), naive T cells (CD45RA), natural killer cells (CD57), neutrophils (CD66b), macrophages (CD68), and regulatory T cells (FoxP3) [15][16][17][18][19][20][21][22] . IHC for these markers was performed with standard procedures 23,24 . Briefly, 3-4 μm tissue sections were dewaxed in xylene and hydrated through a graded ethanol series. Antigen retrieval was performed at 100 °C in citrate buffer (pH 6.0) for 3 min in all cases, except during CD45RA IHC, for which antigen retrieval was performed at 100 °C in TRIS-EDTA (pH 9.0). After the peroxidase was inactivated with hydrogen peroxide for 20 min, all slides were incubated with the primary antibody overnight at 4 °C. The sections were then successively stained with a broad-spectrum secondary antibody for 1 h at room temperature, treated with 3,3′-diaminobenzidine, and finally counterstained with hematoxylin. Detailed antibody information and staining concentrations are shown in Supplementary Table S1.

Selection of cutoff scores
Two senior pathologists, blinded to clinical information and outcome, independently scored all stained sections. Under a light microscope (model BX51; Olympus), the staining was first evaluated according to overall impression at low magnification (×100), and the 5 most representative areas in the CT and IM region were selected. Next, the densities of the positive cells were scored at high magnification (200×). The stained cells in each area were quantified and expressed as the number of cells per field. The count of each immune marker was the average of the count in the 5 regions. The scoring concordance was approximately 87% between the pathologists. In cases of disagreement, the slides were reviewed collaboratively, and a consensus was reached by the 2 pathologists. For subsequent statistical analyses, each biomarker was recorded as a dichotomous (high vs. low) variable according to the optimal cutoff value. For 18 different markers in each tumor region (CT and IM), a corresponding statistically significant correlation was found between the density of immune cells and patient outcomes at a wide range of cutoff values. The optimum cutoff score of the density that produced the "minimum P-value" provided the best OS-related stratification 13,25 . The detailed cutoffs and P-values are provided in Table 1.

Statistical analysis
All statistical analyses were performed in SPSS 22.0 (IBM, Chicago, IL, USA). Kaplan-Meier survival and log-rank tests were used to determine the potential correlations of OS with the immune biomarkers and various clinicopathological parameters. Heatmaps and correlation matrices were created with the "pheatmap" package in R (R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria; http:// www.R-project. org, 2018). All statistical tests were two-sided. P < 0.05 indicated statistical significance.

Feature selection with LASSO
The 171 patients were divided into training (n = 111) and validation (n = 60) cohorts in a 65%:35% proportion. LASSO regression was used to further identify predictive features after screening for prognosis-related clinicopathologic characteristics and immune indicators (P < 0.05) with the Kaplan-Meier method in the training cohort. LASSO uses both variable selection and regularization to select a subset of variables that minimizes the prediction error of the outcome 26 . Ten-fold cross-validation was performed to assess model classification performance. Feature selection was performed with the "glmnet" package in R.

Construction and calibration of the prognostic nomogram
On the basis of the aforementioned factors, a multivariate logistic regression model was used to develop a nomogram for predicting LUAD prognosis. In the training cohort, the features obtained through LASSO dimension reduction were included for developing the prognostic nomogram with the R "survival" package. To evaluate the predictive performance of the nomograms, we used the concordance index (C-index), which ranges from 0.5 to 1, with higher values indicating more accurate predictive results. The calibration curves of the nomogram for 2-and 5-year OS were then generated to compare the predicted and observed survival in the validation cohort. Bootstraps with 1,000 resamples were used.

Risk group stratification based on the nomogram
To further simplify the nomogram, we established a new I-SSS by assigning values to different factors according to the score in the nomogram. By dividing the patients into different risk groups according to the total risk score (from highest to lowest), we determined the cutoff values 3 . Receiver operating characteristic (ROC) analysis was performed to compare the accuracy of I-SSS and TNM stage in predicting prognosis.

Expression and correlation analysis of 18 immune cell markers in the CT and IM regions
We examined the expression of 18 immune cell markers in the CT and IM regions of 171 LUAD specimens with IHC. The staining sites for all indicators were the cell membrane or nucleus (Figure 1). Supplementary Figure S1 shows an example CD3 IHC stained slide with the areas selected for quantification annotated. The heat map (Figure 2A) showed distinct immune cell expression profiles in the CT and IM regions of the patients. The densities of CD20 CT , CD2O IM , CD3 CT , CD3 IM , CD45RO CT , CD45RO IM , CD45RA CT , and CD8 IM were generally high. Meanwhile, the densities of CD8 CT , CD45RA IM , CD68 CT , CD68 IM , CD66b CT , CD66b CT , CD57 CT , CD57 IM , FoxP3 CT , and FoxP3 IM were relatively low (Figure 2A). In addition, the correlation of the various immune cell markers in LUAD varied from weak to moderate ( Figure 2B).

Prognostic effects of immune cell expression on survival
The cutoff values for different immune makers in the CT and IM regions were obtained with the "minimum P-value" method 25 . In Kaplan-Meier analysis, the patients with high densities of CD8 CT (P = 0.001), CD8 IM (P < 0.001), CD45RO CT (P = 0.014), and CD45RO IM (P = 0.002) had significantly better OS than those with low densities of CD8 CT , CD8 IM , CD45RO CT , and CD45RO IM (Table 2, Figure 3). Meanwhile, the patients with high densities of CD66b CT (P = 0.002) and FoxP3 CT (P < 0.001) had significantly poorer OS than those with low densities of CD66b CT and FoxP3 CT (Table 2, Figure 3). None of the other immune markers in either region had significant prognostic value (Table 2, Figure 3).

Feature selection with LASSO
To comprehensively evaluate the influence of the clinicopathological parameters and the immune markers on prognosis, we selected all significant factors (P < 0.05) in the Kaplan-Meier analysis in the training cohort, including 6 of the 17 clinicopathological characteristics and 6 of the 18 immune markers. Because the TNM stage contains tumor size, lymph node metastasis, and distant metastasis, we used only TNM stage in the LASSO screening 13 . The features obtained with LASSO screening included 3 clinicopathological characteristics (TNM stage and preoperative serum NSE and CEA levels) and 4 immune features (CD8 CT , CD8 IM , CD45RO CT, and FoxP3 CT ) (Figure 4A and 4B, Supplementary Table S2).

Nomogram development for OS and validation
Although more prognostic features were selected, the complex interrelationships between variables and the weighted contribution of each factor to tumor formation and development remained unclear. Therefore, a more comprehensive and intuitive model to predict OS was required. Nomograms developed by considering individualized calculations of outcomes on the basis of clinical and pathological features are usually  used to predict prognosis. In the training cohort, 7 variables identified with LASSO regression were used to establish a nomogram for OS prediction (Figure 5A)     and nomogram-predicted 2-and 5-year OS in the validation cohort (Figure 5B and 5C).

Performance of the immune-serum scoring system in stratifying patient risk
To enable more extensive and convenient clinical use, we developed a new I-SSS based on the points awarded for each factor in the nomogram. TNM stages I, II, III, and IV corresponded to 0, 33, 65, and 100 points, respectively. Serum NSE levels >15.2 μg/L corresponded to 34 points. Serum CEA levels >5 μg/L corresponded to 18 points. Low-density CD8 CT (<10%), CD8 IM (<12%), and CD45RO CT (<25%), and highdensity FoxP3 CT (≥ 7%) corresponded to 5, 20, 8, and 26 points, respectively. We determined the cutoff value by grouping the patients evenly into 4 subgroups after sorting by total score (score: 0-25, 26-50, 51-75, and >75); each group represented a distinct prognosis (P < 0.001); the higher the I-SSS score, the poorer the prognosis (Figure 6A). The I-SSS performed better than TNM staging in revealing the differences in prognosis between groups 2 and 3 ( Figure 6A and 6B). In patients with the same TNM stage, the independent discrimination ability of the I-SSS was further illustrated. After 50 was used as the cutoff value to group patients, stratification into different risk subgroups resulted in prominent differences in the Kaplan-Meier curves for OS within each TNM stage (Figure 6C-6F). Furthermore, the combination of I-SSS and TNM stage had a better prognostic value than the TNM stage alone when 60 and 120 were selected as the cutoff values (total score: 0-60, 61-120, >120). Each group had a distinct prognosis (P < 0.001) (Figure 7A), with a predictive accuracy of OS higher than that of TNM stage (AUC I-SSS and TNM stage = 0.861 vs. AUC TNM stage = 0.827) (Figure 7B).

Discussion
In this article, we investigated the associations of the densities and locations of 18 different immune markers with patient survival in LUAD. We also analyzed the effects of various clinicopathological parameters, including tumor markers, on prognosis. In the training cohort, we used LASSO regression to further screen prognostic factors from the results of the Kaplan-Meier analysis, to avoid the problems of multi-collinearity and over-fitting in multiple regression models. Thereafter, we constructed a nomogram by integrating critical prognostic factors for survival. Notably, nomograms are accepted tools for quantifying risk factors, as extensively reported for different cancers. Liang et al. 3 have built a prognostic nomogram based on general clinical parameters in NSCLC. In addition, Wang et al. 27 have created a nomogram integrating clinicopathologic features and serum tumor marker levels in NSCLC. In the current study, beyond the basic demographics, the clinicopathologic characteristics and preoperative serum tumor marker levels, and immune infiltrating cells were incorporated into the candidate variables for model building.
Our nomogram ultimately included 3 clinicopathological characteristics (TNM stage, and preoperative serum NSE and CEA levels) and 4 immune features (CD8 CT , CD8 IM , FoxP3 CT , and CD45RO CT ) that affected LUAD prognosis. Among preoperative serum tumor markers in LUAD, more attention has been focused on CEA, whereas less attention has been focused on NSE. In the present study, in addition to CEA, high levels of NSE were associated with poor prognosis. As described by Li et al. 28 , NSE is a key enzyme in glycolysis that expedites cancer cell replication. Our results indicated that more attention should be paid to NSE in future clinical studies. CD8 is an important part of the immune microenvironment and plays a crucial role in the anti-tumor immune response 29 . We observed that high CD8 + T-cell infiltration in both CT and IM regions was associated with favorable prognosis, similarly to findings from previous studies 21,30 . Donnem et al. 31 have reported that stromal CD8 + T-cell density is an independent prognostic factor for OS, and its prognostic effect increases in different stages in patients with stage I-IIIA NSCLC. FoxP3 is one of the most specific Treg markers, although its effect on prognosis remains ambiguous 23,32,33 . Growing evidence indicates that Tregs play a vital role in promoting cancer by inhibiting the anti-tumor effect of CD8 + T-cells and inhibiting host immunity against tumors 23 . Moreover, Wculek et al. 34 have clarified the key roles played by dendritic cells (DCs) in the initiation and regulation of innate and adaptive immune responses. The roles of Tregs in promoting tumors may be associated with the concomitant absence of DCs, thereby potentiating immunosuppression. In our study, increased FoxP3 CT expression was correlated with poor prognosis. This finding was consistent with those of Fu et al. 15 , who have shown that high-density FoxP3 infiltration in the tumor bed in breast cancer is associated with shorter OS. We did not identify a prognostic trend for FoxP3 IM , possibly because cancer cells in the CT produce chemokines, such as CCL22 and CCR4, thus resulting in lower DC infiltration and recruitment of more FoxP3, and consequently favoring tumor growth 35 . This discovery should be important for guiding immunotherapy for LUAD in the future. CD45RO exerts an antitumor effect, mainly by activating the host immune response 36,37 . In our study, high expression of CD45RO in patients with LUAD had a positive prognostic effect, regardless of whether it was in the CT or IM regions, in agreement with findings from other studies 38 . CD45RO CT rather than CD45RO IM was included the nomogram after LASSO screening, probably because high expression of CD45RO CT had a more pronounced effect than CD45RO IM on prognosis. Recently, Gao et al. 39 have shown that high density of both CD68 CT and CD68 IM is associated with decreased survival, and have demonstrated that the macrophage immunoscore-based prognostic nomogram can effectively predict the prognosis of stage I NSCLC patients and enhance the predictive value of the TNM stage system. Unfortunately, we found that neither CD68 CT nor CD68 IM was associated with prognosis. Further research on the prognostic value of macrophages in lung adenocarcinoma may be warranted. Our results suggested that CD8 CT , CD8 IM , FoxP3 CT , and CD45RO CT might be good candidate immunological markers for establishing a LUAD TNM-immune staging similar to that used in colorectal cancer 1,17,33,40 . Most importantly, we established a new I-SSS based on the scores for each factor in the nomogram. By stratifying patients with disease into 4 risk groups according to the cutoff values, we separated 171 patients with distinct survival outcomes. Our I-SSS was able to better distinguish the differences in prognosis between group 2 and 3 patients than TNM stage. However, some overlaps in survival curves were observed in TNM stage II and III patients. Furthermore, although patients with the same TNM stage could be stratified into different risk groups with the I-SSS, we did not observe a statistically significant prognostic value in TNM stage IV patients. We believe that the sample size of TNM stage IV was the main contributor to this lack of significance. Patients with low I-SSS and stage I, II, III disease had longer OS than patients with high I-SSS. Therefore, patients with high I-SSS may need more aggressive treatment or intensive follow-up to improve prognosis. In addition, the combination of I-SSS and TNM stage had a better prognostic value than the TNM stage alone (AUC I-SSS and TNM stage = 0.861 vs. AUC TNM stage = 0.827), thus indicating that, beyond the TNM stage, the influence of immune cells and tumor markers on LUAD prognosis should not be ignored, and the I-SSS reinforces the prognostic ability of TNM stage [41][42][43] . These findings suggested that the I-SSS can be used to supplement the prognostic value of TNM staging 44 .
To our knowledge, this study is the first to comprehensively evaluate the effects of the immune microenvironment and clinicopathological features on prognosis, and to develop a nomogram for predicting the survival of patients with LUAD. By using this I-SSS, physicians could provide personalized survival prediction. Moreover, high-risk patients with poor prognosis could be identified and treated with more aggressive therapy or could be followed up more frequently. We note that the existence of anthracotic pigments in lung specimens may mask or confound positively stained cells. We chose to evaluate the IHC staining results by direct microscopic visualization, rather than by using pathological digital software, because microscopy can better distinguish between positive cells and anthracotic pigments 23 . However, this study has some limitations. Although 18 immune features associated with prognosis were selected according to literature reviews and clinical standards, all the features of the immune microenvironment, such as CD4 and CD56, were not represented. Furthermore, this was not a multicenter study; samples from only one hospital were selected. Future studies should examine a larger sample size, specimens from multiple hospitals, and a greater number of immune indicators. A more comprehensive, multi-center, large-scale collaborative study is warranted for further exploration.