Drug repurposing for cancer treatment through global propagation with a greedy algorithm in a multilayer network

Objective: Drug repurposing, the application of existing therapeutics to new indications, holds promise in achieving rapid clinical effects at a much lower cost than that of de novo drug development. The aim of our study was to perform a more comprehensive drug repurposing prediction of diseases, particularly cancers. Methods: Here, by targeting 4,096 human diseases, including 384 cancers, we propose a greedy computational model based on a heterogeneous multilayer network for the repurposing of 1,419 existing drugs in DrugBank. We performed additional experimental validation for the dominant repurposed drugs in cancer. Results: The overall performance of the model was well supported by cross-validation and literature mining. Focusing on the top-ranked repurposed drugs in cancers, we verified the anticancer effects of 5 repurposed drugs widely used clinically in drug sensitivity experiments. Because of the distinctive antitumor effects of nifedipine (an antihypertensive agent) and nortriptyline (an antidepressant drug) in prostate cancer, we further explored their underlying mechanisms by using quantitative proteomics. Our analysis revealed that both nifedipine and nortriptyline affected the cancer-related pathways of DNA replication, the cell cycle, and RNA transport. Moreover, in vivo experiments demonstrated that nifedipine and nortriptyline significantly inhibited the growth of prostate tumors in a xenograft model. Conclusions: Our predicted results, which have been released in a public database named The Predictive Database for Drug Repurposing (PAD), provide an informative resource for discovering and ranking drugs that may potentially be repurposed for cancer treatment and determining new therapeutic effects of existing drugs.


Determination of initial probability 0 p
In the initial probability 0 , p probability 1 was assigned to the seed nodes, and probability 0 was assigned to other vertices, forming the drug network 0 h and the protein network 0 . v Given that we added disease nodes, the initial probability of disease network 0 u is a zero vector containing no seed nodes. Hence, the initial probability of the heterogeneous network can be represented as: are inter-transition matrices representing the probability of the transition from one disease/protein/drug to another disease/protein/drug node. is the transition matrix from the drug network to the protein network.
The transition probability from vertex disease i to protein j was defined as: The transition probability from vertex protein i to protein j was defined as: were set as zero matrices. However, in the second strategy, the transition probability from vertex drug i to disease j was defined as:

The parameter optimization process
Random walk differs from many other machine learning algorithms in that it does not have a loss function during the iteration. Consequently, it can measure only the final accuracy by cross validation after computing and ranking. Thus, normal parameter optimization cannot be directly implemented. In this research, we selected as many different parameter combinations as possible within our computing power and used AUC values to measure which parameters combination were optimal.
Here, the weight of drug network a was preferentially given a higher proportion (more than 0.5), according to previous studies 1,2 . The random walk model implemented for drug repurposing has been demonstrated to be robust to the selection of r; therefore, only 3 values between 0 and 1 (0.3, 0.5, and 0.7) were chosen to test whether the robustness still functions in our model. The results showed that our model was robust to the selection of r (AUC value difference ≤ 0.01). Therefore, we chose r = 0.7 because it had the best performance in both previous studies and our research.

Enrichment analysis for differentially expressed proteins
The log 2 -transformed value of each reporter ion intensity (corrected) was obtained. The SVA package was applied to remove batch effects (Supplementary Figure S7). Then the data were imported into Perseus v1.6.1.3 for statistical analysis. The processed intensities were normalized, and two-tailed t-tests were performed as described previously 3 . Proteins meeting significance criteria were subjected to analysis with the Database for Annotation, Visualization and Integrated Discovery (DAVID 6.8) tools with the total human genome information as the background. On the basis of fold change, the proteins with significant differences were classified into 2 data sets: the upregulated data set (fold change >1.2) and downregulated data set (fold change <0.83). Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis was used to investigate the molecular mechanisms. The adjusted P value (Benjamini− Hochberg correction) cutoff was 0.05.

Network analysis
Cytoscape (version 3.6.1) software based on the STRING database (version 10.5) was used to analyze protein−protein interactions and the downregulated proteins 4,5 . Interactions with  an interaction score ≥0.7 and active interaction sources from experiments and databases were exported from STRING for Cytoscape analysis.

Analysis of drug-compound similarity in mechanism
Connectivity Map (CMAP) of the Broad Institute Drug Repurposing Hub (https://clue.io), data version 1.1.1.2 and software version 1.1.1.33, was used for further analysis of compound similarity on the basis of gene-expression profiling. CMAP reveals connections among small molecules by measuring the similarity of transcriptional responses to perturbation in different human cell lines 6 . We extracted the items in "Compound" for nifedipine and nortriptyline. The score threshold was set at 99.