Although advances in medical science and therapeutic strategy have led to an improvement in the survival of patients with cancer, complications that increase morbidity and mortality have also been widely observed (1, 2). Cardiovascular disease (CVD) is one of the most common complications of cancer treatment, leading to premature morbidity and death among long-term cancer survivors (3, 4). For example, anthracycline chemotherapy improves disease-free and overall survival in patients with breast cancer (5), but it has long been noticed that anthracycline drugs, such as doxorubicin, can cause dose-dependent cardiotoxicity by redox cycling and free-radical formation mediated by topoisomerase-IIβ (6, 7). Another common cardiovascular side effect of antineoplastic drugs is observed during the treatment of metastatic human epidermal growth factor receptor 2 (HER2)–positive breast cancer using trastuzumab (Herceptin) (8, 9). The evident cardiovascular risk induced by antineoplastic drugs has impelled researchers to exploit all possible factors that can accurately predict cardiotoxicity (10).
Compared with clinically relevant conditions and outcomes, biomarker-dependent assessments, including assessments of genetics, epigenetics, and other molecular phenotypes, could facilitate the development of evidence-based methods for precise evaluation of cardiovascular risk with antineoplastic drugs (11, 12). Since the emergence of genome-wide association studies (GWASs) and next-generation sequencing (NGS), there has been impressive progress in identifying the genetic variants that influence disease risk and prognosis in cardio-oncology (13), greatly facilitating their broad usage in clinic (14). One canonical discovery is that a nonsynonymous variant of RARG confers susceptibility to anthracycline-induced cardiotoxicity in childhood cancer by altering the expression of TOP2B (15). A recent NGS study revealed that rare genetic variants of the TTN gene contribute to the susceptibility of cancer therapy–induced cardiomyopathy among adult and pediatric patients with cancer (16). Such examples (17) and rapid progress in the genetic understanding of CVD in huge cohorts, such as that enrolled in the UK Biobank (UKBB), introduce the question of how best to connect genetic associations in CVD with the potential cardiotoxicity of existing antineoplastic drugs to guide treatment selection and dissect the underlying mechanisms of side effects (18–21).
Previously, we investigated how well genetic associations predict drug mechanisms and whether they could be used to guide drug target selection and indications (22). In this study, we hypothesized that unintended cardiovascular risk could be partially attributed to deleterious mechanisms of action (MOA) in antineoplastic drug–target (or off-target) interactions and that such effects could be genetically supported by the impact of risk alleles on CVD genes that are directly or indirectly linked to the drug target. On the basis of 30 full GWAS summary statistics of 13 CVDs collected from the UKBB and public resources, we identified credible risk variants (CRVs) that could potentially cause CVD using statistical fine-mapping. We linked these CRVs to target genes to obtain their direction of causal effect (DirCE) mediated by risk alleles. Second, using direct gene matching and network propagation methods, we investigated the genetic evidence for antineoplastic drug–induced cardiovascular risk by identifying the concordant direction between the MOA of drug targets and the DirCE of CRV-associated genes. Third, we used functional experiments to investigate the pharmacologic effect of several predictive drugs, including the retinoid X receptor (RXR) agonists, alitretinoin and bexarotene, on the risk of coronary artery disease. Last, we established a predicted catalog of genetically supported cardiovascular risk from existing antineoplastic drugs at different evidence levels.
Comprehensive integration of CVD GWAS summary statistics and antineoplastic drugs for cardiovascular risk prediction
Existing studies have indicated that genetic evidence is an important predictor of drug target discovery (21, 23–26), drug repositioning (27–29), and drug side effect identification (30). However, no studies have systematically investigated the genetic evidence for antineoplastic drug–induced cardiovascular risk, which is frequently observed in the clinic. By assuming that the risk alleles of CVD genes share similar biological effects with the deleterious MOA of antineoplastic drug–target (or off-target) interactions, we first performed a comprehensive scan of genetically supported associations between cardiovascular risk and antineoplastic drug use (Fig. 1A).
To ensure precise and complete causal variant mapping of CVD, we collected extensive CVD GWAS summary statistics from Gene Atlas on the UKBB cohort (31), the GWAS Catalog (32), GRASP (33), GWAS Atlas (34), and PhenoScanner (35), respectively. After ontology mapping using the International Classification of Disease, 10th Revision (ICD-10) or Medical Subject Headings (MeSH), we selected 30 nonredundant and fine-mappable CVD GWAS results from large European (EUR) cohorts (table S1). According to the official classification of cardiovascular toxicity by the European Society of Cardiology (10), we further grouped these mapped terms into 13 cardiovascular complications (Fig. 1B and fig. S1). We found that coronary artery disease incorporates more studies compared with other complications in our collection.
Antineoplastic drugs and target information were collected from the Drug Gene Interaction Database (DGIdb) (36) and the AdisInsight Database. Out of a total of 13,051 drugs, 2197 antineoplastic drugs were preserved after mapping to OncoTree. Because we were required to consider the MOA of drug-target interactions in the following analysis, only drug targets with known MOA were preserved, yielding 1191 drugs, 10,555 indications, and 4408 drug-target MOA (table S2). Our statistics on the clinical status of drugs showed that clinical trial drugs account for the largest proportion (48.95%), followed by the U.S. Food and Drug Administration (FDA)–approved drugs (42.65%) and drugs in the experimental stages (8.40%) (Fig. 1C). Many of these drugs have multiple targets (fig. S2A) and indications (fig. S2B). In addition, we found that a majority of drug-target MOA belong to the inhibition class (Fig. 1C). Among the antineoplastic drugs identified, we showed the top 30 tumor types according to the number of drugs (fig. S2C). The number of drugs used to treat breast cancer was the greatest, followed by the number of drugs used to treat non–small cell lung cancer.
Multiscale functional predictions to identify causal allele effect of CVD genes
To determine the DirCE of CVD genes mediated by each risk locus, we identified CRVs that potentially cause CVD using statistical fine-mapping. We then linked these CRVs to their target genes (Fig. 1D). FINEMAP (37) was applied to each of the 30 CVD GWAS results, and linkage disequilibrium (LD) block-wise CRVs were identified under a single causal variant per LD block assumption (see Methods). We found that the number of causal loci and CRVs varied among these CVD GWA studies, but, in general, they followed a positive correlation (Fig. 2A). Coronary artery disease, hypertension, and atrial fibrillation obtained the greatest number of causal loci and CRVs, which is consistent with the results of current genetic studies on CVD (18, 38, 39).
We annotated CRVs using ensembl variant effect predictor (VEP) (40) and classified variant annotation consequences into protein-truncating, missense, and other types of variant according to the function affected. These CRVs at different levels were then linked to potential target genes using their genomic location and Genotype-Tissue Expression (GTEx) cis-expression quantitative trait loci (eQTLs) in cardiovascular tissues. As a result, we obtained 5 protein-truncating CRV genes, 18 missense CRV genes, and 146,529 regulatory CRV genes involving 717 unique genes and 850 CVD-gene pairs (Fig. 2, B and C). For regulatory CRV genes, most of the CRVs were located in introns and upstream/downstream of cis-eQTL genes (fig. S3). A comparison of these CRV genes in each of the disease types showed distinct expression patterns among human tissues. For example, the CRV genes for coronary artery disease and myocardial infarction were uniquely expressed in the heart, while the CRV genes for atrial fibrillation displayed similar expression patterns in the heart and adrenal gland (fig. S4), indicating that the causal tissues of certain CVDs could be complex.
For each CRV-gene pair in different categories, we predicted the DirCE, including loss of function (LoF) or gain of function (GoF), by inferring the risk alleles of CRVs on target genes (Fig. 1D). For example, we could predict whether a missense risk allele causes deleterious or activating consequences on protein function using bi-directional sorting intolerant from tolerant (B-SIFT) algorithm (41). We could also determine how a regulatory risk allele controls the expression of a target gene by an effect size of tissue-matched eQTL (see Methods). We found that the DirCE of CVD genes mediated by risk alleles was roughly balanced between LoF and GoF (Fig. 2B), which was significantly different from the MOA of antineoplastic drug targets.
Because of the limited CRVs or unqualified allele effects for several cardiovascular complications, we only observed valid DirCE of CRV genes for six CVDs, including coronary artery disease, hypertension, atrial fibrillation, myocardial infarction, thromboembolic disease, and other unclassified arterial diseases (Fig. 2C). Among 850 CVD-associated gene pairs, most contained multiple CRVs and incorporated conflict DirCEs. Thus, we investigated the DirCEs of each CVD-associated gene and found that only 33 CVD-associated genes (3.9%) obtained inconsistent predictions of risk allele effects (16 of them had a conflict rate of >0.05) (fig. S5 and table S3). This could demonstrate the reliability of our DirCE prediction and provide mechanically distinguishable genetic evidence for the cardiovascular risk–associated drug analysis.
Mapping antineoplastic-induced cardiovascular risk by direct gene matching
By assuming a shared biological mechanism between deleterious MOA of antineoplastic drug–target interactions and risk alleles of CVD-associated genes, we first used a direct gene matching strategy to investigate the concordant direction between the MOA of drug-target interactions and the DirCE of CRV genes. In short, we required that the drug target be the same as the CRV-associated gene, and the direction between the drug-target MOA and DirCE should be consistent (i.e., activation versus GoF and inhibition versus LoF). As a result, we obtained 2369 direct gene-matching evidence for five CVDs and 86 oncological drugs (Fig. 3A). Of note, hypertension had the greatest number of associated drugs supported by the most genetic evidence. Conversely, coronary artery disease only had a small number of drugs but contained the greatest number of associated genes (Figs. 2C and 3A). Most of the gene-based evidence came from the regulatory effects of CRVs, which is probably due to the fact that noncoding regulatory variants explain a larger proportion of the heritability of CVD (Fig. 3B). This may also suggest that when the drug’s direct target (or direct off-target) is the same as the CVD causal gene, the side effects of the drug on the cardiovascular system are more likely to share similar but not identical mechanisms to the risk allele by altering the expression of CVD genes.
Among the gene-based evidence, we observed evidence of cardiovascular toxicity with the aminopeptidase inhibitor, tosedostat, which was genetically supported by a protein-truncating CRV in hypertension (Fig. 3, B and C). The single-nucleotide variant (SNV) of rs33966350 (G>A, ENPEP:p.Trp317*) was detected as a risk variant for essential hypertension in the UKBB GWAS results from Gene Atlas (GWAS P = 2.63 × 10−12) (31). Several large-scale GWASs also identified that rs33966350 was an independent signal associated with hypertension (20, 42, 43). The A allele of rs33966350 was predicted to be a causal allele by GWAS fine-mapping [posterior probability (PP) = 0.10315] and was thought to introduce stop-gained function, which results in a truncated Glutamyl Aminopeptidase (ENPEP) protein. ENPEP encodes aminopeptidase A in the renin-angiotensin-aldosterone system and converts angiotensin (Ang) II into Ang III. Ang II causes vasoconstriction, while Ang III promotes vasodilation and protects against hypertension (44), implying that LoF of the ENPEP protein may lead to an increase in blood pressure due to long-lasting Ang II signaling.
On the other hand, tosedostat is an inhibitor of the M1 family of aminopeptidases and has a direct binding affinity for ENPEP. Tosedostat has demonstrated antineoplastic activity in several models of cancer, and it also entered phase 2 clinical trials for patients with hematological or pancreatic malignancies. We predicted the tosedostat-induced hypertension risk according to the concordant direction between the inhibition effect of tosedostat-ENPEP and the LoF effect of truncated ENPEP mediated by the risk allele of rs33966350 (Fig. 3C). This genetic evidence also highlights the necessity of blood pressure monitoring during long-term systemic reduction of aminopeptidase A activity (42). Because the associated genes of particular CVDs usually contain numerous estimated CRVs and one antineoplastic drug may incorporate multiple targets or off-targets, for each drug-CVD side effect, we observed genetic evidence that was supported by different causal alleles (Fig. 3D). For example, we found that tosedostat may induce atrial fibrillation by inhibiting C9orf3, which is supported by an abundance of consistent genetic evidence and clinical trials (45).
Inferring antineoplastic-induced cardiovascular toxicity using network random walk
Using a direct gene matching strategy, we found that almost all genetic evidence was attributed to the causal risk allele altering the expression of CVD-associated genes, which implies the rare possibility of identical biological mechanisms between deleterious MOA of antineoplastic drug–target interactions and disease-causal alleles in CVD-associated genes. To investigate whether the shared mechanism could be transmitted through signaling pathways or molecular interactions, we further designed a network propagation strategy to build and score path connections between drug targets and CVD-associated genes. By integrating the signaling pathway and protein-protein interaction (PPI) network, we constructed a fused directional biological network involving 10,009 gene nodes and 116,283 associated edges, which only contain inhibition and activation effects (table S4). To measure the relationship between the objective gene and related network genes, we calculated the probability of random walk with restart and filtered out unlikely connections by thresholding probabilities below the inflection point of the probability accumulation curve (fig. S6). We deduced the most likely molecular mechanism upon the shortest path connecting drug targets and CVD-associated genes and estimated the concordant direction between the MOA of drug-target interactions and the DirCE of CRV-associated genes (see Methods).
Through this network strategy, the genetic evidence in support of antineoplastic-induced cardiovascular risk was greatly expanded, involving 812 antineoplastic drugs in six CVDs (fig. S7). The amount of network-level evidence was far more than that obtained by direct gene matching, which also incorporated 33 CRV paths supported by missense variants. Because of the complexity of the biological network, we observed a large amount of genetic evidence in different pathways for a unique drug-CVD side effect (Fig. 3E). For example, bosutinib, vandetanib, and ponatinib had the greatest amount of network-level evidence in hypertension.
For the regulatory CRV paths, we found that the investigational drug, uprosertib, obtained a large amount of network-level evidence in atrial fibrillation. The causal alleles of atrial fibrillation could down-regulate the THRB gene and negatively regulate thyroid hormone signaling, thereby inhibiting the phosphatidylinositol 3-kinase–AKT pathway (46). Uprosertib is an AKT inhibitor, and its MOA exhibit consistent direction with the DirCE of THRB mediated by causal variants of atrial fibrillation (fig. S8). It has been reported that inhibition of the PIK3CA-AKT-Bad pathway could increase the risk of atrial fibrillation (47), which is consistent with our prediction. The network-level results also incorporate evidence supported by deleterious missense variants (fig. S9). For instance, we predicted an increased risk of coronary artery disease with bexarotene (RXR agonist), which was genetically supported by a rare nonsynonymous CRV (Fig. 3F). The common allele, G, of SNV rs116843064 (A>G, ANGPTL4:p.K40E) was detected as a risk allele for coronary artery disease in several GWASs (18, 48). This variant was predicted to be a highly causal signal by GWAS fine-mapping (PP = 0.996) and could introduce GoF in ANGPTL4 according to B-SIFT prediction. ANGPTL4 encodes angiopoietin-like proteins, which inhibit lipoprotein lipase (LPL), implying that GoF in the ANGPTL4 protein may lead to an increase in plasma lipids due to dysfunction in lipid metabolism (49). On the other hand, bexarotene is an agonist of the retinoic acid receptor family of proteins and has a direct binding affinity for RXRA, indirectly promoting the expression and secretion of Angiopoietin Like 4 (ANGPTL4) (50). Therefore, we predicted the bexarotene-induced coronary artery disease risk according to the consistent direction between the activation effect of bexarotene ➔ RXRA ➔ ANGPTL4 and the GoF effect of ANGPTL4:p.K40E (Fig. 3F). Recent clinical studies also reported that bexarotene induces triglyceride (TG) elevation and hypertriglyceridemia (51, 52), which may cause CVD development in the long term.
Functional experiments validate that RXR agonists can reduce TG lipolysis
To measure the efficacy of our genetics-based predictions and validate the likely molecular pathway underlying antineoplastic-induced cardiovascular toxicity, we investigated the pharmacologic effect of aforementioned RXR agonists, such as bexarotene and alitretinoin, on the risk of coronary artery disease using a series of functional experiments. Many studies have shown that a high TG level is associated with an increase in CVD risk (53, 54). In our prediction, RXR agonists could modulate TG metabolism through LPL and its well-characterized inhibitor, ANGPTL4 (Fig. 3F). Because ANGPTL4 and LPL are both mainly synthesized in adipocytes (55), we used differentiated 3T3-L1 adipocytes to test whether alitretinoin and bexarotene can reduce lipolysis of TGs through the RXR ➔ ANGPTL4 –| LPL pathway. As expected, we found that Angptl4 mRNA was increased 2.8- and 10-fold after alitretinoin and bexarotene treatment, respectively (Fig. 4A). However, the RXR antagonist, UVI3003, completely abolished the increase in Angptl4 expression (Fig. 4A). Similarly, alitretinoin- and bexarotene-treated differentiated 3T3-L1 adipocytes expressed more Angptl4 protein compared with the control group. This effect was inhibited by UVI3003, as evaluated by Western blot (Fig. 4B) and immunofluorescence (Fig. 4C). Consistently, there was a higher TG level (Fig. 4D) and less LPL activity (Fig. 4E) after both alitretinoin and bexarotene treatment in differentiated 3T3-L1 adipocytes. These effects were abolished by UVI3003. To verify that Angptl4 is the direct effector protein of alitretinoin and bexarotene, Angptl4 was knocked down by treatment with small interfering RNA (siRNA). Specific siRNA caused a 58% decrease in Angptl4 mRNA (Fig. 4F). Si-Angptl4 infection also blunted the augmented Angptl4 expression in alitretinoin- and bexarotene-treated differentiated 3T3-L1 adipocytes (Fig. 4G). Furthermore, Angplt4 gene silencing with specific siRNA eliminated the effect of alitretinoin and bexarotene on TG (Fig. 4H) and reduced LPL activity (Fig. 4I) in differentiated 3T3-L1 adipocytes.
By constructing 3T3-L1 cells stably expressing different ANGPTL4 alleles, we observed that differentiated 3T3-L1 adipocytes expressing ANGPTL4 (118G) produced more TG compared with differentiated 3T3-L1 adipocytes expressing ANGPTL4 (118A) (fig. S10, A and B), implying a shared biological effect between risk alleles of CVD genes and antineoplastic drugs. To evaluate the potential adverse effect of adipocyte-secreted TGs on cardiovascular tissues/cells, we cocultured human umbilical vein endothelial cells (HUVECs) with differentiated 3T3-L1 adipocytes stably expressing ANGPTL4 and measured the expression of three endothelium-derived inflammatory or stress factors, including adhesion marker E-selectin, vascular cell adhesion molecule 1 (VCAM1), and intercellular adhesion molecule 1 (ICAM-1) (fig. S10C). Compared with untreated and ANGPLT4 (118A) 3T3-L1 adipocyte–treated conditions, HUVECs cocultured with differentiated ANGPLT4 (118G) 3T3-L1 adipocytes exhibited a significant increase in the expression of E-selectin, VCAM1, and ICAM-1 (fig. S10D). This suggests that the genetic effect of different ANGPTL4 alleles could be transmitted to the cardiovascular system. In summary, these experiments support our genetic prediction that alitretinoin and bexarotene reduce TG lipolysis through RXR/ANGPTL4 signaling and further modulate the risk of coronary artery disease.
Evaluation of genetics-based predictions of antineoplastic-induced cardiotoxicity
To investigate whether our genetic predictions of antineoplastic-induced cardiovascular risk (table S5) occurred by chance, we first used a PubMed text-mining method to produce an independent literature-based association analysis of predicted CVDs and antineoplastic drugs. Presumably, whether the set of drugs predicted to have CVD side effects (hereinafter referred to as predicted positive drugs) are more likely to be associated with CVD than the set of antineoplastic drugs predicted to have no CVD side effects (hereinafter referred to as predicted negative drugs) in PubMed articles could be determined by counting the number of articles that mention predicted drugs or CVDs individually and comparing them to the number of articles that mention both. If the number of articles that mention both predicted positive drugs and CVDs is higher than expected, then the hypothesis could be validated (56). We queried PubMed using two keyword sets, including six CVDs and all 1191 antineoplastic drugs involved in the analysis, respectively. After some calculation, we obtained 115,066 articles mentioning both CVDs and 814 predicted positive drugs and 31,604 articles incorporating CVDs and 377 predicted negative drugs. A total of 2,568,289 articles mentioned predicted positive drugs but not CVDs, and 998,939 articles mentioned predicted negative drugs but not CVDs. Then, using a one-tailed Fisher’s exact test, we found statistically significant enrichment of CVD occurrence in articles mentioning predicted positive drugs (P < 2 × 10−16, odds ratio = 1.416), indicating the effectiveness of our predictions (Fig. 5A).
A disadvantage of the above PubMed testing method is that it does not consider the context in which the CVD and/or the drug are mentioned in an article, although this assessment had a very large sample size and was extremely significant. This prevents us from distinguishing whether the association was attributed to the side effects of the drug. We used SIDER, a curated resource for recording adverse drug reactions (57), to evaluate our predicted results. Among 223 overlapped antineoplastic drugs in SIDER 4.1 and the present study (out of 1191 drugs), we found 157 predicted positive drugs and 139 drugs with adverse CVD events in SIDER. We also identified 104 predicted positive drugs, which had adverse CVD events in SIDER. This indicates that our predicted positive drugs are significantly enriched in drugs recorded to have CVD adverse reactions (P = 0.04456, odds ratio = 1.734, one-tailed Fisher’s exact test), further demonstrating that antineoplastic-induced cardiovascular risks can be correctly predicted by assessing genetic associations in CVD (Fig. 5B). This moderate significance but higher odds ratio compared with the PubMed text-mining assessment could be attributed to the smaller sample size. Conversely, by requesting discordant direction between the MOA of drug-target interactions and the DirCE of CRV-associated genes, we identified 814 antineoplastic drugs without predicted cardiovascular risk. Following the same validation criteria above, we did not observe significant enrichment of CVD side effects for these drugs (P = 0.2547, odds ratio = 1.262, one-tailed Fisher’s exact test) (fig. S11), which further supports the validity of our cardiovascular risk predictions.
To illustrate the genetic evidence of cardiovascular risk for commonly used FDA-approved drugs, we first curated 178 first-line antineoplastic drugs from the latest National Comprehensive Cancer Network (NCCN) (table S6). In our predictions, we found 115 of 178 drugs that might induce cardiovascular side effects in different genetic conditions and at different evidence levels. Through searching PubMed, 85 of 115 (73.9%) predicted positive drugs were validated with reported adverse CVD events. We also found that 67 of 78 (86.9%) SIDER-included drugs displayed adverse CVD events (table S7), indicating good power in our genetic-based prediction. For example, a first-line anticolorectal cancer drug, regorafenib, was predicted to induce hypertension, which is consistent with observations in clinical practice (58, 59). Another classical example was that trastuzumab-induced cardiac dysfunction was recorded in our prediction, encompassing myocardial infarction, atrial fibrillation, and coronary artery disease (8, 9).
According to the drug properties, we further divided these first-line drugs into chemotherapy drugs (28 of 42), targeted therapy drugs (37 of 50), immunotherapy drugs (10 of 21), and others (40 of 65). For each drug category, we used Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis to test the possible biological pathways for CVD-associated gene sets with concordant DirCE supporting cardiovascular toxicity. We found that the CVD-associated genes linked to the side effects of chemotherapy were significantly enriched in hypoxia inducible pathway and the human T lymphotropic virus type 1 infection pathway, while CVD genes relevant to the side effects of cancer-targeted therapy were significantly enriched in pathways for epidermal growth factor receptor tyrosine kinase inhibitor resistance, stem cell pluripotency regulation, blood coagulation, and insulin resistance (fig. S12). These results imply that the biological mechanisms underlying antineoplastic drug–induced cardiotoxicity vary among drug types, which could be affected by shared pathogenic signaling between cancer and CVD.
In addition to the above global evaluations, we also illustrated some of our predictions for well-established and frequently used antineoplastic drugs with known cardiovascular side effects. As the most common chemotherapy approach, anthracycline-involved treatment improves overall survival but may induce severe or life-threatening heart failure in some individuals (5, 60). Mechanistically, anthracycline can cause dose-dependent cardiotoxicity by redox cycling and free-radical formation mediated by topoisomerase-IIβ (6, 7). Our genetic evidence shows that individuals who carry the hypertension risk allele, C, of rs7222781 might have greater expression of CDC27. Thus, through the CDC27 –| CCNE2 –| RBL2 –| TOP2A cell cycle signaling and DREAM complex regulation network (61, 62), TOP2A expression could be suppressed. TOP2A forms a complex structure with TOP2B, which is essential during mitosis and meiosis (63). Anthracycline can inhibit TOP2A, which has the same function as the effect of risk allele–mediated pathways in hypertension (Fig. 5C). Thus, anthracycline-induced cardiotoxicity could be supported by our genetic-based prediction. In addition, ponatinib and other BCR-ABL1 tyrosine kinase inhibitors have been linked to widespread cardiovascular events in patients with chronic myeloid leukemia (64, 65). We found that several CRVs in coronary artery disease can down-regulate EPHA3 expression and inhibit ABL1 activity through the axon guidance pathway and PPIs (66). Therefore, ponatinib-induced cardiovascular risk can be validated by the consistent direction between the CRV effect in coronary artery disease and the drug mechanism of BCR-ABL1 tyrosine kinase inhibitors (Fig. 5D).
Cardiovascular toxicity during cancer therapy has become a nonnegligible complication that affects patient survival. However, there is little guidance about how genetic factors influence the potential cardiovascular toxicity of antineoplastic drugs. In this study, by leveraging summary statistics of large cohort CVD GWASs, we introduced a computational strategy to align the MOA of antineoplastic drug–target interactions and the risk allele effect on CVD gene. For the first time, we established a compendium of genetic evidence for cardiovascular risk induced by current antineoplastic drugs with different levels of evidence. One of the most distinctive merits of this study comes from identification of CRVs using GWAS statistical fine-mapping and predicting the DirCE for associated CVD genes mediated by risk alleles. We also improved the confidence in linking noncoding risk alleles to target genes using GTEx eQTLs and their effect size in CVD-associated tissues. Last, in addition to the direct gene matching–based prediction, we applied a network proximity scheme to score path connections between drug targets and disease genes in directional biological networks, which significantly extended the capability of side effect prediction.
Our predictions revealed that a large fraction of antineoplastic drugs can cause potential cardiotoxicity with different levels of evidence. Among these drugs, 38.6% are FDA-approved and 53.6% have been tested in clinical trials, which implies that cardiovascular risk induced by antineoplastic drugs may be mild and may be a long-term process in the general population (4). However, such accumulated risk could be prominent in patients who carry particular alleles that confer susceptibility to CVD. Thus, according to our genetic evidence, we could advise patients with cancer with certain CVD risk alleles to avoid or reduce treatment with certain antineoplastic drugs. On the other hand, the support of genetic evidence for antineoplastic-induced cardiotoxicity will help future target selection or drug repositioning in cancer drug development.
Recent enrichment analysis has shown a significant correlation between the organ systems affected by disease-associated drug target variants and the organ systems in which side effects are observed (30), but it cannot identify which side effect could be modulated by drug target genetic variants because of the lack of genetic effect direction. In contrast, our prediction can establish a link between the genetically supported DirCE of disease genes and the direction of pharmacological modulation, which provides a clue to select optimal drug targets with better safety and efficacy and guides patient recruitment to avoid potential side effects during clinical trial design. Although we cannot provide accurate prioritization according to the identified genetic evidence at present, our predictions can significantly facilitate hypothesis-driven pharmacogenetic and pharmacological research. The applied methods in this work could also be generalized to predict a wide range of side effects for drugs with clear target information.
There are some limitations in the present study that should be highlighted. First, the public GWAS summary statistics for CVDs could be unbalanced and population-biased. Statistical fine-mapping requires full summary information to infer the most likely causal variants, but some of the collected GWAS results did not reach the minimal requirements and were therefore excluded. For example, we only collected the summary statistics from one qualified GWAS on hypertension from the UKBB, representing a subset of recent multicenter GWASs of blood pressure. Also, most of our collected summary statistics were from the EUR population and had large sample sizes. To simplify our analysis pipeline, we only used GWASs performed in the EUR population. Such biases from disease type, sample size, population, and statistical power would cause an underestimation in cardiotoxicity induced by antineoplastic drugs. Second, we only assumed a single causal signal in each LD block and neglected genetic interactions among multiple variants/loci, such as the complex CVD locus 9p21.3, which contains multiple independent causal variants that may interact in either a synergetic or an antagonistic manner to predispose cardiovascular phenotypes (67), which further limits the identification of genetic evidence. Recent Mendelian randomization analysis leverages genetic proxies for the effect of antihypertensive drug classes to inform drug efficacy and side effects, which represents a complementary angle of combined genetic effects (68). Third, although we considered directions of noncoding risk alleles affecting gene expression in cardiovascular tissues, we did not assess effects in other human tissues or effects that were controlled by other types of variant, including splicing and protein translation-altering variants. This problem could be solved using some algorithms that accurately predict disease-causal tissues and the direction of the allelic effect in splicing or translation, such as the inference from accumulated protein quantitative trait loci data (69). In addition, our prediction started with a specific CRV gene in CVDs and learned from the STRING PPI network and KEGG pathways, so the mechanism we predicted for side effects could be biased. For example, our method cannot predict drug-induced cardiotoxicity due to functionally impaired drug metabolizing enzymes and/or transporters. Last, we cannot distinguish whether the induced cardiovascular risk is derived from the on-target or off-target effect of antineoplastic drugs, which could be attributed to the ambiguous definition of the target in our collected drug-related resources. Despite the above shortcomings, we believe that the current study could be helpful for rational drug target evaluation in drug development and could provide an auxiliary reference for selecting patients during clinical trials for testing novel or repurposed antineoplastic drugs.
GWAS data collection and disease mapping
The CVDs GWAS summary statistics were collected from Gene Atlas (31), NHGRI-EBI GWAS Catalog (32), GRASP (33), GWAS Atlas (34), and PhenoScanner (35), respectively, which were based on UKBB, CARDIoGRAMplusC4D, HUNT, and other large-scale cohorts. We removed duplicate GWAS datasets from the above sources according to the PubMed ID and UKBB field of study. Studies containing the non-EUR population and insufficient summary information (at least P value, effect size, effective allele, and baseline allele) were removed. Then, we mapped the GWAS reported diseases from Gene Atlas and GWAS Atlas to ICD-10 and others to MeSH. According to the classification of cardiovascular toxicity by the European Society of Cardiology, we grouped these mapped terms into nine cardiovascular complications, including arrhythmias, coronary artery disease, artery hypertension, myocardial dysfunction and heart failure, pericardial complication, peripheral vascular disease, pulmonary hypertension, thromboembolic disease, and valvar heart disease (10). In addition, atherosclerosis, atrial fibrillation, myocardial infarction, and other arterial diseases were listed separately. Each GWAS summary statistics was only classed into one cardiovascular complication for subsequent analysis.
Antineoplastic drugs data collection
The drug information containing drug targets, drug indications, drug MOA, and drug status was extracted from DGIdb 3.0 (36). Then, we mapped the drug indications to OncoTree (http://oncotree.mskcc.org) to retain all antineoplastic drugs. In addition, we reannotated each drug with classification by searching in the Adisinsight Database (https://adisinsight.springer.com). Last, we normalized MOA for each drug-target pair (activation or inhibition) and filtered out the drug targets with unknown MOA.
GWAS fine-mapping and potentially causal variant selection
We identified causal variants for each CVD GWAS dataset by several steps. First, we leveraged independent LD blocks of the EUR population estimated by ldetect (70) to partition genetic variants into LD blocks. Then, we used GWAS P < 5 × 10−8 as a filter to select significant LD blocks for statistical fine-mapping. Second, for each selected LD block, we applied FINEMAP (37) to estimate the PP of causality for each variant based on 1000 Genomes Project EUR genotypes and single causal variant assumption (71). Third, we selected potentially causal variants in 99% credible set of each locus. To further remove false positives, we excluded variants with GWAS P > 1 × 10−5 and PP < 0.01 if the leading variant is not selected in the credible set. To control possible false negatives for variants not available in 1000 Genome projects, we kept those variants with GWAS P < 1 × 10−5. Thus, we constructed CRVs that potentially cause corresponding CVD for each GWAS dataset.
The significant cis-eQTLs were downloaded from the portal of GTEx project v7 release (72). To ensure the tissues of eQTL data are relevant tissues matching with CVDs, we only included eQTL data from cardiovascular tissues including artery-aorta, artery-coronary, artery-tibial, heart–atrial appendage and heart–left ventricle.
Linking CRVs to target genes
We annotated CRVs of each CVD with VEP release 94 (40) and obtained variant consequences and gene-level annotations. We mapped CRVs to VEP-reported genes if variant shows protein-altering consequences. For the remaining consequences, we assigned CRV to its cis-eQTL genes (window of 1 Mb around each gene) using the above selected GTEx datasets. In addition, we downloaded the gene expression (Transcripts Per Million) across different tissues from the GTEx v7 release for the comparison of the expression pattern of CRV genes. To compare the expression pattern of CRV genes of each CVD across human tissues, we first transferred sample-gene expression matrix to a concatenate expression vector for each tissue. For each CVD, we used paired sample t test to investigate the expression difference of CRV genes between every two tissues by considering overlapping samples.
According to the risk allele of each CRV, we predicted DirCE for each CRV-gene pair in the following three levels. For protein-truncating CRVs annotated with “stop_gained,” “frameshift_variant,” “stop_lost,” and “start_lost” consequences, DirCE was defined as LoF. For missense CRVs annotated with “inframe_insertion,” “inframe_deletion,” and “missense variant” consequences, we predicted the functional impact score of risk allele using B-SIFT (41). As suggested in the original publication, DirCE was defined as LoF if B-SIFT score was less than −0.95 and as GoF if B-SIFT score was higher than 0.5. Last, for other (regulatory) CRVs, we harmonized the eQTL effective allele and its effect size with the risk allele of each CRV. We defined DirCE as LoF if the harmonized effect size of risk allele was less than zero; otherwise, it was GoF.
Cardiovascular toxicity mapping
We assumed that potential cardiotoxicity could occur if there is a concordant direction between the MOA of drug target and the DirCE of CRV-associated genes mediated by risk allele. To identify this at different levels, we adopted direct gene matching and network mapping, respectively. In direct gene matching, we required the drug target to be the same as the CRV-associated gene and the direction between drug MOA and DirCE to be consistent (i.e., activation versus GoF and inhibition versus LoF). In the network mapping, we applied a random walk with restart strategy to search and score path connections between drug target and CVD-associated gene. We first integrated a directed biological network by fusing human PPI networks from the STRING database (PPI score > 0.7) (73) and human signaling pathways from KEGG database (only gene type was included) (74). To facilitate the following network inference of concordant direction, we only selected interaction pairs with clear effect direction (i.e., either activation or inhibition). On the basis of the above-integrated network, the relatedness scores (random walk probability) between start gene and network genes were calculated by walker R package with default restart probability of 0.7 and normalized to sum to 1 (75). To exclude unlikely connections, for each start gene (either drug target or disease gene), we drew a probability accumulation curve for all scored nodes (normalized probability > 0) and adopted the probability at the inflection point of curve as the filtering threshold. We used igraph R package (76) to calculate the shortest path between each start gene and filtered genes. Because the network is directed and contains a clear effect for each edge, we were able to inspect the outcome of the transmission effect and determine the mechanism for any paths. We assigned 1 to the activation effect and −1 to the inhibition effect; hence, a final drug-target MOA in a given path could be calculated by multiplication of all assigned scores in this path. For example, given a path of “drug –| geneA ➔ geneB ➔ disease gene,” where “–|” means inhibition and “➔” means activation, the MOA of the drug on disease gene could be inferred by “−1*1*1 = −1”, which is inhibition ultimately. On the basis of this strategy, we mapped CRV-associated genes to the shortest path initiated from drug target (or vice versa) and calculated the indirect mechanism for such connection. We also allowed an intermediate gene crossed over by two independent paths that start from drug target and disease gene, respectively. Last, we investigated the concordant direction between MOA of drug target and DirCE as described in the direct gene matching strategy.
Drugs and reagents
Alitretinoin was bought from Santa Cruz Biotechnology (Santa Cruz, CA). Bexarotene and heparin sodium were purchased from Selleck Chemicals (Houston, TX, USA). UVI3003, dexamethasone (DEX), 3-isobutyl-1-methylxanthine (IBMX), and insulin were obtained from Sigma Company (Sigma-Aldrich, St. Louis, MO, USA).
Cell culture, treatment, and transfection
3T3-L1 cells were purchased from Cell Bank (Shanghai Institutes for Biological Sciences, CAS, China), cultured, and differentiated into the adipocyte as described previously (77). In brief, cells were cultured in a humidified atmosphere at 37°C with 5% CO2 and 95% room air with culture medium containing Dulbecco’s modified Eagle’s medium (Gibco, 11965092) supplemented with penicillin/streptomycin (50 μg/ml; Gibco, 15070063) and 10% (v/v) fetal bovine serum (Gibco, 10099141). Differentiation was initiated 1 day after confluence by incubation for 7 days with culture medium containing 1 μM DEX, 0.5 mM IBMX, and insulin (1 μg/ml). Medium was changed every other day during all culture or differentiation stages. Alitretinoin, bexarotene, UVI3003, or dimethyl sulfoxide (DMSO) as a vehicle control was added in differentiated 3T3-L1 cells, for the indicated times, and then processed for downstream experiments.
RNA extraction and qRT-PCR
Total RNA from the differentiated 3T3-L1 cells was extracted using TRIzol reagent (Life Technologies) according to the manufacturer’s method. Total RNA (1 μg) was reverse-transcribed to complementary DNA (cDNA) by a reverse transcription reagent kit (Takara, Dalian, China) according to the manufacturer’s instructions. Angptl4 expression was normalized to the level of β-actin mRNA. The quantitative real-time polymerase chain reaction (qRT-PCR) protocol was as follows: 5 min at 95°C for 1 cycle, followed by 40 cycles at 95°C for 30 s, 60°C for 30 s, and 72°C for 20 s, and a final extension at 72°C for 10 min. The relative transcript level of a gene is expressed in ∆Ct values (∆Ct = Ctreference − Cttarget), and the relative changes in transcript levels compared with controls are expressed as ∆∆Ct values (∆∆Ct = ∆Cttreated – ∆Ctcontrol), as previously described (78). The primer sequences were as follows: Angptl4, 5′-CCCCACGCACCTAGACAATG-3′ (sense), 5′-GCCTCCATCTGAAGTCATCTCA-3′ (anti-sense); β-actin, 5′-GGCTGTATTCCCCTCCATCG-3′ (sense), 5′-CCAGTTGGTAACAATGCCATGT-3′ (antisense).
Western blot analysis
Proteins from the differentiated 3T3-L1 cells were extracted using the lysis buffer (Beyotime Biotechnology, Jiangsu, China) with protease inhibitors. The protein concentrations were determined using Pierce BCA Protein Assay Kit (Pierce, Rockford, IL, USA). Equivalent levels of proteins were denatured and resolved with 10% sodium dodecyl sulfate polyacrylamide gel electrophoresis gels and then transferred to nitrocellulose membranes, incubated with 5% skimmed milk, and probed with primary antibodies overnight at 4°C. Primary antibodies were diluted and listed as follows: anti-Angptl4 (A2011, ABclonal Technology, Woburn, MA, USA), anti-HSP90 (A5027, ABclonal Technology, Woburn, MA, USA), anti-ICAM1 (sc-107, Santa Cruz Biotechnology, Dallas, TX, USA), anti-VCAM1 (sc-13160, Santa Cruz Biotechnology, Dallas, TX, USA), anti–E-selectin (20894-1-AP, Proteintech Group, Rosemont, IL, USA), and anti–glyceraldehyde-3-phosphate dehydrogenase (60004-1-Ig, Proteintech Group, Rosemont, IL, USA). The membranes were washed and then incubated in horseradish peroxidase–labeled secondary antibody for 1 to 2 hours at room temperature. Proteins were detected using enhanced chemiluminescence reagents (Thermo Fisher Scientific, Waltham, MA, USA).
3T3-L1 cells were cultured in 12-well plate climbing glass coverslips (BD Biosciences) and differentiated for 48 hours at about 50% confluence. The cells were treated with different drugs for 24 hours and then fixed in 4% paraformaldehyde for 10 min at room temperature; the cells were washed three times with phosphate-buffered saline (PBS) and blocked by 3% bovine serum albumin in PBS with 0.1% Triton X-100 for 45 min at room temperature. Thereafter, cells were incubated with anti-Angptl4 (1:100) antibodies overnight at 4°C. After washing three times with PBS, the cells were incubated with Alexa Fluor 488–conjugated anti-rabbit immunoglobulin G antibody (Invitrogen; 1:1000) for 2 hours at room temperature. The slides were mounted in ProLong Gold antifade reagent with 4′,6-diamidino-2-phenylindole (Invitrogen). All of the immunofluorescence pictures were captured and analyzed using a laser-scanning confocal microscope (Carl Zeiss, Oberkochen, Germany). At least five random fields were taken in the central region of each glass slide.
TG level and LPL activity
TG level and LPL activity were determined using the TG assay kits (Nanjing Jiancheng Bioengineering Institute, Jiangsu, China) and Lipoprotein Lipase Assay kit (Abcam, Cambridge, MA, USA), according to the manufacturer’s instructions. TG was quantified by spectrophotometry (absorbance at 510 nm); LPL activity was measured and calculated by fluorescence at Excitation/Emission = 482/515 nm.
The siRNA for Angptl4 and the Scramble were from GenePharma Company (Shanghai, China). The sequences of Angptl4-siRNAs were as follows: Angptl4-siRNA1 sense, GCAUGGCUGCCUGUGGUAATT; Angptl4-siRNA1 antisense, UUACCACAGGCAGCCAUGCTT; Angptl4-siRNA2 sense, GGGACUGCCAGGAACUCUUTT; Angptl4-siRNA2 antisense, AAGAGUUCCUGGCAGUCCCTT; Angptl4-siRNA3 sense, CCCUGCUGAUCCAGCCCAUTT; Angptl4-siRNA3 antisense, AUGGGCUGGAUCAGCAGGGTT. The Angptl4 siRNA or Scramble siRNAs were transfected into differentiated 3T3-L1 cells with Lipofectamine 2000 according to the manufacturer’s instructions. At indicated times after the transfection, cells were collected for further analysis.
Lentiviral vector preparation and viral infection
The human ANGPTL4 (118G) gene with green fluorescent protein tag was cloned into pCDH via Sal I and Not I restriction enzyme from cDNA. The ANGPTL4 (118A) gene was constructed via QuickMutationPlus Site-Directed Mutagenesis Kit (Beyotime, Shanghai, China) following the manufacturer’s instructions. The ANGPTL4 overexpression vectors were cotransfected with lentiviral packaging vectors pMDL, vesicular stomatitis virus glycoprotein, and Rev. into HEK293FT cells to generate lentivirus (79). 3T3-L1 cells were infected with the collected viruses for 48 hours and screened as stable cell lines by using puromycin (1 μg/ml). The stable ANGPTL4-overexpressing cell lines were used to differentiate into the adipocytes.
HUVECs were isolated and cultured as described previously (80). For the cocultures, ANGPTL4-overexpressed adipocytes were seeded on semipermeable membrane supports in 12-well (2 × 105 cells per well), 0.4-μm pore size transwell plates (Corning, New York, USA). After 1 day, membrane inserts with the ANGPTL4-overexpressed adipocytes were transferred on top of wells in 12-well plates with about 80% confluence HUVECs, and the cells were cocultured for a further 24 hours.
Statistical analysis in experiment validations
All data were expressed as the mean ± SEM. Statistical analysis was performed using GraphPad Prism 6 (GraphPad Software Inc., San Diego, CA, USA). Comparisons between the two groups were assessed using the unpaired Student’s t test. For all experiments, P values of less than 0.05 were considered statistically significant.
Validation by text mining
We performed text mining on PubMed database to produce an independent literature-based association analysis for CVDs and antineoplastic drugs. To conduct the text mining, we selected a specific antineoplastic drug (drug name or ChEMBL ID from DGIdb) and estimated cardiovascular risk (mapped cardiovascular complications using standard from European Society of Cardiology) with genetic evidence and obtained a contingency table containing the number of articles mentioning both (comentioning) of the antineoplastic drugs predicted with CVD side effects (hereinafter referred to as predicted positive drugs) and the CVDs, the number of articles incorporating the CVDs but without the predicted positive drugs, the number of articles having the predicted positive drugs but without the CVDs, and the remaining number of articles within the corpus (the corpus refers to whole articles queried by all the antineoplastic drugs we analyzed). These queries were conducted on the Entrez Programming Utilities (81) on 14 June 2019. Then, one-tailed Fisher’s exact test was used to measure the significance of observing the number of articles with comentioning of predicted drugs and CVDs given the number of articles mentioning the predicted drugs of CVDs individually (82).
Validation by side effect evidence
We extracted drug with cardiovascular side effect information from SIDER4.1 (57). Through mapping by drug name, we obtained shared antineoplastic drugs between the SIDER database and our prediction. A contingency table was derived by calculating the number of comentioned drugs in the SIDER and our result, the number of drugs only in our prediction, the number of drugs only in the SIDER, and the remaining number of drugs within the overlapped drugs. The one-tailed Fisher’s exact test was used to determine whether our genetically supported drugs enriched in the CVD-induced drugs annotated in the SIDER database. Also, we applied this validation to drug obtained by discordant direction between the MOA of drug-target interactions and the DirCE of CRV-associated genes.
Evaluation on FDA-approved first-line antineoplastic drugs
We selected the first-line antineoplastic drugs according to NCCN (https://www.nccn.org/). Then, we searched for the predicted positive first-line drugs in the PubMed literature and the SIDER4.1 to verify the cardiovascular side effects. For the PubMed literature validation, we searched for the drug name and “cardiovascular risk” as the keywords and recorded the PubMed ID for the drug if the article mentioned the CVD adverse effect for the corresponding drug. For the SIDER database validation, we checked whether the drug had been recorded with the CVD adverse effect. We also partitioned these first-line drugs into different categories according to drug properties. For each drug class, we applied the KEGG enrichment analysis in Metascape (83) to investigate the likely associated pathway with the CVD-associated genes with valid DirCE supporting cardiovascular toxicity.