Ligand Receptor Interaction Survival Analysis

By Tiago Maié and James Nagai

Here we introduce the survival model analysis used to the orthogonal validation of the scACCorDiON subcluters.

For installing the R package use

pak::pak("CostaLab/scACCorDiON.su")
[1]:
library(scACCorDiON.su)
library(IRdisplay) ## Tables rendering
library(reactable) ## Tables display

---------------------------------
scACCorDiON.su version 0.0.0.9000
---------------------------------

Loading the data used in the PDAC experiment

[2]:
data(pdac_cci_data)
data(paad_tcga_clinical_data) ## TCGA data
data(paad_tcga_expression_data) ## TCGA data
data(comparison_of_interest) ## Here he have the desired interaction

Interaction: DCT2 and DCT1 were renamed in the paper for the readability, Malignant Ductal and Ductal cell, respectivelly

[3]:
comparison_of_interest
'Ductal.cell.type.2_Ductal.cell.type.1'

Reading CrossTalkeR sankey data (Supp. Fig. 8)

Here we can supply ligand receptor interactions for the PDAC data

[4]:
lrsankey <- read.csv("../../../scaccordion/data/PDAC_differential.csv")

Note: table must be names as the example below

[5]:
head(lrsankey[,c("source","target","gene_A","gene_B","LRScore")])
A data.frame: 6 × 5
sourcetargetgene_Agene_BLRScore
<chr><chr><chr><chr><dbl>
1Ductal cell type 2Macrophage cellB2M|L CD1C|R -2.578362
2Endocrine cell B cell INS|L LILRB1|R-2.548423
3Ductal cell type 2B cell B2M|L CD1C|R -2.495319
4B cell B cell APP|L CD74|R -2.422047
5B cell B cell COPA|LCD74|R -2.386160
6T cell B cell COPA|LCD74|R -2.381791

Preparing the data from the sankey plot to perform survival analysis(Table 4)

Here, the data is prepared to be used in the run_survival funtion. We first select the source and target interaction. Second, HLA genes are filtered. Finally, we filter the overlapping communication gened and select the top 10

[6]:
pairs <- lrsankey |>
            dplyr::filter(grepl("Ductal cell type 2",source))|>
            dplyr::filter(grepl("Ductal cell type 1",target)) |>
            dplyr::filter(!grepl("HLA-",gene_A)) |>
            dplyr::mutate(ligand=gsub("\\|L","",gene_A)) |>
            dplyr::mutate(receptor=gsub("\\|R","",gene_B)) |>
            dplyr::filter(ligand %in% paad_tcga_expression_data$symbol) |>
            dplyr::filter(receptor %in% paad_tcga_expression_data$symbol) |>
            dplyr::slice_max(n=10,order_by = LRScore,with_ties = FALSE) |>
            dplyr::select(ligand,receptor)

Once this analysis were dont comparing PDAC2_x_PDAC1 the highest LRscores are associated to interactions that are highly expressed in PDAC1

LR Survival

[7]:
results <-
  run_survival(
    comparison_oi = comparison_of_interest,
    lr_data = pdac_cci_data,
    tcga_exp_data = paad_tcga_expression_data,
    tcga_clinical_data = paad_tcga_clinical_data,
    selection_method = c("limma"),
    custom_selection = pairs,
    n_lr_selected = 10,
    which_get = c("ligand", "receptor"),
    is_signif = TRUE,
    clinical_vars_in_model = c("stage")
  )
#> Warning: Zero sample variances detected, have been offset away from zero
results$cox_models$coxph_clinical.lr_pairs |>
  broom::tidy() |>
  dplyr::mutate(issignif=p.value<0.05) |>
  dplyr::arrange(p.value)
A tibble: 13 × 6
termestimatestd.errorstatisticp.valueissignif
<chr><dbl><dbl><dbl><dbl><lgl>
C3_CD81 0.682814130.2405836 2.838157320.004537481 TRUE
MMP7_SDC1 0.341646110.1228055 2.782010000.005402338 TRUE
CD99_CD81 -0.854022990.3603318-2.370101880.017783184 TRUE
TIMP1_FGFR2 -0.350780210.1691310-2.074015320.038077889 TRUE
S100A4_ERBB3 0.190785580.1240457 1.538026020.124042254FALSE
TIMP1_CD63 -0.351905800.2511306-1.401285810.161128624FALSE
COPA_CD74 -0.458176030.4770598-0.960416310.336845731FALSE
LAMB2_RPSA -0.280676650.3610970-0.777288800.436988400FALSE
stageStage_IV -0.406011920.8660926-0.468785820.639222740FALSE
APP_CD74 -0.206867610.4565172-0.453143090.650445706FALSE
stageStage_II -0.143418410.4416047-0.324766460.745357834FALSE
stageStage_III-0.344049051.0930201-0.314769180.752936915FALSE
LGALS3BP_ITGB1-0.010678050.3099802-0.034447530.972520284FALSE

Ligand only survival

[8]:
results <-
  run_survival(
    comparison_oi = comparison_of_interest,
    lr_data = pdac_cci_data,
    tcga_exp_data = paad_tcga_expression_data,
    tcga_clinical_data = paad_tcga_clinical_data,
    selection_method = c("limma"),
    custom_selection = pairs,
    n_lr_selected = 10,
    which_get = c("ligand"),
    is_signif = TRUE,
    clinical_vars_in_model = c("stage")
  )
#> Warning: Zero sample variances detected, have been offset away from zero
results$cox_models$coxph_clinical.gene_expression |>
  broom::tidy() |>
  dplyr::mutate(issignif=p.value<0.05)  |>
  dplyr::arrange(p.value)
A tibble: 12 × 6
termestimatestd.errorstatisticp.valueissignif
<chr><dbl><dbl><dbl><dbl><lgl>
MMP7 0.155411090.07010018 2.21698570.02662407 TRUE
TIMP1 -0.427647500.19417184-2.20241780.02763581 TRUE
C3 0.181241620.11222101 1.61504180.10630165FALSE
LAMB2 -0.352288230.25350917-1.38964690.16463612FALSE
S100A4 0.103126850.07768648 1.32747490.18435162FALSE
CD99 -0.235794450.22318141-1.05651470.29073312FALSE
COPA -0.124650430.35093778-0.35519240.72244547FALSE
LGALS3BP -0.056544360.21479237-0.26325130.79235693FALSE
stageStage_II 0.085689170.45425232 0.18863780.85037668FALSE
stageStage_III-0.152438871.08546592-0.14043630.88831524FALSE
APP 0.031279090.27363357 0.11431020.90899194FALSE
stageStage_IV 0.090401310.86870068 0.10406500.91711779FALSE

Receptor only survival

[9]:
results <-
  run_survival(
    comparison_oi = comparison_of_interest,
    lr_data = pdac_cci_data,
    tcga_exp_data = paad_tcga_expression_data,
    tcga_clinical_data = paad_tcga_clinical_data,
    selection_method = c("limma"),
    custom_selection = pairs,
    n_lr_selected = 10,
    which_get = c("receptor"),
    is_signif = TRUE,
    clinical_vars_in_model = c("stage")
  )
#> Warning: Zero sample variances detected, have been offset away from zero
results$cox_models$coxph_clinical.gene_expression |>
  broom::tidy() |>
  dplyr::mutate(issignif=p.value<0.05) |>
  dplyr::arrange(p.value)
A tibble: 11 × 6
termestimatestd.errorstatisticp.valueissignif
<chr><dbl><dbl><dbl><dbl><lgl>
ERBB3 0.365589470.1570826 2.327371510.01994550 TRUE
RPSA 0.522212760.2247357 2.323674770.02014293 TRUE
ITGB1 0.425288590.1852087 2.296266720.02166064 TRUE
CD63 -0.590150330.2715733-2.173079330.02977435 TRUE
SDC1 0.178010060.1223412 1.455029670.14566105FALSE
FGFR2 -0.124298900.1015618-1.223874180.22099970FALSE
CD81 -0.221816240.1813146-1.223378040.22118694FALSE
CD74 0.156562890.1485338 1.054055340.29185758FALSE
stageStage_IV -0.561584090.8621951-0.651342210.51482561FALSE
stageStage_II 0.185146750.4332413 0.427352470.66912262FALSE
stageStage_III-0.014823811.0831958-0.013685250.98908109FALSE

Finding survival hits without custom Selection

[10]:
results <-
  run_survival(
    comparison_oi = comparison_of_interest,
    lr_data = pdac_cci_data,
    tcga_exp_data = paad_tcga_expression_data,
    tcga_clinical_data = paad_tcga_clinical_data,
    selection_method = c("limma"),
    n_lr_selected = 30,
    which_get = c("ligand", "receptor"),
    is_signif = TRUE,
    clinical_vars_in_model = c("stage")
  )
#> Warning: Zero sample variances detected, have been offset away from zero
results$cox_models$coxph_clinical.lr_pairs |>
  broom::tidy() |>
  dplyr::mutate(issignif=p.value<0.05)  |>
  dplyr::arrange(p.value)
Warning message:
“Zero sample variances detected, have been offset away from zero”
A tibble: 33 × 6
termestimatestd.errorstatisticp.valueissignif
<chr><dbl><dbl><dbl><dbl><lgl>
WNT10A_FZD1 4.8042406122.15116680 2.233318500.02552795 TRUE
WNT7B_FZD5 4.7107184282.11236817 2.230065050.02574312 TRUE
FGF2_SDC3 -0.9610843350.44546094-2.157505280.03096632 TRUE
WNT7B_FZD1 -4.2236381971.98969477-2.122756850.03377423 TRUE
WNT7A_FZD5 -2.2370144281.14867741-1.947469680.05147845FALSE
WNT7A_FZD6 2.1825464461.12241215 1.944514260.05183347FALSE
AMH_EGFR -0.0867401020.04518292-1.919754440.05488892FALSE
FGF2_GPC4 0.6429192580.36467082 1.763012640.07789836FALSE
ADAM17_ERBB4 -0.4293564920.27402613-1.566845060.11715091FALSE
WNT10A_FZD5 -3.5293126332.25814499-1.562925610.11807009FALSE
WNT5B_FZD6 -1.6054533831.02823780-1.561363900.11843792FALSE
COL4A5_ITGB8 -0.3605800370.23878081-1.510087990.13102097FALSE
BTC_EGFR 0.3834562620.27661830 1.386228820.16567704FALSE
WNT5B_FZD3 1.4390819481.09242484 1.317328110.18772867FALSE
L1CAM_EPHB2 0.1766604750.14666414 1.204523960.22838713FALSE
WNT10A_FZD3 -1.3998544611.16983695-1.196623570.23145331FALSE
WNT5B_LRP5 -0.8069765440.67525915-1.195061980.23206281FALSE
EREG_ERBB4 0.2578383130.21999882 1.171998610.24119761FALSE
DKK1_KREMEN1 0.1470606770.14137620 1.040208180.29824319FALSE
SHH_SCUBE2 -0.1599076050.16118859-0.992052860.32117174FALSE
WNT5B_LRP6 0.6802640250.74898046 0.908253360.36374438FALSE
ZP3_EGFR 0.1464120920.16844895 0.869177830.38474987FALSE
FGF2_FGFR4 0.2018182230.23227989 0.868857910.38492485FALSE
stageStage_III -0.8638728691.16022534-0.744573350.45652968FALSE
PCSK9_VLDLR 0.2473152270.41280770 0.599105160.54910276FALSE
TNFSF11_TNFRSF11A 0.1007928040.18219314 0.553219530.58011310FALSE
PF4_FGFR2 0.0368325470.10577431 0.348218270.72767627FALSE
PCSK9_SORT1 -0.1300406080.38297481-0.339553950.73419247FALSE
stageStage_II 0.1656573060.53312013 0.310731670.75600462FALSE
stageStage_IV 0.2067538111.05712158 0.195581870.84493746FALSE
SLITRK6_PTPRS 0.0306910640.16191554 0.189549840.84966190FALSE
FGF2_FGFR2 0.0269920980.26272164 0.102740300.91816909FALSE
SCGB1A1_LMBR1L 0.0019599260.05951826 0.032929830.97373055FALSE
[18]:
options(repr.plot.width=8,repr.plot.height=10)
survminer::ggforest(results$cox_models$coxph_clinical.lr_pairs,results$model_data)

../_images/notebooks_LRSurvival_25_0.png