Last updated: 2021-09-13

Checks: 7 0

Knit directory: ctwas_applied/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20210726) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 72c8ef7. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


working directory clean

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/ukb-d-30740_irnt_Adipose_Subcutaneous_known.Rmd) and HTML (docs/ukb-d-30740_irnt_Adipose_Subcutaneous_known.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 72c8ef7 wesleycrouse 2021-09-13 changing mart for biomart
Rmd a49c40e wesleycrouse 2021-09-13 updating with bystander analysis
html 7e22565 wesleycrouse 2021-09-13 updating reports
Rmd 3a7fbc1 wesleycrouse 2021-09-08 generating reports for known annotations
Rmd cbf7408 wesleycrouse 2021-09-08 adding enrichment to reports

Overview

These are the results of a ctwas analysis of the UK Biobank trait Glucose using Adipose_Subcutaneous gene weights.

The GWAS was conducted by the Neale Lab, and the biomarker traits we analyzed are discussed here. Summary statistics were obtained from IEU OpenGWAS using GWAS ID: ukb-d-30740_irnt. Results were obtained from from IEU rather than Neale Lab because they are in a standardard format (GWAS VCF). Note that 3 of the 34 biomarker traits were not available from IEU and were excluded from analysis.

The weights are mashr GTEx v8 models on Adipose_Subcutaneous eQTL obtained from PredictDB. We performed a full harmonization of the variants, including recovering strand ambiguous variants. This procedure is discussed in a separate document. (TO-DO: add report that describes harmonization)

LD matrices were computed from a 10% subset of Neale lab subjects. Subjects were matched using the plate and well information from genotyping. We included only biallelic variants with MAF>0.01 in the original Neale Lab GWAS. (TO-DO: add more details [number of subjects, variants, etc])

Weight QC

TO-DO: add enhanced QC reporting (total number of weights, why each variant was missing for all genes)

qclist_all <- list()

qc_files <- paste0(results_dir, "/", list.files(results_dir, pattern="exprqc.Rd"))

for (i in 1:length(qc_files)){
  load(qc_files[i])
  chr <- unlist(strsplit(rev(unlist(strsplit(qc_files[i], "_")))[1], "[.]"))[1]
  qclist_all[[chr]] <- cbind(do.call(rbind, lapply(qclist,unlist)), as.numeric(substring(chr,4)))
}

qclist_all <- data.frame(do.call(rbind, qclist_all))
colnames(qclist_all)[ncol(qclist_all)] <- "chr"

rm(qclist, wgtlist, z_gene_chr)

#number of imputed weights
nrow(qclist_all)
[1] 12961
#number of imputed weights by chromosome
table(qclist_all$chr)

   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15 
1278  918  741  500  602  740  638  485  494  502  777  731  256  426  444 
  16   17   18   19   20   21   22 
 610  799  198  977  380  144  321 
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.7639843

Load ctwas results

#load ctwas results
ctwas_res <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.susieIrss.txt"))

#make unique identifier for regions
ctwas_res$region_tag <- paste(ctwas_res$region_tag1, ctwas_res$region_tag2, sep="_")

#load z scores for SNPs and collect sample size
load(paste0(results_dir, "/", analysis_id, "_expr_z_snp.Rd"))

sample_size <- z_snp$ss
sample_size <- as.numeric(names(which.max(table(sample_size))))

#compute PVE for each gene/SNP
ctwas_res$PVE = ctwas_res$susie_pip*ctwas_res$mu2/sample_size

#separate gene and SNP results
ctwas_gene_res <- ctwas_res[ctwas_res$type == "gene", ]
ctwas_gene_res <- data.frame(ctwas_gene_res)
ctwas_snp_res <- ctwas_res[ctwas_res$type == "SNP", ]
ctwas_snp_res <- data.frame(ctwas_snp_res)

#add gene information to results
sqlite <- RSQLite::dbDriver("SQLite")
db = RSQLite::dbConnect(sqlite, paste0("/project2/compbio/predictdb/mashr_models/mashr_", weight, ".db"))
query <- function(...) RSQLite::dbGetQuery(db, ...)
gene_info <- query("select gene, genename from extra")
gene_info <- query("select gene, genename, gene_type from extra")
RSQLite::dbDisconnect(db)

ctwas_gene_res <- cbind(ctwas_gene_res, gene_info[sapply(ctwas_gene_res$id, match, gene_info$gene), c("genename", "gene_type")])

#add z scores to results
load(paste0(results_dir, "/", analysis_id, "_expr_z_gene.Rd"))
ctwas_gene_res$z <- z_gene[ctwas_gene_res$id,]$z

z_snp <- z_snp[z_snp$id %in% ctwas_snp_res$id,]
ctwas_snp_res$z <- z_snp$z[match(ctwas_snp_res$id, z_snp$id)]

#formatting and rounding for tables
ctwas_gene_res$z <- round(ctwas_gene_res$z,2)
ctwas_snp_res$z <- round(ctwas_snp_res$z,2)
ctwas_gene_res$susie_pip <- round(ctwas_gene_res$susie_pip,3)
ctwas_snp_res$susie_pip <- round(ctwas_snp_res$susie_pip,3)
ctwas_gene_res$mu2 <- round(ctwas_gene_res$mu2,2)
ctwas_snp_res$mu2 <- round(ctwas_snp_res$mu2,2)
ctwas_gene_res$PVE <- signif(ctwas_gene_res$PVE, 2)
ctwas_snp_res$PVE <- signif(ctwas_snp_res$PVE, 2)

#merge gene and snp results with added information
ctwas_snp_res$genename=NA
ctwas_snp_res$gene_type=NA

ctwas_res <- rbind(ctwas_gene_res,
                   ctwas_snp_res[,colnames(ctwas_gene_res)])

#store columns to report
report_cols <- colnames(ctwas_gene_res)[!(colnames(ctwas_gene_res) %in% c("type", "region_tag1", "region_tag2", "cs_index", "gene_type", "z_flag", "id", "chrom", "pos"))]
first_cols <- c("genename", "region_tag")
report_cols <- c(first_cols, report_cols[!(report_cols %in% first_cols)])

report_cols_snps <- c("id", report_cols[-1])

#get number of SNPs from s1 results; adjust for thin
ctwas_res_s1 <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.s1.susieIrss.txt"))
n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
rm(ctwas_res_s1)

Check convergence of parameters

library(ggplot2)
library(cowplot)

********************************************************
Note: As of version 1.0.0, cowplot does not change the
  default ggplot2 theme anymore. To recover the previous
  behavior, execute:
  theme_set(theme_cowplot())
********************************************************
load(paste0(results_dir, "/", analysis_id, "_ctwas.s2.susieIrssres.Rd"))
  
df <- data.frame(niter = rep(1:ncol(group_prior_rec), 2),
                 value = c(group_prior_rec[1,], group_prior_rec[2,]),
                 group = rep(c("Gene", "SNP"), each = ncol(group_prior_rec)))
df$group <- as.factor(df$group)

df$value[df$group=="SNP"] <- df$value[df$group=="SNP"]*thin #adjust parameter to account for thin argument

p_pi <- ggplot(df, aes(x=niter, y=value, group=group)) +
  geom_line(aes(color=group)) +
  geom_point(aes(color=group)) +
  xlab("Iteration") + ylab(bquote(pi)) +
  ggtitle("Prior mean") +
  theme_cowplot()

df <- data.frame(niter = rep(1:ncol(group_prior_var_rec), 2),
                 value = c(group_prior_var_rec[1,], group_prior_var_rec[2,]),
                 group = rep(c("Gene", "SNP"), each = ncol(group_prior_var_rec)))
df$group <- as.factor(df$group)
p_sigma2 <- ggplot(df, aes(x=niter, y=value, group=group)) +
  geom_line(aes(color=group)) +
  geom_point(aes(color=group)) +
  xlab("Iteration") + ylab(bquote(sigma^2)) +
  ggtitle("Prior variance") +
  theme_cowplot()

plot_grid(p_pi, p_sigma2)

Version Author Date
7e22565 wesleycrouse 2021-09-13
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
        gene          snp 
0.0131489668 0.0001435707 
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
    gene      snp 
10.71966 10.47239 
#report sample size
print(sample_size)
[1] 314916
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1]   12961 8696600
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
       gene         snp 
0.005801183 0.041520854 
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1] 0.02579552 0.27574421

Genes with highest PIPs

#distribution of PIPs
hist(ctwas_gene_res$susie_pip, xlim=c(0,1), main="Distribution of Gene PIPs")

Version Author Date
7e22565 wesleycrouse 2021-09-13
#genes with PIP>0.8 or 20 highest PIPs
head(ctwas_gene_res[order(-ctwas_gene_res$susie_pip),report_cols], max(sum(ctwas_gene_res$susie_pip>0.8), 20))
      genename region_tag susie_pip    mu2     PVE      z
3633     CCND2       12_4     0.994 116.64 3.7e-04 -11.03
4200     PPDPF      20_37     0.987  26.32 8.3e-05  -5.04
11938   CHAMP1      13_62     0.979  28.90 9.0e-05   5.30
9666      CHD2      15_43     0.967  31.34 9.6e-05  -5.58
1237     CPNE3       8_62     0.966  26.16 8.0e-05   4.97
6954   ZFP36L2       2_27     0.957  80.13 2.4e-04  10.17
8836   FAM234A       16_1     0.943  50.88 1.5e-04   8.39
7223      NAT2       8_20     0.929  23.78 7.0e-05  -4.74
2816      MADD      11_29     0.911  33.39 9.7e-05   7.42
7048     JAZF1       7_23     0.908  35.74 1.0e-04  -6.09
6020      AKT1      14_55     0.859  20.40 5.6e-05  -4.14
6523    GIGYF1       7_62     0.843  38.89 1.0e-04  -6.95
66     FAM214B       9_27     0.831  21.22 5.6e-05  -4.20
3620     ABCG2       4_59     0.812  19.76 5.1e-05  -4.10
11303 C15orf52      15_14     0.788  23.00 5.8e-05  -4.41
1526   CWF19L1      10_64     0.766  20.81 5.1e-05  -4.18
92       ABCC8      11_12     0.742  19.98 4.7e-05   3.48
5790      WARS      14_52     0.734  24.11 5.6e-05   4.88
5881  MAP1LC3B      16_52     0.733  20.79 4.8e-05   3.96
7529       ACE      17_37     0.732  22.21 5.2e-05   4.42

Genes with largest effect sizes

#plot PIP vs effect size
plot(ctwas_gene_res$susie_pip, ctwas_gene_res$mu2, xlab="PIP", ylab="mu^2", main="Gene PIPs vs Effect Size")

Version Author Date
7e22565 wesleycrouse 2021-09-13
#genes with 20 largest effect sizes
head(ctwas_gene_res[order(-ctwas_gene_res$mu2),report_cols],20)
       genename region_tag susie_pip     mu2     PVE      z
6925      G6PC2      2_102     0.000 1201.61 1.3e-11 -40.22
6924      SPC25      2_102     0.000  131.08 4.0e-12  10.06
3633      CCND2       12_4     0.994  116.64 3.7e-04 -11.03
8011    RPL22L1      3_104     0.055  103.82 1.8e-05  10.33
3270      SNX17       2_16     0.096   96.74 2.9e-05  -9.53
2474        GCK       7_32     0.000   90.52 6.4e-11  14.68
3267      NRBP1       2_16     0.040   83.19 1.0e-05  -9.09
6954    ZFP36L2       2_27     0.957   80.13 2.4e-04  10.17
7920    NOSTRIN      2_102     0.000   70.08 1.4e-13  -8.43
14678 LINC00506       3_57     0.000   61.82 0.0e+00  -1.21
8941       FEN1      11_34     0.656   52.38 1.1e-04   7.56
11270   NCR3LG1      11_12     0.660   51.60 1.1e-04  -7.01
8836    FAM234A       16_1     0.943   50.88 1.5e-04   8.39
4388       RNF6       13_6     0.188   50.02 3.0e-05   7.67
3368      THADA       2_27     0.023   49.46 3.7e-06   7.66
4875       CDK8       13_6     0.184   47.29 2.8e-05  -6.42
8010     EIF5A2      3_104     0.016   46.29 2.4e-06   8.68
2476       YKT6       7_32     0.005   43.65 7.3e-07  11.82
12250    MANSC4      12_19     0.629   43.06 8.6e-05  -6.66
14659 LINC01126       2_27     0.014   42.87 2.0e-06  -6.91

Genes with highest PVE

#genes with 20 highest pve
head(ctwas_gene_res[order(-ctwas_gene_res$PVE),report_cols],20)
      genename region_tag susie_pip    mu2     PVE      z
3633     CCND2       12_4     0.994 116.64 3.7e-04 -11.03
6954   ZFP36L2       2_27     0.957  80.13 2.4e-04  10.17
8836   FAM234A       16_1     0.943  50.88 1.5e-04   8.39
11270  NCR3LG1      11_12     0.660  51.60 1.1e-04  -7.01
8941      FEN1      11_34     0.656  52.38 1.1e-04   7.56
7048     JAZF1       7_23     0.908  35.74 1.0e-04  -6.09
6523    GIGYF1       7_62     0.843  38.89 1.0e-04  -6.95
2816      MADD      11_29     0.911  33.39 9.7e-05   7.42
9666      CHD2      15_43     0.967  31.34 9.6e-05  -5.58
12080     RNF5       6_26     0.709  40.82 9.2e-05  -6.52
11938   CHAMP1      13_62     0.979  28.90 9.0e-05   5.30
12250   MANSC4      12_19     0.629  43.06 8.6e-05  -6.66
4200     PPDPF      20_37     0.987  26.32 8.3e-05  -5.04
3652   CTNNAL1       9_56     0.701  36.37 8.1e-05  -5.88
1237     CPNE3       8_62     0.966  26.16 8.0e-05   4.97
9852   ARHGAP1      11_28     0.559  39.99 7.1e-05  -6.18
7223      NAT2       8_20     0.929  23.78 7.0e-05  -4.74
10743    AIFM3       22_4     0.706  28.47 6.4e-05  -4.76
7490     UBE2Z      17_28     0.580  33.80 6.2e-05  -5.77
11303 C15orf52      15_14     0.788  23.00 5.8e-05  -4.41

Genes with largest z scores

#genes with 20 largest z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
      genename region_tag susie_pip     mu2     PVE      z
6925     G6PC2      2_102     0.000 1201.61 1.3e-11 -40.22
2474       GCK       7_32     0.000   90.52 6.4e-11  14.68
2476      YKT6       7_32     0.005   43.65 7.3e-07  11.82
3633     CCND2       12_4     0.994  116.64 3.7e-04 -11.03
8011   RPL22L1      3_104     0.055  103.82 1.8e-05  10.33
6954   ZFP36L2       2_27     0.957   80.13 2.4e-04  10.17
6924     SPC25      2_102     0.000  131.08 4.0e-12  10.06
3270     SNX17       2_16     0.096   96.74 2.9e-05  -9.53
3267     NRBP1       2_16     0.040   83.19 1.0e-05  -9.09
8010    EIF5A2      3_104     0.016   46.29 2.4e-06   8.68
7920   NOSTRIN      2_102     0.000   70.08 1.4e-13  -8.43
8836   FAM234A       16_1     0.943   50.88 1.5e-04   8.39
10698   FAM46C       1_72     0.450   35.86 5.1e-05   7.73
4388      RNF6       13_6     0.188   50.02 3.0e-05   7.67
3368     THADA       2_27     0.023   49.46 3.7e-06   7.66
8941      FEN1      11_34     0.656   52.38 1.1e-04   7.56
2816      MADD      11_29     0.911   33.39 9.7e-05   7.42
4009    KBTBD4      11_29     0.019   34.04 2.0e-06  -7.05
11270  NCR3LG1      11_12     0.660   51.60 1.1e-04  -7.01
6523    GIGYF1       7_62     0.843   38.89 1.0e-04  -6.95

Comparing z scores and PIPs

#set nominal signifiance threshold for z scores
alpha <- 0.05

#bonferroni adjusted threshold for z scores
sig_thresh <- qnorm(1-(alpha/nrow(ctwas_gene_res)/2), lower=T)

#Q-Q plot for z scores
obs_z <- ctwas_gene_res$z[order(ctwas_gene_res$z)]
exp_z <- qnorm((1:nrow(ctwas_gene_res))/nrow(ctwas_gene_res))

plot(exp_z, obs_z, xlab="Expected z", ylab="Observed z", main="Gene z score Q-Q plot")
abline(a=0,b=1)

Version Author Date
7e22565 wesleycrouse 2021-09-13
#plot z score vs PIP
plot(abs(ctwas_gene_res$z), ctwas_gene_res$susie_pip, xlab="abs(z)", ylab="PIP")
abline(v=sig_thresh, col="red", lty=2)

Version Author Date
7e22565 wesleycrouse 2021-09-13
#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)
[1] 0.006712445
#genes with most significant z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
      genename region_tag susie_pip     mu2     PVE      z
6925     G6PC2      2_102     0.000 1201.61 1.3e-11 -40.22
2474       GCK       7_32     0.000   90.52 6.4e-11  14.68
2476      YKT6       7_32     0.005   43.65 7.3e-07  11.82
3633     CCND2       12_4     0.994  116.64 3.7e-04 -11.03
8011   RPL22L1      3_104     0.055  103.82 1.8e-05  10.33
6954   ZFP36L2       2_27     0.957   80.13 2.4e-04  10.17
6924     SPC25      2_102     0.000  131.08 4.0e-12  10.06
3270     SNX17       2_16     0.096   96.74 2.9e-05  -9.53
3267     NRBP1       2_16     0.040   83.19 1.0e-05  -9.09
8010    EIF5A2      3_104     0.016   46.29 2.4e-06   8.68
7920   NOSTRIN      2_102     0.000   70.08 1.4e-13  -8.43
8836   FAM234A       16_1     0.943   50.88 1.5e-04   8.39
10698   FAM46C       1_72     0.450   35.86 5.1e-05   7.73
4388      RNF6       13_6     0.188   50.02 3.0e-05   7.67
3368     THADA       2_27     0.023   49.46 3.7e-06   7.66
8941      FEN1      11_34     0.656   52.38 1.1e-04   7.56
2816      MADD      11_29     0.911   33.39 9.7e-05   7.42
4009    KBTBD4      11_29     0.019   34.04 2.0e-06  -7.05
11270  NCR3LG1      11_12     0.660   51.60 1.1e-04  -7.01
6523    GIGYF1       7_62     0.843   38.89 1.0e-04  -6.95

Locus plots for genes and SNPs

ctwas_gene_res_sortz <- ctwas_gene_res[order(-abs(ctwas_gene_res$z)),]

n_plots <- 5
for (region_tag_plot in head(unique(ctwas_gene_res_sortz$region_tag), n_plots)){
  ctwas_res_region <-  ctwas_res[ctwas_res$region_tag==region_tag_plot,]
  start <- min(ctwas_res_region$pos)
  end <- max(ctwas_res_region$pos)
  
  ctwas_res_region <- ctwas_res_region[order(ctwas_res_region$pos),]
  ctwas_res_region_gene <- ctwas_res_region[ctwas_res_region$type=="gene",]
  ctwas_res_region_snp <- ctwas_res_region[ctwas_res_region$type=="SNP",]
  
  #region name
  print(paste0("Region: ", region_tag_plot))
  
  #table of genes in region
  print(ctwas_res_region_gene[,report_cols])
  
  par(mfrow=c(4,1))
  
  #gene z scores
  plot(ctwas_res_region_gene$pos, abs(ctwas_res_region_gene$z), xlab="Position", ylab="abs(gene_z)", xlim=c(start,end),
   ylim=c(0,max(sig_thresh, abs(ctwas_res_region_gene$z))),
   main=paste0("Region: ", region_tag_plot))
  abline(h=sig_thresh,col="red",lty=2)
  
  #significance threshold for SNPs
  alpha_snp <- 5*10^(-8)
  sig_thresh_snp <- qnorm(1-alpha_snp/2, lower=T)
  
  #snp z scores
  plot(ctwas_res_region_snp$pos, abs(ctwas_res_region_snp$z), xlab="Position", ylab="abs(snp_z)",xlim=c(start,end),
   ylim=c(0,max(sig_thresh_snp, max(abs(ctwas_res_region_snp$z)))))
  abline(h=sig_thresh_snp,col="purple",lty=2)
  
  #gene pips
  plot(ctwas_res_region_gene$pos, ctwas_res_region_gene$susie_pip, xlab="Position", ylab="Gene PIP", xlim=c(start,end), ylim=c(0,1))
  abline(h=0.8,col="blue",lty=2)
  
  #snp pips
  plot(ctwas_res_region_snp$pos, ctwas_res_region_snp$susie_pip, xlab="Position", ylab="SNP PIP", xlim=c(start,end), ylim=c(0,1))
  abline(h=0.8,col="blue",lty=2)
}
[1] "Region: 2_102"
     genename region_tag susie_pip     mu2     PVE      z
7920  NOSTRIN      2_102         0   70.08 1.4e-13  -8.43
6924    SPC25      2_102         0  131.08 4.0e-12  10.06
6925    G6PC2      2_102         0 1201.61 1.3e-11 -40.22

Version Author Date
7e22565 wesleycrouse 2021-09-13
[1] "Region: 7_32"
        genename region_tag susie_pip   mu2     PVE     z
17         HECW1       7_32     0.000  8.59 5.1e-12  2.49
2465        COA1       7_32     0.000 20.57 7.3e-11 -2.05
2466       BLVRA       7_32     0.000  5.58 2.4e-12 -1.26
627       MRPS24       7_32     0.000 12.24 1.4e-11  0.07
1072      UBE2D4       7_32     0.000 39.47 1.7e-11 -6.85
2467       URGCP       7_32     0.000 18.44 2.6e-11  1.49
12733 AC004951.6       7_32     0.000  9.17 4.2e-12 -0.37
12967  LINC00957       7_32     0.000  7.13 3.4e-12  2.46
3938        POLM       7_32     0.000  8.05 4.5e-12 -1.05
2472       POLD2       7_32     0.000 25.22 1.3e-10  3.04
2474         GCK       7_32     0.000 90.52 6.4e-11 14.68
2476        YKT6       7_32     0.005 43.65 7.3e-07 11.82
270       NUDCD3       7_32     0.000 12.17 7.0e-12 -1.53
268       NPC1L1       7_32     0.000 14.68 2.2e-11 -0.43
5285       DDX56       7_32     0.000 12.88 1.3e-11  1.54
7441       TMED4       7_32     0.000 16.32 2.8e-11  1.42
2384        OGDH       7_32     0.000 25.61 1.6e-10 -2.17

Version Author Date
7e22565 wesleycrouse 2021-09-13
[1] "Region: 12_4"
     genename region_tag susie_pip    mu2     PVE      z
4562  CRACR2A       12_4     0.010   8.32 2.6e-07   1.13
2871   PARP11       12_4     0.011   8.64 3.0e-07   1.04
3633    CCND2       12_4     0.994 116.64 3.7e-04 -11.03

Version Author Date
7e22565 wesleycrouse 2021-09-13
[1] "Region: 3_104"
           genename region_tag susie_pip    mu2     PVE     z
13155     LINC02082      3_104     0.015   9.14 4.5e-07  1.08
14310 RP11-368I23.3      3_104     0.009   4.61 1.3e-07  0.14
10810        ACTRT3      3_104     0.015   9.70 4.5e-07  1.50
9407         LRRC34      3_104     0.100  26.10 8.3e-06 -2.45
173           SEC62      3_104     0.010   6.33 2.1e-07 -0.96
9702         GPR160      3_104     0.009   5.43 1.6e-07 -0.78
9701           PHC3      3_104     0.011   5.86 2.0e-07 -0.03
8004          PRKCI      3_104     0.009   6.14 1.8e-07  1.46
242          CLDN11      3_104     0.010   4.90 1.5e-07 -0.22
8011        RPL22L1      3_104     0.055 103.82 1.8e-05 10.33
8010         EIF5A2      3_104     0.016  46.29 2.4e-06  8.68

Version Author Date
7e22565 wesleycrouse 2021-09-13
[1] "Region: 2_27"
        genename region_tag susie_pip   mu2     PVE     z
6954     ZFP36L2       2_27     0.957 80.13 2.4e-04 10.17
14659  LINC01126       2_27     0.014 42.87 2.0e-06 -6.91
3368       THADA       2_27     0.023 49.46 3.7e-06  7.66
12562 C1GALT1C1L       2_27     0.013  4.62 1.9e-07 -0.10
6956     PLEKHH2       2_27     0.016  5.83 2.9e-07  0.45
5539    DYNC2LI1       2_27     0.037 15.07 1.8e-06 -2.00
5552      LRPPRC       2_27     0.021  8.94 6.1e-07 -1.16

Version Author Date
7e22565 wesleycrouse 2021-09-13

SNPs with highest PIPs

#snps with PIP>0.8 or 20 highest PIPs
head(ctwas_snp_res[order(-ctwas_snp_res$susie_pip),report_cols_snps],
max(sum(ctwas_snp_res$susie_pip>0.8), 20))
                id region_tag susie_pip     mu2     PVE      z
54769   rs79687284      1_109     1.000  110.51 3.5e-04 -12.08
75841     rs780093       2_16     1.000  187.59 6.0e-04 -14.95
81780    rs2121564       2_28     1.000   59.90 1.9e-04  -8.01
166409 rs148685409       3_57     1.000  853.37 2.7e-03   2.99
176521  rs72964564       3_76     1.000  235.76 7.5e-04  16.88
196568   rs4234603      3_115     1.000   39.15 1.2e-04  -5.05
322828  rs76623841        6_7     1.000   56.94 1.8e-04   6.80
385777  rs10225316       7_15     1.000   45.08 1.4e-04  -7.51
395269 rs138917529       7_32     1.000   78.08 2.5e-04  10.45
434650   rs7012814       8_12     1.000   62.58 2.0e-04   9.18
528687  rs61856594      10_33     1.000   38.34 1.2e-04   6.23
548498  rs12244851      10_70     1.000  291.83 9.3e-04 -16.46
556804   rs3842727       11_3     1.000   99.24 3.1e-04   8.57
559533   rs3750952       11_6     1.000   52.60 1.7e-04   7.40
644589    rs576674      13_11     1.000  104.18 3.3e-04  10.82
692315  rs35889227      14_45     1.000  115.48 3.7e-04  11.34
869379   rs2232318      2_102     1.000  118.88 3.8e-04  -2.07
869385    rs560887      2_102     1.000 2693.07 8.6e-03 -57.71
869395  rs71397670      2_102     1.000  839.32 2.7e-03 -42.67
114580  rs12692596       2_96     0.999   45.79 1.5e-04  -7.24
287368  rs12189028       5_45     0.999   33.95 1.1e-04   5.08
477622  rs10758593        9_4     0.999   70.73 2.2e-04  -8.64
54778    rs3754140      1_109     0.998   57.31 1.8e-04  10.01
235039  rs11728350       4_69     0.998   42.34 1.3e-04  -6.68
510433 rs115478735       9_70     0.997   81.69 2.6e-04  -9.52
754067  rs28489441      17_15     0.997   32.27 1.0e-04   5.83
810592  rs34783010      19_32     0.995   32.77 1.0e-04   6.40
322847  rs55792466        6_7     0.994   97.58 3.1e-04   9.67
484862 rs572168822       9_16     0.993   40.97 1.3e-04   6.61
658936   rs1327315      13_40     0.992   34.72 1.1e-04   7.02
869466  rs56100844      2_102     0.991  152.72 4.8e-04  17.86
635707   rs4765221      12_76     0.989   33.47 1.1e-04  -5.80
395249  rs17769733       7_32     0.986  132.27 4.1e-04   7.76
547950  rs11195508      10_70     0.986   60.78 1.9e-04  10.76
556809  rs10743152       11_3     0.983   39.26 1.2e-04  -1.91
411852   rs4729755       7_63     0.982   26.37 8.2e-05  -4.96
475580   rs4977218       8_94     0.975   29.86 9.2e-05   5.34
484857   rs1333045       9_16     0.975   40.69 1.3e-04  -6.62
702978  rs12912777      15_13     0.974   25.21 7.8e-05  -3.77
627206   rs6538805      12_58     0.973   31.92 9.9e-05   6.74
755     rs60330317        1_2     0.968   36.62 1.1e-04   6.18
34659     rs893230       1_72     0.968   39.62 1.2e-04   7.56
642899  rs60353775       13_7     0.968   82.20 2.5e-04  -9.78
191411  rs10653660      3_104     0.967  346.87 1.1e-03  23.54
174555   rs9875598       3_73     0.962   27.15 8.3e-05   5.10
466327   rs4433184       8_78     0.960   52.39 1.6e-04  -4.78
395261  rs10259649       7_32     0.949  396.06 1.2e-03 -28.65
511723  rs28624681       9_73     0.941   51.50 1.5e-04  -7.55
169326  rs62276527       3_63     0.938   33.19 9.9e-05  -5.85
699455  rs35767992       15_4     0.936   24.57 7.3e-05  -4.72
571437 rs117396352      11_28     0.930   27.38 8.1e-05  -4.93
832926   rs6026545      20_34     0.928   31.93 9.4e-05  -5.72
755576 rs543720569      17_18     0.924   44.21 1.3e-04   7.03
454682  rs10957704       8_54     0.923   24.50 7.2e-05  -4.68
796901  rs10410896       19_5     0.917   27.00 7.9e-05  -5.04
560955  rs34718245       11_9     0.916   32.22 9.4e-05   5.37
710385  rs11637069      15_29     0.913   28.11 8.2e-05  -5.00
574641   rs7941126      11_36     0.911   31.41 9.1e-05   5.63
572457 rs182512331      11_31     0.899   27.53 7.9e-05   5.05
561887 rs117720468      11_11     0.890   44.43 1.3e-04  -6.82
336539  rs62396405       6_31     0.887   25.89 7.3e-05   4.78
287429   rs6887019       5_45     0.886   26.51 7.5e-05  -5.23
359059 rs118126621       6_73     0.881   25.14 7.0e-05  -4.67
638739   rs1882297      12_82     0.875   38.50 1.1e-04  -6.47
793684  rs41404946      18_44     0.870   24.21 6.7e-05  -4.56
572039 rs139913257      11_30     0.857   31.37 8.5e-05   5.63
659287  rs79317015      13_40     0.840   24.30 6.5e-05  -4.40
365537 rs112388031       6_87     0.838   26.15 7.0e-05   4.65
154838   rs3172494       3_34     0.837   26.99 7.2e-05   5.09
257174  rs62332172      4_113     0.833   24.67 6.5e-05  -4.54
19026    rs6699568       1_42     0.827   27.21 7.1e-05  -5.02
397605  rs11763778       7_36     0.817   41.83 1.1e-04   7.61
116081 rs112308555      2_103     0.815   24.92 6.4e-05  -4.32
196108  rs73185688      3_114     0.807   25.15 6.4e-05  -4.69
605021  rs12582270      12_17     0.806   25.76 6.6e-05   4.72
576671  rs11603349      11_41     0.805   44.79 1.1e-04   6.78

SNPs with largest effect sizes

#plot PIP vs effect size
plot(ctwas_snp_res$susie_pip, ctwas_snp_res$mu2, xlab="PIP", ylab="mu^2", main="SNP PIPs vs Effect Size")

Version Author Date
7e22565 wesleycrouse 2021-09-13
#SNPs with 50 largest effect sizes
head(ctwas_snp_res[order(-ctwas_snp_res$mu2),report_cols_snps],50)
                id region_tag susie_pip     mu2     PVE      z
869385    rs560887      2_102     1.000 2693.07 8.6e-03 -57.71
869374    rs573225      2_102     0.000 2299.86 5.9e-11 -54.23
869362  rs13431652      2_102     0.000 2198.87 1.1e-10  53.00
869519    rs853787      2_102     0.004 2158.94 2.4e-05 -53.06
869516    rs853789      2_102     0.003 2155.95 2.1e-05 -53.04
869546    rs853777      2_102     0.002 2121.19 1.2e-05 -52.75
869411    rs537183      2_102     0.000 2098.86 1.8e-09 -52.66
869409    rs563694      2_102     0.000 2093.73 1.8e-09 -52.60
869434    rs557462      2_102     0.000 2093.31 1.7e-09 -52.60
869413    rs502570      2_102     0.000 2061.83 1.3e-09 -52.14
869427    rs475612      2_102     0.000 2050.99 5.8e-10 -52.14
869430    rs518598      2_102     0.000 2050.44 5.8e-10 -52.13
869423    rs578763      2_102     0.000 2047.87 7.2e-10 -52.08
869418    rs580670      2_102     0.000 2036.60 6.2e-10 -51.95
869462    rs508506      2_102     0.000 1988.79 3.4e-09 -51.54
869453    rs569829      2_102     0.000 1985.83 2.9e-09 -51.52
869474    rs494874      2_102     0.000 1985.61 4.0e-09 -51.48
869481    rs552976      2_102     0.000 1977.83 3.9e-09 -51.36
869420    rs580613      2_102     0.000 1977.14 4.8e-10 -51.23
869419    rs580639      2_102     0.000 1963.41 1.4e-09 -51.53
869449    rs486981      2_102     0.000 1945.21 1.1e-09 -51.05
869450    rs485094      2_102     0.000 1943.80 1.1e-09 -51.04
869456    rs566879      2_102     0.000 1942.59 1.0e-09 -51.03
869455    rs579060      2_102     0.000 1940.45 1.0e-09 -51.00
869454    rs569805      2_102     0.000 1937.19 1.1e-09 -50.95
869432 rs796703396      2_102     0.000 1857.75 4.8e-12 -49.98
869407    rs504979      2_102     0.000 1785.37 4.4e-12 -49.37
869451    rs484066      2_102     0.000 1717.34 2.8e-12 -48.49
869525  rs71397677      2_102     0.000 1661.43 1.0e-10 -47.29
869364  rs12475700      2_102     0.000 1322.20 2.4e-14 -38.64
869494   rs2544360      2_102     0.000 1286.73 1.7e-11 -39.60
869490    rs508743      2_102     0.000 1282.53 1.1e-11  39.58
869521    rs862662      2_102     0.000 1273.92 1.0e-11 -39.50
869493  rs79933700      2_102     0.000 1273.37 8.8e-12  39.46
869540   rs1101532      2_102     0.000 1268.64 1.2e-11 -39.54
869500   rs2685807      2_102     0.000 1259.72 7.9e-12 -39.24
869514    rs853791      2_102     0.000 1255.90 6.5e-12 -39.21
869511   rs2250677      2_102     0.000 1250.19 5.6e-12 -39.12
869472    rs531772      2_102     0.000 1234.27 7.6e-13 -38.74
869549    rs853775      2_102     0.000 1234.14 5.0e-12 -39.04
869445    rs575671      2_102     0.000 1229.97 6.6e-13 -38.66
869486    rs567074      2_102     0.000 1227.64 2.1e-12  38.66
869443    rs473351      2_102     0.000 1223.97 5.9e-13 -38.57
869530    rs853781      2_102     0.000 1212.94 2.2e-12 -38.77
869529    rs853782      2_102     0.000 1210.48 2.0e-12 -38.75
869425    rs500432      2_102     0.000 1205.74 9.9e-14 -37.99
869551    rs853773      2_102     0.000 1201.04 4.1e-11 -39.87
869518    rs853788      2_102     0.000 1124.29 1.1e-13 -37.47
869539  rs35351007      2_102     0.000 1123.25 2.8e-13 -37.70
869522    rs853785      2_102     0.000 1122.24 9.8e-14 -37.45

SNPs with highest PVE

#SNPs with 50 highest pve
head(ctwas_snp_res[order(-ctwas_snp_res$PVE),report_cols_snps],50)
                id region_tag susie_pip     mu2     PVE      z
869385    rs560887      2_102     1.000 2693.07 0.00860 -57.71
166409 rs148685409       3_57     1.000  853.37 0.00270   2.99
869395  rs71397670      2_102     1.000  839.32 0.00270 -42.67
869387    rs492594      2_102     0.764  795.31 0.00190 -17.82
166411   rs1436648       3_57     0.505  875.28 0.00140   3.09
166412   rs7619398       3_57     0.518  875.49 0.00140   3.08
395274    rs917793       7_32     0.582  704.38 0.00130 -33.28
395261  rs10259649       7_32     0.949  396.06 0.00120 -28.65
191411  rs10653660      3_104     0.967  346.87 0.00110  23.54
166413   rs2256473       3_57     0.367  874.83 0.00100   3.06
548498  rs12244851      10_70     1.000  291.83 0.00093 -16.46
166410   rs2575789       3_57     0.314  874.71 0.00087   3.06
176521  rs72964564       3_76     1.000  235.76 0.00075  16.88
75841     rs780093       2_16     1.000  187.59 0.00060 -14.95
395278   rs2908282       7_32     0.258  702.30 0.00057 -33.26
869466  rs56100844      2_102     0.991  152.72 0.00048  17.86
385837   rs1974619       7_15     0.606  241.15 0.00046 -16.89
395249  rs17769733       7_32     0.986  132.27 0.00041   7.76
869379   rs2232318      2_102     1.000  118.88 0.00038  -2.07
692315  rs35889227      14_45     1.000  115.48 0.00037  11.34
869391    rs567243      2_102     0.145  797.22 0.00037 -17.95
395268   rs4607517       7_32     0.160  702.26 0.00036 -33.23
54769   rs79687284      1_109     1.000  110.51 0.00035 -12.08
644589    rs576674      13_11     1.000  104.18 0.00033  10.82
322847  rs55792466        6_7     0.994   97.58 0.00031   9.67
556804   rs3842727       11_3     1.000   99.24 0.00031   8.57
510433 rs115478735       9_70     0.997   81.69 0.00026  -9.52
395269 rs138917529       7_32     1.000   78.08 0.00025  10.45
642899  rs60353775       13_7     0.968   82.20 0.00025  -9.78
869390    rs570876      2_102     0.090  796.62 0.00023 -18.01
477622  rs10758593        9_4     0.999   70.73 0.00022  -8.64
292632  rs17085655       5_57     0.547  118.27 0.00021  11.48
916610  rs11039165      11_29     0.653  101.61 0.00021  11.82
434650   rs7012814       8_12     1.000   62.58 0.00020   9.18
81780    rs2121564       2_28     1.000   59.90 0.00019  -8.01
547950  rs11195508      10_70     0.986   60.78 0.00019  10.76
54778    rs3754140      1_109     0.998   57.31 0.00018  10.01
322828  rs76623841        6_7     1.000   56.94 0.00018   6.80
548492 rs117764423      10_70     0.726   76.01 0.00018   5.24
385835  rs10228796       7_15     0.221  238.64 0.00017 -16.82
559533   rs3750952       11_6     1.000   52.60 0.00017   7.40
261258  rs12499544      4_119     0.723   67.95 0.00016   8.39
466327   rs4433184       8_78     0.960   52.39 0.00016  -4.78
823488 rs111405491      20_16     0.674   74.16 0.00016   8.97
114580  rs12692596       2_96     0.999   45.79 0.00015  -7.24
327439   rs2206734       6_15     0.740   64.95 0.00015  -8.26
466351  rs28529793       8_78     0.688   69.90 0.00015   6.80
511723  rs28624681       9_73     0.941   51.50 0.00015  -7.55
385777  rs10225316       7_15     1.000   45.08 0.00014  -7.51
235039  rs11728350       4_69     0.998   42.34 0.00013  -6.68

SNPs with largest z scores

#SNPs with 50 largest z scores
head(ctwas_snp_res[order(-abs(ctwas_snp_res$z)),report_cols_snps],50)
                id region_tag susie_pip     mu2     PVE      z
869385    rs560887      2_102     1.000 2693.07 8.6e-03 -57.71
869374    rs573225      2_102     0.000 2299.86 5.9e-11 -54.23
869519    rs853787      2_102     0.004 2158.94 2.4e-05 -53.06
869516    rs853789      2_102     0.003 2155.95 2.1e-05 -53.04
869362  rs13431652      2_102     0.000 2198.87 1.1e-10  53.00
869546    rs853777      2_102     0.002 2121.19 1.2e-05 -52.75
869411    rs537183      2_102     0.000 2098.86 1.8e-09 -52.66
869409    rs563694      2_102     0.000 2093.73 1.8e-09 -52.60
869434    rs557462      2_102     0.000 2093.31 1.7e-09 -52.60
869413    rs502570      2_102     0.000 2061.83 1.3e-09 -52.14
869427    rs475612      2_102     0.000 2050.99 5.8e-10 -52.14
869430    rs518598      2_102     0.000 2050.44 5.8e-10 -52.13
869423    rs578763      2_102     0.000 2047.87 7.2e-10 -52.08
869418    rs580670      2_102     0.000 2036.60 6.2e-10 -51.95
869462    rs508506      2_102     0.000 1988.79 3.4e-09 -51.54
869419    rs580639      2_102     0.000 1963.41 1.4e-09 -51.53
869453    rs569829      2_102     0.000 1985.83 2.9e-09 -51.52
869474    rs494874      2_102     0.000 1985.61 4.0e-09 -51.48
869481    rs552976      2_102     0.000 1977.83 3.9e-09 -51.36
869420    rs580613      2_102     0.000 1977.14 4.8e-10 -51.23
869449    rs486981      2_102     0.000 1945.21 1.1e-09 -51.05
869450    rs485094      2_102     0.000 1943.80 1.1e-09 -51.04
869456    rs566879      2_102     0.000 1942.59 1.0e-09 -51.03
869455    rs579060      2_102     0.000 1940.45 1.0e-09 -51.00
869454    rs569805      2_102     0.000 1937.19 1.1e-09 -50.95
869432 rs796703396      2_102     0.000 1857.75 4.8e-12 -49.98
869407    rs504979      2_102     0.000 1785.37 4.4e-12 -49.37
869451    rs484066      2_102     0.000 1717.34 2.8e-12 -48.49
869525  rs71397677      2_102     0.000 1661.43 1.0e-10 -47.29
869395  rs71397670      2_102     1.000  839.32 2.7e-03 -42.67
869551    rs853773      2_102     0.000 1201.04 4.1e-11 -39.87
869494   rs2544360      2_102     0.000 1286.73 1.7e-11 -39.60
869490    rs508743      2_102     0.000 1282.53 1.1e-11  39.58
869540   rs1101532      2_102     0.000 1268.64 1.2e-11 -39.54
869521    rs862662      2_102     0.000 1273.92 1.0e-11 -39.50
869493  rs79933700      2_102     0.000 1273.37 8.8e-12  39.46
869500   rs2685807      2_102     0.000 1259.72 7.9e-12 -39.24
869514    rs853791      2_102     0.000 1255.90 6.5e-12 -39.21
869511   rs2250677      2_102     0.000 1250.19 5.6e-12 -39.12
869549    rs853775      2_102     0.000 1234.14 5.0e-12 -39.04
869530    rs853781      2_102     0.000 1212.94 2.2e-12 -38.77
869529    rs853782      2_102     0.000 1210.48 2.0e-12 -38.75
869472    rs531772      2_102     0.000 1234.27 7.6e-13 -38.74
869445    rs575671      2_102     0.000 1229.97 6.6e-13 -38.66
869486    rs567074      2_102     0.000 1227.64 2.1e-12  38.66
869364  rs12475700      2_102     0.000 1322.20 2.4e-14 -38.64
869443    rs473351      2_102     0.000 1223.97 5.9e-13 -38.57
869425    rs500432      2_102     0.000 1205.74 9.9e-14 -37.99
869539  rs35351007      2_102     0.000 1123.25 2.8e-13 -37.70
869542    rs853778      2_102     0.000 1112.85 7.9e-14 -37.54

Gene set enrichment for genes with PIP>0.8

#GO enrichment analysis
library(enrichR)
Welcome to enrichR
Checking connection ... 
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
dbs <- c("GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021")
genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]

#number of genes for gene set enrichment
length(genes)
[1] 14
if (length(genes)>0){
  GO_enrichment <- enrichr(genes, dbs)

  for (db in dbs){
    print(db)
    df <- GO_enrichment[[db]]
    df <- df[df$Adjusted.P.value<0.05,c("Term", "Overlap", "Adjusted.P.value", "Genes")]
    print(df)
  }
  
  #DisGeNET enrichment
  
  # devtools::install_bitbucket("ibi_group/disgenet2r")
  library(disgenet2r)
  
  disgenet_api_key <- get_disgenet_api_key(
                    email = "wesleycrouse@gmail.com", 
                    password = "uchicago1" )
  
  Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
  
  res_enrich <-disease_enrichment(entities=genes, vocabulary = "HGNC",
                               database = "CURATED" )
  
  df <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio",  "BgRatio")]
  print(df)
  
  #WebGestalt enrichment
  library(WebGestaltR)
  
  background <- ctwas_gene_res$genename
  
  #listGeneSet()
  databases <- c("pathway_KEGG", "disease_GLAD4U", "disease_OMIM")
  
  enrichResult <- WebGestaltR(enrichMethod="ORA", organism="hsapiens",
                              interestGene=genes, referenceGene=background,
                              enrichDatabase=databases, interestGeneType="genesymbol",
                              referenceGeneType="genesymbol", isOutput=F)
  print(enrichResult[,c("description", "size", "overlap", "FDR", "database", "userId")])
}
Uploading data to Enrichr... Done.
  Querying GO_Biological_Process_2021... Done.
  Querying GO_Cellular_Component_2021... Done.
  Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
                                                                                                             Term
1                                              insulin-like growth factor receptor signaling pathway (GO:0048009)
2                   positive regulation of cyclin-dependent protein serine/threonine kinase activity (GO:0045737)
3                                    positive regulation of cyclin-dependent protein kinase activity (GO:1904031)
4                                       positive regulation of G1/S transition of mitotic cell cycle (GO:1900087)
5                                                        cellular response to growth factor stimulus (GO:0071363)
6                                            positive regulation of cell cycle G1/S phase transition (GO:1902808)
7                                                                          response to growth factor (GO:0070848)
8                                         positive regulation of mitotic cell cycle phase transition (GO:1901992)
9                                                                             ERBB signaling pathway (GO:0038127)
10                                                                 positive regulation of cell cycle (GO:0045787)
11                                               regulation of G1/S transition of mitotic cell cycle (GO:2000045)
12                                                            regulation of fat cell differentiation (GO:0045598)
13                           regulation of cyclin-dependent protein serine/threonine kinase activity (GO:0000079)
14                                   positive regulation of protein serine/threonine kinase activity (GO:0071902)
15                                                                 response to tumor necrosis factor (GO:0034612)
16                                            regulation of protein serine/threonine kinase activity (GO:0071900)
17                                                              regulation of mRNA catabolic process (GO:0061013)
18                                                 positive regulation of deoxyribonuclease activity (GO:0032077)
19                                                                    cellular response to metal ion (GO:0071248)
20                    cellular response to granulocyte macrophage colony-stimulating factor stimulus (GO:0097011)
21                                 regulation of long-chain fatty acid import across plasma membrane (GO:0010746)
22                                      response to granulocyte macrophage colony-stimulating factor (GO:0097012)
23                                                   positive regulation of I-kappaB phosphorylation (GO:1903721)
24                                                   xenobiotic transport across blood-brain barrier (GO:1990962)
25                                                                      regulation of mRNA stability (GO:0043488)
26                                                     regulation of Rab protein signal transduction (GO:0032483)
27                                                               cellular response to interleukin-18 (GO:0071351)
28                                                negative regulation of cell cycle phase transition (GO:1901988)
29                                                sphingosine-1-phosphate receptor signaling pathway (GO:0003376)
30                                                            regulation of I-kappaB phosphorylation (GO:1903719)
31                                                                          T cell apoptotic process (GO:0070231)
32                                                                            definitive hemopoiesis (GO:0060216)
33                                                         interleukin-18-mediated signaling pathway (GO:0035655)
34                                           positive regulation of mitochondrial membrane potential (GO:0010918)
35                                                     mammary gland epithelial cell differentiation (GO:0060644)
36                                                       negative regulation of fatty acid oxidation (GO:0046322)
37                                                      regulation of endodeoxyribonuclease activity (GO:0032071)
38                                                               protein localization to microtubule (GO:0035372)
39                                                  protein localization to microtubule cytoskeleton (GO:0072698)
40                                                         regulation of glucan biosynthetic process (GO:0010962)
41                                                                                           anoikis (GO:0043276)
42                                                                 glycerolipid biosynthetic process (GO:0045017)
43                                                         positive regulation of membrane potential (GO:0045838)
44                           cellular response to oxidised low-density lipoprotein particle stimulus (GO:0140052)
45                                                    negative regulation of transmembrane transport (GO:0034763)
46                                                  regulation of type B pancreatic cell development (GO:2000074)
47                                                                                T cell homeostasis (GO:0043029)
48                                                             tetrahydrobiopterin metabolic process (GO:0046146)
49                                                                regulation of lymphocyte migration (GO:2000401)
50                                                                           urate metabolic process (GO:0046415)
51                                                       negative regulation of lymphocyte migration (GO:2000402)
52                                                        cellular response to tumor necrosis factor (GO:0071356)
53                                                               regulation of programmed cell death (GO:0043067)
54                                                          regulation of glycogen metabolic process (GO:0070873)
55                                                        regulation of leukocyte cell-cell adhesion (GO:1903037)
56                                         attachment of mitotic spindle microtubules to kinetochore (GO:0051315)
57                                                      protein localization to condensed chromosome (GO:1903083)
58                                                  negative regulation of stem cell differentiation (GO:2000737)
59                                                   pteridine-containing compound metabolic process (GO:0042558)
60                                    positive regulation of cellular carbohydrate metabolic process (GO:0010676)
61                                                 negative regulation of mononuclear cell migration (GO:0071676)
62                                                             protein localization to mitochondrion (GO:0070585)
63                                               positive regulation of protein modification process (GO:0031401)
64                                                           regulation of fatty acid beta-oxidation (GO:0031998)
65                                                                  T cell differentiation in thymus (GO:0033077)
66                                                  positive regulation of glucose metabolic process (GO:0010907)
67                                              positive regulation of glycogen biosynthetic process (GO:0045725)
68                                                                                  response to UV-A (GO:0070141)
69                                                               protein localization to kinetochore (GO:0034501)
70  positive regulation of nuclear-transcribed mRNA catabolic process, deadenylation-dependent decay (GO:1900153)
71                                                              mammary gland epithelium development (GO:0061180)
72                                                                 nitric oxide biosynthetic process (GO:0006809)
73                                                 positive regulation of glycogen metabolic process (GO:0070875)
74           regulation of nuclear-transcribed mRNA catabolic process, deadenylation-dependent decay (GO:1900151)
75                                                              3'-UTR-mediated mRNA destabilization (GO:0061158)
76                                                                    nitric oxide metabolic process (GO:0046209)
77                                           positive regulation of small molecule metabolic process (GO:0062013)
78                                                                         transepithelial transport (GO:0070633)
79                                                                            heme catabolic process (GO:0042167)
80                                                   porphyrin-containing compound catabolic process (GO:0006787)
81                                                      cellular response to corticosteroid stimulus (GO:0071384)
82                                  negative regulation of release of cytochrome c from mitochondria (GO:0090201)
83                                            protein localization to chromosome, centromeric region (GO:0071459)
84                                                      cellular response to glucocorticoid stimulus (GO:0071385)
85                                             positive regulation of nitric-oxide synthase activity (GO:0051000)
86                                                              regulation of B cell differentiation (GO:0045577)
87                                               negative regulation of leukocyte cell-cell adhesion (GO:1903038)
88                                                    negative regulation of lipid catabolic process (GO:0050995)
89                      regulation of oxidative stress-induced intrinsic apoptotic signaling pathway (GO:1902175)
90                                                 cell migration involved in sprouting angiogenesis (GO:0002042)
91                                                            positive regulation of phosphorylation (GO:0042327)
92                                                                 excitatory postsynaptic potential (GO:0060079)
93                                                  maintenance of protein localization in organelle (GO:0072595)
94                                                                            diol metabolic process (GO:0034311)
95             negative regulation of oxidative stress-induced intrinsic apoptotic signaling pathway (GO:1902176)
96                                                                   regulation of apoptotic process (GO:0042981)
97                                                          regulation of lymphocyte differentiation (GO:0045619)
98                                                                              xenobiotic transport (GO:0042908)
99                                                             regulation of protein phosphorylation (GO:0001932)
100                                                cellular response to nerve growth factor stimulus (GO:1990090)
101                                                                        heme biosynthetic process (GO:0006783)
102                                  negative regulation of signal transduction in absence of ligand (GO:1901099)
103                                                                            ERK1 and ERK2 cascade (GO:0070371)
104                negative regulation of extrinsic apoptotic signaling pathway in absence of ligand (GO:2001240)
105                                                          blood vessel endothelial cell migration (GO:0043534)
106                                                             regulation of monooxygenase activity (GO:0032768)
107                                                                       protein kinase B signaling (GO:0043491)
108                                               porphyrin-containing compound biosynthetic process (GO:0006779)
109                                                     chemical synaptic transmission, postsynaptic (GO:0099565)
110                                                    positive regulation of ERBB signaling pathway (GO:1901186)
111                                                   positive regulation of lipid metabolic process (GO:0045834)
112                                                      regulation of glycogen biosynthetic process (GO:0005979)
113                                                                                    TOR signaling (GO:0031929)
114                                                            positive regulation of glucose import (GO:0046326)
115                                                            negative regulation of macroautophagy (GO:0016242)
116                                         positive regulation of nitric oxide biosynthetic process (GO:0045429)
117                                                                       response to glucocorticoid (GO:0051384)
118                                                                 cellular response to cadmium ion (GO:0071276)
119                         regulation of extrinsic apoptotic signaling pathway in absence of ligand (GO:2001239)
120                                            positive regulation of nitric oxide metabolic process (GO:1904407)
121                                                                                  amide transport (GO:0042886)
122                                                                        sulfur compound transport (GO:0072348)
123                                                                  regulation of B cell activation (GO:0050864)
124                                       negative regulation of oxidative stress-induced cell death (GO:1903202)
125                               regulation of plasma membrane bounded cell projection organization (GO:0120035)
126                                                    positive regulation of monooxygenase activity (GO:0032770)
127                                            cellular response to epidermal growth factor stimulus (GO:0071364)
    Overlap Adjusted.P.value              Genes
1      2/11      0.008103861        GIGYF1;AKT1
2      2/17      0.009298289         CCND2;AKT1
3      2/20      0.009298289         CCND2;AKT1
4      2/26      0.010736217         CCND2;AKT1
5     3/158      0.010736217 AKT1;CPNE3;ZFP36L2
6      2/35      0.014472206         CCND2;AKT1
7      2/41      0.017054687         AKT1;CPNE3
8      2/58      0.029878625         CCND2;AKT1
9      2/66      0.030918376         AKT1;CPNE3
10     2/66      0.030918376         CCND2;AKT1
11     2/71      0.032497906         CCND2;AKT1
12     2/80      0.036587970       AKT1;ZFP36L2
13     2/82      0.036587970         CCND2;AKT1
14    2/106      0.044174001         CCND2;AKT1
15    2/110      0.044174001       AKT1;ZFP36L2
16    2/111      0.044174001         CCND2;AKT1
17    2/122      0.044174001       AKT1;ZFP36L2
18      1/5      0.044174001               AKT1
19    2/131      0.044174001         AKT1;CPNE3
20      1/6      0.044174001            ZFP36L2
21      1/6      0.044174001               AKT1
22      1/6      0.044174001            ZFP36L2
23      1/6      0.044174001               AKT1
24      1/6      0.044174001              ABCG2
25    2/146      0.044174001       AKT1;ZFP36L2
26      1/7      0.044174001               MADD
27      1/7      0.044174001               AKT1
28      1/7      0.044174001            ZFP36L2
29      1/7      0.044174001               AKT1
30      1/7      0.044174001               AKT1
31      1/7      0.044174001               AKT1
32      1/7      0.044174001            ZFP36L2
33      1/7      0.044174001               AKT1
34      1/7      0.044174001               AKT1
35      1/8      0.044174001               AKT1
36      1/8      0.044174001               AKT1
37      1/8      0.044174001               AKT1
38      1/8      0.044174001             CHAMP1
39      1/8      0.044174001             CHAMP1
40      1/9      0.044174001               AKT1
41      1/9      0.044174001               AKT1
42      1/9      0.044174001              CPNE3
43      1/9      0.044174001               AKT1
44     1/10      0.044174001               AKT1
45     1/10      0.044174001               AKT1
46     1/10      0.044174001               AKT1
47     1/11      0.044174001               AKT1
48     1/11      0.044174001               AKT1
49     1/11      0.044174001               AKT1
50     1/11      0.044174001              ABCG2
51     1/11      0.044174001               AKT1
52    2/194      0.044174001       AKT1;ZFP36L2
53    2/194      0.044174001          MADD;AKT1
54     1/12      0.044174001               AKT1
55     1/12      0.044174001               AKT1
56     1/12      0.044174001             CHAMP1
57     1/12      0.044174001             CHAMP1
58     1/13      0.044174001            ZFP36L2
59     1/13      0.044174001               AKT1
60     1/13      0.044174001               AKT1
61     1/13      0.044174001               AKT1
62     1/13      0.044174001               AKT1
63    2/214      0.044174001         CCND2;AKT1
64     1/14      0.044174001               AKT1
65     1/14      0.044174001            ZFP36L2
66     1/14      0.044174001               AKT1
67     1/14      0.044174001               AKT1
68     1/14      0.044174001               AKT1
69     1/14      0.044174001             CHAMP1
70     1/15      0.044174001            ZFP36L2
71     1/15      0.044174001               AKT1
72     1/15      0.044174001               AKT1
73     1/15      0.044174001               AKT1
74     1/15      0.044174001            ZFP36L2
75     1/16      0.044174001            ZFP36L2
76     1/16      0.044174001               AKT1
77     1/16      0.044174001               AKT1
78     1/16      0.044174001              ABCG2
79     1/16      0.044174001              ABCG2
80     1/16      0.044174001              ABCG2
81     1/16      0.044174001            ZFP36L2
82     1/16      0.044174001               AKT1
83     1/17      0.045234301             CHAMP1
84     1/18      0.045234301            ZFP36L2
85     1/18      0.045234301               AKT1
86     1/18      0.045234301            ZFP36L2
87     1/18      0.045234301               AKT1
88     1/18      0.045234301               AKT1
89     1/18      0.045234301               AKT1
90     1/18      0.045234301               AKT1
91    2/253      0.045234301         CCND2;AKT1
92     1/19      0.045234301               AKT1
93     1/19      0.045234301               AKT1
94     1/19      0.045234301               AKT1
95     1/19      0.045234301               AKT1
96    3/742      0.046142478    CCND2;MADD;AKT1
97     1/20      0.046142478            ZFP36L2
98     1/20      0.046142478              ABCG2
99    2/266      0.047378384         CCND2;AKT1
100    1/22      0.049217157               AKT1
101    1/22      0.049217157              ABCG2
102    1/23      0.049478084               AKT1
103    1/23      0.049478084            ZFP36L2
104    1/23      0.049478084               AKT1
105    1/23      0.049478084               AKT1
106    1/24      0.049719225               AKT1
107    1/24      0.049719225               AKT1
108    1/24      0.049719225              ABCG2
109    1/25      0.049719225               AKT1
110    1/25      0.049719225               AKT1
111    1/25      0.049719225               AKT1
112    1/26      0.049719225               AKT1
113    1/26      0.049719225               AKT1
114    1/26      0.049719225               AKT1
115    1/26      0.049719225               AKT1
116    1/27      0.049719225               AKT1
117    1/27      0.049719225            ZFP36L2
118    1/27      0.049719225               AKT1
119    1/28      0.049719225               AKT1
120    1/28      0.049719225               AKT1
121    1/28      0.049719225              ABCG2
122    1/28      0.049719225              ABCG2
123    1/28      0.049719225            ZFP36L2
124    1/28      0.049719225               AKT1
125    1/28      0.049719225               AKT1
126    1/28      0.049719225               AKT1
127    1/28      0.049719225            ZFP36L2
[1] "GO_Cellular_Component_2021"
[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"
                                                        Term Overlap
1             protein kinase regulator activity (GO:0019887)    2/98
2      urate transmembrane transporter activity (GO:0015143)     1/5
3          potassium channel activator activity (GO:0099104)     1/5
4 riboflavin transmembrane transporter activity (GO:0032217)     1/5
  Adjusted.P.value      Genes
1       0.03408014 CCND2;MADD
2       0.03408014      ABCG2
3       0.03408014       AKT1
4       0.03408014      ABCG2
MADD gene(s) from the input list not found in DisGeNET CURATEDPPDPF gene(s) from the input list not found in DisGeNET CURATEDFAM214B gene(s) from the input list not found in DisGeNET CURATEDGIGYF1 gene(s) from the input list not found in DisGeNET CURATED
                                                   Description         FDR
65                                         Prostatic Neoplasms 0.001129474
127                             Malignant neoplasm of prostate 0.001129474
18                                        Colorectal Neoplasms 0.008314537
23                    Diabetes Mellitus, Non-Insulin-Dependent 0.022319439
29                                                    Fibrosis 0.022319439
163               Squamous cell carcinoma of the head and neck 0.022319439
180                                                  Cirrhosis 0.022319439
181 URIC ACID CONCENTRATION, SERUM, QUANTITATIVE TRAIT LOCUS 1 0.022319439
189           Precursor T-Cell Lymphoblastic Leukemia-Lymphoma 0.022319439
197                                          COWDEN SYNDROME 6 0.022319439
    Ratio  BgRatio
65   6/10 616/9703
127  6/10 616/9703
18   4/10 277/9703
23   3/10 221/9703
29   2/10  50/9703
163  2/10  52/9703
180  2/10  50/9703
181  1/10   1/9703
189  2/10  53/9703
197  1/10   1/9703
******************************************

*                                        *

*          Welcome to WebGestaltR !      *

*                                        *

******************************************
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet,
minNum = minNum, : No significant gene set is identified based on FDR 0.05!
NULL

Sensitivity, specificity and precision for silver standard genes

library("readxl")

known_annotations <- read_xlsx("data/summary_known_genes_annotations.xlsx", sheet="Glucose")
known_annotations <- unique(known_annotations$`Gene Symbol`)

unrelated_genes <- ctwas_gene_res$genename[!(ctwas_gene_res$genename %in% known_annotations)]

#number of genes in known annotations
print(length(known_annotations))
[1] 87
#number of genes in known annotations with imputed expression
print(sum(known_annotations %in% ctwas_gene_res$genename))
[1] 39
#assign ctwas, TWAS, and bystander genes
ctwas_genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]
twas_genes <- ctwas_gene_res$genename[abs(ctwas_gene_res$z)>sig_thresh]
novel_genes <- ctwas_genes[!(ctwas_genes %in% twas_genes)]

#significance threshold for TWAS
print(sig_thresh)
[1] 4.618904
#number of ctwas genes
length(ctwas_genes)
[1] 14
#number of TWAS genes
length(twas_genes)
[1] 87
#show novel genes (ctwas genes with not in TWAS genes)
ctwas_gene_res[ctwas_gene_res$genename %in% novel_genes,report_cols]
     genename region_tag susie_pip   mu2     PVE     z
3620    ABCG2       4_59     0.812 19.76 5.1e-05 -4.10
66    FAM214B       9_27     0.831 21.22 5.6e-05 -4.20
6020     AKT1      14_55     0.859 20.40 5.6e-05 -4.14
#sensitivity / recall
sensitivity <- rep(NA,2)
names(sensitivity) <- c("ctwas", "TWAS")
sensitivity["ctwas"] <- sum(ctwas_genes %in% known_annotations)/length(known_annotations)
sensitivity["TWAS"] <- sum(twas_genes %in% known_annotations)/length(known_annotations)
sensitivity
     ctwas       TWAS 
0.00000000 0.03448276 
#specificity
specificity <- rep(NA,2)
names(specificity) <- c("ctwas", "TWAS")
specificity["ctwas"] <- sum(!(unrelated_genes %in% ctwas_genes))/length(unrelated_genes)
specificity["TWAS"] <- sum(!(unrelated_genes %in% twas_genes))/length(unrelated_genes)
specificity
    ctwas      TWAS 
0.9989166 0.9934995 
#precision / PPV
precision <- rep(NA,2)
names(precision) <- c("ctwas", "TWAS")
precision["ctwas"] <- sum(ctwas_genes %in% known_annotations)/length(ctwas_genes)
precision["TWAS"] <- sum(twas_genes %in% known_annotations)/length(twas_genes)
precision
     ctwas       TWAS 
0.00000000 0.03448276 
#ROC curves

pip_range <- (0:1000)/1000
sensitivity <- rep(NA, length(pip_range))
specificity <- rep(NA, length(pip_range))

for (index in 1:length(pip_range)){
  pip <- pip_range[index]
  ctwas_genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>=pip]
  sensitivity[index] <- sum(ctwas_genes %in% known_annotations)/length(known_annotations)
  specificity[index] <- sum(!(unrelated_genes %in% ctwas_genes))/length(unrelated_genes)
}

plot(1-specificity, sensitivity, type="l", xlim=c(0,1), ylim=c(0,1))

sig_thresh_range <- seq(from=0, to=max(abs(ctwas_gene_res$z)), length.out=length(pip_range))

for (index in 1:length(sig_thresh_range)){
  sig_thresh_plot <- sig_thresh_range[index]
  twas_genes <- ctwas_gene_res$genename[abs(ctwas_gene_res$z)>=sig_thresh_plot]
  sensitivity[index] <- sum(twas_genes %in% known_annotations)/length(known_annotations)
  specificity[index] <- sum(!(unrelated_genes %in% twas_genes))/length(unrelated_genes)
}

lines(1-specificity, sensitivity, xlim=c(0,1), ylim=c(0,1), col="red", lty=2)

Version Author Date
7e22565 wesleycrouse 2021-09-13

Sensitivity, specificity and precision for silver standard genes - bystanders only

This section first uses all silver standard genes to identify bystander genes within 1Mb. The silver standard and bystander gene lists are then subset to only genes with imputed expression in this analysis. Then, the ctwas and TWAS gene lists from this analysis are subset to only genes that are in the (subset) silver standard and bystander genes. These gene lists are then used to compute sensitivity, specificity and precision for ctwas and TWAS.

library(biomaRt)
library(GenomicRanges)
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':

    anyDuplicated, append, as.data.frame, basename, cbind,
    colnames, dirname, do.call, duplicated, eval, evalq, Filter,
    Find, get, grep, grepl, intersect, is.unsorted, lapply, Map,
    mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, Position, rank, rbind, Reduce, rownames, sapply,
    setdiff, sort, table, tapply, union, unique, unsplit, which,
    which.max, which.min
Loading required package: S4Vectors

Attaching package: 'S4Vectors'
The following object is masked from 'package:base':

    expand.grid
Loading required package: IRanges
Loading required package: GenomeInfoDb
ensembl <- useEnsembl(biomart="ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl")
G_list <- getBM(filters= "chromosome_name", attributes= c("hgnc_symbol","chromosome_name","start_position","end_position","gene_biotype"), values=1:22, mart=ensembl)
G_list <- G_list[G_list$hgnc_symbol!="",]
G_list <- G_list[G_list$gene_biotype %in% c("protein_coding","lncRNA"),]
G_list$start <- G_list$start_position
G_list$end <- G_list$end_position
G_list_granges <- makeGRangesFromDataFrame(G_list, keep.extra.columns=T)

known_annotations_positions <- G_list[G_list$hgnc_symbol %in% known_annotations,]
half_window <- 1000000
known_annotations_positions$start <- known_annotations_positions$start_position - half_window
known_annotations_positions$end <- known_annotations_positions$end_position + half_window
known_annotations_positions$start[known_annotations_positions$start<1] <- 1
known_annotations_granges <- makeGRangesFromDataFrame(known_annotations_positions, keep.extra.columns=T)

bystanders <- findOverlaps(known_annotations_granges,G_list_granges)
bystanders <- unique(subjectHits(bystanders))
bystanders <- G_list$hgnc_symbol[bystanders]
bystanders <- bystanders[!(bystanders %in% known_annotations)]
unrelated_genes <- bystanders

#number of genes in known annotations
print(length(known_annotations))
[1] 87
#number of genes in known annotations with imputed expression
print(sum(known_annotations %in% ctwas_gene_res$genename))
[1] 39
#number of bystander genes
print(length(unrelated_genes))
[1] 1944
#number of bystander genes with imputed expression
print(sum(unrelated_genes %in% ctwas_gene_res$genename))
[1] 1086
#remove genes without imputed expression from gene lists
known_annotations <- known_annotations[known_annotations %in% ctwas_gene_res$genename]
unrelated_genes <- unrelated_genes[unrelated_genes %in% ctwas_gene_res$genename]

#assign ctwas and TWAS genes
ctwas_genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]
twas_genes <- ctwas_gene_res$genename[abs(ctwas_gene_res$z)>sig_thresh]

#significance threshold for TWAS
print(sig_thresh)
[1] 4.618904
#number of ctwas genes
length(ctwas_genes)
[1] 14
#number of ctwas genes in known annotations or bystanders
sum(ctwas_genes %in% c(known_annotations, unrelated_genes))
[1] 0
#number of ctwas genes
length(twas_genes)
[1] 87
#number of TWAS genes
sum(twas_genes %in% c(known_annotations, unrelated_genes))
[1] 13
#remove genes not in known or bystander lists from results
ctwas_genes <- ctwas_genes[ctwas_genes %in% c(known_annotations, unrelated_genes)]
twas_genes <- twas_genes[twas_genes %in% c(known_annotations, unrelated_genes)]

#sensitivity / recall
sensitivity <- rep(NA,2)
names(sensitivity) <- c("ctwas", "TWAS")
sensitivity["ctwas"] <- sum(ctwas_genes %in% known_annotations)/length(known_annotations)
sensitivity["TWAS"] <- sum(twas_genes %in% known_annotations)/length(known_annotations)
sensitivity
     ctwas       TWAS 
0.00000000 0.07692308 
#specificity
specificity <- rep(NA,2)
names(specificity) <- c("ctwas", "TWAS")
specificity["ctwas"] <- sum(!(unrelated_genes %in% ctwas_genes))/length(unrelated_genes)
specificity["TWAS"] <- sum(!(unrelated_genes %in% twas_genes))/length(unrelated_genes)
specificity
    ctwas      TWAS 
1.0000000 0.9907919 
#precision / PPV
precision <- rep(NA,2)
names(precision) <- c("ctwas", "TWAS")
precision["ctwas"] <- sum(ctwas_genes %in% known_annotations)/length(ctwas_genes)
precision["TWAS"] <- sum(twas_genes %in% known_annotations)/length(twas_genes)
precision
    ctwas      TWAS 
      NaN 0.2307692 

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] GenomicRanges_1.36.0 GenomeInfoDb_1.20.0  IRanges_2.18.1      
 [4] S4Vectors_0.22.1     BiocGenerics_0.30.0  biomaRt_2.40.1      
 [7] readxl_1.3.1         WebGestaltR_0.4.4    disgenet2r_0.99.2   
[10] enrichR_3.0          cowplot_1.0.0        ggplot2_3.3.3       

loaded via a namespace (and not attached):
 [1] bitops_1.0-6           fs_1.3.1               bit64_4.0.5           
 [4] doParallel_1.0.16      progress_1.2.2         httr_1.4.1            
 [7] rprojroot_2.0.2        tools_3.6.1            doRNG_1.8.2           
[10] utf8_1.2.1             R6_2.5.0               DBI_1.1.1             
[13] colorspace_1.4-1       withr_2.4.1            tidyselect_1.1.0      
[16] prettyunits_1.0.2      bit_4.0.4              curl_3.3              
[19] compiler_3.6.1         git2r_0.26.1           Biobase_2.44.0        
[22] labeling_0.3           scales_1.1.0           readr_1.4.0           
[25] stringr_1.4.0          apcluster_1.4.8        digest_0.6.20         
[28] rmarkdown_1.13         svglite_1.2.2          XVector_0.24.0        
[31] pkgconfig_2.0.3        htmltools_0.3.6        fastmap_1.1.0         
[34] rlang_0.4.11           RSQLite_2.2.7          farver_2.1.0          
[37] generics_0.0.2         jsonlite_1.6           dplyr_1.0.7           
[40] RCurl_1.98-1.1         magrittr_2.0.1         GenomeInfoDbData_1.2.1
[43] Matrix_1.2-18          Rcpp_1.0.6             munsell_0.5.0         
[46] fansi_0.5.0            gdtools_0.1.9          lifecycle_1.0.0       
[49] stringi_1.4.3          whisker_0.3-2          yaml_2.2.0            
[52] zlibbioc_1.30.0        plyr_1.8.4             grid_3.6.1            
[55] blob_1.2.1             promises_1.0.1         crayon_1.4.1          
[58] lattice_0.20-38        hms_1.1.0              knitr_1.23            
[61] pillar_1.6.1           igraph_1.2.4.1         rjson_0.2.20          
[64] rngtools_1.5           reshape2_1.4.3         codetools_0.2-16      
[67] XML_3.98-1.20          glue_1.4.2             evaluate_0.14         
[70] data.table_1.14.0      vctrs_0.3.8            httpuv_1.5.1          
[73] foreach_1.5.1          cellranger_1.1.0       gtable_0.3.0          
[76] purrr_0.3.4            assertthat_0.2.1       cachem_1.0.5          
[79] xfun_0.8               later_0.8.0            tibble_3.1.2          
[82] iterators_1.0.13       AnnotationDbi_1.46.0   memoise_2.0.0         
[85] workflowr_1.6.2        ellipsis_0.3.2