Last updated: 2021-09-09

Checks: 6 1

Knit directory: ctwas_applied/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20210726) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 59e5f4d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Unstaged changes:
    Modified:   analysis/ukb-d-30500_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30500_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30600_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30600_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30610_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30610_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30620_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30620_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30630_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30630_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30640_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30640_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30650_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30650_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30660_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30660_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30670_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30670_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30680_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30690_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30690_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30700_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30700_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30710_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30710_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30720_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30720_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30730_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30740_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30740_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30750_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30750_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30760_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30760_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30770_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30770_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30780_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30780_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30790_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30800_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30800_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30810_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30820_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30820_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30830_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30830_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30840_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30840_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30850_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30850_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30860_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30870_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30880_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30890_irnt_Liver.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/ukb-d-30850_irnt_Whole_Blood.Rmd) and HTML (docs/ukb-d-30850_irnt_Whole_Blood.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd cbf7408 wesleycrouse 2021-09-08 adding enrichment to reports
html cbf7408 wesleycrouse 2021-09-08 adding enrichment to reports
Rmd 4970e3e wesleycrouse 2021-09-08 updating reports
html 4970e3e wesleycrouse 2021-09-08 updating reports
Rmd dfd2b5f wesleycrouse 2021-09-07 regenerating reports
html dfd2b5f wesleycrouse 2021-09-07 regenerating reports
Rmd 61b53b3 wesleycrouse 2021-09-06 updated PVE calculation
html 61b53b3 wesleycrouse 2021-09-06 updated PVE calculation
Rmd 837dd01 wesleycrouse 2021-09-01 adding additional fixedsigma report
Rmd d0a5417 wesleycrouse 2021-08-30 adding new reports to the index
Rmd 0922de7 wesleycrouse 2021-08-18 updating all reports with locus plots
html 0922de7 wesleycrouse 2021-08-18 updating all reports with locus plots
html 1c62980 wesleycrouse 2021-08-11 Updating reports
Rmd 5981e80 wesleycrouse 2021-08-11 Adding more reports
html 5981e80 wesleycrouse 2021-08-11 Adding more reports
Rmd da9f015 wesleycrouse 2021-08-07 adding more reports
html da9f015 wesleycrouse 2021-08-07 adding more reports

Overview

These are the results of a ctwas analysis of the UK Biobank trait Testosterone (quantile) using Whole_Blood gene weights.

The GWAS was conducted by the Neale Lab, and the biomarker traits we analyzed are discussed here. Summary statistics were obtained from IEU OpenGWAS using GWAS ID: ukb-d-30850_irnt. Results were obtained from from IEU rather than Neale Lab because they are in a standardard format (GWAS VCF). Note that 3 of the 34 biomarker traits were not available from IEU and were excluded from analysis.

The weights are mashr GTEx v8 models on Whole_Blood eQTL obtained from PredictDB. We performed a full harmonization of the variants, including recovering strand ambiguous variants. This procedure is discussed in a separate document. (TO-DO: add report that describes harmonization)

LD matrices were computed from a 10% subset of Neale lab subjects. Subjects were matched using the plate and well information from genotyping. We included only biallelic variants with MAF>0.01 in the original Neale Lab GWAS. (TO-DO: add more details [number of subjects, variants, etc])

Weight QC

TO-DO: add enhanced QC reporting (total number of weights, why each variant was missing for all genes)

qclist_all <- list()

qc_files <- paste0(results_dir, "/", list.files(results_dir, pattern="exprqc.Rd"))

for (i in 1:length(qc_files)){
  load(qc_files[i])
  chr <- unlist(strsplit(rev(unlist(strsplit(qc_files[i], "_")))[1], "[.]"))[1]
  qclist_all[[chr]] <- cbind(do.call(rbind, lapply(qclist,unlist)), as.numeric(substring(chr,4)))
}

qclist_all <- data.frame(do.call(rbind, qclist_all))
colnames(qclist_all)[ncol(qclist_all)] <- "chr"

rm(qclist, wgtlist, z_gene_chr)

#number of imputed weights
nrow(qclist_all)
[1] 11095
#number of imputed weights by chromosome
table(qclist_all$chr)

   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15 
1129  747  624  400  479  621  560  383  404  430  682  652  192  362  331 
  16   17   18   19   20   21   22 
 551  725  159  911  313  130  310 
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.762776

Load ctwas results

#load ctwas results
ctwas_res <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.susieIrss.txt"))

#make unique identifier for regions
ctwas_res$region_tag <- paste(ctwas_res$region_tag1, ctwas_res$region_tag2, sep="_")

#compute PVE for each gene/SNP
ctwas_res$PVE = ctwas_res$susie_pip*ctwas_res$mu2/sample_size #check PVE calculation

#separate gene and SNP results
ctwas_gene_res <- ctwas_res[ctwas_res$type == "gene", ]
ctwas_gene_res <- data.frame(ctwas_gene_res)
ctwas_snp_res <- ctwas_res[ctwas_res$type == "SNP", ]
ctwas_snp_res <- data.frame(ctwas_snp_res)

#add gene information to results
sqlite <- RSQLite::dbDriver("SQLite")
db = RSQLite::dbConnect(sqlite, paste0("/project2/compbio/predictdb/mashr_models/mashr_", weight, ".db"))
query <- function(...) RSQLite::dbGetQuery(db, ...)
gene_info <- query("select gene, genename from extra")
gene_info <- query("select gene, genename, gene_type from extra")
RSQLite::dbDisconnect(db)

ctwas_gene_res <- cbind(ctwas_gene_res, gene_info[sapply(ctwas_gene_res$id, match, gene_info$gene), c("genename", "gene_type")])

#add z score to results
load(paste0(results_dir, "/", analysis_id, "_expr_z_gene.Rd"))
ctwas_gene_res$z <- z_gene[ctwas_gene_res$id,]$z

#load(paste0(results_dir, "/", analysis_id, "_expr_z_snp.Rd")) #for new version, stored after harmonization
z_snp <- readRDS(paste0(results_dir, "/", trait_id, ".RDS")) #for old version, unharmonized

z_snp <- z_snp[z_snp$id %in% ctwas_snp_res$id,] #subset snps to those included in analysis, note some are duplicated, need to match which allele was used
ctwas_snp_res$z <- z_snp$z[match(ctwas_snp_res$id, z_snp$id)] #for duplicated snps, this arbitrarily uses the first allele
ctwas_snp_res$z_flag <- as.numeric(ctwas_snp_res$id %in% z_snp$id[duplicated(z_snp$id)]) #mark the unclear z scores, flag=1

#formatting and rounding for tables
ctwas_gene_res$z <- round(ctwas_gene_res$z,2)
ctwas_snp_res$z <- round(ctwas_snp_res$z,2)
ctwas_gene_res$susie_pip <- round(ctwas_gene_res$susie_pip,3)
ctwas_snp_res$susie_pip <- round(ctwas_snp_res$susie_pip,3)
ctwas_gene_res$mu2 <- round(ctwas_gene_res$mu2,2)
ctwas_snp_res$mu2 <- round(ctwas_snp_res$mu2,2)
ctwas_gene_res$PVE <- signif(ctwas_gene_res$PVE, 2)
ctwas_snp_res$PVE <- signif(ctwas_snp_res$PVE, 2)

#merge gene and snp results with added information
ctwas_gene_res$z_flag=NA
ctwas_snp_res$genename=NA
ctwas_snp_res$gene_type=NA

ctwas_res <- rbind(ctwas_gene_res,
                   ctwas_snp_res[,colnames(ctwas_gene_res)])

#store columns to report
report_cols <- colnames(ctwas_gene_res)[!(colnames(ctwas_gene_res) %in% c("type", "region_tag1", "region_tag2", "cs_index", "gene_type", "z_flag", "id", "chrom", "pos"))]
first_cols <- c("genename", "region_tag")
report_cols <- c(first_cols, report_cols[!(report_cols %in% first_cols)])

report_cols_snps <- c("id", report_cols[-1])

#get number of SNPs from s1 results; adjust for thin
ctwas_res_s1 <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.s1.susieIrss.txt"))
n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
rm(ctwas_res_s1)

Check convergence of parameters

library(ggplot2)
library(cowplot)

********************************************************
Note: As of version 1.0.0, cowplot does not change the
  default ggplot2 theme anymore. To recover the previous
  behavior, execute:
  theme_set(theme_cowplot())
********************************************************
load(paste0(results_dir, "/", analysis_id, "_ctwas.s2.susieIrssres.Rd"))
  
df <- data.frame(niter = rep(1:ncol(group_prior_rec), 2),
                 value = c(group_prior_rec[1,], group_prior_rec[2,]),
                 group = rep(c("Gene", "SNP"), each = ncol(group_prior_rec)))
df$group <- as.factor(df$group)

df$value[df$group=="SNP"] <- df$value[df$group=="SNP"]*thin #adjust parameter to account for thin argument

p_pi <- ggplot(df, aes(x=niter, y=value, group=group)) +
  geom_line(aes(color=group)) +
  geom_point(aes(color=group)) +
  xlab("Iteration") + ylab(bquote(pi)) +
  ggtitle("Prior mean") +
  theme_cowplot()

df <- data.frame(niter = rep(1:ncol(group_prior_var_rec), 2),
                 value = c(group_prior_var_rec[1,], group_prior_var_rec[2,]),
                 group = rep(c("Gene", "SNP"), each = ncol(group_prior_var_rec)))
df$group <- as.factor(df$group)
p_sigma2 <- ggplot(df, aes(x=niter, y=value, group=group)) +
  geom_line(aes(color=group)) +
  geom_point(aes(color=group)) +
  xlab("Iteration") + ylab(bquote(sigma^2)) +
  ggtitle("Prior variance") +
  theme_cowplot()

plot_grid(p_pi, p_sigma2)

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
        gene          snp 
0.0250742164 0.0001009486 
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
     gene       snp 
 6.868055 15.896828 
#report sample size
print(sample_size)
[1] 312102
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1]   11095 8697330
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
      gene        snp 
0.00612198 0.04471985 
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1] 0.03572312 0.41772427

Genes with highest PIPs

#distribution of PIPs
hist(ctwas_gene_res$susie_pip, xlim=c(0,1), main="Distribution of Gene PIPs")

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#genes with PIP>0.8 or 20 highest PIPs
head(ctwas_gene_res[order(-ctwas_gene_res$susie_pip),report_cols], max(sum(ctwas_gene_res$susie_pip>0.8), 20))
      genename region_tag susie_pip   mu2     PVE     z
3804     OPRL1      20_38     0.989 29.28 9.3e-05 -5.63
2237      TBL2       7_47     0.946 21.58 6.5e-05 -5.43
8320    B3GNT2       2_41     0.919 20.21 6.0e-05 -4.43
8504      UTF1      10_85     0.917 22.68 6.7e-05 -4.88
209      CEP68       2_42     0.914 20.93 6.1e-05 -4.48
6064     PTPRJ      11_29     0.909 23.98 7.0e-05 -3.65
7636    ZNF219       14_1     0.894 17.87 5.1e-05  4.02
9030      TYMS       18_1     0.889 18.14 5.2e-05 -3.95
4219      ASS1       9_68     0.876 18.46 5.2e-05 -4.00
10046  CCDC157      22_10     0.851 18.47 5.0e-05  4.06
6977      CYGB      17_43     0.808 17.74 4.6e-05 -3.94
6122    CLEC1A       12_9     0.804 17.01 4.4e-05  3.70
7786  CATSPER2      15_16     0.798 40.88 1.0e-04 -7.43
7256     FABP1       2_55     0.796 18.77 4.8e-05  3.78
10765  ZDHHC18       1_18     0.787 39.28 9.9e-05 -6.36
9050    FBXO46      19_32     0.781 17.37 4.3e-05 -3.70
2409   HSD17B1      17_25     0.770 21.39 5.3e-05  4.50
4610      ACP2      11_29     0.760 19.96 4.9e-05 -2.96
6590     NTAN1      16_15     0.746 20.23 4.8e-05 -4.34
2073   SULT2A1      19_33     0.745 20.30 4.8e-05  4.45

Genes with largest effect sizes

#plot PIP vs effect size
plot(ctwas_gene_res$susie_pip, ctwas_gene_res$mu2, xlab="PIP", ylab="mu^2", main="Gene PIPs vs Effect Size")

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#genes with 20 largest effect sizes
head(ctwas_gene_res[order(-ctwas_gene_res$mu2),report_cols],20)
         genename region_tag susie_pip      mu2     PVE      z
1980        FCGRT      19_34         0 12904.87 0.0e+00  -4.49
5520         RCN3      19_34         0  4199.65 0.0e+00  -5.70
4733         AHI1       6_89         0  1346.81 0.0e+00   2.02
8165        CPT1C      19_34         0   927.03 0.0e+00   2.98
11526     TNFSF12       17_7         0   538.42 0.0e+00  27.26
4093       ATP1B2       17_7         0   523.68 2.4e-15 -29.79
7008      TNFSF13       17_7         0   343.24 1.2e-19   3.98
7009        SENP3       17_7         0   273.89 2.2e-13  12.96
4096        MPDU1       17_7         0   246.28 1.5e-12 -13.99
571       SLC6A16      19_34         0   233.32 0.0e+00   0.89
10492 CTC-301O7.4      19_34         0   222.86 0.0e+00   1.11
5425       WRAP53       17_7         0   197.45 5.8e-13 -15.51
9555        NAA38       17_7         0   197.02 1.6e-12  10.56
5427         SAT2       17_7         0   177.25 4.4e-19 -11.35
11220        ADM5      19_34         0   140.94 0.0e+00   0.32
846         TEAD2      19_34         0   138.68 0.0e+00   1.54
6980     ALDH16A1      19_34         0   134.13 0.0e+00  -0.65
9576       RUVBL2      19_34         0   127.57 0.0e+00  -5.67
1969         GYS1      19_34         0   127.22 0.0e+00   5.62
5430         TP53       17_7         0   104.56 5.6e-19   9.21

Genes with highest PVE

#genes with 20 highest pve
head(ctwas_gene_res[order(-ctwas_gene_res$PVE),report_cols],20)
          genename region_tag susie_pip   mu2     PVE     z
10416       ZNF655       7_61     0.436 82.11 1.1e-04 12.48
7786      CATSPER2      15_16     0.798 40.88 1.0e-04 -7.43
10765      ZDHHC18       1_18     0.787 39.28 9.9e-05 -6.36
3804         OPRL1      20_38     0.989 29.28 9.3e-05 -5.63
11637 GS1-259H13.2       7_61     0.281 80.19 7.2e-05 12.40
6064         PTPRJ      11_29     0.909 23.98 7.0e-05 -3.65
8504          UTF1      10_85     0.917 22.68 6.7e-05 -4.88
2237          TBL2       7_47     0.946 21.58 6.5e-05 -5.43
209          CEP68       2_42     0.914 20.93 6.1e-05 -4.48
8320        B3GNT2       2_41     0.919 20.21 6.0e-05 -4.43
1145          ACHE       7_62     0.622 29.69 5.9e-05  5.19
8787         ZBTB4       17_6     0.578 32.10 5.9e-05 -5.14
2186         PTCD1       7_61     0.231 78.56 5.8e-05 12.30
4545       IER3IP1      18_26     0.731 24.71 5.8e-05 -5.12
9453           ADO      10_42     0.500 34.22 5.5e-05 -4.79
3099       ARHGEF2       1_77     0.688 24.47 5.4e-05  5.55
2409       HSD17B1      17_25     0.770 21.39 5.3e-05  4.50
4219          ASS1       9_68     0.876 18.46 5.2e-05 -4.00
9030          TYMS       18_1     0.889 18.14 5.2e-05 -3.95
1087          GCKR       2_16     0.468 34.25 5.1e-05 -7.81

Genes with largest z scores

#genes with 20 largest z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
          genename region_tag susie_pip    mu2     PVE      z
4093        ATP1B2       17_7     0.000 523.68 2.4e-15 -29.79
11526      TNFSF12       17_7     0.000 538.42 0.0e+00  27.26
5425        WRAP53       17_7     0.000 197.45 5.8e-13 -15.51
4096         MPDU1       17_7     0.000 246.28 1.5e-12 -13.99
7009         SENP3       17_7     0.000 273.89 2.2e-13  12.96
10416       ZNF655       7_61     0.436  82.11 1.1e-04  12.48
11637 GS1-259H13.2       7_61     0.281  80.19 7.2e-05  12.40
2186         PTCD1       7_61     0.231  78.56 5.8e-05  12.30
6935         CPSF4       7_61     0.063  70.64 1.4e-05 -11.76
5427          SAT2       17_7     0.000 177.25 4.4e-19 -11.35
9403        POLR2A       17_7     0.000  56.52 0.0e+00  10.71
2953         NRBP1       2_16     0.031  81.04 8.0e-06 -10.67
9555         NAA38       17_7     0.000 197.02 1.6e-12  10.56
5430          TP53       17_7     0.000 104.56 5.6e-19   9.21
2956         SNX17       2_16     0.032  73.02 7.5e-06  -9.04
7726         CYB5A      18_43     0.170  48.50 2.6e-05   8.65
4402         KDM6B       17_7     0.000  11.05 0.0e+00  -8.11
5076        ATRAID       2_16     0.055  44.54 7.9e-06   8.01
1087          GCKR       2_16     0.468  34.25 5.1e-05  -7.81
5057        IFT172       2_16     0.451  34.02 4.9e-05   7.78

Comparing z scores and PIPs

#set nominal signifiance threshold for z scores
alpha <- 0.05

#bonferroni adjusted threshold for z scores
sig_thresh <- qnorm(1-(alpha/nrow(ctwas_gene_res)/2), lower=T)

#Q-Q plot for z scores
obs_z <- ctwas_gene_res$z[order(ctwas_gene_res$z)]
exp_z <- qnorm((1:nrow(ctwas_gene_res))/nrow(ctwas_gene_res))

plot(exp_z, obs_z, xlab="Expected z", ylab="Observed z", main="Gene z score Q-Q plot")
abline(a=0,b=1)

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#plot z score vs PIP
plot(abs(ctwas_gene_res$z), ctwas_gene_res$susie_pip, xlab="abs(z)", ylab="PIP")
abline(v=sig_thresh, col="red", lty=2)

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)
[1] 0.008021631
#genes with most significant z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
          genename region_tag susie_pip    mu2     PVE      z
4093        ATP1B2       17_7     0.000 523.68 2.4e-15 -29.79
11526      TNFSF12       17_7     0.000 538.42 0.0e+00  27.26
5425        WRAP53       17_7     0.000 197.45 5.8e-13 -15.51
4096         MPDU1       17_7     0.000 246.28 1.5e-12 -13.99
7009         SENP3       17_7     0.000 273.89 2.2e-13  12.96
10416       ZNF655       7_61     0.436  82.11 1.1e-04  12.48
11637 GS1-259H13.2       7_61     0.281  80.19 7.2e-05  12.40
2186         PTCD1       7_61     0.231  78.56 5.8e-05  12.30
6935         CPSF4       7_61     0.063  70.64 1.4e-05 -11.76
5427          SAT2       17_7     0.000 177.25 4.4e-19 -11.35
9403        POLR2A       17_7     0.000  56.52 0.0e+00  10.71
2953         NRBP1       2_16     0.031  81.04 8.0e-06 -10.67
9555         NAA38       17_7     0.000 197.02 1.6e-12  10.56
5430          TP53       17_7     0.000 104.56 5.6e-19   9.21
2956         SNX17       2_16     0.032  73.02 7.5e-06  -9.04
7726         CYB5A      18_43     0.170  48.50 2.6e-05   8.65
4402         KDM6B       17_7     0.000  11.05 0.0e+00  -8.11
5076        ATRAID       2_16     0.055  44.54 7.9e-06   8.01
1087          GCKR       2_16     0.468  34.25 5.1e-05  -7.81
5057        IFT172       2_16     0.451  34.02 4.9e-05   7.78

Locus plots for genes and SNPs

ctwas_gene_res_sortz <- ctwas_gene_res[order(-abs(ctwas_gene_res$z)),]

n_plots <- 5
for (region_tag_plot in head(unique(ctwas_gene_res_sortz$region_tag), n_plots)){
  ctwas_res_region <-  ctwas_res[ctwas_res$region_tag==region_tag_plot,]
  start <- min(ctwas_res_region$pos)
  end <- max(ctwas_res_region$pos)
  
  ctwas_res_region <- ctwas_res_region[order(ctwas_res_region$pos),]
  ctwas_res_region_gene <- ctwas_res_region[ctwas_res_region$type=="gene",]
  ctwas_res_region_snp <- ctwas_res_region[ctwas_res_region$type=="SNP",]
  
  #region name
  print(paste0("Region: ", region_tag_plot))
  
  #table of genes in region
  print(ctwas_res_region_gene[,report_cols])
  
  par(mfrow=c(4,1))
  
  #gene z scores
  plot(ctwas_res_region_gene$pos, abs(ctwas_res_region_gene$z), xlab="Position", ylab="abs(gene_z)", xlim=c(start,end),
   ylim=c(0,max(sig_thresh, abs(ctwas_res_region_gene$z))),
   main=paste0("Region: ", region_tag_plot))
  abline(h=sig_thresh,col="red",lty=2)
  
  #significance threshold for SNPs
  alpha_snp <- 5*10^(-8)
  sig_thresh_snp <- qnorm(1-alpha_snp/2, lower=T)
  
  #snp z scores
  plot(ctwas_res_region_snp$pos, abs(ctwas_res_region_snp$z), xlab="Position", ylab="abs(snp_z)",xlim=c(start,end),
   ylim=c(0,max(sig_thresh_snp, max(abs(ctwas_res_region_snp$z)))))
  abline(h=sig_thresh_snp,col="purple",lty=2)
  
  #gene pips
  plot(ctwas_res_region_gene$pos, ctwas_res_region_gene$susie_pip, xlab="Position", ylab="Gene PIP", xlim=c(start,end), ylim=c(0,1))
  abline(h=0.8,col="blue",lty=2)
  
  #snp pips
  plot(ctwas_res_region_snp$pos, ctwas_res_region_snp$susie_pip, xlab="Position", ylab="SNP PIP", xlim=c(start,end), ylim=c(0,1))
  abline(h=0.8,col="blue",lty=2)
}
[1] "Region: 17_7"
        genename region_tag susie_pip    mu2     PVE      z
7010       FGF11       17_7         0  12.89 0.0e+00   1.61
8293      CHRNB1       17_7         0  95.27 0.0e+00   0.30
9403      POLR2A       17_7         0  56.52 0.0e+00  10.71
11526    TNFSF12       17_7         0 538.42 0.0e+00  27.26
7008     TNFSF13       17_7         0 343.24 1.2e-19   3.98
7009       SENP3       17_7         0 273.89 2.2e-13  12.96
4096       MPDU1       17_7         0 246.28 1.5e-12 -13.99
5427        SAT2       17_7         0 177.25 4.4e-19 -11.35
4093      ATP1B2       17_7         0 523.68 2.4e-15 -29.79
5425      WRAP53       17_7         0 197.45 5.8e-13 -15.51
5430        TP53       17_7         0 104.56 5.6e-19   9.21
4402       KDM6B       17_7         0  11.05 0.0e+00  -8.11
7989      TMEM88       17_7         0  25.68 0.0e+00   3.57
9555       NAA38       17_7         0 197.02 1.6e-12  10.56
8272        CHD3       17_7         0  40.93 0.0e+00  -4.52
9286  AC025335.1       17_7         0  42.35 0.0e+00   4.38
8279      KCNAB3       17_7         0  31.34 0.0e+00  -2.31
8277      CNTROB       17_7         0  26.76 0.0e+00  -1.48
8278     TRAPPC1       17_7         0   6.66 0.0e+00   2.04
11172      VAMP2       17_7         0  65.12 4.6e-20  -3.80
9234     TMEM107       17_7         0  19.21 0.0e+00   1.68
10292     BORCS6       17_7         0  34.87 0.0e+00  -3.13
9228   LINC00324       17_7         0   6.03 0.0e+00  -0.81
9218        PFAS       17_7         0   5.65 0.0e+00  -0.19
9226        CTC1       17_7         0  85.96 1.1e-18  -4.49
3790    SLC25A35       17_7         0  17.54 0.0e+00   2.16
9716       KRBA2       17_7         0  20.46 0.0e+00  -2.50
7011       RPL26       17_7         0   5.11 0.0e+00   0.67

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 7_61"
          genename region_tag susie_pip   mu2     PVE      z
10651       SMURF1       7_61     0.125 13.63 5.5e-06   0.36
11572       ARPC1A       7_61     0.082 16.40 4.3e-06  -2.61
4189        ARPC1B       7_61     0.026  4.69 3.9e-07   1.06
2184         PDAP1       7_61     0.104 15.25 5.1e-06   2.09
2185         BUD31       7_61     0.026  6.93 5.8e-07   1.98
6935         CPSF4       7_61     0.063 70.64 1.4e-05 -11.76
2186         PTCD1       7_61     0.231 78.56 5.8e-05  12.30
11566       ATP5J2       7_61     0.047 10.35 1.6e-06  -2.00
10617       ZNF789       7_61     0.027 21.35 1.8e-06   6.18
10416       ZNF655       7_61     0.436 82.11 1.1e-04  12.48
11637 GS1-259H13.2       7_61     0.281 80.19 7.2e-05  12.40
10368      ZSCAN25       7_61     0.031  6.55 6.5e-07   1.20
2187        CYP3A5       7_61     0.081 12.56 3.2e-06  -0.61
2188       ZKSCAN1       7_61     0.027  5.83 5.0e-07   1.89
7760       ZSCAN21       7_61     0.036 10.79 1.3e-06  -2.68
7758          ZNF3       7_61     0.063 10.52 2.1e-06  -0.91
8029         COPS6       7_61     0.061 12.78 2.5e-06  -1.95
11177        AP4M1       7_61     0.026  4.59 3.8e-07  -0.70
7838         CNPY4       7_61     0.032  7.44 7.7e-07   1.29
2190          TAF6       7_61     0.032  9.40 9.5e-07  -2.60
11096       MBLAC1       7_61     0.059 12.49 2.4e-06   1.92
5920       C7orf43       7_61     0.033  7.08 7.6e-07  -0.96
10076      LAMTOR4       7_61     0.037  7.01 8.3e-07   0.53
10379      GAL3ST4       7_61     0.026  5.92 5.0e-07   1.91
672          STAG3       7_61     0.027  4.67 4.1e-07  -0.25
11022         GPC2       7_61     0.052 14.81 2.5e-06  -2.81
11021        PVRIG       7_61     0.027  5.23 4.5e-07   1.45
11095       SPDYE3       7_61     0.029  6.30 5.8e-07  -1.52
3503         PILRB       7_61     0.029  6.62 6.2e-07   1.64
959         ZCWPW1       7_61     0.029  6.77 6.3e-07  -1.76
5924         MEPCE       7_61     0.029  6.77 6.3e-07  -1.76
7824       TSC22D4       7_61     0.031  7.54 7.5e-07   1.89
7823         NYAP1       7_61     0.032  7.82 8.0e-07  -1.94
2201         AGFG2       7_61     0.030  5.96 5.7e-07   1.03
10908        SAP25       7_61     0.034  7.44 8.2e-07   1.34
2196        FBXO24       7_61     0.070 10.89 2.5e-06  -0.79

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 2_16"
      genename region_tag susie_pip   mu2     PVE      z
11045  SLC35F6       2_16     0.031  7.69 7.6e-07   2.36
3366   TMEM214       2_16     0.034  5.30 5.8e-07   0.36
5074   EMILIN1       2_16     0.040 14.67 1.9e-06  -3.40
5061       KHK       2_16     0.035  5.40 6.0e-07  -0.36
5059    CGREF1       2_16     0.031  4.61 4.6e-07  -0.25
5070      PREB       2_16     0.033  5.12 5.4e-07   1.46
5076    ATRAID       2_16     0.055 44.54 7.9e-06   8.01
1090       CAD       2_16     0.098 25.51 8.0e-06   5.00
5071    SLC5A6       2_16     0.031 10.99 1.1e-06  -2.75
7303       UCN       2_16     0.057 24.94 4.5e-06   6.53
2952    GTF3C2       2_16     0.047 23.06 3.4e-06  -6.40
2956     SNX17       2_16     0.032 73.02 7.5e-06  -9.04
7304    ZNF513       2_16     0.031 40.11 4.0e-06  -6.44
2953     NRBP1       2_16     0.031 81.04 8.0e-06 -10.67
5057    IFT172       2_16     0.451 34.02 4.9e-05   7.78
1087      GCKR       2_16     0.468 34.25 5.1e-05  -7.81
10613     GPN1       2_16     0.031 12.20 1.2e-06  -2.90
9018   CCDC121       2_16     0.073 15.43 3.6e-06   2.73
6660       BRE       2_16     0.055 24.59 4.3e-06  -6.18

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 18_43"
     genename region_tag susie_pip   mu2     PVE     z
878    TIMM21      18_43     0.027  5.19 4.5e-07 -0.04
7726    CYB5A      18_43     0.170 48.50 2.6e-05  8.65

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 17_28"
            genename region_tag susie_pip   mu2     PVE     z
84            OSBPL7      17_28     0.027 15.87 1.4e-06  2.40
5401          LRRC46      17_28     0.054 19.32 3.4e-06  2.90
6750          MRPL10      17_28     0.043 17.07 2.3e-06  2.84
5402           SCRN2      17_28     0.012  9.25 3.7e-07  1.09
7861             SP2      17_28     0.172 25.50 1.4e-05 -2.47
2370            PNPO      17_28     0.012  9.17 3.5e-07 -0.01
7862          PRR15L      17_28     0.015 10.55 5.2e-07  2.41
2373        CDK5RAP3      17_28     0.021 10.47 6.9e-07 -1.14
64             COPZ2      17_28     0.007  6.32 1.4e-07  1.41
1043          NFE2L1      17_28     0.006  4.66 9.4e-08 -0.90
12573   RP5-890E16.5      17_28     0.095 23.22 7.0e-06  3.51
2374            CBX1      17_28     0.052 18.31 3.1e-06 -3.45
21             SNX11      17_28     0.063 20.15 4.1e-06 -3.34
5400           SKAP1      17_28     0.013 10.99 4.4e-07 -1.90
3394           HOXB3      17_28     0.007  5.15 1.1e-07  0.80
9531           HOXB4      17_28     0.007  5.26 1.1e-07  0.83
8755           HOXB2      17_28     0.011  9.67 3.5e-07 -1.30
8369           HOXB9      17_28     0.027 16.65 1.4e-06 -1.71
11962          HOXB7      17_28     0.008  5.95 1.5e-07  0.13
11650      LINC02086      17_28     0.048 18.82 2.9e-06 -1.50
4852        CALCOCO2      17_28     0.006  4.75 9.6e-08 -0.42
6759          ATP5G1      17_28     0.007  6.24 1.4e-07  0.62
6761           UBE2Z      17_28     0.007  6.49 1.5e-07  0.99
6762            SNF8      17_28     0.014 10.08 4.6e-07  0.18
7846           GNGT2      17_28     0.010 29.62 9.3e-07 -7.58
2412            ABI3      17_28     0.006 11.17 2.2e-07 -3.28
8749        PHOSPHO1      17_28     0.018 14.00 8.0e-07 -1.50
11703 RP11-1079K10.3      17_28     0.007 13.31 2.8e-07 -0.01

Version Author Date
dfd2b5f wesleycrouse 2021-09-07

SNPs with highest PIPs

#snps with PIP>0.8 or 20 highest PIPs
head(ctwas_snp_res[order(-ctwas_snp_res$susie_pip),report_cols_snps],
max(sum(ctwas_snp_res$susie_pip>0.8), 20))
                id region_tag susie_pip      mu2     PVE      z
29498   rs12029116       1_62     1.000   113.62 3.6e-04  10.77
31545    rs1730862       1_66     1.000    86.35 2.8e-04  -9.50
75084     rs780093       2_16     1.000   228.89 7.3e-04  16.58
76439   rs11124265       2_20     1.000    71.12 2.3e-04  11.99
76713   rs72787520       2_20     1.000    97.17 3.1e-04  13.91
76932  rs112564689       2_21     1.000    36.13 1.2e-04   5.91
166268 rs768688512       3_58     1.000   576.13 1.8e-03   3.17
199205   rs5855544      3_120     1.000    49.97 1.6e-04  -7.28
223653   rs7696472       4_48     1.000    58.93 1.9e-04   8.69
366650 rs199804242       6_89     1.000  6998.83 2.2e-02  -2.78
411658  rs45467892       7_61     1.000   172.72 5.5e-04 -16.38
516576  rs17134158       10_7     1.000    58.30 1.9e-04  -9.64
579521    rs754042      11_41     1.000    38.18 1.2e-04  -5.56
723440  rs58217463      15_46     1.000    84.92 2.7e-04   8.66
723442   rs8028588      15_46     1.000    44.06 1.4e-04   5.37
747884   rs9931108      16_46     1.000    93.27 3.0e-04   7.38
747913  rs57960111      16_46     1.000   177.24 5.7e-04   1.51
755194  rs62059238       17_6     1.000    49.05 1.6e-04   7.31
755269  rs62059839       17_7     1.000  1126.74 3.6e-03  40.23
755280  rs72829446       17_7     1.000   237.89 7.6e-04  13.01
796290 rs141207866      18_43     1.000    55.55 1.8e-04   7.59
802523   rs8111359       19_9     1.000    46.76 1.5e-04  -6.68
804841 rs141356897      19_14     1.000    56.23 1.8e-04   7.59
940380  rs61371437      19_34     1.000 15063.58 4.8e-02   4.54
940389 rs113176985      19_34     1.000 15073.92 4.8e-02   4.53
940392 rs374141296      19_34     1.000 15075.64 4.8e-02   3.70
755204  rs73233955       17_6     0.998    53.60 1.7e-04  -6.19
200992  rs36205397        4_4     0.996    57.03 1.8e-04   7.71
456879   rs4236857       8_56     0.996    30.43 9.7e-05   5.27
55473   rs17558745      1_110     0.995    30.11 9.6e-05  -5.32
733751   rs2764772      16_18     0.994    36.86 1.2e-04   6.15
747910  rs12934751      16_46     0.993    87.81 2.8e-04  -8.48
234625 rs116652741       4_68     0.989    30.50 9.7e-05  -4.88
534374  rs11510917      10_39     0.983    40.03 1.3e-04  -6.25
749292   rs2255451      16_49     0.982    32.06 1.0e-04   5.54
76510   rs10439444       2_20     0.980    37.74 1.2e-04  -6.99
812360   rs2316974      19_30     0.980    33.84 1.1e-04  -5.65
420412    rs125124       7_80     0.978    27.02 8.5e-05   4.89
547596    rs632196      10_65     0.978    27.37 8.6e-05   4.29
755242 rs142700974       17_7     0.978   155.50 4.9e-04 -20.08
458025    rs382796       8_57     0.977    29.17 9.1e-05   5.35
484738  rs12683780       9_13     0.977    27.51 8.6e-05  -4.99
43226   rs10798655       1_89     0.970    61.49 1.9e-04  -8.15
683350  rs72681869      14_20     0.970    27.22 8.5e-05   4.99
578466  rs72932198      11_38     0.968    27.33 8.5e-05  -5.27
332970 rs553169727       6_26     0.965    32.78 1.0e-04  -5.02
54019    rs3754140      1_108     0.964    27.21 8.4e-05   4.53
54062    rs1223802      1_108     0.964    31.49 9.7e-05  -5.11
75281    rs7606480       2_19     0.964    28.32 8.7e-05   5.06
305673    rs329123       5_80     0.963    28.98 8.9e-05  -5.03
411300   rs4268041       7_60     0.955    74.17 2.3e-04   8.96
287522   rs2928169       5_45     0.953    26.38 8.1e-05   4.44
926265  rs17637241      17_28     0.951    51.75 1.6e-04   7.79
35912   rs10788826       1_74     0.947    26.90 8.2e-05   4.83
578503  rs12804411      11_38     0.944    28.56 8.6e-05   5.35
16004   rs12751490       1_35     0.942    27.08 8.2e-05   4.93
337187   rs6912200       6_32     0.942    28.58 8.6e-05  -5.28
491318  rs11557154       9_27     0.932    27.32 8.2e-05   4.77
701366  rs34639393      14_56     0.930    32.35 9.6e-05  -5.50
630639 rs375115050      12_59     0.926    26.34 7.8e-05  -4.88
357506   rs4515420       6_70     0.917    45.31 1.3e-04   6.30
75886   rs76886327       2_19     0.903    26.33 7.6e-05  -4.73
507163 rs201250488       9_57     0.888    31.98 9.1e-05  -5.36
447169  rs12543287       8_37     0.883    25.68 7.3e-05   4.29
830278   rs3212201      20_28     0.868    41.44 1.2e-04   6.39
564205  rs11024433      11_13     0.866    27.14 7.5e-05  -4.89
366765  rs62431291       6_90     0.861    26.14 7.2e-05  -4.61
634014  rs75622376      12_67     0.851    25.96 7.1e-05   4.67
50946    rs4951319      1_103     0.847    40.16 1.1e-04   6.26
159420  rs73128974       3_44     0.844    28.59 7.7e-05   5.06
554311   rs2225079      10_78     0.842    26.02 7.0e-05   4.56
384845 rs117483607       7_14     0.838    26.49 7.1e-05  -4.66
485945 rs112107457       9_15     0.838    38.69 1.0e-04   5.99
461261 rs369077882       8_64     0.824    25.77 6.8e-05  -4.48
17687  rs137898851       1_39     0.820    26.51 7.0e-05   4.60
547870  rs35035059      10_65     0.817    26.51 6.9e-05  -4.00
579514    rs341048      11_41     0.816    25.71 6.7e-05  -3.79
699223  rs35604463      14_52     0.808    26.82 6.9e-05   4.65

SNPs with largest effect sizes

#plot PIP vs effect size
plot(ctwas_snp_res$susie_pip, ctwas_snp_res$mu2, xlab="PIP", ylab="mu^2", main="SNP PIPs vs Effect Size")

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#SNPs with 50 largest effect sizes
head(ctwas_snp_res[order(-ctwas_snp_res$mu2),report_cols_snps],50)
                id region_tag susie_pip      mu2     PVE    z
940392 rs374141296      19_34         1 15075.64 4.8e-02 3.70
940389 rs113176985      19_34         1 15073.92 4.8e-02 4.53
940380  rs61371437      19_34         1 15063.58 4.8e-02 4.54
940382  rs35295508      19_34         0 15039.96 7.2e-09 4.62
940370    rs739349      19_34         0 15002.21 1.2e-07 4.50
940371    rs756628      19_34         0 15002.09 9.6e-08 4.49
940396   rs2946865      19_34         0 14990.66 2.7e-13 4.49
940387  rs73056069      19_34         0 14982.66 7.5e-13 4.53
940367    rs739347      19_34         0 14977.09 1.1e-08 4.56
940368   rs2073614      19_34         0 14961.72 3.3e-11 4.55
940384   rs2878354      19_34         0 14952.50 2.4e-15 4.59
940373   rs2077300      19_34         0 14915.60 5.0e-16 4.45
940363   rs4802613      19_34         0 14891.61 0.0e+00 4.43
940377  rs73056059      19_34         0 14889.32 1.5e-16 4.50
940397  rs60815603      19_34         0 14779.58 0.0e+00 4.58
940400   rs1316885      19_34         0 14704.19 0.0e+00 4.45
940361  rs10403394      19_34         0 14694.52 0.0e+00 4.69
940362  rs17555056      19_34         0 14688.74 0.0e+00 4.67
940402  rs60746284      19_34         0 14683.22 0.0e+00 4.52
940405   rs2946863      19_34         0 14678.04 0.0e+00 4.49
940398  rs35443645      19_34         0 14666.72 0.0e+00 4.51
940378  rs73056062      19_34         0 14504.73 0.0e+00 4.40
940408 rs553431297      19_34         0 14292.56 0.0e+00 4.65
940391 rs112283514      19_34         0 14235.94 0.0e+00 4.24
940393  rs11270139      19_34         0 14162.14 0.0e+00 4.80
940358  rs10421294      19_34         0 13259.32 0.0e+00 4.01
940357   rs8108175      19_34         0 13257.41 0.0e+00 4.01
940356   rs1858742      19_34         0 13232.23 0.0e+00 3.98
940350  rs59192944      19_34         0 13232.08 0.0e+00 4.00
940347  rs55991145      19_34         0 13222.68 0.0e+00 3.99
940342   rs3786567      19_34         0 13217.41 0.0e+00 3.98
940338   rs2271952      19_34         0 13212.34 0.0e+00 3.98
940341   rs4801801      19_34         0 13212.17 0.0e+00 3.97
940337   rs2271953      19_34         0 13197.98 0.0e+00 3.98
940339   rs2271951      19_34         0 13197.84 0.0e+00 3.99
940328  rs60365978      19_34         0 13183.92 0.0e+00 3.94
940334   rs4802612      19_34         0 13130.13 0.0e+00 3.94
940344   rs2517977      19_34         0 13110.26 0.0e+00 3.95
940331  rs55893003      19_34         0 13091.84 0.0e+00 3.97
940323  rs55992104      19_34         0 12786.69 0.0e+00 3.84
940317  rs60403475      19_34         0 12783.14 0.0e+00 3.84
940320   rs4352151      19_34         0 12782.45 0.0e+00 3.82
940314  rs11878448      19_34         0 12773.36 0.0e+00 3.81
940308   rs9653100      19_34         0 12769.47 0.0e+00 3.84
940304   rs4802611      19_34         0 12760.02 0.0e+00 3.80
940296   rs7251338      19_34         0 12740.38 0.0e+00 3.79
940295  rs59269605      19_34         0 12739.41 0.0e+00 3.80
940316   rs1042120      19_34         0 12706.13 0.0e+00 3.80
940312 rs113220577      19_34         0 12695.12 0.0e+00 3.80
940306   rs9653118      19_34         0 12676.34 0.0e+00 3.83

SNPs with highest PVE

#SNPs with 50 highest pve
head(ctwas_snp_res[order(-ctwas_snp_res$PVE),report_cols_snps],50)
                id region_tag susie_pip      mu2     PVE      z
940380  rs61371437      19_34     1.000 15063.58 0.04800   4.54
940389 rs113176985      19_34     1.000 15073.92 0.04800   4.53
940392 rs374141296      19_34     1.000 15075.64 0.04800   3.70
366650 rs199804242       6_89     1.000  6998.83 0.02200  -2.78
366649   rs2327654       6_89     0.715  7015.15 0.01600  -2.39
366653 rs113527452       6_89     0.597  6992.31 0.01300  -2.51
366666   rs6923513       6_89     0.475  7014.71 0.01100  -2.38
755269  rs62059839       17_7     1.000  1126.74 0.00360  40.23
166268 rs768688512       3_58     1.000   576.13 0.00180   3.17
535562      rs1935      10_42     0.620   580.87 0.00120  27.16
755249   rs8073177       17_7     0.747   432.97 0.00100  31.09
166265  rs12493756       3_58     0.435   563.77 0.00079   3.02
755280  rs72829446       17_7     1.000   237.89 0.00076  13.01
75084     rs780093       2_16     1.000   228.89 0.00073  16.58
166266  rs12493835       3_58     0.386   564.01 0.00070   3.00
166263 rs138503435       3_58     0.341   563.68 0.00062   3.00
166267   rs6765538       3_58     0.337   563.80 0.00061   2.96
747913  rs57960111      16_46     1.000   177.24 0.00057   1.51
755270      rs6257       17_7     0.643   275.72 0.00057 -20.73
411658  rs45467892       7_61     1.000   172.72 0.00055 -16.38
166264  rs73141241       3_58     0.297   561.02 0.00053   3.01
755242 rs142700974       17_7     0.978   155.50 0.00049 -20.08
29498   rs12029116       1_62     1.000   113.62 0.00036  10.77
76713   rs72787520       2_20     1.000    97.17 0.00031  13.91
755268 rs112885647       17_7     0.357   274.08 0.00031 -20.68
747884   rs9931108      16_46     1.000    93.27 0.00030   7.38
31545    rs1730862       1_66     1.000    86.35 0.00028  -9.50
747910  rs12934751      16_46     0.993    87.81 0.00028  -8.48
723440  rs58217463      15_46     1.000    84.92 0.00027   8.66
755247   rs9892862       17_7     0.188   429.17 0.00026  31.04
535573  rs10761729      10_42     0.136   577.23 0.00025  27.10
76439   rs11124265       2_20     1.000    71.12 0.00023  11.99
411300   rs4268041       7_60     0.955    74.17 0.00023   8.96
43226   rs10798655       1_89     0.970    61.49 0.00019  -8.15
223653   rs7696472       4_48     1.000    58.93 0.00019   8.69
516576  rs17134158       10_7     1.000    58.30 0.00019  -9.64
200992  rs36205397        4_4     0.996    57.03 0.00018   7.71
642796    rs606950       13_3     0.595    92.75 0.00018   9.88
796290 rs141207866      18_43     1.000    55.55 0.00018   7.59
804841 rs141356897      19_14     1.000    56.23 0.00018   7.59
755204  rs73233955       17_6     0.998    53.60 0.00017  -6.19
166289   rs9310061       3_58     0.287   169.36 0.00016  -5.57
199205   rs5855544      3_120     1.000    49.97 0.00016  -7.28
535593   rs5785566      10_42     0.084   576.33 0.00016  27.08
747921   rs6564897      16_46     0.370   137.66 0.00016   4.61
755194  rs62059238       17_6     1.000    49.05 0.00016   7.31
926265  rs17637241      17_28     0.951    51.75 0.00016   7.79
707145  rs28534747      15_13     0.649    69.91 0.00015   8.41
802523   rs8111359       19_9     1.000    46.76 0.00015  -6.68
166279  rs56323967       3_58     0.306   144.40 0.00014   5.18

SNPs with largest z scores

#SNPs with 50 largest z scores
head(ctwas_snp_res[order(-abs(ctwas_snp_res$z)),report_cols_snps],50)
                id region_tag susie_pip     mu2     PVE      z
755269  rs62059839       17_7     1.000 1126.74 3.6e-03  40.23
755263 rs149932962       17_7     0.000  927.71 0.0e+00  37.98
755250  rs62059797       17_7     0.000  679.52 0.0e+00  33.12
755248  rs35049113       17_7     0.000  675.96 0.0e+00  33.07
755254  rs12941509       17_7     0.000  683.49 0.0e+00  33.07
755259   rs4968212       17_7     0.000  663.25 0.0e+00 -32.21
755257  rs62059801       17_7     0.000  633.98 0.0e+00  31.72
755249   rs8073177       17_7     0.747  432.97 1.0e-03  31.09
755247   rs9892862       17_7     0.188  429.17 2.6e-04  31.04
755253  rs11078694       17_7     0.065  429.86 8.9e-05 -30.85
535562      rs1935      10_42     0.620  580.87 1.2e-03  27.16
535573  rs10761729      10_42     0.136  577.23 2.5e-04  27.10
535593   rs5785566      10_42     0.084  576.33 1.6e-04  27.08
535609   rs6479896      10_42     0.050  575.30 9.3e-05  27.06
535605  rs10822160      10_42     0.044  574.55 8.2e-05  27.05
535611   rs7910927      10_42     0.034  574.44 6.3e-05  27.05
535626   rs3956912      10_42     0.016  572.47 3.0e-05  27.02
535617  rs10822168      10_42     0.011  571.63 2.1e-05  27.00
535557  rs35751397      10_42     0.004  569.14 7.2e-06  26.94
535619  rs10640079      10_42     0.001  567.00 1.4e-06  26.85
755218  rs34474914       17_7     0.000  432.92 0.0e+00  26.65
755289   rs1641549       17_7     0.000  268.09 1.9e-19 -26.37
755279 rs745412832       17_7     0.000  314.45 6.7e-19  25.69
535654   rs2163188      10_42     0.000  507.20 4.3e-07 -25.42
535595  rs10761739      10_42     0.002  531.32 2.6e-06  25.34
755258  rs12601581       17_7     0.000  457.60 0.0e+00 -25.31
535662  rs10822186      10_42     0.000  494.23 1.5e-07  25.11
535663   rs7895549      10_42     0.000  493.80 1.5e-07  25.09
535661 rs570234162      10_42     0.000  489.39 1.5e-07 -24.96
535658   rs4746205      10_42     0.000  492.37 4.2e-07 -24.92
535659   rs7090758      10_42     0.000  468.75 2.6e-07 -24.39
755271   rs1642797       17_7     0.000  448.97 9.6e-19  23.61
755272   rs1642808       17_7     0.000  447.84 9.6e-19  23.57
755273   rs1641538       17_7     0.000  447.87 9.6e-19  23.57
755274   rs1641531       17_7     0.000  447.53 9.6e-19  23.57
755275   rs1641528       17_7     0.000  448.13 1.1e-18  23.57
755276   rs1641522       17_7     0.000  447.95 1.1e-18  23.55
535547  rs10995445      10_42     0.000  423.25 1.7e-07  23.42
755211     rs13290       17_7     0.000  132.59 0.0e+00  23.39
755208  rs11652328       17_7     0.000  313.88 0.0e+00  22.13
755212  rs34706172       17_7     0.000  131.70 0.0e+00  21.20
755270      rs6257       17_7     0.643  275.72 5.7e-04 -20.73
755268 rs112885647       17_7     0.357  274.08 3.1e-04 -20.68
755286   rs4968186       17_7     0.000  188.88 0.0e+00  20.23
755215  rs12600863       17_7     0.000  153.57 0.0e+00  20.11
755214   rs3829603       17_7     0.000  155.86 0.0e+00  20.08
755242 rs142700974       17_7     0.978  155.50 4.9e-04 -20.08
755227 rs116560331       17_7     0.022  146.20 1.0e-05 -19.55
755260  rs12939472       17_7     0.000  244.22 0.0e+00 -19.28
535613 rs149727843      10_42     0.000  343.36 1.3e-07  18.86

Gene set enrichment for genes with PIP>0.8

#GO enrichment analysis
library(enrichR)
Welcome to enrichR
Checking connection ... 
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
dbs <- c("GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021")
genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]

#number of genes for gene set enrichment
length(genes)
[1] 12
if (length(genes)>0){
  GO_enrichment <- enrichr(genes, dbs)

  for (db in dbs){
    print(db)
    df <- GO_enrichment[[db]]
    df <- df[df$Adjusted.P.value<0.05,c("Term", "Overlap", "Adjusted.P.value", "Genes")]
    print(df)
  }
  
  #DisGeNET enrichment
  
  # devtools::install_bitbucket("ibi_group/disgenet2r")
  library(disgenet2r)
  
  disgenet_api_key <- get_disgenet_api_key(
                    email = "wesleycrouse@gmail.com", 
                    password = "uchicago1" )
  
  Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
  
  res_enrich <-disease_enrichment(entities=genes, vocabulary = "HGNC",
                               database = "CURATED" )
  
  df <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio",  "BgRatio")]
  print(df)
  
  #WebGestalt enrichment
  library(WebGestaltR)
  
  background <- ctwas_gene_res$genename
  
  #listGeneSet()
  databases <- c("pathway_KEGG", "disease_GLAD4U", "disease_OMIM")
  
  enrichResult <- WebGestaltR(enrichMethod="ORA", organism="hsapiens",
                              interestGene=genes, referenceGene=background,
                              enrichDatabase=databases, interestGeneType="genesymbol",
                              referenceGeneType="genesymbol", isOutput=F)
  print(enrichResult[,c("description", "size", "overlap", "FDR", "database", "userId")])
}
Uploading data to Enrichr... Done.
  Querying GO_Biological_Process_2021... Done.
  Querying GO_Cellular_Component_2021... Done.
  Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
[1] "GO_Cellular_Component_2021"
[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"
                                                                                   Term
1                                                           kinase binding (GO:0019900)
2                                                   protein kinase binding (GO:0019901)
3                               G protein-coupled opioid receptor activity (GO:0004985)
4                                           sequence-specific mRNA binding (GO:1990825)
5  N-acetyllactosaminide beta-1,3-N-acetylglucosaminyltransferase activity (GO:0008532)
6                          platelet-derived growth factor receptor binding (GO:0005161)
7                                                     neuropeptide binding (GO:0042923)
8                                 acetylgalactosaminyltransferase activity (GO:0008376)
9                                 mitogen-activated protein kinase binding (GO:0051019)
10                                                      amino acid binding (GO:0016597)
   Overlap Adjusted.P.value            Genes
1    3/461       0.03115665 CEP68;TBL2;PTPRJ
2    3/506       0.03115665 CEP68;TBL2;PTPRJ
3      1/6       0.03115665            OPRL1
4     1/10       0.03368827             TYMS
5     1/12       0.03368827           B3GNT2
6     1/13       0.03368827            PTPRJ
7     1/21       0.04654297            OPRL1
8     1/27       0.04949615           B3GNT2
9     1/29       0.04949615            PTPRJ
10    1/32       0.04949615             ASS1
UTF1 gene(s) from the input list not found in DisGeNET CURATEDTBL2 gene(s) from the input list not found in DisGeNET CURATEDZNF219 gene(s) from the input list not found in DisGeNET CURATEDCCDC157 gene(s) from the input list not found in DisGeNET CURATED
                                                    Description        FDR
5                                                 Aspergillosis 0.01942232
21                                   Digestive System Neoplasms 0.01942232
60                                                  Hypokinesia 0.01942232
64                                                Citrullinemia 0.01942232
67                                                 Bradykinesia 0.01942232
71                                                  Hypodynamia 0.01942232
72                                    Poisoning by fluorouracil 0.01942232
80                                   Cancer of Digestive System 0.01942232
82                                 Hypokinesia, Antiorthostatic 0.01942232
83 Argininosuccinic Acid Synthetase Deficiency Disease, Partial 0.01942232
   Ratio BgRatio
5    1/8  2/9703
21   1/8  2/9703
60   1/8  3/9703
64   1/8  1/9703
67   1/8  3/9703
71   1/8  3/9703
72   1/8  2/9703
80   1/8  2/9703
82   1/8  3/9703
83   1/8  1/9703
******************************************

*                                        *

*          Welcome to WebGestaltR !      *

*                                        *

******************************************
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet,
minNum = minNum, : No significant gene set is identified based on FDR 0.05!
NULL

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0       cowplot_1.0.0    
[5] ggplot2_3.3.3    

loaded via a namespace (and not attached):
  [1] bitops_1.0-6                matrixStats_0.57.0         
  [3] fs_1.3.1                    bit64_4.0.5                
  [5] doParallel_1.0.16           progress_1.2.2             
  [7] httr_1.4.1                  rprojroot_2.0.2            
  [9] GenomeInfoDb_1.20.0         doRNG_1.8.2                
 [11] tools_3.6.1                 utf8_1.2.1                 
 [13] R6_2.5.0                    DBI_1.1.1                  
 [15] BiocGenerics_0.30.0         colorspace_1.4-1           
 [17] withr_2.4.1                 tidyselect_1.1.0           
 [19] prettyunits_1.0.2           bit_4.0.4                  
 [21] curl_3.3                    compiler_3.6.1             
 [23] git2r_0.26.1                Biobase_2.44.0             
 [25] DelayedArray_0.10.0         rtracklayer_1.44.0         
 [27] labeling_0.3                scales_1.1.0               
 [29] readr_1.4.0                 apcluster_1.4.8            
 [31] stringr_1.4.0               digest_0.6.20              
 [33] Rsamtools_2.0.0             svglite_1.2.2              
 [35] rmarkdown_1.13              XVector_0.24.0             
 [37] pkgconfig_2.0.3             htmltools_0.3.6            
 [39] fastmap_1.1.0               BSgenome_1.52.0            
 [41] rlang_0.4.11                RSQLite_2.2.7              
 [43] generics_0.0.2              farver_2.1.0               
 [45] jsonlite_1.6                BiocParallel_1.18.0        
 [47] dplyr_1.0.7                 VariantAnnotation_1.30.1   
 [49] RCurl_1.98-1.1              magrittr_2.0.1             
 [51] GenomeInfoDbData_1.2.1      Matrix_1.2-18              
 [53] Rcpp_1.0.6                  munsell_0.5.0              
 [55] S4Vectors_0.22.1            fansi_0.5.0                
 [57] gdtools_0.1.9               lifecycle_1.0.0            
 [59] stringi_1.4.3               whisker_0.3-2              
 [61] yaml_2.2.0                  SummarizedExperiment_1.14.1
 [63] zlibbioc_1.30.0             plyr_1.8.4                 
 [65] grid_3.6.1                  blob_1.2.1                 
 [67] parallel_3.6.1              promises_1.0.1             
 [69] crayon_1.4.1                lattice_0.20-38            
 [71] Biostrings_2.52.0           GenomicFeatures_1.36.3     
 [73] hms_1.1.0                   knitr_1.23                 
 [75] pillar_1.6.1                igraph_1.2.4.1             
 [77] GenomicRanges_1.36.0        rjson_0.2.20               
 [79] rngtools_1.5                codetools_0.2-16           
 [81] reshape2_1.4.3              biomaRt_2.40.1             
 [83] stats4_3.6.1                XML_3.98-1.20              
 [85] glue_1.4.2                  evaluate_0.14              
 [87] data.table_1.14.0           foreach_1.5.1              
 [89] vctrs_0.3.8                 httpuv_1.5.1               
 [91] gtable_0.3.0                purrr_0.3.4                
 [93] assertthat_0.2.1            cachem_1.0.5               
 [95] xfun_0.8                    later_0.8.0                
 [97] tibble_3.1.2                iterators_1.0.13           
 [99] GenomicAlignments_1.20.1    AnnotationDbi_1.46.0       
[101] memoise_2.0.0               IRanges_2.18.1             
[103] workflowr_1.6.2             ellipsis_0.3.2