Last updated: 2021-09-09

Checks: 6 1

Knit directory: ctwas_applied/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20210726) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 59e5f4d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Unstaged changes:
    Modified:   analysis/ukb-d-30500_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30500_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30600_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30600_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30610_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30610_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30620_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30620_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30630_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30630_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30640_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30640_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30650_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30650_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30660_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30660_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30670_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30670_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30680_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30690_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30690_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30700_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30700_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30710_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30710_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30720_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30720_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30730_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30740_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30740_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30750_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30750_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30760_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30760_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30770_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30770_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30780_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30780_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30790_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30800_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30800_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30810_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30820_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30820_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30830_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30830_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30840_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30840_irnt_Whole_Blood.Rmd
    Modified:   analysis/ukb-d-30850_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30860_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30870_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30880_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30890_irnt_Liver.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/ukb-d-30840_irnt_Whole_Blood.Rmd) and HTML (docs/ukb-d-30840_irnt_Whole_Blood.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd cbf7408 wesleycrouse 2021-09-08 adding enrichment to reports
html cbf7408 wesleycrouse 2021-09-08 adding enrichment to reports
Rmd 4970e3e wesleycrouse 2021-09-08 updating reports
html 4970e3e wesleycrouse 2021-09-08 updating reports
Rmd dfd2b5f wesleycrouse 2021-09-07 regenerating reports
html dfd2b5f wesleycrouse 2021-09-07 regenerating reports
Rmd 61b53b3 wesleycrouse 2021-09-06 updated PVE calculation
html 61b53b3 wesleycrouse 2021-09-06 updated PVE calculation
Rmd 837dd01 wesleycrouse 2021-09-01 adding additional fixedsigma report
Rmd d0a5417 wesleycrouse 2021-08-30 adding new reports to the index
Rmd 0922de7 wesleycrouse 2021-08-18 updating all reports with locus plots
html 0922de7 wesleycrouse 2021-08-18 updating all reports with locus plots
html 1c62980 wesleycrouse 2021-08-11 Updating reports
Rmd 5981e80 wesleycrouse 2021-08-11 Adding more reports
html 5981e80 wesleycrouse 2021-08-11 Adding more reports
Rmd da9f015 wesleycrouse 2021-08-07 adding more reports
html da9f015 wesleycrouse 2021-08-07 adding more reports

Overview

These are the results of a ctwas analysis of the UK Biobank trait Total bilirubin (quantile) using Whole_Blood gene weights.

The GWAS was conducted by the Neale Lab, and the biomarker traits we analyzed are discussed here. Summary statistics were obtained from IEU OpenGWAS using GWAS ID: ukb-d-30840_irnt. Results were obtained from from IEU rather than Neale Lab because they are in a standardard format (GWAS VCF). Note that 3 of the 34 biomarker traits were not available from IEU and were excluded from analysis.

The weights are mashr GTEx v8 models on Whole_Blood eQTL obtained from PredictDB. We performed a full harmonization of the variants, including recovering strand ambiguous variants. This procedure is discussed in a separate document. (TO-DO: add report that describes harmonization)

LD matrices were computed from a 10% subset of Neale lab subjects. Subjects were matched using the plate and well information from genotyping. We included only biallelic variants with MAF>0.01 in the original Neale Lab GWAS. (TO-DO: add more details [number of subjects, variants, etc])

Weight QC

TO-DO: add enhanced QC reporting (total number of weights, why each variant was missing for all genes)

qclist_all <- list()

qc_files <- paste0(results_dir, "/", list.files(results_dir, pattern="exprqc.Rd"))

for (i in 1:length(qc_files)){
  load(qc_files[i])
  chr <- unlist(strsplit(rev(unlist(strsplit(qc_files[i], "_")))[1], "[.]"))[1]
  qclist_all[[chr]] <- cbind(do.call(rbind, lapply(qclist,unlist)), as.numeric(substring(chr,4)))
}

qclist_all <- data.frame(do.call(rbind, qclist_all))
colnames(qclist_all)[ncol(qclist_all)] <- "chr"

rm(qclist, wgtlist, z_gene_chr)

#number of imputed weights
nrow(qclist_all)
[1] 11095
#number of imputed weights by chromosome
table(qclist_all$chr)

   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15 
1129  747  624  400  479  621  560  383  404  430  682  652  192  362  331 
  16   17   18   19   20   21   22 
 551  725  159  911  313  130  310 
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.762776

Load ctwas results

#load ctwas results
ctwas_res <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.susieIrss.txt"))

#make unique identifier for regions
ctwas_res$region_tag <- paste(ctwas_res$region_tag1, ctwas_res$region_tag2, sep="_")

#compute PVE for each gene/SNP
ctwas_res$PVE = ctwas_res$susie_pip*ctwas_res$mu2/sample_size #check PVE calculation

#separate gene and SNP results
ctwas_gene_res <- ctwas_res[ctwas_res$type == "gene", ]
ctwas_gene_res <- data.frame(ctwas_gene_res)
ctwas_snp_res <- ctwas_res[ctwas_res$type == "SNP", ]
ctwas_snp_res <- data.frame(ctwas_snp_res)

#add gene information to results
sqlite <- RSQLite::dbDriver("SQLite")
db = RSQLite::dbConnect(sqlite, paste0("/project2/compbio/predictdb/mashr_models/mashr_", weight, ".db"))
query <- function(...) RSQLite::dbGetQuery(db, ...)
gene_info <- query("select gene, genename from extra")
gene_info <- query("select gene, genename, gene_type from extra")
RSQLite::dbDisconnect(db)

ctwas_gene_res <- cbind(ctwas_gene_res, gene_info[sapply(ctwas_gene_res$id, match, gene_info$gene), c("genename", "gene_type")])

#add z score to results
load(paste0(results_dir, "/", analysis_id, "_expr_z_gene.Rd"))
ctwas_gene_res$z <- z_gene[ctwas_gene_res$id,]$z

#load(paste0(results_dir, "/", analysis_id, "_expr_z_snp.Rd")) #for new version, stored after harmonization
z_snp <- readRDS(paste0(results_dir, "/", trait_id, ".RDS")) #for old version, unharmonized

z_snp <- z_snp[z_snp$id %in% ctwas_snp_res$id,] #subset snps to those included in analysis, note some are duplicated, need to match which allele was used
ctwas_snp_res$z <- z_snp$z[match(ctwas_snp_res$id, z_snp$id)] #for duplicated snps, this arbitrarily uses the first allele
ctwas_snp_res$z_flag <- as.numeric(ctwas_snp_res$id %in% z_snp$id[duplicated(z_snp$id)]) #mark the unclear z scores, flag=1

#formatting and rounding for tables
ctwas_gene_res$z <- round(ctwas_gene_res$z,2)
ctwas_snp_res$z <- round(ctwas_snp_res$z,2)
ctwas_gene_res$susie_pip <- round(ctwas_gene_res$susie_pip,3)
ctwas_snp_res$susie_pip <- round(ctwas_snp_res$susie_pip,3)
ctwas_gene_res$mu2 <- round(ctwas_gene_res$mu2,2)
ctwas_snp_res$mu2 <- round(ctwas_snp_res$mu2,2)
ctwas_gene_res$PVE <- signif(ctwas_gene_res$PVE, 2)
ctwas_snp_res$PVE <- signif(ctwas_snp_res$PVE, 2)

#merge gene and snp results with added information
ctwas_gene_res$z_flag=NA
ctwas_snp_res$genename=NA
ctwas_snp_res$gene_type=NA

ctwas_res <- rbind(ctwas_gene_res,
                   ctwas_snp_res[,colnames(ctwas_gene_res)])

#store columns to report
report_cols <- colnames(ctwas_gene_res)[!(colnames(ctwas_gene_res) %in% c("type", "region_tag1", "region_tag2", "cs_index", "gene_type", "z_flag", "id", "chrom", "pos"))]
first_cols <- c("genename", "region_tag")
report_cols <- c(first_cols, report_cols[!(report_cols %in% first_cols)])

report_cols_snps <- c("id", report_cols[-1])

#get number of SNPs from s1 results; adjust for thin
ctwas_res_s1 <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.s1.susieIrss.txt"))
n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
rm(ctwas_res_s1)

Check convergence of parameters

library(ggplot2)
library(cowplot)

********************************************************
Note: As of version 1.0.0, cowplot does not change the
  default ggplot2 theme anymore. To recover the previous
  behavior, execute:
  theme_set(theme_cowplot())
********************************************************
load(paste0(results_dir, "/", analysis_id, "_ctwas.s2.susieIrssres.Rd"))
  
df <- data.frame(niter = rep(1:ncol(group_prior_rec), 2),
                 value = c(group_prior_rec[1,], group_prior_rec[2,]),
                 group = rep(c("Gene", "SNP"), each = ncol(group_prior_rec)))
df$group <- as.factor(df$group)

df$value[df$group=="SNP"] <- df$value[df$group=="SNP"]*thin #adjust parameter to account for thin argument

p_pi <- ggplot(df, aes(x=niter, y=value, group=group)) +
  geom_line(aes(color=group)) +
  geom_point(aes(color=group)) +
  xlab("Iteration") + ylab(bquote(pi)) +
  ggtitle("Prior mean") +
  theme_cowplot()

df <- data.frame(niter = rep(1:ncol(group_prior_var_rec), 2),
                 value = c(group_prior_var_rec[1,], group_prior_var_rec[2,]),
                 group = rep(c("Gene", "SNP"), each = ncol(group_prior_var_rec)))
df$group <- as.factor(df$group)
p_sigma2 <- ggplot(df, aes(x=niter, y=value, group=group)) +
  geom_line(aes(color=group)) +
  geom_point(aes(color=group)) +
  xlab("Iteration") + ylab(bquote(sigma^2)) +
  ggtitle("Prior variance") +
  theme_cowplot()

plot_grid(p_pi, p_sigma2)

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
        gene          snp 
4.720945e-02 1.195297e-05 
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
       gene         snp 
   8.479027 1498.981237 
#report sample size
print(sample_size)
[1] 342829
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1]   11095 8697330
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
      gene        snp 
0.01295462 0.45454857 
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1]  0.05599477 38.53797345

Genes with highest PIPs

#distribution of PIPs
hist(ctwas_gene_res$susie_pip, xlim=c(0,1), main="Distribution of Gene PIPs")

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#genes with PIP>0.8 or 20 highest PIPs
head(ctwas_gene_res[order(-ctwas_gene_res$susie_pip),report_cols], max(sum(ctwas_gene_res$susie_pip>0.8), 20))
           genename region_tag susie_pip   mu2     PVE     z
5830         GTF2H2       5_42     0.999 24.27 7.1e-05  4.63
7782          CASC4      15_17     0.998 26.11 7.6e-05  5.13
9073           HIC1       17_3     0.998 22.67 6.6e-05 -4.67
10046       CCDC157      22_10     0.998 26.73 7.8e-05 -5.34
5823         IQGAP2       5_44     0.997 22.95 6.7e-05  4.77
4401          EIF5A       17_6     0.997 23.89 6.9e-05 -4.83
11297     LINC01624      6_112     0.996 22.24 6.5e-05 -4.62
3102          DOCK7       1_39     0.993 60.86 1.8e-04  8.35
10508          ADH5       4_66     0.993 36.55 1.1e-04  6.07
6181        CSNK1G3       5_75     0.993 27.34 7.9e-05 -5.87
6327        TXNDC11      16_12     0.993 20.44 5.9e-05  4.23
5941       SLC25A37       8_24     0.992 24.90 7.2e-05 -1.53
6888         SHKBP1      19_30     0.992 24.22 7.0e-05  5.10
368          FBXO42       1_11     0.991 21.15 6.1e-05  3.91
4564          PSRC1       1_67     0.991 19.16 5.5e-05 -3.49
4296         PDLIM4       5_79     0.991 31.38 9.1e-05 -6.38
3748           GNMT       6_33     0.990 49.45 1.4e-04  7.50
5692          ASXL2       2_15     0.988 29.72 8.6e-05  5.42
9028        FAM91A1       8_80     0.987 19.47 5.6e-05  4.24
4660         UBQLN1       9_41     0.987 20.37 5.9e-05  4.34
9924           GLDN      15_20     0.985 20.25 5.8e-05  4.40
8331        ADORA2B      17_14     0.985 34.83 1.0e-04  6.09
7393          CEP44      4_113     0.984 21.23 6.1e-05  4.71
1267         PABPC4       1_24     0.981 39.18 1.1e-04  6.42
6076        ZC3H12C      11_65     0.980 19.74 5.6e-05  4.30
6631        C9orf43       9_58     0.979 19.17 5.5e-05  4.24
7786       CATSPER2      15_16     0.979 23.30 6.7e-05 -4.88
3762          RAB17      2_140     0.978 19.15 5.5e-05  4.17
12450  RP11-88E10.5      13_61     0.978 19.59 5.6e-05  4.26
8178        SLC50A1       1_77     0.977 19.79 5.6e-05  3.77
9687          MROH7       1_34     0.976 17.48 5.0e-05 -3.90
6089          FADS1      11_34     0.976 65.66 1.9e-04 -8.78
7291        PCOLCE2       3_87     0.975 18.73 5.3e-05 -4.11
8486        PLEKHG5        1_5     0.974 19.69 5.6e-05  4.34
2977        CCDC88A       2_36     0.970 17.84 5.0e-05 -3.91
5188           FGD4      12_22     0.969 18.80 5.3e-05 -4.13
6395        UBASH3B      11_74     0.968 18.88 5.3e-05  4.13
8978         SMIM19       8_37     0.967 50.86 1.4e-04 -7.52
9863          LAMP1      13_62     0.966 18.56 5.2e-05  4.66
7950   CTB-50L17.10       19_5     0.963 18.42 5.2e-05 -4.26
3065           GPX7       1_33     0.961 16.72 4.7e-05  3.71
11023         SIPA1      11_36     0.961 21.63 6.1e-05 -4.57
12471 RP11-334C17.6      17_45     0.958 17.89 5.0e-05  3.99
8294           GYPA       4_94     0.957 22.10 6.2e-05  5.35
6290        ZFP36L2       2_27     0.956 18.91 5.3e-05 -4.25
4845           TTC5       14_1     0.955 17.09 4.8e-05 -3.86
9223         ZBTB7A       19_4     0.954 21.89 6.1e-05  4.61
11381     BACH1-AS1      21_11     0.953 18.35 5.1e-05  4.01
10164         RNFT1      17_35     0.950 57.04 1.6e-04  8.43
3023          BIRC6       2_20     0.942 25.85 7.1e-05  5.10
4398            UNK      17_42     0.942 24.50 6.7e-05 -4.77
11894        INAFM1      19_33     0.941 18.84 5.2e-05  4.03
7462          DAGLB        7_9     0.940 17.58 4.8e-05  3.95
562           DGAT2      11_42     0.938 63.18 1.7e-04 -8.58
12003     LINC00565      13_62     0.938 16.76 4.6e-05  3.74
10438          GYPE       4_94     0.933 22.30 6.1e-05 -5.61
5783        TM4SF19      3_121     0.932 22.18 6.0e-05 -4.51
746            SMG6       17_3     0.932 21.48 5.8e-05 -4.92
5171            EGF       4_71     0.928 25.97 7.0e-05  5.38
10781       HLA-DMA       6_27     0.919 28.80 7.7e-05 -5.99
5074        EMILIN1       2_16     0.916 20.13 5.4e-05 -4.57
3848          RRBP1      20_13     0.915 17.56 4.7e-05 -4.08
4656           NREP       5_66     0.912 19.76 5.3e-05 -4.28
5486          SIRT3       11_1     0.912 18.37 4.9e-05 -3.86
5318           USP3      15_29     0.908 88.19 2.3e-04 10.40
8641          OXSR1       3_27     0.905 16.38 4.3e-05  3.73
5540          SYTL1       1_19     0.897 16.79 4.4e-05 -3.71
4698          SNX14       6_58     0.895 17.57 4.6e-05  3.82
1958           NEFM       8_25     0.894 17.18 4.5e-05  3.72
11815       MPV17L2      19_14     0.894 18.04 4.7e-05  3.82
7283          ABHD6       3_40     0.893 17.11 4.5e-05  3.80
7627          STOX1      10_46     0.890 15.79 4.1e-05  2.63
10324         DAPK1       9_45     0.884 16.11 4.2e-05 -3.53
5268          CUL4A      13_62     0.882 17.66 4.5e-05 -4.31
8291           SIK2      11_66     0.876 20.58 5.3e-05  4.86
10319          AMZ2      17_39     0.875 15.91 4.1e-05  3.44
1366        CWF19L1      10_64     0.872 26.84 6.8e-05 -5.45
6892           PKN3       9_66     0.869 18.44 4.7e-05 -3.91
4413          ERAL1      17_18     0.867 28.32 7.2e-05 -5.20
9345         RNF182       6_12     0.866 17.00 4.3e-05 -3.65
6761          UBE2Z      17_28     0.865 19.29 4.9e-05 -3.93
5916          CCT6A       7_40     0.860 17.44 4.4e-05 -3.77
4254          HABP4       9_49     0.859 15.97 4.0e-05 -3.45
2423         LUC7L3      17_29     0.856 16.87 4.2e-05 -3.78
6601         NECAP2       1_11     0.851 15.47 3.8e-05 -2.30
9597         NPIPA1      16_15     0.851 17.26 4.3e-05 -3.64
12181         EGLN2      19_30     0.830 19.84 4.8e-05 -4.23
3906           SBDS       7_44     0.825 17.04 4.1e-05  3.61
6672         LRRC43      12_75     0.825 19.84 4.8e-05 -4.37
5881         MMS22L       6_66     0.818 17.66 4.2e-05 -3.67
11000        KLHL23      2_103     0.814 17.07 4.1e-05  3.75
10174         NKAPL       6_22     0.814 15.51 3.7e-05  1.73
2237           TBL2       7_47     0.804 16.82 3.9e-05 -3.38
10895       TMEM240        1_1     0.803 18.96 4.4e-05  4.10
6911           ADAR       1_75     0.802 18.57 4.3e-05 -4.28

Genes with largest effect sizes

#plot PIP vs effect size
plot(ctwas_gene_res$susie_pip, ctwas_gene_res$mu2, xlab="PIP", ylab="mu^2", main="Gene PIPs vs Effect Size")

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#genes with 20 largest effect sizes
head(ctwas_gene_res[order(-ctwas_gene_res$mu2),report_cols],20)
       genename region_tag susie_pip       mu2     PVE      z
12599     HCP5B       6_26         0 622305.18 0.0e+00  -7.45
10848    TRIM10       6_26         0 410616.30 0.0e+00   5.76
10855     HLA-G       6_26         0 371837.82 0.0e+00   4.62
10853      HCG9       6_26         0 257715.88 0.0e+00  -0.55
10968     HLA-A       6_26         0 219855.64 0.0e+00   0.24
10844     HLA-E       6_26         0 179116.29 0.0e+00  -4.02
11418    TRIM26       6_26         0 177294.54 0.0e+00  -0.70
11120 LINC00243       6_26         0 154816.67 0.0e+00  -3.39
922        DGKD      2_137         0 144372.20 0.0e+00 -58.58
1119      USP40      2_137         0 131909.17 0.0e+00  59.39
5868    PPP1R18       6_26         0 122993.00 0.0e+00  -4.68
10841   MRPS18B       6_26         0  94840.51 0.0e+00   2.11
8464      WIPF2      17_23         0  44658.98 1.3e-09   2.94
10847    TRIM15       6_26         0  40767.56 0.0e+00   1.92
1346       CDC6      17_23         0  40224.64 0.0e+00   3.69
4971       IER3       6_26         0  37277.52 0.0e+00   0.17
1118    ATG16L1      2_137         0  23658.50 0.0e+00 -35.02
10840  C6orf136       6_26         0  22947.29 0.0e+00   1.62
4970      FLOT1       6_26         0  19010.36 0.0e+00  -0.11
10381     ZGPAT      20_38         0  17932.50 1.8e-08  -4.58

Genes with highest PVE

#genes with 20 highest pve
head(ctwas_gene_res[order(-ctwas_gene_res$PVE),report_cols],20)
       genename region_tag susie_pip    mu2     PVE     z
7536      NFIL3       9_46     0.775 102.06 2.3e-04 11.30
5318       USP3      15_29     0.908  88.19 2.3e-04 10.40
6089      FADS1      11_34     0.976  65.66 1.9e-04 -8.78
3102      DOCK7       1_39     0.993  60.86 1.8e-04  8.35
562       DGAT2      11_42     0.938  63.18 1.7e-04 -8.58
10164     RNFT1      17_35     0.950  57.04 1.6e-04  8.43
3748       GNMT       6_33     0.990  49.45 1.4e-04  7.50
8978     SMIM19       8_37     0.967  50.86 1.4e-04 -7.52
1267     PABPC4       1_24     0.981  39.18 1.1e-04  6.42
10508      ADH5       4_66     0.993  36.55 1.1e-04  6.07
8331    ADORA2B      17_14     0.985  34.83 1.0e-04  6.09
4296     PDLIM4       5_79     0.991  31.38 9.1e-05 -6.38
5692      ASXL2       2_15     0.988  29.72 8.6e-05  5.42
6181    CSNK1G3       5_75     0.993  27.34 7.9e-05 -5.87
10046   CCDC157      22_10     0.998  26.73 7.8e-05 -5.34
8277     CNTROB       17_7     0.779  33.96 7.7e-05 -6.57
10781   HLA-DMA       6_27     0.919  28.80 7.7e-05 -5.99
7782      CASC4      15_17     0.998  26.11 7.6e-05  5.13
12299 U91328.19       6_20     0.793  31.71 7.3e-05 -8.24
5941   SLC25A37       8_24     0.992  24.90 7.2e-05 -1.53

Genes with largest z scores

#genes with 20 largest z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
          genename region_tag susie_pip       mu2     PVE      z
1119         USP40      2_137     0.000 131909.17 0.0e+00  59.39
922           DGKD      2_137     0.000 144372.20 0.0e+00 -58.58
1118       ATG16L1      2_137     0.000  23658.50 0.0e+00 -35.02
3636         HJURP      2_137     0.000  16455.03 0.0e+00  13.62
7536         NFIL3       9_46     0.775    102.06 2.3e-04  11.30
5318          USP3      15_29     0.908     88.19 2.3e-04  10.40
10374      ZSCAN26       6_22     0.269     69.86 5.5e-05   9.35
6089         FADS1      11_34     0.976     65.66 1.9e-04  -8.78
11433 MAPKAPK5-AS1      12_67     0.219     56.30 3.6e-05  -8.68
562          DGAT2      11_42     0.938     63.18 1.7e-04  -8.58
2220           AHR       7_17     0.122     57.29 2.0e-05  -8.51
10164        RNFT1      17_35     0.950     57.04 1.6e-04   8.43
3102         DOCK7       1_39     0.993     60.86 1.8e-04   8.35
10805        EHMT2       6_26     0.000    281.40 0.0e+00   8.32
5979           AUH       9_46     0.062     54.02 9.8e-06   8.26
12299    U91328.19       6_20     0.793     31.71 7.3e-05  -8.24
2607         SH2B3      12_67     0.079     46.30 1.1e-05   8.21
4024           TST      22_14     0.127     60.39 2.2e-05   8.15
10765      ZDHHC18       1_18     0.069     61.43 1.2e-05   8.08
10789         PBX2       6_26     0.000    243.77 0.0e+00   7.73

Comparing z scores and PIPs

#set nominal signifiance threshold for z scores
alpha <- 0.05

#bonferroni adjusted threshold for z scores
sig_thresh <- qnorm(1-(alpha/nrow(ctwas_gene_res)/2), lower=T)

#Q-Q plot for z scores
obs_z <- ctwas_gene_res$z[order(ctwas_gene_res$z)]
exp_z <- qnorm((1:nrow(ctwas_gene_res))/nrow(ctwas_gene_res))

plot(exp_z, obs_z, xlab="Expected z", ylab="Observed z", main="Gene z score Q-Q plot")
abline(a=0,b=1)

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#plot z score vs PIP
plot(abs(ctwas_gene_res$z), ctwas_gene_res$susie_pip, xlab="abs(z)", ylab="PIP")
abline(v=sig_thresh, col="red", lty=2)

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)
[1] 0.01360973
#genes with most significant z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
          genename region_tag susie_pip       mu2     PVE      z
1119         USP40      2_137     0.000 131909.17 0.0e+00  59.39
922           DGKD      2_137     0.000 144372.20 0.0e+00 -58.58
1118       ATG16L1      2_137     0.000  23658.50 0.0e+00 -35.02
3636         HJURP      2_137     0.000  16455.03 0.0e+00  13.62
7536         NFIL3       9_46     0.775    102.06 2.3e-04  11.30
5318          USP3      15_29     0.908     88.19 2.3e-04  10.40
10374      ZSCAN26       6_22     0.269     69.86 5.5e-05   9.35
6089         FADS1      11_34     0.976     65.66 1.9e-04  -8.78
11433 MAPKAPK5-AS1      12_67     0.219     56.30 3.6e-05  -8.68
562          DGAT2      11_42     0.938     63.18 1.7e-04  -8.58
2220           AHR       7_17     0.122     57.29 2.0e-05  -8.51
10164        RNFT1      17_35     0.950     57.04 1.6e-04   8.43
3102         DOCK7       1_39     0.993     60.86 1.8e-04   8.35
10805        EHMT2       6_26     0.000    281.40 0.0e+00   8.32
5979           AUH       9_46     0.062     54.02 9.8e-06   8.26
12299    U91328.19       6_20     0.793     31.71 7.3e-05  -8.24
2607         SH2B3      12_67     0.079     46.30 1.1e-05   8.21
4024           TST      22_14     0.127     60.39 2.2e-05   8.15
10765      ZDHHC18       1_18     0.069     61.43 1.2e-05   8.08
10789         PBX2       6_26     0.000    243.77 0.0e+00   7.73

Locus plots for genes and SNPs

ctwas_gene_res_sortz <- ctwas_gene_res[order(-abs(ctwas_gene_res$z)),]

n_plots <- 5
for (region_tag_plot in head(unique(ctwas_gene_res_sortz$region_tag), n_plots)){
  ctwas_res_region <-  ctwas_res[ctwas_res$region_tag==region_tag_plot,]
  start <- min(ctwas_res_region$pos)
  end <- max(ctwas_res_region$pos)
  
  ctwas_res_region <- ctwas_res_region[order(ctwas_res_region$pos),]
  ctwas_res_region_gene <- ctwas_res_region[ctwas_res_region$type=="gene",]
  ctwas_res_region_snp <- ctwas_res_region[ctwas_res_region$type=="SNP",]
  
  #region name
  print(paste0("Region: ", region_tag_plot))
  
  #table of genes in region
  print(ctwas_res_region_gene[,report_cols])
  
  par(mfrow=c(4,1))
  
  #gene z scores
  plot(ctwas_res_region_gene$pos, abs(ctwas_res_region_gene$z), xlab="Position", ylab="abs(gene_z)", xlim=c(start,end),
   ylim=c(0,max(sig_thresh, abs(ctwas_res_region_gene$z))),
   main=paste0("Region: ", region_tag_plot))
  abline(h=sig_thresh,col="red",lty=2)
  
  #significance threshold for SNPs
  alpha_snp <- 5*10^(-8)
  sig_thresh_snp <- qnorm(1-alpha_snp/2, lower=T)
  
  #snp z scores
  plot(ctwas_res_region_snp$pos, abs(ctwas_res_region_snp$z), xlab="Position", ylab="abs(snp_z)",xlim=c(start,end),
   ylim=c(0,max(sig_thresh_snp, max(abs(ctwas_res_region_snp$z)))))
  abline(h=sig_thresh_snp,col="purple",lty=2)
  
  #gene pips
  plot(ctwas_res_region_gene$pos, ctwas_res_region_gene$susie_pip, xlab="Position", ylab="Gene PIP", xlim=c(start,end), ylim=c(0,1))
  abline(h=0.8,col="blue",lty=2)
  
  #snp pips
  plot(ctwas_res_region_snp$pos, ctwas_res_region_snp$susie_pip, xlab="Position", ylab="SNP PIP", xlim=c(start,end), ylim=c(0,1))
  abline(h=0.8,col="blue",lty=2)
}
[1] "Region: 2_137"
      genename region_tag susie_pip       mu2 PVE      z
10759   GIGYF2      2_137         0     66.65   0  -4.28
9519   C2orf82      2_137         0     18.89   0   0.49
1118   ATG16L1      2_137         0  23658.50   0 -35.02
922       DGKD      2_137         0 144372.20   0 -58.58
1119     USP40      2_137         0 131909.17   0  59.39
3636     HJURP      2_137         0  16455.03   0  13.62

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 9_46"
          genename region_tag susie_pip    mu2     PVE     z
11350 RP11-305L7.1       9_46     0.104  32.00 9.7e-06 -5.82
7536         NFIL3       9_46     0.775 102.06 2.3e-04 11.30
5979           AUH       9_46     0.062  54.02 9.8e-06  8.26

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 15_29"
        genename region_tag susie_pip   mu2     PVE     z
5315        TPM1      15_29     0.139  9.44 3.8e-06  1.72
1885       LACTB      15_29     0.150  8.61 3.8e-06 -0.90
9763      RPS27L      15_29     0.488 23.39 3.3e-05  3.86
5132       APH1B      15_29     0.126 10.10 3.7e-06  2.29
852         CA12      15_29     0.124  7.91 2.9e-06 -1.44
11938 AC007950.1      15_29     0.096 20.44 5.7e-06  4.42
5318        USP3      15_29     0.908 88.19 2.3e-04 10.40
10420     FBXL22      15_29     0.130  6.98 2.6e-06 -0.12
1888       HERC1      15_29     0.095 13.77 3.8e-06  3.71
355        DAPK2      15_29     0.415 19.73 2.4e-05  2.92
319         SNX1      15_29     0.139 10.79 4.4e-06 -2.44
6634       SNX22      15_29     0.106  5.47 1.7e-06  0.11
7791        PPIB      15_29     0.155 11.85 5.4e-06 -2.51
8159     CSNK1G1      15_29     0.111  6.78 2.2e-06 -1.40
1889       TRIP4      15_29     0.102  6.19 1.8e-06 -1.54
7795       PCLAF      15_29     0.102  6.19 1.8e-06 -1.54
9333      ZNF609      15_29     0.163 12.20 5.8e-06  2.51

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 6_22"
      genename region_tag susie_pip   mu2     PVE     z
10407   ZNF165       6_22     0.084 47.57 1.2e-05 -7.40
10335  ZSCAN16       6_22     0.060  9.93 1.7e-06 -2.54
10578  ZKSCAN8       6_22     0.762 17.60 3.9e-05  3.33
4950    ZSCAN9       6_22     0.028 11.55 9.4e-07  2.79
10174    NKAPL       6_22     0.814 15.51 3.7e-05  1.73
10021  ZKSCAN4       6_22     0.025 27.95 2.1e-06 -6.35
10374  ZSCAN26       6_22     0.269 69.86 5.5e-05  9.35
4972     PGBD1       6_22     0.030  6.40 5.6e-07  0.42
10188  ZKSCAN3       6_22     0.033  6.76 6.5e-07  0.18
11445  ZSCAN31       6_22     0.061 33.54 6.0e-06 -6.38
6712   ZSCAN12       6_22     0.027 26.02 2.1e-06  6.06
10865   TRIM27       6_22     0.028  6.31 5.2e-07 -0.16

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 11_34"
           genename region_tag susie_pip   mu2     PVE     z
10165       FAM111B      11_34     0.091  8.00 2.1e-06  1.07
7794        FAM111A      11_34     0.062  4.52 8.2e-07 -0.21
2506           DTX4      11_34     0.144 12.43 5.2e-06  1.64
10468         MPEG1      11_34     0.208 15.66 9.5e-06 -1.98
2515         MS4A6A      11_34     0.080  6.78 1.6e-06 -0.93
7815          PATL1      11_34     0.089  7.86 2.0e-06  1.14
7817           STX3      11_34     0.092  8.12 2.2e-06 -1.13
7818         MRPL16      11_34     0.100  8.91 2.6e-06  1.19
4634            GIF      11_34     0.088  7.86 2.0e-06 -1.16
4638           TCN1      11_34     0.076  6.45 1.4e-06 -0.82
6096          MS4A2      11_34     0.064  4.76 8.9e-07  0.48
11819    AP001257.1      11_34     0.077  6.53 1.5e-06 -0.80
11116        MS4A4E      11_34     0.115 10.30 3.4e-06 -1.50
2516         MS4A4A      11_34     0.064  4.73 8.8e-07 -0.33
7825         MS4A6E      11_34     0.065  4.84 9.1e-07 -0.09
7826          MS4A7      11_34     0.117 10.50 3.6e-06 -1.49
7827         MS4A14      11_34     0.071  5.80 1.2e-06  0.78
2519         CCDC86      11_34     0.063  4.79 8.8e-07  0.41
9570         PTGDR2      11_34     0.077  6.54 1.5e-06 -0.82
6093            ZP1      11_34     0.115 10.58 3.5e-06  1.50
2520         PRPF19      11_34     0.067  5.35 1.1e-06  0.60
2521        TMEM109      11_34     0.069  5.48 1.1e-06  0.61
2546        SLC15A3      11_34     0.075  5.97 1.3e-06 -0.35
2547            CD5      11_34     0.088  7.17 1.8e-06  0.51
8008         VPS37C      11_34     0.076  6.12 1.4e-06 -0.49
11874          PGA5      11_34     0.064  5.25 9.8e-07 -0.89
11340          PGA3      11_34     0.093  9.21 2.5e-06 -1.58
8009           VWCE      11_34     0.066  4.97 9.6e-07 -0.10
6088        TMEM138      11_34     0.062  4.75 8.6e-07 -0.42
7030       CYB561A3      11_34     0.062  4.75 8.6e-07 -0.42
9981        TMEM216      11_34     0.088  8.05 2.1e-06 -1.19
11871 RP11-286N22.8      11_34     0.062  4.59 8.4e-07 -0.30
4631          DAGLA      11_34     0.086 12.34 3.1e-06  3.11
3765           MYRF      11_34     0.330 31.28 3.0e-05 -4.94
4636          FADS2      11_34     0.064 34.98 6.5e-06 -6.34
4637        TMEM258      11_34     0.110 15.11 4.8e-06 -2.91
6089          FADS1      11_34     0.976 65.66 1.9e-04 -8.78
11190         FADS3      11_34     0.116  9.14 3.1e-06 -0.16
8011          BEST1      11_34     0.063  7.58 1.4e-06 -2.00
6092         INCENP      11_34     0.071  6.51 1.4e-06 -1.16
7032         ASRGL1      11_34     0.127 10.88 4.0e-06  1.30

Version Author Date
dfd2b5f wesleycrouse 2021-09-07

SNPs with highest PIPs

#snps with PIP>0.8 or 20 highest PIPs
head(ctwas_snp_res[order(-ctwas_snp_res$susie_pip),report_cols_snps],
max(sum(ctwas_snp_res$susie_pip>0.8), 20))
                 id region_tag susie_pip        mu2     PVE      z
27225     rs1730862       1_66     1.000      52.14 1.5e-04   6.87
60143    rs12239046      1_131     1.000      47.10 1.4e-04  -6.51
126771    rs4973550      2_136     1.000      64.04 1.9e-04   7.33
126782    rs6731991      2_136     1.000      72.95 2.1e-04  -7.87
127119   rs76063448      2_137     1.000   65220.78 1.9e-01  87.93
127120   rs11888459      2_137     1.000 3533847.50 1.0e+01 233.42
127121    rs2885296      2_137     1.000 3352259.20 9.8e+00 316.17
127123   rs11568318      2_137     1.000  867241.58 2.5e+00 -89.02
313271   rs72834643       6_20     1.000     192.99 5.6e-04  12.21
313292  rs115740542       6_20     1.000     255.08 7.4e-04  14.39
363841  rs542176135       7_17     1.000     251.75 7.3e-04 -10.36
573150   rs11045819      12_16     1.000    5699.99 1.7e-02 -24.62
573167    rs4363657      12_16     1.000    2505.38 7.3e-03  54.59
600000     rs653178      12_67     1.000     230.50 6.7e-04 -15.66
676033     rs340029      15_27     1.000     101.84 3.0e-04  -9.73
997241    rs1611236       6_26     1.000 1800234.92 5.3e+00  -5.01
1130624  rs10995596      10_42     1.000   61551.50 1.8e-01  -4.75
1130642 rs773090945      10_42     1.000   61583.47 1.8e-01  -4.39
1134968  rs17476364      10_46     1.000     508.47 1.5e-03  22.54
1281603 rs138395719      17_23     1.000   64809.06 1.9e-01  -3.58
1360925 rs202143810      20_38     1.000   30924.58 9.0e-02   4.97
6858     rs79598313       1_18     0.999     100.15 2.9e-04   9.63
55463      rs822928      1_122     0.999      47.67 1.4e-04   6.57
422007   rs12550646       8_36     0.999      46.11 1.3e-04   7.32
527497   rs76153913       11_2     0.999      46.64 1.4e-04   6.13
781927    rs6129760      20_25     0.999      53.73 1.6e-04   7.10
1333479  rs59616136      19_14     0.999      52.37 1.5e-04   8.30
31587     rs2779116       1_79     0.997     155.00 4.5e-04 -12.25
766349     rs814573      19_31     0.996      41.04 1.2e-04   5.98
527511  rs118092312       11_2     0.994      54.45 1.6e-04  -2.24
712940    rs2608604      16_54     0.994      92.63 2.7e-04  -7.55
483876   rs34755157       9_71     0.992      46.74 1.4e-04  -6.53
365436   rs56130071       7_19     0.991      73.25 2.1e-04   8.31
572905   rs10770693      12_15     0.989      97.21 2.8e-04  10.67
127309    rs6431241      2_138     0.987      45.43 1.3e-04  -7.18
582345   rs10876377      12_33     0.982      44.99 1.3e-04   6.21
363863    rs4721597       7_17     0.977     146.35 4.2e-04   1.77
127324   rs72982172      2_138     0.975      38.66 1.1e-04  -4.20
313913  rs559462124       6_23     0.975      65.78 1.9e-04  -8.19
807183    rs6000553      22_14     0.975     156.55 4.5e-04  11.98
313136   rs72838866       6_19     0.962     101.13 2.8e-04  10.46
88741     rs1992769       2_58     0.953      36.52 1.0e-04  -5.62
530519    rs4910498       11_8     0.953      50.89 1.4e-04   6.55
74782    rs10495928       2_28     0.952      37.20 1.0e-04  -5.66
531948   rs10766077      11_10     0.938      58.06 1.6e-04   7.28
759592    rs3794991      19_15     0.928      67.02 1.8e-04   7.93
652044     rs873642      14_30     0.923      36.20 9.7e-05  -5.54
170572    rs6779146       3_84     0.920     312.97 8.4e-04  17.61
483491  rs115478735       9_70     0.914      54.28 1.4e-04  -7.00
583259    rs2657880      12_35     0.914      40.25 1.1e-04   5.92
1333486  rs61166126      19_14     0.909      51.55 1.4e-04  -8.19
471741     rs290992       9_46     0.904      36.81 9.7e-05  -6.09
387137  rs150063393       7_60     0.879      46.79 1.2e-04  -6.44
306987    rs2765359        6_7     0.835      41.58 1.0e-04   6.03
712888    rs8044367      16_54     0.825      47.77 1.2e-04   3.68
313861     rs149972       6_21     0.823     123.22 3.0e-04 -10.99
556181   rs11224303      11_58     0.818      38.54 9.2e-05   5.77
572865    rs7962574      12_15     0.811      71.60 1.7e-04  -9.30
757765  rs141645070      19_11     0.803      39.36 9.2e-05  -5.49

SNPs with largest effect sizes

#plot PIP vs effect size
plot(ctwas_snp_res$susie_pip, ctwas_snp_res$mu2, xlab="PIP", ylab="mu^2", main="SNP PIPs vs Effect Size")

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#SNPs with 50 largest effect sizes
head(ctwas_snp_res[order(-ctwas_snp_res$mu2),report_cols_snps],50)
                id region_tag susie_pip     mu2     PVE      z
127120  rs11888459      2_137     1.000 3533848 1.0e+01 233.42
127103  rs13401281      2_137     0.000 3460773 0.0e+00 231.28
127121   rs2885296      2_137     1.000 3352259 9.8e+00 316.17
127117  rs17862875      2_137     0.000 3324144 0.0e+00 314.44
127100   rs1875263      2_137     0.000 3321675 0.0e+00 225.15
127099   rs6749496      2_137     0.000 2943314 0.0e+00 258.20
127092   rs7583278      2_137     0.000 2800028 0.0e+00 248.73
127108   rs4663332      2_137     0.000 2678089 0.0e+00 240.16
127109 rs200973045      2_137     0.000 2647161 0.0e+00 239.38
127110  rs57258852      2_137     0.000 2535219 0.0e+00 248.23
127114   rs3821242      2_137     0.000 2494136 0.0e+00 253.70
127112   rs2008584      2_137     0.000 2478403 0.0e+00 252.98
127086   rs6753320      2_137     0.000 2462593 0.0e+00 230.24
127087   rs6736743      2_137     0.000 2462593 0.0e+00 230.24
127091   rs2070959      2_137     0.000 2290791 0.0e+00 308.88
127090   rs1105880      2_137     0.000 2201096 0.0e+00 299.78
127085  rs77070100      2_137     0.000 2183541 0.0e+00 298.67
127098   rs2012734      2_137     0.000 2111027 0.0e+00 234.28
997227 rs111734624       6_26     0.417 1800450 2.2e+00  -5.03
997224   rs2508055       6_26     0.412 1800450 2.2e+00  -5.03
997260   rs1611252       6_26     0.290 1800449 1.5e+00  -5.03
997256   rs1611248       6_26     0.387 1800445 2.0e+00  -5.04
997271   rs1611260       6_26     0.138 1800444 7.3e-01  -5.03
997276   rs1611265       6_26     0.136 1800442 7.1e-01  -5.03
997179   rs1633033       6_26     0.028 1800407 1.5e-01  -5.04
997279   rs2394171       6_26     0.000 1800405 5.6e-07  -5.02
997278   rs1611267       6_26     0.000 1800403 3.5e-09  -5.01
997281   rs2893981       6_26     0.000 1800401 3.8e-07  -5.02
997230   rs1611228       6_26     0.000 1800396 2.2e-06  -5.02
997221   rs1737020       6_26     0.000 1800396 3.6e-11  -5.01
997222   rs1737019       6_26     0.000 1800396 3.6e-11  -5.01
997188   rs2844838       6_26     0.000 1800391 1.0e-03  -5.03
997191   rs1633032       6_26     0.000 1800275 4.3e-11  -5.03
997241   rs1611236       6_26     1.000 1800235 5.3e+00  -5.01
997215   rs1633020       6_26     0.000 1800177 0.0e+00  -5.01
997219   rs1633018       6_26     0.000 1800163 0.0e+00  -5.00
997239   rs1611234       6_26     0.000 1800053 0.0e+00  -5.00
997137   rs1610726       6_26     0.000 1799957 0.0e+00  -5.01
997186   rs2844840       6_26     0.000 1799774 0.0e+00  -5.04
997409   rs3129185       6_26     0.000 1799708 0.0e+00  -5.06
997417   rs1736999       6_26     0.000 1799632 0.0e+00  -5.06
997423   rs1633001       6_26     0.000 1799480 0.0e+00  -5.05
997254   rs1611246       6_26     0.000 1799463 0.0e+00  -5.02
997530   rs1632980       6_26     0.000 1799371 0.0e+00  -5.07
997204   rs1614309       6_26     0.000 1798920 0.0e+00  -5.04
997203   rs1633030       6_26     0.000 1797454 0.0e+00  -5.05
997289   rs9258382       6_26     0.000 1795543 0.0e+00  -5.04
997286   rs9258379       6_26     0.000 1792638 0.0e+00  -5.07
997249   rs1611241       6_26     0.000 1790436 0.0e+00  -5.12
997207   rs1633028       6_26     0.000 1787977 0.0e+00  -4.98

SNPs with highest PVE

#SNPs with 50 highest pve
head(ctwas_snp_res[order(-ctwas_snp_res$PVE),report_cols_snps],50)
                 id region_tag susie_pip        mu2     PVE      z
127120   rs11888459      2_137     1.000 3533847.50 1.0e+01 233.42
127121    rs2885296      2_137     1.000 3352259.20 9.8e+00 316.17
997241    rs1611236       6_26     1.000 1800234.92 5.3e+00  -5.01
127123   rs11568318      2_137     1.000  867241.58 2.5e+00 -89.02
997224    rs2508055       6_26     0.412 1800450.19 2.2e+00  -5.03
997227  rs111734624       6_26     0.417 1800450.47 2.2e+00  -5.03
997256    rs1611248       6_26     0.387 1800444.76 2.0e+00  -5.04
997260    rs1611252       6_26     0.290 1800449.44 1.5e+00  -5.03
997271    rs1611260       6_26     0.138 1800443.58 7.3e-01  -5.03
997276    rs1611265       6_26     0.136 1800441.82 7.1e-01  -5.03
127119   rs76063448      2_137     1.000   65220.78 1.9e-01  87.93
1281603 rs138395719      17_23     1.000   64809.06 1.9e-01  -3.58
1130624  rs10995596      10_42     1.000   61551.50 1.8e-01  -4.75
1130642 rs773090945      10_42     1.000   61583.47 1.8e-01  -4.39
997179    rs1633033       6_26     0.028 1800407.22 1.5e-01  -5.04
1281602   rs9893520      17_23     0.514   64908.56 9.7e-02  -3.48
1360925 rs202143810      20_38     1.000   30924.58 9.0e-02   4.97
1360922   rs6089961      20_38     0.721   30930.57 6.5e-02   4.69
1360924   rs2738758      20_38     0.721   30930.57 6.5e-02   4.69
1281607   rs9897092      17_23     0.299   64907.43 5.7e-02  -3.48
1281572   rs9916782      17_23     0.248   64890.52 4.7e-02  -3.52
1281571  rs16965687      17_23     0.232   64890.02 4.4e-02  -3.52
1281573  rs79880154      17_23     0.170   64905.67 3.2e-02  -3.49
1281558   rs7359537      17_23     0.127   64903.24 2.4e-02  -3.49
1281561   rs9892404      17_23     0.124   64903.27 2.3e-02  -3.49
1281565   rs9906409      17_23     0.123   64903.38 2.3e-02  -3.49
573150   rs11045819      12_16     1.000    5699.99 1.7e-02 -24.62
573051   rs12366506      12_16     0.677    5056.28 1.0e-02  23.86
1281551  rs72836642      17_23     0.049   64889.11 9.4e-03  -3.50
573057   rs11045612      12_16     0.560    5051.97 8.3e-03  23.88
573167    rs4363657      12_16     1.000    2505.38 7.3e-03  54.59
1281554  rs58989731      17_23     0.024   63901.49 4.5e-03  -3.33
573068   rs73062442      12_16     0.301    5056.76 4.4e-03  23.83
1360903  rs35201382      20_38     0.047   30919.14 4.2e-03   4.68
1360905   rs2750483      20_38     0.041   30917.71 3.7e-03   4.68
573060   rs11513221      12_16     0.179    5056.45 2.6e-03  23.82
1129710  rs10761756      10_42     0.183    4337.80 2.3e-03 -15.02
1134968  rs17476364      10_46     1.000     508.47 1.5e-03  22.54
1360904  rs67468102      20_38     0.016   30913.17 1.5e-03   4.68
1360789 rs113486739      20_38     0.577     773.14 1.3e-03   0.25
572967    rs7965380      12_16     0.496     820.74 1.2e-03 -28.32
997188    rs2844838       6_26     0.000 1800390.56 1.0e-03  -5.03
1281679 rs562160239      17_23     0.005   64677.28 9.7e-04  -3.40
170572    rs6779146       3_84     0.920     312.97 8.4e-04  17.61
1281651  rs60220138      17_23     0.004   64029.11 7.5e-04  -3.45
313292  rs115740542       6_20     1.000     255.08 7.4e-04  14.39
363841  rs542176135       7_17     1.000     251.75 7.3e-04 -10.36
1281678   rs1807273      17_23     0.004   64720.04 6.9e-04  -3.42
600000     rs653178      12_67     1.000     230.50 6.7e-04 -15.66
1281529 rs546912945      17_23     0.004   64494.60 6.6e-04  -3.42

SNPs with largest z scores

#SNPs with 50 largest z scores
head(ctwas_snp_res[order(-abs(ctwas_snp_res$z)),report_cols_snps],50)
                id region_tag susie_pip        mu2   PVE       z
127121   rs2885296      2_137         1 3352259.20  9.80  316.17
127117  rs17862875      2_137         0 3324143.86  0.00  314.44
127091   rs2070959      2_137         0 2290790.93  0.00  308.88
127090   rs1105880      2_137         0 2201096.23  0.00  299.78
127085  rs77070100      2_137         0 2183540.64  0.00  298.67
127099   rs6749496      2_137         0 2943314.21  0.00  258.20
127114   rs3821242      2_137         0 2494136.46  0.00  253.70
127112   rs2008584      2_137         0 2478402.82  0.00  252.98
127092   rs7583278      2_137         0 2800027.53  0.00  248.73
127110  rs57258852      2_137         0 2535219.32  0.00  248.23
127076   rs2741034      2_137         0 1426356.18  0.00  243.32
127068   rs2602363      2_137         0 1423610.85  0.00  243.08
127063   rs2741013      2_137         0 1420350.81  0.00  242.86
127108   rs4663332      2_137         0 2678089.04  0.00  240.16
127109 rs200973045      2_137         0 2647161.36  0.00  239.38
127098   rs2012734      2_137         0 2111026.84  0.00  234.28
127120  rs11888459      2_137         1 3533847.50 10.00  233.42
127103  rs13401281      2_137         0 3460772.69  0.00  231.28
127086   rs6753320      2_137         0 2462592.75  0.00  230.24
127087   rs6736743      2_137         0 2462592.75  0.00  230.24
127100   rs1875263      2_137         0 3321674.62  0.00  225.15
127135   rs2361502      2_137         0 1484520.51  0.00  200.59
127072   rs6431558      2_137         0  813258.17  0.00 -179.35
127080   rs1113193      2_137         0  309874.18  0.00 -123.49
127074   rs1823803      2_137         0  624535.82  0.00  109.93
127132  rs10202032      2_137         0  533854.86  0.00 -104.88
127122 rs143373661      2_137         0  415355.58  0.00  100.66
127133   rs6723936      2_137         0  354083.38  0.00  100.34
127123  rs11568318      2_137         1  867241.58  2.50  -89.02
127113  rs45507691      2_137         0  857756.71  0.00  -88.76
127119  rs76063448      2_137         1   65220.78  0.19   87.93
127070  rs13027376      2_137         0  432857.74  0.00  -87.77
127067   rs4047192      2_137         0  432460.69  0.00  -87.66
127115  rs45510999      2_137         0   64739.98  0.00   87.41
127107 rs183532563      2_137         0   64198.55  0.00   86.77
127089  rs12476197      2_137         0  505544.66  0.00  -82.78
127083   rs4663871      2_137         0  501885.52  0.00  -82.54
127088 rs765251456      2_137         0  495299.59  0.00  -82.33
127064   rs7563478      2_137         0  712492.31  0.00  -80.75
127116 rs139257330      2_137         0  258993.87  0.00   77.73
127081  rs17863773      2_137         0  394984.33  0.00  -76.44
127073  rs17863766      2_137         0  342986.84  0.00  -75.62
127075  rs10929252      2_137         0  349921.03  0.00  -75.55
127062 rs140719475      2_137         0  348420.79  0.00  -75.37
127125   rs2003569      2_137         0  423254.24  0.00  -74.03
127078   rs2602372      2_137         0  228650.01  0.00   73.61
127065   rs6713902      2_137         0  305824.08  0.00  -72.04
127050   rs4047189      2_137         0  185907.06  0.00   69.43
127055  rs28802965      2_137         0  170760.00  0.00   67.17
127048  rs62192764      2_137         0  221087.62  0.00  -65.91

Gene set enrichment for genes with PIP>0.8

#GO enrichment analysis
library(enrichR)
Welcome to enrichR
Checking connection ... 
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
dbs <- c("GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021")
genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]

#number of genes for gene set enrichment
length(genes)
[1] 95
if (length(genes)>0){
  GO_enrichment <- enrichr(genes, dbs)

  for (db in dbs){
    print(db)
    df <- GO_enrichment[[db]]
    df <- df[df$Adjusted.P.value<0.05,c("Term", "Overlap", "Adjusted.P.value", "Genes")]
    print(df)
  }
  
  #DisGeNET enrichment
  
  # devtools::install_bitbucket("ibi_group/disgenet2r")
  library(disgenet2r)
  
  disgenet_api_key <- get_disgenet_api_key(
                    email = "wesleycrouse@gmail.com", 
                    password = "uchicago1" )
  
  Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
  
  res_enrich <-disease_enrichment(entities=genes, vocabulary = "HGNC",
                               database = "CURATED" )
  
  df <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio",  "BgRatio")]
  print(df)
  
  #WebGestalt enrichment
  library(WebGestaltR)
  
  background <- ctwas_gene_res$genename
  
  #listGeneSet()
  databases <- c("pathway_KEGG", "disease_GLAD4U", "disease_OMIM")
  
  enrichResult <- WebGestaltR(enrichMethod="ORA", organism="hsapiens",
                              interestGene=genes, referenceGene=background,
                              enrichDatabase=databases, interestGeneType="genesymbol",
                              referenceGeneType="genesymbol", isOutput=F)
  print(enrichResult[,c("description", "size", "overlap", "FDR", "database", "userId")])
}
Uploading data to Enrichr... Done.
  Querying GO_Biological_Process_2021... Done.
  Querying GO_Cellular_Component_2021... Done.
  Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
[1] "GO_Cellular_Component_2021"
[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"
[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
CTB-50L17.10 gene(s) from the input list not found in DisGeNET CURATEDPKN3 gene(s) from the input list not found in DisGeNET CURATEDMMS22L gene(s) from the input list not found in DisGeNET CURATEDRP11-88E10.5 gene(s) from the input list not found in DisGeNET CURATEDPSRC1 gene(s) from the input list not found in DisGeNET CURATEDRRBP1 gene(s) from the input list not found in DisGeNET CURATEDSIK2 gene(s) from the input list not found in DisGeNET CURATEDUBASH3B gene(s) from the input list not found in DisGeNET CURATEDSMG6 gene(s) from the input list not found in DisGeNET CURATEDTTC5 gene(s) from the input list not found in DisGeNET CURATEDFBXO42 gene(s) from the input list not found in DisGeNET CURATEDLUC7L3 gene(s) from the input list not found in DisGeNET CURATEDC9orf43 gene(s) from the input list not found in DisGeNET CURATEDLAMP1 gene(s) from the input list not found in DisGeNET CURATEDBACH1-AS1 gene(s) from the input list not found in DisGeNET CURATEDNECAP2 gene(s) from the input list not found in DisGeNET CURATEDRNFT1 gene(s) from the input list not found in DisGeNET CURATEDABHD6 gene(s) from the input list not found in DisGeNET CURATEDAMZ2 gene(s) from the input list not found in DisGeNET CURATEDUNK gene(s) from the input list not found in DisGeNET CURATEDTM4SF19 gene(s) from the input list not found in DisGeNET CURATEDCCDC157 gene(s) from the input list not found in DisGeNET CURATEDSYTL1 gene(s) from the input list not found in DisGeNET CURATEDCCT6A gene(s) from the input list not found in DisGeNET CURATEDGPX7 gene(s) from the input list not found in DisGeNET CURATEDKLHL23 gene(s) from the input list not found in DisGeNET CURATEDFAM91A1 gene(s) from the input list not found in DisGeNET CURATEDRAB17 gene(s) from the input list not found in DisGeNET CURATEDLRRC43 gene(s) from the input list not found in DisGeNET CURATEDINAFM1 gene(s) from the input list not found in DisGeNET CURATEDCSNK1G3 gene(s) from the input list not found in DisGeNET CURATEDDAGLB gene(s) from the input list not found in DisGeNET CURATEDRP11-334C17.6 gene(s) from the input list not found in DisGeNET CURATEDLINC01624 gene(s) from the input list not found in DisGeNET CURATEDGTF2H2 gene(s) from the input list not found in DisGeNET CURATEDSMIM19 gene(s) from the input list not found in DisGeNET CURATEDTBL2 gene(s) from the input list not found in DisGeNET CURATEDTXNDC11 gene(s) from the input list not found in DisGeNET CURATEDUBE2Z gene(s) from the input list not found in DisGeNET CURATEDCEP44 gene(s) from the input list not found in DisGeNET CURATEDSHKBP1 gene(s) from the input list not found in DisGeNET CURATEDLINC00565 gene(s) from the input list not found in DisGeNET CURATEDMPV17L2 gene(s) from the input list not found in DisGeNET CURATEDNPIPA1 gene(s) from the input list not found in DisGeNET CURATEDSLC50A1 gene(s) from the input list not found in DisGeNET CURATEDHABP4 gene(s) from the input list not found in DisGeNET CURATED
                                                Description        FDR
26                                    Conjunctival Diseases 0.06862745
70                                                   polyps 0.06862745
138               Symmetrical dyschromatosis of extremities 0.06862745
180                                Preeclampsia Eclampsia 4 0.06862745
181                    CHARCOT-MARIE-TOOTH DISEASE, TYPE 4H 0.06862745
183                               SPINOCEREBELLAR ATAXIA 21 0.06862745
191 Spinal Muscular Atrophy, Distal, Autosomal Recessive, 4 0.06862745
194                                 Hypomagnesemia 4, Renal 0.06862745
205                            AICARDI-GOUTIERES SYNDROME 6 0.06862745
212   CHARCOT-MARIE-TOOTH DISEASE, RECESSIVE INTERMEDIATE C 0.06862745
    Ratio BgRatio
26   1/49  1/9703
70   1/49  1/9703
138  1/49  1/9703
180  1/49  1/9703
181  1/49  1/9703
183  1/49  1/9703
191  1/49  1/9703
194  1/49  1/9703
205  1/49  1/9703
212  1/49  1/9703
******************************************

*                                        *

*          Welcome to WebGestaltR !      *

*                                        *

******************************************
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet,
minNum = minNum, : No significant gene set is identified based on FDR 0.05!
NULL

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0       cowplot_1.0.0    
[5] ggplot2_3.3.3    

loaded via a namespace (and not attached):
  [1] bitops_1.0-6                matrixStats_0.57.0         
  [3] fs_1.3.1                    bit64_4.0.5                
  [5] doParallel_1.0.16           progress_1.2.2             
  [7] httr_1.4.1                  rprojroot_2.0.2            
  [9] GenomeInfoDb_1.20.0         doRNG_1.8.2                
 [11] tools_3.6.1                 utf8_1.2.1                 
 [13] R6_2.5.0                    DBI_1.1.1                  
 [15] BiocGenerics_0.30.0         colorspace_1.4-1           
 [17] withr_2.4.1                 tidyselect_1.1.0           
 [19] prettyunits_1.0.2           bit_4.0.4                  
 [21] curl_3.3                    compiler_3.6.1             
 [23] git2r_0.26.1                Biobase_2.44.0             
 [25] DelayedArray_0.10.0         rtracklayer_1.44.0         
 [27] labeling_0.3                scales_1.1.0               
 [29] readr_1.4.0                 apcluster_1.4.8            
 [31] stringr_1.4.0               digest_0.6.20              
 [33] Rsamtools_2.0.0             svglite_1.2.2              
 [35] rmarkdown_1.13              XVector_0.24.0             
 [37] pkgconfig_2.0.3             htmltools_0.3.6            
 [39] fastmap_1.1.0               BSgenome_1.52.0            
 [41] rlang_0.4.11                RSQLite_2.2.7              
 [43] generics_0.0.2              farver_2.1.0               
 [45] jsonlite_1.6                BiocParallel_1.18.0        
 [47] dplyr_1.0.7                 VariantAnnotation_1.30.1   
 [49] RCurl_1.98-1.1              magrittr_2.0.1             
 [51] GenomeInfoDbData_1.2.1      Matrix_1.2-18              
 [53] Rcpp_1.0.6                  munsell_0.5.0              
 [55] S4Vectors_0.22.1            fansi_0.5.0                
 [57] gdtools_0.1.9               lifecycle_1.0.0            
 [59] stringi_1.4.3               whisker_0.3-2              
 [61] yaml_2.2.0                  SummarizedExperiment_1.14.1
 [63] zlibbioc_1.30.0             plyr_1.8.4                 
 [65] grid_3.6.1                  blob_1.2.1                 
 [67] parallel_3.6.1              promises_1.0.1             
 [69] crayon_1.4.1                lattice_0.20-38            
 [71] Biostrings_2.52.0           GenomicFeatures_1.36.3     
 [73] hms_1.1.0                   knitr_1.23                 
 [75] pillar_1.6.1                igraph_1.2.4.1             
 [77] GenomicRanges_1.36.0        rjson_0.2.20               
 [79] rngtools_1.5                codetools_0.2-16           
 [81] reshape2_1.4.3              biomaRt_2.40.1             
 [83] stats4_3.6.1                XML_3.98-1.20              
 [85] glue_1.4.2                  evaluate_0.14              
 [87] data.table_1.14.0           foreach_1.5.1              
 [89] vctrs_0.3.8                 httpuv_1.5.1               
 [91] gtable_0.3.0                purrr_0.3.4                
 [93] assertthat_0.2.1            cachem_1.0.5               
 [95] xfun_0.8                    later_0.8.0                
 [97] tibble_3.1.2                iterators_1.0.13           
 [99] GenomicAlignments_1.20.1    AnnotationDbi_1.46.0       
[101] memoise_2.0.0               IRanges_2.18.1             
[103] workflowr_1.6.2             ellipsis_0.3.2