Last updated: 2021-09-09

Checks: 6 1

Knit directory: ctwas_applied/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20210726) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 59e5f4d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Unstaged changes:
    Modified:   analysis/ukb-d-30500_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30600_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30610_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30620_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30630_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30640_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30650_irnt_Liver.Rmd
    Modified:   analysis/ukb-d-30660_irnt_Liver.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/ukb-d-30660_irnt_Liver.Rmd) and HTML (docs/ukb-d-30660_irnt_Liver.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd cbf7408 wesleycrouse 2021-09-08 adding enrichment to reports
html cbf7408 wesleycrouse 2021-09-08 adding enrichment to reports
Rmd 4970e3e wesleycrouse 2021-09-08 updating reports
html 4970e3e wesleycrouse 2021-09-08 updating reports
Rmd 627a4e1 wesleycrouse 2021-09-07 adding heritability
Rmd dfd2b5f wesleycrouse 2021-09-07 regenerating reports
html dfd2b5f wesleycrouse 2021-09-07 regenerating reports
Rmd 61b53b3 wesleycrouse 2021-09-06 updated PVE calculation
html 61b53b3 wesleycrouse 2021-09-06 updated PVE calculation
Rmd 837dd01 wesleycrouse 2021-09-01 adding additional fixedsigma report
Rmd d0a5417 wesleycrouse 2021-08-30 adding new reports to the index
Rmd 0922de7 wesleycrouse 2021-08-18 updating all reports with locus plots
html 0922de7 wesleycrouse 2021-08-18 updating all reports with locus plots
html 1c62980 wesleycrouse 2021-08-11 Updating reports
Rmd 5981e80 wesleycrouse 2021-08-11 Adding more reports
html 5981e80 wesleycrouse 2021-08-11 Adding more reports
Rmd 05a98b7 wesleycrouse 2021-08-07 adding additional results
html 05a98b7 wesleycrouse 2021-08-07 adding additional results
html 03e541c wesleycrouse 2021-07-29 Cleaning up report generation
Rmd 276893d wesleycrouse 2021-07-29 Updating reports
html 276893d wesleycrouse 2021-07-29 Updating reports

Overview

These are the results of a ctwas analysis of the UK Biobank trait Direct bilirubin (quantile) using Liver gene weights.

The GWAS was conducted by the Neale Lab, and the biomarker traits we analyzed are discussed here. Summary statistics were obtained from IEU OpenGWAS using GWAS ID: ukb-d-30660_irnt. Results were obtained from from IEU rather than Neale Lab because they are in a standardard format (GWAS VCF). Note that 3 of the 34 biomarker traits were not available from IEU and were excluded from analysis.

The weights are mashr GTEx v8 models on Liver eQTL obtained from PredictDB. We performed a full harmonization of the variants, including recovering strand ambiguous variants. This procedure is discussed in a separate document. (TO-DO: add report that describes harmonization)

LD matrices were computed from a 10% subset of Neale lab subjects. Subjects were matched using the plate and well information from genotyping. We included only biallelic variants with MAF>0.01 in the original Neale Lab GWAS. (TO-DO: add more details [number of subjects, variants, etc])

Weight QC

TO-DO: add enhanced QC reporting (total number of weights, why each variant was missing for all genes)

qclist_all <- list()

qc_files <- paste0(results_dir, "/", list.files(results_dir, pattern="exprqc.Rd"))

for (i in 1:length(qc_files)){
  load(qc_files[i])
  chr <- unlist(strsplit(rev(unlist(strsplit(qc_files[i], "_")))[1], "[.]"))[1]
  qclist_all[[chr]] <- cbind(do.call(rbind, lapply(qclist,unlist)), as.numeric(substring(chr,4)))
}

qclist_all <- data.frame(do.call(rbind, qclist_all))
colnames(qclist_all)[ncol(qclist_all)] <- "chr"

rm(qclist, wgtlist, z_gene_chr)

#number of imputed weights
nrow(qclist_all)
[1] 10901
#number of imputed weights by chromosome
table(qclist_all$chr)

   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15 
1070  768  652  417  494  611  548  408  405  434  634  629  195  365  354 
  16   17   18   19   20   21   22 
 526  663  160  859  306  114  289 
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.8366205

Load ctwas results

#load ctwas results
ctwas_res <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.susieIrss.txt"))

#make unique identifier for regions
ctwas_res$region_tag <- paste(ctwas_res$region_tag1, ctwas_res$region_tag2, sep="_")

#compute PVE for each gene/SNP
ctwas_res$PVE = ctwas_res$susie_pip*ctwas_res$mu2/sample_size #check PVE calculation

#separate gene and SNP results
ctwas_gene_res <- ctwas_res[ctwas_res$type == "gene", ]
ctwas_gene_res <- data.frame(ctwas_gene_res)
ctwas_snp_res <- ctwas_res[ctwas_res$type == "SNP", ]
ctwas_snp_res <- data.frame(ctwas_snp_res)

#add gene information to results
sqlite <- RSQLite::dbDriver("SQLite")
db = RSQLite::dbConnect(sqlite, paste0("/project2/compbio/predictdb/mashr_models/mashr_", weight, ".db"))
query <- function(...) RSQLite::dbGetQuery(db, ...)
gene_info <- query("select gene, genename from extra")
gene_info <- query("select gene, genename, gene_type from extra")
RSQLite::dbDisconnect(db)

ctwas_gene_res <- cbind(ctwas_gene_res, gene_info[sapply(ctwas_gene_res$id, match, gene_info$gene), c("genename", "gene_type")])

#add z score to results
load(paste0(results_dir, "/", analysis_id, "_expr_z_gene.Rd"))
ctwas_gene_res$z <- z_gene[ctwas_gene_res$id,]$z

#load(paste0(results_dir, "/", analysis_id, "_expr_z_snp.Rd")) #for new version, stored after harmonization
z_snp <- readRDS(paste0(results_dir, "/", trait_id, ".RDS")) #for old version, unharmonized

z_snp <- z_snp[z_snp$id %in% ctwas_snp_res$id,] #subset snps to those included in analysis, note some are duplicated, need to match which allele was used
ctwas_snp_res$z <- z_snp$z[match(ctwas_snp_res$id, z_snp$id)] #for duplicated snps, this arbitrarily uses the first allele
ctwas_snp_res$z_flag <- as.numeric(ctwas_snp_res$id %in% z_snp$id[duplicated(z_snp$id)]) #mark the unclear z scores, flag=1

#formatting and rounding for tables
ctwas_gene_res$z <- round(ctwas_gene_res$z,2)
ctwas_snp_res$z <- round(ctwas_snp_res$z,2)
ctwas_gene_res$susie_pip <- round(ctwas_gene_res$susie_pip,3)
ctwas_snp_res$susie_pip <- round(ctwas_snp_res$susie_pip,3)
ctwas_gene_res$mu2 <- round(ctwas_gene_res$mu2,2)
ctwas_snp_res$mu2 <- round(ctwas_snp_res$mu2,2)
ctwas_gene_res$PVE <- signif(ctwas_gene_res$PVE, 2)
ctwas_snp_res$PVE <- signif(ctwas_snp_res$PVE, 2)

#merge gene and snp results with added information
ctwas_gene_res$z_flag=NA
ctwas_snp_res$genename=NA
ctwas_snp_res$gene_type=NA

ctwas_res <- rbind(ctwas_gene_res,
                   ctwas_snp_res[,colnames(ctwas_gene_res)])

#store columns to report
report_cols <- colnames(ctwas_gene_res)[!(colnames(ctwas_gene_res) %in% c("type", "region_tag1", "region_tag2", "cs_index", "gene_type", "z_flag", "id", "chrom", "pos"))]
first_cols <- c("genename", "region_tag")
report_cols <- c(first_cols, report_cols[!(report_cols %in% first_cols)])

report_cols_snps <- c("id", report_cols[-1])

#get number of SNPs from s1 results; adjust for thin
ctwas_res_s1 <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.s1.susieIrss.txt"))
n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
rm(ctwas_res_s1)

Check convergence of parameters

library(ggplot2)
library(cowplot)

********************************************************
Note: As of version 1.0.0, cowplot does not change the
  default ggplot2 theme anymore. To recover the previous
  behavior, execute:
  theme_set(theme_cowplot())
********************************************************
load(paste0(results_dir, "/", analysis_id, "_ctwas.s2.susieIrssres.Rd"))
  
df <- data.frame(niter = rep(1:ncol(group_prior_rec), 2),
                 value = c(group_prior_rec[1,], group_prior_rec[2,]),
                 group = rep(c("Gene", "SNP"), each = ncol(group_prior_rec)))
df$group <- as.factor(df$group)

df$value[df$group=="SNP"] <- df$value[df$group=="SNP"]*thin #adjust parameter to account for thin argument

p_pi <- ggplot(df, aes(x=niter, y=value, group=group)) +
  geom_line(aes(color=group)) +
  geom_point(aes(color=group)) +
  xlab("Iteration") + ylab(bquote(pi)) +
  ggtitle("Prior mean") +
  theme_cowplot()

df <- data.frame(niter = rep(1:ncol(group_prior_var_rec), 2),
                 value = c(group_prior_var_rec[1,], group_prior_var_rec[2,]),
                 group = rep(c("Gene", "SNP"), each = ncol(group_prior_var_rec)))
df$group <- as.factor(df$group)
p_sigma2 <- ggplot(df, aes(x=niter, y=value, group=group)) +
  geom_line(aes(color=group)) +
  geom_point(aes(color=group)) +
  xlab("Iteration") + ylab(bquote(sigma^2)) +
  ggtitle("Prior variance") +
  theme_cowplot()

plot_grid(p_pi, p_sigma2)

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
        gene          snp 
4.548826e-02 6.866634e-06 
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
       gene         snp 
   5.325796 1575.790741 
#report sample size
print(sample_size)
[1] 292933
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1]   10901 8697330
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
       gene         snp 
0.009015335 0.321262531 
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1]  0.05609462 10.78474853

Genes with highest PIPs

#distribution of PIPs
hist(ctwas_gene_res$susie_pip, xlim=c(0,1), main="Distribution of Gene PIPs")

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#genes with PIP>0.8 or 20 highest PIPs
head(ctwas_gene_res[order(-ctwas_gene_res$susie_pip),report_cols], max(sum(ctwas_gene_res$susie_pip>0.8), 20))
           genename region_tag susie_pip   mu2     PVE     z
3212          CCND2       12_4     1.000 25.94 8.9e-05  5.34
5563          ABCG8       2_27     0.999 30.18 1.0e-04  5.88
12467 RP11-219B17.3      15_27     0.999 43.14 1.5e-04  7.18
10495         PRMT6       1_66     0.997 24.37 8.3e-05  5.14
7040          INHBB       2_70     0.997 21.98 7.5e-05  4.81
1848          CD276      15_35     0.997 32.17 1.1e-04  6.13
11790        CYP2A6      19_28     0.995 21.10 7.2e-05 -4.73
1320        CWF19L1      10_64     0.994 26.77 9.1e-05 -7.09
1231         PABPC4       1_24     0.992 20.58 7.0e-05  4.52
3562         ACVR1C       2_94     0.992 20.57 7.0e-05  4.62
6093        CSNK1G3       5_75     0.992 21.37 7.2e-05  4.74
5780       IRAK1BP1       6_54     0.990 22.21 7.5e-05  4.85
2092            SP4       7_19     0.982 19.81 6.6e-05  4.51
8811         SMIM19       8_37     0.982 27.65 9.3e-05 -5.58
6291          JAZF1       7_23     0.979 21.96 7.3e-05  4.83
2924          EFHD1      2_136     0.977 25.00 8.3e-05  6.05
10214        ZNF165       6_22     0.971 22.14 7.3e-05  5.99
2359          ABCC3      17_29     0.971 18.15 6.0e-05  4.38
12687   RP4-781K5.7      1_121     0.967 17.86 5.9e-05 -4.17
5978        ZC3H12C      11_65     0.966 18.04 5.9e-05 -4.19
11669 RP11-452H21.4      11_43     0.964 28.01 9.2e-05  5.78
9390           GAS6      13_62     0.964 19.49 6.4e-05  4.46
4269          ITGB4      17_42     0.963 19.97 6.6e-05 -4.91
8156           GYPA       4_94     0.962 20.07 6.6e-05  4.83
7547           LIPC      15_26     0.962 17.81 5.8e-05  3.99
537           DGAT2      11_42     0.959 46.58 1.5e-04 -7.51
5089         SCAF11      12_29     0.958 17.64 5.8e-05  4.13
583           ZNF76       6_28     0.946 18.05 5.8e-05 -4.20
59              MPO      17_34     0.939 19.17 6.1e-05  4.62
1120           CETP      16_31     0.932 17.60 5.6e-05 -4.03
11009     LINC01118       2_29     0.931 17.02 5.4e-05  3.96
1146         DNMT3B      20_19     0.924 16.75 5.3e-05 -3.98
10501        MAP3K3      17_37     0.922 15.89 5.0e-05  3.49
1153           TGDS      13_47     0.920 17.15 5.4e-05 -4.00
666           COASY      17_25     0.920 17.40 5.5e-05 -3.97
1609          UQCC1      20_21     0.920 16.50 5.2e-05 -3.60
7651          CASC4      15_17     0.919 16.06 5.0e-05  3.77
5799        SLC22A3      6_104     0.916 19.80 6.2e-05  4.51
669          ATP11A      13_61     0.916 17.00 5.3e-05 -3.91
5544          CNIH4      1_114     0.912 16.35 5.1e-05 -3.78
7266           HHIP       4_94     0.910 17.25 5.4e-05 -4.02
6439            MCU      10_48     0.909 15.39 4.8e-05  3.22
6682         CYB5R1      1_102     0.903 16.93 5.2e-05 -3.95
4178         PDLIM4       5_79     0.887 18.39 5.6e-05 -4.33
7256          CEP44      4_113     0.886 16.77 5.1e-05  3.81
1494           TXN2      22_14     0.862 16.02 4.7e-05  3.67
10212          IL27      16_23     0.861 20.59 6.1e-05 -4.76
4295           RHBG       1_77     0.857 16.61 4.9e-05 -3.78
3791           PROZ      13_62     0.847 16.41 4.7e-05 -3.64
8506        FAM222B      17_17     0.824 17.58 4.9e-05  4.12
4271          H3F3B      17_42     0.818 23.75 6.6e-05  5.28
8190        ADORA2B      17_14     0.817 17.86 5.0e-05  4.14
2204           AKNA       9_59     0.810 16.42 4.5e-05  3.72

Genes with largest effect sizes

#plot PIP vs effect size
plot(ctwas_gene_res$susie_pip, ctwas_gene_res$mu2, xlab="PIP", ylab="mu^2", main="Gene PIPs vs Effect Size")

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#genes with 20 largest effect sizes
head(ctwas_gene_res[order(-ctwas_gene_res$mu2),report_cols],20)
           genename region_tag susie_pip       mu2 PVE       z
12683         HCP5B       6_24         0 323765.43   0   -3.52
10663        TRIM31       6_24         0 169467.05   0    1.07
4833          FLOT1       6_24         0 160968.35   0   -1.07
10651         ABCF1       6_24         0  73493.54   0   -3.76
5766        PPP1R18       6_24         0  63880.99   0   -3.94
10425        AKR1C4       10_6         0  59625.40   0    6.16
11533        UGT1A4      2_137         0  51199.81   0  232.75
11447        UGT1A1      2_137         0  43485.02   0 -230.41
4836            NRM       6_24         0  36023.92   0   -0.40
10667         HLA-G       6_24         0  34755.83   0   -6.69
11489        UGT1A3      2_137         0  34565.53   0  213.80
7732         UGT1A6      2_137         0  29022.18   0  186.96
624           ZNRD1       6_24         0  25299.13   0    0.19
10661        TRIM10       6_24         0  10671.43   0   -0.49
11033 RP11-445P17.8       10_6         0   7886.61   0   -1.15
11522        UGT1A7      2_137         0   5624.83   0  -71.90
10648      C6orf136       6_24         0   4697.83   0    0.11
11136         HCG20       6_24         0   4212.20   0   -1.98
10664         RNF39       6_24         0   3872.54   0   -1.02
10774         HLA-A       6_24         0   1958.68   0   -0.83

Genes with highest PVE

#genes with 20 highest pve
head(ctwas_gene_res[order(-ctwas_gene_res$PVE),report_cols],20)
           genename region_tag susie_pip   mu2     PVE     z
537           DGAT2      11_42     0.959 46.58 1.5e-04 -7.51
12467 RP11-219B17.3      15_27     0.999 43.14 1.5e-04  7.18
1848          CD276      15_35     0.997 32.17 1.1e-04  6.13
5563          ABCG8       2_27     0.999 30.18 1.0e-04  5.88
8811         SMIM19       8_37     0.982 27.65 9.3e-05 -5.58
11669 RP11-452H21.4      11_43     0.964 28.01 9.2e-05  5.78
1320        CWF19L1      10_64     0.994 26.77 9.1e-05 -7.09
3212          CCND2       12_4     1.000 25.94 8.9e-05  5.34
10495         PRMT6       1_66     0.997 24.37 8.3e-05  5.14
2924          EFHD1      2_136     0.977 25.00 8.3e-05  6.05
7040          INHBB       2_70     0.997 21.98 7.5e-05  4.81
5780       IRAK1BP1       6_54     0.990 22.21 7.5e-05  4.85
2004          TGFB1      19_28     0.749 29.34 7.5e-05  5.64
10214        ZNF165       6_22     0.971 22.14 7.3e-05  5.99
6291          JAZF1       7_23     0.979 21.96 7.3e-05  4.83
11216       CYP21A2       6_26     0.754 28.09 7.2e-05 -6.07
6093        CSNK1G3       5_75     0.992 21.37 7.2e-05  4.74
11790        CYP2A6      19_28     0.995 21.10 7.2e-05 -4.73
1231         PABPC4       1_24     0.992 20.58 7.0e-05  4.52
3562         ACVR1C       2_94     0.992 20.57 7.0e-05  4.62

Genes with largest z scores

#genes with 20 largest z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
           genename region_tag susie_pip      mu2     PVE       z
11533        UGT1A4      2_137     0.000 51199.81 0.0e+00  232.75
11447        UGT1A1      2_137     0.000 43485.02 0.0e+00 -230.41
11489        UGT1A3      2_137     0.000 34565.53 0.0e+00  213.80
7732         UGT1A6      2_137     0.000 29022.18 0.0e+00  186.96
11522        UGT1A7      2_137     0.000  5624.83 0.0e+00  -71.90
1088          USP40      2_137     0.000  1758.34 0.0e+00  -46.64
10747       SLCO1B7      12_16     0.000  1025.26 0.0e+00   12.26
3556          HJURP      2_137     0.000   169.37 0.0e+00   10.96
8651           MSL2       3_84     0.094    78.80 2.5e-05   10.28
2584        SLCO1B3      12_16     0.000   506.35 0.0e+00    9.93
2586         GOLT1B      12_16     0.000    63.88 0.0e+00    7.53
537           DGAT2      11_42     0.959    46.58 1.5e-04   -7.51
11290  MAPKAPK5-AS1      12_67     0.155    34.68 1.8e-05   -7.21
12467 RP11-219B17.3      15_27     0.999    43.14 1.5e-04    7.18
2541          ALDH2      12_67     0.134    32.96 1.5e-05    7.10
1320        CWF19L1      10_64     0.994    26.77 9.1e-05   -7.09
2536          SH2B3      12_67     0.084    29.11 8.3e-06    6.80
10667         HLA-G       6_24     0.000 34755.83 0.0e+00   -6.69
2170            AHR       7_17     0.065    24.94 5.5e-06   -6.58
4962          EXOC6      10_59     0.110    35.21 1.3e-05   -6.37

Comparing z scores and PIPs

#set nominal signifiance threshold for z scores
alpha <- 0.05

#bonferroni adjusted threshold for z scores
sig_thresh <- qnorm(1-(alpha/nrow(ctwas_gene_res)/2), lower=T)

#Q-Q plot for z scores
obs_z <- ctwas_gene_res$z[order(ctwas_gene_res$z)]
exp_z <- qnorm((1:nrow(ctwas_gene_res))/nrow(ctwas_gene_res))

plot(exp_z, obs_z, xlab="Expected z", ylab="Observed z", main="Gene z score Q-Q plot")
abline(a=0,b=1)

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#plot z score vs PIP
plot(abs(ctwas_gene_res$z), ctwas_gene_res$susie_pip, xlab="abs(z)", ylab="PIP")
abline(v=sig_thresh, col="red", lty=2)

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)
[1] 0.006696633
#genes with most significant z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
           genename region_tag susie_pip      mu2     PVE       z
11533        UGT1A4      2_137     0.000 51199.81 0.0e+00  232.75
11447        UGT1A1      2_137     0.000 43485.02 0.0e+00 -230.41
11489        UGT1A3      2_137     0.000 34565.53 0.0e+00  213.80
7732         UGT1A6      2_137     0.000 29022.18 0.0e+00  186.96
11522        UGT1A7      2_137     0.000  5624.83 0.0e+00  -71.90
1088          USP40      2_137     0.000  1758.34 0.0e+00  -46.64
10747       SLCO1B7      12_16     0.000  1025.26 0.0e+00   12.26
3556          HJURP      2_137     0.000   169.37 0.0e+00   10.96
8651           MSL2       3_84     0.094    78.80 2.5e-05   10.28
2584        SLCO1B3      12_16     0.000   506.35 0.0e+00    9.93
2586         GOLT1B      12_16     0.000    63.88 0.0e+00    7.53
537           DGAT2      11_42     0.959    46.58 1.5e-04   -7.51
11290  MAPKAPK5-AS1      12_67     0.155    34.68 1.8e-05   -7.21
12467 RP11-219B17.3      15_27     0.999    43.14 1.5e-04    7.18
2541          ALDH2      12_67     0.134    32.96 1.5e-05    7.10
1320        CWF19L1      10_64     0.994    26.77 9.1e-05   -7.09
2536          SH2B3      12_67     0.084    29.11 8.3e-06    6.80
10667         HLA-G       6_24     0.000 34755.83 0.0e+00   -6.69
2170            AHR       7_17     0.065    24.94 5.5e-06   -6.58
4962          EXOC6      10_59     0.110    35.21 1.3e-05   -6.37

Locus plots for genes and SNPs

ctwas_gene_res_sortz <- ctwas_gene_res[order(-abs(ctwas_gene_res$z)),]

n_plots <- 5
for (region_tag_plot in head(unique(ctwas_gene_res_sortz$region_tag), n_plots)){
  ctwas_res_region <-  ctwas_res[ctwas_res$region_tag==region_tag_plot,]
  start <- min(ctwas_res_region$pos)
  end <- max(ctwas_res_region$pos)
  
  ctwas_res_region <- ctwas_res_region[order(ctwas_res_region$pos),]
  ctwas_res_region_gene <- ctwas_res_region[ctwas_res_region$type=="gene",]
  ctwas_res_region_snp <- ctwas_res_region[ctwas_res_region$type=="SNP",]
  
  #region name
  print(paste0("Region: ", region_tag_plot))
  
  #table of genes in region
  print(ctwas_res_region_gene[,report_cols])
  
  par(mfrow=c(4,1))
  
  #gene z scores
  plot(ctwas_res_region_gene$pos, abs(ctwas_res_region_gene$z), xlab="Position", ylab="abs(gene_z)", xlim=c(start,end),
   ylim=c(0,max(sig_thresh, abs(ctwas_res_region_gene$z))),
   main=paste0("Region: ", region_tag_plot))
  abline(h=sig_thresh,col="red",lty=2)
  
  #significance threshold for SNPs
  alpha_snp <- 5*10^(-8)
  sig_thresh_snp <- qnorm(1-alpha_snp/2, lower=T)
  
  #snp z scores
  plot(ctwas_res_region_snp$pos, abs(ctwas_res_region_snp$z), xlab="Position", ylab="abs(snp_z)",xlim=c(start,end),
   ylim=c(0,max(sig_thresh_snp, max(abs(ctwas_res_region_snp$z)))))
  abline(h=sig_thresh_snp,col="purple",lty=2)
  
  #gene pips
  plot(ctwas_res_region_gene$pos, ctwas_res_region_gene$susie_pip, xlab="Position", ylab="Gene PIP", xlim=c(start,end), ylim=c(0,1))
  abline(h=0.8,col="blue",lty=2)
  
  #snp pips
  plot(ctwas_res_region_snp$pos, ctwas_res_region_snp$susie_pip, xlab="Position", ylab="SNP PIP", xlim=c(start,end), ylim=c(0,1))
  abline(h=0.8,col="blue",lty=2)
}
[1] "Region: 2_137"
        genename region_tag susie_pip      mu2 PVE       z
10567     GIGYF2      2_137         0    30.97   0   -5.12
9340     C2orf82      2_137         0     5.79   0    0.45
620         NGEF      2_137         0     7.24   0    2.52
8006      INPP5D      2_137         0    15.40   0    4.05
879         DGKD      2_137         0    31.39   0   -0.07
1088       USP40      2_137         0  1758.34   0  -46.64
11522     UGT1A7      2_137         0  5624.83   0  -71.90
7732      UGT1A6      2_137         0 29022.18   0  186.96
11533     UGT1A4      2_137         0 51199.81   0  232.75
11489     UGT1A3      2_137         0 34565.53   0  213.80
11447     UGT1A1      2_137         0 43485.02   0 -230.41
3556       HJURP      2_137         0   169.37   0   10.96
11098 AC006037.2      2_137         0     5.10   0    1.50

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 12_16"
      genename region_tag susie_pip     mu2 PVE     z
2584   SLCO1B3      12_16         0  506.35   0  9.93
10747  SLCO1B7      12_16         0 1025.26   0 12.26
3400      IAPP      12_16         0  138.98   0  5.16
3399   PYROXD1      12_16         0   19.02   0  0.89
36       RECQL      12_16         0  221.55   0  3.81
2586    GOLT1B      12_16         0   63.88   0  7.53
4482       SPX      12_16         0  230.14   0  4.60
2587      LDHB      12_16         0    8.56   0  0.07
3401     KCNJ8      12_16         0    6.44   0 -0.33
689      ABCC9      12_16         0    8.41   0  2.11
2590     C2CD5      12_16         0    9.94   0 -1.39
5073     ETNK1      12_16         0   11.12   0  0.30

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 3_84"
     genename region_tag susie_pip   mu2     PVE     z
796   PPP2R3A       3_84     0.178 13.30 8.1e-06 -2.46
8651     MSL2       3_84     0.094 78.80 2.5e-05 10.28
2795     PCCB       3_84     0.083  5.45 1.6e-06  1.22
3148    STAG1       3_84     0.086  4.57 1.3e-06 -0.03
6584     NCK1       3_84     0.085  7.94 2.3e-06 -2.19

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 11_42"
           genename region_tag susie_pip   mu2     PVE     z
7611          XRRA1      11_42     0.161  7.17 3.9e-06 -1.00
3170          SPCS2      11_42     0.129  5.40 2.4e-06  0.73
6901           NEU3      11_42     0.116  4.39 1.7e-06  0.39
4848        SLCO2B1      11_42     0.116  4.35 1.7e-06 -0.22
12001         TPBGL      11_42     0.145  6.17 3.0e-06  0.68
6617          GDPD5      11_42     0.163  9.03 5.0e-06  1.95
8328           MAP6      11_42     0.128  5.23 2.3e-06  0.24
7603         MOGAT2      11_42     0.323 11.93 1.3e-05  0.85
537           DGAT2      11_42     0.959 46.58 1.5e-04 -7.51
10381         UVRAG      11_42     0.135  6.65 3.1e-06  1.62
1082          WNT11      11_42     0.157  7.13 3.8e-06  1.06
11773 RP11-619A14.3      11_42     0.128  5.23 2.3e-06  0.61
4849         THAP12      11_42     0.125  5.14 2.2e-06 -0.66
12265 RP11-111M22.5      11_42     0.135  5.75 2.6e-06  0.80
11766 RP11-111M22.3      11_42     0.116  4.39 1.7e-06  0.31
11751  RP11-672A2.4      11_42     0.124  4.95 2.1e-06  0.53
9350           TSKU      11_42     0.121  4.65 1.9e-06  0.25
905           ACER3      11_42     0.339 13.87 1.6e-05 -2.04
5976          CAPN5      11_42     0.254 11.44 9.9e-06  1.72

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
[1] "Region: 12_67"
          genename region_tag susie_pip   mu2     PVE     z
5112          TCHP      12_67     0.664 15.54 3.5e-05  3.21
5111          GIT2      12_67     0.380 13.95 1.8e-05  3.36
8639      C12orf76      12_67     0.105  5.91 2.1e-06 -0.30
3515         IFT81      12_67     0.107  9.01 3.3e-06  2.50
10093       ANAPC7      12_67     0.108  7.56 2.8e-06  2.16
2531         ARPC3      12_67     0.115  6.40 2.5e-06  0.54
10684      FAM216A      12_67     0.091  7.13 2.2e-06  2.42
2532          GPN3      12_67     0.083  4.49 1.3e-06  1.40
2533         VPS29      12_67     0.083  4.51 1.3e-06 -1.41
10683        TCTN1      12_67     0.104  5.51 1.9e-06  0.02
3517         HVCN1      12_67     0.158 11.55 6.2e-06  2.67
9717        PPP1CC      12_67     0.158 11.49 6.2e-06 -2.65
10375      FAM109A      12_67     0.085  6.47 1.9e-06 -1.46
2536         SH2B3      12_67     0.084 29.11 8.3e-06  6.80
10680        ATXN2      12_67     0.084 16.47 4.7e-06  3.97
2541         ALDH2      12_67     0.134 32.96 1.5e-05  7.10
11290 MAPKAPK5-AS1      12_67     0.155 34.68 1.8e-05 -7.21
1191         ERP29      12_67     0.225 27.91 2.1e-05  6.25
10370      TMEM116      12_67     0.225 27.91 2.1e-05 -6.25
2544         NAA25      12_67     0.193 25.94 1.7e-05 -6.12
8505        HECTD4      12_67     0.247 30.21 2.6e-05  6.33
9084        PTPN11      12_67     0.112  6.73 2.6e-06 -0.78

Version Author Date
dfd2b5f wesleycrouse 2021-09-07

SNPs with highest PIPs

#snps with PIP>0.8 or 20 highest PIPs
head(ctwas_snp_res[order(-ctwas_snp_res$susie_pip),report_cols_snps],
max(sum(ctwas_snp_res$susie_pip>0.8), 20))
                id region_tag susie_pip       mu2     PVE      z
130485   rs2070959      2_137     1.000  54842.78 0.19000 238.14
130513  rs76063448      2_137     1.000   8087.82 0.02800  70.66
130514  rs11888459      2_137     1.000  58593.32 0.20000 187.72
130515   rs2885296      2_137     1.000  77570.97 0.26000 240.61
322220  rs72834643       6_20     1.000    129.63 0.00044   9.73
322241 rs115740542       6_20     1.000    197.25 0.00067  12.66
364959  rs12208357      6_103     1.000     91.40 0.00031  -6.65
376137 rs542176135       7_17     1.000    177.50 0.00061  -8.38
525245   rs6480402      10_46     1.000    609.89 0.00210   8.90
594103  rs11045819      12_16     1.000   5758.71 0.02000 -14.34
594120   rs4363657      12_16     1.000   1630.52 0.00560  43.78
785491  rs59616136      19_14     1.000     53.09 0.00018   7.00
900606  rs13031505      2_136     1.000     66.42 0.00023  -8.07
906830 rs200216446      3_104     1.000  11556.78 0.03900   3.98
934781   rs1611236       6_24     1.000 954165.10 3.30000  -3.60
988935 rs201623796       10_6     1.000 111465.14 0.38000  -3.61
988943 rs186871467       10_6     1.000  12681.91 0.04300   2.21
323083   rs3130253       6_23     0.999     47.83 0.00016  -6.63
376159   rs4721597       7_17     0.999    111.11 0.00038   1.94
501207 rs115478735       9_70     0.999     70.32 0.00024  -8.02
546127  rs76153913       11_2     0.996     48.26 0.00016   6.70
620933    rs653178      12_67     0.996    165.62 0.00056 -13.12
793635    rs814573      19_32     0.991     42.72 0.00014  -6.68
35207    rs2779116       1_78     0.987     72.82 0.00025  -8.25
593818   rs7962574      12_15     0.984     58.36 0.00020  -8.40
734216   rs2608604      16_53     0.983     46.90 0.00016  -6.33
604347   rs7397189      12_36     0.963     37.73 0.00012   5.69
786276   rs3794991      19_15     0.957    139.37 0.00046  11.64
793571   rs1551891      19_32     0.955     58.35 0.00019   7.88
130518  rs12052787      2_137     0.939   2604.89 0.00840 -11.25
593858  rs10770693      12_15     0.934     66.81 0.00021   8.86
487410   rs9410381       9_45     0.923     79.08 0.00025   8.62
805368  rs34507316      20_13     0.918     37.16 0.00012   5.60
906829  rs12492030      3_104     0.917  11474.13 0.03600   2.70
900543  rs34247311      2_136     0.913     41.29 0.00013   4.44
322790    rs156737       6_21     0.872     44.29 0.00013  -6.37
322059  rs75080831       6_19     0.867     68.04 0.00020   7.96
549149   rs4910498       11_8     0.831     49.64 0.00014   6.71
934804   rs1611248       6_24     0.812 954152.00 2.60000  -3.63
594004  rs12366506      12_16     0.805   4961.06 0.01400  23.44
988923   rs7914015       10_6     0.805 112204.93 0.31000  -5.52
546173  rs16929046       11_2     0.803     65.89 0.00018   7.10

SNPs with largest effect sizes

#plot PIP vs effect size
plot(ctwas_snp_res$susie_pip, ctwas_snp_res$mu2, xlab="PIP", ylab="mu^2", main="SNP PIPs vs Effect Size")

Version Author Date
dfd2b5f wesleycrouse 2021-09-07
#SNPs with 50 largest effect sizes
head(ctwas_snp_res[order(-ctwas_snp_res$mu2),report_cols_snps],50)
                id region_tag susie_pip      mu2     PVE     z
934781   rs1611236       6_24     1.000 954165.1 3.3e+00 -3.60
934763 rs111734624       6_24     0.169 954153.4 5.5e-01 -3.62
934760   rs2508055       6_24     0.168 954153.2 5.5e-01 -3.62
934808   rs1611252       6_24     0.098 954153.2 3.2e-01 -3.62
934804   rs1611248       6_24     0.812 954152.0 2.6e+00 -3.63
934825   rs1611260       6_24     0.059 954150.2 1.9e-01 -3.62
934831   rs1611265       6_24     0.049 954149.2 1.6e-01 -3.62
934699   rs1633033       6_24     0.280 954133.3 9.1e-01 -3.63
934835   rs2394171       6_24     0.000 954132.4 3.7e-05 -3.61
934833   rs1611267       6_24     0.000 954131.4 8.4e-05 -3.62
934837   rs2893981       6_24     0.000 954130.2 2.7e-05 -3.61
934767   rs1611228       6_24     0.000 954127.5 1.6e-04 -3.62
934756   rs1737020       6_24     0.000 954127.3 1.7e-07 -3.60
934757   rs1737019       6_24     0.000 954127.3 1.7e-07 -3.60
934712   rs2844838       6_24     0.003 954124.3 1.1e-02 -3.63
934716   rs1633032       6_24     0.000 954063.1 2.3e-07 -3.62
934750   rs1633020       6_24     0.000 954011.3 6.3e-12 -3.61
934754   rs1633018       6_24     0.000 954003.8 5.4e-15 -3.60
934779   rs1611234       6_24     0.000 953946.2 0.0e+00 -3.60
934639   rs1610726       6_24     0.000 953894.5 0.0e+00 -3.61
934707   rs2844840       6_24     0.000 953797.6 0.0e+00 -3.62
935034   rs3129185       6_24     0.000 953764.3 0.0e+00 -3.63
935049   rs1736999       6_24     0.000 953724.2 0.0e+00 -3.63
935062   rs1633001       6_24     0.000 953644.1 0.0e+00 -3.64
934802   rs1611246       6_24     0.000 953631.8 0.0e+00 -3.62
935238   rs1632980       6_24     0.000 953585.9 0.0e+00 -3.63
934735   rs1614309       6_24     0.000 953345.1 0.0e+00 -3.63
934734   rs1633030       6_24     0.000 952567.1 0.0e+00 -3.61
934847   rs9258382       6_24     0.000 951556.3 0.0e+00 -3.69
934844   rs9258379       6_24     0.000 950013.0 0.0e+00 -3.62
934793   rs1611241       6_24     0.000 948857.8 0.0e+00 -3.84
934738   rs1633028       6_24     0.000 947556.2 0.0e+00 -3.72
934796   rs1611244       6_24     0.000 943964.8 0.0e+00 -3.64
934751   rs2735042       6_24     0.000 942481.9 0.0e+00 -3.61
934832   rs1611266       6_24     0.000 935325.1 0.0e+00 -3.97
934805   rs1611249       6_24     0.000 931224.2 0.0e+00 -4.10
934771   rs1611230       6_24     0.000 928943.0 0.0e+00 -4.08
934820 rs145043018       6_24     0.000 928751.9 0.0e+00 -4.09
934830 rs147376303       6_24     0.000 928746.9 0.0e+00 -4.09
934841   rs9258376       6_24     0.000 928734.8 0.0e+00 -4.09
934848   rs1633016       6_24     0.000 928724.5 0.0e+00 -4.09
934693   rs1633035       6_24     0.000 928699.5 0.0e+00 -4.09
934901   rs1633014       6_24     0.000 928634.7 0.0e+00 -4.09
934726   rs1618670       6_24     0.000 928631.3 0.0e+00 -4.09
934753   rs1633019       6_24     0.000 928572.8 0.0e+00 -4.06
935015   rs1633010       6_24     0.000 928331.9 0.0e+00 -4.09
935139    rs909722       6_24     0.000 928184.5 0.0e+00 -4.08
935171   rs1610713       6_24     0.000 928174.0 0.0e+00 -4.09
935096   rs1736991       6_24     0.000 928166.7 0.0e+00 -4.09
935151   rs1610648       6_24     0.000 928111.8 0.0e+00 -4.10

SNPs with highest PVE

#SNPs with 50 highest pve
head(ctwas_snp_res[order(-ctwas_snp_res$PVE),report_cols_snps],50)
                id region_tag susie_pip       mu2     PVE      z
934781   rs1611236       6_24     1.000 954165.10 3.30000  -3.60
934804   rs1611248       6_24     0.812 954152.00 2.60000  -3.63
934699   rs1633033       6_24     0.280 954133.33 0.91000  -3.63
934760   rs2508055       6_24     0.168 954153.21 0.55000  -3.62
934763 rs111734624       6_24     0.169 954153.37 0.55000  -3.62
988935 rs201623796       10_6     1.000 111465.14 0.38000  -3.61
934808   rs1611252       6_24     0.098 954153.16 0.32000  -3.62
988923   rs7914015       10_6     0.805 112204.93 0.31000  -5.52
988914   rs4880705       10_6     0.795 112311.88 0.30000  -5.51
130515   rs2885296      2_137     1.000  77570.97 0.26000 240.61
130514  rs11888459      2_137     1.000  58593.32 0.20000 187.72
130485   rs2070959      2_137     1.000  54842.78 0.19000 238.14
934825   rs1611260       6_24     0.059 954150.17 0.19000  -3.62
934831   rs1611265       6_24     0.049 954149.25 0.16000  -3.62
988926   rs4644565       10_6     0.195 112156.63 0.07500  -5.51
988913   rs7907985       10_6     0.166 112301.84 0.06400  -5.51
988943 rs186871467       10_6     1.000  12681.91 0.04300   2.21
906830 rs200216446      3_104     1.000  11556.78 0.03900   3.98
906829  rs12492030      3_104     0.917  11474.13 0.03600   2.70
130513  rs76063448      2_137     1.000   8087.82 0.02800  70.66
934916 rs372065521       6_24     0.266  31064.73 0.02800   0.27
934917 rs555157405       6_24     0.259  30950.71 0.02700   0.27
934918 rs572275478       6_24     0.259  30950.71 0.02700   0.27
934915 rs376865941       6_24     0.207  31066.51 0.02200   0.25
906812  rs61793869      3_104     0.524  11502.80 0.02100   2.58
594103  rs11045819      12_16     1.000   5758.71 0.02000 -14.34
988965   rs1937847       10_6     0.039 112335.10 0.01500  -5.51
594004  rs12366506      12_16     0.805   4961.06 0.01400  23.44
594090  rs12371691      12_16     0.724   5459.79 0.01300   7.75
934712   rs2844838       6_24     0.003 954124.32 0.01100  -3.63
906818  rs61793871      3_104     0.231  11502.37 0.00910   2.57
130518  rs12052787      2_137     0.939   2604.89 0.00840 -11.25
906821  rs61793896      3_104     0.206  11502.40 0.00810   2.56
594120   rs4363657      12_16     1.000   1630.52 0.00560  43.78
594084  rs17328763      12_16     0.230   5436.65 0.00430   7.87
525245   rs6480402      10_46     1.000    609.89 0.00210   8.90
594010  rs11045612      12_16     0.114   4954.76 0.00190  23.44
906900  rs61791069      3_104     0.049  11491.33 0.00190   2.64
906912  rs61791073      3_104     0.033  11470.02 0.00130   2.67
525250  rs35233497      10_46     0.529    468.47 0.00085  -4.24
593914   rs3060461      12_16     0.359    685.47 0.00084 -21.86
593918  rs35290079      12_16     0.329    681.64 0.00077 -21.76
525253  rs79086908      10_46     0.471    468.43 0.00075  -4.24
906866  rs61791061      3_104     0.019  11505.87 0.00075   2.51
594021  rs73062442      12_16     0.042   4959.80 0.00070  23.39
322241 rs115740542       6_20     1.000    197.25 0.00067  12.66
594013  rs11513221      12_16     0.039   4960.27 0.00066  23.38
376137 rs542176135       7_17     1.000    177.50 0.00061  -8.38
620933    rs653178      12_67     0.996    165.62 0.00056 -13.12
594064  rs11045737      12_16     0.029   5385.26 0.00053   9.34

SNPs with largest z scores

#SNPs with 50 largest z scores
head(ctwas_snp_res[order(-abs(ctwas_snp_res$z)),report_cols_snps],50)
                id region_tag susie_pip      mu2     PVE       z
130515   rs2885296      2_137         1 77570.97 2.6e-01  240.61
130511  rs17862875      2_137         0 77087.61 0.0e+00  239.53
130485   rs2070959      2_137         1 54842.78 1.9e-01  238.14
130484   rs1105880      2_137         0 52752.67 0.0e+00  232.16
130479  rs77070100      2_137         0 52391.48 0.0e+00  231.38
130493   rs6749496      2_137         0 51255.87 0.0e+00  208.13
130508   rs3821242      2_137         0 43716.35 0.0e+00  203.17
130506   rs2008584      2_137         0 43425.08 0.0e+00  202.59
130486   rs7583278      2_137         0 47399.66 0.0e+00  200.99
130504  rs57258852      2_137         0 47478.17 0.0e+00  198.41
130502   rs4663332      2_137         0 44334.04 0.0e+00  194.41
130503 rs200973045      2_137         0 44112.98 0.0e+00  194.10
130470   rs2741034      2_137         0 32648.39 0.0e+00  190.53
130462   rs2602363      2_137         0 32597.37 0.0e+00  190.40
130457   rs2741013      2_137         0 32510.39 0.0e+00  190.21
130492   rs2012734      2_137         0 37095.00 0.0e+00  187.89
130514  rs11888459      2_137         1 58593.32 2.0e-01  187.72
130480   rs6753320      2_137         0 40890.94 0.0e+00  186.96
130481   rs6736743      2_137         0 40890.94 0.0e+00  186.96
130497  rs13401281      2_137         0 57559.11 6.5e-17  186.34
130494   rs1875263      2_137         0 54641.41 5.5e-15  181.67
130529   rs2361502      2_137         0 33132.93 0.0e+00  157.91
130466   rs6431558      2_137         0 19510.26 0.0e+00 -144.34
130474   rs1113193      2_137         0  7484.14 0.0e+00  -97.79
130468   rs1823803      2_137         0  8995.59 0.0e+00   91.76
130526  rs10202032      2_137         0  8376.88 0.0e+00  -88.03
130527   rs6723936      2_137         0  9712.61 0.0e+00   78.31
130516 rs143373661      2_137         0  8720.64 0.0e+00   78.24
130464  rs13027376      2_137         0  6682.23 0.0e+00  -74.88
130461   rs4047192      2_137         0  6673.02 0.0e+00  -74.81
130483  rs12476197      2_137         0  7944.69 0.0e+00  -71.92
130477   rs4663871      2_137         0  7899.05 0.0e+00  -71.72
130482 rs765251456      2_137         0  7826.70 0.0e+00  -71.57
130513  rs76063448      2_137         1  8087.82 2.8e-02   70.66
130509  rs45510999      2_137         0  8004.30 0.0e+00   70.21
130501 rs183532563      2_137         0  7936.26 0.0e+00   69.70
130517  rs11568318      2_137         0  1455.89 0.0e+00  -65.81
130507  rs45507691      2_137         0  1446.95 0.0e+00  -65.58
130475  rs17863773      2_137         0  6266.01 0.0e+00  -65.44
130469  rs10929252      2_137         0  5358.59 0.0e+00  -63.60
130467  rs17863766      2_137         0  5241.57 0.0e+00  -63.58
130456 rs140719475      2_137         0  5347.22 0.0e+00  -63.55
130459   rs6713902      2_137         0  4568.80 0.0e+00  -60.77
130510 rs139257330      2_137         0  5347.89 0.0e+00   60.10
130458   rs7563478      2_137         0  1129.04 0.0e+00  -59.73
130472   rs2602372      2_137         0  4810.94 0.0e+00   58.13
130519   rs2003569      2_137         0  3751.41 2.4e-08  -57.58
130442  rs62192764      2_137         0  3153.10 0.0e+00  -54.32
130434  rs62192761      2_137         0  3137.98 0.0e+00  -54.28
130444   rs4047189      2_137         0  4173.22 0.0e+00   53.85

Gene set enrichment for genes with PIP>0.8

#GO enrichment analysis
library(enrichR)
Welcome to enrichR
Checking connection ... 
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
dbs <- c("GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021")
genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]

#number of genes for gene set enrichment
length(genes)
[1] 53
if (length(genes)>0){
  GO_enrichment <- enrichr(genes, dbs)

  for (db in dbs){
    print(db)
    df <- GO_enrichment[[db]]
    df <- df[df$Adjusted.P.value<0.05,c("Term", "Overlap", "Adjusted.P.value", "Genes")]
    print(df)
  }
  
  #DisGeNET enrichment
  
  # devtools::install_bitbucket("ibi_group/disgenet2r")
  library(disgenet2r)
  
  disgenet_api_key <- get_disgenet_api_key(
                    email = "wesleycrouse@gmail.com", 
                    password = "uchicago1" )
  
  Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
  
  res_enrich <-disease_enrichment(entities=genes, vocabulary = "HGNC",
                               database = "CURATED" )
  
  df <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio",  "BgRatio")]
  print(df)
  
  #WebGestalt enrichment
  library(WebGestaltR)
  
  background <- ctwas_gene_res$genename
  
  #listGeneSet()
  databases <- c("pathway_KEGG", "disease_GLAD4U", "disease_OMIM")
  
  enrichResult <- WebGestaltR(enrichMethod="ORA", organism="hsapiens",
                              interestGene=genes, referenceGene=background,
                              enrichDatabase=databases, interestGeneType="genesymbol",
                              referenceGeneType="genesymbol", isOutput=F)
  print(enrichResult[,c("description", "size", "overlap", "FDR", "database", "userId")])
}
Uploading data to Enrichr... Done.
  Querying GO_Biological_Process_2021... Done.
  Querying GO_Cellular_Component_2021... Done.
  Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
                                                             Term Overlap
1        low-density lipoprotein particle remodeling (GO:0034374)    3/14
2                            cholesterol homeostasis (GO:0042632)    4/71
3                                 sterol homeostasis (GO:0055092)    4/72
4                           triglyceride homeostasis (GO:0070328)    3/31
5   very-low-density lipoprotein particle remodeling (GO:0034372)     2/9
6                              cholesterol transport (GO:0030301)    3/51
7                     triglyceride metabolic process (GO:0006641)    3/55
8  triglyceride-rich lipoprotein particle remodeling (GO:0034370)    2/13
9                      reverse cholesterol transport (GO:0043691)    2/17
10      high-density lipoprotein particle remodeling (GO:0034375)    2/18
11                activin receptor signaling pathway (GO:0032924)    2/19
12                              xenobiotic transport (GO:0042908)    2/20
13                                  sterol transport (GO:0015918)    2/21
14         regulation of interferon-gamma production (GO:0032649)    3/86
   Adjusted.P.value                 Genes
1       0.002831856         CETP;LIPC;MPO
2       0.005960472 ABCG8;CETP;LIPC;DGAT2
3       0.005960472 ABCG8;CETP;LIPC;DGAT2
4       0.008468662       CETP;LIPC;DGAT2
5       0.022157824             CETP;LIPC
6       0.025196105       ABCG8;CETP;LIPC
7       0.027004534       CETP;LIPC;DGAT2
8       0.029802322             CETP;LIPC
9       0.045876892             CETP;LIPC
10      0.046371623             CETP;LIPC
11      0.047035707          ACVR1C;INHBB
12      0.047825564         ABCC3;SLC22A3
13      0.048711028            ABCG8;CETP
14      0.049711019       IL27;GAS6;CD276
[1] "GO_Cellular_Component_2021"
[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"
[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
RP11-219B17.3 gene(s) from the input list not found in DisGeNET CURATEDRP4-781K5.7 gene(s) from the input list not found in DisGeNET CURATEDCEP44 gene(s) from the input list not found in DisGeNET CURATEDRP11-452H21.4 gene(s) from the input list not found in DisGeNET CURATEDCYB5R1 gene(s) from the input list not found in DisGeNET CURATEDRHBG gene(s) from the input list not found in DisGeNET CURATEDAKNA gene(s) from the input list not found in DisGeNET CURATEDH3F3B gene(s) from the input list not found in DisGeNET CURATEDCSNK1G3 gene(s) from the input list not found in DisGeNET CURATEDSMIM19 gene(s) from the input list not found in DisGeNET CURATEDZNF76 gene(s) from the input list not found in DisGeNET CURATEDSCAF11 gene(s) from the input list not found in DisGeNET CURATEDMCU gene(s) from the input list not found in DisGeNET CURATEDFAM222B gene(s) from the input list not found in DisGeNET CURATEDLINC01118 gene(s) from the input list not found in DisGeNET CURATEDCNIH4 gene(s) from the input list not found in DisGeNET CURATEDZNF165 gene(s) from the input list not found in DisGeNET CURATED
                             Description        FDR Ratio  BgRatio
69        Hypercholesterolemia, Familial 0.01150470  3/36  18/9703
72                 Hyperlipoproteinemias 0.02074241  2/36   5/9703
22                        Brain Diseases 0.03959329  2/36  25/9703
68                  Hypercholesterolemia 0.03959329  3/36  39/9703
71                        Hyperlipidemia 0.03959329  2/36  17/9703
100                      Opisthorchiasis 0.03959329  1/36   1/9703
108                  Prostatic Neoplasms 0.03959329  8/36 616/9703
113             Salivary Gland Neoplasms 0.03959329  3/36  47/9703
161 Malignant neoplasm of salivary gland 0.03959329  3/36  45/9703
178                      Pyloric Atresia 0.03959329  1/36   1/9703
******************************************

*                                        *

*          Welcome to WebGestaltR !      *

*                                        *

******************************************
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
                             description size overlap        FDR
1               Gestational hypertension   64       5 0.02265364
2                        Hyperlipidemias   64       5 0.02265364
3                          Dyslipidaemia   84       5 0.03996497
4 Pulmonary Disease, Chronic Obstructive   87       5 0.03996497
5      Genetic Predisposition to Disease  449      10 0.03996497
6                 Disease Susceptibility  456      10 0.03996497
        database                                                    userId
1 disease_GLAD4U                              INHBB;PROZ;GAS6;LIPC;ADORA2B
2 disease_GLAD4U                                 ABCG8;DGAT2;LIPC;CETP;MPO
3 disease_GLAD4U                                 ABCG8;DGAT2;LIPC;CETP;MPO
4 disease_GLAD4U                             HHIP;GYPA;IL27;ADORA2B;CYP2A6
5 disease_GLAD4U ABCG8;HHIP;SLC22A3;JAZF1;LIPC;IL27;CETP;MPO;CYP2A6;DNMT3B
6 disease_GLAD4U ABCG8;HHIP;SLC22A3;JAZF1;LIPC;IL27;CETP;MPO;CYP2A6;DNMT3B

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0       cowplot_1.0.0    
[5] ggplot2_3.3.3    

loaded via a namespace (and not attached):
  [1] bitops_1.0-6                matrixStats_0.57.0         
  [3] fs_1.3.1                    bit64_4.0.5                
  [5] doParallel_1.0.16           progress_1.2.2             
  [7] httr_1.4.1                  rprojroot_2.0.2            
  [9] GenomeInfoDb_1.20.0         doRNG_1.8.2                
 [11] tools_3.6.1                 utf8_1.2.1                 
 [13] R6_2.5.0                    DBI_1.1.1                  
 [15] BiocGenerics_0.30.0         colorspace_1.4-1           
 [17] withr_2.4.1                 tidyselect_1.1.0           
 [19] prettyunits_1.0.2           bit_4.0.4                  
 [21] curl_3.3                    compiler_3.6.1             
 [23] git2r_0.26.1                Biobase_2.44.0             
 [25] DelayedArray_0.10.0         rtracklayer_1.44.0         
 [27] labeling_0.3                scales_1.1.0               
 [29] readr_1.4.0                 apcluster_1.4.8            
 [31] stringr_1.4.0               digest_0.6.20              
 [33] Rsamtools_2.0.0             svglite_1.2.2              
 [35] rmarkdown_1.13              XVector_0.24.0             
 [37] pkgconfig_2.0.3             htmltools_0.3.6            
 [39] fastmap_1.1.0               BSgenome_1.52.0            
 [41] rlang_0.4.11                RSQLite_2.2.7              
 [43] generics_0.0.2              farver_2.1.0               
 [45] jsonlite_1.6                BiocParallel_1.18.0        
 [47] dplyr_1.0.7                 VariantAnnotation_1.30.1   
 [49] RCurl_1.98-1.1              magrittr_2.0.1             
 [51] GenomeInfoDbData_1.2.1      Matrix_1.2-18              
 [53] Rcpp_1.0.6                  munsell_0.5.0              
 [55] S4Vectors_0.22.1            fansi_0.5.0                
 [57] gdtools_0.1.9               lifecycle_1.0.0            
 [59] stringi_1.4.3               whisker_0.3-2              
 [61] yaml_2.2.0                  SummarizedExperiment_1.14.1
 [63] zlibbioc_1.30.0             plyr_1.8.4                 
 [65] grid_3.6.1                  blob_1.2.1                 
 [67] parallel_3.6.1              promises_1.0.1             
 [69] crayon_1.4.1                lattice_0.20-38            
 [71] Biostrings_2.52.0           GenomicFeatures_1.36.3     
 [73] hms_1.1.0                   knitr_1.23                 
 [75] pillar_1.6.1                igraph_1.2.4.1             
 [77] GenomicRanges_1.36.0        rjson_0.2.20               
 [79] rngtools_1.5                codetools_0.2-16           
 [81] reshape2_1.4.3              biomaRt_2.40.1             
 [83] stats4_3.6.1                XML_3.98-1.20              
 [85] glue_1.4.2                  evaluate_0.14              
 [87] data.table_1.14.0           foreach_1.5.1              
 [89] vctrs_0.3.8                 httpuv_1.5.1               
 [91] gtable_0.3.0                purrr_0.3.4                
 [93] assertthat_0.2.1            cachem_1.0.5               
 [95] xfun_0.8                    later_0.8.0                
 [97] tibble_3.1.2                iterators_1.0.13           
 [99] GenomicAlignments_1.20.1    AnnotationDbi_1.46.0       
[101] memoise_2.0.0               IRanges_2.18.1             
[103] workflowr_1.6.2             ellipsis_0.3.2