Last updated: 2021-09-09
Checks: 6 1
Knit directory: ctwas_applied/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20210726)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 59e5f4d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Unstaged changes:
Modified: analysis/ukb-d-30500_irnt_Liver.Rmd
Modified: analysis/ukb-d-30600_irnt_Liver.Rmd
Modified: analysis/ukb-d-30610_irnt_Liver.Rmd
Modified: analysis/ukb-d-30620_irnt_Liver.Rmd
Modified: analysis/ukb-d-30630_irnt_Liver.Rmd
Modified: analysis/ukb-d-30640_irnt_Liver.Rmd
Modified: analysis/ukb-d-30650_irnt_Liver.Rmd
Modified: analysis/ukb-d-30660_irnt_Liver.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/ukb-d-30660_irnt_Liver.Rmd
) and HTML (docs/ukb-d-30660_irnt_Liver.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | cbf7408 | wesleycrouse | 2021-09-08 | adding enrichment to reports |
html | cbf7408 | wesleycrouse | 2021-09-08 | adding enrichment to reports |
Rmd | 4970e3e | wesleycrouse | 2021-09-08 | updating reports |
html | 4970e3e | wesleycrouse | 2021-09-08 | updating reports |
Rmd | 627a4e1 | wesleycrouse | 2021-09-07 | adding heritability |
Rmd | dfd2b5f | wesleycrouse | 2021-09-07 | regenerating reports |
html | dfd2b5f | wesleycrouse | 2021-09-07 | regenerating reports |
Rmd | 61b53b3 | wesleycrouse | 2021-09-06 | updated PVE calculation |
html | 61b53b3 | wesleycrouse | 2021-09-06 | updated PVE calculation |
Rmd | 837dd01 | wesleycrouse | 2021-09-01 | adding additional fixedsigma report |
Rmd | d0a5417 | wesleycrouse | 2021-08-30 | adding new reports to the index |
Rmd | 0922de7 | wesleycrouse | 2021-08-18 | updating all reports with locus plots |
html | 0922de7 | wesleycrouse | 2021-08-18 | updating all reports with locus plots |
html | 1c62980 | wesleycrouse | 2021-08-11 | Updating reports |
Rmd | 5981e80 | wesleycrouse | 2021-08-11 | Adding more reports |
html | 5981e80 | wesleycrouse | 2021-08-11 | Adding more reports |
Rmd | 05a98b7 | wesleycrouse | 2021-08-07 | adding additional results |
html | 05a98b7 | wesleycrouse | 2021-08-07 | adding additional results |
html | 03e541c | wesleycrouse | 2021-07-29 | Cleaning up report generation |
Rmd | 276893d | wesleycrouse | 2021-07-29 | Updating reports |
html | 276893d | wesleycrouse | 2021-07-29 | Updating reports |
These are the results of a ctwas
analysis of the UK Biobank trait Direct bilirubin (quantile)
using Liver
gene weights.
The GWAS was conducted by the Neale Lab, and the biomarker traits we analyzed are discussed here. Summary statistics were obtained from IEU OpenGWAS using GWAS ID: ukb-d-30660_irnt
. Results were obtained from from IEU rather than Neale Lab because they are in a standardard format (GWAS VCF). Note that 3 of the 34 biomarker traits were not available from IEU and were excluded from analysis.
The weights are mashr GTEx v8 models on Liver
eQTL obtained from PredictDB. We performed a full harmonization of the variants, including recovering strand ambiguous variants. This procedure is discussed in a separate document. (TO-DO: add report that describes harmonization)
LD matrices were computed from a 10% subset of Neale lab subjects. Subjects were matched using the plate and well information from genotyping. We included only biallelic variants with MAF>0.01 in the original Neale Lab GWAS. (TO-DO: add more details [number of subjects, variants, etc])
TO-DO: add enhanced QC reporting (total number of weights, why each variant was missing for all genes)
qclist_all <- list()
qc_files <- paste0(results_dir, "/", list.files(results_dir, pattern="exprqc.Rd"))
for (i in 1:length(qc_files)){
load(qc_files[i])
chr <- unlist(strsplit(rev(unlist(strsplit(qc_files[i], "_")))[1], "[.]"))[1]
qclist_all[[chr]] <- cbind(do.call(rbind, lapply(qclist,unlist)), as.numeric(substring(chr,4)))
}
qclist_all <- data.frame(do.call(rbind, qclist_all))
colnames(qclist_all)[ncol(qclist_all)] <- "chr"
rm(qclist, wgtlist, z_gene_chr)
#number of imputed weights
nrow(qclist_all)
[1] 10901
#number of imputed weights by chromosome
table(qclist_all$chr)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1070 768 652 417 494 611 548 408 405 434 634 629 195 365 354
16 17 18 19 20 21 22
526 663 160 859 306 114 289
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.8366205
#load ctwas results
ctwas_res <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.susieIrss.txt"))
#make unique identifier for regions
ctwas_res$region_tag <- paste(ctwas_res$region_tag1, ctwas_res$region_tag2, sep="_")
#compute PVE for each gene/SNP
ctwas_res$PVE = ctwas_res$susie_pip*ctwas_res$mu2/sample_size #check PVE calculation
#separate gene and SNP results
ctwas_gene_res <- ctwas_res[ctwas_res$type == "gene", ]
ctwas_gene_res <- data.frame(ctwas_gene_res)
ctwas_snp_res <- ctwas_res[ctwas_res$type == "SNP", ]
ctwas_snp_res <- data.frame(ctwas_snp_res)
#add gene information to results
sqlite <- RSQLite::dbDriver("SQLite")
db = RSQLite::dbConnect(sqlite, paste0("/project2/compbio/predictdb/mashr_models/mashr_", weight, ".db"))
query <- function(...) RSQLite::dbGetQuery(db, ...)
gene_info <- query("select gene, genename from extra")
gene_info <- query("select gene, genename, gene_type from extra")
RSQLite::dbDisconnect(db)
ctwas_gene_res <- cbind(ctwas_gene_res, gene_info[sapply(ctwas_gene_res$id, match, gene_info$gene), c("genename", "gene_type")])
#add z score to results
load(paste0(results_dir, "/", analysis_id, "_expr_z_gene.Rd"))
ctwas_gene_res$z <- z_gene[ctwas_gene_res$id,]$z
#load(paste0(results_dir, "/", analysis_id, "_expr_z_snp.Rd")) #for new version, stored after harmonization
z_snp <- readRDS(paste0(results_dir, "/", trait_id, ".RDS")) #for old version, unharmonized
z_snp <- z_snp[z_snp$id %in% ctwas_snp_res$id,] #subset snps to those included in analysis, note some are duplicated, need to match which allele was used
ctwas_snp_res$z <- z_snp$z[match(ctwas_snp_res$id, z_snp$id)] #for duplicated snps, this arbitrarily uses the first allele
ctwas_snp_res$z_flag <- as.numeric(ctwas_snp_res$id %in% z_snp$id[duplicated(z_snp$id)]) #mark the unclear z scores, flag=1
#formatting and rounding for tables
ctwas_gene_res$z <- round(ctwas_gene_res$z,2)
ctwas_snp_res$z <- round(ctwas_snp_res$z,2)
ctwas_gene_res$susie_pip <- round(ctwas_gene_res$susie_pip,3)
ctwas_snp_res$susie_pip <- round(ctwas_snp_res$susie_pip,3)
ctwas_gene_res$mu2 <- round(ctwas_gene_res$mu2,2)
ctwas_snp_res$mu2 <- round(ctwas_snp_res$mu2,2)
ctwas_gene_res$PVE <- signif(ctwas_gene_res$PVE, 2)
ctwas_snp_res$PVE <- signif(ctwas_snp_res$PVE, 2)
#merge gene and snp results with added information
ctwas_gene_res$z_flag=NA
ctwas_snp_res$genename=NA
ctwas_snp_res$gene_type=NA
ctwas_res <- rbind(ctwas_gene_res,
ctwas_snp_res[,colnames(ctwas_gene_res)])
#store columns to report
report_cols <- colnames(ctwas_gene_res)[!(colnames(ctwas_gene_res) %in% c("type", "region_tag1", "region_tag2", "cs_index", "gene_type", "z_flag", "id", "chrom", "pos"))]
first_cols <- c("genename", "region_tag")
report_cols <- c(first_cols, report_cols[!(report_cols %in% first_cols)])
report_cols_snps <- c("id", report_cols[-1])
#get number of SNPs from s1 results; adjust for thin
ctwas_res_s1 <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.s1.susieIrss.txt"))
n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
rm(ctwas_res_s1)
library(ggplot2)
library(cowplot)
********************************************************
Note: As of version 1.0.0, cowplot does not change the
default ggplot2 theme anymore. To recover the previous
behavior, execute:
theme_set(theme_cowplot())
********************************************************
load(paste0(results_dir, "/", analysis_id, "_ctwas.s2.susieIrssres.Rd"))
df <- data.frame(niter = rep(1:ncol(group_prior_rec), 2),
value = c(group_prior_rec[1,], group_prior_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_rec)))
df$group <- as.factor(df$group)
df$value[df$group=="SNP"] <- df$value[df$group=="SNP"]*thin #adjust parameter to account for thin argument
p_pi <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(pi)) +
ggtitle("Prior mean") +
theme_cowplot()
df <- data.frame(niter = rep(1:ncol(group_prior_var_rec), 2),
value = c(group_prior_var_rec[1,], group_prior_var_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_var_rec)))
df$group <- as.factor(df$group)
p_sigma2 <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(sigma^2)) +
ggtitle("Prior variance") +
theme_cowplot()
plot_grid(p_pi, p_sigma2)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
gene snp
4.548826e-02 6.866634e-06
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
gene snp
5.325796 1575.790741
#report sample size
print(sample_size)
[1] 292933
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1] 10901 8697330
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
gene snp
0.009015335 0.321262531
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1] 0.05609462 10.78474853
#distribution of PIPs
hist(ctwas_gene_res$susie_pip, xlim=c(0,1), main="Distribution of Gene PIPs")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#genes with PIP>0.8 or 20 highest PIPs
head(ctwas_gene_res[order(-ctwas_gene_res$susie_pip),report_cols], max(sum(ctwas_gene_res$susie_pip>0.8), 20))
genename region_tag susie_pip mu2 PVE z
3212 CCND2 12_4 1.000 25.94 8.9e-05 5.34
5563 ABCG8 2_27 0.999 30.18 1.0e-04 5.88
12467 RP11-219B17.3 15_27 0.999 43.14 1.5e-04 7.18
10495 PRMT6 1_66 0.997 24.37 8.3e-05 5.14
7040 INHBB 2_70 0.997 21.98 7.5e-05 4.81
1848 CD276 15_35 0.997 32.17 1.1e-04 6.13
11790 CYP2A6 19_28 0.995 21.10 7.2e-05 -4.73
1320 CWF19L1 10_64 0.994 26.77 9.1e-05 -7.09
1231 PABPC4 1_24 0.992 20.58 7.0e-05 4.52
3562 ACVR1C 2_94 0.992 20.57 7.0e-05 4.62
6093 CSNK1G3 5_75 0.992 21.37 7.2e-05 4.74
5780 IRAK1BP1 6_54 0.990 22.21 7.5e-05 4.85
2092 SP4 7_19 0.982 19.81 6.6e-05 4.51
8811 SMIM19 8_37 0.982 27.65 9.3e-05 -5.58
6291 JAZF1 7_23 0.979 21.96 7.3e-05 4.83
2924 EFHD1 2_136 0.977 25.00 8.3e-05 6.05
10214 ZNF165 6_22 0.971 22.14 7.3e-05 5.99
2359 ABCC3 17_29 0.971 18.15 6.0e-05 4.38
12687 RP4-781K5.7 1_121 0.967 17.86 5.9e-05 -4.17
5978 ZC3H12C 11_65 0.966 18.04 5.9e-05 -4.19
11669 RP11-452H21.4 11_43 0.964 28.01 9.2e-05 5.78
9390 GAS6 13_62 0.964 19.49 6.4e-05 4.46
4269 ITGB4 17_42 0.963 19.97 6.6e-05 -4.91
8156 GYPA 4_94 0.962 20.07 6.6e-05 4.83
7547 LIPC 15_26 0.962 17.81 5.8e-05 3.99
537 DGAT2 11_42 0.959 46.58 1.5e-04 -7.51
5089 SCAF11 12_29 0.958 17.64 5.8e-05 4.13
583 ZNF76 6_28 0.946 18.05 5.8e-05 -4.20
59 MPO 17_34 0.939 19.17 6.1e-05 4.62
1120 CETP 16_31 0.932 17.60 5.6e-05 -4.03
11009 LINC01118 2_29 0.931 17.02 5.4e-05 3.96
1146 DNMT3B 20_19 0.924 16.75 5.3e-05 -3.98
10501 MAP3K3 17_37 0.922 15.89 5.0e-05 3.49
1153 TGDS 13_47 0.920 17.15 5.4e-05 -4.00
666 COASY 17_25 0.920 17.40 5.5e-05 -3.97
1609 UQCC1 20_21 0.920 16.50 5.2e-05 -3.60
7651 CASC4 15_17 0.919 16.06 5.0e-05 3.77
5799 SLC22A3 6_104 0.916 19.80 6.2e-05 4.51
669 ATP11A 13_61 0.916 17.00 5.3e-05 -3.91
5544 CNIH4 1_114 0.912 16.35 5.1e-05 -3.78
7266 HHIP 4_94 0.910 17.25 5.4e-05 -4.02
6439 MCU 10_48 0.909 15.39 4.8e-05 3.22
6682 CYB5R1 1_102 0.903 16.93 5.2e-05 -3.95
4178 PDLIM4 5_79 0.887 18.39 5.6e-05 -4.33
7256 CEP44 4_113 0.886 16.77 5.1e-05 3.81
1494 TXN2 22_14 0.862 16.02 4.7e-05 3.67
10212 IL27 16_23 0.861 20.59 6.1e-05 -4.76
4295 RHBG 1_77 0.857 16.61 4.9e-05 -3.78
3791 PROZ 13_62 0.847 16.41 4.7e-05 -3.64
8506 FAM222B 17_17 0.824 17.58 4.9e-05 4.12
4271 H3F3B 17_42 0.818 23.75 6.6e-05 5.28
8190 ADORA2B 17_14 0.817 17.86 5.0e-05 4.14
2204 AKNA 9_59 0.810 16.42 4.5e-05 3.72
#plot PIP vs effect size
plot(ctwas_gene_res$susie_pip, ctwas_gene_res$mu2, xlab="PIP", ylab="mu^2", main="Gene PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#genes with 20 largest effect sizes
head(ctwas_gene_res[order(-ctwas_gene_res$mu2),report_cols],20)
genename region_tag susie_pip mu2 PVE z
12683 HCP5B 6_24 0 323765.43 0 -3.52
10663 TRIM31 6_24 0 169467.05 0 1.07
4833 FLOT1 6_24 0 160968.35 0 -1.07
10651 ABCF1 6_24 0 73493.54 0 -3.76
5766 PPP1R18 6_24 0 63880.99 0 -3.94
10425 AKR1C4 10_6 0 59625.40 0 6.16
11533 UGT1A4 2_137 0 51199.81 0 232.75
11447 UGT1A1 2_137 0 43485.02 0 -230.41
4836 NRM 6_24 0 36023.92 0 -0.40
10667 HLA-G 6_24 0 34755.83 0 -6.69
11489 UGT1A3 2_137 0 34565.53 0 213.80
7732 UGT1A6 2_137 0 29022.18 0 186.96
624 ZNRD1 6_24 0 25299.13 0 0.19
10661 TRIM10 6_24 0 10671.43 0 -0.49
11033 RP11-445P17.8 10_6 0 7886.61 0 -1.15
11522 UGT1A7 2_137 0 5624.83 0 -71.90
10648 C6orf136 6_24 0 4697.83 0 0.11
11136 HCG20 6_24 0 4212.20 0 -1.98
10664 RNF39 6_24 0 3872.54 0 -1.02
10774 HLA-A 6_24 0 1958.68 0 -0.83
#genes with 20 highest pve
head(ctwas_gene_res[order(-ctwas_gene_res$PVE),report_cols],20)
genename region_tag susie_pip mu2 PVE z
537 DGAT2 11_42 0.959 46.58 1.5e-04 -7.51
12467 RP11-219B17.3 15_27 0.999 43.14 1.5e-04 7.18
1848 CD276 15_35 0.997 32.17 1.1e-04 6.13
5563 ABCG8 2_27 0.999 30.18 1.0e-04 5.88
8811 SMIM19 8_37 0.982 27.65 9.3e-05 -5.58
11669 RP11-452H21.4 11_43 0.964 28.01 9.2e-05 5.78
1320 CWF19L1 10_64 0.994 26.77 9.1e-05 -7.09
3212 CCND2 12_4 1.000 25.94 8.9e-05 5.34
10495 PRMT6 1_66 0.997 24.37 8.3e-05 5.14
2924 EFHD1 2_136 0.977 25.00 8.3e-05 6.05
7040 INHBB 2_70 0.997 21.98 7.5e-05 4.81
5780 IRAK1BP1 6_54 0.990 22.21 7.5e-05 4.85
2004 TGFB1 19_28 0.749 29.34 7.5e-05 5.64
10214 ZNF165 6_22 0.971 22.14 7.3e-05 5.99
6291 JAZF1 7_23 0.979 21.96 7.3e-05 4.83
11216 CYP21A2 6_26 0.754 28.09 7.2e-05 -6.07
6093 CSNK1G3 5_75 0.992 21.37 7.2e-05 4.74
11790 CYP2A6 19_28 0.995 21.10 7.2e-05 -4.73
1231 PABPC4 1_24 0.992 20.58 7.0e-05 4.52
3562 ACVR1C 2_94 0.992 20.57 7.0e-05 4.62
#genes with 20 largest z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
11533 UGT1A4 2_137 0.000 51199.81 0.0e+00 232.75
11447 UGT1A1 2_137 0.000 43485.02 0.0e+00 -230.41
11489 UGT1A3 2_137 0.000 34565.53 0.0e+00 213.80
7732 UGT1A6 2_137 0.000 29022.18 0.0e+00 186.96
11522 UGT1A7 2_137 0.000 5624.83 0.0e+00 -71.90
1088 USP40 2_137 0.000 1758.34 0.0e+00 -46.64
10747 SLCO1B7 12_16 0.000 1025.26 0.0e+00 12.26
3556 HJURP 2_137 0.000 169.37 0.0e+00 10.96
8651 MSL2 3_84 0.094 78.80 2.5e-05 10.28
2584 SLCO1B3 12_16 0.000 506.35 0.0e+00 9.93
2586 GOLT1B 12_16 0.000 63.88 0.0e+00 7.53
537 DGAT2 11_42 0.959 46.58 1.5e-04 -7.51
11290 MAPKAPK5-AS1 12_67 0.155 34.68 1.8e-05 -7.21
12467 RP11-219B17.3 15_27 0.999 43.14 1.5e-04 7.18
2541 ALDH2 12_67 0.134 32.96 1.5e-05 7.10
1320 CWF19L1 10_64 0.994 26.77 9.1e-05 -7.09
2536 SH2B3 12_67 0.084 29.11 8.3e-06 6.80
10667 HLA-G 6_24 0.000 34755.83 0.0e+00 -6.69
2170 AHR 7_17 0.065 24.94 5.5e-06 -6.58
4962 EXOC6 10_59 0.110 35.21 1.3e-05 -6.37
#set nominal signifiance threshold for z scores
alpha <- 0.05
#bonferroni adjusted threshold for z scores
sig_thresh <- qnorm(1-(alpha/nrow(ctwas_gene_res)/2), lower=T)
#Q-Q plot for z scores
obs_z <- ctwas_gene_res$z[order(ctwas_gene_res$z)]
exp_z <- qnorm((1:nrow(ctwas_gene_res))/nrow(ctwas_gene_res))
plot(exp_z, obs_z, xlab="Expected z", ylab="Observed z", main="Gene z score Q-Q plot")
abline(a=0,b=1)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#plot z score vs PIP
plot(abs(ctwas_gene_res$z), ctwas_gene_res$susie_pip, xlab="abs(z)", ylab="PIP")
abline(v=sig_thresh, col="red", lty=2)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)
[1] 0.006696633
#genes with most significant z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
11533 UGT1A4 2_137 0.000 51199.81 0.0e+00 232.75
11447 UGT1A1 2_137 0.000 43485.02 0.0e+00 -230.41
11489 UGT1A3 2_137 0.000 34565.53 0.0e+00 213.80
7732 UGT1A6 2_137 0.000 29022.18 0.0e+00 186.96
11522 UGT1A7 2_137 0.000 5624.83 0.0e+00 -71.90
1088 USP40 2_137 0.000 1758.34 0.0e+00 -46.64
10747 SLCO1B7 12_16 0.000 1025.26 0.0e+00 12.26
3556 HJURP 2_137 0.000 169.37 0.0e+00 10.96
8651 MSL2 3_84 0.094 78.80 2.5e-05 10.28
2584 SLCO1B3 12_16 0.000 506.35 0.0e+00 9.93
2586 GOLT1B 12_16 0.000 63.88 0.0e+00 7.53
537 DGAT2 11_42 0.959 46.58 1.5e-04 -7.51
11290 MAPKAPK5-AS1 12_67 0.155 34.68 1.8e-05 -7.21
12467 RP11-219B17.3 15_27 0.999 43.14 1.5e-04 7.18
2541 ALDH2 12_67 0.134 32.96 1.5e-05 7.10
1320 CWF19L1 10_64 0.994 26.77 9.1e-05 -7.09
2536 SH2B3 12_67 0.084 29.11 8.3e-06 6.80
10667 HLA-G 6_24 0.000 34755.83 0.0e+00 -6.69
2170 AHR 7_17 0.065 24.94 5.5e-06 -6.58
4962 EXOC6 10_59 0.110 35.21 1.3e-05 -6.37
ctwas_gene_res_sortz <- ctwas_gene_res[order(-abs(ctwas_gene_res$z)),]
n_plots <- 5
for (region_tag_plot in head(unique(ctwas_gene_res_sortz$region_tag), n_plots)){
ctwas_res_region <- ctwas_res[ctwas_res$region_tag==region_tag_plot,]
start <- min(ctwas_res_region$pos)
end <- max(ctwas_res_region$pos)
ctwas_res_region <- ctwas_res_region[order(ctwas_res_region$pos),]
ctwas_res_region_gene <- ctwas_res_region[ctwas_res_region$type=="gene",]
ctwas_res_region_snp <- ctwas_res_region[ctwas_res_region$type=="SNP",]
#region name
print(paste0("Region: ", region_tag_plot))
#table of genes in region
print(ctwas_res_region_gene[,report_cols])
par(mfrow=c(4,1))
#gene z scores
plot(ctwas_res_region_gene$pos, abs(ctwas_res_region_gene$z), xlab="Position", ylab="abs(gene_z)", xlim=c(start,end),
ylim=c(0,max(sig_thresh, abs(ctwas_res_region_gene$z))),
main=paste0("Region: ", region_tag_plot))
abline(h=sig_thresh,col="red",lty=2)
#significance threshold for SNPs
alpha_snp <- 5*10^(-8)
sig_thresh_snp <- qnorm(1-alpha_snp/2, lower=T)
#snp z scores
plot(ctwas_res_region_snp$pos, abs(ctwas_res_region_snp$z), xlab="Position", ylab="abs(snp_z)",xlim=c(start,end),
ylim=c(0,max(sig_thresh_snp, max(abs(ctwas_res_region_snp$z)))))
abline(h=sig_thresh_snp,col="purple",lty=2)
#gene pips
plot(ctwas_res_region_gene$pos, ctwas_res_region_gene$susie_pip, xlab="Position", ylab="Gene PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
#snp pips
plot(ctwas_res_region_snp$pos, ctwas_res_region_snp$susie_pip, xlab="Position", ylab="SNP PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
}
[1] "Region: 2_137"
genename region_tag susie_pip mu2 PVE z
10567 GIGYF2 2_137 0 30.97 0 -5.12
9340 C2orf82 2_137 0 5.79 0 0.45
620 NGEF 2_137 0 7.24 0 2.52
8006 INPP5D 2_137 0 15.40 0 4.05
879 DGKD 2_137 0 31.39 0 -0.07
1088 USP40 2_137 0 1758.34 0 -46.64
11522 UGT1A7 2_137 0 5624.83 0 -71.90
7732 UGT1A6 2_137 0 29022.18 0 186.96
11533 UGT1A4 2_137 0 51199.81 0 232.75
11489 UGT1A3 2_137 0 34565.53 0 213.80
11447 UGT1A1 2_137 0 43485.02 0 -230.41
3556 HJURP 2_137 0 169.37 0 10.96
11098 AC006037.2 2_137 0 5.10 0 1.50
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 12_16"
genename region_tag susie_pip mu2 PVE z
2584 SLCO1B3 12_16 0 506.35 0 9.93
10747 SLCO1B7 12_16 0 1025.26 0 12.26
3400 IAPP 12_16 0 138.98 0 5.16
3399 PYROXD1 12_16 0 19.02 0 0.89
36 RECQL 12_16 0 221.55 0 3.81
2586 GOLT1B 12_16 0 63.88 0 7.53
4482 SPX 12_16 0 230.14 0 4.60
2587 LDHB 12_16 0 8.56 0 0.07
3401 KCNJ8 12_16 0 6.44 0 -0.33
689 ABCC9 12_16 0 8.41 0 2.11
2590 C2CD5 12_16 0 9.94 0 -1.39
5073 ETNK1 12_16 0 11.12 0 0.30
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 3_84"
genename region_tag susie_pip mu2 PVE z
796 PPP2R3A 3_84 0.178 13.30 8.1e-06 -2.46
8651 MSL2 3_84 0.094 78.80 2.5e-05 10.28
2795 PCCB 3_84 0.083 5.45 1.6e-06 1.22
3148 STAG1 3_84 0.086 4.57 1.3e-06 -0.03
6584 NCK1 3_84 0.085 7.94 2.3e-06 -2.19
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 11_42"
genename region_tag susie_pip mu2 PVE z
7611 XRRA1 11_42 0.161 7.17 3.9e-06 -1.00
3170 SPCS2 11_42 0.129 5.40 2.4e-06 0.73
6901 NEU3 11_42 0.116 4.39 1.7e-06 0.39
4848 SLCO2B1 11_42 0.116 4.35 1.7e-06 -0.22
12001 TPBGL 11_42 0.145 6.17 3.0e-06 0.68
6617 GDPD5 11_42 0.163 9.03 5.0e-06 1.95
8328 MAP6 11_42 0.128 5.23 2.3e-06 0.24
7603 MOGAT2 11_42 0.323 11.93 1.3e-05 0.85
537 DGAT2 11_42 0.959 46.58 1.5e-04 -7.51
10381 UVRAG 11_42 0.135 6.65 3.1e-06 1.62
1082 WNT11 11_42 0.157 7.13 3.8e-06 1.06
11773 RP11-619A14.3 11_42 0.128 5.23 2.3e-06 0.61
4849 THAP12 11_42 0.125 5.14 2.2e-06 -0.66
12265 RP11-111M22.5 11_42 0.135 5.75 2.6e-06 0.80
11766 RP11-111M22.3 11_42 0.116 4.39 1.7e-06 0.31
11751 RP11-672A2.4 11_42 0.124 4.95 2.1e-06 0.53
9350 TSKU 11_42 0.121 4.65 1.9e-06 0.25
905 ACER3 11_42 0.339 13.87 1.6e-05 -2.04
5976 CAPN5 11_42 0.254 11.44 9.9e-06 1.72
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 12_67"
genename region_tag susie_pip mu2 PVE z
5112 TCHP 12_67 0.664 15.54 3.5e-05 3.21
5111 GIT2 12_67 0.380 13.95 1.8e-05 3.36
8639 C12orf76 12_67 0.105 5.91 2.1e-06 -0.30
3515 IFT81 12_67 0.107 9.01 3.3e-06 2.50
10093 ANAPC7 12_67 0.108 7.56 2.8e-06 2.16
2531 ARPC3 12_67 0.115 6.40 2.5e-06 0.54
10684 FAM216A 12_67 0.091 7.13 2.2e-06 2.42
2532 GPN3 12_67 0.083 4.49 1.3e-06 1.40
2533 VPS29 12_67 0.083 4.51 1.3e-06 -1.41
10683 TCTN1 12_67 0.104 5.51 1.9e-06 0.02
3517 HVCN1 12_67 0.158 11.55 6.2e-06 2.67
9717 PPP1CC 12_67 0.158 11.49 6.2e-06 -2.65
10375 FAM109A 12_67 0.085 6.47 1.9e-06 -1.46
2536 SH2B3 12_67 0.084 29.11 8.3e-06 6.80
10680 ATXN2 12_67 0.084 16.47 4.7e-06 3.97
2541 ALDH2 12_67 0.134 32.96 1.5e-05 7.10
11290 MAPKAPK5-AS1 12_67 0.155 34.68 1.8e-05 -7.21
1191 ERP29 12_67 0.225 27.91 2.1e-05 6.25
10370 TMEM116 12_67 0.225 27.91 2.1e-05 -6.25
2544 NAA25 12_67 0.193 25.94 1.7e-05 -6.12
8505 HECTD4 12_67 0.247 30.21 2.6e-05 6.33
9084 PTPN11 12_67 0.112 6.73 2.6e-06 -0.78
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#snps with PIP>0.8 or 20 highest PIPs
head(ctwas_snp_res[order(-ctwas_snp_res$susie_pip),report_cols_snps],
max(sum(ctwas_snp_res$susie_pip>0.8), 20))
id region_tag susie_pip mu2 PVE z
130485 rs2070959 2_137 1.000 54842.78 0.19000 238.14
130513 rs76063448 2_137 1.000 8087.82 0.02800 70.66
130514 rs11888459 2_137 1.000 58593.32 0.20000 187.72
130515 rs2885296 2_137 1.000 77570.97 0.26000 240.61
322220 rs72834643 6_20 1.000 129.63 0.00044 9.73
322241 rs115740542 6_20 1.000 197.25 0.00067 12.66
364959 rs12208357 6_103 1.000 91.40 0.00031 -6.65
376137 rs542176135 7_17 1.000 177.50 0.00061 -8.38
525245 rs6480402 10_46 1.000 609.89 0.00210 8.90
594103 rs11045819 12_16 1.000 5758.71 0.02000 -14.34
594120 rs4363657 12_16 1.000 1630.52 0.00560 43.78
785491 rs59616136 19_14 1.000 53.09 0.00018 7.00
900606 rs13031505 2_136 1.000 66.42 0.00023 -8.07
906830 rs200216446 3_104 1.000 11556.78 0.03900 3.98
934781 rs1611236 6_24 1.000 954165.10 3.30000 -3.60
988935 rs201623796 10_6 1.000 111465.14 0.38000 -3.61
988943 rs186871467 10_6 1.000 12681.91 0.04300 2.21
323083 rs3130253 6_23 0.999 47.83 0.00016 -6.63
376159 rs4721597 7_17 0.999 111.11 0.00038 1.94
501207 rs115478735 9_70 0.999 70.32 0.00024 -8.02
546127 rs76153913 11_2 0.996 48.26 0.00016 6.70
620933 rs653178 12_67 0.996 165.62 0.00056 -13.12
793635 rs814573 19_32 0.991 42.72 0.00014 -6.68
35207 rs2779116 1_78 0.987 72.82 0.00025 -8.25
593818 rs7962574 12_15 0.984 58.36 0.00020 -8.40
734216 rs2608604 16_53 0.983 46.90 0.00016 -6.33
604347 rs7397189 12_36 0.963 37.73 0.00012 5.69
786276 rs3794991 19_15 0.957 139.37 0.00046 11.64
793571 rs1551891 19_32 0.955 58.35 0.00019 7.88
130518 rs12052787 2_137 0.939 2604.89 0.00840 -11.25
593858 rs10770693 12_15 0.934 66.81 0.00021 8.86
487410 rs9410381 9_45 0.923 79.08 0.00025 8.62
805368 rs34507316 20_13 0.918 37.16 0.00012 5.60
906829 rs12492030 3_104 0.917 11474.13 0.03600 2.70
900543 rs34247311 2_136 0.913 41.29 0.00013 4.44
322790 rs156737 6_21 0.872 44.29 0.00013 -6.37
322059 rs75080831 6_19 0.867 68.04 0.00020 7.96
549149 rs4910498 11_8 0.831 49.64 0.00014 6.71
934804 rs1611248 6_24 0.812 954152.00 2.60000 -3.63
594004 rs12366506 12_16 0.805 4961.06 0.01400 23.44
988923 rs7914015 10_6 0.805 112204.93 0.31000 -5.52
546173 rs16929046 11_2 0.803 65.89 0.00018 7.10
#plot PIP vs effect size
plot(ctwas_snp_res$susie_pip, ctwas_snp_res$mu2, xlab="PIP", ylab="mu^2", main="SNP PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#SNPs with 50 largest effect sizes
head(ctwas_snp_res[order(-ctwas_snp_res$mu2),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
934781 rs1611236 6_24 1.000 954165.1 3.3e+00 -3.60
934763 rs111734624 6_24 0.169 954153.4 5.5e-01 -3.62
934760 rs2508055 6_24 0.168 954153.2 5.5e-01 -3.62
934808 rs1611252 6_24 0.098 954153.2 3.2e-01 -3.62
934804 rs1611248 6_24 0.812 954152.0 2.6e+00 -3.63
934825 rs1611260 6_24 0.059 954150.2 1.9e-01 -3.62
934831 rs1611265 6_24 0.049 954149.2 1.6e-01 -3.62
934699 rs1633033 6_24 0.280 954133.3 9.1e-01 -3.63
934835 rs2394171 6_24 0.000 954132.4 3.7e-05 -3.61
934833 rs1611267 6_24 0.000 954131.4 8.4e-05 -3.62
934837 rs2893981 6_24 0.000 954130.2 2.7e-05 -3.61
934767 rs1611228 6_24 0.000 954127.5 1.6e-04 -3.62
934756 rs1737020 6_24 0.000 954127.3 1.7e-07 -3.60
934757 rs1737019 6_24 0.000 954127.3 1.7e-07 -3.60
934712 rs2844838 6_24 0.003 954124.3 1.1e-02 -3.63
934716 rs1633032 6_24 0.000 954063.1 2.3e-07 -3.62
934750 rs1633020 6_24 0.000 954011.3 6.3e-12 -3.61
934754 rs1633018 6_24 0.000 954003.8 5.4e-15 -3.60
934779 rs1611234 6_24 0.000 953946.2 0.0e+00 -3.60
934639 rs1610726 6_24 0.000 953894.5 0.0e+00 -3.61
934707 rs2844840 6_24 0.000 953797.6 0.0e+00 -3.62
935034 rs3129185 6_24 0.000 953764.3 0.0e+00 -3.63
935049 rs1736999 6_24 0.000 953724.2 0.0e+00 -3.63
935062 rs1633001 6_24 0.000 953644.1 0.0e+00 -3.64
934802 rs1611246 6_24 0.000 953631.8 0.0e+00 -3.62
935238 rs1632980 6_24 0.000 953585.9 0.0e+00 -3.63
934735 rs1614309 6_24 0.000 953345.1 0.0e+00 -3.63
934734 rs1633030 6_24 0.000 952567.1 0.0e+00 -3.61
934847 rs9258382 6_24 0.000 951556.3 0.0e+00 -3.69
934844 rs9258379 6_24 0.000 950013.0 0.0e+00 -3.62
934793 rs1611241 6_24 0.000 948857.8 0.0e+00 -3.84
934738 rs1633028 6_24 0.000 947556.2 0.0e+00 -3.72
934796 rs1611244 6_24 0.000 943964.8 0.0e+00 -3.64
934751 rs2735042 6_24 0.000 942481.9 0.0e+00 -3.61
934832 rs1611266 6_24 0.000 935325.1 0.0e+00 -3.97
934805 rs1611249 6_24 0.000 931224.2 0.0e+00 -4.10
934771 rs1611230 6_24 0.000 928943.0 0.0e+00 -4.08
934820 rs145043018 6_24 0.000 928751.9 0.0e+00 -4.09
934830 rs147376303 6_24 0.000 928746.9 0.0e+00 -4.09
934841 rs9258376 6_24 0.000 928734.8 0.0e+00 -4.09
934848 rs1633016 6_24 0.000 928724.5 0.0e+00 -4.09
934693 rs1633035 6_24 0.000 928699.5 0.0e+00 -4.09
934901 rs1633014 6_24 0.000 928634.7 0.0e+00 -4.09
934726 rs1618670 6_24 0.000 928631.3 0.0e+00 -4.09
934753 rs1633019 6_24 0.000 928572.8 0.0e+00 -4.06
935015 rs1633010 6_24 0.000 928331.9 0.0e+00 -4.09
935139 rs909722 6_24 0.000 928184.5 0.0e+00 -4.08
935171 rs1610713 6_24 0.000 928174.0 0.0e+00 -4.09
935096 rs1736991 6_24 0.000 928166.7 0.0e+00 -4.09
935151 rs1610648 6_24 0.000 928111.8 0.0e+00 -4.10
#SNPs with 50 highest pve
head(ctwas_snp_res[order(-ctwas_snp_res$PVE),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
934781 rs1611236 6_24 1.000 954165.10 3.30000 -3.60
934804 rs1611248 6_24 0.812 954152.00 2.60000 -3.63
934699 rs1633033 6_24 0.280 954133.33 0.91000 -3.63
934760 rs2508055 6_24 0.168 954153.21 0.55000 -3.62
934763 rs111734624 6_24 0.169 954153.37 0.55000 -3.62
988935 rs201623796 10_6 1.000 111465.14 0.38000 -3.61
934808 rs1611252 6_24 0.098 954153.16 0.32000 -3.62
988923 rs7914015 10_6 0.805 112204.93 0.31000 -5.52
988914 rs4880705 10_6 0.795 112311.88 0.30000 -5.51
130515 rs2885296 2_137 1.000 77570.97 0.26000 240.61
130514 rs11888459 2_137 1.000 58593.32 0.20000 187.72
130485 rs2070959 2_137 1.000 54842.78 0.19000 238.14
934825 rs1611260 6_24 0.059 954150.17 0.19000 -3.62
934831 rs1611265 6_24 0.049 954149.25 0.16000 -3.62
988926 rs4644565 10_6 0.195 112156.63 0.07500 -5.51
988913 rs7907985 10_6 0.166 112301.84 0.06400 -5.51
988943 rs186871467 10_6 1.000 12681.91 0.04300 2.21
906830 rs200216446 3_104 1.000 11556.78 0.03900 3.98
906829 rs12492030 3_104 0.917 11474.13 0.03600 2.70
130513 rs76063448 2_137 1.000 8087.82 0.02800 70.66
934916 rs372065521 6_24 0.266 31064.73 0.02800 0.27
934917 rs555157405 6_24 0.259 30950.71 0.02700 0.27
934918 rs572275478 6_24 0.259 30950.71 0.02700 0.27
934915 rs376865941 6_24 0.207 31066.51 0.02200 0.25
906812 rs61793869 3_104 0.524 11502.80 0.02100 2.58
594103 rs11045819 12_16 1.000 5758.71 0.02000 -14.34
988965 rs1937847 10_6 0.039 112335.10 0.01500 -5.51
594004 rs12366506 12_16 0.805 4961.06 0.01400 23.44
594090 rs12371691 12_16 0.724 5459.79 0.01300 7.75
934712 rs2844838 6_24 0.003 954124.32 0.01100 -3.63
906818 rs61793871 3_104 0.231 11502.37 0.00910 2.57
130518 rs12052787 2_137 0.939 2604.89 0.00840 -11.25
906821 rs61793896 3_104 0.206 11502.40 0.00810 2.56
594120 rs4363657 12_16 1.000 1630.52 0.00560 43.78
594084 rs17328763 12_16 0.230 5436.65 0.00430 7.87
525245 rs6480402 10_46 1.000 609.89 0.00210 8.90
594010 rs11045612 12_16 0.114 4954.76 0.00190 23.44
906900 rs61791069 3_104 0.049 11491.33 0.00190 2.64
906912 rs61791073 3_104 0.033 11470.02 0.00130 2.67
525250 rs35233497 10_46 0.529 468.47 0.00085 -4.24
593914 rs3060461 12_16 0.359 685.47 0.00084 -21.86
593918 rs35290079 12_16 0.329 681.64 0.00077 -21.76
525253 rs79086908 10_46 0.471 468.43 0.00075 -4.24
906866 rs61791061 3_104 0.019 11505.87 0.00075 2.51
594021 rs73062442 12_16 0.042 4959.80 0.00070 23.39
322241 rs115740542 6_20 1.000 197.25 0.00067 12.66
594013 rs11513221 12_16 0.039 4960.27 0.00066 23.38
376137 rs542176135 7_17 1.000 177.50 0.00061 -8.38
620933 rs653178 12_67 0.996 165.62 0.00056 -13.12
594064 rs11045737 12_16 0.029 5385.26 0.00053 9.34
#SNPs with 50 largest z scores
head(ctwas_snp_res[order(-abs(ctwas_snp_res$z)),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
130515 rs2885296 2_137 1 77570.97 2.6e-01 240.61
130511 rs17862875 2_137 0 77087.61 0.0e+00 239.53
130485 rs2070959 2_137 1 54842.78 1.9e-01 238.14
130484 rs1105880 2_137 0 52752.67 0.0e+00 232.16
130479 rs77070100 2_137 0 52391.48 0.0e+00 231.38
130493 rs6749496 2_137 0 51255.87 0.0e+00 208.13
130508 rs3821242 2_137 0 43716.35 0.0e+00 203.17
130506 rs2008584 2_137 0 43425.08 0.0e+00 202.59
130486 rs7583278 2_137 0 47399.66 0.0e+00 200.99
130504 rs57258852 2_137 0 47478.17 0.0e+00 198.41
130502 rs4663332 2_137 0 44334.04 0.0e+00 194.41
130503 rs200973045 2_137 0 44112.98 0.0e+00 194.10
130470 rs2741034 2_137 0 32648.39 0.0e+00 190.53
130462 rs2602363 2_137 0 32597.37 0.0e+00 190.40
130457 rs2741013 2_137 0 32510.39 0.0e+00 190.21
130492 rs2012734 2_137 0 37095.00 0.0e+00 187.89
130514 rs11888459 2_137 1 58593.32 2.0e-01 187.72
130480 rs6753320 2_137 0 40890.94 0.0e+00 186.96
130481 rs6736743 2_137 0 40890.94 0.0e+00 186.96
130497 rs13401281 2_137 0 57559.11 6.5e-17 186.34
130494 rs1875263 2_137 0 54641.41 5.5e-15 181.67
130529 rs2361502 2_137 0 33132.93 0.0e+00 157.91
130466 rs6431558 2_137 0 19510.26 0.0e+00 -144.34
130474 rs1113193 2_137 0 7484.14 0.0e+00 -97.79
130468 rs1823803 2_137 0 8995.59 0.0e+00 91.76
130526 rs10202032 2_137 0 8376.88 0.0e+00 -88.03
130527 rs6723936 2_137 0 9712.61 0.0e+00 78.31
130516 rs143373661 2_137 0 8720.64 0.0e+00 78.24
130464 rs13027376 2_137 0 6682.23 0.0e+00 -74.88
130461 rs4047192 2_137 0 6673.02 0.0e+00 -74.81
130483 rs12476197 2_137 0 7944.69 0.0e+00 -71.92
130477 rs4663871 2_137 0 7899.05 0.0e+00 -71.72
130482 rs765251456 2_137 0 7826.70 0.0e+00 -71.57
130513 rs76063448 2_137 1 8087.82 2.8e-02 70.66
130509 rs45510999 2_137 0 8004.30 0.0e+00 70.21
130501 rs183532563 2_137 0 7936.26 0.0e+00 69.70
130517 rs11568318 2_137 0 1455.89 0.0e+00 -65.81
130507 rs45507691 2_137 0 1446.95 0.0e+00 -65.58
130475 rs17863773 2_137 0 6266.01 0.0e+00 -65.44
130469 rs10929252 2_137 0 5358.59 0.0e+00 -63.60
130467 rs17863766 2_137 0 5241.57 0.0e+00 -63.58
130456 rs140719475 2_137 0 5347.22 0.0e+00 -63.55
130459 rs6713902 2_137 0 4568.80 0.0e+00 -60.77
130510 rs139257330 2_137 0 5347.89 0.0e+00 60.10
130458 rs7563478 2_137 0 1129.04 0.0e+00 -59.73
130472 rs2602372 2_137 0 4810.94 0.0e+00 58.13
130519 rs2003569 2_137 0 3751.41 2.4e-08 -57.58
130442 rs62192764 2_137 0 3153.10 0.0e+00 -54.32
130434 rs62192761 2_137 0 3137.98 0.0e+00 -54.28
130444 rs4047189 2_137 0 4173.22 0.0e+00 53.85
#GO enrichment analysis
library(enrichR)
Welcome to enrichR
Checking connection ...
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
dbs <- c("GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021")
genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]
#number of genes for gene set enrichment
length(genes)
[1] 53
if (length(genes)>0){
GO_enrichment <- enrichr(genes, dbs)
for (db in dbs){
print(db)
df <- GO_enrichment[[db]]
df <- df[df$Adjusted.P.value<0.05,c("Term", "Overlap", "Adjusted.P.value", "Genes")]
print(df)
}
#DisGeNET enrichment
# devtools::install_bitbucket("ibi_group/disgenet2r")
library(disgenet2r)
disgenet_api_key <- get_disgenet_api_key(
email = "wesleycrouse@gmail.com",
password = "uchicago1" )
Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
res_enrich <-disease_enrichment(entities=genes, vocabulary = "HGNC",
database = "CURATED" )
df <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio", "BgRatio")]
print(df)
#WebGestalt enrichment
library(WebGestaltR)
background <- ctwas_gene_res$genename
#listGeneSet()
databases <- c("pathway_KEGG", "disease_GLAD4U", "disease_OMIM")
enrichResult <- WebGestaltR(enrichMethod="ORA", organism="hsapiens",
interestGene=genes, referenceGene=background,
enrichDatabase=databases, interestGeneType="genesymbol",
referenceGeneType="genesymbol", isOutput=F)
print(enrichResult[,c("description", "size", "overlap", "FDR", "database", "userId")])
}
Uploading data to Enrichr... Done.
Querying GO_Biological_Process_2021... Done.
Querying GO_Cellular_Component_2021... Done.
Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
Term Overlap
1 low-density lipoprotein particle remodeling (GO:0034374) 3/14
2 cholesterol homeostasis (GO:0042632) 4/71
3 sterol homeostasis (GO:0055092) 4/72
4 triglyceride homeostasis (GO:0070328) 3/31
5 very-low-density lipoprotein particle remodeling (GO:0034372) 2/9
6 cholesterol transport (GO:0030301) 3/51
7 triglyceride metabolic process (GO:0006641) 3/55
8 triglyceride-rich lipoprotein particle remodeling (GO:0034370) 2/13
9 reverse cholesterol transport (GO:0043691) 2/17
10 high-density lipoprotein particle remodeling (GO:0034375) 2/18
11 activin receptor signaling pathway (GO:0032924) 2/19
12 xenobiotic transport (GO:0042908) 2/20
13 sterol transport (GO:0015918) 2/21
14 regulation of interferon-gamma production (GO:0032649) 3/86
Adjusted.P.value Genes
1 0.002831856 CETP;LIPC;MPO
2 0.005960472 ABCG8;CETP;LIPC;DGAT2
3 0.005960472 ABCG8;CETP;LIPC;DGAT2
4 0.008468662 CETP;LIPC;DGAT2
5 0.022157824 CETP;LIPC
6 0.025196105 ABCG8;CETP;LIPC
7 0.027004534 CETP;LIPC;DGAT2
8 0.029802322 CETP;LIPC
9 0.045876892 CETP;LIPC
10 0.046371623 CETP;LIPC
11 0.047035707 ACVR1C;INHBB
12 0.047825564 ABCC3;SLC22A3
13 0.048711028 ABCG8;CETP
14 0.049711019 IL27;GAS6;CD276
[1] "GO_Cellular_Component_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
RP11-219B17.3 gene(s) from the input list not found in DisGeNET CURATEDRP4-781K5.7 gene(s) from the input list not found in DisGeNET CURATEDCEP44 gene(s) from the input list not found in DisGeNET CURATEDRP11-452H21.4 gene(s) from the input list not found in DisGeNET CURATEDCYB5R1 gene(s) from the input list not found in DisGeNET CURATEDRHBG gene(s) from the input list not found in DisGeNET CURATEDAKNA gene(s) from the input list not found in DisGeNET CURATEDH3F3B gene(s) from the input list not found in DisGeNET CURATEDCSNK1G3 gene(s) from the input list not found in DisGeNET CURATEDSMIM19 gene(s) from the input list not found in DisGeNET CURATEDZNF76 gene(s) from the input list not found in DisGeNET CURATEDSCAF11 gene(s) from the input list not found in DisGeNET CURATEDMCU gene(s) from the input list not found in DisGeNET CURATEDFAM222B gene(s) from the input list not found in DisGeNET CURATEDLINC01118 gene(s) from the input list not found in DisGeNET CURATEDCNIH4 gene(s) from the input list not found in DisGeNET CURATEDZNF165 gene(s) from the input list not found in DisGeNET CURATED
Description FDR Ratio BgRatio
69 Hypercholesterolemia, Familial 0.01150470 3/36 18/9703
72 Hyperlipoproteinemias 0.02074241 2/36 5/9703
22 Brain Diseases 0.03959329 2/36 25/9703
68 Hypercholesterolemia 0.03959329 3/36 39/9703
71 Hyperlipidemia 0.03959329 2/36 17/9703
100 Opisthorchiasis 0.03959329 1/36 1/9703
108 Prostatic Neoplasms 0.03959329 8/36 616/9703
113 Salivary Gland Neoplasms 0.03959329 3/36 47/9703
161 Malignant neoplasm of salivary gland 0.03959329 3/36 45/9703
178 Pyloric Atresia 0.03959329 1/36 1/9703
******************************************
* *
* Welcome to WebGestaltR ! *
* *
******************************************
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
description size overlap FDR
1 Gestational hypertension 64 5 0.02265364
2 Hyperlipidemias 64 5 0.02265364
3 Dyslipidaemia 84 5 0.03996497
4 Pulmonary Disease, Chronic Obstructive 87 5 0.03996497
5 Genetic Predisposition to Disease 449 10 0.03996497
6 Disease Susceptibility 456 10 0.03996497
database userId
1 disease_GLAD4U INHBB;PROZ;GAS6;LIPC;ADORA2B
2 disease_GLAD4U ABCG8;DGAT2;LIPC;CETP;MPO
3 disease_GLAD4U ABCG8;DGAT2;LIPC;CETP;MPO
4 disease_GLAD4U HHIP;GYPA;IL27;ADORA2B;CYP2A6
5 disease_GLAD4U ABCG8;HHIP;SLC22A3;JAZF1;LIPC;IL27;CETP;MPO;CYP2A6;DNMT3B
6 disease_GLAD4U ABCG8;HHIP;SLC22A3;JAZF1;LIPC;IL27;CETP;MPO;CYP2A6;DNMT3B
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0 cowplot_1.0.0
[5] ggplot2_3.3.3
loaded via a namespace (and not attached):
[1] bitops_1.0-6 matrixStats_0.57.0
[3] fs_1.3.1 bit64_4.0.5
[5] doParallel_1.0.16 progress_1.2.2
[7] httr_1.4.1 rprojroot_2.0.2
[9] GenomeInfoDb_1.20.0 doRNG_1.8.2
[11] tools_3.6.1 utf8_1.2.1
[13] R6_2.5.0 DBI_1.1.1
[15] BiocGenerics_0.30.0 colorspace_1.4-1
[17] withr_2.4.1 tidyselect_1.1.0
[19] prettyunits_1.0.2 bit_4.0.4
[21] curl_3.3 compiler_3.6.1
[23] git2r_0.26.1 Biobase_2.44.0
[25] DelayedArray_0.10.0 rtracklayer_1.44.0
[27] labeling_0.3 scales_1.1.0
[29] readr_1.4.0 apcluster_1.4.8
[31] stringr_1.4.0 digest_0.6.20
[33] Rsamtools_2.0.0 svglite_1.2.2
[35] rmarkdown_1.13 XVector_0.24.0
[37] pkgconfig_2.0.3 htmltools_0.3.6
[39] fastmap_1.1.0 BSgenome_1.52.0
[41] rlang_0.4.11 RSQLite_2.2.7
[43] generics_0.0.2 farver_2.1.0
[45] jsonlite_1.6 BiocParallel_1.18.0
[47] dplyr_1.0.7 VariantAnnotation_1.30.1
[49] RCurl_1.98-1.1 magrittr_2.0.1
[51] GenomeInfoDbData_1.2.1 Matrix_1.2-18
[53] Rcpp_1.0.6 munsell_0.5.0
[55] S4Vectors_0.22.1 fansi_0.5.0
[57] gdtools_0.1.9 lifecycle_1.0.0
[59] stringi_1.4.3 whisker_0.3-2
[61] yaml_2.2.0 SummarizedExperiment_1.14.1
[63] zlibbioc_1.30.0 plyr_1.8.4
[65] grid_3.6.1 blob_1.2.1
[67] parallel_3.6.1 promises_1.0.1
[69] crayon_1.4.1 lattice_0.20-38
[71] Biostrings_2.52.0 GenomicFeatures_1.36.3
[73] hms_1.1.0 knitr_1.23
[75] pillar_1.6.1 igraph_1.2.4.1
[77] GenomicRanges_1.36.0 rjson_0.2.20
[79] rngtools_1.5 codetools_0.2-16
[81] reshape2_1.4.3 biomaRt_2.40.1
[83] stats4_3.6.1 XML_3.98-1.20
[85] glue_1.4.2 evaluate_0.14
[87] data.table_1.14.0 foreach_1.5.1
[89] vctrs_0.3.8 httpuv_1.5.1
[91] gtable_0.3.0 purrr_0.3.4
[93] assertthat_0.2.1 cachem_1.0.5
[95] xfun_0.8 later_0.8.0
[97] tibble_3.1.2 iterators_1.0.13
[99] GenomicAlignments_1.20.1 AnnotationDbi_1.46.0
[101] memoise_2.0.0 IRanges_2.18.1
[103] workflowr_1.6.2 ellipsis_0.3.2