Last updated: 2021-09-09
Checks: 6 1
Knit directory: ctwas_applied/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20210726)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 59e5f4d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Unstaged changes:
Modified: analysis/ukb-d-30500_irnt_Liver.Rmd
Modified: analysis/ukb-d-30500_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30600_irnt_Liver.Rmd
Modified: analysis/ukb-d-30600_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30610_irnt_Liver.Rmd
Modified: analysis/ukb-d-30610_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30620_irnt_Liver.Rmd
Modified: analysis/ukb-d-30620_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30630_irnt_Liver.Rmd
Modified: analysis/ukb-d-30630_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30640_irnt_Liver.Rmd
Modified: analysis/ukb-d-30640_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30650_irnt_Liver.Rmd
Modified: analysis/ukb-d-30650_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30660_irnt_Liver.Rmd
Modified: analysis/ukb-d-30660_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30670_irnt_Liver.Rmd
Modified: analysis/ukb-d-30670_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30680_irnt_Liver.Rmd
Modified: analysis/ukb-d-30690_irnt_Liver.Rmd
Modified: analysis/ukb-d-30690_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30700_irnt_Liver.Rmd
Modified: analysis/ukb-d-30700_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30710_irnt_Liver.Rmd
Modified: analysis/ukb-d-30710_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30720_irnt_Liver.Rmd
Modified: analysis/ukb-d-30720_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30730_irnt_Liver.Rmd
Modified: analysis/ukb-d-30740_irnt_Liver.Rmd
Modified: analysis/ukb-d-30740_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30750_irnt_Liver.Rmd
Modified: analysis/ukb-d-30750_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30760_irnt_Liver.Rmd
Modified: analysis/ukb-d-30760_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30770_irnt_Liver.Rmd
Modified: analysis/ukb-d-30770_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30780_irnt_Liver.Rmd
Modified: analysis/ukb-d-30780_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30790_irnt_Liver.Rmd
Modified: analysis/ukb-d-30800_irnt_Liver.Rmd
Modified: analysis/ukb-d-30800_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30810_irnt_Liver.Rmd
Modified: analysis/ukb-d-30820_irnt_Liver.Rmd
Modified: analysis/ukb-d-30820_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30830_irnt_Liver.Rmd
Modified: analysis/ukb-d-30830_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30840_irnt_Liver.Rmd
Modified: analysis/ukb-d-30840_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30850_irnt_Liver.Rmd
Modified: analysis/ukb-d-30850_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30860_irnt_Liver.Rmd
Modified: analysis/ukb-d-30860_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30870_irnt_Liver.Rmd
Modified: analysis/ukb-d-30870_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30880_irnt_Liver.Rmd
Modified: analysis/ukb-d-30880_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30890_irnt_Liver.Rmd
Modified: analysis/ukb-d-30890_irnt_Whole_Blood.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/ukb-d-30890_irnt_Whole_Blood.Rmd
) and HTML (docs/ukb-d-30890_irnt_Whole_Blood.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | cbf7408 | wesleycrouse | 2021-09-08 | adding enrichment to reports |
html | cbf7408 | wesleycrouse | 2021-09-08 | adding enrichment to reports |
Rmd | 4970e3e | wesleycrouse | 2021-09-08 | updating reports |
html | 4970e3e | wesleycrouse | 2021-09-08 | updating reports |
Rmd | dfd2b5f | wesleycrouse | 2021-09-07 | regenerating reports |
html | dfd2b5f | wesleycrouse | 2021-09-07 | regenerating reports |
Rmd | 61b53b3 | wesleycrouse | 2021-09-06 | updated PVE calculation |
html | 61b53b3 | wesleycrouse | 2021-09-06 | updated PVE calculation |
Rmd | 837dd01 | wesleycrouse | 2021-09-01 | adding additional fixedsigma report |
Rmd | d0a5417 | wesleycrouse | 2021-08-30 | adding new reports to the index |
Rmd | 0922de7 | wesleycrouse | 2021-08-18 | updating all reports with locus plots |
html | 0922de7 | wesleycrouse | 2021-08-18 | updating all reports with locus plots |
html | 1c62980 | wesleycrouse | 2021-08-11 | Updating reports |
Rmd | 5981e80 | wesleycrouse | 2021-08-11 | Adding more reports |
html | 5981e80 | wesleycrouse | 2021-08-11 | Adding more reports |
Rmd | da9f015 | wesleycrouse | 2021-08-07 | adding more reports |
html | da9f015 | wesleycrouse | 2021-08-07 | adding more reports |
These are the results of a ctwas
analysis of the UK Biobank trait Vitamin D (quantile)
using Whole_Blood
gene weights.
The GWAS was conducted by the Neale Lab, and the biomarker traits we analyzed are discussed here. Summary statistics were obtained from IEU OpenGWAS using GWAS ID: ukb-d-30890_irnt
. Results were obtained from from IEU rather than Neale Lab because they are in a standardard format (GWAS VCF). Note that 3 of the 34 biomarker traits were not available from IEU and were excluded from analysis.
The weights are mashr GTEx v8 models on Whole_Blood
eQTL obtained from PredictDB. We performed a full harmonization of the variants, including recovering strand ambiguous variants. This procedure is discussed in a separate document. (TO-DO: add report that describes harmonization)
LD matrices were computed from a 10% subset of Neale lab subjects. Subjects were matched using the plate and well information from genotyping. We included only biallelic variants with MAF>0.01 in the original Neale Lab GWAS. (TO-DO: add more details [number of subjects, variants, etc])
TO-DO: add enhanced QC reporting (total number of weights, why each variant was missing for all genes)
qclist_all <- list()
qc_files <- paste0(results_dir, "/", list.files(results_dir, pattern="exprqc.Rd"))
for (i in 1:length(qc_files)){
load(qc_files[i])
chr <- unlist(strsplit(rev(unlist(strsplit(qc_files[i], "_")))[1], "[.]"))[1]
qclist_all[[chr]] <- cbind(do.call(rbind, lapply(qclist,unlist)), as.numeric(substring(chr,4)))
}
qclist_all <- data.frame(do.call(rbind, qclist_all))
colnames(qclist_all)[ncol(qclist_all)] <- "chr"
rm(qclist, wgtlist, z_gene_chr)
#number of imputed weights
nrow(qclist_all)
[1] 11095
#number of imputed weights by chromosome
table(qclist_all$chr)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1129 747 624 400 479 621 560 383 404 430 682 652 192 362 331
16 17 18 19 20 21 22
551 725 159 911 313 130 310
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.762776
#load ctwas results
ctwas_res <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.susieIrss.txt"))
#make unique identifier for regions
ctwas_res$region_tag <- paste(ctwas_res$region_tag1, ctwas_res$region_tag2, sep="_")
#compute PVE for each gene/SNP
ctwas_res$PVE = ctwas_res$susie_pip*ctwas_res$mu2/sample_size #check PVE calculation
#separate gene and SNP results
ctwas_gene_res <- ctwas_res[ctwas_res$type == "gene", ]
ctwas_gene_res <- data.frame(ctwas_gene_res)
ctwas_snp_res <- ctwas_res[ctwas_res$type == "SNP", ]
ctwas_snp_res <- data.frame(ctwas_snp_res)
#add gene information to results
sqlite <- RSQLite::dbDriver("SQLite")
db = RSQLite::dbConnect(sqlite, paste0("/project2/compbio/predictdb/mashr_models/mashr_", weight, ".db"))
query <- function(...) RSQLite::dbGetQuery(db, ...)
gene_info <- query("select gene, genename from extra")
gene_info <- query("select gene, genename, gene_type from extra")
RSQLite::dbDisconnect(db)
ctwas_gene_res <- cbind(ctwas_gene_res, gene_info[sapply(ctwas_gene_res$id, match, gene_info$gene), c("genename", "gene_type")])
#add z score to results
load(paste0(results_dir, "/", analysis_id, "_expr_z_gene.Rd"))
ctwas_gene_res$z <- z_gene[ctwas_gene_res$id,]$z
#load(paste0(results_dir, "/", analysis_id, "_expr_z_snp.Rd")) #for new version, stored after harmonization
z_snp <- readRDS(paste0(results_dir, "/", trait_id, ".RDS")) #for old version, unharmonized
z_snp <- z_snp[z_snp$id %in% ctwas_snp_res$id,] #subset snps to those included in analysis, note some are duplicated, need to match which allele was used
ctwas_snp_res$z <- z_snp$z[match(ctwas_snp_res$id, z_snp$id)] #for duplicated snps, this arbitrarily uses the first allele
ctwas_snp_res$z_flag <- as.numeric(ctwas_snp_res$id %in% z_snp$id[duplicated(z_snp$id)]) #mark the unclear z scores, flag=1
#formatting and rounding for tables
ctwas_gene_res$z <- round(ctwas_gene_res$z,2)
ctwas_snp_res$z <- round(ctwas_snp_res$z,2)
ctwas_gene_res$susie_pip <- round(ctwas_gene_res$susie_pip,3)
ctwas_snp_res$susie_pip <- round(ctwas_snp_res$susie_pip,3)
ctwas_gene_res$mu2 <- round(ctwas_gene_res$mu2,2)
ctwas_snp_res$mu2 <- round(ctwas_snp_res$mu2,2)
ctwas_gene_res$PVE <- signif(ctwas_gene_res$PVE, 2)
ctwas_snp_res$PVE <- signif(ctwas_snp_res$PVE, 2)
#merge gene and snp results with added information
ctwas_gene_res$z_flag=NA
ctwas_snp_res$genename=NA
ctwas_snp_res$gene_type=NA
ctwas_res <- rbind(ctwas_gene_res,
ctwas_snp_res[,colnames(ctwas_gene_res)])
#store columns to report
report_cols <- colnames(ctwas_gene_res)[!(colnames(ctwas_gene_res) %in% c("type", "region_tag1", "region_tag2", "cs_index", "gene_type", "z_flag", "id", "chrom", "pos"))]
first_cols <- c("genename", "region_tag")
report_cols <- c(first_cols, report_cols[!(report_cols %in% first_cols)])
report_cols_snps <- c("id", report_cols[-1])
#get number of SNPs from s1 results; adjust for thin
ctwas_res_s1 <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.s1.susieIrss.txt"))
n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
rm(ctwas_res_s1)
library(ggplot2)
library(cowplot)
********************************************************
Note: As of version 1.0.0, cowplot does not change the
default ggplot2 theme anymore. To recover the previous
behavior, execute:
theme_set(theme_cowplot())
********************************************************
load(paste0(results_dir, "/", analysis_id, "_ctwas.s2.susieIrssres.Rd"))
df <- data.frame(niter = rep(1:ncol(group_prior_rec), 2),
value = c(group_prior_rec[1,], group_prior_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_rec)))
df$group <- as.factor(df$group)
df$value[df$group=="SNP"] <- df$value[df$group=="SNP"]*thin #adjust parameter to account for thin argument
p_pi <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(pi)) +
ggtitle("Prior mean") +
theme_cowplot()
df <- data.frame(niter = rep(1:ncol(group_prior_var_rec), 2),
value = c(group_prior_var_rec[1,], group_prior_var_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_var_rec)))
df$group <- as.factor(df$group)
p_sigma2 <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(sigma^2)) +
ggtitle("Prior variance") +
theme_cowplot()
plot_grid(p_pi, p_sigma2)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
gene snp
0.0019923611 0.0002010424
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
gene snp
157.189551 6.999876
#report sample size
print(sample_size)
[1] 329247
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1] 11095 8697330
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
gene snp
0.01055352 0.03717423
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1] 0.01238507 0.32669942
#distribution of PIPs
hist(ctwas_gene_res$susie_pip, xlim=c(0,1), main="Distribution of Gene PIPs")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#genes with PIP>0.8 or 20 highest PIPs
head(ctwas_gene_res[order(-ctwas_gene_res$susie_pip),report_cols], max(sum(ctwas_gene_res$susie_pip>0.8), 20))
genename region_tag susie_pip mu2 PVE z
2073 SULT2A1 19_33 1.000 357.93 1.1e-03 -19.30
8347 HSD17B13 4_59 0.941 30.22 8.6e-05 4.70
4564 PSRC1 1_67 0.874 49.00 1.3e-04 6.56
3811 INSIG2 2_69 0.874 38.97 1.0e-04 -5.95
2410 MLX 17_25 0.783 44.84 1.1e-04 -6.51
2496 ZPR1 11_70 0.767 85.44 2.0e-04 8.92
2720 MED23 6_87 0.677 53.08 1.1e-04 6.97
4249 LRP3 19_23 0.606 32.59 6.0e-05 -4.40
8634 DHCR7 11_40 0.519 1482.51 2.3e-03 36.46
5772 LPP 3_115 0.506 37.63 5.8e-05 4.45
4084 PSMA1 11_11 0.500 1232.13 1.9e-03 -17.47
6264 PDE3B 11_11 0.500 1232.13 1.9e-03 -17.47
8633 NADSYN1 11_40 0.481 1482.36 2.2e-03 36.46
5318 USP3 15_29 0.480 49.83 7.3e-05 -7.06
1043 NFE2L1 17_28 0.343 27.56 2.9e-05 -4.22
7089 USP1 1_39 0.339 76.99 7.9e-05 -8.54
10765 ZDHHC18 1_18 0.319 35.28 3.4e-05 -4.52
4394 UTP3 4_49 0.221 34.98 2.3e-05 4.63
3191 AKR1A1 1_28 0.209 35.40 2.3e-05 -4.82
4151 LDLR 19_9 0.194 59.83 3.5e-05 3.77
#plot PIP vs effect size
plot(ctwas_gene_res$susie_pip, ctwas_gene_res$mu2, xlab="PIP", ylab="mu^2", main="Gene PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#genes with 20 largest effect sizes
head(ctwas_gene_res[order(-ctwas_gene_res$mu2),report_cols],20)
genename region_tag susie_pip mu2 PVE z
8634 DHCR7 11_40 0.519 1482.51 2.3e-03 36.46
8633 NADSYN1 11_40 0.481 1482.36 2.2e-03 36.46
9889 CYP2R1 11_11 0.000 1261.34 0.0e+00 35.17
4084 PSMA1 11_11 0.500 1232.13 1.9e-03 -17.47
6264 PDE3B 11_11 0.500 1232.13 1.9e-03 -17.47
7886 TRIM68 11_3 0.000 503.24 0.0e+00 -0.85
4083 COPB1 11_11 0.000 403.08 0.0e+00 -6.44
2073 SULT2A1 19_33 1.000 357.93 1.1e-03 -19.30
9362 OR51E1 11_3 0.000 91.12 0.0e+00 1.13
2496 ZPR1 11_70 0.767 85.44 2.0e-04 8.92
7089 USP1 1_39 0.339 76.99 7.9e-05 -8.54
3102 DOCK7 1_39 0.003 66.73 6.8e-07 -7.91
9151 STAP2 19_4 0.153 63.01 2.9e-05 -4.11
3099 ARHGEF2 1_77 0.004 61.79 6.7e-07 -5.14
7922 ZNF701 19_37 0.050 61.05 9.3e-06 -4.07
574 CA11 19_33 0.004 60.81 6.6e-07 3.80
4151 LDLR 19_9 0.194 59.83 3.5e-05 3.77
10443 ZNF665 19_37 0.041 59.81 7.5e-06 -3.57
4434 SYT11 1_77 0.026 58.93 4.7e-06 -5.79
9500 TRAPPC6B 14_11 0.002 57.65 4.0e-07 -7.19
#genes with 20 highest pve
head(ctwas_gene_res[order(-ctwas_gene_res$PVE),report_cols],20)
genename region_tag susie_pip mu2 PVE z
8634 DHCR7 11_40 0.519 1482.51 2.3e-03 36.46
8633 NADSYN1 11_40 0.481 1482.36 2.2e-03 36.46
4084 PSMA1 11_11 0.500 1232.13 1.9e-03 -17.47
6264 PDE3B 11_11 0.500 1232.13 1.9e-03 -17.47
2073 SULT2A1 19_33 1.000 357.93 1.1e-03 -19.30
2496 ZPR1 11_70 0.767 85.44 2.0e-04 8.92
4564 PSRC1 1_67 0.874 49.00 1.3e-04 6.56
2720 MED23 6_87 0.677 53.08 1.1e-04 6.97
2410 MLX 17_25 0.783 44.84 1.1e-04 -6.51
3811 INSIG2 2_69 0.874 38.97 1.0e-04 -5.95
8347 HSD17B13 4_59 0.941 30.22 8.6e-05 4.70
7089 USP1 1_39 0.339 76.99 7.9e-05 -8.54
5318 USP3 15_29 0.480 49.83 7.3e-05 -7.06
4249 LRP3 19_23 0.606 32.59 6.0e-05 -4.40
5772 LPP 3_115 0.506 37.63 5.8e-05 4.45
4151 LDLR 19_9 0.194 59.83 3.5e-05 3.77
10765 ZDHHC18 1_18 0.319 35.28 3.4e-05 -4.52
1043 NFE2L1 17_28 0.343 27.56 2.9e-05 -4.22
9151 STAP2 19_4 0.153 63.01 2.9e-05 -4.11
3191 AKR1A1 1_28 0.209 35.40 2.3e-05 -4.82
#genes with 20 largest z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
8633 NADSYN1 11_40 0.481 1482.36 2.2e-03 36.46
8634 DHCR7 11_40 0.519 1482.51 2.3e-03 36.46
9889 CYP2R1 11_11 0.000 1261.34 0.0e+00 35.17
2073 SULT2A1 19_33 1.000 357.93 1.1e-03 -19.30
4084 PSMA1 11_11 0.500 1232.13 1.9e-03 -17.47
6264 PDE3B 11_11 0.500 1232.13 1.9e-03 -17.47
2496 ZPR1 11_70 0.767 85.44 2.0e-04 8.92
7089 USP1 1_39 0.339 76.99 7.9e-05 -8.54
3102 DOCK7 1_39 0.003 66.73 6.8e-07 -7.91
5633 CRNN 1_74 0.000 34.86 1.2e-08 7.29
9500 TRAPPC6B 14_11 0.002 57.65 4.0e-07 -7.19
5318 USP3 15_29 0.480 49.83 7.3e-05 -7.06
2720 MED23 6_87 0.677 53.08 1.1e-04 6.97
3270 ARG1 6_87 0.154 49.91 2.3e-05 6.75
4564 PSRC1 1_67 0.874 49.00 1.3e-04 6.56
2410 MLX 17_25 0.783 44.84 1.1e-04 -6.51
4083 COPB1 11_11 0.000 403.08 0.0e+00 -6.44
9893 ZBTB6 9_63 0.031 43.82 4.2e-06 -6.18
205 RABGAP1 9_63 0.024 44.61 3.2e-06 -6.15
5007 BUD13 11_70 0.000 26.05 2.6e-08 -6.04
#set nominal signifiance threshold for z scores
alpha <- 0.05
#bonferroni adjusted threshold for z scores
sig_thresh <- qnorm(1-(alpha/nrow(ctwas_gene_res)/2), lower=T)
#Q-Q plot for z scores
obs_z <- ctwas_gene_res$z[order(ctwas_gene_res$z)]
exp_z <- qnorm((1:nrow(ctwas_gene_res))/nrow(ctwas_gene_res))
plot(exp_z, obs_z, xlab="Expected z", ylab="Observed z", main="Gene z score Q-Q plot")
abline(a=0,b=1)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#plot z score vs PIP
plot(abs(ctwas_gene_res$z), ctwas_gene_res$susie_pip, xlab="abs(z)", ylab="PIP")
abline(v=sig_thresh, col="red", lty=2)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)
[1] 0.00396575
#genes with most significant z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
8633 NADSYN1 11_40 0.481 1482.36 2.2e-03 36.46
8634 DHCR7 11_40 0.519 1482.51 2.3e-03 36.46
9889 CYP2R1 11_11 0.000 1261.34 0.0e+00 35.17
2073 SULT2A1 19_33 1.000 357.93 1.1e-03 -19.30
4084 PSMA1 11_11 0.500 1232.13 1.9e-03 -17.47
6264 PDE3B 11_11 0.500 1232.13 1.9e-03 -17.47
2496 ZPR1 11_70 0.767 85.44 2.0e-04 8.92
7089 USP1 1_39 0.339 76.99 7.9e-05 -8.54
3102 DOCK7 1_39 0.003 66.73 6.8e-07 -7.91
5633 CRNN 1_74 0.000 34.86 1.2e-08 7.29
9500 TRAPPC6B 14_11 0.002 57.65 4.0e-07 -7.19
5318 USP3 15_29 0.480 49.83 7.3e-05 -7.06
2720 MED23 6_87 0.677 53.08 1.1e-04 6.97
3270 ARG1 6_87 0.154 49.91 2.3e-05 6.75
4564 PSRC1 1_67 0.874 49.00 1.3e-04 6.56
2410 MLX 17_25 0.783 44.84 1.1e-04 -6.51
4083 COPB1 11_11 0.000 403.08 0.0e+00 -6.44
9893 ZBTB6 9_63 0.031 43.82 4.2e-06 -6.18
205 RABGAP1 9_63 0.024 44.61 3.2e-06 -6.15
5007 BUD13 11_70 0.000 26.05 2.6e-08 -6.04
ctwas_gene_res_sortz <- ctwas_gene_res[order(-abs(ctwas_gene_res$z)),]
n_plots <- 5
for (region_tag_plot in head(unique(ctwas_gene_res_sortz$region_tag), n_plots)){
ctwas_res_region <- ctwas_res[ctwas_res$region_tag==region_tag_plot,]
start <- min(ctwas_res_region$pos)
end <- max(ctwas_res_region$pos)
ctwas_res_region <- ctwas_res_region[order(ctwas_res_region$pos),]
ctwas_res_region_gene <- ctwas_res_region[ctwas_res_region$type=="gene",]
ctwas_res_region_snp <- ctwas_res_region[ctwas_res_region$type=="SNP",]
#region name
print(paste0("Region: ", region_tag_plot))
#table of genes in region
print(ctwas_res_region_gene[,report_cols])
par(mfrow=c(4,1))
#gene z scores
plot(ctwas_res_region_gene$pos, abs(ctwas_res_region_gene$z), xlab="Position", ylab="abs(gene_z)", xlim=c(start,end),
ylim=c(0,max(sig_thresh, abs(ctwas_res_region_gene$z))),
main=paste0("Region: ", region_tag_plot))
abline(h=sig_thresh,col="red",lty=2)
#significance threshold for SNPs
alpha_snp <- 5*10^(-8)
sig_thresh_snp <- qnorm(1-alpha_snp/2, lower=T)
#snp z scores
plot(ctwas_res_region_snp$pos, abs(ctwas_res_region_snp$z), xlab="Position", ylab="abs(snp_z)",xlim=c(start,end),
ylim=c(0,max(sig_thresh_snp, max(abs(ctwas_res_region_snp$z)))))
abline(h=sig_thresh_snp,col="purple",lty=2)
#gene pips
plot(ctwas_res_region_gene$pos, ctwas_res_region_gene$susie_pip, xlab="Position", ylab="Gene PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
#snp pips
plot(ctwas_res_region_snp$pos, ctwas_res_region_snp$susie_pip, xlab="Position", ylab="SNP PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
}
[1] "Region: 11_40"
genename region_tag susie_pip mu2 PVE z
8634 DHCR7 11_40 0.519 1482.51 2.3e-03 36.46
8633 NADSYN1 11_40 0.481 1482.36 2.2e-03 36.46
11821 KRTAP5-9 11_40 0.000 6.41 5.0e-09 -0.37
6699 FAM86C1 11_40 0.000 10.57 1.1e-08 -1.16
4999 RNF121 11_40 0.000 8.47 5.8e-09 1.87
11802 RP11-849H4.2 11_40 0.000 5.00 3.4e-09 -0.19
4991 IL18BP 11_40 0.001 21.56 8.6e-08 -1.62
9668 LRTOMT 11_40 0.000 10.48 7.1e-09 -2.06
4992 NUMA1 11_40 0.000 11.41 1.2e-08 1.56
2526 ANAPC15 11_40 0.000 6.97 4.9e-09 0.77
2527 FOLR3 11_40 0.000 10.18 7.7e-09 -1.76
2525 FOLR1 11_40 0.001 17.41 5.4e-08 0.80
7579 FOLR2 11_40 0.005 31.34 4.4e-07 2.16
7580 INPPL1 11_40 0.000 5.03 3.4e-09 -0.12
7028 CLPB 11_40 0.000 5.07 3.5e-09 0.02
11293 LINC01537 11_40 0.000 10.05 7.4e-09 2.22
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 11_11"
genename region_tag susie_pip mu2 PVE z
10464 FAR1 11_11 0.0 21.74 0.0000 -2.05
4083 COPB1 11_11 0.0 403.08 0.0000 -6.44
4084 PSMA1 11_11 0.5 1232.13 0.0019 -17.47
6264 PDE3B 11_11 0.5 1232.13 0.0019 -17.47
9889 CYP2R1 11_11 0.0 1261.34 0.0000 35.17
10103 INSC 11_11 0.0 10.36 0.0000 -1.15
11818 RP11-531H8.1 11_11 0.0 31.29 0.0000 -3.75
11810 RP11-531H8.2 11_11 0.0 39.36 0.0000 4.53
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 19_33"
genename region_tag susie_pip mu2 PVE z
2050 PRKD2 19_33 0.000 5.68 2.2e-10 -0.41
1257 STRN4 19_33 0.000 6.51 2.7e-10 0.49
9389 FKRP 19_33 0.000 7.15 3.2e-10 0.69
400 AP2S1 19_33 0.000 5.59 2.2e-10 0.37
6825 ARHGAP35 19_33 0.000 5.47 2.1e-10 -0.10
5502 SAE1 19_33 0.000 5.22 1.9e-10 0.52
2055 BBC3 19_33 0.000 20.35 3.0e-09 -1.90
2053 CCDC9 19_33 0.000 13.11 1.3e-09 1.17
11894 INAFM1 19_33 0.000 8.63 4.4e-10 0.65
4639 C5AR2 19_33 0.000 5.97 2.4e-10 -0.22
4635 DHX34 19_33 0.000 5.94 2.4e-10 0.46
2077 MEIS3 19_33 0.000 6.46 2.5e-10 -0.17
2074 NAPA 19_33 0.000 5.75 2.3e-10 -0.24
3238 ZNF541 19_33 0.000 6.95 2.6e-10 -0.37
572 GLTSCR1 19_33 0.000 7.37 3.4e-10 -0.49
294 EHD2 19_33 0.000 14.66 1.6e-09 -0.96
2066 GLTSCR2 19_33 0.000 10.25 5.9e-10 -0.50
2073 SULT2A1 19_33 1.000 357.93 1.1e-03 -19.30
2089 PLA2G4C 19_33 0.000 5.93 2.2e-10 0.88
2086 LIG1 19_33 0.000 7.87 3.8e-10 0.19
9808 C19orf68 19_33 0.000 6.86 3.0e-10 0.60
2091 CABP5 19_33 0.000 10.68 6.5e-10 1.42
2085 CARD8 19_33 0.000 11.56 8.3e-10 1.39
5501 EMP3 19_33 0.000 7.18 3.2e-10 0.82
2084 CCDC114 19_33 0.000 5.27 2.0e-10 -0.48
2081 GRWD1 19_33 0.000 7.77 3.6e-10 -1.12
2080 CYTH2 19_33 0.000 8.50 4.4e-10 0.95
9493 KCNJ14 19_33 0.000 5.59 2.1e-10 0.12
5504 LMTK3 19_33 0.000 5.38 2.0e-10 -0.27
1173 SULT2B1 19_33 0.000 11.79 9.2e-10 0.96
573 SPHK2 19_33 0.000 10.91 6.8e-10 1.44
574 CA11 19_33 0.004 60.81 6.6e-07 3.80
5503 NTN5 19_33 0.000 5.15 1.9e-10 -0.08
9034 MAMSTR 19_33 0.000 29.10 1.3e-08 2.35
2097 RASIP1 19_33 0.000 26.07 8.4e-09 -2.10
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 11_70"
genename region_tag susie_pip mu2 PVE z
5007 BUD13 11_70 0.000 26.05 2.6e-08 -6.04
2496 ZPR1 11_70 0.767 85.44 2.0e-04 8.92
3237 APOA1 11_70 0.001 10.15 1.5e-08 1.22
6898 SIK3 11_70 0.000 5.92 6.0e-09 0.14
8030 PAFAH1B2 11_70 0.001 21.79 3.3e-08 -0.86
6104 TAGLN 11_70 0.000 8.19 7.5e-09 2.06
6902 PCSK7 11_70 0.001 14.53 2.9e-08 -2.30
7873 RNF214 11_70 0.001 17.02 5.6e-08 -1.19
9915 BACE1 11_70 0.000 10.95 1.2e-08 3.02
2530 CEP164 11_70 0.001 15.12 3.6e-08 1.53
5018 FXYD2 11_70 0.002 21.17 9.9e-08 1.74
5017 FXYD6 11_70 0.002 25.28 1.7e-07 -2.05
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 1_39"
genename region_tag susie_pip mu2 PVE z
7088 TM2D1 1_39 0.001 6.08 1.9e-08 0.37
4449 PATJ 1_39 0.001 6.32 2.0e-08 -0.46
7089 USP1 1_39 0.339 76.99 7.9e-05 -8.54
3102 DOCK7 1_39 0.003 66.73 6.8e-07 -7.91
3822 ATG4C 1_39 0.001 10.33 4.6e-08 1.31
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#snps with PIP>0.8 or 20 highest PIPs
head(ctwas_snp_res[order(-ctwas_snp_res$susie_pip),report_cols_snps],
max(sum(ctwas_snp_res$susie_pip>0.8), 20))
id region_tag susie_pip mu2 PVE z
35277 rs140655928 1_74 1.000 44.21 1.3e-04 9.17
35366 rs115288876 1_74 1.000 104.31 3.2e-04 12.20
35535 rs61813920 1_74 1.000 48.54 1.5e-04 9.77
73583 rs1042034 2_13 1.000 33.99 1.0e-04 -5.24
75319 rs780093 2_16 1.000 54.90 1.7e-04 7.97
225050 rs189319377 4_49 1.000 39.57 1.2e-04 -6.98
225199 rs222042 4_52 1.000 453.30 1.4e-03 4.62
225201 rs843005 4_52 1.000 1579.06 4.8e-03 59.21
225202 rs7697091 4_52 1.000 2017.45 6.1e-03 -64.16
561121 rs12282912 11_3 1.000 1171.13 3.6e-03 2.65
561126 rs759352454 11_3 1.000 1154.25 3.5e-03 2.92
565265 rs576218922 11_11 1.000 500.09 1.5e-03 -35.89
565284 rs10766194 11_11 1.000 1617.70 4.9e-03 -41.01
565288 rs7129781 11_11 1.000 949.01 2.9e-03 -20.22
583355 rs3740776 11_42 1.000 36.62 1.1e-04 6.23
632606 rs6538696 12_57 1.000 161.23 4.9e-04 15.75
657309 rs775834524 13_25 1.000 1174.60 3.6e-03 -2.53
716351 rs2070895 15_27 1.000 96.36 2.9e-04 -10.63
752338 rs11542462 16_46 1.000 34.19 1.0e-04 -6.06
838982 rs6123359 20_32 1.000 58.27 1.8e-04 8.76
838988 rs2585441 20_32 1.000 70.61 2.1e-04 9.56
838989 rs12480880 20_32 1.000 59.72 1.8e-04 9.97
893819 rs561086564 19_33 1.000 235.44 7.2e-04 2.81
893870 rs75778010 19_33 1.000 191.71 5.8e-04 2.70
35475 rs11205006 1_74 0.999 55.39 1.7e-04 -11.74
716345 rs62000868 15_27 0.999 44.61 1.4e-04 -7.33
839011 rs6068816 20_32 0.999 36.41 1.1e-04 -7.56
859473 rs2072193 22_11 0.998 29.81 9.0e-05 5.74
632610 rs7308827 12_57 0.997 62.99 1.9e-04 -11.80
843372 rs2823025 21_2 0.995 27.57 8.3e-05 -5.33
225215 rs35096193 4_52 0.994 152.34 4.6e-04 25.11
224979 rs112345622 4_49 0.979 26.55 7.9e-05 4.88
759491 rs4968186 17_7 0.974 25.99 7.7e-05 5.18
451826 rs4738679 8_45 0.964 27.14 7.9e-05 5.22
329172 rs75080831 6_19 0.958 25.04 7.3e-05 -4.95
565693 rs117731662 11_12 0.958 26.87 7.8e-05 -5.55
86718 rs2862874 2_39 0.956 27.58 8.0e-05 -5.01
764298 rs112178027 17_18 0.953 26.54 7.7e-05 -5.24
223960 rs445908 4_48 0.952 44.12 1.3e-04 -6.66
59159 rs11122453 1_117 0.940 25.44 7.3e-05 5.06
322677 rs6925468 6_7 0.929 23.48 6.6e-05 -4.68
893742 rs191652498 19_33 0.924 70.69 2.0e-04 2.43
818716 rs2972561 19_31 0.906 24.30 6.7e-05 -4.79
54245 rs79687284 1_108 0.900 23.72 6.5e-05 -4.64
494898 rs11143795 9_34 0.898 24.54 6.7e-05 4.85
816186 rs1137844 19_24 0.897 24.16 6.6e-05 -4.71
565260 rs11023314 11_11 0.892 823.26 2.2e-03 12.46
637857 rs55947413 12_69 0.892 23.04 6.2e-05 4.46
443078 rs10086575 8_26 0.880 23.71 6.3e-05 4.67
683333 rs2144530 14_11 0.873 117.56 3.1e-04 -11.93
597633 rs2847500 11_72 0.862 27.26 7.1e-05 -5.37
579618 rs575513603 11_36 0.859 25.69 6.7e-05 4.77
743886 rs821840 16_31 0.858 44.48 1.2e-04 -6.90
201276 rs78649910 4_4 0.856 30.30 7.9e-05 -5.87
300492 rs1966478 5_72 0.849 24.49 6.3e-05 -4.73
287261 rs6887019 5_45 0.848 24.18 6.2e-05 -4.66
35748 rs61803125 1_75 0.837 24.74 6.3e-05 4.69
747689 rs139861017 16_37 0.815 24.73 6.1e-05 4.71
468612 rs12056610 8_78 0.813 74.46 1.8e-04 -8.77
564769 rs560227606 11_10 0.810 25.07 6.2e-05 -4.56
513801 rs115478735 9_70 0.801 24.60 6.0e-05 -4.60
865471 rs5767072 22_22 0.801 24.63 6.0e-05 4.55
#plot PIP vs effect size
plot(ctwas_snp_res$susie_pip, ctwas_snp_res$mu2, xlab="PIP", ylab="mu^2", main="SNP PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#SNPs with 50 largest effect sizes
head(ctwas_snp_res[order(-ctwas_snp_res$mu2),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
225202 rs7697091 4_52 1.000 2017.45 6.1e-03 -64.16
565284 rs10766194 11_11 1.000 1617.70 4.9e-03 -41.01
225201 rs843005 4_52 1.000 1579.06 4.8e-03 59.21
225200 rs705119 4_52 0.000 1456.00 2.2e-13 56.46
657309 rs775834524 13_25 1.000 1174.60 3.6e-03 -2.53
561121 rs12282912 11_3 1.000 1171.13 3.6e-03 2.65
657307 rs7999449 13_25 0.559 1160.51 2.0e-03 -2.79
657299 rs7337153 13_25 0.579 1160.03 2.0e-03 -2.81
657300 rs9527399 13_25 0.219 1156.60 7.7e-04 2.80
657303 rs9597193 13_25 0.208 1156.54 7.3e-04 2.80
657304 rs9537143 13_25 0.197 1156.54 6.9e-04 2.79
657302 rs9527401 13_25 0.206 1156.53 7.2e-04 2.79
657297 rs9527398 13_25 0.178 1155.93 6.3e-04 2.80
657298 rs9537125 13_25 0.178 1155.93 6.2e-04 2.80
657295 rs9537123 13_25 0.173 1155.83 6.1e-04 2.79
561126 rs759352454 11_3 1.000 1154.25 3.5e-03 2.92
581864 rs2276362 11_40 0.001 1151.31 3.8e-06 36.48
581860 rs12790010 11_40 0.001 1150.81 3.7e-06 36.47
581862 rs11233747 11_40 0.001 1150.52 3.7e-06 36.47
581859 rs12789751 11_40 0.001 1150.12 3.7e-06 36.46
581868 rs11233789 11_40 0.001 1147.36 3.7e-06 36.41
225208 rs114289627 4_52 0.008 1142.39 2.7e-05 -55.56
657288 rs3013347 13_25 0.006 1132.57 2.0e-05 -2.63
657289 rs2937326 13_25 0.006 1132.53 1.9e-05 -2.63
657290 rs9597179 13_25 0.001 1128.46 2.8e-06 2.57
657314 rs9537159 13_25 0.000 1107.37 7.1e-07 -2.51
657320 rs539380 13_25 0.000 1106.19 4.7e-07 -2.51
657291 rs9537116 13_25 0.000 1105.22 6.9e-08 2.43
657313 rs35800055 13_25 0.000 1103.60 1.7e-07 2.49
657312 rs67100646 13_25 0.000 1103.57 1.8e-07 2.50
657310 rs4536353 13_25 0.000 1103.53 1.7e-07 2.49
657311 rs4296148 13_25 0.000 1103.40 1.6e-07 2.49
657317 rs7994036 13_25 0.000 1103.28 1.8e-07 2.50
657315 rs9597201 13_25 0.000 1103.23 1.7e-07 2.50
657319 rs9537174 13_25 0.000 1103.14 1.7e-07 2.49
565285 rs7125781 11_11 0.000 1087.09 0.0e+00 -33.70
657286 rs3105089 13_25 0.002 1061.11 8.0e-06 -3.11
657283 rs2315887 13_25 0.005 1056.52 1.8e-05 -3.22
657284 rs2315886 13_25 0.005 1056.52 1.8e-05 -3.22
657285 rs3124374 13_25 0.004 1056.29 1.2e-05 -3.19
657275 rs2315898 13_25 0.008 1056.11 2.6e-05 -3.26
657281 rs3124402 13_25 0.012 1055.13 3.8e-05 -3.29
657277 rs3105045 13_25 0.004 1055.08 1.4e-05 -3.22
657278 rs2315895 13_25 0.004 1055.07 1.4e-05 -3.22
657279 rs3124405 13_25 0.004 1055.05 1.4e-05 -3.22
561086 rs11500272 11_3 0.000 1054.45 1.3e-09 3.09
657273 rs7317475 13_25 0.002 1052.73 5.4e-06 -3.15
657262 rs1960704 13_25 0.002 1051.10 7.5e-06 -3.19
657270 rs520268 13_25 0.002 1051.09 7.2e-06 -3.19
657267 rs616312 13_25 0.002 1050.96 6.6e-06 -3.18
#SNPs with 50 highest pve
head(ctwas_snp_res[order(-ctwas_snp_res$PVE),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
225202 rs7697091 4_52 1.000 2017.45 0.00610 -64.16
565284 rs10766194 11_11 1.000 1617.70 0.00490 -41.01
225201 rs843005 4_52 1.000 1579.06 0.00480 59.21
561121 rs12282912 11_3 1.000 1171.13 0.00360 2.65
657309 rs775834524 13_25 1.000 1174.60 0.00360 -2.53
561126 rs759352454 11_3 1.000 1154.25 0.00350 2.92
565288 rs7129781 11_11 1.000 949.01 0.00290 -20.22
565260 rs11023314 11_11 0.892 823.26 0.00220 12.46
657299 rs7337153 13_25 0.579 1160.03 0.00200 -2.81
657307 rs7999449 13_25 0.559 1160.51 0.00200 -2.79
225196 rs16846876 4_52 0.728 720.42 0.00160 -52.15
565265 rs576218922 11_11 1.000 500.09 0.00150 -35.89
225199 rs222042 4_52 1.000 453.30 0.00140 4.62
657300 rs9527399 13_25 0.219 1156.60 0.00077 2.80
657303 rs9597193 13_25 0.208 1156.54 0.00073 2.80
657302 rs9527401 13_25 0.206 1156.53 0.00072 2.79
893819 rs561086564 19_33 1.000 235.44 0.00072 2.81
657304 rs9537143 13_25 0.197 1156.54 0.00069 2.79
657297 rs9527398 13_25 0.178 1155.93 0.00063 2.80
657298 rs9537125 13_25 0.178 1155.93 0.00062 2.80
657295 rs9537123 13_25 0.173 1155.83 0.00061 2.79
893870 rs75778010 19_33 1.000 191.71 0.00058 2.70
632606 rs6538696 12_57 1.000 161.23 0.00049 15.75
225215 rs35096193 4_52 0.994 152.34 0.00046 25.11
225183 rs34186014 4_52 0.183 702.54 0.00039 -51.73
581876 rs78434528 11_40 0.635 192.47 0.00037 -8.60
35366 rs115288876 1_74 1.000 104.31 0.00032 12.20
683333 rs2144530 14_11 0.873 117.56 0.00031 -11.93
716351 rs2070895 15_27 1.000 96.36 0.00029 -10.63
565268 rs73418613 11_11 0.108 818.41 0.00027 12.14
838988 rs2585441 20_32 1.000 70.61 0.00021 9.56
581843 rs7105151 11_40 0.133 486.51 0.00020 24.33
893742 rs191652498 19_33 0.924 70.69 0.00020 2.43
225182 rs11732044 4_52 0.087 696.51 0.00019 -51.61
632610 rs7308827 12_57 0.997 62.99 0.00019 -11.80
468612 rs12056610 8_78 0.813 74.46 0.00018 -8.77
838982 rs6123359 20_32 1.000 58.27 0.00018 8.76
838989 rs12480880 20_32 1.000 59.72 0.00018 9.97
35475 rs11205006 1_74 0.999 55.39 0.00017 -11.74
75319 rs780093 2_16 1.000 54.90 0.00017 7.97
35535 rs61813920 1_74 1.000 48.54 0.00015 9.77
716345 rs62000868 15_27 0.999 44.61 0.00014 -7.33
35277 rs140655928 1_74 1.000 44.21 0.00013 9.17
223960 rs445908 4_48 0.952 44.12 0.00013 -6.66
225050 rs189319377 4_49 1.000 39.57 0.00012 -6.98
743886 rs821840 16_31 0.858 44.48 0.00012 -6.90
811056 rs73004951 19_15 0.743 53.78 0.00012 7.89
583355 rs3740776 11_42 1.000 36.62 0.00011 6.23
839011 rs6068816 20_32 0.999 36.41 0.00011 -7.56
73583 rs1042034 2_13 1.000 33.99 0.00010 -5.24
#SNPs with 50 largest z scores
head(ctwas_snp_res[order(-abs(ctwas_snp_res$z)),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
225202 rs7697091 4_52 1.000 2017.45 6.1e-03 -64.16
225201 rs843005 4_52 1.000 1579.06 4.8e-03 59.21
225200 rs705119 4_52 0.000 1456.00 2.2e-13 56.46
225208 rs114289627 4_52 0.008 1142.39 2.7e-05 -55.56
225196 rs16846876 4_52 0.728 720.42 1.6e-03 -52.15
225183 rs34186014 4_52 0.183 702.54 3.9e-04 -51.73
225182 rs11732044 4_52 0.087 696.51 1.9e-04 -51.61
225175 rs11733890 4_52 0.000 641.59 9.4e-07 -49.98
225190 rs1526692 4_52 0.000 859.79 1.2e-15 -46.84
225186 rs7682810 4_52 0.000 857.93 1.2e-15 -46.79
565284 rs10766194 11_11 1.000 1617.70 4.9e-03 -41.01
225210 rs34539583 4_52 0.000 291.35 1.3e-14 40.43
225197 rs1491711 4_52 0.000 536.97 2.9e-15 39.27
581864 rs2276362 11_40 0.001 1151.31 3.8e-06 36.48
581860 rs12790010 11_40 0.001 1150.81 3.7e-06 36.47
581862 rs11233747 11_40 0.001 1150.52 3.7e-06 36.47
581859 rs12789751 11_40 0.001 1150.12 3.7e-06 36.46
581868 rs11233789 11_40 0.001 1147.36 3.7e-06 36.41
565265 rs576218922 11_11 1.000 500.09 1.5e-03 -35.89
565289 rs10766196 11_11 0.000 976.48 0.0e+00 -35.15
225225 rs2139641 4_52 0.000 437.07 4.1e-11 34.17
225227 rs4694436 4_52 0.000 436.80 4.0e-11 34.17
565285 rs7125781 11_11 0.000 1087.09 0.0e+00 -33.70
565262 rs10832297 11_11 0.000 826.40 0.0e+00 -33.59
565251 rs10832291 11_11 0.000 822.27 0.0e+00 -33.58
565252 rs78474492 11_11 0.000 821.98 0.0e+00 -33.56
565256 rs7119320 11_11 0.000 821.34 0.0e+00 -33.56
581884 rs7926180 11_40 0.001 799.77 3.0e-06 32.92
581866 rs1790344 11_40 0.001 888.15 2.7e-06 -31.64
581867 rs1790343 11_40 0.001 888.15 2.7e-06 -31.64
581865 rs142902943 11_40 0.001 887.91 2.7e-06 -31.63
581869 rs1790339 11_40 0.001 887.79 2.7e-06 -31.62
581871 rs4402322 11_40 0.001 887.37 2.7e-06 -31.62
581858 rs2002064 11_40 0.001 883.03 2.7e-06 -31.54
581853 rs1790328 11_40 0.001 828.59 2.6e-06 -30.67
581855 rs1792264 11_40 0.001 826.84 2.6e-06 -30.62
225171 rs28393895 4_52 0.000 259.36 1.4e-15 29.35
565216 rs12792120 11_11 0.000 621.43 0.0e+00 -28.89
565207 rs12804549 11_11 0.000 616.18 0.0e+00 -28.80
565212 rs12287212 11_11 0.000 615.27 0.0e+00 -28.78
225191 rs2201126 4_52 0.000 106.45 2.0e-17 -28.74
581879 rs10898191 11_40 0.001 589.60 2.0e-06 -28.55
565215 rs36093983 11_11 0.000 611.13 0.0e+00 -28.51
581886 rs79634040 11_40 0.001 572.45 1.9e-06 -28.17
581887 rs11234027 11_40 0.001 558.91 1.8e-06 -27.83
581885 rs28762388 11_40 0.001 549.99 1.7e-06 -27.58
225193 rs35937576 4_52 0.000 211.65 2.5e-17 27.03
225189 rs13111974 4_52 0.000 211.18 2.4e-17 26.99
225185 rs371397 4_52 0.000 209.16 2.5e-17 -26.89
225184 rs67769961 4_52 0.000 206.72 2.4e-17 26.71
#GO enrichment analysis
library(enrichR)
Welcome to enrichR
Checking connection ...
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
dbs <- c("GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021")
genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]
#number of genes for gene set enrichment
length(genes)
[1] 4
if (length(genes)>0){
GO_enrichment <- enrichr(genes, dbs)
for (db in dbs){
print(db)
df <- GO_enrichment[[db]]
df <- df[df$Adjusted.P.value<0.05,c("Term", "Overlap", "Adjusted.P.value", "Genes")]
print(df)
}
#DisGeNET enrichment
# devtools::install_bitbucket("ibi_group/disgenet2r")
library(disgenet2r)
disgenet_api_key <- get_disgenet_api_key(
email = "wesleycrouse@gmail.com",
password = "uchicago1" )
Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
res_enrich <-disease_enrichment(entities=genes, vocabulary = "HGNC",
database = "CURATED" )
df <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio", "BgRatio")]
print(df)
#WebGestalt enrichment
library(WebGestaltR)
background <- ctwas_gene_res$genename
#listGeneSet()
databases <- c("pathway_KEGG", "disease_GLAD4U", "disease_OMIM")
enrichResult <- WebGestaltR(enrichMethod="ORA", organism="hsapiens",
interestGene=genes, referenceGene=background,
enrichDatabase=databases, interestGeneType="genesymbol",
referenceGeneType="genesymbol", isOutput=F)
print(enrichResult[,c("description", "size", "overlap", "FDR", "database", "userId")])
}
Uploading data to Enrichr... Done.
Querying GO_Biological_Process_2021... Done.
Querying GO_Cellular_Component_2021... Done.
Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
Term
1 sterol metabolic process (GO:0016125)
2 cholesterol metabolic process (GO:0008203)
3 ethanol catabolic process (GO:0006068)
4 SREBP signaling pathway (GO:0032933)
5 cellular response to sterol depletion (GO:0071501)
6 primary alcohol catabolic process (GO:0034310)
7 maintenance of protein localization in endoplasmic reticulum (GO:0035437)
8 cellular response to sterol (GO:0036315)
9 sulfation (GO:0051923)
10 positive regulation of cyclin-dependent protein serine/threonine kinase activity (GO:0045737)
11 regulation of spindle organization (GO:0090224)
12 ethanol metabolic process (GO:0006067)
13 purine ribonucleoside bisphosphate metabolic process (GO:0034035)
14 positive regulation of cyclin-dependent protein kinase activity (GO:1904031)
15 lipid droplet organization (GO:0034389)
16 3'-phosphoadenosine 5'-phosphosulfate metabolic process (GO:0050427)
17 positive regulation of lipid metabolic process (GO:0045834)
18 positive regulation of microtubule polymerization or depolymerization (GO:0031112)
19 positive regulation of microtubule polymerization (GO:0031116)
20 secondary alcohol biosynthetic process (GO:1902653)
21 cholesterol biosynthetic process (GO:0006695)
22 positive regulation of lipid biosynthetic process (GO:0046889)
23 regulation of mitotic spindle organization (GO:0060236)
24 regulation of lipid biosynthetic process (GO:0046890)
25 oxoacid metabolic process (GO:0043436)
26 sterol biosynthetic process (GO:0016126)
27 regulation of microtubule polymerization (GO:0031113)
28 positive regulation of biosynthetic process (GO:0009891)
29 microtubule bundle formation (GO:0001578)
30 secondary alcohol metabolic process (GO:1902652)
31 organic hydroxy compound biosynthetic process (GO:1901617)
32 mitotic metaphase plate congression (GO:0007080)
33 steroid biosynthetic process (GO:0006694)
34 positive regulation of cell cycle (GO:0045787)
35 positive regulation of protein polymerization (GO:0032273)
36 purine ribonucleotide metabolic process (GO:0009150)
37 regulation of cyclin-dependent protein serine/threonine kinase activity (GO:0000079)
38 response to insulin (GO:0032868)
39 positive regulation of supramolecular fiber organization (GO:1902905)
40 mitotic sister chromatid segregation (GO:0000070)
41 steroid metabolic process (GO:0008202)
42 positive regulation of protein serine/threonine kinase activity (GO:0071902)
43 cellular response to peptide hormone stimulus (GO:0071375)
44 negative regulation of cell growth (GO:0030308)
45 negative regulation of growth (GO:0045926)
46 cellular response to insulin stimulus (GO:0032869)
47 regulation of cell growth (GO:0001558)
Overlap Adjusted.P.value Genes
1 2/70 0.002270902 INSIG2;SULT2A1
2 2/77 0.002270902 INSIG2;SULT2A1
3 1/7 0.014522738 SULT2A1
4 1/8 0.014522738 INSIG2
5 1/9 0.014522738 INSIG2
6 1/10 0.014522738 SULT2A1
7 1/12 0.014522738 INSIG2
8 1/16 0.014522738 INSIG2
9 1/17 0.014522738 SULT2A1
10 1/17 0.014522738 PSRC1
11 1/18 0.014522738 PSRC1
12 1/19 0.014522738 SULT2A1
13 1/19 0.014522738 SULT2A1
14 1/20 0.014522738 PSRC1
15 1/22 0.014522738 HSD17B13
16 1/25 0.014522738 SULT2A1
17 1/25 0.014522738 HSD17B13
18 1/26 0.014522738 PSRC1
19 1/28 0.014522738 PSRC1
20 1/34 0.014522738 INSIG2
21 1/35 0.014522738 INSIG2
22 1/35 0.014522738 HSD17B13
23 1/35 0.014522738 PSRC1
24 1/35 0.014522738 HSD17B13
25 1/35 0.014522738 SULT2A1
26 1/38 0.015157694 INSIG2
27 1/40 0.015362222 PSRC1
28 1/44 0.016084577 HSD17B13
29 1/45 0.016084577 PSRC1
30 1/49 0.016512774 SULT2A1
31 1/50 0.016512774 INSIG2
32 1/51 0.016512774 PSRC1
33 1/65 0.020089839 INSIG2
34 1/66 0.020089839 PSRC1
35 1/69 0.020398338 PSRC1
36 1/77 0.022117773 SULT2A1
37 1/82 0.022846560 PSRC1
38 1/84 0.022846560 INSIG2
39 1/91 0.024103153 PSRC1
40 1/102 0.025435818 PSRC1
41 1/104 0.025435818 SULT2A1
42 1/106 0.025435818 PSRC1
43 1/106 0.025435818 INSIG2
44 1/126 0.028847918 PSRC1
45 1/126 0.028847918 PSRC1
46 1/129 0.028886207 INSIG2
47 1/217 0.047244425 PSRC1
[1] "GO_Cellular_Component_2021"
Term Overlap
1 spindle microtubule (GO:0005876) 1/61
2 lipid droplet (GO:0005811) 1/77
3 intracellular non-membrane-bounded organelle (GO:0043232) 2/1158
Adjusted.P.value Genes
1 0.04955318 PSRC1
2 0.04955318 HSD17B13
3 0.04955318 PSRC1;HSD17B13
[1] "GO_Molecular_Function_2021"
Term
1 oxysterol binding (GO:0008142)
2 sulfotransferase activity (GO:0008146)
3 sterol binding (GO:0032934)
4 oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor (GO:0016616)
Overlap Adjusted.P.value Genes
1 1/7 0.008396047 INSIG2
2 1/40 0.023893772 SULT2A1
3 1/60 0.023893772 INSIG2
4 1/87 0.025931914 HSD17B13
HSD17B13 gene(s) from the input list not found in DisGeNET CURATEDPSRC1 gene(s) from the input list not found in DisGeNET CURATED
Description FDR Ratio BgRatio
1 Liver Cirrhosis, Experimental 0.02543268 2/2 774/9703
3 Weight Gain 0.04183427 1/2 102/9703
2 Prostatic Neoplasms 0.12295901 1/2 616/9703
4 Malignant neoplasm of prostate 0.12295901 1/2 616/9703
NA <NA> NA <NA> <NA>
NA.1 <NA> NA <NA> <NA>
NA.2 <NA> NA <NA> <NA>
NA.3 <NA> NA <NA> <NA>
NA.4 <NA> NA <NA> <NA>
NA.5 <NA> NA <NA> <NA>
******************************************
* *
* Welcome to WebGestaltR ! *
* *
******************************************
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet,
minNum = minNum, : No significant gene set is identified based on FDR 0.05!
NULL
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0 cowplot_1.0.0
[5] ggplot2_3.3.3
loaded via a namespace (and not attached):
[1] bitops_1.0-6 matrixStats_0.57.0
[3] fs_1.3.1 bit64_4.0.5
[5] doParallel_1.0.16 progress_1.2.2
[7] httr_1.4.1 rprojroot_2.0.2
[9] GenomeInfoDb_1.20.0 doRNG_1.8.2
[11] tools_3.6.1 utf8_1.2.1
[13] R6_2.5.0 DBI_1.1.1
[15] BiocGenerics_0.30.0 colorspace_1.4-1
[17] withr_2.4.1 tidyselect_1.1.0
[19] prettyunits_1.0.2 bit_4.0.4
[21] curl_3.3 compiler_3.6.1
[23] git2r_0.26.1 Biobase_2.44.0
[25] DelayedArray_0.10.0 rtracklayer_1.44.0
[27] labeling_0.3 scales_1.1.0
[29] readr_1.4.0 apcluster_1.4.8
[31] stringr_1.4.0 digest_0.6.20
[33] Rsamtools_2.0.0 svglite_1.2.2
[35] rmarkdown_1.13 XVector_0.24.0
[37] pkgconfig_2.0.3 htmltools_0.3.6
[39] fastmap_1.1.0 BSgenome_1.52.0
[41] rlang_0.4.11 RSQLite_2.2.7
[43] generics_0.0.2 farver_2.1.0
[45] jsonlite_1.6 BiocParallel_1.18.0
[47] dplyr_1.0.7 VariantAnnotation_1.30.1
[49] RCurl_1.98-1.1 magrittr_2.0.1
[51] GenomeInfoDbData_1.2.1 Matrix_1.2-18
[53] Rcpp_1.0.6 munsell_0.5.0
[55] S4Vectors_0.22.1 fansi_0.5.0
[57] gdtools_0.1.9 lifecycle_1.0.0
[59] stringi_1.4.3 whisker_0.3-2
[61] yaml_2.2.0 SummarizedExperiment_1.14.1
[63] zlibbioc_1.30.0 plyr_1.8.4
[65] grid_3.6.1 blob_1.2.1
[67] parallel_3.6.1 promises_1.0.1
[69] crayon_1.4.1 lattice_0.20-38
[71] Biostrings_2.52.0 GenomicFeatures_1.36.3
[73] hms_1.1.0 knitr_1.23
[75] pillar_1.6.1 igraph_1.2.4.1
[77] GenomicRanges_1.36.0 rjson_0.2.20
[79] rngtools_1.5 codetools_0.2-16
[81] reshape2_1.4.3 biomaRt_2.40.1
[83] stats4_3.6.1 XML_3.98-1.20
[85] glue_1.4.2 evaluate_0.14
[87] data.table_1.14.0 foreach_1.5.1
[89] vctrs_0.3.8 httpuv_1.5.1
[91] gtable_0.3.0 purrr_0.3.4
[93] assertthat_0.2.1 cachem_1.0.5
[95] xfun_0.8 later_0.8.0
[97] tibble_3.1.2 iterators_1.0.13
[99] GenomicAlignments_1.20.1 AnnotationDbi_1.46.0
[101] memoise_2.0.0 IRanges_2.18.1
[103] workflowr_1.6.2 ellipsis_0.3.2