Last updated: 2021-09-09
Checks: 6 1
Knit directory: ctwas_applied/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20210726)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 59e5f4d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Unstaged changes:
Modified: analysis/ukb-d-30500_irnt_Liver.Rmd
Modified: analysis/ukb-d-30500_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30600_irnt_Liver.Rmd
Modified: analysis/ukb-d-30600_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30610_irnt_Liver.Rmd
Modified: analysis/ukb-d-30610_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30620_irnt_Liver.Rmd
Modified: analysis/ukb-d-30620_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30630_irnt_Liver.Rmd
Modified: analysis/ukb-d-30630_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30640_irnt_Liver.Rmd
Modified: analysis/ukb-d-30640_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30650_irnt_Liver.Rmd
Modified: analysis/ukb-d-30650_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30660_irnt_Liver.Rmd
Modified: analysis/ukb-d-30660_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30670_irnt_Liver.Rmd
Modified: analysis/ukb-d-30670_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30680_irnt_Liver.Rmd
Modified: analysis/ukb-d-30690_irnt_Liver.Rmd
Modified: analysis/ukb-d-30690_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30700_irnt_Liver.Rmd
Modified: analysis/ukb-d-30700_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30710_irnt_Liver.Rmd
Modified: analysis/ukb-d-30710_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30720_irnt_Liver.Rmd
Modified: analysis/ukb-d-30720_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30730_irnt_Liver.Rmd
Modified: analysis/ukb-d-30740_irnt_Liver.Rmd
Modified: analysis/ukb-d-30740_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30750_irnt_Liver.Rmd
Modified: analysis/ukb-d-30750_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30760_irnt_Liver.Rmd
Modified: analysis/ukb-d-30760_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30770_irnt_Liver.Rmd
Modified: analysis/ukb-d-30770_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30780_irnt_Liver.Rmd
Modified: analysis/ukb-d-30780_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30790_irnt_Liver.Rmd
Modified: analysis/ukb-d-30800_irnt_Liver.Rmd
Modified: analysis/ukb-d-30800_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30810_irnt_Liver.Rmd
Modified: analysis/ukb-d-30820_irnt_Liver.Rmd
Modified: analysis/ukb-d-30820_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30830_irnt_Liver.Rmd
Modified: analysis/ukb-d-30830_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30840_irnt_Liver.Rmd
Modified: analysis/ukb-d-30840_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30850_irnt_Liver.Rmd
Modified: analysis/ukb-d-30850_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30860_irnt_Liver.Rmd
Modified: analysis/ukb-d-30870_irnt_Liver.Rmd
Modified: analysis/ukb-d-30880_irnt_Liver.Rmd
Modified: analysis/ukb-d-30890_irnt_Liver.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/ukb-d-30850_irnt_Whole_Blood.Rmd
) and HTML (docs/ukb-d-30850_irnt_Whole_Blood.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | cbf7408 | wesleycrouse | 2021-09-08 | adding enrichment to reports |
html | cbf7408 | wesleycrouse | 2021-09-08 | adding enrichment to reports |
Rmd | 4970e3e | wesleycrouse | 2021-09-08 | updating reports |
html | 4970e3e | wesleycrouse | 2021-09-08 | updating reports |
Rmd | dfd2b5f | wesleycrouse | 2021-09-07 | regenerating reports |
html | dfd2b5f | wesleycrouse | 2021-09-07 | regenerating reports |
Rmd | 61b53b3 | wesleycrouse | 2021-09-06 | updated PVE calculation |
html | 61b53b3 | wesleycrouse | 2021-09-06 | updated PVE calculation |
Rmd | 837dd01 | wesleycrouse | 2021-09-01 | adding additional fixedsigma report |
Rmd | d0a5417 | wesleycrouse | 2021-08-30 | adding new reports to the index |
Rmd | 0922de7 | wesleycrouse | 2021-08-18 | updating all reports with locus plots |
html | 0922de7 | wesleycrouse | 2021-08-18 | updating all reports with locus plots |
html | 1c62980 | wesleycrouse | 2021-08-11 | Updating reports |
Rmd | 5981e80 | wesleycrouse | 2021-08-11 | Adding more reports |
html | 5981e80 | wesleycrouse | 2021-08-11 | Adding more reports |
Rmd | da9f015 | wesleycrouse | 2021-08-07 | adding more reports |
html | da9f015 | wesleycrouse | 2021-08-07 | adding more reports |
These are the results of a ctwas
analysis of the UK Biobank trait Testosterone (quantile)
using Whole_Blood
gene weights.
The GWAS was conducted by the Neale Lab, and the biomarker traits we analyzed are discussed here. Summary statistics were obtained from IEU OpenGWAS using GWAS ID: ukb-d-30850_irnt
. Results were obtained from from IEU rather than Neale Lab because they are in a standardard format (GWAS VCF). Note that 3 of the 34 biomarker traits were not available from IEU and were excluded from analysis.
The weights are mashr GTEx v8 models on Whole_Blood
eQTL obtained from PredictDB. We performed a full harmonization of the variants, including recovering strand ambiguous variants. This procedure is discussed in a separate document. (TO-DO: add report that describes harmonization)
LD matrices were computed from a 10% subset of Neale lab subjects. Subjects were matched using the plate and well information from genotyping. We included only biallelic variants with MAF>0.01 in the original Neale Lab GWAS. (TO-DO: add more details [number of subjects, variants, etc])
TO-DO: add enhanced QC reporting (total number of weights, why each variant was missing for all genes)
qclist_all <- list()
qc_files <- paste0(results_dir, "/", list.files(results_dir, pattern="exprqc.Rd"))
for (i in 1:length(qc_files)){
load(qc_files[i])
chr <- unlist(strsplit(rev(unlist(strsplit(qc_files[i], "_")))[1], "[.]"))[1]
qclist_all[[chr]] <- cbind(do.call(rbind, lapply(qclist,unlist)), as.numeric(substring(chr,4)))
}
qclist_all <- data.frame(do.call(rbind, qclist_all))
colnames(qclist_all)[ncol(qclist_all)] <- "chr"
rm(qclist, wgtlist, z_gene_chr)
#number of imputed weights
nrow(qclist_all)
[1] 11095
#number of imputed weights by chromosome
table(qclist_all$chr)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1129 747 624 400 479 621 560 383 404 430 682 652 192 362 331
16 17 18 19 20 21 22
551 725 159 911 313 130 310
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.762776
#load ctwas results
ctwas_res <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.susieIrss.txt"))
#make unique identifier for regions
ctwas_res$region_tag <- paste(ctwas_res$region_tag1, ctwas_res$region_tag2, sep="_")
#compute PVE for each gene/SNP
ctwas_res$PVE = ctwas_res$susie_pip*ctwas_res$mu2/sample_size #check PVE calculation
#separate gene and SNP results
ctwas_gene_res <- ctwas_res[ctwas_res$type == "gene", ]
ctwas_gene_res <- data.frame(ctwas_gene_res)
ctwas_snp_res <- ctwas_res[ctwas_res$type == "SNP", ]
ctwas_snp_res <- data.frame(ctwas_snp_res)
#add gene information to results
sqlite <- RSQLite::dbDriver("SQLite")
db = RSQLite::dbConnect(sqlite, paste0("/project2/compbio/predictdb/mashr_models/mashr_", weight, ".db"))
query <- function(...) RSQLite::dbGetQuery(db, ...)
gene_info <- query("select gene, genename from extra")
gene_info <- query("select gene, genename, gene_type from extra")
RSQLite::dbDisconnect(db)
ctwas_gene_res <- cbind(ctwas_gene_res, gene_info[sapply(ctwas_gene_res$id, match, gene_info$gene), c("genename", "gene_type")])
#add z score to results
load(paste0(results_dir, "/", analysis_id, "_expr_z_gene.Rd"))
ctwas_gene_res$z <- z_gene[ctwas_gene_res$id,]$z
#load(paste0(results_dir, "/", analysis_id, "_expr_z_snp.Rd")) #for new version, stored after harmonization
z_snp <- readRDS(paste0(results_dir, "/", trait_id, ".RDS")) #for old version, unharmonized
z_snp <- z_snp[z_snp$id %in% ctwas_snp_res$id,] #subset snps to those included in analysis, note some are duplicated, need to match which allele was used
ctwas_snp_res$z <- z_snp$z[match(ctwas_snp_res$id, z_snp$id)] #for duplicated snps, this arbitrarily uses the first allele
ctwas_snp_res$z_flag <- as.numeric(ctwas_snp_res$id %in% z_snp$id[duplicated(z_snp$id)]) #mark the unclear z scores, flag=1
#formatting and rounding for tables
ctwas_gene_res$z <- round(ctwas_gene_res$z,2)
ctwas_snp_res$z <- round(ctwas_snp_res$z,2)
ctwas_gene_res$susie_pip <- round(ctwas_gene_res$susie_pip,3)
ctwas_snp_res$susie_pip <- round(ctwas_snp_res$susie_pip,3)
ctwas_gene_res$mu2 <- round(ctwas_gene_res$mu2,2)
ctwas_snp_res$mu2 <- round(ctwas_snp_res$mu2,2)
ctwas_gene_res$PVE <- signif(ctwas_gene_res$PVE, 2)
ctwas_snp_res$PVE <- signif(ctwas_snp_res$PVE, 2)
#merge gene and snp results with added information
ctwas_gene_res$z_flag=NA
ctwas_snp_res$genename=NA
ctwas_snp_res$gene_type=NA
ctwas_res <- rbind(ctwas_gene_res,
ctwas_snp_res[,colnames(ctwas_gene_res)])
#store columns to report
report_cols <- colnames(ctwas_gene_res)[!(colnames(ctwas_gene_res) %in% c("type", "region_tag1", "region_tag2", "cs_index", "gene_type", "z_flag", "id", "chrom", "pos"))]
first_cols <- c("genename", "region_tag")
report_cols <- c(first_cols, report_cols[!(report_cols %in% first_cols)])
report_cols_snps <- c("id", report_cols[-1])
#get number of SNPs from s1 results; adjust for thin
ctwas_res_s1 <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.s1.susieIrss.txt"))
n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
rm(ctwas_res_s1)
library(ggplot2)
library(cowplot)
********************************************************
Note: As of version 1.0.0, cowplot does not change the
default ggplot2 theme anymore. To recover the previous
behavior, execute:
theme_set(theme_cowplot())
********************************************************
load(paste0(results_dir, "/", analysis_id, "_ctwas.s2.susieIrssres.Rd"))
df <- data.frame(niter = rep(1:ncol(group_prior_rec), 2),
value = c(group_prior_rec[1,], group_prior_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_rec)))
df$group <- as.factor(df$group)
df$value[df$group=="SNP"] <- df$value[df$group=="SNP"]*thin #adjust parameter to account for thin argument
p_pi <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(pi)) +
ggtitle("Prior mean") +
theme_cowplot()
df <- data.frame(niter = rep(1:ncol(group_prior_var_rec), 2),
value = c(group_prior_var_rec[1,], group_prior_var_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_var_rec)))
df$group <- as.factor(df$group)
p_sigma2 <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(sigma^2)) +
ggtitle("Prior variance") +
theme_cowplot()
plot_grid(p_pi, p_sigma2)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
gene snp
0.0250742164 0.0001009486
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
gene snp
6.868055 15.896828
#report sample size
print(sample_size)
[1] 312102
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1] 11095 8697330
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
gene snp
0.00612198 0.04471985
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1] 0.03572312 0.41772427
#distribution of PIPs
hist(ctwas_gene_res$susie_pip, xlim=c(0,1), main="Distribution of Gene PIPs")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#genes with PIP>0.8 or 20 highest PIPs
head(ctwas_gene_res[order(-ctwas_gene_res$susie_pip),report_cols], max(sum(ctwas_gene_res$susie_pip>0.8), 20))
genename region_tag susie_pip mu2 PVE z
3804 OPRL1 20_38 0.989 29.28 9.3e-05 -5.63
2237 TBL2 7_47 0.946 21.58 6.5e-05 -5.43
8320 B3GNT2 2_41 0.919 20.21 6.0e-05 -4.43
8504 UTF1 10_85 0.917 22.68 6.7e-05 -4.88
209 CEP68 2_42 0.914 20.93 6.1e-05 -4.48
6064 PTPRJ 11_29 0.909 23.98 7.0e-05 -3.65
7636 ZNF219 14_1 0.894 17.87 5.1e-05 4.02
9030 TYMS 18_1 0.889 18.14 5.2e-05 -3.95
4219 ASS1 9_68 0.876 18.46 5.2e-05 -4.00
10046 CCDC157 22_10 0.851 18.47 5.0e-05 4.06
6977 CYGB 17_43 0.808 17.74 4.6e-05 -3.94
6122 CLEC1A 12_9 0.804 17.01 4.4e-05 3.70
7786 CATSPER2 15_16 0.798 40.88 1.0e-04 -7.43
7256 FABP1 2_55 0.796 18.77 4.8e-05 3.78
10765 ZDHHC18 1_18 0.787 39.28 9.9e-05 -6.36
9050 FBXO46 19_32 0.781 17.37 4.3e-05 -3.70
2409 HSD17B1 17_25 0.770 21.39 5.3e-05 4.50
4610 ACP2 11_29 0.760 19.96 4.9e-05 -2.96
6590 NTAN1 16_15 0.746 20.23 4.8e-05 -4.34
2073 SULT2A1 19_33 0.745 20.30 4.8e-05 4.45
#plot PIP vs effect size
plot(ctwas_gene_res$susie_pip, ctwas_gene_res$mu2, xlab="PIP", ylab="mu^2", main="Gene PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#genes with 20 largest effect sizes
head(ctwas_gene_res[order(-ctwas_gene_res$mu2),report_cols],20)
genename region_tag susie_pip mu2 PVE z
1980 FCGRT 19_34 0 12904.87 0.0e+00 -4.49
5520 RCN3 19_34 0 4199.65 0.0e+00 -5.70
4733 AHI1 6_89 0 1346.81 0.0e+00 2.02
8165 CPT1C 19_34 0 927.03 0.0e+00 2.98
11526 TNFSF12 17_7 0 538.42 0.0e+00 27.26
4093 ATP1B2 17_7 0 523.68 2.4e-15 -29.79
7008 TNFSF13 17_7 0 343.24 1.2e-19 3.98
7009 SENP3 17_7 0 273.89 2.2e-13 12.96
4096 MPDU1 17_7 0 246.28 1.5e-12 -13.99
571 SLC6A16 19_34 0 233.32 0.0e+00 0.89
10492 CTC-301O7.4 19_34 0 222.86 0.0e+00 1.11
5425 WRAP53 17_7 0 197.45 5.8e-13 -15.51
9555 NAA38 17_7 0 197.02 1.6e-12 10.56
5427 SAT2 17_7 0 177.25 4.4e-19 -11.35
11220 ADM5 19_34 0 140.94 0.0e+00 0.32
846 TEAD2 19_34 0 138.68 0.0e+00 1.54
6980 ALDH16A1 19_34 0 134.13 0.0e+00 -0.65
9576 RUVBL2 19_34 0 127.57 0.0e+00 -5.67
1969 GYS1 19_34 0 127.22 0.0e+00 5.62
5430 TP53 17_7 0 104.56 5.6e-19 9.21
#genes with 20 highest pve
head(ctwas_gene_res[order(-ctwas_gene_res$PVE),report_cols],20)
genename region_tag susie_pip mu2 PVE z
10416 ZNF655 7_61 0.436 82.11 1.1e-04 12.48
7786 CATSPER2 15_16 0.798 40.88 1.0e-04 -7.43
10765 ZDHHC18 1_18 0.787 39.28 9.9e-05 -6.36
3804 OPRL1 20_38 0.989 29.28 9.3e-05 -5.63
11637 GS1-259H13.2 7_61 0.281 80.19 7.2e-05 12.40
6064 PTPRJ 11_29 0.909 23.98 7.0e-05 -3.65
8504 UTF1 10_85 0.917 22.68 6.7e-05 -4.88
2237 TBL2 7_47 0.946 21.58 6.5e-05 -5.43
209 CEP68 2_42 0.914 20.93 6.1e-05 -4.48
8320 B3GNT2 2_41 0.919 20.21 6.0e-05 -4.43
1145 ACHE 7_62 0.622 29.69 5.9e-05 5.19
8787 ZBTB4 17_6 0.578 32.10 5.9e-05 -5.14
2186 PTCD1 7_61 0.231 78.56 5.8e-05 12.30
4545 IER3IP1 18_26 0.731 24.71 5.8e-05 -5.12
9453 ADO 10_42 0.500 34.22 5.5e-05 -4.79
3099 ARHGEF2 1_77 0.688 24.47 5.4e-05 5.55
2409 HSD17B1 17_25 0.770 21.39 5.3e-05 4.50
4219 ASS1 9_68 0.876 18.46 5.2e-05 -4.00
9030 TYMS 18_1 0.889 18.14 5.2e-05 -3.95
1087 GCKR 2_16 0.468 34.25 5.1e-05 -7.81
#genes with 20 largest z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
4093 ATP1B2 17_7 0.000 523.68 2.4e-15 -29.79
11526 TNFSF12 17_7 0.000 538.42 0.0e+00 27.26
5425 WRAP53 17_7 0.000 197.45 5.8e-13 -15.51
4096 MPDU1 17_7 0.000 246.28 1.5e-12 -13.99
7009 SENP3 17_7 0.000 273.89 2.2e-13 12.96
10416 ZNF655 7_61 0.436 82.11 1.1e-04 12.48
11637 GS1-259H13.2 7_61 0.281 80.19 7.2e-05 12.40
2186 PTCD1 7_61 0.231 78.56 5.8e-05 12.30
6935 CPSF4 7_61 0.063 70.64 1.4e-05 -11.76
5427 SAT2 17_7 0.000 177.25 4.4e-19 -11.35
9403 POLR2A 17_7 0.000 56.52 0.0e+00 10.71
2953 NRBP1 2_16 0.031 81.04 8.0e-06 -10.67
9555 NAA38 17_7 0.000 197.02 1.6e-12 10.56
5430 TP53 17_7 0.000 104.56 5.6e-19 9.21
2956 SNX17 2_16 0.032 73.02 7.5e-06 -9.04
7726 CYB5A 18_43 0.170 48.50 2.6e-05 8.65
4402 KDM6B 17_7 0.000 11.05 0.0e+00 -8.11
5076 ATRAID 2_16 0.055 44.54 7.9e-06 8.01
1087 GCKR 2_16 0.468 34.25 5.1e-05 -7.81
5057 IFT172 2_16 0.451 34.02 4.9e-05 7.78
#set nominal signifiance threshold for z scores
alpha <- 0.05
#bonferroni adjusted threshold for z scores
sig_thresh <- qnorm(1-(alpha/nrow(ctwas_gene_res)/2), lower=T)
#Q-Q plot for z scores
obs_z <- ctwas_gene_res$z[order(ctwas_gene_res$z)]
exp_z <- qnorm((1:nrow(ctwas_gene_res))/nrow(ctwas_gene_res))
plot(exp_z, obs_z, xlab="Expected z", ylab="Observed z", main="Gene z score Q-Q plot")
abline(a=0,b=1)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#plot z score vs PIP
plot(abs(ctwas_gene_res$z), ctwas_gene_res$susie_pip, xlab="abs(z)", ylab="PIP")
abline(v=sig_thresh, col="red", lty=2)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)
[1] 0.008021631
#genes with most significant z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
4093 ATP1B2 17_7 0.000 523.68 2.4e-15 -29.79
11526 TNFSF12 17_7 0.000 538.42 0.0e+00 27.26
5425 WRAP53 17_7 0.000 197.45 5.8e-13 -15.51
4096 MPDU1 17_7 0.000 246.28 1.5e-12 -13.99
7009 SENP3 17_7 0.000 273.89 2.2e-13 12.96
10416 ZNF655 7_61 0.436 82.11 1.1e-04 12.48
11637 GS1-259H13.2 7_61 0.281 80.19 7.2e-05 12.40
2186 PTCD1 7_61 0.231 78.56 5.8e-05 12.30
6935 CPSF4 7_61 0.063 70.64 1.4e-05 -11.76
5427 SAT2 17_7 0.000 177.25 4.4e-19 -11.35
9403 POLR2A 17_7 0.000 56.52 0.0e+00 10.71
2953 NRBP1 2_16 0.031 81.04 8.0e-06 -10.67
9555 NAA38 17_7 0.000 197.02 1.6e-12 10.56
5430 TP53 17_7 0.000 104.56 5.6e-19 9.21
2956 SNX17 2_16 0.032 73.02 7.5e-06 -9.04
7726 CYB5A 18_43 0.170 48.50 2.6e-05 8.65
4402 KDM6B 17_7 0.000 11.05 0.0e+00 -8.11
5076 ATRAID 2_16 0.055 44.54 7.9e-06 8.01
1087 GCKR 2_16 0.468 34.25 5.1e-05 -7.81
5057 IFT172 2_16 0.451 34.02 4.9e-05 7.78
ctwas_gene_res_sortz <- ctwas_gene_res[order(-abs(ctwas_gene_res$z)),]
n_plots <- 5
for (region_tag_plot in head(unique(ctwas_gene_res_sortz$region_tag), n_plots)){
ctwas_res_region <- ctwas_res[ctwas_res$region_tag==region_tag_plot,]
start <- min(ctwas_res_region$pos)
end <- max(ctwas_res_region$pos)
ctwas_res_region <- ctwas_res_region[order(ctwas_res_region$pos),]
ctwas_res_region_gene <- ctwas_res_region[ctwas_res_region$type=="gene",]
ctwas_res_region_snp <- ctwas_res_region[ctwas_res_region$type=="SNP",]
#region name
print(paste0("Region: ", region_tag_plot))
#table of genes in region
print(ctwas_res_region_gene[,report_cols])
par(mfrow=c(4,1))
#gene z scores
plot(ctwas_res_region_gene$pos, abs(ctwas_res_region_gene$z), xlab="Position", ylab="abs(gene_z)", xlim=c(start,end),
ylim=c(0,max(sig_thresh, abs(ctwas_res_region_gene$z))),
main=paste0("Region: ", region_tag_plot))
abline(h=sig_thresh,col="red",lty=2)
#significance threshold for SNPs
alpha_snp <- 5*10^(-8)
sig_thresh_snp <- qnorm(1-alpha_snp/2, lower=T)
#snp z scores
plot(ctwas_res_region_snp$pos, abs(ctwas_res_region_snp$z), xlab="Position", ylab="abs(snp_z)",xlim=c(start,end),
ylim=c(0,max(sig_thresh_snp, max(abs(ctwas_res_region_snp$z)))))
abline(h=sig_thresh_snp,col="purple",lty=2)
#gene pips
plot(ctwas_res_region_gene$pos, ctwas_res_region_gene$susie_pip, xlab="Position", ylab="Gene PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
#snp pips
plot(ctwas_res_region_snp$pos, ctwas_res_region_snp$susie_pip, xlab="Position", ylab="SNP PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
}
[1] "Region: 17_7"
genename region_tag susie_pip mu2 PVE z
7010 FGF11 17_7 0 12.89 0.0e+00 1.61
8293 CHRNB1 17_7 0 95.27 0.0e+00 0.30
9403 POLR2A 17_7 0 56.52 0.0e+00 10.71
11526 TNFSF12 17_7 0 538.42 0.0e+00 27.26
7008 TNFSF13 17_7 0 343.24 1.2e-19 3.98
7009 SENP3 17_7 0 273.89 2.2e-13 12.96
4096 MPDU1 17_7 0 246.28 1.5e-12 -13.99
5427 SAT2 17_7 0 177.25 4.4e-19 -11.35
4093 ATP1B2 17_7 0 523.68 2.4e-15 -29.79
5425 WRAP53 17_7 0 197.45 5.8e-13 -15.51
5430 TP53 17_7 0 104.56 5.6e-19 9.21
4402 KDM6B 17_7 0 11.05 0.0e+00 -8.11
7989 TMEM88 17_7 0 25.68 0.0e+00 3.57
9555 NAA38 17_7 0 197.02 1.6e-12 10.56
8272 CHD3 17_7 0 40.93 0.0e+00 -4.52
9286 AC025335.1 17_7 0 42.35 0.0e+00 4.38
8279 KCNAB3 17_7 0 31.34 0.0e+00 -2.31
8277 CNTROB 17_7 0 26.76 0.0e+00 -1.48
8278 TRAPPC1 17_7 0 6.66 0.0e+00 2.04
11172 VAMP2 17_7 0 65.12 4.6e-20 -3.80
9234 TMEM107 17_7 0 19.21 0.0e+00 1.68
10292 BORCS6 17_7 0 34.87 0.0e+00 -3.13
9228 LINC00324 17_7 0 6.03 0.0e+00 -0.81
9218 PFAS 17_7 0 5.65 0.0e+00 -0.19
9226 CTC1 17_7 0 85.96 1.1e-18 -4.49
3790 SLC25A35 17_7 0 17.54 0.0e+00 2.16
9716 KRBA2 17_7 0 20.46 0.0e+00 -2.50
7011 RPL26 17_7 0 5.11 0.0e+00 0.67
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 7_61"
genename region_tag susie_pip mu2 PVE z
10651 SMURF1 7_61 0.125 13.63 5.5e-06 0.36
11572 ARPC1A 7_61 0.082 16.40 4.3e-06 -2.61
4189 ARPC1B 7_61 0.026 4.69 3.9e-07 1.06
2184 PDAP1 7_61 0.104 15.25 5.1e-06 2.09
2185 BUD31 7_61 0.026 6.93 5.8e-07 1.98
6935 CPSF4 7_61 0.063 70.64 1.4e-05 -11.76
2186 PTCD1 7_61 0.231 78.56 5.8e-05 12.30
11566 ATP5J2 7_61 0.047 10.35 1.6e-06 -2.00
10617 ZNF789 7_61 0.027 21.35 1.8e-06 6.18
10416 ZNF655 7_61 0.436 82.11 1.1e-04 12.48
11637 GS1-259H13.2 7_61 0.281 80.19 7.2e-05 12.40
10368 ZSCAN25 7_61 0.031 6.55 6.5e-07 1.20
2187 CYP3A5 7_61 0.081 12.56 3.2e-06 -0.61
2188 ZKSCAN1 7_61 0.027 5.83 5.0e-07 1.89
7760 ZSCAN21 7_61 0.036 10.79 1.3e-06 -2.68
7758 ZNF3 7_61 0.063 10.52 2.1e-06 -0.91
8029 COPS6 7_61 0.061 12.78 2.5e-06 -1.95
11177 AP4M1 7_61 0.026 4.59 3.8e-07 -0.70
7838 CNPY4 7_61 0.032 7.44 7.7e-07 1.29
2190 TAF6 7_61 0.032 9.40 9.5e-07 -2.60
11096 MBLAC1 7_61 0.059 12.49 2.4e-06 1.92
5920 C7orf43 7_61 0.033 7.08 7.6e-07 -0.96
10076 LAMTOR4 7_61 0.037 7.01 8.3e-07 0.53
10379 GAL3ST4 7_61 0.026 5.92 5.0e-07 1.91
672 STAG3 7_61 0.027 4.67 4.1e-07 -0.25
11022 GPC2 7_61 0.052 14.81 2.5e-06 -2.81
11021 PVRIG 7_61 0.027 5.23 4.5e-07 1.45
11095 SPDYE3 7_61 0.029 6.30 5.8e-07 -1.52
3503 PILRB 7_61 0.029 6.62 6.2e-07 1.64
959 ZCWPW1 7_61 0.029 6.77 6.3e-07 -1.76
5924 MEPCE 7_61 0.029 6.77 6.3e-07 -1.76
7824 TSC22D4 7_61 0.031 7.54 7.5e-07 1.89
7823 NYAP1 7_61 0.032 7.82 8.0e-07 -1.94
2201 AGFG2 7_61 0.030 5.96 5.7e-07 1.03
10908 SAP25 7_61 0.034 7.44 8.2e-07 1.34
2196 FBXO24 7_61 0.070 10.89 2.5e-06 -0.79
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 2_16"
genename region_tag susie_pip mu2 PVE z
11045 SLC35F6 2_16 0.031 7.69 7.6e-07 2.36
3366 TMEM214 2_16 0.034 5.30 5.8e-07 0.36
5074 EMILIN1 2_16 0.040 14.67 1.9e-06 -3.40
5061 KHK 2_16 0.035 5.40 6.0e-07 -0.36
5059 CGREF1 2_16 0.031 4.61 4.6e-07 -0.25
5070 PREB 2_16 0.033 5.12 5.4e-07 1.46
5076 ATRAID 2_16 0.055 44.54 7.9e-06 8.01
1090 CAD 2_16 0.098 25.51 8.0e-06 5.00
5071 SLC5A6 2_16 0.031 10.99 1.1e-06 -2.75
7303 UCN 2_16 0.057 24.94 4.5e-06 6.53
2952 GTF3C2 2_16 0.047 23.06 3.4e-06 -6.40
2956 SNX17 2_16 0.032 73.02 7.5e-06 -9.04
7304 ZNF513 2_16 0.031 40.11 4.0e-06 -6.44
2953 NRBP1 2_16 0.031 81.04 8.0e-06 -10.67
5057 IFT172 2_16 0.451 34.02 4.9e-05 7.78
1087 GCKR 2_16 0.468 34.25 5.1e-05 -7.81
10613 GPN1 2_16 0.031 12.20 1.2e-06 -2.90
9018 CCDC121 2_16 0.073 15.43 3.6e-06 2.73
6660 BRE 2_16 0.055 24.59 4.3e-06 -6.18
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 18_43"
genename region_tag susie_pip mu2 PVE z
878 TIMM21 18_43 0.027 5.19 4.5e-07 -0.04
7726 CYB5A 18_43 0.170 48.50 2.6e-05 8.65
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 17_28"
genename region_tag susie_pip mu2 PVE z
84 OSBPL7 17_28 0.027 15.87 1.4e-06 2.40
5401 LRRC46 17_28 0.054 19.32 3.4e-06 2.90
6750 MRPL10 17_28 0.043 17.07 2.3e-06 2.84
5402 SCRN2 17_28 0.012 9.25 3.7e-07 1.09
7861 SP2 17_28 0.172 25.50 1.4e-05 -2.47
2370 PNPO 17_28 0.012 9.17 3.5e-07 -0.01
7862 PRR15L 17_28 0.015 10.55 5.2e-07 2.41
2373 CDK5RAP3 17_28 0.021 10.47 6.9e-07 -1.14
64 COPZ2 17_28 0.007 6.32 1.4e-07 1.41
1043 NFE2L1 17_28 0.006 4.66 9.4e-08 -0.90
12573 RP5-890E16.5 17_28 0.095 23.22 7.0e-06 3.51
2374 CBX1 17_28 0.052 18.31 3.1e-06 -3.45
21 SNX11 17_28 0.063 20.15 4.1e-06 -3.34
5400 SKAP1 17_28 0.013 10.99 4.4e-07 -1.90
3394 HOXB3 17_28 0.007 5.15 1.1e-07 0.80
9531 HOXB4 17_28 0.007 5.26 1.1e-07 0.83
8755 HOXB2 17_28 0.011 9.67 3.5e-07 -1.30
8369 HOXB9 17_28 0.027 16.65 1.4e-06 -1.71
11962 HOXB7 17_28 0.008 5.95 1.5e-07 0.13
11650 LINC02086 17_28 0.048 18.82 2.9e-06 -1.50
4852 CALCOCO2 17_28 0.006 4.75 9.6e-08 -0.42
6759 ATP5G1 17_28 0.007 6.24 1.4e-07 0.62
6761 UBE2Z 17_28 0.007 6.49 1.5e-07 0.99
6762 SNF8 17_28 0.014 10.08 4.6e-07 0.18
7846 GNGT2 17_28 0.010 29.62 9.3e-07 -7.58
2412 ABI3 17_28 0.006 11.17 2.2e-07 -3.28
8749 PHOSPHO1 17_28 0.018 14.00 8.0e-07 -1.50
11703 RP11-1079K10.3 17_28 0.007 13.31 2.8e-07 -0.01
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#snps with PIP>0.8 or 20 highest PIPs
head(ctwas_snp_res[order(-ctwas_snp_res$susie_pip),report_cols_snps],
max(sum(ctwas_snp_res$susie_pip>0.8), 20))
id region_tag susie_pip mu2 PVE z
29498 rs12029116 1_62 1.000 113.62 3.6e-04 10.77
31545 rs1730862 1_66 1.000 86.35 2.8e-04 -9.50
75084 rs780093 2_16 1.000 228.89 7.3e-04 16.58
76439 rs11124265 2_20 1.000 71.12 2.3e-04 11.99
76713 rs72787520 2_20 1.000 97.17 3.1e-04 13.91
76932 rs112564689 2_21 1.000 36.13 1.2e-04 5.91
166268 rs768688512 3_58 1.000 576.13 1.8e-03 3.17
199205 rs5855544 3_120 1.000 49.97 1.6e-04 -7.28
223653 rs7696472 4_48 1.000 58.93 1.9e-04 8.69
366650 rs199804242 6_89 1.000 6998.83 2.2e-02 -2.78
411658 rs45467892 7_61 1.000 172.72 5.5e-04 -16.38
516576 rs17134158 10_7 1.000 58.30 1.9e-04 -9.64
579521 rs754042 11_41 1.000 38.18 1.2e-04 -5.56
723440 rs58217463 15_46 1.000 84.92 2.7e-04 8.66
723442 rs8028588 15_46 1.000 44.06 1.4e-04 5.37
747884 rs9931108 16_46 1.000 93.27 3.0e-04 7.38
747913 rs57960111 16_46 1.000 177.24 5.7e-04 1.51
755194 rs62059238 17_6 1.000 49.05 1.6e-04 7.31
755269 rs62059839 17_7 1.000 1126.74 3.6e-03 40.23
755280 rs72829446 17_7 1.000 237.89 7.6e-04 13.01
796290 rs141207866 18_43 1.000 55.55 1.8e-04 7.59
802523 rs8111359 19_9 1.000 46.76 1.5e-04 -6.68
804841 rs141356897 19_14 1.000 56.23 1.8e-04 7.59
940380 rs61371437 19_34 1.000 15063.58 4.8e-02 4.54
940389 rs113176985 19_34 1.000 15073.92 4.8e-02 4.53
940392 rs374141296 19_34 1.000 15075.64 4.8e-02 3.70
755204 rs73233955 17_6 0.998 53.60 1.7e-04 -6.19
200992 rs36205397 4_4 0.996 57.03 1.8e-04 7.71
456879 rs4236857 8_56 0.996 30.43 9.7e-05 5.27
55473 rs17558745 1_110 0.995 30.11 9.6e-05 -5.32
733751 rs2764772 16_18 0.994 36.86 1.2e-04 6.15
747910 rs12934751 16_46 0.993 87.81 2.8e-04 -8.48
234625 rs116652741 4_68 0.989 30.50 9.7e-05 -4.88
534374 rs11510917 10_39 0.983 40.03 1.3e-04 -6.25
749292 rs2255451 16_49 0.982 32.06 1.0e-04 5.54
76510 rs10439444 2_20 0.980 37.74 1.2e-04 -6.99
812360 rs2316974 19_30 0.980 33.84 1.1e-04 -5.65
420412 rs125124 7_80 0.978 27.02 8.5e-05 4.89
547596 rs632196 10_65 0.978 27.37 8.6e-05 4.29
755242 rs142700974 17_7 0.978 155.50 4.9e-04 -20.08
458025 rs382796 8_57 0.977 29.17 9.1e-05 5.35
484738 rs12683780 9_13 0.977 27.51 8.6e-05 -4.99
43226 rs10798655 1_89 0.970 61.49 1.9e-04 -8.15
683350 rs72681869 14_20 0.970 27.22 8.5e-05 4.99
578466 rs72932198 11_38 0.968 27.33 8.5e-05 -5.27
332970 rs553169727 6_26 0.965 32.78 1.0e-04 -5.02
54019 rs3754140 1_108 0.964 27.21 8.4e-05 4.53
54062 rs1223802 1_108 0.964 31.49 9.7e-05 -5.11
75281 rs7606480 2_19 0.964 28.32 8.7e-05 5.06
305673 rs329123 5_80 0.963 28.98 8.9e-05 -5.03
411300 rs4268041 7_60 0.955 74.17 2.3e-04 8.96
287522 rs2928169 5_45 0.953 26.38 8.1e-05 4.44
926265 rs17637241 17_28 0.951 51.75 1.6e-04 7.79
35912 rs10788826 1_74 0.947 26.90 8.2e-05 4.83
578503 rs12804411 11_38 0.944 28.56 8.6e-05 5.35
16004 rs12751490 1_35 0.942 27.08 8.2e-05 4.93
337187 rs6912200 6_32 0.942 28.58 8.6e-05 -5.28
491318 rs11557154 9_27 0.932 27.32 8.2e-05 4.77
701366 rs34639393 14_56 0.930 32.35 9.6e-05 -5.50
630639 rs375115050 12_59 0.926 26.34 7.8e-05 -4.88
357506 rs4515420 6_70 0.917 45.31 1.3e-04 6.30
75886 rs76886327 2_19 0.903 26.33 7.6e-05 -4.73
507163 rs201250488 9_57 0.888 31.98 9.1e-05 -5.36
447169 rs12543287 8_37 0.883 25.68 7.3e-05 4.29
830278 rs3212201 20_28 0.868 41.44 1.2e-04 6.39
564205 rs11024433 11_13 0.866 27.14 7.5e-05 -4.89
366765 rs62431291 6_90 0.861 26.14 7.2e-05 -4.61
634014 rs75622376 12_67 0.851 25.96 7.1e-05 4.67
50946 rs4951319 1_103 0.847 40.16 1.1e-04 6.26
159420 rs73128974 3_44 0.844 28.59 7.7e-05 5.06
554311 rs2225079 10_78 0.842 26.02 7.0e-05 4.56
384845 rs117483607 7_14 0.838 26.49 7.1e-05 -4.66
485945 rs112107457 9_15 0.838 38.69 1.0e-04 5.99
461261 rs369077882 8_64 0.824 25.77 6.8e-05 -4.48
17687 rs137898851 1_39 0.820 26.51 7.0e-05 4.60
547870 rs35035059 10_65 0.817 26.51 6.9e-05 -4.00
579514 rs341048 11_41 0.816 25.71 6.7e-05 -3.79
699223 rs35604463 14_52 0.808 26.82 6.9e-05 4.65
#plot PIP vs effect size
plot(ctwas_snp_res$susie_pip, ctwas_snp_res$mu2, xlab="PIP", ylab="mu^2", main="SNP PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#SNPs with 50 largest effect sizes
head(ctwas_snp_res[order(-ctwas_snp_res$mu2),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
940392 rs374141296 19_34 1 15075.64 4.8e-02 3.70
940389 rs113176985 19_34 1 15073.92 4.8e-02 4.53
940380 rs61371437 19_34 1 15063.58 4.8e-02 4.54
940382 rs35295508 19_34 0 15039.96 7.2e-09 4.62
940370 rs739349 19_34 0 15002.21 1.2e-07 4.50
940371 rs756628 19_34 0 15002.09 9.6e-08 4.49
940396 rs2946865 19_34 0 14990.66 2.7e-13 4.49
940387 rs73056069 19_34 0 14982.66 7.5e-13 4.53
940367 rs739347 19_34 0 14977.09 1.1e-08 4.56
940368 rs2073614 19_34 0 14961.72 3.3e-11 4.55
940384 rs2878354 19_34 0 14952.50 2.4e-15 4.59
940373 rs2077300 19_34 0 14915.60 5.0e-16 4.45
940363 rs4802613 19_34 0 14891.61 0.0e+00 4.43
940377 rs73056059 19_34 0 14889.32 1.5e-16 4.50
940397 rs60815603 19_34 0 14779.58 0.0e+00 4.58
940400 rs1316885 19_34 0 14704.19 0.0e+00 4.45
940361 rs10403394 19_34 0 14694.52 0.0e+00 4.69
940362 rs17555056 19_34 0 14688.74 0.0e+00 4.67
940402 rs60746284 19_34 0 14683.22 0.0e+00 4.52
940405 rs2946863 19_34 0 14678.04 0.0e+00 4.49
940398 rs35443645 19_34 0 14666.72 0.0e+00 4.51
940378 rs73056062 19_34 0 14504.73 0.0e+00 4.40
940408 rs553431297 19_34 0 14292.56 0.0e+00 4.65
940391 rs112283514 19_34 0 14235.94 0.0e+00 4.24
940393 rs11270139 19_34 0 14162.14 0.0e+00 4.80
940358 rs10421294 19_34 0 13259.32 0.0e+00 4.01
940357 rs8108175 19_34 0 13257.41 0.0e+00 4.01
940356 rs1858742 19_34 0 13232.23 0.0e+00 3.98
940350 rs59192944 19_34 0 13232.08 0.0e+00 4.00
940347 rs55991145 19_34 0 13222.68 0.0e+00 3.99
940342 rs3786567 19_34 0 13217.41 0.0e+00 3.98
940338 rs2271952 19_34 0 13212.34 0.0e+00 3.98
940341 rs4801801 19_34 0 13212.17 0.0e+00 3.97
940337 rs2271953 19_34 0 13197.98 0.0e+00 3.98
940339 rs2271951 19_34 0 13197.84 0.0e+00 3.99
940328 rs60365978 19_34 0 13183.92 0.0e+00 3.94
940334 rs4802612 19_34 0 13130.13 0.0e+00 3.94
940344 rs2517977 19_34 0 13110.26 0.0e+00 3.95
940331 rs55893003 19_34 0 13091.84 0.0e+00 3.97
940323 rs55992104 19_34 0 12786.69 0.0e+00 3.84
940317 rs60403475 19_34 0 12783.14 0.0e+00 3.84
940320 rs4352151 19_34 0 12782.45 0.0e+00 3.82
940314 rs11878448 19_34 0 12773.36 0.0e+00 3.81
940308 rs9653100 19_34 0 12769.47 0.0e+00 3.84
940304 rs4802611 19_34 0 12760.02 0.0e+00 3.80
940296 rs7251338 19_34 0 12740.38 0.0e+00 3.79
940295 rs59269605 19_34 0 12739.41 0.0e+00 3.80
940316 rs1042120 19_34 0 12706.13 0.0e+00 3.80
940312 rs113220577 19_34 0 12695.12 0.0e+00 3.80
940306 rs9653118 19_34 0 12676.34 0.0e+00 3.83
#SNPs with 50 highest pve
head(ctwas_snp_res[order(-ctwas_snp_res$PVE),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
940380 rs61371437 19_34 1.000 15063.58 0.04800 4.54
940389 rs113176985 19_34 1.000 15073.92 0.04800 4.53
940392 rs374141296 19_34 1.000 15075.64 0.04800 3.70
366650 rs199804242 6_89 1.000 6998.83 0.02200 -2.78
366649 rs2327654 6_89 0.715 7015.15 0.01600 -2.39
366653 rs113527452 6_89 0.597 6992.31 0.01300 -2.51
366666 rs6923513 6_89 0.475 7014.71 0.01100 -2.38
755269 rs62059839 17_7 1.000 1126.74 0.00360 40.23
166268 rs768688512 3_58 1.000 576.13 0.00180 3.17
535562 rs1935 10_42 0.620 580.87 0.00120 27.16
755249 rs8073177 17_7 0.747 432.97 0.00100 31.09
166265 rs12493756 3_58 0.435 563.77 0.00079 3.02
755280 rs72829446 17_7 1.000 237.89 0.00076 13.01
75084 rs780093 2_16 1.000 228.89 0.00073 16.58
166266 rs12493835 3_58 0.386 564.01 0.00070 3.00
166263 rs138503435 3_58 0.341 563.68 0.00062 3.00
166267 rs6765538 3_58 0.337 563.80 0.00061 2.96
747913 rs57960111 16_46 1.000 177.24 0.00057 1.51
755270 rs6257 17_7 0.643 275.72 0.00057 -20.73
411658 rs45467892 7_61 1.000 172.72 0.00055 -16.38
166264 rs73141241 3_58 0.297 561.02 0.00053 3.01
755242 rs142700974 17_7 0.978 155.50 0.00049 -20.08
29498 rs12029116 1_62 1.000 113.62 0.00036 10.77
76713 rs72787520 2_20 1.000 97.17 0.00031 13.91
755268 rs112885647 17_7 0.357 274.08 0.00031 -20.68
747884 rs9931108 16_46 1.000 93.27 0.00030 7.38
31545 rs1730862 1_66 1.000 86.35 0.00028 -9.50
747910 rs12934751 16_46 0.993 87.81 0.00028 -8.48
723440 rs58217463 15_46 1.000 84.92 0.00027 8.66
755247 rs9892862 17_7 0.188 429.17 0.00026 31.04
535573 rs10761729 10_42 0.136 577.23 0.00025 27.10
76439 rs11124265 2_20 1.000 71.12 0.00023 11.99
411300 rs4268041 7_60 0.955 74.17 0.00023 8.96
43226 rs10798655 1_89 0.970 61.49 0.00019 -8.15
223653 rs7696472 4_48 1.000 58.93 0.00019 8.69
516576 rs17134158 10_7 1.000 58.30 0.00019 -9.64
200992 rs36205397 4_4 0.996 57.03 0.00018 7.71
642796 rs606950 13_3 0.595 92.75 0.00018 9.88
796290 rs141207866 18_43 1.000 55.55 0.00018 7.59
804841 rs141356897 19_14 1.000 56.23 0.00018 7.59
755204 rs73233955 17_6 0.998 53.60 0.00017 -6.19
166289 rs9310061 3_58 0.287 169.36 0.00016 -5.57
199205 rs5855544 3_120 1.000 49.97 0.00016 -7.28
535593 rs5785566 10_42 0.084 576.33 0.00016 27.08
747921 rs6564897 16_46 0.370 137.66 0.00016 4.61
755194 rs62059238 17_6 1.000 49.05 0.00016 7.31
926265 rs17637241 17_28 0.951 51.75 0.00016 7.79
707145 rs28534747 15_13 0.649 69.91 0.00015 8.41
802523 rs8111359 19_9 1.000 46.76 0.00015 -6.68
166279 rs56323967 3_58 0.306 144.40 0.00014 5.18
#SNPs with 50 largest z scores
head(ctwas_snp_res[order(-abs(ctwas_snp_res$z)),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
755269 rs62059839 17_7 1.000 1126.74 3.6e-03 40.23
755263 rs149932962 17_7 0.000 927.71 0.0e+00 37.98
755250 rs62059797 17_7 0.000 679.52 0.0e+00 33.12
755248 rs35049113 17_7 0.000 675.96 0.0e+00 33.07
755254 rs12941509 17_7 0.000 683.49 0.0e+00 33.07
755259 rs4968212 17_7 0.000 663.25 0.0e+00 -32.21
755257 rs62059801 17_7 0.000 633.98 0.0e+00 31.72
755249 rs8073177 17_7 0.747 432.97 1.0e-03 31.09
755247 rs9892862 17_7 0.188 429.17 2.6e-04 31.04
755253 rs11078694 17_7 0.065 429.86 8.9e-05 -30.85
535562 rs1935 10_42 0.620 580.87 1.2e-03 27.16
535573 rs10761729 10_42 0.136 577.23 2.5e-04 27.10
535593 rs5785566 10_42 0.084 576.33 1.6e-04 27.08
535609 rs6479896 10_42 0.050 575.30 9.3e-05 27.06
535605 rs10822160 10_42 0.044 574.55 8.2e-05 27.05
535611 rs7910927 10_42 0.034 574.44 6.3e-05 27.05
535626 rs3956912 10_42 0.016 572.47 3.0e-05 27.02
535617 rs10822168 10_42 0.011 571.63 2.1e-05 27.00
535557 rs35751397 10_42 0.004 569.14 7.2e-06 26.94
535619 rs10640079 10_42 0.001 567.00 1.4e-06 26.85
755218 rs34474914 17_7 0.000 432.92 0.0e+00 26.65
755289 rs1641549 17_7 0.000 268.09 1.9e-19 -26.37
755279 rs745412832 17_7 0.000 314.45 6.7e-19 25.69
535654 rs2163188 10_42 0.000 507.20 4.3e-07 -25.42
535595 rs10761739 10_42 0.002 531.32 2.6e-06 25.34
755258 rs12601581 17_7 0.000 457.60 0.0e+00 -25.31
535662 rs10822186 10_42 0.000 494.23 1.5e-07 25.11
535663 rs7895549 10_42 0.000 493.80 1.5e-07 25.09
535661 rs570234162 10_42 0.000 489.39 1.5e-07 -24.96
535658 rs4746205 10_42 0.000 492.37 4.2e-07 -24.92
535659 rs7090758 10_42 0.000 468.75 2.6e-07 -24.39
755271 rs1642797 17_7 0.000 448.97 9.6e-19 23.61
755272 rs1642808 17_7 0.000 447.84 9.6e-19 23.57
755273 rs1641538 17_7 0.000 447.87 9.6e-19 23.57
755274 rs1641531 17_7 0.000 447.53 9.6e-19 23.57
755275 rs1641528 17_7 0.000 448.13 1.1e-18 23.57
755276 rs1641522 17_7 0.000 447.95 1.1e-18 23.55
535547 rs10995445 10_42 0.000 423.25 1.7e-07 23.42
755211 rs13290 17_7 0.000 132.59 0.0e+00 23.39
755208 rs11652328 17_7 0.000 313.88 0.0e+00 22.13
755212 rs34706172 17_7 0.000 131.70 0.0e+00 21.20
755270 rs6257 17_7 0.643 275.72 5.7e-04 -20.73
755268 rs112885647 17_7 0.357 274.08 3.1e-04 -20.68
755286 rs4968186 17_7 0.000 188.88 0.0e+00 20.23
755215 rs12600863 17_7 0.000 153.57 0.0e+00 20.11
755214 rs3829603 17_7 0.000 155.86 0.0e+00 20.08
755242 rs142700974 17_7 0.978 155.50 4.9e-04 -20.08
755227 rs116560331 17_7 0.022 146.20 1.0e-05 -19.55
755260 rs12939472 17_7 0.000 244.22 0.0e+00 -19.28
535613 rs149727843 10_42 0.000 343.36 1.3e-07 18.86
#GO enrichment analysis
library(enrichR)
Welcome to enrichR
Checking connection ...
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
dbs <- c("GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021")
genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]
#number of genes for gene set enrichment
length(genes)
[1] 12
if (length(genes)>0){
GO_enrichment <- enrichr(genes, dbs)
for (db in dbs){
print(db)
df <- GO_enrichment[[db]]
df <- df[df$Adjusted.P.value<0.05,c("Term", "Overlap", "Adjusted.P.value", "Genes")]
print(df)
}
#DisGeNET enrichment
# devtools::install_bitbucket("ibi_group/disgenet2r")
library(disgenet2r)
disgenet_api_key <- get_disgenet_api_key(
email = "wesleycrouse@gmail.com",
password = "uchicago1" )
Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
res_enrich <-disease_enrichment(entities=genes, vocabulary = "HGNC",
database = "CURATED" )
df <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio", "BgRatio")]
print(df)
#WebGestalt enrichment
library(WebGestaltR)
background <- ctwas_gene_res$genename
#listGeneSet()
databases <- c("pathway_KEGG", "disease_GLAD4U", "disease_OMIM")
enrichResult <- WebGestaltR(enrichMethod="ORA", organism="hsapiens",
interestGene=genes, referenceGene=background,
enrichDatabase=databases, interestGeneType="genesymbol",
referenceGeneType="genesymbol", isOutput=F)
print(enrichResult[,c("description", "size", "overlap", "FDR", "database", "userId")])
}
Uploading data to Enrichr... Done.
Querying GO_Biological_Process_2021... Done.
Querying GO_Cellular_Component_2021... Done.
Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Cellular_Component_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"
Term
1 kinase binding (GO:0019900)
2 protein kinase binding (GO:0019901)
3 G protein-coupled opioid receptor activity (GO:0004985)
4 sequence-specific mRNA binding (GO:1990825)
5 N-acetyllactosaminide beta-1,3-N-acetylglucosaminyltransferase activity (GO:0008532)
6 platelet-derived growth factor receptor binding (GO:0005161)
7 neuropeptide binding (GO:0042923)
8 acetylgalactosaminyltransferase activity (GO:0008376)
9 mitogen-activated protein kinase binding (GO:0051019)
10 amino acid binding (GO:0016597)
Overlap Adjusted.P.value Genes
1 3/461 0.03115665 CEP68;TBL2;PTPRJ
2 3/506 0.03115665 CEP68;TBL2;PTPRJ
3 1/6 0.03115665 OPRL1
4 1/10 0.03368827 TYMS
5 1/12 0.03368827 B3GNT2
6 1/13 0.03368827 PTPRJ
7 1/21 0.04654297 OPRL1
8 1/27 0.04949615 B3GNT2
9 1/29 0.04949615 PTPRJ
10 1/32 0.04949615 ASS1
UTF1 gene(s) from the input list not found in DisGeNET CURATEDTBL2 gene(s) from the input list not found in DisGeNET CURATEDZNF219 gene(s) from the input list not found in DisGeNET CURATEDCCDC157 gene(s) from the input list not found in DisGeNET CURATED
Description FDR
5 Aspergillosis 0.01942232
21 Digestive System Neoplasms 0.01942232
60 Hypokinesia 0.01942232
64 Citrullinemia 0.01942232
67 Bradykinesia 0.01942232
71 Hypodynamia 0.01942232
72 Poisoning by fluorouracil 0.01942232
80 Cancer of Digestive System 0.01942232
82 Hypokinesia, Antiorthostatic 0.01942232
83 Argininosuccinic Acid Synthetase Deficiency Disease, Partial 0.01942232
Ratio BgRatio
5 1/8 2/9703
21 1/8 2/9703
60 1/8 3/9703
64 1/8 1/9703
67 1/8 3/9703
71 1/8 3/9703
72 1/8 2/9703
80 1/8 2/9703
82 1/8 3/9703
83 1/8 1/9703
******************************************
* *
* Welcome to WebGestaltR ! *
* *
******************************************
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet,
minNum = minNum, : No significant gene set is identified based on FDR 0.05!
NULL
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0 cowplot_1.0.0
[5] ggplot2_3.3.3
loaded via a namespace (and not attached):
[1] bitops_1.0-6 matrixStats_0.57.0
[3] fs_1.3.1 bit64_4.0.5
[5] doParallel_1.0.16 progress_1.2.2
[7] httr_1.4.1 rprojroot_2.0.2
[9] GenomeInfoDb_1.20.0 doRNG_1.8.2
[11] tools_3.6.1 utf8_1.2.1
[13] R6_2.5.0 DBI_1.1.1
[15] BiocGenerics_0.30.0 colorspace_1.4-1
[17] withr_2.4.1 tidyselect_1.1.0
[19] prettyunits_1.0.2 bit_4.0.4
[21] curl_3.3 compiler_3.6.1
[23] git2r_0.26.1 Biobase_2.44.0
[25] DelayedArray_0.10.0 rtracklayer_1.44.0
[27] labeling_0.3 scales_1.1.0
[29] readr_1.4.0 apcluster_1.4.8
[31] stringr_1.4.0 digest_0.6.20
[33] Rsamtools_2.0.0 svglite_1.2.2
[35] rmarkdown_1.13 XVector_0.24.0
[37] pkgconfig_2.0.3 htmltools_0.3.6
[39] fastmap_1.1.0 BSgenome_1.52.0
[41] rlang_0.4.11 RSQLite_2.2.7
[43] generics_0.0.2 farver_2.1.0
[45] jsonlite_1.6 BiocParallel_1.18.0
[47] dplyr_1.0.7 VariantAnnotation_1.30.1
[49] RCurl_1.98-1.1 magrittr_2.0.1
[51] GenomeInfoDbData_1.2.1 Matrix_1.2-18
[53] Rcpp_1.0.6 munsell_0.5.0
[55] S4Vectors_0.22.1 fansi_0.5.0
[57] gdtools_0.1.9 lifecycle_1.0.0
[59] stringi_1.4.3 whisker_0.3-2
[61] yaml_2.2.0 SummarizedExperiment_1.14.1
[63] zlibbioc_1.30.0 plyr_1.8.4
[65] grid_3.6.1 blob_1.2.1
[67] parallel_3.6.1 promises_1.0.1
[69] crayon_1.4.1 lattice_0.20-38
[71] Biostrings_2.52.0 GenomicFeatures_1.36.3
[73] hms_1.1.0 knitr_1.23
[75] pillar_1.6.1 igraph_1.2.4.1
[77] GenomicRanges_1.36.0 rjson_0.2.20
[79] rngtools_1.5 codetools_0.2-16
[81] reshape2_1.4.3 biomaRt_2.40.1
[83] stats4_3.6.1 XML_3.98-1.20
[85] glue_1.4.2 evaluate_0.14
[87] data.table_1.14.0 foreach_1.5.1
[89] vctrs_0.3.8 httpuv_1.5.1
[91] gtable_0.3.0 purrr_0.3.4
[93] assertthat_0.2.1 cachem_1.0.5
[95] xfun_0.8 later_0.8.0
[97] tibble_3.1.2 iterators_1.0.13
[99] GenomicAlignments_1.20.1 AnnotationDbi_1.46.0
[101] memoise_2.0.0 IRanges_2.18.1
[103] workflowr_1.6.2 ellipsis_0.3.2