Last updated: 2021-09-09
Checks: 6 1
Knit directory: ctwas_applied/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20210726)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 59e5f4d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Unstaged changes:
Modified: analysis/ukb-d-30500_irnt_Liver.Rmd
Modified: analysis/ukb-d-30500_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30600_irnt_Liver.Rmd
Modified: analysis/ukb-d-30600_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30610_irnt_Liver.Rmd
Modified: analysis/ukb-d-30610_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30620_irnt_Liver.Rmd
Modified: analysis/ukb-d-30620_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30630_irnt_Liver.Rmd
Modified: analysis/ukb-d-30630_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30640_irnt_Liver.Rmd
Modified: analysis/ukb-d-30640_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30650_irnt_Liver.Rmd
Modified: analysis/ukb-d-30650_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30660_irnt_Liver.Rmd
Modified: analysis/ukb-d-30660_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30670_irnt_Liver.Rmd
Modified: analysis/ukb-d-30670_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30680_irnt_Liver.Rmd
Modified: analysis/ukb-d-30690_irnt_Liver.Rmd
Modified: analysis/ukb-d-30690_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30700_irnt_Liver.Rmd
Modified: analysis/ukb-d-30700_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30710_irnt_Liver.Rmd
Modified: analysis/ukb-d-30710_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30720_irnt_Liver.Rmd
Modified: analysis/ukb-d-30720_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30730_irnt_Liver.Rmd
Modified: analysis/ukb-d-30740_irnt_Liver.Rmd
Modified: analysis/ukb-d-30740_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30750_irnt_Liver.Rmd
Modified: analysis/ukb-d-30750_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30760_irnt_Liver.Rmd
Modified: analysis/ukb-d-30760_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30770_irnt_Liver.Rmd
Modified: analysis/ukb-d-30770_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30780_irnt_Liver.Rmd
Modified: analysis/ukb-d-30780_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30790_irnt_Liver.Rmd
Modified: analysis/ukb-d-30800_irnt_Liver.Rmd
Modified: analysis/ukb-d-30800_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30810_irnt_Liver.Rmd
Modified: analysis/ukb-d-30820_irnt_Liver.Rmd
Modified: analysis/ukb-d-30820_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30830_irnt_Liver.Rmd
Modified: analysis/ukb-d-30830_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30840_irnt_Liver.Rmd
Modified: analysis/ukb-d-30840_irnt_Whole_Blood.Rmd
Modified: analysis/ukb-d-30850_irnt_Liver.Rmd
Modified: analysis/ukb-d-30860_irnt_Liver.Rmd
Modified: analysis/ukb-d-30870_irnt_Liver.Rmd
Modified: analysis/ukb-d-30880_irnt_Liver.Rmd
Modified: analysis/ukb-d-30890_irnt_Liver.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/ukb-d-30840_irnt_Whole_Blood.Rmd
) and HTML (docs/ukb-d-30840_irnt_Whole_Blood.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | cbf7408 | wesleycrouse | 2021-09-08 | adding enrichment to reports |
html | cbf7408 | wesleycrouse | 2021-09-08 | adding enrichment to reports |
Rmd | 4970e3e | wesleycrouse | 2021-09-08 | updating reports |
html | 4970e3e | wesleycrouse | 2021-09-08 | updating reports |
Rmd | dfd2b5f | wesleycrouse | 2021-09-07 | regenerating reports |
html | dfd2b5f | wesleycrouse | 2021-09-07 | regenerating reports |
Rmd | 61b53b3 | wesleycrouse | 2021-09-06 | updated PVE calculation |
html | 61b53b3 | wesleycrouse | 2021-09-06 | updated PVE calculation |
Rmd | 837dd01 | wesleycrouse | 2021-09-01 | adding additional fixedsigma report |
Rmd | d0a5417 | wesleycrouse | 2021-08-30 | adding new reports to the index |
Rmd | 0922de7 | wesleycrouse | 2021-08-18 | updating all reports with locus plots |
html | 0922de7 | wesleycrouse | 2021-08-18 | updating all reports with locus plots |
html | 1c62980 | wesleycrouse | 2021-08-11 | Updating reports |
Rmd | 5981e80 | wesleycrouse | 2021-08-11 | Adding more reports |
html | 5981e80 | wesleycrouse | 2021-08-11 | Adding more reports |
Rmd | da9f015 | wesleycrouse | 2021-08-07 | adding more reports |
html | da9f015 | wesleycrouse | 2021-08-07 | adding more reports |
These are the results of a ctwas
analysis of the UK Biobank trait Total bilirubin (quantile)
using Whole_Blood
gene weights.
The GWAS was conducted by the Neale Lab, and the biomarker traits we analyzed are discussed here. Summary statistics were obtained from IEU OpenGWAS using GWAS ID: ukb-d-30840_irnt
. Results were obtained from from IEU rather than Neale Lab because they are in a standardard format (GWAS VCF). Note that 3 of the 34 biomarker traits were not available from IEU and were excluded from analysis.
The weights are mashr GTEx v8 models on Whole_Blood
eQTL obtained from PredictDB. We performed a full harmonization of the variants, including recovering strand ambiguous variants. This procedure is discussed in a separate document. (TO-DO: add report that describes harmonization)
LD matrices were computed from a 10% subset of Neale lab subjects. Subjects were matched using the plate and well information from genotyping. We included only biallelic variants with MAF>0.01 in the original Neale Lab GWAS. (TO-DO: add more details [number of subjects, variants, etc])
TO-DO: add enhanced QC reporting (total number of weights, why each variant was missing for all genes)
qclist_all <- list()
qc_files <- paste0(results_dir, "/", list.files(results_dir, pattern="exprqc.Rd"))
for (i in 1:length(qc_files)){
load(qc_files[i])
chr <- unlist(strsplit(rev(unlist(strsplit(qc_files[i], "_")))[1], "[.]"))[1]
qclist_all[[chr]] <- cbind(do.call(rbind, lapply(qclist,unlist)), as.numeric(substring(chr,4)))
}
qclist_all <- data.frame(do.call(rbind, qclist_all))
colnames(qclist_all)[ncol(qclist_all)] <- "chr"
rm(qclist, wgtlist, z_gene_chr)
#number of imputed weights
nrow(qclist_all)
[1] 11095
#number of imputed weights by chromosome
table(qclist_all$chr)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1129 747 624 400 479 621 560 383 404 430 682 652 192 362 331
16 17 18 19 20 21 22
551 725 159 911 313 130 310
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.762776
#load ctwas results
ctwas_res <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.susieIrss.txt"))
#make unique identifier for regions
ctwas_res$region_tag <- paste(ctwas_res$region_tag1, ctwas_res$region_tag2, sep="_")
#compute PVE for each gene/SNP
ctwas_res$PVE = ctwas_res$susie_pip*ctwas_res$mu2/sample_size #check PVE calculation
#separate gene and SNP results
ctwas_gene_res <- ctwas_res[ctwas_res$type == "gene", ]
ctwas_gene_res <- data.frame(ctwas_gene_res)
ctwas_snp_res <- ctwas_res[ctwas_res$type == "SNP", ]
ctwas_snp_res <- data.frame(ctwas_snp_res)
#add gene information to results
sqlite <- RSQLite::dbDriver("SQLite")
db = RSQLite::dbConnect(sqlite, paste0("/project2/compbio/predictdb/mashr_models/mashr_", weight, ".db"))
query <- function(...) RSQLite::dbGetQuery(db, ...)
gene_info <- query("select gene, genename from extra")
gene_info <- query("select gene, genename, gene_type from extra")
RSQLite::dbDisconnect(db)
ctwas_gene_res <- cbind(ctwas_gene_res, gene_info[sapply(ctwas_gene_res$id, match, gene_info$gene), c("genename", "gene_type")])
#add z score to results
load(paste0(results_dir, "/", analysis_id, "_expr_z_gene.Rd"))
ctwas_gene_res$z <- z_gene[ctwas_gene_res$id,]$z
#load(paste0(results_dir, "/", analysis_id, "_expr_z_snp.Rd")) #for new version, stored after harmonization
z_snp <- readRDS(paste0(results_dir, "/", trait_id, ".RDS")) #for old version, unharmonized
z_snp <- z_snp[z_snp$id %in% ctwas_snp_res$id,] #subset snps to those included in analysis, note some are duplicated, need to match which allele was used
ctwas_snp_res$z <- z_snp$z[match(ctwas_snp_res$id, z_snp$id)] #for duplicated snps, this arbitrarily uses the first allele
ctwas_snp_res$z_flag <- as.numeric(ctwas_snp_res$id %in% z_snp$id[duplicated(z_snp$id)]) #mark the unclear z scores, flag=1
#formatting and rounding for tables
ctwas_gene_res$z <- round(ctwas_gene_res$z,2)
ctwas_snp_res$z <- round(ctwas_snp_res$z,2)
ctwas_gene_res$susie_pip <- round(ctwas_gene_res$susie_pip,3)
ctwas_snp_res$susie_pip <- round(ctwas_snp_res$susie_pip,3)
ctwas_gene_res$mu2 <- round(ctwas_gene_res$mu2,2)
ctwas_snp_res$mu2 <- round(ctwas_snp_res$mu2,2)
ctwas_gene_res$PVE <- signif(ctwas_gene_res$PVE, 2)
ctwas_snp_res$PVE <- signif(ctwas_snp_res$PVE, 2)
#merge gene and snp results with added information
ctwas_gene_res$z_flag=NA
ctwas_snp_res$genename=NA
ctwas_snp_res$gene_type=NA
ctwas_res <- rbind(ctwas_gene_res,
ctwas_snp_res[,colnames(ctwas_gene_res)])
#store columns to report
report_cols <- colnames(ctwas_gene_res)[!(colnames(ctwas_gene_res) %in% c("type", "region_tag1", "region_tag2", "cs_index", "gene_type", "z_flag", "id", "chrom", "pos"))]
first_cols <- c("genename", "region_tag")
report_cols <- c(first_cols, report_cols[!(report_cols %in% first_cols)])
report_cols_snps <- c("id", report_cols[-1])
#get number of SNPs from s1 results; adjust for thin
ctwas_res_s1 <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.s1.susieIrss.txt"))
n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
rm(ctwas_res_s1)
library(ggplot2)
library(cowplot)
********************************************************
Note: As of version 1.0.0, cowplot does not change the
default ggplot2 theme anymore. To recover the previous
behavior, execute:
theme_set(theme_cowplot())
********************************************************
load(paste0(results_dir, "/", analysis_id, "_ctwas.s2.susieIrssres.Rd"))
df <- data.frame(niter = rep(1:ncol(group_prior_rec), 2),
value = c(group_prior_rec[1,], group_prior_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_rec)))
df$group <- as.factor(df$group)
df$value[df$group=="SNP"] <- df$value[df$group=="SNP"]*thin #adjust parameter to account for thin argument
p_pi <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(pi)) +
ggtitle("Prior mean") +
theme_cowplot()
df <- data.frame(niter = rep(1:ncol(group_prior_var_rec), 2),
value = c(group_prior_var_rec[1,], group_prior_var_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_var_rec)))
df$group <- as.factor(df$group)
p_sigma2 <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(sigma^2)) +
ggtitle("Prior variance") +
theme_cowplot()
plot_grid(p_pi, p_sigma2)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
gene snp
4.720945e-02 1.195297e-05
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
gene snp
8.479027 1498.981237
#report sample size
print(sample_size)
[1] 342829
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1] 11095 8697330
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
gene snp
0.01295462 0.45454857
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1] 0.05599477 38.53797345
#distribution of PIPs
hist(ctwas_gene_res$susie_pip, xlim=c(0,1), main="Distribution of Gene PIPs")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#genes with PIP>0.8 or 20 highest PIPs
head(ctwas_gene_res[order(-ctwas_gene_res$susie_pip),report_cols], max(sum(ctwas_gene_res$susie_pip>0.8), 20))
genename region_tag susie_pip mu2 PVE z
5830 GTF2H2 5_42 0.999 24.27 7.1e-05 4.63
7782 CASC4 15_17 0.998 26.11 7.6e-05 5.13
9073 HIC1 17_3 0.998 22.67 6.6e-05 -4.67
10046 CCDC157 22_10 0.998 26.73 7.8e-05 -5.34
5823 IQGAP2 5_44 0.997 22.95 6.7e-05 4.77
4401 EIF5A 17_6 0.997 23.89 6.9e-05 -4.83
11297 LINC01624 6_112 0.996 22.24 6.5e-05 -4.62
3102 DOCK7 1_39 0.993 60.86 1.8e-04 8.35
10508 ADH5 4_66 0.993 36.55 1.1e-04 6.07
6181 CSNK1G3 5_75 0.993 27.34 7.9e-05 -5.87
6327 TXNDC11 16_12 0.993 20.44 5.9e-05 4.23
5941 SLC25A37 8_24 0.992 24.90 7.2e-05 -1.53
6888 SHKBP1 19_30 0.992 24.22 7.0e-05 5.10
368 FBXO42 1_11 0.991 21.15 6.1e-05 3.91
4564 PSRC1 1_67 0.991 19.16 5.5e-05 -3.49
4296 PDLIM4 5_79 0.991 31.38 9.1e-05 -6.38
3748 GNMT 6_33 0.990 49.45 1.4e-04 7.50
5692 ASXL2 2_15 0.988 29.72 8.6e-05 5.42
9028 FAM91A1 8_80 0.987 19.47 5.6e-05 4.24
4660 UBQLN1 9_41 0.987 20.37 5.9e-05 4.34
9924 GLDN 15_20 0.985 20.25 5.8e-05 4.40
8331 ADORA2B 17_14 0.985 34.83 1.0e-04 6.09
7393 CEP44 4_113 0.984 21.23 6.1e-05 4.71
1267 PABPC4 1_24 0.981 39.18 1.1e-04 6.42
6076 ZC3H12C 11_65 0.980 19.74 5.6e-05 4.30
6631 C9orf43 9_58 0.979 19.17 5.5e-05 4.24
7786 CATSPER2 15_16 0.979 23.30 6.7e-05 -4.88
3762 RAB17 2_140 0.978 19.15 5.5e-05 4.17
12450 RP11-88E10.5 13_61 0.978 19.59 5.6e-05 4.26
8178 SLC50A1 1_77 0.977 19.79 5.6e-05 3.77
9687 MROH7 1_34 0.976 17.48 5.0e-05 -3.90
6089 FADS1 11_34 0.976 65.66 1.9e-04 -8.78
7291 PCOLCE2 3_87 0.975 18.73 5.3e-05 -4.11
8486 PLEKHG5 1_5 0.974 19.69 5.6e-05 4.34
2977 CCDC88A 2_36 0.970 17.84 5.0e-05 -3.91
5188 FGD4 12_22 0.969 18.80 5.3e-05 -4.13
6395 UBASH3B 11_74 0.968 18.88 5.3e-05 4.13
8978 SMIM19 8_37 0.967 50.86 1.4e-04 -7.52
9863 LAMP1 13_62 0.966 18.56 5.2e-05 4.66
7950 CTB-50L17.10 19_5 0.963 18.42 5.2e-05 -4.26
3065 GPX7 1_33 0.961 16.72 4.7e-05 3.71
11023 SIPA1 11_36 0.961 21.63 6.1e-05 -4.57
12471 RP11-334C17.6 17_45 0.958 17.89 5.0e-05 3.99
8294 GYPA 4_94 0.957 22.10 6.2e-05 5.35
6290 ZFP36L2 2_27 0.956 18.91 5.3e-05 -4.25
4845 TTC5 14_1 0.955 17.09 4.8e-05 -3.86
9223 ZBTB7A 19_4 0.954 21.89 6.1e-05 4.61
11381 BACH1-AS1 21_11 0.953 18.35 5.1e-05 4.01
10164 RNFT1 17_35 0.950 57.04 1.6e-04 8.43
3023 BIRC6 2_20 0.942 25.85 7.1e-05 5.10
4398 UNK 17_42 0.942 24.50 6.7e-05 -4.77
11894 INAFM1 19_33 0.941 18.84 5.2e-05 4.03
7462 DAGLB 7_9 0.940 17.58 4.8e-05 3.95
562 DGAT2 11_42 0.938 63.18 1.7e-04 -8.58
12003 LINC00565 13_62 0.938 16.76 4.6e-05 3.74
10438 GYPE 4_94 0.933 22.30 6.1e-05 -5.61
5783 TM4SF19 3_121 0.932 22.18 6.0e-05 -4.51
746 SMG6 17_3 0.932 21.48 5.8e-05 -4.92
5171 EGF 4_71 0.928 25.97 7.0e-05 5.38
10781 HLA-DMA 6_27 0.919 28.80 7.7e-05 -5.99
5074 EMILIN1 2_16 0.916 20.13 5.4e-05 -4.57
3848 RRBP1 20_13 0.915 17.56 4.7e-05 -4.08
4656 NREP 5_66 0.912 19.76 5.3e-05 -4.28
5486 SIRT3 11_1 0.912 18.37 4.9e-05 -3.86
5318 USP3 15_29 0.908 88.19 2.3e-04 10.40
8641 OXSR1 3_27 0.905 16.38 4.3e-05 3.73
5540 SYTL1 1_19 0.897 16.79 4.4e-05 -3.71
4698 SNX14 6_58 0.895 17.57 4.6e-05 3.82
1958 NEFM 8_25 0.894 17.18 4.5e-05 3.72
11815 MPV17L2 19_14 0.894 18.04 4.7e-05 3.82
7283 ABHD6 3_40 0.893 17.11 4.5e-05 3.80
7627 STOX1 10_46 0.890 15.79 4.1e-05 2.63
10324 DAPK1 9_45 0.884 16.11 4.2e-05 -3.53
5268 CUL4A 13_62 0.882 17.66 4.5e-05 -4.31
8291 SIK2 11_66 0.876 20.58 5.3e-05 4.86
10319 AMZ2 17_39 0.875 15.91 4.1e-05 3.44
1366 CWF19L1 10_64 0.872 26.84 6.8e-05 -5.45
6892 PKN3 9_66 0.869 18.44 4.7e-05 -3.91
4413 ERAL1 17_18 0.867 28.32 7.2e-05 -5.20
9345 RNF182 6_12 0.866 17.00 4.3e-05 -3.65
6761 UBE2Z 17_28 0.865 19.29 4.9e-05 -3.93
5916 CCT6A 7_40 0.860 17.44 4.4e-05 -3.77
4254 HABP4 9_49 0.859 15.97 4.0e-05 -3.45
2423 LUC7L3 17_29 0.856 16.87 4.2e-05 -3.78
6601 NECAP2 1_11 0.851 15.47 3.8e-05 -2.30
9597 NPIPA1 16_15 0.851 17.26 4.3e-05 -3.64
12181 EGLN2 19_30 0.830 19.84 4.8e-05 -4.23
3906 SBDS 7_44 0.825 17.04 4.1e-05 3.61
6672 LRRC43 12_75 0.825 19.84 4.8e-05 -4.37
5881 MMS22L 6_66 0.818 17.66 4.2e-05 -3.67
11000 KLHL23 2_103 0.814 17.07 4.1e-05 3.75
10174 NKAPL 6_22 0.814 15.51 3.7e-05 1.73
2237 TBL2 7_47 0.804 16.82 3.9e-05 -3.38
10895 TMEM240 1_1 0.803 18.96 4.4e-05 4.10
6911 ADAR 1_75 0.802 18.57 4.3e-05 -4.28
#plot PIP vs effect size
plot(ctwas_gene_res$susie_pip, ctwas_gene_res$mu2, xlab="PIP", ylab="mu^2", main="Gene PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#genes with 20 largest effect sizes
head(ctwas_gene_res[order(-ctwas_gene_res$mu2),report_cols],20)
genename region_tag susie_pip mu2 PVE z
12599 HCP5B 6_26 0 622305.18 0.0e+00 -7.45
10848 TRIM10 6_26 0 410616.30 0.0e+00 5.76
10855 HLA-G 6_26 0 371837.82 0.0e+00 4.62
10853 HCG9 6_26 0 257715.88 0.0e+00 -0.55
10968 HLA-A 6_26 0 219855.64 0.0e+00 0.24
10844 HLA-E 6_26 0 179116.29 0.0e+00 -4.02
11418 TRIM26 6_26 0 177294.54 0.0e+00 -0.70
11120 LINC00243 6_26 0 154816.67 0.0e+00 -3.39
922 DGKD 2_137 0 144372.20 0.0e+00 -58.58
1119 USP40 2_137 0 131909.17 0.0e+00 59.39
5868 PPP1R18 6_26 0 122993.00 0.0e+00 -4.68
10841 MRPS18B 6_26 0 94840.51 0.0e+00 2.11
8464 WIPF2 17_23 0 44658.98 1.3e-09 2.94
10847 TRIM15 6_26 0 40767.56 0.0e+00 1.92
1346 CDC6 17_23 0 40224.64 0.0e+00 3.69
4971 IER3 6_26 0 37277.52 0.0e+00 0.17
1118 ATG16L1 2_137 0 23658.50 0.0e+00 -35.02
10840 C6orf136 6_26 0 22947.29 0.0e+00 1.62
4970 FLOT1 6_26 0 19010.36 0.0e+00 -0.11
10381 ZGPAT 20_38 0 17932.50 1.8e-08 -4.58
#genes with 20 highest pve
head(ctwas_gene_res[order(-ctwas_gene_res$PVE),report_cols],20)
genename region_tag susie_pip mu2 PVE z
7536 NFIL3 9_46 0.775 102.06 2.3e-04 11.30
5318 USP3 15_29 0.908 88.19 2.3e-04 10.40
6089 FADS1 11_34 0.976 65.66 1.9e-04 -8.78
3102 DOCK7 1_39 0.993 60.86 1.8e-04 8.35
562 DGAT2 11_42 0.938 63.18 1.7e-04 -8.58
10164 RNFT1 17_35 0.950 57.04 1.6e-04 8.43
3748 GNMT 6_33 0.990 49.45 1.4e-04 7.50
8978 SMIM19 8_37 0.967 50.86 1.4e-04 -7.52
1267 PABPC4 1_24 0.981 39.18 1.1e-04 6.42
10508 ADH5 4_66 0.993 36.55 1.1e-04 6.07
8331 ADORA2B 17_14 0.985 34.83 1.0e-04 6.09
4296 PDLIM4 5_79 0.991 31.38 9.1e-05 -6.38
5692 ASXL2 2_15 0.988 29.72 8.6e-05 5.42
6181 CSNK1G3 5_75 0.993 27.34 7.9e-05 -5.87
10046 CCDC157 22_10 0.998 26.73 7.8e-05 -5.34
8277 CNTROB 17_7 0.779 33.96 7.7e-05 -6.57
10781 HLA-DMA 6_27 0.919 28.80 7.7e-05 -5.99
7782 CASC4 15_17 0.998 26.11 7.6e-05 5.13
12299 U91328.19 6_20 0.793 31.71 7.3e-05 -8.24
5941 SLC25A37 8_24 0.992 24.90 7.2e-05 -1.53
#genes with 20 largest z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
1119 USP40 2_137 0.000 131909.17 0.0e+00 59.39
922 DGKD 2_137 0.000 144372.20 0.0e+00 -58.58
1118 ATG16L1 2_137 0.000 23658.50 0.0e+00 -35.02
3636 HJURP 2_137 0.000 16455.03 0.0e+00 13.62
7536 NFIL3 9_46 0.775 102.06 2.3e-04 11.30
5318 USP3 15_29 0.908 88.19 2.3e-04 10.40
10374 ZSCAN26 6_22 0.269 69.86 5.5e-05 9.35
6089 FADS1 11_34 0.976 65.66 1.9e-04 -8.78
11433 MAPKAPK5-AS1 12_67 0.219 56.30 3.6e-05 -8.68
562 DGAT2 11_42 0.938 63.18 1.7e-04 -8.58
2220 AHR 7_17 0.122 57.29 2.0e-05 -8.51
10164 RNFT1 17_35 0.950 57.04 1.6e-04 8.43
3102 DOCK7 1_39 0.993 60.86 1.8e-04 8.35
10805 EHMT2 6_26 0.000 281.40 0.0e+00 8.32
5979 AUH 9_46 0.062 54.02 9.8e-06 8.26
12299 U91328.19 6_20 0.793 31.71 7.3e-05 -8.24
2607 SH2B3 12_67 0.079 46.30 1.1e-05 8.21
4024 TST 22_14 0.127 60.39 2.2e-05 8.15
10765 ZDHHC18 1_18 0.069 61.43 1.2e-05 8.08
10789 PBX2 6_26 0.000 243.77 0.0e+00 7.73
#set nominal signifiance threshold for z scores
alpha <- 0.05
#bonferroni adjusted threshold for z scores
sig_thresh <- qnorm(1-(alpha/nrow(ctwas_gene_res)/2), lower=T)
#Q-Q plot for z scores
obs_z <- ctwas_gene_res$z[order(ctwas_gene_res$z)]
exp_z <- qnorm((1:nrow(ctwas_gene_res))/nrow(ctwas_gene_res))
plot(exp_z, obs_z, xlab="Expected z", ylab="Observed z", main="Gene z score Q-Q plot")
abline(a=0,b=1)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#plot z score vs PIP
plot(abs(ctwas_gene_res$z), ctwas_gene_res$susie_pip, xlab="abs(z)", ylab="PIP")
abline(v=sig_thresh, col="red", lty=2)
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)
[1] 0.01360973
#genes with most significant z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
1119 USP40 2_137 0.000 131909.17 0.0e+00 59.39
922 DGKD 2_137 0.000 144372.20 0.0e+00 -58.58
1118 ATG16L1 2_137 0.000 23658.50 0.0e+00 -35.02
3636 HJURP 2_137 0.000 16455.03 0.0e+00 13.62
7536 NFIL3 9_46 0.775 102.06 2.3e-04 11.30
5318 USP3 15_29 0.908 88.19 2.3e-04 10.40
10374 ZSCAN26 6_22 0.269 69.86 5.5e-05 9.35
6089 FADS1 11_34 0.976 65.66 1.9e-04 -8.78
11433 MAPKAPK5-AS1 12_67 0.219 56.30 3.6e-05 -8.68
562 DGAT2 11_42 0.938 63.18 1.7e-04 -8.58
2220 AHR 7_17 0.122 57.29 2.0e-05 -8.51
10164 RNFT1 17_35 0.950 57.04 1.6e-04 8.43
3102 DOCK7 1_39 0.993 60.86 1.8e-04 8.35
10805 EHMT2 6_26 0.000 281.40 0.0e+00 8.32
5979 AUH 9_46 0.062 54.02 9.8e-06 8.26
12299 U91328.19 6_20 0.793 31.71 7.3e-05 -8.24
2607 SH2B3 12_67 0.079 46.30 1.1e-05 8.21
4024 TST 22_14 0.127 60.39 2.2e-05 8.15
10765 ZDHHC18 1_18 0.069 61.43 1.2e-05 8.08
10789 PBX2 6_26 0.000 243.77 0.0e+00 7.73
ctwas_gene_res_sortz <- ctwas_gene_res[order(-abs(ctwas_gene_res$z)),]
n_plots <- 5
for (region_tag_plot in head(unique(ctwas_gene_res_sortz$region_tag), n_plots)){
ctwas_res_region <- ctwas_res[ctwas_res$region_tag==region_tag_plot,]
start <- min(ctwas_res_region$pos)
end <- max(ctwas_res_region$pos)
ctwas_res_region <- ctwas_res_region[order(ctwas_res_region$pos),]
ctwas_res_region_gene <- ctwas_res_region[ctwas_res_region$type=="gene",]
ctwas_res_region_snp <- ctwas_res_region[ctwas_res_region$type=="SNP",]
#region name
print(paste0("Region: ", region_tag_plot))
#table of genes in region
print(ctwas_res_region_gene[,report_cols])
par(mfrow=c(4,1))
#gene z scores
plot(ctwas_res_region_gene$pos, abs(ctwas_res_region_gene$z), xlab="Position", ylab="abs(gene_z)", xlim=c(start,end),
ylim=c(0,max(sig_thresh, abs(ctwas_res_region_gene$z))),
main=paste0("Region: ", region_tag_plot))
abline(h=sig_thresh,col="red",lty=2)
#significance threshold for SNPs
alpha_snp <- 5*10^(-8)
sig_thresh_snp <- qnorm(1-alpha_snp/2, lower=T)
#snp z scores
plot(ctwas_res_region_snp$pos, abs(ctwas_res_region_snp$z), xlab="Position", ylab="abs(snp_z)",xlim=c(start,end),
ylim=c(0,max(sig_thresh_snp, max(abs(ctwas_res_region_snp$z)))))
abline(h=sig_thresh_snp,col="purple",lty=2)
#gene pips
plot(ctwas_res_region_gene$pos, ctwas_res_region_gene$susie_pip, xlab="Position", ylab="Gene PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
#snp pips
plot(ctwas_res_region_snp$pos, ctwas_res_region_snp$susie_pip, xlab="Position", ylab="SNP PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
}
[1] "Region: 2_137"
genename region_tag susie_pip mu2 PVE z
10759 GIGYF2 2_137 0 66.65 0 -4.28
9519 C2orf82 2_137 0 18.89 0 0.49
1118 ATG16L1 2_137 0 23658.50 0 -35.02
922 DGKD 2_137 0 144372.20 0 -58.58
1119 USP40 2_137 0 131909.17 0 59.39
3636 HJURP 2_137 0 16455.03 0 13.62
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 9_46"
genename region_tag susie_pip mu2 PVE z
11350 RP11-305L7.1 9_46 0.104 32.00 9.7e-06 -5.82
7536 NFIL3 9_46 0.775 102.06 2.3e-04 11.30
5979 AUH 9_46 0.062 54.02 9.8e-06 8.26
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 15_29"
genename region_tag susie_pip mu2 PVE z
5315 TPM1 15_29 0.139 9.44 3.8e-06 1.72
1885 LACTB 15_29 0.150 8.61 3.8e-06 -0.90
9763 RPS27L 15_29 0.488 23.39 3.3e-05 3.86
5132 APH1B 15_29 0.126 10.10 3.7e-06 2.29
852 CA12 15_29 0.124 7.91 2.9e-06 -1.44
11938 AC007950.1 15_29 0.096 20.44 5.7e-06 4.42
5318 USP3 15_29 0.908 88.19 2.3e-04 10.40
10420 FBXL22 15_29 0.130 6.98 2.6e-06 -0.12
1888 HERC1 15_29 0.095 13.77 3.8e-06 3.71
355 DAPK2 15_29 0.415 19.73 2.4e-05 2.92
319 SNX1 15_29 0.139 10.79 4.4e-06 -2.44
6634 SNX22 15_29 0.106 5.47 1.7e-06 0.11
7791 PPIB 15_29 0.155 11.85 5.4e-06 -2.51
8159 CSNK1G1 15_29 0.111 6.78 2.2e-06 -1.40
1889 TRIP4 15_29 0.102 6.19 1.8e-06 -1.54
7795 PCLAF 15_29 0.102 6.19 1.8e-06 -1.54
9333 ZNF609 15_29 0.163 12.20 5.8e-06 2.51
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 6_22"
genename region_tag susie_pip mu2 PVE z
10407 ZNF165 6_22 0.084 47.57 1.2e-05 -7.40
10335 ZSCAN16 6_22 0.060 9.93 1.7e-06 -2.54
10578 ZKSCAN8 6_22 0.762 17.60 3.9e-05 3.33
4950 ZSCAN9 6_22 0.028 11.55 9.4e-07 2.79
10174 NKAPL 6_22 0.814 15.51 3.7e-05 1.73
10021 ZKSCAN4 6_22 0.025 27.95 2.1e-06 -6.35
10374 ZSCAN26 6_22 0.269 69.86 5.5e-05 9.35
4972 PGBD1 6_22 0.030 6.40 5.6e-07 0.42
10188 ZKSCAN3 6_22 0.033 6.76 6.5e-07 0.18
11445 ZSCAN31 6_22 0.061 33.54 6.0e-06 -6.38
6712 ZSCAN12 6_22 0.027 26.02 2.1e-06 6.06
10865 TRIM27 6_22 0.028 6.31 5.2e-07 -0.16
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
[1] "Region: 11_34"
genename region_tag susie_pip mu2 PVE z
10165 FAM111B 11_34 0.091 8.00 2.1e-06 1.07
7794 FAM111A 11_34 0.062 4.52 8.2e-07 -0.21
2506 DTX4 11_34 0.144 12.43 5.2e-06 1.64
10468 MPEG1 11_34 0.208 15.66 9.5e-06 -1.98
2515 MS4A6A 11_34 0.080 6.78 1.6e-06 -0.93
7815 PATL1 11_34 0.089 7.86 2.0e-06 1.14
7817 STX3 11_34 0.092 8.12 2.2e-06 -1.13
7818 MRPL16 11_34 0.100 8.91 2.6e-06 1.19
4634 GIF 11_34 0.088 7.86 2.0e-06 -1.16
4638 TCN1 11_34 0.076 6.45 1.4e-06 -0.82
6096 MS4A2 11_34 0.064 4.76 8.9e-07 0.48
11819 AP001257.1 11_34 0.077 6.53 1.5e-06 -0.80
11116 MS4A4E 11_34 0.115 10.30 3.4e-06 -1.50
2516 MS4A4A 11_34 0.064 4.73 8.8e-07 -0.33
7825 MS4A6E 11_34 0.065 4.84 9.1e-07 -0.09
7826 MS4A7 11_34 0.117 10.50 3.6e-06 -1.49
7827 MS4A14 11_34 0.071 5.80 1.2e-06 0.78
2519 CCDC86 11_34 0.063 4.79 8.8e-07 0.41
9570 PTGDR2 11_34 0.077 6.54 1.5e-06 -0.82
6093 ZP1 11_34 0.115 10.58 3.5e-06 1.50
2520 PRPF19 11_34 0.067 5.35 1.1e-06 0.60
2521 TMEM109 11_34 0.069 5.48 1.1e-06 0.61
2546 SLC15A3 11_34 0.075 5.97 1.3e-06 -0.35
2547 CD5 11_34 0.088 7.17 1.8e-06 0.51
8008 VPS37C 11_34 0.076 6.12 1.4e-06 -0.49
11874 PGA5 11_34 0.064 5.25 9.8e-07 -0.89
11340 PGA3 11_34 0.093 9.21 2.5e-06 -1.58
8009 VWCE 11_34 0.066 4.97 9.6e-07 -0.10
6088 TMEM138 11_34 0.062 4.75 8.6e-07 -0.42
7030 CYB561A3 11_34 0.062 4.75 8.6e-07 -0.42
9981 TMEM216 11_34 0.088 8.05 2.1e-06 -1.19
11871 RP11-286N22.8 11_34 0.062 4.59 8.4e-07 -0.30
4631 DAGLA 11_34 0.086 12.34 3.1e-06 3.11
3765 MYRF 11_34 0.330 31.28 3.0e-05 -4.94
4636 FADS2 11_34 0.064 34.98 6.5e-06 -6.34
4637 TMEM258 11_34 0.110 15.11 4.8e-06 -2.91
6089 FADS1 11_34 0.976 65.66 1.9e-04 -8.78
11190 FADS3 11_34 0.116 9.14 3.1e-06 -0.16
8011 BEST1 11_34 0.063 7.58 1.4e-06 -2.00
6092 INCENP 11_34 0.071 6.51 1.4e-06 -1.16
7032 ASRGL1 11_34 0.127 10.88 4.0e-06 1.30
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#snps with PIP>0.8 or 20 highest PIPs
head(ctwas_snp_res[order(-ctwas_snp_res$susie_pip),report_cols_snps],
max(sum(ctwas_snp_res$susie_pip>0.8), 20))
id region_tag susie_pip mu2 PVE z
27225 rs1730862 1_66 1.000 52.14 1.5e-04 6.87
60143 rs12239046 1_131 1.000 47.10 1.4e-04 -6.51
126771 rs4973550 2_136 1.000 64.04 1.9e-04 7.33
126782 rs6731991 2_136 1.000 72.95 2.1e-04 -7.87
127119 rs76063448 2_137 1.000 65220.78 1.9e-01 87.93
127120 rs11888459 2_137 1.000 3533847.50 1.0e+01 233.42
127121 rs2885296 2_137 1.000 3352259.20 9.8e+00 316.17
127123 rs11568318 2_137 1.000 867241.58 2.5e+00 -89.02
313271 rs72834643 6_20 1.000 192.99 5.6e-04 12.21
313292 rs115740542 6_20 1.000 255.08 7.4e-04 14.39
363841 rs542176135 7_17 1.000 251.75 7.3e-04 -10.36
573150 rs11045819 12_16 1.000 5699.99 1.7e-02 -24.62
573167 rs4363657 12_16 1.000 2505.38 7.3e-03 54.59
600000 rs653178 12_67 1.000 230.50 6.7e-04 -15.66
676033 rs340029 15_27 1.000 101.84 3.0e-04 -9.73
997241 rs1611236 6_26 1.000 1800234.92 5.3e+00 -5.01
1130624 rs10995596 10_42 1.000 61551.50 1.8e-01 -4.75
1130642 rs773090945 10_42 1.000 61583.47 1.8e-01 -4.39
1134968 rs17476364 10_46 1.000 508.47 1.5e-03 22.54
1281603 rs138395719 17_23 1.000 64809.06 1.9e-01 -3.58
1360925 rs202143810 20_38 1.000 30924.58 9.0e-02 4.97
6858 rs79598313 1_18 0.999 100.15 2.9e-04 9.63
55463 rs822928 1_122 0.999 47.67 1.4e-04 6.57
422007 rs12550646 8_36 0.999 46.11 1.3e-04 7.32
527497 rs76153913 11_2 0.999 46.64 1.4e-04 6.13
781927 rs6129760 20_25 0.999 53.73 1.6e-04 7.10
1333479 rs59616136 19_14 0.999 52.37 1.5e-04 8.30
31587 rs2779116 1_79 0.997 155.00 4.5e-04 -12.25
766349 rs814573 19_31 0.996 41.04 1.2e-04 5.98
527511 rs118092312 11_2 0.994 54.45 1.6e-04 -2.24
712940 rs2608604 16_54 0.994 92.63 2.7e-04 -7.55
483876 rs34755157 9_71 0.992 46.74 1.4e-04 -6.53
365436 rs56130071 7_19 0.991 73.25 2.1e-04 8.31
572905 rs10770693 12_15 0.989 97.21 2.8e-04 10.67
127309 rs6431241 2_138 0.987 45.43 1.3e-04 -7.18
582345 rs10876377 12_33 0.982 44.99 1.3e-04 6.21
363863 rs4721597 7_17 0.977 146.35 4.2e-04 1.77
127324 rs72982172 2_138 0.975 38.66 1.1e-04 -4.20
313913 rs559462124 6_23 0.975 65.78 1.9e-04 -8.19
807183 rs6000553 22_14 0.975 156.55 4.5e-04 11.98
313136 rs72838866 6_19 0.962 101.13 2.8e-04 10.46
88741 rs1992769 2_58 0.953 36.52 1.0e-04 -5.62
530519 rs4910498 11_8 0.953 50.89 1.4e-04 6.55
74782 rs10495928 2_28 0.952 37.20 1.0e-04 -5.66
531948 rs10766077 11_10 0.938 58.06 1.6e-04 7.28
759592 rs3794991 19_15 0.928 67.02 1.8e-04 7.93
652044 rs873642 14_30 0.923 36.20 9.7e-05 -5.54
170572 rs6779146 3_84 0.920 312.97 8.4e-04 17.61
483491 rs115478735 9_70 0.914 54.28 1.4e-04 -7.00
583259 rs2657880 12_35 0.914 40.25 1.1e-04 5.92
1333486 rs61166126 19_14 0.909 51.55 1.4e-04 -8.19
471741 rs290992 9_46 0.904 36.81 9.7e-05 -6.09
387137 rs150063393 7_60 0.879 46.79 1.2e-04 -6.44
306987 rs2765359 6_7 0.835 41.58 1.0e-04 6.03
712888 rs8044367 16_54 0.825 47.77 1.2e-04 3.68
313861 rs149972 6_21 0.823 123.22 3.0e-04 -10.99
556181 rs11224303 11_58 0.818 38.54 9.2e-05 5.77
572865 rs7962574 12_15 0.811 71.60 1.7e-04 -9.30
757765 rs141645070 19_11 0.803 39.36 9.2e-05 -5.49
#plot PIP vs effect size
plot(ctwas_snp_res$susie_pip, ctwas_snp_res$mu2, xlab="PIP", ylab="mu^2", main="SNP PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
dfd2b5f | wesleycrouse | 2021-09-07 |
#SNPs with 50 largest effect sizes
head(ctwas_snp_res[order(-ctwas_snp_res$mu2),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
127120 rs11888459 2_137 1.000 3533848 1.0e+01 233.42
127103 rs13401281 2_137 0.000 3460773 0.0e+00 231.28
127121 rs2885296 2_137 1.000 3352259 9.8e+00 316.17
127117 rs17862875 2_137 0.000 3324144 0.0e+00 314.44
127100 rs1875263 2_137 0.000 3321675 0.0e+00 225.15
127099 rs6749496 2_137 0.000 2943314 0.0e+00 258.20
127092 rs7583278 2_137 0.000 2800028 0.0e+00 248.73
127108 rs4663332 2_137 0.000 2678089 0.0e+00 240.16
127109 rs200973045 2_137 0.000 2647161 0.0e+00 239.38
127110 rs57258852 2_137 0.000 2535219 0.0e+00 248.23
127114 rs3821242 2_137 0.000 2494136 0.0e+00 253.70
127112 rs2008584 2_137 0.000 2478403 0.0e+00 252.98
127086 rs6753320 2_137 0.000 2462593 0.0e+00 230.24
127087 rs6736743 2_137 0.000 2462593 0.0e+00 230.24
127091 rs2070959 2_137 0.000 2290791 0.0e+00 308.88
127090 rs1105880 2_137 0.000 2201096 0.0e+00 299.78
127085 rs77070100 2_137 0.000 2183541 0.0e+00 298.67
127098 rs2012734 2_137 0.000 2111027 0.0e+00 234.28
997227 rs111734624 6_26 0.417 1800450 2.2e+00 -5.03
997224 rs2508055 6_26 0.412 1800450 2.2e+00 -5.03
997260 rs1611252 6_26 0.290 1800449 1.5e+00 -5.03
997256 rs1611248 6_26 0.387 1800445 2.0e+00 -5.04
997271 rs1611260 6_26 0.138 1800444 7.3e-01 -5.03
997276 rs1611265 6_26 0.136 1800442 7.1e-01 -5.03
997179 rs1633033 6_26 0.028 1800407 1.5e-01 -5.04
997279 rs2394171 6_26 0.000 1800405 5.6e-07 -5.02
997278 rs1611267 6_26 0.000 1800403 3.5e-09 -5.01
997281 rs2893981 6_26 0.000 1800401 3.8e-07 -5.02
997230 rs1611228 6_26 0.000 1800396 2.2e-06 -5.02
997221 rs1737020 6_26 0.000 1800396 3.6e-11 -5.01
997222 rs1737019 6_26 0.000 1800396 3.6e-11 -5.01
997188 rs2844838 6_26 0.000 1800391 1.0e-03 -5.03
997191 rs1633032 6_26 0.000 1800275 4.3e-11 -5.03
997241 rs1611236 6_26 1.000 1800235 5.3e+00 -5.01
997215 rs1633020 6_26 0.000 1800177 0.0e+00 -5.01
997219 rs1633018 6_26 0.000 1800163 0.0e+00 -5.00
997239 rs1611234 6_26 0.000 1800053 0.0e+00 -5.00
997137 rs1610726 6_26 0.000 1799957 0.0e+00 -5.01
997186 rs2844840 6_26 0.000 1799774 0.0e+00 -5.04
997409 rs3129185 6_26 0.000 1799708 0.0e+00 -5.06
997417 rs1736999 6_26 0.000 1799632 0.0e+00 -5.06
997423 rs1633001 6_26 0.000 1799480 0.0e+00 -5.05
997254 rs1611246 6_26 0.000 1799463 0.0e+00 -5.02
997530 rs1632980 6_26 0.000 1799371 0.0e+00 -5.07
997204 rs1614309 6_26 0.000 1798920 0.0e+00 -5.04
997203 rs1633030 6_26 0.000 1797454 0.0e+00 -5.05
997289 rs9258382 6_26 0.000 1795543 0.0e+00 -5.04
997286 rs9258379 6_26 0.000 1792638 0.0e+00 -5.07
997249 rs1611241 6_26 0.000 1790436 0.0e+00 -5.12
997207 rs1633028 6_26 0.000 1787977 0.0e+00 -4.98
#SNPs with 50 highest pve
head(ctwas_snp_res[order(-ctwas_snp_res$PVE),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
127120 rs11888459 2_137 1.000 3533847.50 1.0e+01 233.42
127121 rs2885296 2_137 1.000 3352259.20 9.8e+00 316.17
997241 rs1611236 6_26 1.000 1800234.92 5.3e+00 -5.01
127123 rs11568318 2_137 1.000 867241.58 2.5e+00 -89.02
997224 rs2508055 6_26 0.412 1800450.19 2.2e+00 -5.03
997227 rs111734624 6_26 0.417 1800450.47 2.2e+00 -5.03
997256 rs1611248 6_26 0.387 1800444.76 2.0e+00 -5.04
997260 rs1611252 6_26 0.290 1800449.44 1.5e+00 -5.03
997271 rs1611260 6_26 0.138 1800443.58 7.3e-01 -5.03
997276 rs1611265 6_26 0.136 1800441.82 7.1e-01 -5.03
127119 rs76063448 2_137 1.000 65220.78 1.9e-01 87.93
1281603 rs138395719 17_23 1.000 64809.06 1.9e-01 -3.58
1130624 rs10995596 10_42 1.000 61551.50 1.8e-01 -4.75
1130642 rs773090945 10_42 1.000 61583.47 1.8e-01 -4.39
997179 rs1633033 6_26 0.028 1800407.22 1.5e-01 -5.04
1281602 rs9893520 17_23 0.514 64908.56 9.7e-02 -3.48
1360925 rs202143810 20_38 1.000 30924.58 9.0e-02 4.97
1360922 rs6089961 20_38 0.721 30930.57 6.5e-02 4.69
1360924 rs2738758 20_38 0.721 30930.57 6.5e-02 4.69
1281607 rs9897092 17_23 0.299 64907.43 5.7e-02 -3.48
1281572 rs9916782 17_23 0.248 64890.52 4.7e-02 -3.52
1281571 rs16965687 17_23 0.232 64890.02 4.4e-02 -3.52
1281573 rs79880154 17_23 0.170 64905.67 3.2e-02 -3.49
1281558 rs7359537 17_23 0.127 64903.24 2.4e-02 -3.49
1281561 rs9892404 17_23 0.124 64903.27 2.3e-02 -3.49
1281565 rs9906409 17_23 0.123 64903.38 2.3e-02 -3.49
573150 rs11045819 12_16 1.000 5699.99 1.7e-02 -24.62
573051 rs12366506 12_16 0.677 5056.28 1.0e-02 23.86
1281551 rs72836642 17_23 0.049 64889.11 9.4e-03 -3.50
573057 rs11045612 12_16 0.560 5051.97 8.3e-03 23.88
573167 rs4363657 12_16 1.000 2505.38 7.3e-03 54.59
1281554 rs58989731 17_23 0.024 63901.49 4.5e-03 -3.33
573068 rs73062442 12_16 0.301 5056.76 4.4e-03 23.83
1360903 rs35201382 20_38 0.047 30919.14 4.2e-03 4.68
1360905 rs2750483 20_38 0.041 30917.71 3.7e-03 4.68
573060 rs11513221 12_16 0.179 5056.45 2.6e-03 23.82
1129710 rs10761756 10_42 0.183 4337.80 2.3e-03 -15.02
1134968 rs17476364 10_46 1.000 508.47 1.5e-03 22.54
1360904 rs67468102 20_38 0.016 30913.17 1.5e-03 4.68
1360789 rs113486739 20_38 0.577 773.14 1.3e-03 0.25
572967 rs7965380 12_16 0.496 820.74 1.2e-03 -28.32
997188 rs2844838 6_26 0.000 1800390.56 1.0e-03 -5.03
1281679 rs562160239 17_23 0.005 64677.28 9.7e-04 -3.40
170572 rs6779146 3_84 0.920 312.97 8.4e-04 17.61
1281651 rs60220138 17_23 0.004 64029.11 7.5e-04 -3.45
313292 rs115740542 6_20 1.000 255.08 7.4e-04 14.39
363841 rs542176135 7_17 1.000 251.75 7.3e-04 -10.36
1281678 rs1807273 17_23 0.004 64720.04 6.9e-04 -3.42
600000 rs653178 12_67 1.000 230.50 6.7e-04 -15.66
1281529 rs546912945 17_23 0.004 64494.60 6.6e-04 -3.42
#SNPs with 50 largest z scores
head(ctwas_snp_res[order(-abs(ctwas_snp_res$z)),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
127121 rs2885296 2_137 1 3352259.20 9.80 316.17
127117 rs17862875 2_137 0 3324143.86 0.00 314.44
127091 rs2070959 2_137 0 2290790.93 0.00 308.88
127090 rs1105880 2_137 0 2201096.23 0.00 299.78
127085 rs77070100 2_137 0 2183540.64 0.00 298.67
127099 rs6749496 2_137 0 2943314.21 0.00 258.20
127114 rs3821242 2_137 0 2494136.46 0.00 253.70
127112 rs2008584 2_137 0 2478402.82 0.00 252.98
127092 rs7583278 2_137 0 2800027.53 0.00 248.73
127110 rs57258852 2_137 0 2535219.32 0.00 248.23
127076 rs2741034 2_137 0 1426356.18 0.00 243.32
127068 rs2602363 2_137 0 1423610.85 0.00 243.08
127063 rs2741013 2_137 0 1420350.81 0.00 242.86
127108 rs4663332 2_137 0 2678089.04 0.00 240.16
127109 rs200973045 2_137 0 2647161.36 0.00 239.38
127098 rs2012734 2_137 0 2111026.84 0.00 234.28
127120 rs11888459 2_137 1 3533847.50 10.00 233.42
127103 rs13401281 2_137 0 3460772.69 0.00 231.28
127086 rs6753320 2_137 0 2462592.75 0.00 230.24
127087 rs6736743 2_137 0 2462592.75 0.00 230.24
127100 rs1875263 2_137 0 3321674.62 0.00 225.15
127135 rs2361502 2_137 0 1484520.51 0.00 200.59
127072 rs6431558 2_137 0 813258.17 0.00 -179.35
127080 rs1113193 2_137 0 309874.18 0.00 -123.49
127074 rs1823803 2_137 0 624535.82 0.00 109.93
127132 rs10202032 2_137 0 533854.86 0.00 -104.88
127122 rs143373661 2_137 0 415355.58 0.00 100.66
127133 rs6723936 2_137 0 354083.38 0.00 100.34
127123 rs11568318 2_137 1 867241.58 2.50 -89.02
127113 rs45507691 2_137 0 857756.71 0.00 -88.76
127119 rs76063448 2_137 1 65220.78 0.19 87.93
127070 rs13027376 2_137 0 432857.74 0.00 -87.77
127067 rs4047192 2_137 0 432460.69 0.00 -87.66
127115 rs45510999 2_137 0 64739.98 0.00 87.41
127107 rs183532563 2_137 0 64198.55 0.00 86.77
127089 rs12476197 2_137 0 505544.66 0.00 -82.78
127083 rs4663871 2_137 0 501885.52 0.00 -82.54
127088 rs765251456 2_137 0 495299.59 0.00 -82.33
127064 rs7563478 2_137 0 712492.31 0.00 -80.75
127116 rs139257330 2_137 0 258993.87 0.00 77.73
127081 rs17863773 2_137 0 394984.33 0.00 -76.44
127073 rs17863766 2_137 0 342986.84 0.00 -75.62
127075 rs10929252 2_137 0 349921.03 0.00 -75.55
127062 rs140719475 2_137 0 348420.79 0.00 -75.37
127125 rs2003569 2_137 0 423254.24 0.00 -74.03
127078 rs2602372 2_137 0 228650.01 0.00 73.61
127065 rs6713902 2_137 0 305824.08 0.00 -72.04
127050 rs4047189 2_137 0 185907.06 0.00 69.43
127055 rs28802965 2_137 0 170760.00 0.00 67.17
127048 rs62192764 2_137 0 221087.62 0.00 -65.91
#GO enrichment analysis
library(enrichR)
Welcome to enrichR
Checking connection ...
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
dbs <- c("GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021")
genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]
#number of genes for gene set enrichment
length(genes)
[1] 95
if (length(genes)>0){
GO_enrichment <- enrichr(genes, dbs)
for (db in dbs){
print(db)
df <- GO_enrichment[[db]]
df <- df[df$Adjusted.P.value<0.05,c("Term", "Overlap", "Adjusted.P.value", "Genes")]
print(df)
}
#DisGeNET enrichment
# devtools::install_bitbucket("ibi_group/disgenet2r")
library(disgenet2r)
disgenet_api_key <- get_disgenet_api_key(
email = "wesleycrouse@gmail.com",
password = "uchicago1" )
Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
res_enrich <-disease_enrichment(entities=genes, vocabulary = "HGNC",
database = "CURATED" )
df <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio", "BgRatio")]
print(df)
#WebGestalt enrichment
library(WebGestaltR)
background <- ctwas_gene_res$genename
#listGeneSet()
databases <- c("pathway_KEGG", "disease_GLAD4U", "disease_OMIM")
enrichResult <- WebGestaltR(enrichMethod="ORA", organism="hsapiens",
interestGene=genes, referenceGene=background,
enrichDatabase=databases, interestGeneType="genesymbol",
referenceGeneType="genesymbol", isOutput=F)
print(enrichResult[,c("description", "size", "overlap", "FDR", "database", "userId")])
}
Uploading data to Enrichr... Done.
Querying GO_Biological_Process_2021... Done.
Querying GO_Cellular_Component_2021... Done.
Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Cellular_Component_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
CTB-50L17.10 gene(s) from the input list not found in DisGeNET CURATEDPKN3 gene(s) from the input list not found in DisGeNET CURATEDMMS22L gene(s) from the input list not found in DisGeNET CURATEDRP11-88E10.5 gene(s) from the input list not found in DisGeNET CURATEDPSRC1 gene(s) from the input list not found in DisGeNET CURATEDRRBP1 gene(s) from the input list not found in DisGeNET CURATEDSIK2 gene(s) from the input list not found in DisGeNET CURATEDUBASH3B gene(s) from the input list not found in DisGeNET CURATEDSMG6 gene(s) from the input list not found in DisGeNET CURATEDTTC5 gene(s) from the input list not found in DisGeNET CURATEDFBXO42 gene(s) from the input list not found in DisGeNET CURATEDLUC7L3 gene(s) from the input list not found in DisGeNET CURATEDC9orf43 gene(s) from the input list not found in DisGeNET CURATEDLAMP1 gene(s) from the input list not found in DisGeNET CURATEDBACH1-AS1 gene(s) from the input list not found in DisGeNET CURATEDNECAP2 gene(s) from the input list not found in DisGeNET CURATEDRNFT1 gene(s) from the input list not found in DisGeNET CURATEDABHD6 gene(s) from the input list not found in DisGeNET CURATEDAMZ2 gene(s) from the input list not found in DisGeNET CURATEDUNK gene(s) from the input list not found in DisGeNET CURATEDTM4SF19 gene(s) from the input list not found in DisGeNET CURATEDCCDC157 gene(s) from the input list not found in DisGeNET CURATEDSYTL1 gene(s) from the input list not found in DisGeNET CURATEDCCT6A gene(s) from the input list not found in DisGeNET CURATEDGPX7 gene(s) from the input list not found in DisGeNET CURATEDKLHL23 gene(s) from the input list not found in DisGeNET CURATEDFAM91A1 gene(s) from the input list not found in DisGeNET CURATEDRAB17 gene(s) from the input list not found in DisGeNET CURATEDLRRC43 gene(s) from the input list not found in DisGeNET CURATEDINAFM1 gene(s) from the input list not found in DisGeNET CURATEDCSNK1G3 gene(s) from the input list not found in DisGeNET CURATEDDAGLB gene(s) from the input list not found in DisGeNET CURATEDRP11-334C17.6 gene(s) from the input list not found in DisGeNET CURATEDLINC01624 gene(s) from the input list not found in DisGeNET CURATEDGTF2H2 gene(s) from the input list not found in DisGeNET CURATEDSMIM19 gene(s) from the input list not found in DisGeNET CURATEDTBL2 gene(s) from the input list not found in DisGeNET CURATEDTXNDC11 gene(s) from the input list not found in DisGeNET CURATEDUBE2Z gene(s) from the input list not found in DisGeNET CURATEDCEP44 gene(s) from the input list not found in DisGeNET CURATEDSHKBP1 gene(s) from the input list not found in DisGeNET CURATEDLINC00565 gene(s) from the input list not found in DisGeNET CURATEDMPV17L2 gene(s) from the input list not found in DisGeNET CURATEDNPIPA1 gene(s) from the input list not found in DisGeNET CURATEDSLC50A1 gene(s) from the input list not found in DisGeNET CURATEDHABP4 gene(s) from the input list not found in DisGeNET CURATED
Description FDR
26 Conjunctival Diseases 0.06862745
70 polyps 0.06862745
138 Symmetrical dyschromatosis of extremities 0.06862745
180 Preeclampsia Eclampsia 4 0.06862745
181 CHARCOT-MARIE-TOOTH DISEASE, TYPE 4H 0.06862745
183 SPINOCEREBELLAR ATAXIA 21 0.06862745
191 Spinal Muscular Atrophy, Distal, Autosomal Recessive, 4 0.06862745
194 Hypomagnesemia 4, Renal 0.06862745
205 AICARDI-GOUTIERES SYNDROME 6 0.06862745
212 CHARCOT-MARIE-TOOTH DISEASE, RECESSIVE INTERMEDIATE C 0.06862745
Ratio BgRatio
26 1/49 1/9703
70 1/49 1/9703
138 1/49 1/9703
180 1/49 1/9703
181 1/49 1/9703
183 1/49 1/9703
191 1/49 1/9703
194 1/49 1/9703
205 1/49 1/9703
212 1/49 1/9703
******************************************
* *
* Welcome to WebGestaltR ! *
* *
******************************************
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet,
minNum = minNum, : No significant gene set is identified based on FDR 0.05!
NULL
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0 cowplot_1.0.0
[5] ggplot2_3.3.3
loaded via a namespace (and not attached):
[1] bitops_1.0-6 matrixStats_0.57.0
[3] fs_1.3.1 bit64_4.0.5
[5] doParallel_1.0.16 progress_1.2.2
[7] httr_1.4.1 rprojroot_2.0.2
[9] GenomeInfoDb_1.20.0 doRNG_1.8.2
[11] tools_3.6.1 utf8_1.2.1
[13] R6_2.5.0 DBI_1.1.1
[15] BiocGenerics_0.30.0 colorspace_1.4-1
[17] withr_2.4.1 tidyselect_1.1.0
[19] prettyunits_1.0.2 bit_4.0.4
[21] curl_3.3 compiler_3.6.1
[23] git2r_0.26.1 Biobase_2.44.0
[25] DelayedArray_0.10.0 rtracklayer_1.44.0
[27] labeling_0.3 scales_1.1.0
[29] readr_1.4.0 apcluster_1.4.8
[31] stringr_1.4.0 digest_0.6.20
[33] Rsamtools_2.0.0 svglite_1.2.2
[35] rmarkdown_1.13 XVector_0.24.0
[37] pkgconfig_2.0.3 htmltools_0.3.6
[39] fastmap_1.1.0 BSgenome_1.52.0
[41] rlang_0.4.11 RSQLite_2.2.7
[43] generics_0.0.2 farver_2.1.0
[45] jsonlite_1.6 BiocParallel_1.18.0
[47] dplyr_1.0.7 VariantAnnotation_1.30.1
[49] RCurl_1.98-1.1 magrittr_2.0.1
[51] GenomeInfoDbData_1.2.1 Matrix_1.2-18
[53] Rcpp_1.0.6 munsell_0.5.0
[55] S4Vectors_0.22.1 fansi_0.5.0
[57] gdtools_0.1.9 lifecycle_1.0.0
[59] stringi_1.4.3 whisker_0.3-2
[61] yaml_2.2.0 SummarizedExperiment_1.14.1
[63] zlibbioc_1.30.0 plyr_1.8.4
[65] grid_3.6.1 blob_1.2.1
[67] parallel_3.6.1 promises_1.0.1
[69] crayon_1.4.1 lattice_0.20-38
[71] Biostrings_2.52.0 GenomicFeatures_1.36.3
[73] hms_1.1.0 knitr_1.23
[75] pillar_1.6.1 igraph_1.2.4.1
[77] GenomicRanges_1.36.0 rjson_0.2.20
[79] rngtools_1.5 codetools_0.2-16
[81] reshape2_1.4.3 biomaRt_2.40.1
[83] stats4_3.6.1 XML_3.98-1.20
[85] glue_1.4.2 evaluate_0.14
[87] data.table_1.14.0 foreach_1.5.1
[89] vctrs_0.3.8 httpuv_1.5.1
[91] gtable_0.3.0 purrr_0.3.4
[93] assertthat_0.2.1 cachem_1.0.5
[95] xfun_0.8 later_0.8.0
[97] tibble_3.1.2 iterators_1.0.13
[99] GenomicAlignments_1.20.1 AnnotationDbi_1.46.0
[101] memoise_2.0.0 IRanges_2.18.1
[103] workflowr_1.6.2 ellipsis_0.3.2