Last updated: 2021-09-13
Checks: 7 0
Knit directory: ctwas_applied/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20210726)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 72c8ef7. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
working directory clean
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/ukb-a-536_Heart_Left_Ventricle_known.Rmd
) and HTML (docs/ukb-a-536_Heart_Left_Ventricle_known.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
html | 7e22565 | wesleycrouse | 2021-09-13 | updating reports |
Rmd | 3a7fbc1 | wesleycrouse | 2021-09-08 | generating reports for known annotations |
html | 3a7fbc1 | wesleycrouse | 2021-09-08 | generating reports for known annotations |
Rmd | cbf7408 | wesleycrouse | 2021-09-08 | adding enrichment to reports |
These are the results of a ctwas
analysis of the UK Biobank trait Diagnoses - main ICD10: I48 Atrial fibrillation and flutter
using Heart_Left_Ventricle
gene weights.
The GWAS was conducted by the Neale Lab, and the biomarker traits we analyzed are discussed here. Summary statistics were obtained from IEU OpenGWAS using GWAS ID: ukb-a-536
. Results were obtained from from IEU rather than Neale Lab because they are in a standardard format (GWAS VCF). Note that 3 of the 34 biomarker traits were not available from IEU and were excluded from analysis.
The weights are mashr GTEx v8 models on Heart_Left_Ventricle
eQTL obtained from PredictDB. We performed a full harmonization of the variants, including recovering strand ambiguous variants. This procedure is discussed in a separate document. (TO-DO: add report that describes harmonization)
LD matrices were computed from a 10% subset of Neale lab subjects. Subjects were matched using the plate and well information from genotyping. We included only biallelic variants with MAF>0.01 in the original Neale Lab GWAS. (TO-DO: add more details [number of subjects, variants, etc])
TO-DO: add enhanced QC reporting (total number of weights, why each variant was missing for all genes)
qclist_all <- list()
qc_files <- paste0(results_dir, "/", list.files(results_dir, pattern="exprqc.Rd"))
for (i in 1:length(qc_files)){
load(qc_files[i])
chr <- unlist(strsplit(rev(unlist(strsplit(qc_files[i], "_")))[1], "[.]"))[1]
qclist_all[[chr]] <- cbind(do.call(rbind, lapply(qclist,unlist)), as.numeric(substring(chr,4)))
}
qclist_all <- data.frame(do.call(rbind, qclist_all))
colnames(qclist_all)[ncol(qclist_all)] <- "chr"
rm(qclist, wgtlist, z_gene_chr)
#number of imputed weights
nrow(qclist_all)
[1] 10971
#number of imputed weights by chromosome
table(qclist_all$chr)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1078 769 621 445 494 650 544 404 412 451 645 591 199 362 370
16 17 18 19 20 21 22
529 681 157 850 330 119 270
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.7405888
#load ctwas results
ctwas_res <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.susieIrss.txt"))
#make unique identifier for regions
ctwas_res$region_tag <- paste(ctwas_res$region_tag1, ctwas_res$region_tag2, sep="_")
#load z scores for SNPs and collect sample size
load(paste0(results_dir, "/", analysis_id, "_expr_z_snp.Rd"))
sample_size <- z_snp$ss
sample_size <- as.numeric(names(which.max(table(sample_size))))
#compute PVE for each gene/SNP
ctwas_res$PVE = ctwas_res$susie_pip*ctwas_res$mu2/sample_size
#separate gene and SNP results
ctwas_gene_res <- ctwas_res[ctwas_res$type == "gene", ]
ctwas_gene_res <- data.frame(ctwas_gene_res)
ctwas_snp_res <- ctwas_res[ctwas_res$type == "SNP", ]
ctwas_snp_res <- data.frame(ctwas_snp_res)
#add gene information to results
sqlite <- RSQLite::dbDriver("SQLite")
db = RSQLite::dbConnect(sqlite, paste0("/project2/compbio/predictdb/mashr_models/mashr_", weight, ".db"))
query <- function(...) RSQLite::dbGetQuery(db, ...)
gene_info <- query("select gene, genename from extra")
gene_info <- query("select gene, genename, gene_type from extra")
RSQLite::dbDisconnect(db)
ctwas_gene_res <- cbind(ctwas_gene_res, gene_info[sapply(ctwas_gene_res$id, match, gene_info$gene), c("genename", "gene_type")])
#add z scores to results
load(paste0(results_dir, "/", analysis_id, "_expr_z_gene.Rd"))
ctwas_gene_res$z <- z_gene[ctwas_gene_res$id,]$z
z_snp <- z_snp[z_snp$id %in% ctwas_snp_res$id,]
ctwas_snp_res$z <- z_snp$z[match(ctwas_snp_res$id, z_snp$id)]
#formatting and rounding for tables
ctwas_gene_res$z <- round(ctwas_gene_res$z,2)
ctwas_snp_res$z <- round(ctwas_snp_res$z,2)
ctwas_gene_res$susie_pip <- round(ctwas_gene_res$susie_pip,3)
ctwas_snp_res$susie_pip <- round(ctwas_snp_res$susie_pip,3)
ctwas_gene_res$mu2 <- round(ctwas_gene_res$mu2,2)
ctwas_snp_res$mu2 <- round(ctwas_snp_res$mu2,2)
ctwas_gene_res$PVE <- signif(ctwas_gene_res$PVE, 2)
ctwas_snp_res$PVE <- signif(ctwas_snp_res$PVE, 2)
#merge gene and snp results with added information
ctwas_snp_res$genename=NA
ctwas_snp_res$gene_type=NA
ctwas_res <- rbind(ctwas_gene_res,
ctwas_snp_res[,colnames(ctwas_gene_res)])
#store columns to report
report_cols <- colnames(ctwas_gene_res)[!(colnames(ctwas_gene_res) %in% c("type", "region_tag1", "region_tag2", "cs_index", "gene_type", "z_flag", "id", "chrom", "pos"))]
first_cols <- c("genename", "region_tag")
report_cols <- c(first_cols, report_cols[!(report_cols %in% first_cols)])
report_cols_snps <- c("id", report_cols[-1])
#get number of SNPs from s1 results; adjust for thin
ctwas_res_s1 <- data.table::fread(paste0(results_dir, "/", analysis_id, "_ctwas.s1.susieIrss.txt"))
n_snps <- sum(ctwas_res_s1$type=="SNP")/thin
rm(ctwas_res_s1)
library(ggplot2)
library(cowplot)
********************************************************
Note: As of version 1.0.0, cowplot does not change the
default ggplot2 theme anymore. To recover the previous
behavior, execute:
theme_set(theme_cowplot())
********************************************************
load(paste0(results_dir, "/", analysis_id, "_ctwas.s2.susieIrssres.Rd"))
df <- data.frame(niter = rep(1:ncol(group_prior_rec), 2),
value = c(group_prior_rec[1,], group_prior_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_rec)))
df$group <- as.factor(df$group)
df$value[df$group=="SNP"] <- df$value[df$group=="SNP"]*thin #adjust parameter to account for thin argument
p_pi <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(pi)) +
ggtitle("Prior mean") +
theme_cowplot()
df <- data.frame(niter = rep(1:ncol(group_prior_var_rec), 2),
value = c(group_prior_var_rec[1,], group_prior_var_rec[2,]),
group = rep(c("Gene", "SNP"), each = ncol(group_prior_var_rec)))
df$group <- as.factor(df$group)
p_sigma2 <- ggplot(df, aes(x=niter, y=value, group=group)) +
geom_line(aes(color=group)) +
geom_point(aes(color=group)) +
xlab("Iteration") + ylab(bquote(sigma^2)) +
ggtitle("Prior variance") +
theme_cowplot()
plot_grid(p_pi, p_sigma2)
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
gene snp
0.0087835822 0.0001568083
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
gene snp
2.722269 2.483925
#report sample size
print(sample_size)
[1] 337199
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1] 10971 7535010
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
gene snp
0.0007779696 0.0087037244
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1] 0.00970027 0.14034857
#distribution of PIPs
hist(ctwas_gene_res$susie_pip, xlim=c(0,1), main="Distribution of Gene PIPs")
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
#genes with PIP>0.8 or 20 highest PIPs
head(ctwas_gene_res[order(-ctwas_gene_res$susie_pip),report_cols], max(sum(ctwas_gene_res$susie_pip>0.8), 20))
genename region_tag susie_pip mu2 PVE z
3725 SSPN 12_18 0.777 23.82 5.5e-05 -5.47
3147 WIPF1 2_105 0.628 23.59 4.4e-05 4.81
2401 PITX3 10_65 0.497 22.47 3.3e-05 -5.35
2380 SEC23IP 10_74 0.452 27.22 3.7e-05 -4.14
3169 PRRX1 1_84 0.445 26.36 3.5e-05 5.30
7794 KDM1B 6_14 0.436 20.51 2.7e-05 -4.75
3876 DEK 6_14 0.424 20.53 2.6e-05 -4.73
1374 SLC22A17 14_3 0.416 31.18 3.8e-05 -4.78
2987 GNB4 3_110 0.413 26.30 3.2e-05 -3.87
503 TBC1D22A 22_21 0.384 26.57 3.0e-05 3.91
7205 SRSF2 17_43 0.352 25.50 2.7e-05 -3.60
6373 NOCT 4_92 0.342 24.28 2.5e-05 3.54
7939 FBN1 15_19 0.328 24.15 2.3e-05 3.44
7567 LIPH 3_114 0.313 23.79 2.2e-05 3.85
9712 COPG1 3_80 0.312 23.02 2.1e-05 -3.66
844 SCARB1 12_76 0.308 23.23 2.1e-05 3.69
1298 LTBP4 19_28 0.304 23.51 2.1e-05 -3.38
1031 DDX43 6_51 0.273 21.92 1.8e-05 3.45
9646 OXTR 3_7 0.271 22.54 1.8e-05 3.12
7278 C1orf123 1_33 0.270 21.86 1.8e-05 -3.37
#plot PIP vs effect size
plot(ctwas_gene_res$susie_pip, ctwas_gene_res$mu2, xlab="PIP", ylab="mu^2", main="Gene PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
#genes with 20 largest effect sizes
head(ctwas_gene_res[order(-ctwas_gene_res$mu2),report_cols],20)
genename region_tag susie_pip mu2 PVE z
9455 STX19 3_59 0.000 62.50 4.1e-14 0.17
9450 NSUN3 3_59 0.000 42.21 9.3e-14 -1.48
9987 PROS1 3_59 0.000 33.99 7.7e-14 -2.06
1374 SLC22A17 14_3 0.416 31.18 3.8e-05 -4.78
2380 SEC23IP 10_74 0.452 27.22 3.7e-05 -4.14
503 TBC1D22A 22_21 0.384 26.57 3.0e-05 3.91
3169 PRRX1 1_84 0.445 26.36 3.5e-05 5.30
2987 GNB4 3_110 0.413 26.30 3.2e-05 -3.87
7205 SRSF2 17_43 0.352 25.50 2.7e-05 -3.60
7445 DCST2 1_76 0.180 24.60 1.3e-05 -4.49
6373 NOCT 4_92 0.342 24.28 2.5e-05 3.54
7939 FBN1 15_19 0.328 24.15 2.3e-05 3.44
9511 ZNF664 12_75 0.232 23.95 1.7e-05 3.65
3725 SSPN 12_18 0.777 23.82 5.5e-05 -5.47
12085 MIR4458HG 5_8 0.233 23.80 1.6e-05 -3.25
7567 LIPH 3_114 0.313 23.79 2.2e-05 3.85
3147 WIPF1 2_105 0.628 23.59 4.4e-05 4.81
1298 LTBP4 19_28 0.304 23.51 2.1e-05 -3.38
4518 EIF5A 17_7 0.185 23.36 1.3e-05 -3.45
11057 ZNF525 19_36 0.152 23.33 1.0e-05 2.97
#genes with 20 highest pve
head(ctwas_gene_res[order(-ctwas_gene_res$PVE),report_cols],20)
genename region_tag susie_pip mu2 PVE z
3725 SSPN 12_18 0.777 23.82 5.5e-05 -5.47
3147 WIPF1 2_105 0.628 23.59 4.4e-05 4.81
1374 SLC22A17 14_3 0.416 31.18 3.8e-05 -4.78
2380 SEC23IP 10_74 0.452 27.22 3.7e-05 -4.14
3169 PRRX1 1_84 0.445 26.36 3.5e-05 5.30
2401 PITX3 10_65 0.497 22.47 3.3e-05 -5.35
2987 GNB4 3_110 0.413 26.30 3.2e-05 -3.87
503 TBC1D22A 22_21 0.384 26.57 3.0e-05 3.91
7794 KDM1B 6_14 0.436 20.51 2.7e-05 -4.75
7205 SRSF2 17_43 0.352 25.50 2.7e-05 -3.60
3876 DEK 6_14 0.424 20.53 2.6e-05 -4.73
6373 NOCT 4_92 0.342 24.28 2.5e-05 3.54
7939 FBN1 15_19 0.328 24.15 2.3e-05 3.44
7567 LIPH 3_114 0.313 23.79 2.2e-05 3.85
9712 COPG1 3_80 0.312 23.02 2.1e-05 -3.66
844 SCARB1 12_76 0.308 23.23 2.1e-05 3.69
1298 LTBP4 19_28 0.304 23.51 2.1e-05 -3.38
7278 C1orf123 1_33 0.270 21.86 1.8e-05 -3.37
9646 OXTR 3_7 0.271 22.54 1.8e-05 3.12
10102 GPRIN3 4_60 0.270 22.45 1.8e-05 3.17
#genes with 20 largest z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
3725 SSPN 12_18 0.777 23.82 5.5e-05 -5.47
2401 PITX3 10_65 0.497 22.47 3.3e-05 -5.35
3169 PRRX1 1_84 0.445 26.36 3.5e-05 5.30
3147 WIPF1 2_105 0.628 23.59 4.4e-05 4.81
12761 RP1-79C4.4 1_84 0.231 23.10 1.6e-05 4.80
1374 SLC22A17 14_3 0.416 31.18 3.8e-05 -4.78
7794 KDM1B 6_14 0.436 20.51 2.7e-05 -4.75
3876 DEK 6_14 0.424 20.53 2.6e-05 -4.73
7445 DCST2 1_76 0.180 24.60 1.3e-05 -4.49
7134 ZBTB7B 1_76 0.046 15.00 2.0e-06 4.18
2380 SEC23IP 10_74 0.452 27.22 3.7e-05 -4.14
503 TBC1D22A 22_21 0.384 26.57 3.0e-05 3.91
2554 SLC2A9 4_11 0.254 22.50 1.7e-05 -3.88
2987 GNB4 3_110 0.413 26.30 3.2e-05 -3.87
7567 LIPH 3_114 0.313 23.79 2.2e-05 3.85
6811 FBXO32 8_80 0.178 23.26 1.2e-05 -3.80
7930 CMTM5 14_3 0.138 21.47 8.8e-06 -3.80
7381 MAIP1 2_118 0.127 19.06 7.2e-06 -3.71
844 SCARB1 12_76 0.308 23.23 2.1e-05 3.69
3052 STAM2 2_91 0.241 21.59 1.5e-05 3.68
#set nominal signifiance threshold for z scores
alpha <- 0.05
#bonferroni adjusted threshold for z scores
sig_thresh <- qnorm(1-(alpha/nrow(ctwas_gene_res)/2), lower=T)
#Q-Q plot for z scores
obs_z <- ctwas_gene_res$z[order(ctwas_gene_res$z)]
exp_z <- qnorm((1:nrow(ctwas_gene_res))/nrow(ctwas_gene_res))
plot(exp_z, obs_z, xlab="Expected z", ylab="Observed z", main="Gene z score Q-Q plot")
abline(a=0,b=1)
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
#plot z score vs PIP
plot(abs(ctwas_gene_res$z), ctwas_gene_res$susie_pip, xlab="abs(z)", ylab="PIP")
abline(v=sig_thresh, col="red", lty=2)
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)
[1] 0.0007291952
#genes with most significant z scores
head(ctwas_gene_res[order(-abs(ctwas_gene_res$z)),report_cols],20)
genename region_tag susie_pip mu2 PVE z
3725 SSPN 12_18 0.777 23.82 5.5e-05 -5.47
2401 PITX3 10_65 0.497 22.47 3.3e-05 -5.35
3169 PRRX1 1_84 0.445 26.36 3.5e-05 5.30
3147 WIPF1 2_105 0.628 23.59 4.4e-05 4.81
12761 RP1-79C4.4 1_84 0.231 23.10 1.6e-05 4.80
1374 SLC22A17 14_3 0.416 31.18 3.8e-05 -4.78
7794 KDM1B 6_14 0.436 20.51 2.7e-05 -4.75
3876 DEK 6_14 0.424 20.53 2.6e-05 -4.73
7445 DCST2 1_76 0.180 24.60 1.3e-05 -4.49
7134 ZBTB7B 1_76 0.046 15.00 2.0e-06 4.18
2380 SEC23IP 10_74 0.452 27.22 3.7e-05 -4.14
503 TBC1D22A 22_21 0.384 26.57 3.0e-05 3.91
2554 SLC2A9 4_11 0.254 22.50 1.7e-05 -3.88
2987 GNB4 3_110 0.413 26.30 3.2e-05 -3.87
7567 LIPH 3_114 0.313 23.79 2.2e-05 3.85
6811 FBXO32 8_80 0.178 23.26 1.2e-05 -3.80
7930 CMTM5 14_3 0.138 21.47 8.8e-06 -3.80
7381 MAIP1 2_118 0.127 19.06 7.2e-06 -3.71
844 SCARB1 12_76 0.308 23.23 2.1e-05 3.69
3052 STAM2 2_91 0.241 21.59 1.5e-05 3.68
ctwas_gene_res_sortz <- ctwas_gene_res[order(-abs(ctwas_gene_res$z)),]
n_plots <- 5
for (region_tag_plot in head(unique(ctwas_gene_res_sortz$region_tag), n_plots)){
ctwas_res_region <- ctwas_res[ctwas_res$region_tag==region_tag_plot,]
start <- min(ctwas_res_region$pos)
end <- max(ctwas_res_region$pos)
ctwas_res_region <- ctwas_res_region[order(ctwas_res_region$pos),]
ctwas_res_region_gene <- ctwas_res_region[ctwas_res_region$type=="gene",]
ctwas_res_region_snp <- ctwas_res_region[ctwas_res_region$type=="SNP",]
#region name
print(paste0("Region: ", region_tag_plot))
#table of genes in region
print(ctwas_res_region_gene[,report_cols])
par(mfrow=c(4,1))
#gene z scores
plot(ctwas_res_region_gene$pos, abs(ctwas_res_region_gene$z), xlab="Position", ylab="abs(gene_z)", xlim=c(start,end),
ylim=c(0,max(sig_thresh, abs(ctwas_res_region_gene$z))),
main=paste0("Region: ", region_tag_plot))
abline(h=sig_thresh,col="red",lty=2)
#significance threshold for SNPs
alpha_snp <- 5*10^(-8)
sig_thresh_snp <- qnorm(1-alpha_snp/2, lower=T)
#snp z scores
plot(ctwas_res_region_snp$pos, abs(ctwas_res_region_snp$z), xlab="Position", ylab="abs(snp_z)",xlim=c(start,end),
ylim=c(0,max(sig_thresh_snp, max(abs(ctwas_res_region_snp$z)))))
abline(h=sig_thresh_snp,col="purple",lty=2)
#gene pips
plot(ctwas_res_region_gene$pos, ctwas_res_region_gene$susie_pip, xlab="Position", ylab="Gene PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
#snp pips
plot(ctwas_res_region_snp$pos, ctwas_res_region_snp$susie_pip, xlab="Position", ylab="SNP PIP", xlim=c(start,end), ylim=c(0,1))
abline(h=0.8,col="blue",lty=2)
}
[1] "Region: 12_18"
genename region_tag susie_pip mu2 PVE z
13136 RP11-707G18.1 12_18 0.027 9.72 7.9e-07 -1.67
3723 RASSF8 12_18 0.082 17.62 4.3e-06 -2.57
3724 BHLHE41 12_18 0.068 19.13 3.9e-06 3.25
3725 SSPN 12_18 0.777 23.82 5.5e-05 -5.47
3726 ITPR2 12_18 0.013 4.17 1.6e-07 -0.53
600 INTS13 12_18 0.025 9.22 7.0e-07 -1.53
2749 FGFR1OP2 12_18 0.025 9.22 7.0e-07 1.53
601 TM7SF3 12_18 0.012 3.86 1.4e-07 -0.34
13030 RP11-582E3.6 12_18 0.016 5.79 2.7e-07 1.06
6540 MED21 12_18 0.021 7.97 5.0e-07 -1.43
11318 STK38L 12_18 0.014 4.70 1.9e-07 -0.65
333 ARNTL2 12_18 0.060 15.31 2.7e-06 -2.11
7909 SMCO2 12_18 0.016 5.58 2.6e-07 0.89
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
[1] "Region: 10_65"
genename region_tag susie_pip mu2 PVE z
7945 BTRC 10_65 0.020 3.68 2.1e-07 0.12
7947 DPCD 10_65 0.020 3.79 2.2e-07 -0.40
2398 FBXW4 10_65 0.055 10.72 1.7e-06 -1.55
2399 NPM3 10_65 0.028 8.47 6.9e-07 2.35
10914 MGEA5 10_65 0.020 4.08 2.4e-07 -0.72
3509 KCNIP2 10_65 0.020 3.83 2.3e-07 0.25
3508 C10orf76 10_65 0.025 6.54 4.9e-07 1.81
10975 LDB1 10_65 0.023 4.65 3.1e-07 0.50
6247 PPRC1 10_65 0.032 7.75 7.3e-07 1.55
7953 NOLC1 10_65 0.024 5.69 4.0e-07 1.36
3496 ELOVL3 10_65 0.027 6.89 5.6e-07 1.57
2401 PITX3 10_65 0.497 22.47 3.3e-05 -5.35
2402 GBF1 10_65 0.040 9.26 1.1e-06 1.61
557 PSD 10_65 0.030 7.14 6.3e-07 -1.42
2405 FBXL15 10_65 0.020 4.10 2.4e-07 0.89
2406 CUEDC2 10_65 0.024 5.45 3.8e-07 1.12
5214 MFSD13A 10_65 0.021 4.16 2.6e-07 0.52
2407 SUFU 10_65 0.021 4.24 2.7e-07 -0.46
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
[1] "Region: 1_84"
genename region_tag susie_pip mu2 PVE z
12761 RP1-79C4.4 1_84 0.231 23.10 1.6e-05 4.80
3169 PRRX1 1_84 0.445 26.36 3.5e-05 5.30
134 FMO3 1_84 0.015 3.82 1.7e-07 0.28
1412 FMO2 1_84 0.015 3.75 1.6e-07 0.28
195 FMO1 1_84 0.016 4.54 2.2e-07 -0.53
933 FMO4 1_84 0.014 3.67 1.6e-07 0.09
3311 PRRC2C 1_84 0.015 4.00 1.8e-07 -0.42
362 MYOC 1_84 0.015 3.77 1.6e-07 -0.23
174 METTL13 1_84 0.015 3.74 1.6e-07 0.24
10837 DNM3 1_84 0.020 6.02 3.6e-07 -0.94
4893 PIGC 1_84 0.015 3.71 1.6e-07 0.18
9656 C1orf105 1_84 0.014 3.67 1.6e-07 -0.04
1413 SUCO 1_84 0.016 4.51 2.2e-07 0.49
3315 FASLG 1_84 0.014 3.67 1.6e-07 0.06
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
[1] "Region: 2_105"
genename region_tag susie_pip mu2 PVE z
6480 PDK1 2_105 0.068 15.23 3.1e-06 -2.19
1348 MAP3K20 2_105 0.020 6.11 3.6e-07 0.96
5895 CDCA7 2_105 0.014 3.71 1.6e-07 0.20
8872 SP3 2_105 0.033 10.10 1.0e-06 -1.86
12726 RP11-394I13.2 2_105 0.038 11.02 1.2e-06 -2.00
5254 OLA1 2_105 0.082 17.22 4.2e-06 3.09
5892 SCRN3 2_105 0.078 16.66 3.9e-06 2.96
7442 GPR155 2_105 0.028 9.73 8.2e-07 -2.95
11646 AC010894.3 2_105 0.015 4.28 1.9e-07 1.46
3147 WIPF1 2_105 0.628 23.59 4.4e-05 4.81
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
[1] "Region: 14_3"
genename region_tag susie_pip mu2 PVE z
1629 ABHD4 14_3 0.014 4.82 2.1e-07 -0.68
6717 OXA1L 14_3 0.015 5.09 2.3e-07 -0.67
6843 MMP14 14_3 0.028 9.62 7.9e-07 -1.56
10731 LRP10 14_3 0.013 4.28 1.7e-07 0.52
1633 RBM23 14_3 0.012 3.66 1.3e-07 -0.06
1634 PRMT5 14_3 0.014 4.43 1.8e-07 0.55
1369 HAUS4 14_3 0.013 4.28 1.7e-07 0.44
4236 AJUBA 14_3 0.016 5.46 2.5e-07 -0.95
1679 C14orf93 14_3 0.012 3.66 1.3e-07 -0.07
1680 PSMB5 14_3 0.013 4.26 1.7e-07 -0.54
5420 CDH24 14_3 0.015 4.94 2.1e-07 0.94
1682 ACIN1 14_3 0.013 3.97 1.5e-07 0.55
1372 SLC7A8 14_3 0.012 3.71 1.4e-07 -0.17
11487 HOMEZ 14_3 0.041 12.56 1.5e-06 -1.90
4235 BCL2L2 14_3 0.013 4.11 1.6e-07 0.31
11842 PPP1R3E 14_3 0.067 16.32 3.2e-06 -2.51
1685 PABPN1 14_3 0.028 9.64 8.1e-07 -1.29
1374 SLC22A17 14_3 0.416 31.18 3.8e-05 -4.78
1686 EFS 14_3 0.019 6.90 3.9e-07 -1.40
7930 CMTM5 14_3 0.138 21.47 8.8e-06 -3.80
7929 IL25 14_3 0.014 4.31 1.7e-07 0.13
1371 MYH7 14_3 0.012 3.69 1.4e-07 -0.28
4231 NGDN 14_3 0.015 5.26 2.4e-07 -0.68
4966 ZFHX2 14_3 0.013 4.11 1.6e-07 -0.38
12378 THTPA 14_3 0.013 4.11 1.6e-07 -0.38
6848 DHRS4 14_3 0.015 5.08 2.3e-07 -0.75
1691 PCK2 14_3 0.013 4.10 1.6e-07 0.44
4243 NRL 14_3 0.013 4.15 1.6e-07 0.42
5425 FITM1 14_3 0.013 3.80 1.4e-07 -0.10
12372 RP11-468E2.5 14_3 0.013 3.92 1.5e-07 0.45
1694 EMC9 14_3 0.012 3.67 1.3e-07 -0.21
1375 RNF31 14_3 0.012 3.67 1.3e-07 0.06
1695 PSME2 14_3 0.099 19.09 5.6e-06 2.58
11409 IRF9 14_3 0.027 9.43 7.5e-07 -1.56
1698 TM9SF1 14_3 0.013 4.14 1.6e-07 0.49
13059 RP11-468E2.10 14_3 0.017 5.84 2.9e-07 -0.97
5423 TSSK4 14_3 0.013 4.22 1.7e-07 -0.43
12244 CHMP4A 14_3 0.046 13.41 1.8e-06 -1.97
1700 GMPR2 14_3 0.013 3.96 1.5e-07 -0.45
1384 TINF2 14_3 0.017 5.94 3.0e-07 -1.05
1383 TGM1 14_3 0.014 4.36 1.8e-07 -0.49
1702 RABGGTA 14_3 0.012 3.68 1.4e-07 -0.06
6853 DHRS1 14_3 0.169 23.20 1.2e-05 3.04
11405 LTB4R2 14_3 0.031 10.40 9.5e-07 -1.62
11404 LTB4R 14_3 0.016 5.58 2.7e-07 -0.97
4233 ADCY4 14_3 0.012 3.74 1.4e-07 -0.08
4232 RIPK3 14_3 0.012 3.66 1.3e-07 0.01
11294 NYNRIN 14_3 0.019 6.82 3.8e-07 1.11
1630 KHNYN 14_3 0.056 14.80 2.5e-06 2.12
5422 CBLN3 14_3 0.030 10.13 8.9e-07 -1.63
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
#snps with PIP>0.8 or 20 highest PIPs
head(ctwas_snp_res[order(-ctwas_snp_res$susie_pip),report_cols_snps],
max(sum(ctwas_snp_res$susie_pip>0.8), 20))
id region_tag susie_pip mu2 PVE z
36036 rs72700114 1_83 1.000 59.84 1.8e-04 8.86
102715 rs7568037 2_104 1.000 229.26 6.8e-04 3.74
146915 rs73146926 3_59 1.000 180.40 5.3e-04 3.17
272131 rs280460 5_92 1.000 238.16 7.1e-04 -3.73
673200 rs4257245 17_40 1.000 290.39 8.6e-04 -4.18
36035 rs12142529 1_83 0.999 31.73 9.4e-05 5.85
207639 rs186546224 4_72 0.978 34.36 1.0e-04 -3.71
272126 rs7714176 5_92 0.976 260.25 7.5e-04 3.80
650195 rs60602157 16_39 0.878 26.19 6.8e-05 6.10
207612 rs1906615 4_72 0.821 163.93 4.0e-04 18.79
522447 rs76540106 11_80 0.689 24.34 5.0e-05 -5.07
604044 rs6574343 14_36 0.619 28.15 5.2e-05 4.90
47761 rs80025148 1_109 0.608 28.45 5.1e-05 4.68
207629 rs1906613 4_72 0.585 32.23 5.6e-05 6.06
31838 rs34292822 1_76 0.574 40.02 6.8e-05 7.84
207613 rs75725917 4_72 0.565 150.64 2.5e-04 17.95
650169 rs16971447 16_39 0.546 27.30 4.4e-05 6.53
362075 rs729949 7_70 0.529 29.54 4.6e-05 -7.17
207581 rs1823291 4_72 0.490 80.81 1.2e-04 -11.44
554071 rs6489955 12_69 0.475 25.50 3.6e-05 5.67
#plot PIP vs effect size
plot(ctwas_snp_res$susie_pip, ctwas_snp_res$mu2, xlab="PIP", ylab="mu^2", main="SNP PIPs vs Effect Size")
Version | Author | Date |
---|---|---|
3a7fbc1 | wesleycrouse | 2021-09-08 |
#SNPs with 50 largest effect sizes
head(ctwas_snp_res[order(-ctwas_snp_res$mu2),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
673208 rs8066344 17_40 0.421 316.68 4.0e-04 4.19
673202 rs7220878 17_40 0.214 315.55 2.0e-04 4.10
673203 rs2366020 17_40 0.214 315.55 2.0e-04 4.10
673204 rs9302905 17_40 0.214 315.55 2.0e-04 4.10
673205 rs9302908 17_40 0.214 315.55 2.0e-04 4.10
673207 rs7224273 17_40 0.214 315.55 2.0e-04 4.10
673199 rs7216073 17_40 0.135 314.57 1.3e-04 4.06
673195 rs9909041 17_40 0.136 314.46 1.3e-04 4.07
673194 rs714769 17_40 0.102 313.84 9.5e-05 4.05
673200 rs4257245 17_40 1.000 290.39 8.6e-04 -4.18
272126 rs7714176 5_92 0.976 260.25 7.5e-04 3.80
272118 rs6883510 5_92 0.287 256.16 2.2e-04 3.71
102712 rs7576894 2_104 0.334 250.42 2.5e-04 -3.74
102716 rs4668385 2_104 0.319 250.39 2.4e-04 -3.77
102726 rs7572334 2_104 0.205 249.40 1.5e-04 -3.74
102731 rs3821084 2_104 0.203 249.39 1.5e-04 -3.74
102728 rs3731981 2_104 0.173 249.10 1.3e-04 -3.72
102729 rs2356781 2_104 0.169 249.05 1.2e-04 -3.72
102725 rs77634005 2_104 0.168 249.04 1.2e-04 -3.71
102733 rs17221367 2_104 0.163 248.98 1.2e-04 -3.71
102721 rs62183502 2_104 0.140 248.69 1.0e-04 -3.69
272131 rs280460 5_92 1.000 238.16 7.1e-04 -3.73
102715 rs7568037 2_104 1.000 229.26 6.8e-04 3.74
272125 rs11739522 5_92 0.000 228.51 3.6e-08 3.58
102717 rs28742013 2_104 0.002 228.37 1.2e-06 -4.15
272137 rs42685 5_92 0.000 226.36 9.4e-09 3.41
272133 rs6860519 5_92 0.000 225.86 5.5e-09 3.32
272134 rs1363743 5_92 0.000 196.45 3.7e-13 3.12
146915 rs73146926 3_59 1.000 180.40 5.3e-04 3.17
102707 rs62183465 2_104 0.000 171.94 1.6e-16 -2.57
272104 rs2431769 5_92 0.000 169.25 6.5e-18 2.46
146916 rs73146943 3_59 0.358 166.96 1.8e-04 -3.15
146917 rs73146950 3_59 0.345 166.88 1.7e-04 -3.15
146918 rs112404167 3_59 0.344 166.87 1.7e-04 -3.14
146920 rs6774870 3_59 0.311 166.59 1.5e-04 -3.13
146922 rs73145304 3_59 0.304 166.23 1.5e-04 -3.14
146921 rs73149115 3_59 0.240 165.88 1.2e-04 -3.09
146914 rs12638777 3_59 0.236 165.74 1.2e-04 -3.09
146913 rs2048521 3_59 0.227 165.64 1.1e-04 -3.08
207612 rs1906615 4_72 0.821 163.93 4.0e-04 18.79
146911 rs12634269 3_59 0.167 159.80 7.9e-05 -3.25
207611 rs2634074 4_72 0.179 159.46 8.5e-05 -18.71
146925 rs819293 3_59 0.042 153.38 1.9e-05 -3.06
146945 rs73153258 3_59 0.011 153.33 5.1e-06 -2.76
146948 rs73153283 3_59 0.014 153.01 6.5e-06 -2.82
207613 rs75725917 4_72 0.565 150.64 2.5e-04 17.95
146899 rs73137393 3_59 0.043 149.70 1.9e-05 -3.25
146908 rs73139131 3_59 0.015 148.27 6.5e-06 -3.05
146906 rs17801380 3_59 0.014 148.21 6.3e-06 -3.03
207610 rs1906611 4_72 0.310 148.11 1.4e-04 17.84
#SNPs with 50 highest pve
head(ctwas_snp_res[order(-ctwas_snp_res$PVE),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
673200 rs4257245 17_40 1.000 290.39 8.6e-04 -4.18
272126 rs7714176 5_92 0.976 260.25 7.5e-04 3.80
272131 rs280460 5_92 1.000 238.16 7.1e-04 -3.73
102715 rs7568037 2_104 1.000 229.26 6.8e-04 3.74
146915 rs73146926 3_59 1.000 180.40 5.3e-04 3.17
207612 rs1906615 4_72 0.821 163.93 4.0e-04 18.79
673208 rs8066344 17_40 0.421 316.68 4.0e-04 4.19
102712 rs7576894 2_104 0.334 250.42 2.5e-04 -3.74
207613 rs75725917 4_72 0.565 150.64 2.5e-04 17.95
102716 rs4668385 2_104 0.319 250.39 2.4e-04 -3.77
272118 rs6883510 5_92 0.287 256.16 2.2e-04 3.71
673202 rs7220878 17_40 0.214 315.55 2.0e-04 4.10
673203 rs2366020 17_40 0.214 315.55 2.0e-04 4.10
673204 rs9302905 17_40 0.214 315.55 2.0e-04 4.10
673205 rs9302908 17_40 0.214 315.55 2.0e-04 4.10
673207 rs7224273 17_40 0.214 315.55 2.0e-04 4.10
36036 rs72700114 1_83 1.000 59.84 1.8e-04 8.86
146916 rs73146943 3_59 0.358 166.96 1.8e-04 -3.15
146917 rs73146950 3_59 0.345 166.88 1.7e-04 -3.15
146918 rs112404167 3_59 0.344 166.87 1.7e-04 -3.14
102726 rs7572334 2_104 0.205 249.40 1.5e-04 -3.74
102731 rs3821084 2_104 0.203 249.39 1.5e-04 -3.74
146920 rs6774870 3_59 0.311 166.59 1.5e-04 -3.13
146922 rs73145304 3_59 0.304 166.23 1.5e-04 -3.14
207610 rs1906611 4_72 0.310 148.11 1.4e-04 17.84
102728 rs3731981 2_104 0.173 249.10 1.3e-04 -3.72
673195 rs9909041 17_40 0.136 314.46 1.3e-04 4.07
673199 rs7216073 17_40 0.135 314.57 1.3e-04 4.06
102725 rs77634005 2_104 0.168 249.04 1.2e-04 -3.71
102729 rs2356781 2_104 0.169 249.05 1.2e-04 -3.72
102733 rs17221367 2_104 0.163 248.98 1.2e-04 -3.71
146914 rs12638777 3_59 0.236 165.74 1.2e-04 -3.09
146921 rs73149115 3_59 0.240 165.88 1.2e-04 -3.09
207581 rs1823291 4_72 0.490 80.81 1.2e-04 -11.44
146913 rs2048521 3_59 0.227 165.64 1.1e-04 -3.08
102721 rs62183502 2_104 0.140 248.69 1.0e-04 -3.69
207639 rs186546224 4_72 0.978 34.36 1.0e-04 -3.71
673194 rs714769 17_40 0.102 313.84 9.5e-05 4.05
36035 rs12142529 1_83 0.999 31.73 9.4e-05 5.85
207611 rs2634074 4_72 0.179 159.46 8.5e-05 -18.71
207584 rs2044674 4_72 0.339 79.55 8.0e-05 11.37
146911 rs12634269 3_59 0.167 159.80 7.9e-05 -3.25
31838 rs34292822 1_76 0.574 40.02 6.8e-05 7.84
650195 rs60602157 16_39 0.878 26.19 6.8e-05 6.10
207629 rs1906613 4_72 0.585 32.23 5.6e-05 6.06
604044 rs6574343 14_36 0.619 28.15 5.2e-05 4.90
31837 rs34515871 1_76 0.439 39.14 5.1e-05 7.79
47761 rs80025148 1_109 0.608 28.45 5.1e-05 4.68
522447 rs76540106 11_80 0.689 24.34 5.0e-05 -5.07
362075 rs729949 7_70 0.529 29.54 4.6e-05 -7.17
#SNPs with 50 largest z scores
head(ctwas_snp_res[order(-abs(ctwas_snp_res$z)),report_cols_snps],50)
id region_tag susie_pip mu2 PVE z
207612 rs1906615 4_72 0.821 163.93 4.0e-04 18.79
207611 rs2634074 4_72 0.179 159.46 8.5e-05 -18.71
207613 rs75725917 4_72 0.565 150.64 2.5e-04 17.95
207610 rs1906611 4_72 0.310 148.11 1.4e-04 17.84
207602 rs4529121 4_72 0.038 143.15 1.6e-05 17.46
207607 rs74496596 4_72 0.020 140.26 8.3e-06 17.43
207606 rs76013973 4_72 0.019 140.11 7.9e-06 17.42
207603 rs12647316 4_72 0.018 139.73 7.3e-06 17.41
207608 rs12639820 4_72 0.014 138.98 6.0e-06 17.39
207609 rs28521134 4_72 0.005 135.34 2.2e-06 17.28
207604 rs12647393 4_72 0.006 135.66 2.3e-06 17.27
207605 rs10019689 4_72 0.006 135.62 2.2e-06 17.27
207596 rs12650829 4_72 0.000 105.03 1.3e-08 14.63
207593 rs60197024 4_72 0.000 99.73 7.3e-09 14.56
207601 rs12644107 4_72 0.000 54.01 8.6e-11 11.66
207617 rs6533530 4_72 0.000 71.87 5.8e-10 -11.49
207581 rs1823291 4_72 0.490 80.81 1.2e-04 -11.44
207587 rs12505886 4_72 0.000 46.15 7.4e-11 -11.39
207618 rs3866831 4_72 0.000 69.41 4.5e-10 -11.38
207584 rs2044674 4_72 0.339 79.55 8.0e-05 11.37
207588 rs112927894 4_72 0.000 44.27 5.2e-11 11.34
207589 rs2723301 4_72 0.000 42.52 4.6e-11 -11.27
207590 rs2595074 4_72 0.000 41.97 4.4e-11 -11.25
207591 rs2723302 4_72 0.000 41.92 4.4e-11 -11.24
207585 rs2723296 4_72 0.171 77.62 3.9e-05 -11.21
207592 rs2218698 4_72 0.000 41.43 4.3e-11 -11.20
207594 rs1584429 4_72 0.000 41.50 4.2e-11 -11.15
207595 rs1448799 4_72 0.000 42.86 4.6e-11 11.14
207597 rs2197814 4_72 0.000 42.92 4.6e-11 11.14
207599 rs2723318 4_72 0.000 41.89 4.3e-11 -11.09
36036 rs72700114 1_83 1.000 59.84 1.8e-04 8.86
207583 rs12642151 4_72 0.000 60.11 1.7e-08 8.76
207576 rs17553596 4_72 0.000 50.67 3.3e-10 7.94
31838 rs34292822 1_76 0.574 40.02 6.8e-05 7.84
31837 rs34515871 1_76 0.439 39.14 5.1e-05 7.79
207637 rs17513625 4_72 0.000 46.41 9.2e-09 7.36
362075 rs729949 7_70 0.529 29.54 4.6e-05 -7.17
362064 rs4727833 7_70 0.124 32.07 1.2e-05 7.15
362076 rs3807990 7_70 0.463 29.08 4.0e-05 -7.14
31840 rs12058931 1_76 0.152 45.49 2.1e-05 7.13
207619 rs4032974 4_72 0.000 37.07 5.2e-11 -6.96
207615 rs4124159 4_72 0.000 36.84 5.1e-11 -6.94
207616 rs10006881 4_72 0.000 36.84 5.1e-11 -6.94
207614 rs1906606 4_72 0.000 36.64 5.0e-11 -6.93
362054 rs1007751 7_70 0.092 28.85 7.9e-06 6.77
362055 rs717957 7_70 0.088 28.62 7.4e-06 6.76
362056 rs28557111 7_70 0.089 28.67 7.5e-06 6.76
362050 rs76160518 7_70 0.113 29.81 1.0e-05 6.74
362046 rs12706089 7_70 0.100 29.28 8.7e-06 6.72
362034 rs2157799 7_70 0.202 32.55 1.9e-05 6.64
#GO enrichment analysis
library(enrichR)
Welcome to enrichR
Checking connection ...
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
dbs <- c("GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021")
genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]
#number of genes for gene set enrichment
length(genes)
[1] 0
if (length(genes)>0){
GO_enrichment <- enrichr(genes, dbs)
for (db in dbs){
print(db)
df <- GO_enrichment[[db]]
df <- df[df$Adjusted.P.value<0.05,c("Term", "Overlap", "Adjusted.P.value", "Genes")]
print(df)
}
#DisGeNET enrichment
# devtools::install_bitbucket("ibi_group/disgenet2r")
library(disgenet2r)
disgenet_api_key <- get_disgenet_api_key(
email = "wesleycrouse@gmail.com",
password = "uchicago1" )
Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
res_enrich <-disease_enrichment(entities=genes, vocabulary = "HGNC",
database = "CURATED" )
df <- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio", "BgRatio")]
print(df)
#WebGestalt enrichment
library(WebGestaltR)
background <- ctwas_gene_res$genename
#listGeneSet()
databases <- c("pathway_KEGG", "disease_GLAD4U", "disease_OMIM")
enrichResult <- WebGestaltR(enrichMethod="ORA", organism="hsapiens",
interestGene=genes, referenceGene=background,
enrichDatabase=databases, interestGeneType="genesymbol",
referenceGeneType="genesymbol", isOutput=F)
print(enrichResult[,c("description", "size", "overlap", "FDR", "database", "userId")])
}
# library("readxl")
#
# known_annotations <- read_xlsx("data/summary_known_genes_annotations.xlsx", sheet="BMI")
# known_annotations <- unique(known_annotations$`Gene Symbol`)
#
# unrelated_genes <- ctwas_gene_res$genename[!(ctwas_gene_res$genename %in% known_annotations)]
#
# #number of genes in known annotations
# print(length(known_annotations))
#
# #number of genes in known annotations with imputed expression
# print(sum(known_annotations %in% ctwas_gene_res$genename))
#
# #assign ctwas, TWAS, and bystander genes
# ctwas_genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>0.8]
# twas_genes <- ctwas_gene_res$genename[abs(ctwas_gene_res$z)>sig_thresh]
# novel_genes <- ctwas_genes[!(ctwas_genes %in% twas_genes)]
#
# #significance threshold for TWAS
# print(sig_thresh)
#
# #number of ctwas genes
# length(ctwas_genes)
#
# #number of TWAS genes
# length(twas_genes)
#
# #show novel genes (ctwas genes with not in TWAS genes)
# ctwas_gene_res[ctwas_gene_res$genename %in% novel_genes,report_cols]
#
# #sensitivity / recall
# sensitivity <- rep(NA,2)
# names(sensitivity) <- c("ctwas", "TWAS")
# sensitivity["ctwas"] <- sum(ctwas_genes %in% known_annotations)/length(known_annotations)
# sensitivity["TWAS"] <- sum(twas_genes %in% known_annotations)/length(known_annotations)
# sensitivity
#
# #specificity
# specificity <- rep(NA,2)
# names(specificity) <- c("ctwas", "TWAS")
# specificity["ctwas"] <- sum(!(unrelated_genes %in% ctwas_genes))/length(unrelated_genes)
# specificity["TWAS"] <- sum(!(unrelated_genes %in% twas_genes))/length(unrelated_genes)
# specificity
#
# #precision / PPV
# precision <- rep(NA,2)
# names(precision) <- c("ctwas", "TWAS")
# precision["ctwas"] <- sum(ctwas_genes %in% known_annotations)/length(ctwas_genes)
# precision["TWAS"] <- sum(twas_genes %in% known_annotations)/length(twas_genes)
# precision
#
# #ROC curves
#
# pip_range <- (0:1000)/1000
# sensitivity <- rep(NA, length(pip_range))
# specificity <- rep(NA, length(pip_range))
#
# for (index in 1:length(pip_range)){
# pip <- pip_range[index]
# ctwas_genes <- ctwas_gene_res$genename[ctwas_gene_res$susie_pip>=pip]
# sensitivity[index] <- sum(ctwas_genes %in% known_annotations)/length(known_annotations)
# specificity[index] <- sum(!(unrelated_genes %in% ctwas_genes))/length(unrelated_genes)
# }
#
# plot(1-specificity, sensitivity, type="l", xlim=c(0,1), ylim=c(0,1))
#
# sig_thresh_range <- seq(from=0, to=max(abs(ctwas_gene_res$z)), length.out=length(pip_range))
#
# for (index in 1:length(sig_thresh_range)){
# sig_thresh_plot <- sig_thresh_range[index]
# twas_genes <- ctwas_gene_res$genename[abs(ctwas_gene_res$z)>=sig_thresh_plot]
# sensitivity[index] <- sum(twas_genes %in% known_annotations)/length(known_annotations)
# specificity[index] <- sum(!(unrelated_genes %in% twas_genes))/length(unrelated_genes)
# }
#
# lines(1-specificity, sensitivity, xlim=c(0,1), ylim=c(0,1), col="red", lty=2)
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] enrichR_3.0 cowplot_1.0.0 ggplot2_3.3.3
loaded via a namespace (and not attached):
[1] tidyselect_1.1.0 xfun_0.8 purrr_0.3.4
[4] colorspace_1.4-1 vctrs_0.3.8 generics_0.0.2
[7] htmltools_0.3.6 yaml_2.2.0 utf8_1.2.1
[10] blob_1.2.1 rlang_0.4.11 later_0.8.0
[13] pillar_1.6.1 glue_1.4.2 withr_2.4.1
[16] DBI_1.1.1 bit64_4.0.5 lifecycle_1.0.0
[19] stringr_1.4.0 munsell_0.5.0 gtable_0.3.0
[22] workflowr_1.6.2 memoise_2.0.0 evaluate_0.14
[25] labeling_0.3 knitr_1.23 fastmap_1.1.0
[28] httpuv_1.5.1 curl_3.3 fansi_0.5.0
[31] Rcpp_1.0.6 promises_1.0.1 scales_1.1.0
[34] cachem_1.0.5 farver_2.1.0 fs_1.3.1
[37] bit_4.0.4 rjson_0.2.20 digest_0.6.20
[40] stringi_1.4.3 dplyr_1.0.7 grid_3.6.1
[43] rprojroot_2.0.2 tools_3.6.1 magrittr_2.0.1
[46] RSQLite_2.2.7 tibble_3.1.2 crayon_1.4.1
[49] whisker_0.3-2 pkgconfig_2.0.3 ellipsis_0.3.2
[52] data.table_1.14.0 rmarkdown_1.13 httr_1.4.1
[55] R6_2.5.0 git2r_0.26.1 compiler_3.6.1