We describe the design and scope of @mrbase2 (mrbase.org): a platform that integrates a database of complete GWAS results (no restrictions according to p-values) with an API, webapp and R packages that automate Mendelian randomization (MR) using GWAS results 2/n
New postdoc opportunity in Health Data Science at the MRC Integrative Epidemiology Unit (@mrcieu) at University of Bristol. Get in touch if interested! bristol.ac.uk/jobs/find/deta…
Join me at #ElasticCC where I'll be talking about how we use Elasticsearch for @OpenGwas. This is @Elastic’s free technical event from the community, for the community — happening from Feb 26 - 27 ela.st/community-conference
Were experiencing problems with the mrbase web app at the moment. You can continue to access the data via github.com/MRCIEU/TwoSampleM… and @OpenGwas while we work on a fix
Analytical tools - The gwasglue R package connects the local or cloud data sources, as well as cloud hosted LD reference panels, to a range of third party tools. Fine mapping, colocalisation, MR, genetic correlations, PheWAS, etc. mrcieu.github.io/gwasglue/
Local querying - All the data can also be downloaded as GWAS VCF files. There is a need for a storage format that is unambiguous, self contained and performant for querying. We designed this for large scale analysis on compute clusters bit.ly/33VbOfL
Cloud querying - elasticsearch to store the data, hosted in collaboration with Oracle Cloud Infrastructure. We worked hard to optimise queries, allowing rapid search by position, rsid, range, p-value. This can be done in R, python or through a RESTful API.
QC process - we align the non-effect allele to the human genome reference sequence; and annotate the positions with dbSNP identifiers. Example QC report: gwas.mrcieu.ac.uk/datasets/u…
Data sources - To date, the summary data have been manually harvested/pooled from various GWAS consortia and biobanks. Many thanks to these studies who have made their summary data available!
OpenGWAS (gwas.mrcieu.ac.uk/) is a harmonised database of complete GWAS summary data (126 billion associations, almost 15000 complete datasets) with programmatic connections to a range of third party analytical tools.
Continued data harvesting - for new GWAS results that are published, upload them to the EBI GWAS catalog and we will pull them in from there. For unpublished results, or large batches, we have pipelines to do that. Get in touch here: github.com/mrcieu/opengwas-r…
We built the OpenGWAS resource to be a free and open platform to support work on GWAS summary data. Much of it is based on extensive feedback on @mrbase2 from both internal and external colleagues. Paper here: bit.ly/2DRbHGT, key points below:
Paper from Matt Lyon @mrc_ieu describing an adapted variant call format (VCF) for efficient and robust storage of GWAS summary statistics disq.us/t/3p55pl6
Guidelines on performing Mendelian randomization investigations written by an all-star line-up of MR researchers are now available on Wellcome Open Research: wellcomeopenresearch.org/art… - represents a consensus statement after 12 months of deliberation. Comments welcome!
Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases biorxiv.org/content/10.1101/…
New MR paper suggest proteins with MR and colocalization evidence is more likely to predict drug trial success