Wednesday, November 7, 2018

Meet the Knowledge Portal team at AHA

This weekend, cardiovascular researchers from around the globe will be meeting in Chicago for the 2018 Scientific Sessions of the American Heart Association. Members of the Knowledge Portal Network team will be there to meet and talk with geneticists and biologists who use the Portals and get your input on how we can improve them.

Please come visit us at booth #2249 in the Exhibit Hall! We'll be there on Saturday, Nov. 10 from 11am-5pm; on Sunday, Nov. 11 from 10am-4:30pm; and on Monday, Nov. 12 from 10am-3pm.

Friday, October 26, 2018

New features in the CDKP

Today the Cerebrovascular Disease Knowledge Portal has several new features that help bring meaning to genetic association results.

Calculated credible sets

Credible sets are useful because they assign to individual variants in a locus a probability of being causal for a phenotype. On Gene Pages, when viewing the type 2 diabetes (T2D) phenotype, the Credible sets tab displays credible sets generated by the MAGIC consortium. However, credible sets have not been generated by researchers for phenotypes in the CDKP other than T2D.

Now, the CDKP provides calculated credible sets for all phenotypes. When viewing a phenotype other than T2D on the Gene page, the Credible sets tab is replaced by a Calculated credible set tab. This LocusZoom module, developed by researchers at the University of Michigan, automatically calculates posterior probabilities from p-values. Calculated credible sets include up to 10 variants; the credible interval covered by the set may vary, depending on the strength of associations across the region.

UK Biobank PheWAS

The PheWAS display in the "Associations at a glance" section of Variant pages (see an example) is another LocusZoom module for displaying phenome-wide associations. The default PheWAS plot on the Variant page shows associations for a variant across all of the phenotypes included in the CDKP.

Now, by checking the "Use UKBB data" box, you can view associations for a variant across about 1,400 UK Biobank phenotypes from an analysis performed at the University of Michigan.


New LocusZoom visualization shows variant associations across UK Biobank phenotypes


Forest plot visualization of variant associations

We provide yet another LocusZoom visualization on a separate tab of the "Associations at a glance" section of the Variant page. The Forest plot is an alternative way to visualize phenotypic associations for a variant. In addition to displaying the significance of variant associations, the Forest plot also shows their direction of effect and confidence interval.

Forest plot on the Variant page


Check out a new portal!

We've just launched a new member of the Knowledge Portal Network: the Sleep Disorder Knowledge Portal for the genetics of sleep and circadian traits. Find a link to it on the CDKP home page:




We hope you enjoy the new data and features in the CDKP. Please contact us any time with suggestions or questions!


Monday, October 15, 2018

Connect with the Knowledge Portal Network team at ASHG!

Next week, the human genetics research community will come together in San Diego for one of the most important conferences of the year: the annual American Society of Human Genetics meeting. The Knowledge Portal Network team will be there, and in addition to presenting all the new data and features in the Type 2 DiabetesCerebrovascular Disease, and Cardiovascular Disease Knowledge Portals (KPs), we'll be launching an entirely new Portal for the genetics of sleep disorders!

We'll also present an interactive workshop on Friday that will go over the basics of navigating the Knowledge Portal Network. Download the flyer here, and find more details below.

Here's the schedule of events for the week:

Tuesday, October 16
2:05-2:30 pm: Jason Flannick will present a talk, "Infrastructure for analyzing and disseminating large-scale genetic data for type 2 diabetes and other complex diseases," in the ASHG/IGES/ISCB Joint Symposium.
Room 6C - Upper Level/San Diego Convention Center

Wednesday, October 17
The Knowledge Portal team will be at our booth, #219, in the exhibit hall from 10am-4:30pm.
We'll also be at the Broad Institute Genomic Services booth, #1634, from 10:30-11:30am.

Thursday, October 18
The team will again be at our booth, #219, in the exhibit hall from 10am-4:30pm.

Friday, October 19
We'll again be at our booth, #219, in the exhibit hall from 10am- 4:30pm, but today the booth will be closed around lunchtime so that we can present a special tutorial session on the Knowledge Portals. See details and sign up below. After the session, we'll be back at our booth until 4:30pm and will also be at the Broad Institute Genomic Services booth, #1634, from 2:30 - 3:30pm.

At lunchtime on Friday, grab your laptop and come to a workshop on the Knowledge Portals:

Navigating complex disease genetics: using the Knowledge Portal Network to move from SNPs to functional insights
12:30-1:45pm
Room 28C, Upper Level, San Diego Convention Center

We'll go over some basics, illustrate workflows, and answer questions about how you can use KPs to investigate SNPs, genes, or regions of interest and turn genetic data into insights about complex diseases.

Please sign up so we can plan for refreshments. We'll send you a reminder a few days beforehand. We look forward to seeing you there! Please contact us with any questions or suggestions for topics you'd like to discuss.



Wednesday, September 26, 2018

New data and new features in the CDKP

We are pleased to announce the addition of two new summary level data sets to the Cerebrovascular Disease Knowledge Portal. MEGASTROKE is a genome-wide association study of ~520,000 subjects, including controls. Using stroke risk scores and LD scores, this study discovered 22 novel stroke related loci. With the addition of this study, the sample size for stroke-related associations in the CDKP increases by more than 5-fold. Since the METASTROKE results previously available in the CDKP are a subset of MEGASTROKE, they are no longer displayed as a separate set.

Another new dataset in the CDKP is the Han Population Taiwan-NGCM study of ~2,000 subjects, a genome-wide association study that discovered novel loci for large- and small-vessel ischemic strokes. Results from both of these datasets may be searched using the Variant Finder tool and may be browsed:

• On Gene Pages in the Common variants and High-impact variants tables and in LocusZoom plots;

• On Variant Pages in the Associations at a glance section, the Associations across all datasets section, and in LocusZoom plots;

• From the View full genetic association results for a phenotype search on the home page: first select a phenotype, then select a dataset on the resulting page.



In addition to new results, the CDKP now includes four new features that simplify the interpretation of genetic association data, making it easier to pinpoint variants and datasets that are informative for a disease or phenotype of interest.

"Clumping" variants by linkage disequilibrium

The first step in getting an overview of the results of a particular experiment is typically to plot variant associations vs. chromosomal location, in a so-called "Manhattan plot." These plots are available from the CDKP home page after choosing a phenotype:


After selecting a phenotype, you may select a dataset, and the Manhattan plot is displayed above a table of the top variants:



Now, in addition to selecting a dataset to view associations, you may select a threshold for linkage disequilibrium (LD) in order to reduce the number of linked variants that represent a single association signal. For example, without "clumping" variants by LD (r2 = 1), when viewing the "All ischemic stroke" phenotype and the NINDS SiGN 2016 dataset, 9 of the top 25 significantly associated variants are near the KCNQ3 gene; but setting the most stringent LD threshold  (r2 = 0.1) reduces that number to just 2 variants by displaying only the most significant associations after clumping variants by LD. Intermediate LD thresholds of r2 = 0.2. 0.4, 0.6, or 0.8 may also be set, allowing more versatility in this analysis.

New Region page

The Gene page of the CDKP (see an example) integrates and summarizes information about the associations of variants across the region of a gene. Now, you can see this integration and summation for any region of the genome, not just the areas surrounding protein-coding genes. Simply enter a chromosome and coordinates in the home page search box:



The resulting page resembles a Gene page. The traffic light integrates all associations across the region to give you an immediate indication of whether there are significant associations found in any of the datasets in the CDKP. Further down the page, tools and displays let you drill down to the specifics for a phenotype or variant of interest. This new Region page provides a way to explore any part of the genome in great detail.



PheWAS graphic on the Variant page

Previously, the Variant page of the CDKP displayed significant associations for each variant in a graphic that showed a color-coded box for each phenotype-dataset combination. But the rapidly increasing number of phenotypes becoming available from biobank studies has made this view unsustainably large. In its place, we have incorporated a phenome-wide association study (PheWAS) visualization developed at the University of Michigan. The graphic shows at a glance which phenotype associations are most significant for a particular variant. Mouse over a point to see more details.


All Associations graphic on the Variant page

The PheWAS graphic distills variant associations in order to highlight the most significant ones. But suppose you want to drill down to the details and explore associations in every dataset, viewing parameters like sample size, odds ratio, and more? There's a graphic for that too: our new All Associations interactive graphic, located in the "Associations across all datasets" section of the variant page. Start by using keywords to filter phenotypes. Filtering allows you to view one specific phenotype, several related phenotypes, or phenotypes in a broad category, such as ischemic stroke; both the graphic and the table below it change in response to phenotype filtering.  There are also options to filter by setting ranges of p-values and/or sample sizes.

The graph plots p-value (vertical axis) vs. dataset sample size (horizontal axis) for each association. Points in the graph are triangular; whether the triangle points up or down indicates a positive or negative direction of effect, respectively. Mousing over a point shows you more details about the association and the dataset. This graphic can help you evaluate whether an association is likely to be real: a genuine signal should increase in significance (i.e., decrease in p-value) with increasing sample size.


Stay in touch!

Like the rest of the CDKP, these features are under continuous development. Please give them a try and let us know what you think.

Wednesday, August 15, 2018

Sign up for a hands-on tutorial session on the Knowledge Portals

Are you attending the American Society of Human Genetics meeting in October? If so, save your Friday lunch break for a tutorial session on the Knowledge Portals!

Navigating complex disease genetics: using the Knowledge Portal Network to move from SNPs to functional insights
12:30pm - 1:45pm
Friday, October 19
San Diego Convention Center
Room 28C, Upper Level

Bring your laptop and your questions about the Cerebrovascular DiseaseType 2 Diabetes, or Cardiovascular Disease Knowledge Portals (KPs). We'll go over some basics, illustrate workflows, and answer questions about how you can use KPs to investigate SNPs, genes, or regions of interest and turn genetic data into insights about complex diseases.

Please sign up so we can plan for refreshments. We'll send you a reminder a few days beforehand. We look forward to seeing you there! Please contact us with any questions or suggestions for topics you'd like to discuss.

Wednesday, May 2, 2018

Join the Knowledge Portal Network team!

At the Knowledge Portal Network (currently consisting of the Type 2 Diabetes, Cerebrovascular Disease, and Cardiovascular Disease Knowledge Portals), we are looking for energetic, talented people to help us produce web portals that aggregate and serve genetic association results to the world in order to spark insights into complex diseases. There are positions open for a software engineer to help in developing and producing these web portals, and for a technical release manager to manage and coordinate tasks during production and maintenance of the portals.

The positions are located at the Broad Institute in Cambridge, MA, a dynamic and exciting work environment where cutting-edge science is applied to critical biomedical problems.

Find more details and apply for the software engineer or technical release manager positions at the Broad Careers site.

Monday, February 12, 2018

CDKP Publication

The Cerebrovascular Disease Knowledge Portal has an article describing its design and development available! This article has been published in Stroke’s February issue as a topical review. The article details the challenges and successes that the CDKP has faced during development and outlines plans for the future of the knowledge portal. The article is entitled “Cerebrovascular Disease Knowledge Portal: An Open-Access Data Resource to Accelerate Genomic Discoveries in Stroke” and is available through the journal website.