top of page

RMSS 2017

Krista Fischer

Personalized Prediction of the Risk of Common Complex Diseases: Some Statistical Aspects

The talk will provide an overview of the development and validation of algorithms for personalized prediction of the risk of Type 2 Diabetes (T2D) in the Estonian Biobank cohort. In addition, various methodological challenges at different steps of the process will be discussed.


The results of a large-scale meta-analysis of Genome-Wide Association Studies (GWAS) could be used to order the Single Nucleotide Polymorphisms (SNPs) based on the strength of their established association with the phenotype (indicated by the p-value, for instance).  Subsequently, a certain number of top independent SNPs can be combined to form a Genetic Risk Score (GRS) that has considerably better predictive ability than any of the SNPs alone.  For an efficient GRS, one needs to identify optimal criteria for selecting SNPs and their corresponding weights. We show that for T2D, the doubly-weighted GRS that combines more than 5000 SNPs provides the strongest association with both prevalent and incident T2D. 


To provide accurate estimates of the risk of a complex disease, a GRS needs to be combined with known environmental and lifestyle-related risk factors. We discuss important steps in the development of a prediction model that combines genetic and non-genetic predictors for T2D. To use the model in practical risk assessment, it needs to be combined with estimates of baseline (age-specific) risk level to provide estimates of absolute risk.  We will show important stages of the process as well as the resulting risk prediction tool. We also discuss statistical challenges that are related to specific features of the population-based biobank data (left-truncation for some outcomes, mix of retrospective and prospective data for some others, etc.).  In addition, some common mistakes and their consequences are pointed out (such as partial overlap of discovery and validation cohorts).

William Gates Building, 15 J.J. Thomson Avenue Cambridge CB3 0FD

RMSS 2017

This project has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 305280

bottom of page