top of page

RMSS 2017

Nuala Sheehan

Pedigree Reconstruction from Genetic Marker Data

The problem of estimating relationships amongst a group of individuals from genetic marker data (‘pedigree reconstruction’) is of interest in many diverse areas. Vast numbers of genetic markers are now routinely genotyped on large population cohorts (e.g. UK Biobank) of purportedly unrelated individuals. These cohorts undoubtedly contain relatives and dense marker sets are hugely informative for relatedness. Standard marker-based estimators of pairwise relatedness are often used to adjust association analyses for cryptic relatedness which is thus treated as a nuisance factor.  Full relationship information, as provided by a pedigree, could perhaps be exploited to improve inference if it could be reliably recovered. Pedigrees are also important for identifying rare disease alleles via linkage analysis, are essential to understanding parent-of-origin genetic effects, and inform the structure of human populations.

​

In theory, estimating the pedigree for a given set of individuals from genetic marker data requires consideration of all possible relationships amongst them and computing the likelihood for each. For large problems, brute force enumeration is clearly impractical. The reconstruction problem can be formulated as a problem of graphical structure estimation and is known to be NP-hard. We propose an integer linear programming (ILP) approach to graphical structure estimation which is adapted to find valid pedigrees by imposing appropriate constraints. Our method, unlike others, is guaranteed to return a maximum likelihood pedigree for the standard situation where all individuals are observed at unlinked marker loci. The more realistic situation, where observed individuals are typically connected by (possibly many) missing individuals poses a far harder problem, however. Such applications will require efficient formulations of general purpose and graph learning algorithms. In particular, a Bayesian approach enabling the incorporation of additional prior information in a principled way would seem appropriate.

William Gates Building, 15 J.J. Thomson Avenue Cambridge CB3 0FD

RMSS 2017

This project has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 305280

bottom of page