top of page

RMSS 2017

Korbinian Strimmer

A New Look at Omics Data Integration from an Entropy and Network Perspective
​
(joint work with Takoua Jendoubi)
 

​

Probably the most commonly used approach for joint integrative analysis of  omics data is classical canonical correlation analysis (CCA), or a modern  variant of it (e.g., sparse CCA). Other popular approaches for data  integration include O2PLS, a related projection-based approach developed 
for used in chemometrics, or the RV coefficient to measure and dissect  total association between groups of genes/metabolites/etc. are now also in  widespread use in omics data analysis. Unfortunately, these approaches have  a number of crucial drawbacks, including lack of interpretability of the  underlying factors, incoherency with standard multivariate regression and 
difficulty in application to large-scale data.


Here we present as alternative a simple network-based approach to integrative data analysis that employs relative entropy to characterize the overall association between two (or more) sets of omics data. This approach is natural in the setting of latent-variable multivariate regression and we show that in case of normal variables it enables a canonical decomposition that allows to additionally infer the underlying corresponding association network among the individual constituents. Furthermore, our approach to data integration is computationally inexpensive and hence can be applied to large-dimensional data sets. It can also be easily extended to more than two data sets. We illustrate this approach, which can be interpreted as networked extension of CCA, by analyzing metabolomic and transcriptomic data.

William Gates Building, 15 J.J. Thomson Avenue Cambridge CB3 0FD

RMSS 2017

This project has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 305280

bottom of page