top of page

RMSS 2017

Simon Rogers

Decomposing Metabolomics Mass Spectrometry Data with Topic Models

The key challenge in the analysis of metabolomics mass spectrometry data is the identification of the ions detected in the mass spectrometer. Fragmentation is the most popular strategy for molecular identification but relies upon matching fragment spectra to databases, but these have a very low coverage: in a typical experiment, <10% of the measured molecules can be matched to database spectra.

​

In this talk, I will present an approach for the analysis of mass spectrometry fragment data that extracts uses approaches developed for the analysis of text -- topic models -- to extract commonly co-occurring patterns of fragments and losses that can be interpreted as molecular substructures. I will present results that indicate that identification of substructures (topics) is often possible, allowing all molecules including that topic to be partially annotated even if they cannot be identified in a traditional manner. In addition, I will show how topics can be linked to molecular intensity and how the approach can be extended to analyse the change in substructure prevalence across groups of samples.

William Gates Building, 15 J.J. Thomson Avenue Cambridge CB3 0FD

RMSS 2017

This project has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 305280

bottom of page