MSstats and Cardinal: Next Generation Statistical Mass Spectrometry in R
Lead PI
Co PIs
- Kylie Ariel Bemis
- Meena Choi
Abstract
MSstats
MSstats is a family of open-source R/Bioconductor packages for statistical relative quantification of peptides and proteins in mass spectrometry-based proteomics. MSstats is applicable to experiments with arbitrary complex designs (factorial experiments, paired designs, time course), data acquired with shotgun DDA, data independent DIA/SWATH, PRM or targeted SRM workflows, and label-free or label-based (e.g., TMT labeling) workflows.
The core of MSstats is its state-of-the-art statistical models and algorithms that address technological aspects specific to mass spectrometry-based proteomics. The functionalities include data visualization, normalization, transformation, detecting differentially abundant proteins, estimating protein abundance, or detecting sites with differential post-translational modifications. It also can detect system suitability and quality control for mass spectrometric assays, characterize mass spectrometric assays (e.g., limits of detection, dynamic range), and plan future experiments (sample size calculation for detection of differentially abundant proteins or predictive proteins).
MSstats takes as input a list of identified and quantified spectral features. It interfaces with most currently used open-source and commercial tools (e.g., Skyline, MaxQuant, OpenMS, Spectronaut, ProteomeDiscoverer).
Cardinal
Cardinal is a family of open source R/Bioconductor packages for statistical analysis of mass spectrometry-based imaging (MSI) experiments of biological samples, such as tissues. Technology-specific issues, such as data accessibility, difficulties of mapping spectra between samples, and complexities of analyte ionization and fragmentation, require specialized analysis tools for MSI.
Cardinal supports 2- and 3-dimensional MSI experiments with multiple tissues and conditions, and complex designs, as well as matrix-assisted laser desorption/ionization and desorption electrospray ionization-based workflows.
Cardinal’s functionalities include image visualization, image segmentation, image classification, and detection of differentially abundant ions across conditions.
Due to large sizes of raw MSI data, the back-end to Cardinal supports direct interactions with larger-than-memory datasets, stored in an arbitrary number of files in arbitrary formats. Matter supports reproducible research by minimizing the need of converting and storing data in multiple formats.
Funding
Essential Open Source Software for Science from the Chan Zuckerberg Initiative
Related Publications
- M. Choi, C.-Y. Chang, T. Clough, D. Broudy, T. Killeen, B. MacLean, O. Vitek. “MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments”. Bioinformatics, 30:2524, 2014. DOI: 10.1093/bioinformatics/btu305
- K. D. Bemis, A. Harry, L. S. Eberlin, C. Ferreira, S. M. van de Ven, P. Mallick, M. Stolowitz, O. Vitek. “Cardinal: an R package for statistical analysis of mass spectrometry-based imaging experiments”. Bioinformatics, 31:2418, 2015. DOI: 10.1093/bioinformatics/btu305
- E. Dogu, S. Mohammad-Taheri, S.E. Abbatiello, M.S. Bereman, B. MacLean, B. Schilling, O. Vitek. “MSstatsQC: Longitudinal System Suitability Monitoring and Quality Control for Targeted Proteomic Experiments”, Mol Cell Proteomics. 2017;16(7):1335-1347. DOI: 10.1074/mcp.M116.064774
- E. Dogu, S. Mohammad-Taheri, R. Olivella, F. Marty, I. Lienert, L. Reiter, E. Sabidó, O. Vitek. “MSstatsQC 2.0: R/Bioconductor package for statistical quality control of mass spectrometry-based proteomic experiments”. Journal of Proteome Research, 18:678, 2019. DOI: 10.1021/acs.jproteome.8b00732
- T. Huang, M. Choi, M. Tzouros, S. Golling, N. J. Pandya, B. Banfai, T. Dunkley, O. Vitek. “MSstatsTMT: Statistical detection of differentially abundant proteins in experiments with isobaric labeling and multiple mixtures”, Molecular & Cellular Proteomics, mcp.RA120.002105, 2020. DOI: 10.1074/mcp.ra120.002105