Search | arXiv e-print repository

doi 10.1038/s41592-023-01802-5

Sampling the proteome by emerging single-molecule and mass-spectrometry methods

Authors: Michael J. MacCoss, Javier Alfaro, Meni Wanunu, Danielle A. Faivre, Nikolai Slavov

Abstract: Mammalian cells have about 30,000-fold more protein molecules than mRNA molecules. This larger number of molecules and the associated larger dynamic range have major implications in the development of proteomics technologies. We examine these implications for both liquid chromatography-tandem mass spectrometry (LC-MS/MS) and single-molecule counting and provide estimates on how many molecules are… ▽ More Mammalian cells have about 30,000-fold more protein molecules than mRNA molecules. This larger number of molecules and the associated larger dynamic range have major implications in the development of proteomics technologies. We examine these implications for both liquid chromatography-tandem mass spectrometry (LC-MS/MS) and single-molecule counting and provide estimates on how many molecules are routinely measured in proteomics experiments by LC-MS/MS. We review strategies that have been helpful for counting billions of protein molecules by LC-MS/MS and suggest that these strategies can benefit single-molecule methods, especially in mitigating the challenges of the wide dynamic range of the proteome. We also examine the theoretical possibilities for scaling up single-molecule and mass spectrometry proteomics approaches to quantifying the billions of protein molecules that make up the proteomes of our cells. △ Less

Submitted 27 January, 2023; v1 submitted 31 July, 2022; originally announced August 2022.

Comments: Recorded presentation: https://youtu.be/w0IOgJrrvNM

Journal ref: Nat Methods 20, 339--346 (2023)

arXiv:2207.10815 [pdf]

doi 10.1038/s41592-023-01785-3

Initial recommendations for performing, benchmarking, and reporting single-cell proteomics experiments

Authors: Laurent Gatto, Ruedi Aebersold, Juergen Cox, Vadim Demichev, Jason Derks, Edward Emmott, Alexander M. Franks, Alexander R. Ivanov, Ryan T. Kelly, Luke Khoury, Andrew Leduc, Michael J. MacCoss, Peter Nemes, David H. Perlman, Aleksandra A. Petelski, Christopher M. Rose, Erwin M. Schoof, Jennifer Van Eyk, Christophe Vanderaa, John R. Yates III, Nikolai Slavov

Abstract: Analyzing proteins from single cells by tandem mass spectrometry (MS) has become technically feasible. While such analysis has the potential to accurately quantify thousands of proteins across thousands of single cells, the accuracy and reproducibility of the results may be undermined by numerous factors affecting experimental design, sample preparation, data acquisition, and data analysis. Broadl… ▽ More Analyzing proteins from single cells by tandem mass spectrometry (MS) has become technically feasible. While such analysis has the potential to accurately quantify thousands of proteins across thousands of single cells, the accuracy and reproducibility of the results may be undermined by numerous factors affecting experimental design, sample preparation, data acquisition, and data analysis. Broadly accepted community guidelines and standardized metrics will enhance rigor, data quality, and alignment between laboratories. Here we propose best practices, quality controls, and data reporting recommendations to assist in the broad adoption of reliable quantitative workflows for single-cell proteomics. △ Less

Submitted 12 September, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

Comments: Supporting website: https://single-cell.net/guidelines

Journal ref: Nature Methods, 20, 375--386 (2023)

arXiv:2108.07660 [pdf]

New views of old proteins: clarifying the enigmatic proteome

Authors: Participants in a NIH Workshop on Functional, Integrative Proteomics, :, Kristin E. Burnum Johnson, Thomas P. Conrads, Richard R. Drake, Amy E. Herr, Ravi Iyengar, Ryan T. Kelly, Emma Lundberg, Michael J. MacCoss, Alexandra Naba, Garry P. Nolan, Pavel A. Pevzner, Karin D. Rodland, Salvatore Sechi, Nikolai Slavov, Jeffrey M. Spraggins, Jennifer E. Van Eyk, Marc Vidal, Christine Vogel, David R. Walt, Neil L. Kelleher

Abstract: All human diseases involve proteins, yet our current tools to characterize and quantify them are limited. To better elucidate proteins across space, time, and molecular composition, we provide provocative projections for technologies to meet the challenges that protein biology presents. With a broad perspective, we discuss grand opportunities to transition the science of proteomics into a more pro… ▽ More All human diseases involve proteins, yet our current tools to characterize and quantify them are limited. To better elucidate proteins across space, time, and molecular composition, we provide provocative projections for technologies to meet the challenges that protein biology presents. With a broad perspective, we discuss grand opportunities to transition the science of proteomics into a more propulsive enterprise. Extrapolating recent trends, we offer potential futures for a next generation of disruptive approaches to define, quantify and visualize the multiple dimensions of the proteome, thereby transforming our understanding and interactions with human disease in the coming decade. △ Less

Submitted 17 August, 2021; originally announced August 2021.

Comments: Submitted to Nature. 22 Pages. 6 Figures. Corresponding author Neil L. Kelleher

arXiv:1207.5848 [pdf, other]

On the feasibility and utility of exploiting real time database search to improve adaptive peak selection

Authors: Benjamin J. Diament, Michael J. MacCoss, William Stafford Noble

Abstract: Rationale: In a shotgun proteomics experiment with data-dependent acquisition, real-time analysis of a precursor scan results in selection of a handful of peaks for subsequent isolation, fragmentation and secondary scanning. This peak selection protocol typically focuses on the most abundant peaks in the precursor scan, while attempting to avoid re-sampling the same m/z values in rapid succession.… ▽ More Rationale: In a shotgun proteomics experiment with data-dependent acquisition, real-time analysis of a precursor scan results in selection of a handful of peaks for subsequent isolation, fragmentation and secondary scanning. This peak selection protocol typically focuses on the most abundant peaks in the precursor scan, while attempting to avoid re-sampling the same m/z values in rapid succession. The protocol does not, however, incorporate analysis of previous fragmentation scans into the peak selection procedure. Methods: In this work, we investigate the feasibility and utility of incorporating analysis of previous fragmentation scans into the peak selection protocol. We demonstrate that real-time identification of fragmentation spectra is feasible in principle, and we investigate, via simulations, several strategies to make use of the resulting peptide identifications during peak selection. Results: Our simulations fail to provide evidence that peptide identifications can provide a large improvement in the total number of peptides identified by a shotgun proteomics experiment. Conclusions: These results are significant because they point out the feasibility of using peptide identifications during peak selection, and because our experiments may provide a starting point for others working in this direction. △ Less

Submitted 24 July, 2012; originally announced July 2012.

arXiv:1011.2087 [pdf, ps, other]

doi 10.1214/09-AOAS316

A nested mixture model for protein identification using mass spectrometry

Authors: Qunhua Li, Michael J. MacCoss, Matthew Stephens

Abstract: Mass spectrometry provides a high-throughput way to identify proteins in biological samples. In a typical experiment, proteins in a sample are first broken into their constituent peptides. The resulting mixture of peptides is then subjected to mass spectrometry, which generates thousands of spectra, each characteristic of its generating peptide. Here we consider the problem of inferring, from thes… ▽ More Mass spectrometry provides a high-throughput way to identify proteins in biological samples. In a typical experiment, proteins in a sample are first broken into their constituent peptides. The resulting mixture of peptides is then subjected to mass spectrometry, which generates thousands of spectra, each characteristic of its generating peptide. Here we consider the problem of inferring, from these spectra, which proteins and peptides are present in the sample. We develop a statistical approach to the problem, based on a nested mixture model. In contrast to commonly used two-stage approaches, this model provides a one-stage solution that simultaneously identifies which proteins are present, and which peptides are correctly identified. In this way our model incorporates the evidence feedback between proteins and their constituent peptides. Using simulated data and a yeast data set, we compare and contrast our method with existing widely used approaches (PeptideProphet/ProteinProphet) and with a recently published new approach, HSM. For peptide identification, our single-stage approach yields consistently more accurate results. For protein identification the methods have similar accuracy in most settings, although we exhibit some scenarios in which the existing methods perform poorly. △ Less

Submitted 9 November, 2010; originally announced November 2010.

Comments: Published in at http://dx.doi.org/10.1214/09-AOAS316 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS316

Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 2, 962-987

Showing 1–5 of 5 results for author: MacCoss, M J