Proteomic profiling is an important aspect of cancer research and drug development. It provides protein abundance information that complements genomics data from next-generation sequencing (NGS).
In this blog post we explore a recently published study that demonstrates how researchers can achieve better proteomic profiling of human tumor xenografts, by separating human and mouse cells prior to protein extraction and experimental data acquisition via mass spectrometry.
The Value of Large-Scale Proteomic Profiling for Cancer Drug Development
Large-scale proteomic profiling provides protein abundance information that complements other -omics information, such as NGS genomic data. For instance, the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium project intends to “accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis.”
Recently, various initiatives have started to systematically characterize the proteomes of human xenograft models, which have long been used to drive preclinical cancer research and drug development. In general, these models are established by implanting either human cancer cell lines (CDXs) or patient tumor material (PDX) into immunodeficient mice; each having its own advantages and disadvantages. For example, CDXs can be developed faster and offer high reproducibility and good convenience for genetic engineering, while PDXs are highly clinically relevant and recapitulate the original tumor’s molecular pathology, histology, genomics, and drug response.
Human tumor xenograft models are unique in that they consist of both human cancer cells and mouse noncancerous cells that not only allow researchers to study the molecular basis of cancer but also facilitate studies relevant to cancer drug development such as drug mechanism and biomarker discovery studies. However, this human-mouse mixture can make proteomic profiling of xenografts challenging when it comes to distinguishing human and mouse proteins. This is because human stroma is rapidly replaced by mouse stroma after engraftment.
While some studies separate human and mouse cells before proteomic profiling, the vast majority do not, and instead rely on the “run-then-separate”, which is a notoriously difficult and imperfect approach that relies on computational algorithms to do the species-specific peptide assignments. The major challenge in using computational methods is due to the high overlap in peptides derived from human and mouse proteins; referred to as the “common peptide problem”.
While different computational approaches have been developed for species-specific peptide assignments, including the full-peptide, unique-peptide, and human/mouse-only methods, each has limitations, and any variant or combination of the three is unlikely to achieve good performance due to the common peptide problem.
Now, a new study published in Cancer Research Communications and led by scientists at Crown Bioscience and ShanghaiTech University demonstrates how separating human and mouse cells prior to protein extraction and experimental data acquisition via mass spectrometry (MS) can achieve superior proteomic profiling results when using human tumor xenografts where approximately half of the identified peptides are common.
Below is a high-level summary of this important study.
Study Objectives
The investigators evaluated the performance of three computational methods for characterizing and quantifying human protein expression in the presence of mouse proteins.
- Full-peptide approach: Both common and species-unique peptides are searched against a unified human and mouse protein database. Common peptides are assigned to one or both species or discarded.
- Unique-peptide approach: Common peptides are discarded to allow unambiguous peptide assignment
- Human/mouse-only approach: Peptides are searched against only the human or mouse protein database.
The performance of these methods was tested using two different model systems. The first used peptide mixtures derived from two cell lines: human 293T cells and mouse MC38 cells. The second system used liver PDX models with differing levels of mouse contents. Tandem mass tagging (TMT) quantitative proteomics technology was used because it has high sensitivity, specificity, and the multiplexing capability to accommodate 16 samples. TMT identifies and quantitates proteins in different samples using tandem mass spectrometry.
Key Findings
Human Protein Quantification in Human and Mouse Cell Line Mixtures
Shi et al. generated a series of human-mouse mixed cell samples. Human protein quantification was done by searching against a combined human and mouse reference protein database using all mappable peptides (full-peptide), only unique peptides (unique-peptide), or human-only and mouse-only reference protein databases (human/mouse-only).
The first two methods quantified roughly twice as many human as mouse proteins. This was expected, given that the samples contained either purely or primarily human cells. The human-only method (which ignored mouse content) returned ~1,000 more human proteins compared to the other two methods, although many of these were actually common peptides.
The researchers also evaluated the effect of mouse proteins on human protein quantification. Human differentially expressed proteins (DEPs) were identified between samples with different mouse cell percentages. Mouse proteins negatively affected the results. Many identified human DEPs were not truly differentially expressed between the samples and instead were artifacts due to the influence of mouse proteins.
DEPs also rose rapidly as the mouse cell percentage difference increased. This was due to the high percentage of shared peptides, an issue that is unlikely to be resolved by computational methods. The existence of mouse proteins did not significantly affect the results for the full-peptide and unique-peptide methods, but only if the mouse cell percentage was below ~8%. The human-only approach performed less well compared to these two methods.
Human Protein Quantification in PDX Tumors
Much like these cell line experiments, human protein expression was assessed in five liver PDX tumors by different computational methods. Overall, approximately 3,000–4,000 DEPs were detected between the PDX models for either the original or “demoused” tumor samples (the intra-PDX study).
The high number of DEPs reflects the complex and heterogeneous proteomes in individual PDX models, with a strong positive correlation between the number of DEPs and the percentage of mouse cells in the original tumor sample. The investigators reported about 1,000 DEPs between the original and demoused tumor samples, with only 7.8% mouse cells in a liver PDX tumor sample.
In the inter-PDX study, DEPs were compared between demoused and original tumor samples. The original tumors always had roughly 20–30% of human proteins, with real differential expression between two PDXs that were not detected. Furthermore, approximately 30–40% of DEPs were false positives. The analysis showed that profiling of PDX tumors is very sensitive to mouse stroma when quantifying human protein expression, with many false positives and real DEPs that could not be identified in both intra-PDX and inter-PDX comparative studies.
The results of this analysis were consistent with the human-mouse cell line mixture studies. These outcomes demonstrate the risk of both false positives and negatives and that spurious results can arise from xenograft proteomics data that do not separate human and mouse cells.
Conclusion
Proteomic profiling can provide key insights and an abundance of information that complements other -omics data. Based on the recent publication described in this post, the data convincingly demonstrate that separating human and mouse cells prior to protein extraction and experimental data acquisition via mass spectrometry can achieve superior proteomic profiling results when using human tumor xenografts.
Crown Bioscience provides high-quality and reliable proteomics services. Data-independent acquisition (DIA) has gained increasing popularity in the field of bottom-up proteomics, due to higher reproducibility and improved data completeness. View our poster to learn more - “Systematic Evaluation of Label-Free Protein Quantification Pipelines (DDA vs DIA) in 12 Mouse Syngeneic Models.”