The Role of Proteomics in Oncology Biomarker Discovery
Proteomics in Biomarker Discovery for Cancer Drug Development
While genomics has a long history in biomarker discovery, the power of proteomics is required to interpret the function of identified genes in the context of functional networks.1
Proteomic biomarkers provide key information about the dynamic nature of proteins, including functionality, post‐translational modifications, interaction with other biological molecules, and response to environmental factors.1 With advances in mass spectrometry (MS) and other advanced analytical techniques, proteomic workflows enable the profiling of large datasets with the high precision and resolution required for biomarker discovery.
Protein Separation Techniques
Reliable and validated methods for protein separation and analysis are key to identifying protein biomarkers. Three commonly used techniques for separation of complex protein or peptide samples are:
- Denaturing polyacrylamide gel electrophoresis (PAGE) or sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE)
- 2D gel electrophoresis (2D-PAGE)
- High-performance liquid chromatography (HPLC)
Below, each of these techniques is discussed in more detail.
SDS-PAGE is a one-dimensional method commonly used to separate proteins by size. Samples are treated with a denaturing agent to cleave disulfide bonds and a detergent (SDS) to impart a uniform net negative charge on the proteins in the sample. The unfolded and negatively charged proteins are allowed to migrate through a gel matrix (i.e. polyacrylamide) with an electrical current applied, causing smaller proteins to migrate faster due to less resistance.
Following electrophoresis, the separated proteins are visualized by staining the gel with a dye (e.g. Coomassie blue, SYPROTM Ruby) and then electroblotted onto a membrane for immunoblotting or excised for further analysis by techniques such as MS.
One-dimensional separation is not usually sufficient, however, for in-depth analysis of complex protein mixtures, especially to discover and develop biomarkers. Combining isoelectric focusing (IEF) with SDS-PAGE separates proteins by charge and size, respectively. This two-dimensional method is known as “2D-PAGE,” which has long been a workhorse in proteomics due to its ability to resolve thousands of proteins in a single analytical run.2
Normally, when in solution, proteins carry a net positive or negative charge depending on their amino acid composition. The isoelectric point (pI) refers to the pH level at which a protein has a net charge of zero. This feature can be exploited to separate a mixture of proteins by running the sample through an established pH gradient (often in a tube or strip gel).
Once a protein reaches its respective pI, migration stops, and it becomes “focused” at a specific pH point. The IEF gel is inserted into a traditional one-dimensional gel for SDS-PAGE size-based separation. Visualization and image analysis of stained gels are used to quantify the total number of proteins in a sample, or “spot” extraction is performed for more detailed interrogation by MS.
Although 2D-PAGE has been extensively used, it does have limitations. For instance, some highly acidic or basic proteins are difficult to separate, and low-abundance proteins are not easily detected.
HPLC represents a technology with high resolving power and reproducibility. Samples are enzymatically digested (e.g. via trypsin) to generate peptide fragments that are mixed with a solvent that flows under high pressure to facilitate movement through a chromatographic column packed with absorbent particles. A wide variety of columns are commercially available for protein and peptide analysis, including those designed for properties such as size, hydrophobicity, charge, or specific amino acid sequences.
Differences in the interaction of proteins with the absorbent particles correlate with changes in the elution rate from the column. Specialized detectors (e.g. absorbance detectors, such as ultraviolet or photometric diode array) capture this data and convert it into a chromatogram.
Compared to 2D-PAGE, HPLC is considerably more sensitive and has a broader dynamic range, allowing for better detection of low-abundance proteins. HPLC can also be combined with MS to create a powerful proteomics workflow.
Other useful tools for protein separation, albeit not discussed here, include capillary electrophoresis and affinity chromatography.
The Different Types of Proteomics Platforms
Proteomic and biomarker discovery rely heavily on MS. There are two typical approaches:
- In the “bottom-up” approach, intact proteins undergo enzymatic digestion, and the peptide fragments are characterized by MS.
- In the “top-down” approach, intact proteins are directly fragmented by MS to obtain sequence data for identification.
Although a variety of MS technologies and instruments exist, all consist of three main components:
- An ionization source
- A mass analyzer
- An ion detector system.
Ionization of proteins and peptides generates charged gas ions, which are separated by the mass analyzer based on their mass-to-charge ratio. Precise mass values and their abundance are measured by the ion detection system to generate a mass spectrum for analysis.
Significant innovation in mass spectrometers has led to modern instruments being equipped with advanced mass analyzers and/or detectors. These include time of flight (TOF), Fourier transform ion cyclotron resonance (FT-ICR), and Orbitrap, all of which provide extremely accurate mass resolution.3
As mentioned above, MS-based techniques are also coupled with separation methods, such as electrophoresis and HPLC. These combinations offer a synergistic approach that is applied to particularly complex protein mixtures where either technique alone is not sufficient.
Surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI TOF-MS), also known as “ProteinChip® technology,” is another combination approach widely used in biomarker discovery.4 This high-throughput technique utilizes chips with distinct chromatographic surfaces to separate proteins based on physical and chemical properties (hydrophobic, hydrophilic, acidic, basic, surface affinity). Following a washing step to remove contaminants and unbound proteins, the sample is crystallized with an energy-absorbing matrix and analyzed by TOF-MS.
SELDI TOF-MS has a number of features that make it a useful tool for biomarker discovery. For example, minimal sample material is required, and crude samples, such as tissue biopsies and body fluids, can be readily analyzed. In addition, the technology has high sensitivity for the detection of low-molecular-weight proteins. As a result, it has identified biomarkers that other methods missed.
Although there are some recognized challenges with the SELDI TOF-MS technology,5 which are mostly related to poor experimental design, a myriad of potential biomarkers has been identified across multiple cancer types. For example, in a recent study, serum profiling with SELDI TOF-MS identified a small panel of proteins that distinguished breast cancer patients from healthy controls, including those with benign breast disease.6
Affinity-Reagent Array-Based Methods
Although MS-based methods dominate the protein biomarker discovery space, alternative strategies have emerged and are gaining popularity. For example, the protein, antibody, and tissue microarray (TMA) methods represent targeted approaches for protein detection and quantification with high sensitivity as well as multiplexing (up to 5000-plex) capabilities that can be automated for high-throughput sample analysis. While predefined targets introduce a level of bias into the discovery process, affinity-reagent-array-based methods have great clinical potential for monitoring validated disease biomarkers in different sample types (e.g. tissue, plasma, cell lines, biofluids/solids).
A variety of affinity reagents have been generated to support biomarker discovery and development in clinical settings. These include antibodies (e.g. O-link, Myriad, Kiloplex1000) and nucleic-acid-based aptamers (e.g. Somascan).7 Quantitation of protein levels with these assays ranges from those used in traditional immunoassay protocols to more advanced statistical analysis for array-based platforms.
There are two major protein microarray formats:
- Forward-phase protein arrays (FPPAs)
- Reverse-phase protein arrays (RPPAs)
In an FPPA (commonly referred to as an “antibody array”), a large number of affinity reagents (antibodies) directed against different protein targets are fixed onto a glass array slide. The test sample (e.g. lysate, body fluid) is spread over the array, and the bound proteins are detected by a labeled secondary antibody.8
In an RPPA, the test samples are spotted onto the glass array slides, and each array is incubated with an individual antibody. This format allows for measuring an individual protein across a large number of samples during a single run, which is critical for cancer biomarker discovery and development. Multiplexing can also be achieved by spotting samples from the same lysate onto different arrays that are incubated with different antibodies.8
RPPAs are often preferred over FPPAs, as they require only one antibody for signal detection. Moreover, the signal can be optimized to fit within the dynamic range of the assay by using a dilution series of the test sample on the same array slide. This is an important consideration because antibody affinity is influenced by protein concentration.9
In the clinic, the RPPA platform has shown some promise, particularly when combined with laser capture microdissection to evaluate distinct tumor cell populations. For example, an RPPA capable of measuring total protein and phosphorylated (activated) levels of the HER2 receptor, related family members (EGFR and HER3), and downstream signaling mediators may identify a novel subgroup of patients who respond to a pan-HER small molecule inhibitor.10
Antibody- and Affinity-Based MS Methods
Antibody-based proteomics can also be combined with MS methods to create a workflow incorporating the strengths of immunoassays and MS to rapidly profile a large number of proteins with high sensitivity. Examples of these technologies include antibody-enriched selective reaction monitoring (SRM) and affinity MS.11
The combination of peptide enrichment with antibodies immobilized on affinity columns and MS offers great promise in biomarker discovery, especially for complex samples, such as serum, which contains thousands of proteins at concentrations covering a large dynamic range. For example, this approach has been used to develop an assay that predicts tamoxifen resistance in breast cancer patients.6
Despite the noted strengths of affinity-based arrays, there are challenges with this technology. For instance, the data generated are only as good as the antibodies available. This aspect of targeted approaches to biomarker discovery remains a significant issue, as developing and validating such reagents is not trivial. Most affinity-array-based platforms include only a few hundred antibodies to detect total or modified protein levels.9 With uncertainties in antibody specificity, interpretation of already complex data can be problematic and limit confidence in widespread clinical use.
With the rise of “-omics” technologies, large datasets are being generated at an increasingly rapid pace. As proteomic approaches continue to advance and expand, several repositories and resources (e.g. ProteomXchange, PRIDE, GPMdb, neXtProt, ProteomDb, Human Protein Atlas, PeptideAtlas, SRMAtlas, HPP Browser, and MissingProteinPedia) will increasingly play an important role in facilitating biomarker discovery.12
New computational tools have also been developed that ease the burden of mining the vast amount of existing literature regarding proteomics, support optimization of complex MS experiments, or provide much-needed capacity for processing large datasets.12
Significant effort has also gone into bioinformatic methods specific to biomarker research and discovery. This includes advanced statistical tools for dealing with high-dimensionality datasets, algorithms for training and validating potential biomarkers in clinical data, and ways to integrate multiple biomarker types and prior knowledge.13
Future Technologies in Proteomic Platforms
Affinity-based proteomic platforms will continue to evolve as the number of validated and highly specific antibodies for target quantification increases. In addition, developing more complete affinity reagents that can distinguish the proteoforms of individual proteins will further enhance the application of this technology to biomarker discovery.3
For MS-based proteomics platforms, detecting low-abundance peptides is often challenging due to noise from confounding ions. One potential solution is gas-phase fractionation by ion mobility spectrometry. The high-field asymmetric waveform ion mobility system has been shown to perform gas-phase separation of peptide ions, leading to enrichment of multiply charged species and therefore a more comprehensive analysis of the proteome.14
Capillary electrophoresis (CE)-MS is an alternative and complementary strategy to HPLC-MS for peptide analysis. The technology is used frequently in clinical studies or for patient assessment due to minimal sample requirements and high reproducibility. It can also detect hydrophilic peptides.15 CE-MS has been utilized to discover cancer biomarkers, particularly in urinary profiles. For example, a panel of 193 peptides was identified that was shared across multiple cancer types and may have application as a general primary or relapsed tumor marker.16
CE has also been optimized with MS-based methods for efficient analysis of middle-mass-range proteins17 and microscale multidimensional fractionation (e.g. dynamic pH-junction-based CE) for improved coverage similar to that delivered by 2D-LC-CE-MS/MS.18
Although CE-MS has been useful in biomarker discovery, the technology has yet to be widely adopted. This is due at least in part to better familiarity with LC-MS in the proteomics field and the associated costs of bringing in new CE equipment.
Cancer biomarkers are crucial to discovering and developing novel cancer therapeutics. They also play a key role in clinical practice, where they have applications in risk assessment, diagnosis, prognosis, and determination of treatment efficacy, safety, and/or relapse.
Proteomics in oncology biomarker discovery are continually evolving. Technologies encompassing MS and antibody- and affinity-based methods have developed greatly over the past few years, opening up the development of biomarkers that can be used for the early detection of cancer and personalized therapeutic strategies. However, it remains a challenge to effectively and efficiently transfer candidate biomarkers from bench to bedside. A multidisciplinary effort is needed to overcome these challenges with contributions from different fields and a multi-omics approach for the best chance at translational success.
References and Further Reading
- Hristova, V. A. & Chan, D. W. Cancer biomarker discovery and translation: proteomics and beyond. Expert Review of Proteomics 16, 93–103 (2019).
- Issaq, H. J. & Veenstra, T. D. Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE): Advances and perspectives. BioTechniques 44, 697–700 (2008).
- Kellie, J. F. et al. A new era for proteomics. Bioanalysis 11, 1731–1735 (2019).
- Liu, C. The application of SELDI-TOF-MS in clinical diagnosis of cancers. J. Biomed. Biotechnol. 2011, 245821 (2011).
- Muthu, M., Vimala, A., Mendoza, O. H. & Gopal, J. Tracing the voyage of SELDI-TOF MS in cancer biomarker discovery and its current depreciation trend - need for resurrection? TrAC - Trends in Analytical Chemistry 76, 95–101 (2016).
- De Marchi, T. et al. Targeted MS Assay Predicting Tamoxifen Resistance in Estrogen-Receptor-Positive Breast Cancer Tissues and Sera. J. Proteome Res. 15, 1230–1242 (2016).
- Brody, E. N., Gold, L., Lawn, R. M., Walker, J. J. & Zichi, D. High-content affinity-based proteomics: Unlocking protein biomarker discovery. Expert Review of Molecular Diagnostics 10, 1013–1022 (2010).
- Huang, Y. & Zhu, H. Protein Array-based Approaches for Biomarker Discovery in Cancer. Genomics, Proteomics and Bioinformatics 15, 73–81 (2017).
- Creighton, C. J. & Huang, S. Reverse phase protein arrays in signaling pathways: A data integration perspective. Drug Design, Development and Therapy 9, 3519–3527 (2015).
- Wulfkuhle, J. D. et al. Evaluation of the HER/PI3K/AKT Family Signaling Network as a Predictive Biomarker of Pathologic Complete Response for Patients With Breast Cancer Treated With Neratinib in the I-SPY 2 TRIAL. JCO Precis. Oncol. 1–20 (2018). doi:10.1200/po.18.00024
- Weiß, F. et al. Catch and measure-mass spectrometry-based immunoassays in biomarker research. Biochimica et Biophysica Acta - Proteins and Proteomics 1844, 927–932 (2014).
- Peng, L. et al. Tissue and plasma proteomics for early stage cancer detection. Molecular Omics 14, 405–423 (2018).
- Perera-Bel, J., Leha, A. & Beißbarth, T. Bioinformatic Methods and Resources for Biomarker Discovery, Validation, Development, and Integration. in Predictive Biomarkers in Oncology 149–164 (Springer International Publishing, 2019). doi:10.1007/978-3-319-95228-4_11
- Pfammatter, S. et al. A novel differential ion mobility device expands the depth of proteome coverage and the sensitivity of multiplex proteomic measurements. Mol. Cell. Proteomics 17, 2051–2067 (2018).
- Pontillo, C. et al. CE-MS-based proteomics in biomarker discovery and clinical application. Proteomics - Clinical Applications 9, 322–334 (2015).
- Belczacka, I. et al. Urinary CE-MS peptide marker pattern for detection of solid tumors. Sci. Rep. 8, 1–11 (2018).
- Li, Y., Compton, P. D., Tran, J. C., Ntai, I. & Kelleher, N. L. Optimizing capillary electrophoresis for top-down proteomics of 30-80 kDa proteins. Proteomics 14, 1158–1164 (2014).
- Yang, Z., Shen, X., Chen, D. & Sun, L. Microscale Reversed-Phase Liquid Chromatography/Capillary Zone Electrophoresis-Tandem Mass Spectrometry for Deep and Highly Sensitive Bottom-Up Proteomics: Identification of 7500 Proteins with Five Micrograms of an MCF7 Proteome Digest. Anal. Chem. 90, 10479–10486 (2018).