Review best practices in cancer cell line authentication, including a comparison of STR and SNP assays.
The Importance of Authenticating Cancer Cell Lines
Preclinical cancer research models - such as human xenograft tumors, murine homograft tumors, human and murine cell lines, and organoids - are widely used in cancer research and drug development. Since these models are used for a broad spectrum of applications, they are often heavily utilized (and shared) across labs worldwide.
Consequently, the cell lines used to establish these cancer models can quickly become contaminated, cross-contaminated, or even misidentified. The prevalence of contaminated cell lines is estimated at 20%, with the most frequent contaminant being HeLa cells, followed by T24 bladder cancer cells.
It’s been projected that $700 million (per year) in funding goes to studies employing contaminated or misidentified cell lines. These studies have an increased risk of reaching erroneous conclusions and can damage researcher reputations due to study retraction. Cell line and model misidentification also contributes to the sometimes poor reproducibility of preclinical studies.
It’s become vitally important for researchers and CROs to implement routine assays for quality control of their cultures, to validate identity and quality before establishing a tumor model. Although cell line authentication has been widely recommended for many years, contamination, cross-contamination, and misidentification remain a serious problem.
This post reviews two of the most common genomic-based assays for cell authentication: short tandem repeat (STR) and single nucleotide polymorphism (SNP) assays, their strengths and limitations, and reviews guidelines for determining identity.
STR or SNP: Choosing the Best Approach
How Does Short Tandem Repeat (STR) Testing Work?
STR testing employs primers that pick up repeated DNA segments, typically 2–6 base pairs. STR has been especially valuable for authenticating human cell lines, since the number of repeats at each genetic site varies within the human population. Therefore, STR genotyping can track genetic identification of tumors of human origin including either cell line derived or patient-derived xenografts (PDX).
The American Tissue Culture Collection (ATCC) has published robust guidelines that have standardized STR analysis for human cell line authentication.
STR Testing Limitations
One main limitation of STR assays is that they can suffer from inadequate accuracy, especially for samples of kinship or of close genetic background. This means that STR assays are intended for human cells and are not applicable for murine tumors, which originate from limited strains of inbred experimental mice and therefore lack unique markers to readily identify individual murine tumor models. STR profiling will also not distinguish between cell lines established from the same human donor.
What are Single Nucleotide Polymorphism (SNP) Assays?
While STR has traditionally been considered the gold standard of authentication, due to its limitations improved methods are now being developed. Advances in high throughput sequencing technologies have led to robust and cost-efficient SNP array profiling assays. SNP testing is increasingly being used to complement or replace STR authentication due to improved accuracy.
SNP testing relies upon the natural variation in single DNA nucleotides among individuals in a population. For instance, a SNP may replace the nucleotide cytosine (C) with thymine (T) in a specific region.
SNPs can authenticate murine cell lines as well as human tumor models, providing a key advantage compared to standard STR profiling. Mismatch repair (MMR) deficient human cell lines that can be misclassified by STR profiling can also be identified using SNPs, and testing can also:
- Determine gender and ethnicity
- Detect viral infection and mycoplasma contamination
- Characterize common immunodeficient mouse strains.
Strengths and Limitations of the STR and SNP Assays
The main strengths and limitations of STR and SNP assays are compiled here for reference.
Strengths | Limitations | |
---|---|---|
STR |
Useful to authenticate human cell lines Databases with known STR profiles available, allowing data from different laboratories to be compared to reference profiles. The DSMZ (Leibniz Institute) and ATCC repositories have large, online, and freely accessible collections of profiles to find profile matches |
Often inadequately accurate, especially with kinship or close genetic background (e.g. not applicable for authenticating murine tumor models) Unreliable for samples with varying levels of DNA degradation (e.g. formalin-fixed paraffin embedded (FFPE) tissues) Cannot determine absence of interspecies cross-contamination and cannot identify chromosomal copy number aberrations |
SNP |
Efficient and robust next generation sequencing (NGS) arrays to authenticate tumor models Applicable to murine as well as human tumor models Can be adopted for quality assurance of banking and tracking tumor model collections Determine gender and ethnicity, detect viral infection, and mycoplasma contamination |
Lacking a searchable database for identifying cell lines |
What Threshold of Match Probability Is Used for Determining Identity?
Interpreting authentication tests requires matching criteria to distinguish between “related” and “unrelated” samples. For instance, using the STR-based “Masters algorithm” or “Alternative Masters algorithm,” cell lines with ≥80% match are generally considered related (i.e. derived from the same patient or donor), while those with a 55–80% match require further profiling (e.g. alternate test methods).
The matching algorithm is based on comparing the number of shared alleles between two cell line samples, expressed as a percentage. An authenticated sample is selected as a “reference” profile, while the sample undergoing authentication is the “questioned” profile. The matching algorithm to determine the percent match between two cell lines is as follows:
Percent Match = number of shared alleles (in both STR profiles) ÷ total number of alleles in the questioned profile (note: homozygous alleles are counted as one allele)
When two samples have been concluded to be related (≥80% match), it must be kept in mind that the two samples may have originated from
- The same cell line
- Different cell lines that were derived from the same donor (e.g. different tissues of the same donor)
- A related cell line (e.g. daughter or sister cell line) or
- A cell line that may be misidentified or cross-contaminated.
Since genetic variations may occur in cell lines that are maintained in culture, these algorithms allow for some variation. The choice of ≥80% as a suitable threshold is based on the published work from the ATCC (the ASN-0002 Standard Workgroup). It relied on eight core loci in related cell line samples from five cell banks. These criteria allow for 98% of related cell line samples to be successfully authenticated using the matching algorithm.
In a small number of cell lines, STR profiles show a greater degree of instability, which results in a percent match of <80%. The ATCC Workgroup indicates that unrelated cell lines generally show percent matches of 55% or less. Based on this threshold, cell lines below 56% are considered unrelated and those at 56–80% are probably unrelated, although further data should be relied upon to confidently confirm or refute this conclusion.
How Often Should I Authenticate My Cell Lines?
While there are no definitive guidelines for how often cell lines should be authenticated, the following are a few factors to consider:
- When a cell line is received from an unreliable source
- After ten passages
- After preparing a cell bank
- When in doubt!
Overall, it is best to authenticate cell lines frequently and scientific journals and funding agencies are increasingly requesting this information. Presumably, there is always at least a small amount of doubt for any cell line that has not been authenticated recently.
Conclusion
With the importance of cell cultures in biomedical research, correct cell line authentication is key for generating reliable and reproducible data. However, cell line contamination, cross-contamination, and misidentification continue to be widespread problems, which are expected to expand given the increasing number of cell lines generated and the high rate of cell culture use in worldwide research laboratories.
The problem is exacerbated by significant gaps in basic quality control principles related to authentication. A routine and rigorous authentication process can help ensure your laboratory is using the correct cell lines that are in optimal health.