iPSC WES Genetic Stability Testing
1. Background
Induced pluripotent stem cells (iPSCs) hold tremendous potential across regenerative medicine, disease modeling, and drug screening. However, during prolonged in vitro culture and serial passaging, iPSCs inevitably accumulate genetic mutations that may compromise differentiation capacity and functional properties, or introduce pathogenic and tumorigenic risks, thereby posing serious threats to the safety and efficacy of their clinical applications [1]. Consequently, dynamic monitoring and comprehensive assessment of iPSC genetic stability across different passage stages is a critical component of iPSC quality control and clinical translation.
Whole Exome Sequencing (WES) is a highly efficient and cost-effective technology for detecting genetic variants. By capturing the exonic regions of the genome (approximately 1-1.5% of the total genome), WES can detect single-nucleotide variants (SNVs) and small insertions/deletions (InDels) at extremely high depth (typically >100X). As more than 85% of known pathogenic mutations reside within exonic regions, WES is an ideal tool for monitoring the genetic stability of iPSCs [2].
This protocol is based on WES technology and is designed to provide a scientifically rigorous, comprehensive, and cost-effective solution for the dynamic monitoring of iPSC genetic stability. By comparing iPSC samples at different passage numbers (e.g., P10, P20, P30), the protocol not only elucidates the patterns of mutation accumulation, but also enables thorough bioinformatics-based assessment of the potential functional impact, pathogenicity, and tumorigenic risk of identified mutations, providing critical scientific evidence for iPSC quality control, passage number selection, and clinical application safety.
2. Technical Principles and Analysis Pipeline
2.1 WES Exon Capture Principle
The core of WES lies in exon capture technology. This technique uses specially designed biotinylated probes to hybridize in solution with a fragmented genomic DNA library. These probes specifically bind to target exonic regions, after which magnetic beads (typically conjugated with streptavidin) are used to capture the probe-bound DNA fragments,thereby achieving enrichment of exonic regions. The enriched DNA fragments are then sequenced to generate high-depth exonic sequence information.

Figure 1. Schematic Diagram of WES Exon Capture Principle
2.2 iPSC WES Genetic Stability Testing Workflow
The testing workflow comprises two major stages: laboratory library construction and sequencing, followed by bioinformatics analysis. The central aim is to perform comparative analysis of iPSC samples from different passage numbers in order to assess the dynamic changes in their genetic stability.

Figure 2. iPSC WES Genetic Stability Testing Workflow Diagram
2.3 Core Analytical Framework: From Mutation Accumulation to Risk Assessment
2.3.1 Analysis of Genetic Mutation Accumulation Across Passage Numbers
The core of this protocol is to identify somatic mutations that arise during serial passaging by comparing iPSC samples at different passage numbers (e.g., P10, P20, P30) against an initial reference sample (e.g., fibroblasts or low-passage iPSCs).
(1) Variant Calling: High-sensitivity somatic mutation callers such as GATK Mutect2 are applied to perform paired analysis of each passage sample against the reference sample, identifying SNVs and InDels.
(2) Mutation Accumulation Statistics: The number of high-quality filtered variants unique to each passage sample is tallied. By comparing variant counts across passage numbers, the trend of mutation accumulation can be directly visualized. In general, the cumulative variant count increases with passage number, reflecting the genetic instability of iPSCs during prolonged culture.

Figure 3. Schematic Diagram of iPSC Genetic Stability Assessment
As shown in Figure 3, as iPSCs are passaged from low passage (P10) to high passage (P30), the cellular mutation burden progressively accumulates (red dots represent mutations). The variant count increases from 1,274 at P10 to 1,439 at P30. Concurrently, the number of pathogenic variants also increases, rising from 2 at P10 to 3 at P20, resulting in a risk assessment that escalates from low risk to high risk. This trend underscores the need for strict control over the number of iPSC passages to ensure genetic stability and clinical application safety.
2.3.2 Variant Prioritization and Key Variant Filtering
To identify the most clinically relevant variants from among the thousands of accumulated mutations, we have established a rigorous variant prioritization and filtering framework.
Prioritization Criteria:
Priority | Filtering Criteria |
High | Located in exonic region + amino acid-altering + frequency <0.01 in East Asian population + predicted damaging by at least one tool + not in repeat region |
Likelyhigh | Located in exonic or splicing region + frequency <0.01 in East Asian population + predicted damaging by at least one tool + not in repeat region |
Medium | Located in exonic or splicing region + frequency <0.01 in East Asian population + not in repeat region |
Low | All remaining variants |
Key Variant Filtering Criteria:
Building on the above prioritization, variants with the greatest potential impact on cellular function and safety are further selected as "key variants":
(1) Functional Impact: Synonymous mutations (no amino acid change) and variants of unknown function are excluded.
(2) Clinical Significance: Variants classified as "Benign" or "Likely Benign" in ClinVar are excluded.
(3) Disease Association: Variants with no association to known diseases in OMIM are excluded.
Following this multi-step filtering, a compact, highly enriched list of potentially deleterious key variants is generated, providing the foundation for subsequent manual review and risk assessment.
2.3.3 Comprehensive Variant Annotation and Risk Assessment
Comprehensive functional annotation of the filtered key variants is essential for assessing their potential risk. We utilize the ANNOVAR annotation tool, integrating dozens of authoritative databases to perform multi-dimensional, in-depth characterization of each variant.
Multi-dimensional Annotation Framework:
Annotation Category | Example Databases | Annotation Purpose |
Genomic Region Annotation | RefSeq, Ensembl | Determine the genomic location of each variant (exon, intron, UTR, promoter, etc.). |
Population Frequency Databases | gnomAD, 1000G, ExAC | Assess variant frequency in the general population; rare variants warrant greater scrutiny. |
Clinical Phenotype Databases | ClinVar, OMIM | (Pathogenicity assessment) Associate variants with known genetic diseases and clinical phenotypes. |
Cancer-related Databases | COSMIC | (Tumorigenicity assessment) Identify somatic mutations associated with cancer. |
Functional Prediction Algorithms | SIFT, PolyPhen2, CADD | Predict the potential deleterious effect of non-synonymous coding variants. |
Through this comprehensive annotation framework, three core assessment objectives are achieved:
(1) In-depth Understanding of Mutational Impact: Clarify the specific functional consequences of each variant and distinguish benign variants from potentially deleterious ones.
(2) Pathogenicity and Tumorigenicity Risk Assessment: Identify mutations associated with known diseases or cancers, providing critical evidence for the clinical safety of iPSC-based applications.
(3) Genetic Stability Assessment: Comprehensively evaluate whether harmful mutations have accumulated during iPSC passaging, providing scientific guidance for quality control and passage number selection.
3. Technical Advantages
Advantage Dimension | Description |
Dynamic Monitoring of Genetic Stability | By comparing samples from different passage numbers, mutation accumulation patterns in iPSCs during culture can be dynamically monitored, providing direct evidence for genetic stability assessment. |
Exceptional Cost-effectiveness | By focusing on the functionally most relevant exonic regions, the core information needed for genetic stability assessment can be obtained at approximately 1/10 the cost of WGS, making this approach particularly suitable for multi-timepoint longitudinal studies. |
High Depth and High Sensitivity | Sequencing depth is typically >100X, far exceeding WGS (30-50X), enabling more sensitive detection of low-frequency somatic mutations. |
Focus on Functional Variants | By directly targeting exonic regions relevant to protein function, detected variants are more likely to have functional consequences, with a well-defined analytical focus. |
Comprehensive Risk Assessment | Integration of dozens of authoritative databases enables comprehensive annotation and in-depth assessment of the potential functional impact, pathogenicity, and tumorigenicity risk of each variant. |
Established Analysis Pipeline | Using the internationally recognized GATK toolkit and ANNOVAR annotation pipeline ensures accurate, reliable, and reproducible analytical results. |
4. Application Scenarios
(1) iPSC Quality Control: Serves as a routine quality control measure during iPSC line establishment and serial passaging.
(2) Passage Number Optimization: By monitoring mutation accumulation across passage numbers, this protocol provides scientific evidence for defining a safe passaging window for iPSCs.
(3) Clone Selection: Among multiple iPSC clones, the genetically most stable clones with the lowest mutation burden are selected for downstream research or clinical application.
(4) Preclinical Safety Assessment: Constitutes an essential component of the safety assessment required before iPSC-derived cell therapy products enter clinical trials.
(5) Culture Condition Optimization: Compares iPSC genetic stability under different culture systems or conditions to identify optimal culture protocols.
5. Service Contentand Sample Submission Requirements
5.1Testing Service Content
Service Stage | Service Content |
Experimental Stage | Sample DNA quality inspection, WES library construction, exon capture, high-throughput sequencing (>100X).. |
Bioinformatics Analysis | Raw data quality control, reference genome alignment, SNV/InDel calling, cross-passage comparative analysis, mutation accumulation statistics, variant prioritization, comprehensive functional annotation and risk assessment. |
Deliverables | Complete analysis report, raw sequencing data (FASTQ), alignment files (BAM), variant calling results (VCF), detailed variant annotation tables.. |
5.2 Sample Submission Requirements
Category | Specific Requirements |
Basic Service Options | 1) DNA extraction and quality inspection service available; 2) PCR-free library construction and sequencing service available (client must provide high-quality DNA);3) Stand-alone bioinformatics analysis service available (client must provide raw FASTQ files). |
Cell Sample Requirements | 1) Cell number: minimum 1x10^6 cells, recommended >=5x10^6 cells; 2) Viability: cell viability >=80%; 3) Shipping: dry ice, fully frozen throughout transport; 4) Integrity: normal cell morphology, no apparent fragmentation. |
DNA Sample Requirements | 1) Total DNA: >=2 ug, quantified by Qubit; 2) Concentration: >=50 ng/uL; 3) Purity: OD260/280 = 1.8-2.0, OD260/230 >=1.8; 4) Integrity: main electrophoresis band >10 kb, no apparent degradation; 5) Shipping: ice pack at -20 degrees C. |
Reference Sample Requirements | 1) Unedited parental iPSCs must be provided as a reference for filtering background variants. Reference samples must be from the same batch and cultured under identical conditions as the experimental samples, meeting the same quality standards. |
Additional Client Information | 1) Sample type and designation; 2) Experimental design and grouping information; 3) iPSCsiPSC origin, culture conditions, passage number, and other relevant details. |
6. References
[1] Abyzov, A., et al. (2012). Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells. Nature, 492(7429), 438-442.
[2] Choi, M., et al. (2009). Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proceedings of the National Academy of Sciences, 106(45), 19096-19101.
[3] Gore, A., et al. (2011). Somatic coding mutations in human induced pluripotent stem cells. Nature, 471(7336), 63-67.
[4] Martins Taylor, K., & Nisler, B. S. (2018). A review of genomic and epigenomic stability of human induced pluripotent stem cells. Stem cell reviews and reports, 14(5), 625-634.