WGS-DSB Off-Target Detection
1. Background
Gene editing technologies represented by CRISPR/Cas9 introduce DNA double-strand breaks (DSBs) at specific genomic loci, greatly advancing both life science research and gene therapy. However, safety concerns - particularly off-target effects - remain a fundamental challenge that must be resolved before clinical application. In contrast to base editing, which primarily induces single-nucleotide mutations, DSB-type gene editing gives rise to a more complex and diverse spectrum of off-target events [1].
Diverse Off-target Variant Types: The DSB repair process, particularly the error-prone non-homologous end joining (NHEJ) pathway, generates not only small insertions/deletions (InDels), but can also cause large-scale structural variants (SVs) such as large deletions, duplications, inversions, and chromosomal translocations, as well as copy number variants (CNVs) [2].
sgRNA-dependent Off-targets: Similar to base editing, the primary off-target events of DSB-type editing occur at genomic sites sharing sequence homology with the sgRNA.
To comprehensively and unbiasedly identify all DSB-induced off-target events - ranging from single-nucleotide changes to chromosomal-level rearrangements - high-depth whole genome sequencing (WGS) is internationally recognized as the most authoritative "gold-standard" approach. Through deep sequencing of edited cells alongside unedited control cells, combined with rigorous bioinformatics comparative analysis, WGS captures all gene editing-induced variants across the entire genome at single-base resolution, providing the most comprehensive data support for safety assessment of DSB-type gene editing systems.
2. Technical Principles and Analysis Pipeline
This protocol integrates high-depth WGS with a DSB-specific, multi-dimensional advanced bioinformatics analysis pipeline, designed to provide the most scientifically rigorous off-target effect assessment for both research and clinical applications.
2.1 CRISPR/Cas9 Editing Principle
The CRISPR/Cas9 system uses sgRNA to direct the Cas9 nuclease to a specific genomic locus. Upstream of the PAM sequence, the nuclease domain of Cas9 precisely cleaves both DNA strands to generate a DSB. The cellular DNA damage repair system responds immediately, primarily through two pathways:
Non-homologous End Joining (NHEJ): A rapid but error-prone repair pathway that directly ligates the broken DNA ends, frequently introducing random base insertions or deletions (InDels) and resulting in gene knockout.
Homology-directed Repair (HDR): A precise repair pathway that uses an exogenously provided homologous DNA template to achieve accurate sequence replacement or gene insertion at the break site.
In most somatic cells, NHEJ is the predominant repair mechanism; therefore, DSB-type gene editing primarily generates InDel mutations.

Figure 1. Schematic Diagram of CRISPR/Cas9 Editing Principle
2.2 WGS-DSB Off-target Detection Workflow
Our detection workflow follows strictly standardized procedures, comprising three core stages - laboratory testing, data processing, and advanced analysis - to ensure data accuracy and analytical depth.
Laboratory Stage: Edited and control samples (cells, gDNA, etc.) provided by the client are received and subjected to rigorous quality control, followed by high-quality WGS library construction and sequencing at >=50X depth on a high-throughput sequencing platform.
Data Processing Stage: Raw sequencing data undergo comprehensive quality assessment and filtering to generate high-quality clean data. Reads are then aligned to the reference genome using mainstream aligners such as BWA-MEM, followed by duplicate removal, base quality score recalibration, and other standard processing steps.
Advanced Analysis Stage: This is the core of the protocol, encompassing a multi-dimensional, high-stringency analytical strategy for the diverse variant types induced by DSBs. See Section 2.3 for details.
2.3 Core Analytical Logic for DSB Off-target Analysis
To precisely distinguish genuine DSB-induced off-target signals from background noise within large-scale genomic datasets, we have designed a rigorous, multi-dimensional, multi-tier variant filtering and classification pipeline.
To precisely distinguish genuine DSB-induced off-target signals from background noise within large-scale genomic datasets, we have designed independent, scientifically rigorous analysis workflows for each variant type:

Figure 2. WGS-DSB Off-target Detection and Genome Stability Assessment Workflow
This workflow diagram illustrates the complete path from WGS sequencing to final analytical results, comprising three parallel analysis branches. Left branch: SNV/InDel off-target analysis, employing a full pipeline of three-tool joint detection, quality control, control filtering, three-tool intersection, sequence similarity screening, and final filtering. Middle branch: SV off-target analysis, using Manta detection, control filtering to obtain somatic SVs, and intersection with Cas-OFFinder predictions. Right branch: CNV detection for genome stability assessment. The three analytical branches complement each other to form a complete safety assessment framework for DSB-type gene editing.

Figure 3. Schematic Diagram of DSB-induced Off-target Variant Types
Visually illustrates the four major variant types that can be generated by DSB-type gene editing, highlighting the complexity of its off-target effects.

Figure 4. Schematic Diagram of DSB On-target and Off-target Types
Compares the differences between on-target editing and sgRNA-dependent off-target events in terms of sequence matching and repair outcomes, emphasizing the diverse mutational consequences that may arise at off-target sites.
3. Technical Advantages
Advantage | Description |
Whole-genome Scanning | WGS scans the entire genome in an unbiased manner without relying on any prediction algorithms, and is internationally recognized as the most comprehensive method for off-target assessment. |
Multi-dimensional Detection, Comprehensive Coverage | The defining feature of this protocol. In addition to comprehensive off-target analysis covering SNVs, InDels, and SVs, CNV detection is also provided as a key assessment of genome stability, ensuring no blind spots in the safety evaluation. |
High-confidence Identification, Reliable Results | Multi-tool joint detection combined with stringent sequence similarity screening maximally eliminates false positives, ensuring the highest level of result reliability. |
Multi-tier Filtering, Maximum Stringency | Integrating quality control, control filtering, and GC/repeat region filtering as multiple data cleaning strategies ensures that the final delivered off-target list is clean and trustworthy. |
High-depth Sequencing, High Sensitivity | A sequencing depth of >=50X guarantees excellent sensitivity for detecting even low-frequency off-target events. |
Quantitative Analysis, Comprehensive Assessment | Beyond providing a list of off-target sites, editing efficiency at each site is precisely quantified and detailed functional annotation is provided, enabling comprehensive assessment of biological risk. |
4. Application Scenarios
(1) CRISPR/Cas9 Drug IND Filing: Provides a comprehensive and rigorous off-target safety assessment report meeting regulatory requirements.
(2) Gene Editing System Optimization and Comparison: Quantitatively compares the specificity and off-target profiles of different Cas9 variants or sgRNA designs, guiding iterative optimization of the editing system.
(3) sgRNA Screening and Validation: Through parallel off-target detection, identifies the optimal guide sequence combining high activity with maximum safety from among multiple sgRNA candidates.
(4) Gene/Cell Therapy Product QC Release: Serves as a critical quality control checkpoint in the manufacturing of gene-edited cell products, ensuring the safety of every batch.
(5) Basic Scientific Research: Enables in-depth investigation of DSB-induced off-target mechanisms, repair patterns, and distribution characteristics across different genomic contexts.
5. Sample Report
Data Quality Control and Alignment
Sample | Clean reads | Clean base | GC | N | Q20 | Q30 | Clean Ratio |
WT | 893526612 | 127.53G | 42.07% | 0.007% | 99.01% | 96.15% | 89.82% |
KO-691 | 955516290 | 135.47G | 41.72% | 0.015% | 99.08% | 96.30% | 90.91% |

Figure5. Sequencing Depth and Coverage Visualization After Genome Alignment
5.2 Key Results

Figure6. Venn Diagram of Variant Intersections Across Detection Algorithms

Figure7. On-target Editing Visualization

Figure8. sgRNA-dependent Off-target Mismatch Map
6. Service Contents and Sample Requirements
6.1 Service Contents
We provide a one-stop service from experimental design consultation to final report delivery, ensuring smooth project execution.
Service Item | Service Content |
Project Consultation | Senior technical experts assist in designing a rigorous experimental plan and defining sample and information requirements. |
Sample Testing | Standardized sample quality inspection, library construction, and high-depth WGS sequencing. |
Data Analysis | Execution of the DSB-specific multi-dimensional advanced bioinformatics analysis pipeline described above. |
Report Delivery | Delivery of a comprehensive PDF report and complete analysis result files within the committed turnaround time(35-40 business days). |
After-sales Support | Professional report interpretation and ongoing technical consultation. |
6.2 Sample and Information Requirements
Accurate analysis depends on high-quality samples and complete information. Please prepare strictly according to the following requirements:
Requirement Category | Item | Specific Requirements |
Sample Submission Requirements | Genomic DNA (gDNA) | • Total amount: >= 2 ug (Qubit quantification) |
Cell Samples | • Cell number: >= 5 x 10^6 cells | |
Tissue Samples | • Weight: >= 100 mg | |
Required Information | sgRNA Information | • Complete 20 nt sgRNA sequence |
CasCas System Information | Specify the Cas protein type used (e.g., SpCas9, SaCas9, etc.) | |
Sample Information | • Clear sample identifiers | |
Turnaround Time | Turnaround Time | 35-40 business days |
7. References
[1] Kosicki, M., et al. (2018). Repair of double-strand breaks induced by CRISPR–Cas9 leads to large deletions and complex rearrangements. Nature Biotechnology, 36(8), pp.765–771.
[2] Cullot, G., et al. (2019). CRISPR-Cas9-induced DSB repair analysis reveals that a single nucleotide insertion is a hallmark of NHEJ. Cell Reports, 29(1), pp.80-94.