CRISPRme Software Prediction
1. Background Introduction
CRISPR gene editing technology has demonstrated tremendous potential in disease treatment and basic research. However, off-target effects remain a core safety challenge limiting its clinical translation. Accurate prediction of off-target risks has become a decisive factor in advancing gene editing technology from basic research to clinical application.
In its latest guidance on human gene therapy products, the FDA explicitly emphasizes that evaluating the safety of gene editing products requires the use of multiple complementary validation methods. In particular, it highlights that computational software prediction ("in silico" methods) is the primary step in off-target risk assessment. Traditional off-target prediction tools (such as Cas-OFFinder and CRISPOR) are based solely on reference genome analysis and ignore the impact of genetic variation on off-target risk, which may lead to serious risk underestimation in diverse populations.
CRISPRme, as a next-generation off-target prediction tool, overcomes the limitations of traditional methods by integrating large-scale genetic variation databases such as the 1000 Genomes Project and HGDP. It can identify individual-specific potential off-target sites, providing a more comprehensive and accurate solution for gene editing safety assessment.
2. CRISPRme Technical Principles
CRISPRme employs an innovative haplotype-aware algorithm that considers not only the reference genome but also integrates genetic variation information (SNVs/Indels). Through the following core technologies, it achieves comprehensive off-target risk assessment:
(1) Efficient Genome Indexing:
Constructs whole-genome indices containing both reference and variant genomes, enabling rapid retrieval of potential off-target sites.
(2) Haplotype-Aware Target Nomination:
Systematically identifies off-target sites created or eliminated by genetic variants, assessing individual-specific risks.
(3) Multi-Dimensional Scoring System:
Integrates CFD score, CRISTA score, and sequence homology score to provide comprehensive off-target risk quantification.
(4) Multi-PAM Type Analysis:
Covers NGG (canonical PAM), NAG, and NGA (non-canonical PAMs), comprehensively assessing off-target risks under different PAM sequence conditions.

Figure 1. CRISPRme Software Prediction Principle Diagram
3. CRISPRme Technical Innovation and Advantages
3.1 Core Technical Innovation
3.1.1 Individualized Risk Assessment
Integrates large-scale genetic variation data from the 1000 Genomes Project (2,504 individuals, 26 populations) and HGDP (1,043 individuals, 54 ethnic groups), enabling:
1. Identification of new off-target sites created by SNVs/Indels (increased risk);
2. Identification of off-target sites eliminated by variants (decreased risk);
3. Population- and individual-specific risk assessment for different ethnic groups.
3.1.2 Comprehensive PAM Coverage Strategy
SpCas9 can recognize multiple PAM sequences, with significant differences in binding affinity and cleavage efficiency. CRISPRme simultaneously analyzes all PAM types to ensure comprehensive off-target risk assessment:
1. NGG (canonical PAM): Cleavage efficiency ~100%, highest off-target risk;
2. NAG (non-canonical PAM): Relative cleavage efficiency 20-50%;
3. NGA (non-canonical PAM): Relative cleavage efficiency 5-20%.
3.1.3 Multi-Dimensional Bioinformatics Analysis
Integrates professional bioinformatics analysis workflows to achieve:
1. Precise identification of chromosomal coordinates and functional characteristics of off-target sites;
2. Mismatch analysis in the seed region (PAM-proximal 10bp);
3. Quantitative assessment of off-target risk changes caused by variants.
4. Application Scenarios and Service Advantages
4.1 Application Scenarios
CRISPRme has extensive applications in the development and regulation of CRISPR gene editing products:
(1) sgRNA Design Optimization: Comprehensive off-target prediction during the sgRNA design phase to screen candidate sgRNAs with low off-target risk.
(2) Gene Therapy Product Safety Evaluation: Accurate assessment of off-target risks for CRISPR gene editing products, identifying high-risk off-target sites.
(3) IND Application Support: Provision of computational prediction analysis reports that comply with FDA and NMPA requirements to support clinical trial applications.
(4) Personalized Precision Medicine: Assessment of off-target risks based on individual patient genetic backgrounds to support personalized gene therapy strategies.
(5) Preclinical Research: Assistance in designing off-target validation experiments and guiding wet-lab validation strategies.
4.2 Service Advantages
(1) Advanced Technology: Utilizes the latest CRISPRme v2.1.3 (released December 2024), integrating the most recent genetic variation databases.
(2) Comprehensive Analysis: Covers three PAM types (NGG, NAG, NGA), providing dual analysis of reference and variant genomes.
(3) Standardized Reports: Provides standardized analysis reports that comply with FDA and NMPA guidance principles, fully supporting regulatory submissions.
(4) Personalized Service: Customized analysis and technical support available based on client needs.
5. CRISPRme Off-Target Prediction Sample Report
We provide comprehensive CRISPRme off-target prediction analysis reports that meet regulatory requirements, including detailed information on the analysis platform, sgRNA sequence information, genetic variation database descriptions, and analysis workflow descriptions. In addition, the report includes the following core content:
(1) Off-Target Site Count Statistics:
For each sgRNA and each PAM type, provides statistics on the number of off-target sites under different mismatch and bulge conditions, including reference genome, variant genome, and merged total site counts. The system displays bulge type (DNA/RNA/none), mismatch count, bulge count, and corresponding off-target site numbers, providing quantitative evidence for off-target risk assessment.

Figure 2. Example of Off-Target Site Count Statistics
(2) Detailed Information on Top 50 High-Risk Sites:
All off-target sites are sorted in descending order based on CFD scores, displaying detailed information on the top 50 sites with the highest off-target risk, including: chromosomal location, start coordinate, DNA strand orientation, PAM sequence, mismatch count, bulge count, reference genome CFD score, variant genome CFD score, and highest CFD score. Seed region (PAM-proximal 10bp) alignment information is also provided for these high-risk sites, showing the distribution of mismatches and bulges in seed and non-seed regions.

Figure 3. Example of Detailed Information for the Top 50 Sites with the Highest Off-Target Risks
(3) Analysis of Off-Target Risk Changes Caused by Variants:
By calculating the difference in CFD scores before and after variants, the system identifies and displays:
① Sites with Increased Off-Target Risk (Top 50): Off-target sites created or enhanced by genetic variants, providing variant information (position, type, allele frequency, rsID), PAM creation status, and risk score changes.

Figure 4. Example of Top 50 Off-Target Sites with Increased Risk
② Sites with Decreased Off-Target Risk (Top 50): Off-target sites eliminated or reduced by genetic variants, helping to identify protective variants in specific populations.

Figure 5. Example of Top 50 Off-Target Sites with Decreased Risk
(4) Multi-PAM Deduplication and Standard Chromosome Filtering:
Deduplication is performed on the prediction results for the three PAM types (NGG, NAG, NGA), and off-target sites on standard chromosomes (chr1-22, X, Y) are screened. Site count statistics before and after deduplication and filtering are provided to ensure accuracy and comparability of analysis results.

Figure 6. Example of Multi-PAM Deduplication and Standard Chromosome Filtering
6. CRISPRme Off-Target Prediction Service Content
| Service Process | Service Content |
| Project Consultation and Assessment | Develop personalized analysis plans and provide project quotation |
Information Collection | Collect sgRNA sequence information (sequence, PAM type, genome localization, etc.) |
CRISPRme Analysis | Execute standardized analysis workflow: Reference genome hg38, variant database integration (1000G + HGDP), multi-PAM type analysis (NGG/NAG/NGA), CFD score calculation, seed region analysis, variant risk assessment |
Professional Report Delivery | Provide standardized analysis reports, including technical interpretation and consulting services |
| IND Application Support | Technical documentation compliant with FDA and NMPA requirements can be provided based on client needs |
Service Cycle: Standard process 5-10 business days
7. Information Requirements
| Category | Specific Requirements |
Basic Service Options | CRISPRme off-target prediction and analysis services available (client provides sgRNA sequence information) |
sgRNA Sequence Information | sgRNA sequence (20bp spacer sequence); PAM sequence (e.g., NGG, NAG); Genome localization information (chromosome, start position, end position, strand orientation); Target gene name (optional). |
Analysis Parameters | Reference genome: CRISPRme reference genome (standard); PAM types: NGG/NAG/NGA (standard coverage of all); Mismatch tolerance: ≤5 mismatches, ≤1 bulge (standard); Variant database: 1000 Genomes + HGDP (standard). |
Note: ① All analysis parameters can be adjusted and customized according to client needs; ② For special analysis needs, please communicate with the technical team in advance (Tel: 400-6309596; Product ordering/Technical support: service@generulor.com).
References
[1] Cancellieri S, Zeng J, Lin LY, et al. Human genetic diversity alters off-target outcomes of therapeutic gene editing. Nat Genet. 2023;55(1):34-43. doi:10.1038/s41588-022-01257-y
[2] U.S. Food and Drug Administration. (2024). Human Gene Therapy Products Incorporating Human Genome Editing - Guidance for Industry.
[3] Center for Drug Evaluation, National Medical Products Administration. (2022). Technical Guidelines for Pharmaceutical Research and Evaluation of In Vivo Gene Therapy Products (Trial).