Location:
sgRNA Design

sgRNA Design

sgRNA Design

1. Background Introduction

Since its inception in 2012, CRISPR/Cas9 gene editing technology has become one of the most revolutionary tools in life sciences research and gene therapy. This technology employs single guide RNA (sgRNA) to direct the Cas9 nuclease to precisely recognize and cleave target DNA sequences, enabling site-specific genome modifications. However, the quality of sgRNA design directly determines the success or failure of gene editing—improper sgRNA design may lead to off-target effects, low editing efficiency, and even unintended genomic variations, potentially causing serious safety risks in clinical applications.

CRISPOR[1] software systematically integrates multiple scoring models to simultaneously evaluate sgRNA cleavage efficiency (Doench '16/RuleSet3-Score), targeting specificity (MIT/CFD-Score), frameshift mutation probability (Out-of-Frame-Score), and genome-wide off-target risk. It provides researchers with a transparent, quantifiable multidimensional data framework, greatly improving CRISPR experimental success rates.

2. Technical Principles and Core Innovations

2.1 CRISPOR Design Principles

CRISPOR designs sgRNAs based on the molecular mechanisms of the CRISPR/Cas system. For the classical SpCas9 system, sgRNAs must meet the following basic requirements: (1) target sequence length of 20 nucleotides; (2) adjacent to a PAM sequence (NGG, the protospacer adjacent motif essential for Cas9 recognition); (3) avoid high GC content (>80%) or low GC content (<20%); (4) avoid four consecutive T's (TTTT, which may cause U6 promoter transcription termination). After the user inputs the target sequence, CRISPOR automatically scans all eligible potential target sites and ranks them using multiple algorithms.

The platform employs a hierarchical screening strategy: first, candidate sgRNAs are identified based on PAM sequences; then, genome-wide alignment technology detects off-target sites (allowing up to 4 mismatches); finally, machine learning algorithms predict editing efficiency. This systematic approach ensures that recommended sgRNAs possess both high specificity (minimizing off-target risk) and high activity (maximizing editing efficiency). CRISPOR integrates multidimensional evaluation metrics including MIT specificity score, CFD score, Doench efficiency score, and frameshift mutation score, providing a solid data foundation for subsequent intelligent screening.

2.2 Core Technical Innovation: Intelligent Screening System

Building upon the rich scoring data provided by CRISPOR, we have independently developed a professional intelligent screening system, which represents the core technical advantage of our service. Targeting the specific requirements of gene knockout experiments, we have established a scientifically rigorous two-stage screening workflow: hard filtering + four-level intelligent ranking, ensuring that the final delivered sgRNA sequences achieve optimal balance in functionality, specificity, efficiency, and safety.

2.2.1 Stage One: Hard Quality Control Filtering

Based on extensive literature validation, we have established three key quality thresholds to initially screen all candidate sgRNAs output by CRISPOR, eliminating sequences that do not meet basic quality standards:

(1) GC Content Control (40%-60%): GC content directly affects sgRNA stability and Cas9 binding efficiency. Low GC content (<40%) leads to unstable RNA secondary structure, reducing Cas9-sgRNA complex formation efficiency; high GC content (>60%) may cause non-specific binding and off-target effects. We strictly limit GC content to the optimal range of 40%-60%. This standard is based on large-scale screening data published by Doench et al. in Nature Biotechnology, where sgRNAs within this range demonstrated significantly higher editing success rates (approximately 30% median efficiency improvement).

(2) MIT Specificity Score (>50): The MIT score calculates off-target probability based on mismatch position and number, with a range of 0-100, where higher scores represent better specificity. We set 50 as the minimum threshold; sgRNAs below this score pose significant off-target risk. According to validation studies of the MIT scoring system, sgRNAs with scores <50 have multiple high-frequency cleavage off-target sites genome-wide (CFD score >0.2), potentially causing confusion in experimental results and unintended phenotypic changes.

(3) Graf et al. Status Annotation (GrafEtAlStatus = OK): CRISPOR integrates the sgRNA quality assessment system developed by Graf et al., which classifies and annotates sgRNAs based on experimental validation data. Only sequences annotated as 'OK' are considered suitable for gene knockout experiments. Sequences marked as 'AVOID' or 'WARNING' typically have the following issues: located in repetitive sequence regions, containing polynucleotide repeats (such as AAAA/TTTT/GGGG/CCCC), adjacent to known SNP hotspots, or possessing overly complex predicted secondary structures. This filtering criterion effectively eliminates approximately 15%-20% of potentially problematic sequences.

2.2.2 Stage Two: Four-Level Intelligent Priority Ranking

Candidate sequences that pass hard filtering enter a four-level priority ranking workflow. This workflow assigns different weights to different performance metrics based on the core objectives of gene knockout experiments, performing level-by-level ranking until the Top 4 optimal sequences are selected.

(1) Priority 1: Out-of-Frame Score ↓, Core Objective: Functional Knockout.

This score predicts the probability of frameshift mutations generated by the NHEJ repair mechanism and is a direct indicator of achieving gene function loss. We assign it the highest priority to maximize the possibility of obtaining functional knockout alleles.

(2) Priority 2: CFD Specificity Score (cfdSpecScore) ↓, Core Objective: High Fidelity.

While ensuring high knockout potential, we employ the more precise CFD (Cutting Frequency Determination) algorithm to assess off-target risk, prioritizing sequences with the highest specificity to ensure experimental result purity.

(3) Priority 3: Doench RuleSet3 Efficiency Score ↓, Core Objective: High Efficiency.

This score uses the updated algorithm developed by Doench's team[2], where higher scores represent higher cleavage activity. Under equal specificity conditions, high-efficiency sequences are prioritized to reduce experimental cytotoxicity and improve positive rates.

(4) Priority 4: Off-target Count (offtargetCount) ↓, Core Objective: Final Risk Control.

As the final ranking step, we directly compare the total number of off-target sites with 0-4 mismatches. When all upstream metrics are similar, we select sequences with the fewest off-target sites as the ultimate safety guarantee.

Figure1. sgRNA Design Workflow Diagram

2.3 Batch Design of CRISPR Knockout Libraries

In addition to single-gene sgRNA design, we provide professional batch design services supporting the construction of genome-wide or targeted CRISPR knockout libraries. Based on the CRISPOR tool, we can batch process sgRNA design requirements for thousands of genes. Users need only provide a target gene list (supporting multiple formats such as gene symbols, Ensembl IDs, RefSeq IDs, etc.), and the system automatically completes the following tasks:

(1). Automatically identify all exons of each gene, prioritizing exons concentrated with functional domains or 5' terminal exons;

(2). Design 4-6 sgRNAs for each gene, ensuring backup options even if some sgRNAs fail;

(3). Apply our intelligent screening system to rank candidate sgRNAs for each gene, outputting the Top 4 sequences;

Our batch design service has successfully supported multiple high-throughput screening projects, including genome-wide CRISPR screens and targeted drug target screens. Compared to using public libraries (such as Brunello, GeCKO, etc.), custom-designed libraries offer higher species specificity, better knockout efficiency, and lower off-target risk, making them particularly suitable for non-human species or research projects requiring high precision.

3. Technical Advantages and Methodological Validation

3.1 Core Technical Advantage Comparison

Compared to directly using the CRISPOR online tool or other sgRNA design platforms, our service offers the following significant advantages:

Comparison Dimension

CRISPOR Original Output

Our Intelligent Screening Service

Candidate Number

Typically outputs 10-50 candidate sgRNAs, requiring user screening with no clear selection criteria

Two-stage screening precisely outputs Top 4 optimal sgRNAs, saving client screening time

Screening Strategy

Provides multiple scoring metrics but no clear priorities or weights; users must judge how to balance different indicators themselves

Experiment logic-based four-level ranking system with clear indicator weights (functionality > specificity > efficiency > off-target count), scientifically sound

Quality Control

Provides color markers (green/yellow/red) as rough quality classification, but still includes many marginal quality sequences

Hard filtering standards (GC 40-60%, MIT>50, GrafStatus=OK) eliminate all low-quality sequences, ensuring basic quality

Personalization Level

Standardized output, does not consider specific experimental scenarios and client special needs

Screening parameters can be adjusted based on client needs (e.g., clinical-grade applications can tighten specificity thresholds), providing personalized solutions

Batch Processing

CRISPOR Batch supports batch input, but output results still require manual individual screening, heavy workload

Automated batch screening can simultaneously process thousands of genes, automatically outputting Top 4 per gene, suitable for library construction

Technical Support

Only provides online tools and documentation, no professional technical support

Provides professional bioinformatics support and experimental consultation, including one-stop services such as follow-up vector construction and validation scheme design

4. Application Scenarios and Service Advantages

4.1 Application Scenarios

(1) Gene Function Research: Design highly efficient sgRNAs for gene knockout/knock-in to study the biological functions of target genes. Our high frameshift score priority strategy is particularly suitable for functional analysis of gene family members—achieving functional knockout of single members without affecting other paralogous genes through precise targeting of specific exons.

(2) High-Throughput Functional Screening: Construct genome-wide or targeted CRISPR knockout libraries for drug target screening, synthetic lethal screening, etc. Our batch screening service can process thousands of genes, automatically outputting 4 quality-optimized sgRNAs per gene. Compared to public libraries, this offers approximately 15% higher average editing efficiency and lower off-target risk, improving screening signal-to-noise ratio and false positive control.

(3) Gene Therapy Product Development: Design ultra-high specificity sgRNAs for clinical applications such as CAR-T therapy (e.g., TRAC, B2M gene knockout) and in vivo gene editing. We can provide clinical-grade screening standards (CFD >0.9, off-target sites <3), and support subsequent genome-wide off-target NGS validation services, providing off-target safety assessment reports compliant with FDA/EMA/NMPA regulatory requirements to support IND/CTA submissions.

(4) Agricultural Biotechnology and Synthetic Biology: Support crop improvement (disease resistance gene editing, quality improvement, abiotic stress resistance, etc.) and industrial microbial modification. CRISPOR supports major crops such as rice, wheat, corn, tomatoes, and soybeans, as well as industrial strains like brewer's yeast and E. coli. Our batch design service can rapidly construct multi-gene metabolic pathway editing schemes.

4.2 Service Advantages

(1) Core Technical Advantage—Intelligent Screening Algorithm: Our independently developed two-stage screening system (hard filtering + four-level ranking) is the greatest highlight of our service. Based on a deep understanding of CRISPR experimental logic, this system integrates complex multidimensional scores into clear priority rankings, precisely screening the Top 4 optimal sgRNAs from dozens of candidate sequences, significantly improving experimental success rates and saving clients' screening time.

(2) Batch Processing and Library Construction Capability: Supports various scale requirements from single-gene design to genome-wide library construction. Our automated workflow can complete batch design for 1000 genes within 2-3 business days, outputting oligonucleotide sequence files directly usable for chip synthesis, significantly shortening library construction cycles. Custom libraries offer better species compatibility and higher overall quality compared to commercial public libraries.

5. sgRNA Design Result Examples

In the initial stage, the CRISPOR platform may generate dozens of candidate sgRNAs for each target. Our two-stage screening system first applies strict hard quality filtering, then employs a four-level intelligent priority ranking algorithm to precisely select the Top 4 optimal sgRNA candidates for each target. This process not only greatly improves screening efficiency but, more importantly, ensures that final recommended sequences achieve optimal balance across multiple dimensions including functionality, specificity, and efficiency.

6. sgRNA Design Service Content

Service Process

Service Content

Requirement Communication

Clarify research objectives (gene knockout/knock-in/base editing, etc.), species information, target gene list or target sequences, nuclease type (SpCas9/SaCas9/Cpf1, etc.), experimental scale (single gene/batch/genome-wide library), and special requirements (such as clinical-grade high specificity requirements, priority targeting of exons, etc.)

Initial Design

Perform initial sgRNA design based on the CRISPOR platform, obtaining all candidate sequences and their multidimensional scoring data (MIT/CFD specificity, Doench efficiency, Out-of-Frame score, off-target site statistics, etc.)

Intelligent Screening

Apply our two-stage screening system: Stage one performs hard quality filtering (GC 40-60%, MIT>50, GrafStatus=OK), Stage two executes four-level intelligent ranking (frameshift score ↓ → CFD specificity ↑ → Doench efficiency ↑ → off-target count ↓), precisely screening the Top 4 optimal sgRNAs

Detailed Report Delivery

Provide standardized professional reports, including: Top 4 recommended sgRNA list (with complete sequences, all scoring metrics, and screening criteria explanations)

*Service Cycle: Single gene design 2 business days; Batch design (<100 genes) 3-5 business days; Large-scale library design (100-1000 genes) 5-10 business days; Genome-wide library design (>1000 genes) 10-15 business days

7. Information Required from Clients

Information Category

Specific Requirements

Required Information

Species information (Latin scientific name or common name, such as Homo sapiens, mouse, zebrafish, etc.); Target gene name or gene ID (such as human HGNC symbol, mouse MGI symbol, Ensembl ID, RefSeq ID, etc.), or directly provide target sequence (FASTA format, recommended length >200bp to ensure sufficient candidate sgRNAs); Nuclease type (SpCas9/SaCas9/Cpf1, etc.); Editing type (gene knockout/knock-in/base editing/prime editing, etc.)

Recommended Information

Genetic background information of cell lines or animal strains (for SNP screening and personalized design); Preferred target exon numbers or functional domains (such as priority targeting of N-terminal exons, catalytic domains, etc.);

Additional Information for Batch Design

Target gene list (Excel format, including gene symbols, gene IDs, etc.); Number of sgRNAs needed per gene (default 4, customizable 2-10); Library construction strategy; Whether control sgRNAs need to be added (negative controls, positive controls)

*Note: For non-conventional model organisms or custom genomes, clients must provide reference genome sequence files (FASTA format) and gene annotation files (GFF/GTF format). For clinical-grade applications or genes with numerous high-risk off-target sites, NGS off-target validation services are strongly recommended to meet regulatory requirements and ensure experimental safety. We support flexible service customization and can adjust screening criteria and deliverables based on specific project needs.

8. References

[1] Concordet, J. P., & Haeussler, M. (2018). CRISPOR: intuitive guide selection for CRISPR/Cas9 genome editing experiments and screens. Nucleic Acids Research, 46(W1), W242-W245.

[2] Haeussler, M., et al. (2016). Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biology, 17, 148.

[3] Doench, J. G., et al. (2016). Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nature Biotechnology, 34, 184-191.

[4] Doench, J. G., et al. (2014). Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nature Biotechnology, 32, 1262-1267.

[5] Moreno-Mateos, M. A., et al. (2015). CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo. Nature Methods, 12, 982-988.

[6] Bae, S., Kweon, J., Kim, H. S., & Kim, J. S. (2014). Microhomology-based choice of Cas9 nuclease target sites. Nature Methods, 11, 705-706.

[7] Najm, F. J., et al. (2018). Orthologous CRISPR-Cas9 enzymes for combinatorial genetic screens. Nature Biotechnology, 36, 179-189.

[8] Canver, M. C., et al. (2018). Integrated design, execution, and analysis of arrayed and pooled CRISPR genome-editing experiments. Nature Protocols, 13, 946-986.

[9] Pinello, L., et al. (2016). Analyzing CRISPR genome-editing experiments with CRISPResso. Nature Biotechnology, 34, 695-697.