mouse cd44 knockout project (crispr/cas9) · 2020. 12. 2. · gene id: 12505, updated on...

9
Mouse Cd44 Knockout Project (CRISPR/Cas9) Objective: To create a Cd44 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Cd44 gene (NCBI Reference Sequence: NM_009851 ; Ensembl: ENSMUSG00000005087 ) is located on Mouse chromosome 2. 19 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 19 (Transcript: ENSMUST00000005218). Exon 4~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for targeted null mutations exhibit impaired T lymphocyte trafficking resulting in muted inflammatory responses, altered myeloid progenitor distribution, reduced growth of tumors, and impaired uterine involution and maintenance of lactation. Exon 4 starts from about 16.37% of the coding region. Exon 4~7 covers 22.82% of the coding region. The size of effective KO region: ~9565 bp. The KO region does not have any other known gene. http://www.alphaknockout.com/ Page 1 of 9

Upload: others

Post on 17-Aug-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mouse Cd44 Knockout Project (CRISPR/Cas9) · 2020. 12. 2. · Gene ID: 12505, updated on 10-Oct-2019 Gene summary Genomic context Exon count: 20 Annotation release Status Assembly

Mouse Cd44 Knockout Project (CRISPR/Cas9)

Objective:To create a Cd44 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary:The Cd44 gene (NCBI Reference Sequence: NM_009851 ; Ensembl: ENSMUSG00000005087 ) is located onMouse chromosome 2.19 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 19 (Transcript:ENSMUST00000005218).Exon 4~7 will be selected as target site.Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production.The pups will be genotyped by PCR followed by sequencing analysis.

Note: Homozygotes for targeted null mutations exhibit impaired T lymphocyte trafficking resulting in muted inflammatoryresponses, altered myeloid progenitor distribution, reduced growth of tumors, and impaired uterine involution andmaintenance of lactation.

Exon 4 starts from about 16.37% of the coding region. Exon 4~7 covers 22.82% of the coding region.

The size of effective KO region: ~9565 bp.

The KO region does not have any other known gene.

http://www.alphaknockout.com/

Page 1 of 9

Page 2: Mouse Cd44 Knockout Project (CRISPR/Cas9) · 2020. 12. 2. · Gene ID: 12505, updated on 10-Oct-2019 Gene summary Genomic context Exon count: 20 Annotation release Status Assembly

Overview of the Targeting Strategy

gRNA region gRNA region

1 4 5 6 7 8 19

Wildtype allele5' 3'

Legends Exon of mouse Cd44 Knockout region

http://www.alphaknockout.com/

Page 2 of 9

Page 3: Mouse Cd44 Knockout Project (CRISPR/Cas9) · 2020. 12. 2. · Gene ID: 12505, updated on 10-Oct-2019 Gene summary Genomic context Exon count: 20 Annotation release Status Assembly

Overview of the Dot Plot (up)

Overview of the Dot Plot (down)

Sequence 1Sequence 2

Reverse ComplementForward

Window size: 15 bp

Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is foundin the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Sequence 1Sequence 2

Reverse ComplementForward

Window size: 15 bp

Note: The 2000 bp section downstream of Exon 7 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat isfound in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

http://www.alphaknockout.com/

Page 3 of 9

Page 4: Mouse Cd44 Knockout Project (CRISPR/Cas9) · 2020. 12. 2. · Gene ID: 12505, updated on 10-Oct-2019 Gene summary Genomic context Exon count: 20 Annotation release Status Assembly

Overview of the GC Content Distribution (up)

Overview of the GC Content Distribution (down)

Sequence 1Sequence 2

Window size: 300 bp

Summary: Full Length(2000bp) | A(30.45% 609) | C(18.9% 378) | G(19.6% 392) | T(31.05% 621)

Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So thisregion is suitable for PCR screening or sequencing analysis.

Sequence 1Sequence 2

Window size: 300 bp

Summary: Full Length(2000bp) | A(32.4% 648) | C(18.15% 363) | G(19.9% 398) | T(29.55% 591)

Note: The 2000 bp section downstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So thisregion is suitable for PCR screening or sequencing analysis.

http://www.alphaknockout.com/

Page 4 of 9

Page 5: Mouse Cd44 Knockout Project (CRISPR/Cas9) · 2020. 12. 2. · Gene ID: 12505, updated on 10-Oct-2019 Gene summary Genomic context Exon count: 20 Annotation release Status Assembly

BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN-----------------------------------------------------------------------------------------------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 102855968 102857967 2000browser details YourSeq 40 442 826 2000 64.3% chr7 - 24613618 24613905 288browser details YourSeq 33 800 832 2000 100.0% chr6 - 72937683 72937715 33browser details YourSeq 31 1953 1988 2000 87.9% chr3 - 68782027 68782060 34browser details YourSeq 30 1948 1977 2000 100.0% chr4 + 59078247 59078276 30browser details YourSeq 29 799 831 2000 94.0% chr5 - 43916484 43916516 33browser details YourSeq 29 800 832 2000 94.0% chr7 + 4512919 4512951 33browser details YourSeq 29 800 830 2000 96.8% chr12 + 107203007 107203037 31browser details YourSeq 28 818 857 2000 96.8% chr5 - 146877180 146877219 40browser details YourSeq 28 802 833 2000 93.8% chr12 + 58540621 58540652 32browser details YourSeq 28 1950 1977 2000 100.0% chr11 + 106086990 106087017 28browser details YourSeq 27 803 833 2000 86.7% chr6 - 98430498 98430527 30browser details YourSeq 26 803 832 2000 93.4% chr1 - 119075438 119075467 30browser details YourSeq 25 1944 1968 2000 100.0% chr8 - 25486432 25486456 25browser details YourSeq 25 800 832 2000 87.9% chr5 + 9324636 9324668 33browser details YourSeq 25 800 832 2000 87.9% chr13 + 99247540 99247572 33browser details YourSeq 24 802 827 2000 96.2% chr11 - 113681741 113681766 26browser details YourSeq 23 818 856 2000 79.5% chr9 - 42572686 42572724 39browser details YourSeq 23 800 832 2000 84.9% chr7 - 35464344 35464376 33browser details YourSeq 23 1526 1551 2000 96.0% chr16 - 58179945 58179971 27

Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN-----------------------------------------------------------------------------------------------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 102844403 102846402 2000browser details YourSeq 228 305 579 2000 91.0% chr3 - 142254509 142254776 268browser details YourSeq 29 1671 1707 2000 75.8% chr1 + 37560141 37560173 33browser details YourSeq 26 29 59 2000 77.8% chr17 + 63716431 63716457 27browser details YourSeq 25 1460 1488 2000 93.2% chr7 + 103121389 103121417 29browser details YourSeq 22 260 281 2000 100.0% chrX + 47202950 47202971 22browser details YourSeq 20 628 673 2000 71.8% chr3 - 53298161 53298206 46

Note: The 2000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

http://www.alphaknockout.com/

Page 5 of 9

Page 6: Mouse Cd44 Knockout Project (CRISPR/Cas9) · 2020. 12. 2. · Gene ID: 12505, updated on 10-Oct-2019 Gene summary Genomic context Exon count: 20 Annotation release Status Assembly

Official Symbol Cd44 provided by MGI

Official Full Name CD44 antigen provided by MGI

Primary source MGI:MGI:88338See related Ensembl:ENSMUSG00000005087Gene type protein coding

RefSeq status VALIDATEDOrganism Mus musculus

Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires;Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus

Also known as Ly-24; Pgp-1; HERMES; AU023126; AW121933; AW146109Expression Broad expression in lung adult (RPKM 13.1), spleen adult (RPKM 8.5) and 23 other tissues See more

Orthologs human all

See Cd44 in Genome Data ViewerLocation: 2 E2; 2 54.13 cM

Gene and protein information:Cd44 CD44 antigen [ Mus musculus (house mouse) ]Gene ID: 12505, updated on 10-Oct-2019

Gene summary

Genomic context

Exon count: 20

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (102811141..102901669, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (102651298..102741822, complement)

Chromosome 2 - NC_000068.7

http://www.alphaknockout.com/

Page 6 of 9

Page 7: Mouse Cd44 Knockout Project (CRISPR/Cas9) · 2020. 12. 2. · Gene ID: 12505, updated on 10-Oct-2019 Gene summary Genomic context Exon count: 20 Annotation release Status Assembly

Description CD44 antigen [Source:MGI Symbol;Acc:MGI:88338]Gene Synonyms HERMES, Ly-24, Pgp-1

Location Chromosome 2: 102,811,141-102,901,665 reverse strand.GRCm38:CM000995.2

About this gene This gene has 9 transcripts (splice variants), 184 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and isassociated with 84 phenotypes.

Transcripts

Transcript information: This gene has 9 transcripts

Gene: Cd44 ENSMUSG00000005087

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cd44-201 ENSMUST00000005218.14 5614 780aa ENSMUSP00000005218.8 Protein coding CCDS16472 A2APM2 TSL:1GENCODE basicAPPRIS P4

Cd44-202 ENSMUST00000060516.13 3495 580aa ENSMUSP00000062330.7 Protein coding CCDS16470 Q80X37 TSL:1GENCODE basicAPPRIS ALT2

Cd44-203 ENSMUST00000099673.8 2848 365aa ENSMUSP00000097265.2 Protein coding CCDS16471 Q3U8S1 TSL:1GENCODE basicAPPRIS ALT2

Cd44-208 ENSMUST00000111198.8 1981 657aa ENSMUSP00000106829.2 Protein coding CCDS50650 A2APM1 TSL:1GENCODE basicAPPRIS ALT2

Cd44-205 ENSMUST00000111191.8 1533 498aa ENSMUSP00000106822.2 Protein coding CCDS50649 A2APM4 TSL:1GENCODE basicAPPRIS ALT2

Cd44-204 ENSMUST00000111190.8 1412 464aa ENSMUSP00000106821.2 Protein coding CCDS50648 A2APM3 TSL:1GENCODE basicAPPRIS ALT2

Cd44-207 ENSMUST00000111194.7 5045 577aa ENSMUSP00000106825.1 Protein coding - E9QKM8 TSL:5GENCODE basicAPPRIS ALT2

Cd44-206 ENSMUST00000111192.2 1305 434aa ENSMUSP00000106823.2 Protein coding - A2APM5 TSL:5GENCODE basicAPPRIS ALT2

Cd44-209 ENSMUST00000124624.1 1907 No protein - Retained intron - - TSL:1

http://www.alphaknockout.com/

Page 7 of 9

Page 8: Mouse Cd44 Knockout Project (CRISPR/Cas9) · 2020. 12. 2. · Gene ID: 12505, updated on 10-Oct-2019 Gene summary Genomic context Exon count: 20 Annotation release Status Assembly

102.82Mb 102.84Mb 102.86Mb 102.88Mb 102.90Mb

Gm23731-201 >snRNAGenes(Comprehensive set...

AL844155.6 >Contigs

< Cd44-207protein coding

< Cd44-201protein coding

< Cd44-202protein coding

< Cd44-203protein coding

< Cd44-205protein coding

< Cd44-204protein coding

< Cd44-208protein coding

< Cd44-206protein coding

< Cd44-209retained intron

Genes(Comprehensive set...

Regulatory Build

102.82Mb 102.84Mb 102.86Mb 102.88Mb 102.90Mb

Protein Coding

merged Ensembl/Havana

Ensembl protein coding

Non-Protein Coding

processed transcript

RNA gene

Gene Legend

CTCF

Enhancer

Open Chromatin

Promoter

Promoter Flank

Transcription Factor Binding Site

Regulation Legend

110.53 kb Forward strand

Reverse strand 110.53 kb

http://www.alphaknockout.com/

Page 8 of 9

Page 9: Mouse Cd44 Knockout Project (CRISPR/Cas9) · 2020. 12. 2. · Gene ID: 12505, updated on 10-Oct-2019 Gene summary Genomic context Exon count: 20 Annotation release Status Assembly

Transcript: ENSMUST00000005218

< Cd44-201protein coding

Reverse strand 90.48 kb

ENSMUSP00000005...Transmembrane heli...MobiDB liteLow complexity (Seg)

Cleavage site (Sign...

C-type lectin foldSuperfamily

Link domainSMART

CD44 antigen

Link domain

Prints

Link domainPfam

Link domainPROSITE profiles

Link domainPROSITE patterns

PTHR10225

CD44 antigen

PANTHER

C-type lectin-like/link domain superfamilyGene3D

cd03516CDD

Sequence variants (dbSNP and all other sources)All sequence SNPs/i...

missense variant

splice region variant

synonymous variant

Variant Legend

0 80 160 240 320 400 480 560 6400 780Scale bar

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

http://www.alphaknockout.com/

Page 9 of 9