mouse cd44 knockout project (crispr/cas9) · 2020. 12. 2. · gene id: 12505, updated on...
TRANSCRIPT
Mouse Cd44 Knockout Project (CRISPR/Cas9)
Objective:To create a Cd44 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.
Strategy summary:The Cd44 gene (NCBI Reference Sequence: NM_009851 ; Ensembl: ENSMUSG00000005087 ) is located onMouse chromosome 2.19 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 19 (Transcript:ENSMUST00000005218).Exon 4~7 will be selected as target site.Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production.The pups will be genotyped by PCR followed by sequencing analysis.
Note: Homozygotes for targeted null mutations exhibit impaired T lymphocyte trafficking resulting in muted inflammatoryresponses, altered myeloid progenitor distribution, reduced growth of tumors, and impaired uterine involution andmaintenance of lactation.
Exon 4 starts from about 16.37% of the coding region. Exon 4~7 covers 22.82% of the coding region.
The size of effective KO region: ~9565 bp.
The KO region does not have any other known gene.
http://www.alphaknockout.com/
Page 1 of 9
Overview of the Targeting Strategy
gRNA region gRNA region
1 4 5 6 7 8 19
Wildtype allele5' 3'
Legends Exon of mouse Cd44 Knockout region
http://www.alphaknockout.com/
Page 2 of 9
Overview of the Dot Plot (up)
Overview of the Dot Plot (down)
Sequence 1Sequence 2
Reverse ComplementForward
Window size: 15 bp
Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is foundin the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.
Sequence 1Sequence 2
Reverse ComplementForward
Window size: 15 bp
Note: The 2000 bp section downstream of Exon 7 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat isfound in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.
http://www.alphaknockout.com/
Page 3 of 9
Overview of the GC Content Distribution (up)
Overview of the GC Content Distribution (down)
Sequence 1Sequence 2
Window size: 300 bp
Summary: Full Length(2000bp) | A(30.45% 609) | C(18.9% 378) | G(19.6% 392) | T(31.05% 621)
Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So thisregion is suitable for PCR screening or sequencing analysis.
Sequence 1Sequence 2
Window size: 300 bp
Summary: Full Length(2000bp) | A(32.4% 648) | C(18.15% 363) | G(19.9% 398) | T(29.55% 591)
Note: The 2000 bp section downstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So thisregion is suitable for PCR screening or sequencing analysis.
http://www.alphaknockout.com/
Page 4 of 9
BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN-----------------------------------------------------------------------------------------------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 102855968 102857967 2000browser details YourSeq 40 442 826 2000 64.3% chr7 - 24613618 24613905 288browser details YourSeq 33 800 832 2000 100.0% chr6 - 72937683 72937715 33browser details YourSeq 31 1953 1988 2000 87.9% chr3 - 68782027 68782060 34browser details YourSeq 30 1948 1977 2000 100.0% chr4 + 59078247 59078276 30browser details YourSeq 29 799 831 2000 94.0% chr5 - 43916484 43916516 33browser details YourSeq 29 800 832 2000 94.0% chr7 + 4512919 4512951 33browser details YourSeq 29 800 830 2000 96.8% chr12 + 107203007 107203037 31browser details YourSeq 28 818 857 2000 96.8% chr5 - 146877180 146877219 40browser details YourSeq 28 802 833 2000 93.8% chr12 + 58540621 58540652 32browser details YourSeq 28 1950 1977 2000 100.0% chr11 + 106086990 106087017 28browser details YourSeq 27 803 833 2000 86.7% chr6 - 98430498 98430527 30browser details YourSeq 26 803 832 2000 93.4% chr1 - 119075438 119075467 30browser details YourSeq 25 1944 1968 2000 100.0% chr8 - 25486432 25486456 25browser details YourSeq 25 800 832 2000 87.9% chr5 + 9324636 9324668 33browser details YourSeq 25 800 832 2000 87.9% chr13 + 99247540 99247572 33browser details YourSeq 24 802 827 2000 96.2% chr11 - 113681741 113681766 26browser details YourSeq 23 818 856 2000 79.5% chr9 - 42572686 42572724 39browser details YourSeq 23 800 832 2000 84.9% chr7 - 35464344 35464376 33browser details YourSeq 23 1526 1551 2000 96.0% chr16 - 58179945 58179971 27
Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.
BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN-----------------------------------------------------------------------------------------------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 102844403 102846402 2000browser details YourSeq 228 305 579 2000 91.0% chr3 - 142254509 142254776 268browser details YourSeq 29 1671 1707 2000 75.8% chr1 + 37560141 37560173 33browser details YourSeq 26 29 59 2000 77.8% chr17 + 63716431 63716457 27browser details YourSeq 25 1460 1488 2000 93.2% chr7 + 103121389 103121417 29browser details YourSeq 22 260 281 2000 100.0% chrX + 47202950 47202971 22browser details YourSeq 20 628 673 2000 71.8% chr3 - 53298161 53298206 46
Note: The 2000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.
http://www.alphaknockout.com/
Page 5 of 9
Official Symbol Cd44 provided by MGI
Official Full Name CD44 antigen provided by MGI
Primary source MGI:MGI:88338See related Ensembl:ENSMUSG00000005087Gene type protein coding
RefSeq status VALIDATEDOrganism Mus musculus
Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires;Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus
Also known as Ly-24; Pgp-1; HERMES; AU023126; AW121933; AW146109Expression Broad expression in lung adult (RPKM 13.1), spleen adult (RPKM 8.5) and 23 other tissues See more
Orthologs human all
See Cd44 in Genome Data ViewerLocation: 2 E2; 2 54.13 cM
Gene and protein information:Cd44 CD44 antigen [ Mus musculus (house mouse) ]Gene ID: 12505, updated on 10-Oct-2019
Gene summary
Genomic context
Exon count: 20
Annotation release Status Assembly Chr Location
108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (102811141..102901669, complement)
Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (102651298..102741822, complement)
Chromosome 2 - NC_000068.7
http://www.alphaknockout.com/
Page 6 of 9
Description CD44 antigen [Source:MGI Symbol;Acc:MGI:88338]Gene Synonyms HERMES, Ly-24, Pgp-1
Location Chromosome 2: 102,811,141-102,901,665 reverse strand.GRCm38:CM000995.2
About this gene This gene has 9 transcripts (splice variants), 184 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and isassociated with 84 phenotypes.
Transcripts
Transcript information: This gene has 9 transcripts
Gene: Cd44 ENSMUSG00000005087
Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags
Cd44-201 ENSMUST00000005218.14 5614 780aa ENSMUSP00000005218.8 Protein coding CCDS16472 A2APM2 TSL:1GENCODE basicAPPRIS P4
Cd44-202 ENSMUST00000060516.13 3495 580aa ENSMUSP00000062330.7 Protein coding CCDS16470 Q80X37 TSL:1GENCODE basicAPPRIS ALT2
Cd44-203 ENSMUST00000099673.8 2848 365aa ENSMUSP00000097265.2 Protein coding CCDS16471 Q3U8S1 TSL:1GENCODE basicAPPRIS ALT2
Cd44-208 ENSMUST00000111198.8 1981 657aa ENSMUSP00000106829.2 Protein coding CCDS50650 A2APM1 TSL:1GENCODE basicAPPRIS ALT2
Cd44-205 ENSMUST00000111191.8 1533 498aa ENSMUSP00000106822.2 Protein coding CCDS50649 A2APM4 TSL:1GENCODE basicAPPRIS ALT2
Cd44-204 ENSMUST00000111190.8 1412 464aa ENSMUSP00000106821.2 Protein coding CCDS50648 A2APM3 TSL:1GENCODE basicAPPRIS ALT2
Cd44-207 ENSMUST00000111194.7 5045 577aa ENSMUSP00000106825.1 Protein coding - E9QKM8 TSL:5GENCODE basicAPPRIS ALT2
Cd44-206 ENSMUST00000111192.2 1305 434aa ENSMUSP00000106823.2 Protein coding - A2APM5 TSL:5GENCODE basicAPPRIS ALT2
Cd44-209 ENSMUST00000124624.1 1907 No protein - Retained intron - - TSL:1
http://www.alphaknockout.com/
Page 7 of 9
102.82Mb 102.84Mb 102.86Mb 102.88Mb 102.90Mb
Gm23731-201 >snRNAGenes(Comprehensive set...
AL844155.6 >Contigs
< Cd44-207protein coding
< Cd44-201protein coding
< Cd44-202protein coding
< Cd44-203protein coding
< Cd44-205protein coding
< Cd44-204protein coding
< Cd44-208protein coding
< Cd44-206protein coding
< Cd44-209retained intron
Genes(Comprehensive set...
Regulatory Build
102.82Mb 102.84Mb 102.86Mb 102.88Mb 102.90Mb
Protein Coding
merged Ensembl/Havana
Ensembl protein coding
Non-Protein Coding
processed transcript
RNA gene
Gene Legend
CTCF
Enhancer
Open Chromatin
Promoter
Promoter Flank
Transcription Factor Binding Site
Regulation Legend
110.53 kb Forward strand
Reverse strand 110.53 kb
http://www.alphaknockout.com/
Page 8 of 9
Transcript: ENSMUST00000005218
< Cd44-201protein coding
Reverse strand 90.48 kb
ENSMUSP00000005...Transmembrane heli...MobiDB liteLow complexity (Seg)
Cleavage site (Sign...
C-type lectin foldSuperfamily
Link domainSMART
CD44 antigen
Link domain
Prints
Link domainPfam
Link domainPROSITE profiles
Link domainPROSITE patterns
PTHR10225
CD44 antigen
PANTHER
C-type lectin-like/link domain superfamilyGene3D
cd03516CDD
Sequence variants (dbSNP and all other sources)All sequence SNPs/i...
missense variant
splice region variant
synonymous variant
Variant Legend
0 80 160 240 320 400 480 560 6400 780Scale bar
We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.
http://www.alphaknockout.com/
Page 9 of 9