bioinformatics analysis of an unknown gene
TRANSCRIPT
Bioinformatics Analysis of an Unknown Gene from
Lactobacillus acidophilusDeborah Perez
BI357: Bioinformatics and Computational BiologyQueensborough Community College Honors Conference
2016
Image Source: http://vivatfor.com/bulk-probiotics/
What Do We Know So Far?Background Information
BACK
GROU
ND IN
FORM
ATIO
N• PROBIOTIC used
in dairy products such as yogurt.
• Gram positive
• Low pH Environment
• Human gut
Image Source: http://www.optibacprobiotics.co.uk/
The Good Bacterium
Unknown Gene in Lactobacillus acidophilus
>DEBBIEMPLENDLDKVLIIGSWPTLIGSVAEMDLMATEAIDALTEEGIQVVLVNPNPATISTDKRPDVTVYLEPMTLDFLKRILRMEEPDAIITEYGSTNGLKVAHKLLQDGILEQMGIQLLTLNSRMLQMGNQQKRNELLKKLGIDTGKSWELNQGIPDSINSNELTEKITFPVLVTKYNRYVHNEHLHFDNAQDLIDFFKKEKQNDNFNWKNYRLTEDLSSWEEVIVDVIRDKDGNTVFINFAGSIEPVKINSGDSAVTMPALTLNNDHIQELRESVKKIINNLNLIGFSSFHFAIKHYGTQIKSKLLTIRPRLTRSAVWTQRIGLYDVGYIVSKVAIGYRLNEIIDPLSGLNASIEPTLDAIAIKMPYWSFAESGYNHYRLSNRMQASGEAMGVGRNFETAFLKGLHATIDLELGWNAFIQETQKNKDKILEDLANPDELHLVKLLAAIKQGITFAELQKVTGLHPIYYQKLLHIINIANRLISDKDNLSFDLLEEAKVYGFFNTLLAKILNKSVEDVQEIIEQYNLTPSFLKIDGSAGVYKPNVCAYYSAYNVQNEANTLAADKKILILGMLPLQVSVTSEFDYMIAHAAKTLHNNGYVTVLLSNNDESVSSRYKDIDRVYFESITLENILTVANRENIKDILLQFSGKKVSALSKRLEECGLHVIGQVPTNDVHDKIDNLLKENLANLDRVPALKTTQEDDVFEFADQHGFPILIGGMNKDNKQKSAVVYDIPAIEKYLTENQLDQIAVSQFIEGNKYEVTAISDGENVTLPGIIEHLEQTGSHASDSIAVIQPQNLMIKQQNRIEKESIKIIKRLKTRGIFNLHYLFVNDDLYLLQIKPYAGHNVAFLSKSLNKDITACATEVLIGKNLIDMGYPDGLWQTSNFIHIKMPVFSFLNYTSGNTFDSNMKSSGSVMGRDTQLAKALYKGYEASDLHIPSYGTIFISVRDEDKEKVTQLARRFDRLGFKLVATEGTANIFAEAGITTGIVEKVHNNPRNLLEKIRQHKIVMVVNITNLSDAASEDALRIRDQALYTHIPVFSSIETAELILDVLESLALTTQPI
???FASTA format
1061 Amino acid
sequence
Bioinformatics Techniques• TMHMM• Signal-P• PHOBIUS• PSORTB• PFAM• PDB (Protein Data Bank)• CDD• KEGG
• T-Coffee• Weblogo• Phylogenetic Tree
Who, what, where, when, and why?
●What is the protein?●What does it do?●Where in the cell does it function?●When is it active?●Why does Lactobacillus acidophilus
inherit this gene?
AnnotationResults
TMHM
M O
UTPU
T
Outside - SecretoryNo transmembrane helices detected!
SIGN
ALP
No Signal peptides detected!
PHOB
IUS
Non-cytoplasmic – Not found inside cell
PSOR
T-B
Cytoplasmic!
PFAM 3 types of domains found but lowest e-value is:
carbomyl phosphate synthetase large chain!!
TIGR
FAM
Carbomyl phosphate synthase (CPSase II)!
PROT
EIN
DATA
BAN
K
3-D Cartoon Image. EC NUMBER 6.3.5.5.
CARBAMOYL PHOSPHATESYNTHETASE Large Subunit
CONS
ERVE
D DO
MAI
N FI
NDER
Various conserved domains found. Also noted in Pfam.
KEGGReference Pathway
Pyrimidine Metabolism
Who, what, where, when, and why?
●Who: EC NUMBER 6.3.5.5.Carbamyl phosphate synthetase large unit
●Other names: CarB, CPSase II, Carbamyl phosphate synthase
Image Source: http://themedicalbiochemistrypage.org/nucleotide-metabolism.php
Carbamyl phosphate synthetase 3-D structure
Who, what, where, when, and why?Cont.
●What: ATP-dependent synthesis of carbamyl phosphate from glutamine Part of pyrimidine metabolism.
●Where: Cytosol●When: In the presence of ATP and glutamine●Why?
Why does Lactobacillus acidophilus inherit this gene?
Who, what, where, when, and why?Cont.
Lets Continue the ResultsNow to compare the query gene to its presence in other organisms.
T-CO
FFEE
AND
WEB
LOGO
Presence of highly conserved regions as well as pockets of diversity.
Diversity
Relatively
Conserved Region
*Protein BLAST*6 homologous sequences aligned
Phylogenetic Tree
⦿ Only 4% divergence⦿ Many paralogs detected
within Lactobacillus acidophilus and other related species.
⦿ Evidence shows paralogs happened during speciation within the lactobacillus genus.
●RNA synthesis●Balance supply pyrimidines●Conclusively a paralog●Evidence shows paralog came from
speciation events●Inconclusively determined to have an
alternate function
Why does Lactobacillus acidophilus inherit this gene?
Who, what, where, when, and why?Cont.
Questions????
AcknowledgementsDr. Peter NovickThe internet