Download - daTAA server
![Page 1: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/1.jpg)
Server daTAA: http://toolkit.tuebingen.mpg.de/dataa
Paweł Szczęsny MPI for Developmental Biology, Tuebingen, Germany Institute of Biochemistry and Biophysics PAS, Warsaw, Poland
![Page 2: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/2.jpg)
Internal complexity of TAAs
MFWMCFVIFFIGEFIMKKLSVTSKRQYNLYASPISRRLSLLMKLSLETVTVMFLLGASPVLA/SNLALTGAKNLSQNSPGVNYSKGSHGSIVLSGDDDFCGADYVLGRGGNSTVRNGIPISVEEEYERFVKQKLMNNATSPYSQSSEQQVWTGDGLTSKGSGYMGGKSTDGDKNILPEAYGIY-------------------------SFATGCGSSAQGNY-------------------------SVAFGANATALTGG-------------------------SQAFGVAALASGRV-------------------------SVAIGVGSEATGEA-------------------------GVSLGGLSKAAGAR-------------------------SVAIGTRANAYGEE-------------------------SIAIGGGLKQGSDNKIGSAVAQGLK-------------------------AISIGSDSVGFQHY-------------------------AVAIGAKSRALLLK-------------------------SVALGSYSVADVDAGVRGYDPVEDEPSKNVSFVWKSSVGAVSVGNRKEGLTRQIIGVAAG---TEDTDAVNVAQLKALR:GMISEK|GGWNLTVNNDNNTVVSSGGALDLSSGSKNLKIAKDGKKNNVTFDVARDLTLKSIKLDGVTLNETGLFIANGPQITASGINAGSQKITGVAEG---TDANDAVNFGQL-----------------------------------------------------------------------------------KKI|ETEVKE-----QVAASGFVKQDSDTK:YLTIGKDTDGDTINIANNKSDKRTLMGIKEGDISKDSSEAITGSQLFTTNQNVKTVSDNLQTAATNIAKTFGGDAKYE-DGEWTAPTFKVKTVTGEGKE-EEKTYQNVADALAGVGSSITNVQ-------NKVTEQVNNAIT--KVEGDALLWSDEANAFVARHEKSKLEKGASKATQENSKITYLLDGDVSKDSTDAITGKQLYSLGD--------------KIASYLGGNAKYE-NGEWTAPTFKVKTVKEDGKE-EEQTYHNVAAAFEGVGTSFTNVK-------NEITKQINHL----QSDDSAVVHYDKDDK-NGSINYASVTLGKGKDSAAVTLHNVAAGNIAKDSHDAINGSQIYSLNE--------------QLATYFGGGAGYNKEGKWTAPTFTVKTVKEDGEE-EEKTYQNVAEALTGVGTSFTNIK-------SEITKQIANEIS--NVTGDSLVKKDLDTNLITIGKEVAGTEINIASVSKADRTLSGVKEA---VKDNEAVNKGQL---------------------------------------------------------------------------------------DKGLKHLSDSLQSEDSAVVHYDKKTDETGGINYTSVTLG-GKDKTPVALHNVADGSISKDSHDAINGGQIHTIGE--------------DVAKFLGGAASFN-NGAFTGPTYKLSNIDAKGDV-QQSEFKDIGSAFAGLDTNIKNVNNNVTNKFNELTQNITNVTQ--QVKGDALLWSDEANAFVARHEKSKLGKGASKATQENSKITYLLDGDVSKDSTDAITGKQLYSLGD--------------KIASYLGGNAKYE-DGEWTAPTFKVKTVKEDGKE-EEKTYQNVAEALTGVGTSFTNVK-------NEITKQINHL----QSDDSAVVHYDKNKDETGGINYASVTLGKGKDSAAVTLHNVADGSISKDSRDAINGSQIYSLNE--------------QLATYFGGGAKYE-NGQWTAPIFKVKTVKEDGEE-EEKTYQNVAEALTGVGTSFTNIK-------SEITKQIANEIS--SVTGDSLVKKDLATNLITIGKEVAGTEINIASVSKADRTLSGVKEA---VKDNEAVNKGQL---------------------------------------------------------------------DTNIKKVE-------DKLTEAVGKVTQ--QVKGDALLWSNEDNAFVADHGKDSAKTKSKITHLLDGNIASGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGQWTAPTFKVKTVNGEGKE-EEQTYQNVAEALTGVGASFMNVQNKIT---NEITNQVNNAIT--KVEGDSLVKQDNLG-IITLGKERGGLKVDFANRDGLDRTLSGVKEA---VNDNEAVNKGQL---------------------------------------------------------------------DADISKVNNNVTNKFNELTQNITNVTQ--QVKGDALLWSDEANAFVARHEKSKLEKGVSKATQENSKITYLLDGDISKGSTDAVTGGQLYSLNE--------------QLATYFGGDAKYE-NGQWTAPTFKVKTVNGEGKE-EEQTYHNVAAAFEGVGTSFTNIK-------SEITKQINNEIS--NVKGDSLVKKDLATNLITIGKEVAGTEINIASVSKADRTLSGVKEA---VKDNEAVNKGQL---------------------------------------------------------------------DTNIKKVE-------DKLTEAVGKVTQ--QVKGDALLWSNEDNAFVADHGKDSAKTKSKITHLLDGNIASGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGQWTAPTFKVKTVNGDGKE-EEQTYQNVAEALTGVGTSFTNVQNKIT---NEITNQVNNAIT--KVEGDSLVKQDNLG-IITLGKERGGLKVDFANRDGLDRTLSGVKEA---VNDNEAVNKGQL---------------------------------------------------------------------DANISKVNNNVTNKFNELTQNITNVTQ--QVQGDTLLWSDEANAFVARHEKSKLEKGVSKATQENSKITYLLDGDISKGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGEWTAPTFKVKTVNGEGKE-EEQTYHNVAAAFEGVGTSFTNIK-------SEITKQIDNEII--NVKGDSLVKRDLATNLITIGKEIEGSAINIANKSGEARTISGVKEA---VNNNEAVNKGQL---------------------------------------------------------------------DTNIKKVE-------DKLTEAVGKVTQ--QVKGDALLWSNEDNAFVADHGKDSAKTKSKITHLLDGNIASGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGQWTAPSFKVKTVKEDGKE-EEQTYQNVAEALTGVGTSFTNVK-------NEITKQINHL----QSDDSAVVHYDKNKDETGTINYASVTLGKGKDSAAVTLHNVADGSISKDSRDAINGGQIHTIGE--------------DVAKFLGGDAAFK-DGAFTGPTYKLSNIDAKGDV-QQSEFKDIGSAFAGLDTNIKNVNNNVTNKFNELTQSITNVTQ--QVKGDSLLWSDEANAFVARHEKSKLEKGASKAIQENSKITYLLDGNVSKGSTDAVTGGQLYSMSN--------------MLATYLGGNAKYE-NGEWTAPTFKVKTVNGEGKE-EEQTYQNVAEALTGVGTSFTNIK-------SEIAKQINHL----QSDDSAVIHYDKNKDETGTINYASVTLGKGEDSAAVALHNVAAGNIAKDSRDAINGSQLYSLNE--------------QLLTYFGGNAGYK-DGQWIAPKFQVSQFKSDGSSGEKESYDNVAAAFEGVNKSLAGM--------NERINNVVTAGQ--NVSSNSLNWNETEGGYDARHNGVDSKLTHVENGDVSEKSKEAVNGSQLWNTNEKVEAVEKDVKNIEKKVQDIATVADSAVKYEKDSTGKKTNVIKLVGGSESDPVLIDNVADGDIKEGSKQAVNGGQLRDYTEKQMKIVLEDAKKYTDERFNDVVNNGVNEAKAYTDMKFEALSYAVEDVRKEARQAQLLVWRYLTYVTMIYRDL AAIGLAVSNLRYYDIPGSLSLSFGTGIWRSQSAFAVGAGYTSEDGNIRSNLSITNAGGHWGVGAGITLRLK
![Page 3: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/3.jpg)
Automated vs manual annotation
Domain type PFAM manually
Present in PFAM 28% 35%
Not present in PFAM - 18%
Coiled coils - 3%
Total 28% 56%
Present in PFAM 26% 31%
Not present in PFAM - 36%
Coiled coils - 25%
Total 26% 92%
Coverage of annotation
![Page 4: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/4.jpg)
Automated vs manual annotation
Domain type PFAM daTAA manually
Present in PFAM 28% 32% 35%
Not present in PFAM - 13% 18%
Coiled coils - 5% 3%
Total 28% 50% 56%
Present in PFAM 26% 28% 31%
Not present in PFAM - 27% 36%
Coiled coils - 11% 25%
Total 26% 66% 92%
Coverage of annotation
![Page 5: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/5.jpg)
Prediction of individual repeats in YadA
|----------Hep_Hag---------|---------Hep_Hag- |---Ylhead---|---Ylhead----|-----ASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALG ----------| |---------Hep_Hag-------Ylhead--|---Ylhead---|---Ylhead---|----Ylhead--DSAVTYGAASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIG ---| |----------Hep_Hag---------|-|----Ylhead-----|----Ylhead---|HSSHVAANHGYSIAIGDRSKTDRENSVSIGHESL
![Page 6: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/6.jpg)
![Page 7: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/7.jpg)
![Page 8: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/8.jpg)
![Page 9: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/9.jpg)
![Page 10: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/10.jpg)
![Page 11: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/11.jpg)
Key points
Approach of human annotator implemented in a computer system
Improvement in coverage and accuracy over general annotation servers
Unique workflow with knowledge-based rules
Visual helpers for interpretation of the results
![Page 12: daTAA server](https://reader036.vdocuments.site/reader036/viewer/2022081413/547c371b5806b5e03f8b4722/html5/thumbnails/12.jpg)
Acknowledgements
MPI for Developmental Biology
Institute of Biochemistry and Biophysics PAS
Andrei Lupas Dirk Linke Toolkit development
team
Piotr Zielenkiewicz Marcin Grynberg