decoding our bacterial overlords - melbourne knowledge week - tue 28 oct 2014

Post on 18-Jul-2015

33 Views

Category:

Science

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Decoding our

bacterial overlords

Dr Torsten Seemann

A bacterium

Bacteria are diverse

5,000,000,000,000,000,000,000,000,000,000,000

000,000,000,000,000,000,000,000,000,000,000,

000,000,000,000,000,000,000,000.

Bacteria run the show

100,000,000,000,000

1,000,000

90% microbial

Help digest

our food

Essential for human life

Immune systemSynthesize

vitamins

“Good” E.coli “Bad”

(colon) (bladder)

Bacteria are not malicious

6,000,000,000

letters

The blueprint of life

GenomeA T G C

4,000,000

letters

Extract

the DNA

Reading the genome

Chop it into

small piecesRead DNA of each piece

We had a bunch of nice long DNA(each 4 million letters long)

We got back millions of short DNA (each only 200 letters long)

We want our nice long DNA back!(please)

Can’t always get what you want

Reconstruct the DNA of the chromosome(s)

Genome assembly

● No box

● Millions of pieces

● Missing and duplicate pieces

● Broken pieces

● No corner or edge pieces

→ Usually end up with ~200 sequences

Like a jigsaw puzzle, but ...

Contains ~4,000 genes

Each gene is ~800 letters long

Genes start and end with special triplets

Finding genes

←ATGCATGATTAGCTTTTAGTCTTATAATGTCTTATATATCGCATTTAAGCCCTGATTCTATGAATG→

Genome is ~4,000,000 letters long

● Identify new species

● Find resistance genes

● Understand evolution

● Trace outbreak origin

Applications

2000 finished genomes

10,000 assembled draft genomes

200,000 downloadable genomes

2,000,000 sitting on USB disks?

Genome assembly is different

- RAM more useful than CPU

Computational challenge

top related