bioinformatics and data mining: application in dairy ... · bioinformatics and data mining:...

74
Bioinformatics and data mining: application in dairy cattle nutrition and physiology Juan J. Loor Associate Professor Department of Animal Sciences and Division of Nutritional Sciences University of Illinois, Urbana-Champaign, USA

Upload: vothuan

Post on 29-Apr-2018

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Bioinformatics and data mining: application in

dairy cattle nutrition and physiology

Juan J. Loor Associate Professor

Department of Animal Sciences and Division of Nutritional Sciences

University of Illinois, Urbana-Champaign, USA

Page 2: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically
Page 3: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Outline

1. Biological context:

a. Dairy cattle physiology and metabolism

b. Nutrition

c. Interactions outcomes at physiologic level

2. Bioinformatics, data mining, and systems biology

3. Application of these tools in dairy cattle

Page 4: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically
Page 5: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Optimized nutrition depends on the target(s)

2+ Calving

Lactation

Re-bred

Dry period

Calving

0 10 months 2-4 months 12 months

TMR TMR

Common and important goals from a nutrition standpoint:

• Minimize incidence of disease around calving (“transition”)

• Enhance the ability of the cow to get pregnant

• Early lactation enhance nutrient use for milk synthesis

• Post-peak lactation enhance body tissue replenishment

• Dry period minimize body fat deposition

• Fetal growth and development?? (epigenetics and heredity)

Page 6: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

The transition cow: a complex and dynamic system

Page 7: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Energy (NEL, Mcal) requirements 2 days before

versus 2 days after calving

725-kg Cow 570-kg Heifer

Function Pre Post Pre Post

Maintenance 11.2 10.1 9.3 8.5

Pregnancy 3.3 --- 2.8 ---

Growth --- --- 1.9 1.7

Milk production --- 18.7 --- 14.9

Total (Mcal) 14.5 28.8 14.0 25.1

Calculated from NRC (2001). Assumes milk production of 25 kg/d for cow and

20 kg/d for heifer, each containing 4% fat.

Typical intake 14-17 19-21

Page 8: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Adipose

Tissue

NEFA

TG

NEFA NEFA

TG TG

VLDL

Ketone

Bodies

Milk

Fat

Mammary

Gland

CO2

Propionate

Liver

Insulin NE, Epi

Mitochondria

Modified from Drackley, 1999

Glucose

Amino acids,

glycerol

Cow in negative

energy balance

Feed intake

Met

P-Choline

Page 9: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Incidence of transition period health phenotypes in

high-producing herds (US National Animal Health Monitoring System)

19931 1996 2001 2006

Mean (%) Range (%)1

Clinical mastitis 14.1 13.4 14.7 16.5 0 to 20

Milk fever 7.2 5.9 5.2 4.9 0 to 44

Displaced abomasum* 3.3 2.8 3.5 3.5 0 to 14

Clinical ketosis* 3.7 4.8 4.1 -- 0 to 20

Retained fetal membranes

9.0 7.8 7.8 7.8 0 to 22

Metritis 12.8 -- -- -- 0 to 66

(modified from Goff, 2006)

*Strong association with liver lipidosis

Subclinical ketosis most predominant 1Jordan and Fourdraine (1993)

Page 10: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Majority of cows leave herd soon after calving

0%

2%

4%

6%

8%

10%

12%

Tim

ing

of c

ullin

g (%

of

co

ws

cu

lle

d)

624,614 Cows Leaving

5,749 Herds 1996-2001

1 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60

Week postcalving

Godden et al. (2003) – source - Overton and Waldron (2004)

25% of cullings

Page 11: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

How might nutrition and physiology impact

incidence of health problems?

• Immune system (metritis, mastitis, retained

placenta)

• Body fat mobilization (ketosis, fatty liver)

• Intakes of dry matter and energy (ketosis,

milk fever, displaced abomasum)

• Rumen function (ketosis, displaced

abomasum)

• Supply of glucose precursors (ketosis)

• Calcium and mineral balances (milk fever,

subclinical hypocalcemia)

Page 12: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Cows can consume enough energy to meet requirements during transition period from a variety

of diets

Dietary NEL DMI (kg) for

(Mcal/kg) 15 Mcal

1.30 (high straw) 11.5

1.40 10.7

1.50 10.0

1.60 (typical close-up) 9.4

Dry cows will easily consume more energy than they require

Page 13: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Energy balance is altered by prepartal energy intake

0

25

50

75

100

125

150

175

200

-10 -8 -6 -4 -2 0 2 4 6 8Weeks relative to parturition

En

erg

y i

nta

ke,

% r

eq

uir

ed

Overfed energy

Controlled energy

Diet, diet

week: P < 0.001

Diet: P < 0.002; diet

week: P < 0.10

(Modified from Janovick et al., 2010)

(1.63 Mcal/kg)

(1.30 Mcal/kg)

One-diet dry period feeding program

Page 14: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Overfeeding of moderate-energy diets increases

postpartal hepatic lipid storage and risk of metabolic

disorders

Day relative to parturitionPretrial -14 1 14 28

% o

f w

et

wt

0

1

2

3

4

5

6

7

Controlled

Overfed

Liver TAG Variable CON OVER P

Displaced Ab. 0 4 0.01

Ketosis 1 6 0.03

Mastitis 2 3 0.11

Cow>1 prob. 1 6 0.06

Janovick et al., 2011

Page 15: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Bionaz et al. (2008)

Bionaz et al. (2012)

Non-esterified fatty acids(NEFA)

Day relative to parturition

-14 -7 0 7 14 21 28 42 63

mmol/L

0.1

0.2

0.3

0.4

0.5

0.6

0.7

InsulinIU/mL

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5Haptoglobing/L

0.0

0.1

0.2

0.3

0.4

0.5

Reactive Oxygen Species

Day relative to parturition

-14 -7 0 7 14 21 28 42 63

mg H2O

2/100 mL

11.0

11.5

12.0

12.5

13.0

13.5

14.0

TAKE HOME MESSAGE ON TRANSITION PERIOD

Page 16: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Bionaz et al. (2008)

Bionaz et al. (2012)

Non-esterified fatty acids(NEFA)

Day relative to parturition

-14 -7 0 7 14 21 28 42 63

mmol/L

0.1

0.2

0.3

0.4

0.5

0.6

0.7

InsulinIU/mL

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5Haptoglobing/L

0.0

0.1

0.2

0.3

0.4

0.5

Reactive Oxygen Species

Day relative to parturition

-14 -7 0 7 14 21 28 42 63

mg H2O

2/100 mL

11.0

11.5

12.0

12.5

13.0

13.5

14.0

METABOLIC,

INFLAMMATORY

& OXIDATIVE

STRESS

TAKE HOME MESSAGE ON TRANSITION PERIOD

Page 17: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Systems biology, functional genomics, and

bioinformatics: terminology/concepts,

techniques, and applications

Page 18: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

What is bioinformatics?

• The unified discipline formed from the

combination of biology, computer science, and

information technology.

• The mathematical, statistical and computing

methods that aim to solve biological problems

using DNA and amino acid sequences and

related information.

• Essential tool for understanding interrelationships

among components of biological systems

NCBI (National Center for Biotechnology

Information)

Page 19: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

“Dogma of molecular biology” as it relates to metabolism and physiology

G1 Promoter

G2 Promoter

G3 Promoter

CPT-1A ACACA FASN

16:0 MalCoA AcCoA

Proteomics

Transcriptomics

Metabolomics

Protein/protein

interactions

Protein/DNA

interactions

Metabolite/protein

interactions

S t o i c h i o m e t r y Metabolic

control K i n e t i c s

CpG

CH3

Ac

(Modified from Loor and Cohick, JAS 2009)

(Transcription regulators)

Tissue function phenotype

Metabolon

PPAR

P

Page 20: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

TF1 TF2

RE

TF1 TF2

Nutrient

m7G AAAA

mRNA TF1 TF2 TF1 TF2

RE

Transcription

Translation

Protein

Transcript

omics

Metabolomics

Metabolites

Proteomics

miRNAomics

Page 21: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Most-common “transcriptomics” tools

“Microarray” dates back to ~1995

“RNA-Sequencing” newer (e.g. yeast transcriptome; 2008 Science 320:1344)

Page 22: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Lowess normalization and PROC Mixed in SAS

Quality evaluation of hybridizations GPRparser Perl:

Pick only good spots → 3 SD above median background

Total number of good spots (min 20,000)

Mean intensity per channel (dye), min 200 RFU

SD per channel (compare to mean) Intensity among channels (check difference)

Statistical analysis

Minimum P-value and False Discovery Rate (FDR)

Minimum degree of freedom (= presence of data)

Filtering

Database for bioinformatics/data mining

Transcriptome data analysis pipeline

Page 23: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

What can we do with all the data ??

6,579 DEG with FDR ≤ 0.001

Bovine mammary transcriptomics

(Bionaz et al., 2012)

Page 24: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Individual datasets ► several objectives ► multiple approaches

Affected genes

Unsupervised approach:

Temporal transcriptomic adaptations

Finding similar

functions/gene

regulation

k-means clusters

Promoter Motifs

Supervised approach:

Known functions

-adipogenesis

-inflammation

-energy generation

- protein synthesis

- lipid synthesis

- vesicle transport

- etc…

Metabolic, adipogenic, lactation gene sets

Genes consistently affected by a physiological state, e.g., growth or lactation

Page 25: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Putting it all together….

the Systems concept

Page 26: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Systems biology

study of interactions between the components of biological

systems

function and behavior

“ …is about putting together rather than taking apart,

integration rather than reduction…..”

Wikipedia

‘‘Every object that biology studies is a system of systems.’’

Francois Jacob (1974)

System network of interconnected components unified

whole. Every system exhibits emergent behavior individual

components Anthony Trewavas, The Plant Cell, 2006

Lee Hood, 2011 National Medal of Science

Holism Reductionism

The systems concept

Page 27: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Elsik et al. (2009)

Bovine Genome

Sequencing and

Analysis

Consortium

Bovine genome

annotation

consortium

Reecy et al. (2010)

Improvement and introduction of new technologies omics

Critical for systems biology of cattle in the genome era

Page 28: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Number of scientific papers using transcriptomics

(microarray) or proteomics approaches to study bovine has

increased over time (PubMed, through May, 2012)

(Loor et al., 2013)

Page 29: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

-diet

-physiology

-etc….

Milk

Tissue

composition

Fatty acids

Proteins

Carbohydrates

Lipid

Glycogen

Blood

Biological functions

Build an all-

encompassing model

Functional assays

Experiment Measurements Bioinformatics analyses

Groupings and Networks

RNAseq

Page 30: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

-diet

-physiology

-etc….

Milk

Tissue

composition

Fatty acids

Proteins

Carbohydrates

Lipid

Glycogen

Blood

Biological functions

Build an all-

encompassing model

Functional assays

Experiment Measurements Bioinformatics analyses

Groupings and Networks

RNAseq

Page 31: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

-diet

-physiology

-etc….

Milk

Tissue

composition

Fatty acids

Proteins

Carbohydrates

Lipid

Glycogen

Blood

Biological functions

Build an all-

encompassing model

Functional assays

Experiment Measurements Bioinformatics analyses

Groupings and Networks Learning about the whole system

…and the dynamic interactions

RNAseq

Page 32: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

So, how do we put it all together ??

Page 33: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Traditional analysis…..before transcriptomics

Gene 1

Apoptosis

Cell-cell signaling

Protein phosphorylation

Mitosis

Gene 2

Growth control

Mitosis

Oncogenesis

Protein phosphorylation

Gene 3

Growth control

Mitosis

Oncogenesis

Protein phosphorylation

Gene 4

Nervous system

Pregnancy

Oncogenesis

Mitosis

Gene 100

Positive ctrl. of cell prolif

Mitosis

Oncogenesis

Glucose transport

Page 34: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

There is a lot

of biological

research output

You are

interested in

specific genes

You get 1,893

results!

How will you

ever find what

you want?

Page 35: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Help! You work hard to read…..

more and

more

and more!!

http://www.teamtechnology.co.uk/f-scientist.jpg

Page 36: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

The Gene Ontology (GO) consortium

Page 37: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Definition of

mesoderm

development

Gene

products

involved in

mesoderm

development

Page 38: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Different ways of grouping genes: e.g. by

biological process

Apoptosis

Gene 1

Gene 53

Mitosis

Gene 2

Gene 5

Gene45

Gene 7

Gene 35

Positive ctrl. of

cell prolif.

Gene 7

Gene 3

Gene 12

Growth

Gene 5

Gene 2

Gene 6

Glucose transport

Gene 7

Gene 3

Gene 6

• Annotations give ‘function’ label to genes

• Ask meaningful questions of omics data e.g.

– genes/proteins involved in the same process, same/different expression patterns?

Page 39: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Biological process

regulation of gluconeogenesis

Page 40: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Commercial software packages

use GO annotation information

but can be costly! ($10K/year)

Page 41: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Cellular Functions

Time course (all time points vs. previous)FDR<0.05 - P < 0.001

-34 -14 -4 0 7 14 21 28

Nu

mb

er

of

olig

os

0

100

200

300

400

500

600

Overall

Upregulated

Downregulated

Cell-to-cell signaling and interaction

Cell assembly and organization

Nucleic acid metabolism

Carbohydrate metabolism

Cell morphology

Cell signaling

Cell growth and proliferation

Molecular transport

Gene expression

Small molecular biochemistry

Cellular development

Cellular function and maintenance

Cell death

Drug metabolism

Cellular movement

Cellular compromise

Amino acid metabolism

Lipid metabolism

Free radical scavenging

Protein degradation

Post-translational modification

Carbohydrate metabolism

Molecular transport

Small molecular biochemistry

Amino acid metabolism

Lipid metabolism

FA , anions, amino acids

Secretion (exocytosis)

Chronological: 0 vs. - 4 d

Down-regulated

Up-regulated

(Tramontana et al., 2008)

Page 42: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

(freely-accessible )

Page 43: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

KEGG Pathway Database

Page 44: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

There is a wealth of bioinformatics resources….

Page 45: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Bioinformatics tools for data mining

Page 46: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

The enrichment analysis concept in bioinformatics

Why useful ? Tissue biological processes and functions –

many genes rather than an individual gene

Enrichment tools systematically map a large number of

affected genes in an experiment to an associated

biological annotation term, function, and pathway

Goal: Annotation terms with enriched gene members will

give important insights to understand biological meaning

behind the large gene list

“…path towards comprehensive functional analysis of

large gene lists. “ (Huang da et al., 2009)

Page 47: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

“Enrichment” analysis: maps genes, metabolites, proteins to biological functions and pathways

Annotation Database

Algorithms (sort and organize annotation terms)

Statistics Calculate enrichment p-values with suitable

methods

Enriched terms

Back-end annotation database

Data mining

Presentation of results

User to input gene list

Physiological context

Nutrition

“Association of

biological terms to a

gene/s”

Page 48: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

An example of affected pathways in a nutrigenomics

experiment: liver

04

D-Fructose D-Sorbitol α-D-Glucose

D-Galactose Amino Sugar and Nucleotide Sugar

Metabolism

Phosphoenolpyruvate

Glycerate-3-P

Glycerate-2-P

Pyruvate Oxaloacetate

Citrate

Isocitrate

Oxalosuccinate 2-Oxoglutarate

Succinyl-CoA

Succinate

Fumarate

Oxidative Phosphorylation

S-Malate

TCA Cycle Acetyl-CoA

Acetate

Acetaldehyde

Ethanol

3P-Hydroxypyruvate

Phosphoserine

Serine

2-Oxoglutaramate

L-Glutamine L-Glutamate

NH3

NH3

Carbamoylphosphate

L-Aspartate

Urea Cycle

Ornithine N-Acetyl

ornithine

Glutathione

R-S-Glutathione

R

X

R-S-

Cysteinyl

glycine

Acetoacetyl CoA

(S)-3-Hydroxy-butanoyl

CoA

Methylmalonyl

CoA

Propanoyl

CoA

1-D-myo-

Inositol 3-P α-D-Glucose-6-P

D-Glucose-1-P

D-Galactose-1-P

UDP-Glucose

UDP-Galactose

Glyceraldehyde-3-P

Glycerate-1,3-2P

Formaldehyde

Glycerone

Methanol

Glutathione Metabolism

Glycolysis/ Gluconeogenesis

Fatty Acid Metabolism

Glutamate Metabolism

Fructose Metabolism

Galactose Metabolism

Treatments

A vs control

B vs control

Arginine

Urea

L-Arginosuccinate

Citrulline

Identical regulation in

both diets

Unequal regulation in

both diets

Page 49: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Existing pathway analysis methods

Page 50: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Existing pathway analysis methods

Limitations of enrichment approach (ORA):

“Uses only the most-significant genes and discards others.”:

biological information is lost

“Assumes the behavior of each gene or pathway is

independent from another.”:

biological control is not a function of a gene, but

groups of genes

“Cannot handle time-course datasets”.

Page 51: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

What’s new ?

Accounts for proportion of Differentially Expressed

Genes (DEG), fold change, and p-value

Allows to follow the impact of pathways/functions

through time and between multiple treatments

Provides the overall direction (“activation”/”inhibition”) of

the impact on a pathway/function based on genes

affected

Can use any publicly available-annotation database

Page 52: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

6,579 DEG with FDR ≤ 0.001

Mammary transcriptomics during lactation

Illinois bovine oligoarray (>13,000 elements)

Page 53: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Functional analysis using ORA tools

plus Gene Ontology

Cell Cycle Cell Death

-log P

-valu

e

0

2

4

6

8

Cellular Assembly and Organization

1.6

2.0

2.4

2.8

3.2

3.6Lipid Metabolism Molecular Transport

0

1

2

3

4

5

6Protein Synthesis

-15 60 12

024

030

01

15

30

-15 60 12

024

030

01

15

30

-15 60 12

024

030

01

15

30

Functions

Pathways

Networks

Transcription factors

Almost no

functions/pathways

significant with False

Discovery Rate (FDR)

correction !

Page 54: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Functional Annotation Analysis

Data mining bioinformatics tools used with DIA

Microarrays (Affy, Agilent, etc)

RNA-Seq data also could be used

Statistical cut-offs (e.g. FDR <0.05 and P value<0.05)

Canonical Pathway Analysis

Kyoto Encyclopedia of Genes and Genomes

Database for Annotation, Visualization, and Integrated Discovery

Chromosomes (32) Biological Processes (2,300)

Lipid-related (100)

Gene expression-related (141)

(>200 manually-curated pathways)

Page 55: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

How to interpret DIA output

Impact = Impact of the condition (diet, physiol. state)

on the biological term

Direction of the impact (or “flux”) = Biological effect of the

condition

Time

-15 1 15 30 60

Impac

t/D

irec

tion o

f th

e im

pac

t

-60

-40

-20

0

20

40

60

80

100

120 Direction of the impact

Impact

Page 56: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Bioinformatics analysis of functional adaptations

of the mammary gland using DIA

(Bionaz and Loor, 2012)

Page 57: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

PATHWAYS

Galactose metabolism

Glycosylphosphatidylinositol(GPI)-anchor biosynthesis

PPAR signaling pathway

Ascorbate and aldarate metabolism

Proximal tubule bicarbonate reclamation

Biosynthesis of unsaturated fatty acids

Synthesis and degradation of ketone bodies

O-Mannosyl glycan biosynthesis

Citrate cycle (TCA cycle)

Antigen processing and presentation

Limonene and pinene degradation

ABC transporters

Hedgehog signaling pathway

Sulfur metabolism

Drug metabolism - other enzymes

Adipocytokine signaling pathway

Steroid biosynthesis

TGF-beta signaling pathway

Glutathione metabolism

ECM-receptor interaction

Phagosome

Peroxisome

Cell adhesion molecules (CAMs)

Hematopoietic cell lineage

Fc epsilon RI signaling pathway

Jak-STAT signaling pathway

Ether lipid metabolism

Arachidonic acid metabolism

Riboflavin metabolism

Valine, leucine and isoleucine degradation

240 300 240vs120-15 1 15 30 60 120

30 most impacted KEGG pathways by DIA

Page 58: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

-15 60 12

024

030

01

15

30

TCA cycle

Pyruvate metabolism

Day relative to parturition

-300

-200

-100

0

100

200

300400500

Galactose metabolism

Imp

act 0

100

200

300

400

500

600

Oxidative

phosphorylation

Dire

ctio

n o

f th

e im

pa

ct

-300

-200

-100

0

100

200

300400500

Pentose & glucuronate

interconversions

Glycolysis/

gluconeogenesis

0

100

200

300

400

500

600

-15 60 12

024

030

01

15

30

-15 60 12

024

030

01

15

30

Carbohydrate and energy metabolism

Page 59: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

-15 60 12

024

030

01

15

30

TCA cycle

Pyruvate metabolism

Day relative to parturition

-300

-200

-100

0

100

200

300400500

Galactose metabolism

Imp

act 0

100

200

300

400

500

600

Oxidative

phosphorylation

Dire

ctio

n o

f th

e im

pa

ct

-300

-200

-100

0

100

200

300400500

Pentose & glucuronate

interconversions

Glycolysis/

gluconeogenesis

0

100

200

300

400

500

600

-15 60 12

024

030

01

15

30

-15 60 12

024

030

01

15

30

Curve of lactation

0 30 60 90 120 150 180 210 240 270 300

Kg/d

0

10

20

30

40

50

Carbohydrate and energy metabolism

Page 60: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Application of DIA to liver transcriptome data

Page 61: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Application of DIA to liver transcriptome data

Page 62: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Dynamics of liver transcriptome in response to

plane of nutrition during the dry period

-65 -30 -14 1 14 28 49

Day relative to parturition

Over

Rest

• Multiparous Holstein cows (Loor et al. 2005, 2006 Physiol. Genomics)

• Energy intake during late pregnancy:

- Ad libitum (Over – ca. 150% of NRC requirements)

- Control (Con – ca. 100% of NRC requirements)

- Restricted (Rest – ca. 80% of NRC requirements)

• Aims: study the liver transcriptome and physiological

outcomes

Con

Page 63: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

>140% NRC prepartum

Overfed ~100% NRC prepartum

Control

~80% NRC prepartum

Restricted

(Bionaz and Loor, 2012)

Dietary energy prepartum affects the liver transcriptome

4,790 genes with diet time effect

Page 64: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

0

200

400

600

800Pentose phosphate pathway Glycolysis / Gluconeogenesis

-400

-200

0

200

400Citrate cycle (TCA cycle)

0

200

400

600

800Oxidative phosphorylation Synthesis and degradation

of ketone bodies

Dir

ec

tio

n o

f im

pa

ct

-400

-200

0

200

400Fatty acid metabolism

Imp

ac

t

0

200

400

600

800Steroid biosynthesis Glycerolipid metabolism

-400

-200

0

200

400PPAR signaling pathway

-30

-14

1

14

28

49

-3

0 -1

4 1 14 28

49

-3

0 -1

4 1 14 2

8 4

90

200

400

600

800Ribosome

Day relative to parturition

-30

-14

1

14

28

49

-3

0 -1

4 1 14 28

49

-3

0 -1

4 1 14 2

8 4

9

Cell cycle

-30

-14

1

14

28

49

-3

0 -1

4 1 14 28

49

-3

0 -1

4 1 14 2

8 4

9-400

-200

0

200

400

Antigen processingand presentation

Restrict Control Adlibitum Restrict Control Adlibitum Restrict Control Adlibitum

Ribosome

Terpenoid backbone biosynthesis

Sulfur metabolism

Phe, Tyr and Trp biosynthesis

Complement & coagulation

cascades

Synthesis & degrad. ketone bodies

Glycosphingolip bios - globo series

Pentose phosphate pathway

PPAR signaling pathway

Butanoate metabolism

Fatty acid metabolism

Folate biosynthesis

N-Glycan biosynthesis

Pyruvate metabolism

Fructose & mannose metabolism

O-Glycan biosynthesis

ECM-receptor interaction

Limonene and pinene degradation

Glycolysis / Gluconeogenesis

Steroid biosynthesis

Ubiquin &other terp-quinone bios

Vitamin B6 metabolism

Most impacted biological pathways using DIA 22 most impacted

Page 65: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

1

0

-1

1

0

-1

1

0

-1

1

0

-1

Log2

fold

ch

ange

rel

ativ

e to

-6

5 d

ay in

milk

(d

ry-o

ff)

Overfed Control Restricted

Overfed Control Restricted Overfed Control Restricted Overfed Control Restricted

Cluster analysis plus ORA

applied to bovine liver

longitudinal transcriptomics

Page 66: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

1

0

-1

1

0

-1

1

0

-1

1

0

-1

Log2

fold

ch

ange

rel

ativ

e to

-6

5 d

ay in

milk

(d

ry-o

ff)

Overfed Control Restricted

Overfed Control Restricted Overfed Control Restricted Overfed Control Restricted

Cluster analysis plus ORA

applied to bovine liver

longitudinal transcriptomics

GOTERM_BP_FAT

activity of plasma protein involved in acute inflam. response

complement activation, classical pathway

humoral immune response KEGG_PATHWAY

Complement and coagulation cascades GOTERM_CC_FAT

extracellular region

GOTERM_BP_FAT

translation KEGG_PATHWAY

Ribosome GOTERM_CC_FAT

basement membrane

proteinaceous extracellular matrix

cytosolic ribosome

GOTERM_BP_FAT

ubiquitin-dependent protein catabolic process

response to protein stimulus

GOTERM_CC_FAT

mitochondrion

nuclear lumen

organelle membrane

Page 67: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

What practical knowledge have we gained from the

bioinformatics approach ??

Overfeeding or restricting energy prepartum:

Coordinated inhibition of genes related with immune

system:

•Plasma inflammatory proteins

•Complement system activation

•Antigen processing and presentation

Restricting energy prepartum:

Coordinated upregulation of:

•Fatty acid oxidation and energy production: Mitochondrial

elements

Role for PPARα signalling pathway ?

Pros: long-chain fatty acid supplementation ?

Page 68: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Bionaz and Loor (2012)

Loor et al. (2013)

Lipolysis Adipokynes Lipogenesis

Immune response

Milk synthesis FA oxidation

FA oxidation AA metabolism Immune response Gluconeogenesis

Glucose oxidation

Rumen/intestine Muscle Pancreas Brain Bone Others….

The transition cow: a complex and dynamic system

Page 69: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Bionaz and Loor (2012)

Loor et al. (2013)

Lipolysis Adipokynes Lipogenesis

Immune response

Milk synthesis FA oxidation

FA oxidation AA metabolism Immune response Gluconeogenesis

Glucose oxidation

Rumen/intestine Muscle Pancreas Brain Bone Others….

The transition cow: a complex and dynamic system

Page 70: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

Y-axis: Mamamry up to 60 vs -15 Max Feb12, Default Interpretation

Colored by: Time 16.94

Gene List: FDR0.05 (9567)

-15.0 1.0 15.0 30.0 60.0

0.1

1

10

100

-15.0 1.0 15.0 30.0 60.0

0.1

1

10

100

Y-axis: Adipose Nicole control only vs -15 Max Feb12, Default Interpretation

Colored by: Time -2.226

Gene List: Adipose FDR0.05 (3355)

-15.0 1.0 15.0

0.1

1

10

100

-15.0 1.0 15.0

0.1

1

10

100

Y-axis: Liver vs-15 Max Feb12, Default Interpretation

Colored by: Time 15.89

Gene List: all genes (9004)

-15.0 1.0 15.0 30.0 60.0

0.1

1

10

100

-15.0 1.0 15.0 30.0 60.0

0.1

1

10

100

100

10

1

0.1

10

1

0.1

10

1

100

0.1

100

Bionaz et al. (2012)

Loor et al. (2005)

Janovick et al. (2009)

Y-axis: Mamamry up to 60 vs -15 Max Feb12, Default Interpretation

Colored by: Time 16.94

Gene List: FDR0.05 (9567)

-15.0 1.0 15.0 30.0 60.0

0.1

1

10

100

-15.0 1.0 15.0 30.0 60.0

0.1

1

10

100

-15 1 15 30 60

0.1

Mammary

Liver

Adipose

Number of Differentially Expressed Genes

Day relative to parturition

-15 1 15 30 60

Num

ber

of

DE

G

0

1000

2000

3000

4000

5000 Mammary

Adipose

Liver

Application of DIA for integrative systems physiology

Page 71: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

PPAR signaling

-15 1 15 30 60

-400

-250

-100

50

200

350

Overall metabolism

-100

-50

0

50

100

Mammary

Adipose

Liver

Carbohydrate Metabolism Lipid Metabolism

Glycolysis/Gluconeogenesis

Dir

ecti

on

of

the

Imp

act

-150

-100

-50

0

50

100

150Pyruvate metabolism

Fatty acid biosynthesis

-15 1 15 30 60

Biosynt. of Unsaturated FA

Day relative to parturition

-15 1 15 30 60

Page 72: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically
Page 73: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

• Most value from nutrigenomics and metabolomics (i.e. expensive)

from examining multiple tissues or biological fluids:

Potential crosstalk e.g. visceral fat to liver (dairy); myocyte

and adipocyte (beef)

Plasma, serum, milk, ruminal fluid, etc

• If focused only on nutrition (beef or dairy) as management tools,

diets:

Must be applicable in the field (of practical value):

Supplemental nutrients:

Essential amino acids (rumen by-pass)

Organic trace minerals

Long-chain fatty acids

• Build up knowledge within a “systems framework” use of

bioinformatics. Link expression networks or gene/s to:

Blood indicators: metabolism, immune response

Health status: ketosis, fatty liver, mastitis, metritis, etc

Summary and Perspectives

Page 74: Bioinformatics and data mining: application in dairy ... · Bioinformatics and data mining: application in dairy cattle nutrition and physiology ... Enrichment tools systematically

• Identify susceptibility/marker genes:

Probably tissue-specific

• Field applications:

Management strategies based on marker genes ?

“Personalized nutrition” ? e.g. grouping cows/steers and

feed accordingly

•How to deliver outcomes ?

Training packages for students and industry professionals

Marker-assisted selection for more disease-resistant or more

efficient animals

“Biologicals” or “metabolic modifiers” that can be used in the

short or long-term to modify metabolism and health:

Dietary fatty acids, amino acids, trace minerals, etc.

Summary and Perspectives