wow world of walkover-weight “my god, it’s full of cows!” (david bowman, 2001)
TRANSCRIPT
WOWWOWWorld of Walkover-weightWorld of Walkover-weight
““My God, it’s full of cows!” My God, it’s full of cows!” (David Bowman, 2001)(David Bowman, 2001)
Can walkover-weight suggest a cow needs attention?
Join with breeding information …
Position at the outset …Position at the outset …
Obstacle:Obstacle: No health information!!! No health information!!!
Suggested:Suggested: Milking order (i.e. where a Milking order (i.e. where a cow is in the herd/line-up) is hierarchical cow is in the herd/line-up) is hierarchical and affected by health issuesand affected by health issues
Proposed goal:Proposed goal: to predict a drop in to predict a drop in milking order using WOW and other factsmilking order using WOW and other facts
Assumptions … deck of cardsAssumptions … deck of cards
Same cows come in for milking each timeSame cows come in for milking each time
Cows are well-behaved (e.g. arrive in a Cows are well-behaved (e.g. arrive in a nice queue)nice queue)
Data is in good shape (e.g. one reading Data is in good shape (e.g. one reading per cow per milking)per cow per milking)
Data problemsData problems
Multiple entries for cows (e.g. four entries Multiple entries for cows (e.g. four entries for 22719193 in QBH2005)for 22719193 in QBH2005)
Delete duplicate weights (SQL problem?)Delete duplicate weights (SQL problem?) Cow skipped and recycled back into orderCow skipped and recycled back into order Use average if more than one valueUse average if more than one value
About a quarter of the data are zeroes About a quarter of the data are zeroes ……
instancesinstances 0 weights0 weights λλ weights weights
BBYG2006BBYG2006 182,935182,935 57,28857,288 815815
BBYG2007BBYG2007 206,545206,545 39,72639,726 1,1931,193
JJVX2007JJVX2007 7,8507,850 00 7373
QBH2005QBH2005 7,8507,850 00 7373
QBH2006QBH2006 324,365324,365 80,36280,362 7272
QBH2007QBH2007 222,300222,300 67,10967,109 2,1182,118
QBH2008QBH2008 48,53448,534 10,53510,535 224224
““zero” problemszero” problems
Differentiate between a missing cow, a Differentiate between a missing cow, a missing weight and a “zero” weightmissing weight and a “zero” weight
Ignore missing cowsIgnore missing cows Cow skipped and recycled back into orderCow skipped and recycled back into order Time-based interpolationTime-based interpolation
Can be problematic if cow has been missing for a Can be problematic if cow has been missing for a whilewhile
Add flag to indicate weight was “guessed”Add flag to indicate weight was “guessed”
other issues in data preparationother issues in data preparation
Change milking date to Change milking date to milk indexmilk index Change birthdate to Change birthdate to age in monthsage in months Change parturition date to Change parturition date to days since last days since last
calvedcalved Additional derivativesAdditional derivatives
milking indexmilking index - cow’s position in milk order - cow’s position in milk order ∆∆-index-index – change in index for a cow over various – change in index for a cow over various
time periods (1, 3 and 7 days)time periods (1, 3 and 7 days) mu-weightmu-weight – average weight over varying-length – average weight over varying-length
periods (3, 7, 14, 21 and 28 milkings)periods (3, 7, 14, 21 and 28 milkings) ∆∆-mu-weight-mu-weight – change in index for a cow (1, 3, and – change in index for a cow (1, 3, and
7 days)7 days)
Does [change in] milk order correlate to WOW?
Correlation coefficients QBH2006 Correlation coefficients QBH2006 (dense)(dense)
WOW to index == 0.12WOW to index == 0.12 WOW to 14-day mu-weight == 0.93WOW to 14-day mu-weight == 0.93 Index to 10-day mu-weight == 0.14Index to 10-day mu-weight == 0.14 3-day 3-day ∆∆-order to -order to ∆∆-weight == 0.045-weight == 0.045
3-day ∆∆-order and 3-day ∆∆-weight
Predict change in milking orderPredict change in milking order
Use Use M5PM5P to predict how the milking order to predict how the milking order will change for a cow at the next milkingwill change for a cow at the next milking
Approx. 205,000 Approx. 205,000 QBH2006QBH2006 samples (with samples (with fewer than 5/25 missing attributes)fewer than 5/25 missing attributes)
2/32/3 training training 1/31/3 testing testing
Re-running took too long … but … you’ve all seen it before,where accuracy was 51.89% (discrimination 0.527) andthe model tree was hugely ugly (65 nodes, 33 leaves).
Also tried predicting cow’s index as decile and as ratio to herdsize.
<missing results go here when available><missing results go here when available>
0
0.2
0.4
0.6
0.8
1
1.2
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85
Cow’s position (index) as ratio to herdsizeCow’s position (index) as ratio to herdsize
QBH2008 Tag:17102150
0
200
400
600
800
1000
1200
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85
Cow index vs. herd sizeCow index vs. herd size
Where to? ….Where to? ….
Data must still be scrubbed so that milking Data must still be scrubbed so that milking order makes sense (if milking order is going to order makes sense (if milking order is going to be relevant)be relevant)
Perhaps cow order needs to be described in Perhaps cow order needs to be described in completely different terms (e.g. cow buddies)completely different terms (e.g. cow buddies)
Easy visualization of Easy visualization of herds/cows/breeds/dates/trends is neededherds/cows/breeds/dates/trends is needed
this segued into another area of the project ..this segued into another area of the project ..
Visualization tools (alpha and beta)Visualization tools (alpha and beta)
In the meantime … health data is obtained …
Can WOW predict onset of illness?Can WOW predict onset of illness?
Combine original attributes and Combine original attributes and derivatives with health judgmentsderivatives with health judgments
Cows with Cows with unknown healthunknown health are are considered healthyconsidered healthy
Need equal number of positive and Need equal number of positive and negative instancesnegative instances
Health data becomes availableHealth data becomes available
farmfarm yearyear Qty > 50Qty > 50
BBYGBBYG 20062006 7373
BBYGBBYG 20072007 9595
BBYGBBYG 20082008 220220
QBHQBH 20052005 113113
QBHQBH 20062006 282282
QBHQBH 20072007 481481
QBHQBH 20082008 253253
Not so much health dataNot so much health data
1613 recorded instances of health1613 recorded instances of health 913 different cows with health info913 different cows with health info 2540 cows with milking info2540 cows with milking info 788 milked cows with health data788 milked cows with health data 7 broad categories of illness:7 broad categories of illness:
Calving disorderCalving disorder Metabolic disorderMetabolic disorder Udder disorder (only one with >50 in herd)Udder disorder (only one with >50 in herd) Reproductive disorderReproductive disorder LamenessLameness Infectious diseasesInfectious diseases Other ailmentsOther ailments
Data sparsenessData sparseness
QBH2006QBH2006 75 instances out of 324,291 have 75 instances out of 324,291 have healthhealth
63 udder disorder63 udder disorder 10 metabolic disorder10 metabolic disorder 2 lameness 2 lameness
Only .002% positives Only .002% positives → → will never be isolated will never be isolated → → must subsample negativesmust subsample negatives
Random selection of 75 negatives Random selection of 75 negatives → data → data sparseness sparseness → over-fitting likely→ over-fitting likely
Data sparsenessData sparseness
QBH2006QBH2006 36 cows have illness at some time, so just learn 36 cows have illness at some time, so just learn
those? those?
11,966 records for those cows, 76 of which have 11,966 records for those cows, 76 of which have illness (still <1% positive)illness (still <1% positive)
Random selection of 1% as negatives (about 120)Random selection of 1% as negatives (about 120)
Refinements to approachRefinements to approach
QBH2006QBH2006
Restrict target objective to Restrict target objective to UDDER DISORDERUDDER DISORDER
Randomly select equal number of negatives from Randomly select equal number of negatives from cows cows who have health problem at some pointwho have health problem at some point
goal: goal: differentiate between healthy and unhealthy differentiate between healthy and unhealthy statestate
Detecting mastitis amidst random normal cowsDetecting mastitis amidst random normal cows
QBH2006QBH2006
Restrict learning objective to UDDER DISORDER Restrict learning objective to UDDER DISORDER
Randomly select equal number of negatives from all Randomly select equal number of negatives from all cows that have been milked (63+,63-)cows that have been milked (63+,63-)
When is a cow sick?When is a cow sick?
So far, attempted to predict health label So far, attempted to predict health label at point of milking, but ..at point of milking, but .. … … when was the health label attached? when was the health label attached?
beforebefore, , duringduring or or afterafter the current milking? the current milking?
Goal: predict whether cow needs Goal: predict whether cow needs attention at the attention at the next milkingnext milking (i.e. time (i.e. time series)series)
=== Summary ===Correctly Classified Instances 90 70.3125 %Incorrectly Classified Instances 38 29.6875 %Kappa statistic 0.4026Mean absolute error 0.3446Root mean squared error 0.4532Relative absolute error 68.8933 %Root relative squared error 90.5974 %Total Number of Instances 128 === Detailed Accuracy By Class ===TP Rate FP Rate Precision Recall F-Measure ROC Area Class 0.508 0.108 0.821 0.508 0.627 0.707 UDDER DISORDER 0.892 0.492 0.652 0.892 0.753 0.707 NONE
=== Confusion Matrix === a b <-- classified as 32 31 | a = UDDER DISORDER 7 58 | b = NONE
AgendaAgenda Replace quantified attributes with simpler (e.g. boolean, nominal) Replace quantified attributes with simpler (e.g. boolean, nominal)
onesones
Characterise exceptionsCharacterise exceptions Below average weight for cow/herd/breed/ageBelow average weight for cow/herd/breed/age Dropped decile/>50 in orderDropped decile/>50 in order
Broad statistical measuresBroad statistical measures How many std.devs. from meanHow many std.devs. from mean z-score (probability of variation)z-score (probability of variation)
Choose negative instances more carefully (select fewer Choose negative instances more carefully (select fewer interpolates)interpolates)
Spend more time with people who know cowsSpend more time with people who know cows