new ways of looking at binary data fitting in r yoon g kim, [email protected] colloquium talk
TRANSCRIPT
![Page 2: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/2.jpg)
2
Can we “stabilize” this?
Appetizer
5 10 15 20
05
01
00
15
02
00
25
0
x
![Page 3: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/3.jpg)
3
After taking LOG …> y1 <- rep(c(100,200),times=10)> y2 <- rep(c(10,20),times=10)> x <- c(1:20)> data <- cbind(x,y1,y2)> data[1:3,] x y1 y2 [1,] 1 100 10 [2,] 2 200 20 [3,] 3 100 10> par(mfrow=c(1,2))> plot(y1~x,type="l",ylim=c(0,250),col="blue",ylab="")> lines(y2~x,type="l",col="red")> plot(log(y1)~x,type="l",ylim=c(0,6),col="blue",ylab="")> lines(log(y2)~x,type="l",col="red")
Log transformed
![Page 4: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/4.jpg)
4
5 10 15 20
05
01
00
15
02
00
25
0
x
5 10 15 200
12
34
56
x
![Page 5: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/5.jpg)
5
Outline
Exploring options available when
assumptions of classical linear models are untenable.
In this talk:
What can we do when observations are not
continuous
and the residuals are not normally distributed nor
identically distributed ?
![Page 6: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/6.jpg)
6
Defined by three assumptions:
(1) the response variable is continuous.
(2) the residuals (ε) are normally distributed and ...
(3) ... independently (3a) and identically distributed (3b).
Today, we will consider a range of options available
when assumptions (1) (2) and/or (3b) are not verified.
Classical Linear Models
![Page 7: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/7.jpg)
7
Many situations exist:
The response variable could be
(1) a count (number of individuals in a population)(number of species in a community)
(2) a proportion (proportion "cured" after treatment) (proportion of threatened species)
(3) a categorical variable (breeding/non-breeding)
(different phenotypes)
(4) a strictly positive value (esp. time to success) (or time to failure)
( ... ) and so forth
Non-continuous response variable
![Page 8: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/8.jpg)
8
These types of non-continuous variables also tend to deviate from the assumptions of
Normality (assumption #2) and Homoscedasticity (assumption #3b)
(1) A count variable often follows a Poisson distribution (where the variance increases linearly with the mean)
(2) A proportion often follows a Binomial distribution (where the variance reaches a maximum for intermediate values
and a minimum at either end: 0% or 100%)
Added difficulties
![Page 9: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/9.jpg)
9
These types of non-continuous variables also tend to deviate from the assumptions of
Normality (assumption #2) and Homoscedasticity (assumption #3b).
(3) A categorical variable tends to follow a Binomial distribution
(when the variable has only two levels) or a Multinomial
distribution (when the variable has more than two levels)
(4) Time to success/failure can follow an exponential distribution or
an inverse Gaussian distribution (the latter having a variance
increasing much more quickly than the mean).
Added difficulties
![Page 10: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/10.jpg)
10
Many of these situations can be unified under a central framework.
Since all these distributions (and a few more) belong to the exponential family of distributions.
Fortunately
) ,(
)(
)(e xp ,
yca
byyf
Probability density function (if y is continuous)
Probability mass function (if y is discrete)
Canonical (location) parameter
Dispersion parameter
Canonical form
bE Y
abY v a r
mean
variance
![Page 11: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/11.jpg)
11
The Normal distribution
2
2
2 2e xp
2
1,
yyf
Probability density
function
Canonical form
)2lo g (
2
12/e x p 2
2
2
2
2
yy
Canonical (location) parameter
Dispersion parameter
2
bE Y
2v a r abY
![Page 12: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/12.jpg)
12
The Poisson distribution
!
,y
eyf
y
Probability
mass function
Canonical form
!l nl ne x p yy
= 1
Canonical (location) parameter
Dispersion parameter
l n1
bE Y
abYv a r
)e x p ()( b
![Page 13: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/13.jpg)
13
The Binomial distribution
yny ppy
nyf
1,
Probability mass
function
Canonical form
y
npynpy ln1lnlne xp
= 1
Canonical (location) parameter
Dispersion parameter
p
p
1ln
1
n pbE Y
)1(v a r pn pabY
)e x p1l o g ()1l n ()( npnb
y
npn
p
py ln1ln
1lne xp
![Page 14: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/14.jpg)
14
Why is that remotely useful?1) A single algorithm (maximum likelihood)
will cope with all these situations.
2) Different types of Variance can be accommodated
When Var is constant -> Normal (Gaussian)
When Var increases linearly with the mean -> Poisson
When Var has a humped back shape -> Binomial
When Var increases as the square of the mean -> Gamma(means the coefficient of variation remains constant)
When Var increases as the cube of the mean -> inverse Gaussian
3) Most types of data are thus effectively covered
![Page 15: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/15.jpg)
15
![Page 16: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/16.jpg)
16
Two ways to cope with non-independent observations
When design is balanced ("equal sample size")
We can use factors to partition our observations in different "groups" and analyze them as an ANOVA or ANCOVA.
… when factors are "crossed" or when they are “nested"
When design is unbalanced ("uneven sample size")
Mixed effect models are then called for.
Non-independent Observations
![Page 17: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/17.jpg)
17
How does it work?1) You need to specify the family of distribution to use
2) You need to specify the link function
pp xxx 22110 iyg
linear predictorlink function
For each type of variable the "natural" link function to use is indicated by the canonical parameter
Link
Normal Identity
Poisson Log
Binomial Logit
Gamma Inverse
Inv.Gaussian Inverse square
p
p
1ln
![Page 18: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/18.jpg)
18
Binary variableThe response variable contains only 0’s and 1’s. The
probability that a place is “occupied” is p, and we write
The objective is to determine how Y influences p.
The family to use is Binomial and the canonical link is logit.Example: The response is occupation of territories and the explanatory
variable is the resource availability in each territory
> occupy <- read.table("D:\\STAT999\\RBook\\occupation.txt",header=T)> dim(occupy)[1] 150 2> occupy[1:3,] resources occupied1 14.18154 02 18.68306 03 20.22156 0> attach(occupy)Crawley, M.J. (2007) The R Book: 597-598
yy ppyP 1)1()(
![Page 19: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/19.jpg)
19
Binary variable
> table(occupied)occupied 0 1 58 92 > modell <- glm(occupied~resources, family=binomial)> > plot(resources, occupied, type="n")> rug(jitter(resources[occupied==0]))> rug(jitter(resources[occupied==1]),side=3)> xv <- 0:1000> yv <- predict(modell, list(resources=xv),type="response")
by default the link for a Binomial is logistic
![Page 20: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/20.jpg)
20
0 200 400 600 800 1000
0.0
0.2
0.4
0.6
0.8
1.0
resources
occ
up
ied
![Page 21: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/21.jpg)
21
cutr <- cut(resources,5)tapply(occupied,cutr,sum)(13.2,209] (209,405] (405,600] (600,796] (796,992] 0 10 25 26 31 table(cutr)cutr(13.2,209] (209,405] (405,600] (600,796] (796,992] 31 29 30 29 31 probs <- tapply(occupied,cutr,sum)/table(cutr)probs(13.2,209] (209,405] (405,600] (600,796] (796,992] 0.0000000 0.3448276 0.8333333 0.8965517 1.0000000 attr(,"class")[1] "table"probs <- as.vector(probs)resmeans <- tapply(resources,cutr,mean)resmeans <- as.vector(resmeans)points(resmeans,probs,pch=16,cex=2)se <- sqrt(probs*(1-probs)/table(cutr))up <- probs + as.vector(se)down <- probs - as.vector(se)for(i in 1:5) { lines(c(resmeans[i],resmeans[i]),c(up[i],down[i]))}
![Page 22: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/22.jpg)
22
0 200 400 600 800 1000
0.0
0.2
0.4
0.6
0.8
1.0
resources
occ
up
ied
![Page 23: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/23.jpg)
23
> grid_x <- seq(10,990,by=0.5)
> modell_p <- predict(modell,new=data.frame(resources=grid_x),type="response")
> modelp <- glm(occupied~resources, family=binomial(link=probit))> modelp_p <- predict(modelp,new=data.frame(resources=grid_x),type="response")
> modelcl <- glm(occupied~resources, family=binomial(link=cloglog))> modelcl_p <- predict(modelcl,new=data.frame(resources=grid_x),type="response")
> modelca <- glm(occupied~resources, family=binomial(link=cauchit))> modelca_p <- predict(modelca,new=data.frame(resources=grid_x),type="response")
Various Link Functions
![Page 24: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/24.jpg)
24
To draw …> newdata <- data.frame(grid_x,modell_p,modelp_p,modelcl_p,modelca_p)> library(lattice)> print(xyplot(modell_p+modelp_p+modelcl_p+ modelca_p ~ grid_x,+ data=newdata, type ="l", xlab="resources",+ ylab="p",lwd=1.5, lty=c(1,2,3,4), col=c(1:4),+ panel = function(x, y, ...) {+ panel.xyplot(x, y, ...)+ panel.text(occupy$resources,occupy$probs,"x", cex=1.5, type="p", ...)+ }))> legend("topleft", legend=c("logit","probit","cloglog","cauchit"),lty=c(1:4), col=c(1:4), lwd=1.5)> > par(new=F)> points(resmeans,probs,pch=16,cex=2)> for (i in 1:5){+ lines(c(resmeans[i],resmeans[i]),c(up[i],down[i]))}
![Page 25: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/25.jpg)
25
![Page 26: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/26.jpg)
26
Binary variable> summary(modell)
Call:glm(formula = occupied ~ resources, family = binomial)
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.744592 0.669923 -5.590 2.28e-08 ***resources 0.009762 0.001568 6.227 4.77e-10 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 200.170 on 149 degrees of freedomResidual deviance: 97.152 on 148 degrees of freedomAIC: 101.15
Number of Fisher Scoring iterations: 6
Only valid if the Response variable is indeed a binomial
n
iiiiii yyyD
1
)ˆ()ˆl n (2 also called G-statistic
![Page 27: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/27.jpg)
27
Binary variable
> (dp <- sum(residuals(modell, type="pearson")^2)/modell$df.res)[1] 0.8472199)
Pearson's residuals
This dispersion parameter () must be calculated.
pn
y
pni iii
ˆˆˆ22
Residual degrees of freedom
Suggests that the Variance is 0.85 times the Mean.
In statistical terms there is no overdispersion.
In biological terms, it suggests that the counts are independent from each other and are not Aggregated(i.e. Clumped).
Typically Overdispersed count data follow a Negative Binomial distribution, which is not part of the Exponential families of distribution.
It won't be covered here, but it can be approximated as a quasi-binomial (family="quasibinomial").
If you need it in your future work, you can also try glm.nb (in MASS package)
![Page 28: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/28.jpg)
28
Binary variable
> summary(modell, dispersion=dp)
Call:glm(formula = occupied ~ resources, family = binomial)
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.744592 0.616628 -6.073 1.26e-09 ***resources 0.009762 0.001443 6.765 1.33e-11 ***---(Dispersion parameter for binomial family taken to be 0.8472199)
Null deviance: 200.170 on 149 degrees of freedomResidual deviance: 97.152 on 148 degrees of freedomAIC: 101.15
Number of Fisher Scoring iterations: 6
The summary table can be adjusted with the dispersion parameter
These Values can now be taken at face value
How good is the model? 1 – (Res. Dev. / Null Dev.)
= 51.47 %
![Page 29: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/29.jpg)
29
> summary(modell)Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.744592 0.669923 -5.590 2.28e-08 ***resources 0.009762 0.001568 6.227 4.77e-10 ***
(Dispersion parameter for binomial family taken to be 1) Null deviance: 200.170 on 149 degrees of freedomResidual deviance: 97.152 on 148 degrees of freedomAIC: 101.15
> summary(modelp)Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.1437759 0.3448511 -6.217 5.08e-10 ***resources 0.0055046 0.0007811 7.047 1.82e-12 ***
(Dispersion parameter for binomial family taken to be 1) Null deviance: 200.170 on 149 degrees of freedomResidual deviance: 97.024 on 148 degrees of freedomAIC: 101.02
![Page 30: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/30.jpg)
30
> summary(modelcl)
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.5902574 0.4293153 -6.033 1.60e-09 ***resources 0.0053519 0.0008337 6.419 1.37e-10 *** (Dispersion parameter for binomial family taken to be 1) Null deviance: 200.17 on 149 degrees of freedomResidual deviance: 102.30 on 148 degrees of freedomAIC: 106.30 > summary(modelca)
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -5.540198 1.644250 -3.369 0.000753 ***resources 0.014612 0.004205 3.475 0.000510 ***
(Dispersion parameter for binomial family taken to be 1) Null deviance: 200.17 on 149 degrees of freedomResidual deviance: 99.69 on 148 degrees of freedomAIC: 103.69
![Page 31: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/31.jpg)
31
Bootstrapping
> modell <- glm(occupied~resources,family=binomial)
> bcoef <- matrix(0,1000,2)> for (i in 1:1000){+ indices <-sample(1:150,replace=T)+ x <- resources[indices]+ y <- occupied[indices]+ modell <- glm(y~x, family=binomial)+ bcoef[i,] <- modell$coef }
> par(mfrow=c(1,2))> plot(density(bcoef[,2]),xlab="Coefficient of x",main="")> abline(v=quantile(bcoef[,2],c(0.025,0.975)),lty=2, col=4)> plot(density(bcoef[,1]),xlab="Intercept",main="")> abline(v=quantile(bcoef[,1],c(0.025,0.975)),lty=2, col=4)
![Page 32: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/32.jpg)
32
0.005 0.010 0.015 0.020
05
01
00
15
02
00
25
0
Coefficient of x
De
nsi
ty
-8 -7 -6 -5 -4 -3 -20
.00
.10
.20
.30
.40
.50
.6Intercept
De
nsi
ty
![Page 33: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/33.jpg)
33
Jackknifing
> jcoef <- matrix(0,150,2)> for (i in 1:150) {+ modelj<-glm(occupied[-i]~resources[-i], family=binomial)+ jcoef[i,] <- modelj$coef+ }
> par(mfrow=c(1,2))> plot(density(jcoef[,2]),xlab="Coefficient of x",main="")> abline(v=quantile(jcoef[,2],c(0.025,0.975)),lty=2, col=4)> plot(density(jcoef[,1]),xlab="Intercept",main="")> abline(v=quantile(jcoef[,1],c(0.025,0.975)),lty=2, col=4)
![Page 34: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/34.jpg)
34
0.0098 0.0102 0.0106
02
00
06
00
01
00
00
Coefficient of x
De
nsi
ty
-4.00 -3.90 -3.80 -3.700
51
01
52
02
53
0
Intercept
De
nsi
ty
![Page 35: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/35.jpg)
35
C.I.’s
> library(boot)> reg.boot<-function(regdat, index){+ x <- resources[index]+ y <- occupied[index]+ modell <- glm(y~x, family=binomial)+ coef(modell) }> reg.model<-boot(occupy,reg.boot,R=10000)> boot.ci(reg.model,index=2)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONSBased on 10000 bootstrap replicates Intervals : Level Normal Basic 95% ( 0.0059, 0.0128 ) ( 0.0051, 0.0120 ) Level Percentile BCa 95% ( 0.0075, 0.0144 ) ( 0.0070, 0.0132 ) Calculations and Intervals on Original Scale
![Page 36: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/36.jpg)
36
> jack.after.boot(reg.model,index=2)
-6 -5 -4 -3 -2 -1 0 1
-0.0
04
-0.0
02
0.0
00
0.0
02
0.0
04
standardized jackknife value
5, 1
0, 1
6, 5
0, 8
4, 9
0, 9
5 %
-ile
s o
f (T
*-t)
* * * * * ** * ** ** **** *************************************************************************************************************************************** * * * * ** * ** ** **** *************************************************************************************************************************************** * * * * ** * ** ** **** **************************************************************************************************************************************
* * * * * ** * ** ** **** **************************************************************************************************************************************
* * * * * ** * ** ** **** **************************************************************************************************************************************
* * * * * ** * ** ** **** **************************************************************
************************************************************************
* * * * * ** *** ** **** **************
*******************************************************
******************************************
***********************
108 34 46 70 125113601221391443852645812412891779285282353829871047944
102 83 45 57 6214121351710713295148126511431141097649874128988291015527 68
100 36 49 14912913699137111105119133112698118313111812125488422401847 75
90 39 71 134130150117147146138131120566572541231466501678867624961099431 35
32 41 67 127595115156171211451031101066393140432019116801422630377333 42
![Page 37: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/37.jpg)
37
108th observation?> occupy[105:110,] resources occupied105 703.1783 1106 710.1274 1107 716.7298 1108 717.1994 0109 733.3538 1110 736.3060 1> plot(resources, occupied)> text(resources[108],occupied[108],"Here",cex = 1.5,col="blue",pos=3)OR> fat.arrow <- function(size.x=0.5,size.y=0.5,ar.col="red"){+ size.x <- size.x*(par("usr")[2]-par("usr")[1])*0.1+ size.y <- size.y*(par("usr")[4]-par("usr")[3])*0.1+ pos <- locator(1)+ xc <- c(0,1,0.5,0.5,-0.5,-0.5,-1,0)+ yc <- c(0,1,1,6,6,1,1,0)+ polygon(pos$x+size.x*xc,pos$y+size.y*yc,col=ar.col) }> fat.arrow()
![Page 38: New Ways of Looking at Binary Data Fitting in R Yoon G Kim, ygk1@humboldt.edu Colloquium Talk](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e4e5503460f94b45149/html5/thumbnails/38.jpg)
38
0 200 400 600 800 1000
0.0
0.2
0.4
0.6
0.8
1.0
resources
occ
up
ied