joint effect of noise, personality and environmental factors on the intelligibility … · 2007....

Methods of Psychological Research Online 2001, Vol.6, No.2 Institute for Science Education

Internet: http://www.mpr-online.de © 2001 IPN Kiel

Joint Effect of Noise, Personality and Environmental

Factors on the Intelligibility of Speech

Agustín Turrero1, Pilar Zuluaga and Carmen Santisteban

Abstract

The performance of students in speech intelligibility tests is influenced by individual

characteristics such as sex and age, personality factors such as neuroticism (N), extra-

version, attention and sensitivity to noise, and environmental conditions such as the

location of the scholars in the classroom (LI), the location of the classroom itself with

regard to extraneous noise (LO) and background noise (BN). The first aim of this study

was to analyse the role of these factors in predicting performance. From a mathematical

point of view the problem was to establish a model to reflect accurately the relationship

between the expected proportion of successes and a set of covariates. We used a logistic

regression model mainly because of its high mathematical flexibility. A further aim was

to study in depth methodological questions such as the choice and assessment of the

model, including its extension to a random-effects model. One hundred and seventy stu-

dents participated in the study. The results indicate that only four of the factors studied

had any significant bearing upon their performance: N, LI, BN and LO, and that the

effect of the classroom on performance was a random one. The covariate pattern corre-

sponding to the best performance is given by the following levels : (N) high, (LI) front

row, (LO) playground and (BN) normal. For this pattern the estimated proportion of

successes is 0.66.

Keywords: Speech Intelligibility, Neuroticism, Noise, Multiple Logistic Regression,

Random Effects

1 Correspondence should be addressed to Agustín Turrero, Departamento de Estadística e I.O., Facultad

de Medicina, Universidad Complutense, 28040 Madrid, Spain.

e-mail: [email protected]

176 MPR-Online 2001, No. 2

1. Introduction

The interference of noise with speech is a masking process which affects communica-

tion and learning processes. An important aspect of communication interference in edu-

cational situations is the failure of scholars to hear words or sentences correctly, leading

to a lowering of performance in cognitive tasks. For example, there are abundant cross-

sectional and some longitudinal studies which report negative associations between

chronic exposure to high-noise sources (principally aircraft or road traffic noise) and

deficits in reading acquisition (Cohen et al., 1980; Evans, 1990; Hygge, Bullinger and

Evans, 1994). The effects of noise on performance are complex and have been widely

studied by many researchers (cf. Broadbent, 1957, 1981; Jones and Broadbent, 1979;

Kryter, 1970, 1994; Jones, Smith and Broadbent, 1979; Santisteban and Santalla,

1990a; Smith and Broadbent, 1991; Hygge et al. 1992; Santisteban, Sebastian and San-

talla, 1994; Santalla, Alvarado and Santisteban, 1999). Accurate predictions of the

audibility of a particular speech sound in the presence of a specific noise can be estab-

lished for a normal−hearing listener (Webster 1969, 1974; Kryter 1985, 1994) and the

speech interference level (SIL) or the articulation index (AI) (Beranek, 1947) can be

used in predicting the speech−masking capacity of a large variety of noises. Communica-

tion processes are, however, affected by numerous factors, including those referring to

the learner's own personality. Thus, the results of many studies showing that factors

such as neuroticism and sensitivity to noise are relevant in establishing individual differ-

ences in response to noise and its effects upon cognitive tasks have led us to include

these factors in our study (cf. Smith, 1988, 1993). Since Broadbent's initial studies

(1972) there has also been considerable interest in the role of individual differences in

sensitivity to noise when determining noise response. We have included gender in the

light of the findings of such authors as Gulian and Thomas (1986), who found that noise

affected female workers but had little apparent effect on males. Nevertheless, other

studies have found no differences in response to noise according to gender.

Much research has involved measuring speech intelligibility by using either nonsense

syllables and isolated words in phonetically balanced lists or whole sentences made up of

a series of these and taking the percentage of words that are correctly perceived. Thus,

for example, the percentage of words correctly perceived from a list of isolated words

and the corresponding percentage when these words are key words in a sentence have

been estimated by Kryter (1985, 1994). Studies into the effects of simulated impaired

frequency selectivity (IFS) on the intelligibility of speech in the presence of background

Turrero, Zuluaga & Santisteban: Factors Influencing the Intelligibility of Speech 177

noise and of interfering speech have been also made (Baer and Moore, 1994), which

confirmed that performance was seriously impaired in some cases in the presence of

both types of spectral smearing.

We have tried to predict the performance of school children in speech-

intelligibility tests from a set of factors including acoustic conditions and individual

characteristics represented by a vector of k components. In mathematical terms the

problem has been to establish a model that describes adequately the relationship be-

tween a probability,π , representing the expected proportion of successes, and a set of

covariates measured on different scales. The logistic distribution is a function capable

of modeling this kind of relationship, and preferable to others mainly due to its extreme

flexibility in mathematical terms and easy use.

The following section contains the logistic regression model and advances, presented

in a general way, the philosophy for the selection of the variables for the model. Next

the subjects and measurements used in the study are described. The subsequent section

shows in detail the results of fitting the logistic model to the data and this is followed

by an analysis of the natural extension of the model by taking into account random ef-

fects. Our findings are set out in the final section.

2. Logistic Regression Model

In the logistic regression model the relationship between π and the k vector of ex-

planatory variables X’ = ( x x xk1 2, ,..., ) associated with π is given by

(1)

where

(2)

and 0 1´ ( , ,..., )kβ β β β= is a ( 1)k + -vector of parameters.

( ) 1(X) (X)(X) 1 ,g ge eπ−

= +

0 1 1(X) ... ,k kg x xβ β β= + + +


2.1. Methodology for the Variable Selection

The main aim of any strategy of variable selection is to obtain the model that fits the

data best. Nevertheless, “ the goodness of fit of the model should never be taken into

account without also taking into account the parsimony of the model” (Mulaik et al.,

1989, p. 437). As Hosmer and Lemeshow (1989, p. 83) note, “The rationale for mini-

mizing the number of variables in the model is that the resultant model is more likely

to be numerically stable and is more easily generalized”. Therefore, the model-building

process should seek the most parsimonious model that still explains the data. In order to

achieve this goal we had to decide on a procedure to select the variables for the model

and the measurements to assess the fitting of the model. There are several approaches to

variable selection, based on different analytic philosophies and different statistical

methods. Hosmer and Lemeshow (1989, p. 83) suggest a general method of variable se-

lection in the logistic regression context. The main feature of this method is that selec-

tion is controlled throughout the whole process by the analyst, in contrast to the me-

chanical selection procedures used by computers, such as stepwise or best subsets regres-

sion. In this way all scientifically relevant variables may be included irrespective of their

statistical contribution to the model. Quantitative epidemiologists have adopted this

methodological stance and for similar reasons we chose to adopt this approach in quan-

titative behavioral research. This model-building methodology is applied to our set of

performance data concerning speech intelligibility which, taking into account the role of

the analyst, is in itself a methodological contribution in this respect. Furthermore, we

discuss about the inappropriate use, for inference, of the estimated odds ratios in our

case. A second methodological issue dealt with in the paper is that of extending the

fixed-effects model to a random-effects model, i.e. the analysis continues taking into ac-

count any possible random effects that isolate sources of variation . We propose the lo-

gistic-binomial model as being more suitable for modeling this specific variation.

3. Subjects and variables

The subjects for this study were 170 schoolchildren at a secondary school in Madrid

(Spain) affected by high noise levels from two sources: a main road carrying heavy traf-

fic and Madrid airport. The students were between 14 and 19 years old and were tested

during normal school hours in six classrooms.


The variables taken into account were:

1. Gender (G), with values of 0 or 1, corresponding to males and females respectively.

2. Age (A), the age of the student in years.

3. Location in the classroom (LI), an indicator of the distance between the student

and the source of the speech signals, with values of 0, 1, 2, corresponding to near,

intermediate and far distances respectively.

4. Location of the classroom (LO), an indicator of the site of the classroom, with val-

ues of 0 or 1 depending upon whether it gives onto the playground or onto the

road respectively.

5. Neuroticism (N) and Extraversion (E); two measurements of personality using the

Eysenck Personality Inventory (Eysenck and Eysenck, 1987). Both variables were

treated as ordinal variables with four categories: low, medium, high and very high.

6. Sensitivity to noise (SN), a measurement of the sensitivity to noise obtained using

the SENSIT-NA Questionnaire (Santisteban and Santalla, 1990b). Four levels were

considered: very little, little, sensitive and very sensitive.

7. Attention (AT), measured by the Spanish version of Thurstone's Identical Forms

Test (1958). Three levels were considered for this variable: inattentive, normal and

very attentive, depending upon the number of items solved correctly out the 60

graphical elements presented.

8. Background noise (BN), a measurement of the level of background noise present

in the classroom during the course of the intelligibility test. The chosen value was

the continuous equivalent sound level Leq .

All the above variables were treated as independent variables because none of them

were viewed as being subject to change during the study. The outcome or dependent

variable is that of speech intelligibility, i.e. the number of words correctly heard by the

subject as measured by a test developed at the Instituto de Acústica in Madrid (Del-

gado, 1968), consisting of 100 two-syllable words phonetically balanced in 10 groups.

The successive groups of words of the intelligibility test were reproduced at emission

levels decreasing in steps of 5 dBA, from 80 dBA to 35 dBA. This gives for each student

a mark of 0 to 10 according to the number of words heard correctly at each fixed emis-

sion level. As we only found significant differences in the subjects' performance at emis-

sion levels from 45 dBA to 55 dBA we selected an emission level of 50 dBA for the pur-


poses of this study. The outcome variable in the analysis is the proportion of words

correctly heard in the group of 10 words reproduced at 50 dBA.

The data were processed using the statistical packages EGRET (1991) and BMDP

(1992).

4. Fitting the Logistic Regression Model

4.1 Variable Selection

The selection process began with a careful univariate analysis of each variable. The

results of fitting the univariate logistic regression models to the data are shown in Table

1, in which the nominal and ordinal scaled variables have been modeled by creating de-

sign variables according to the “reference cell coding” or ”partial” method used in the

programs BMDPLR and EGRET. The intercept term is referred to the constant-only

model. To assess the significance of the coefficient(s) for each variable we used the like-

lihood ratio test statistic, G, which is obtained in terms of the difference between the

deviances for the constant-only model and the model containing the variable in ques-

tion. The statistic G follows a χ2 distribution with p degrees of freedom, where p is the

number of coefficients ( categories minus one ) of the variable.

In accordance with Hosmer and Lemeshow (1989, p.86) and the publications of

Bendel and Afifi (1977) and Mickey and Greenland (1989) on linear and logistic regres-

sion respectively, we used a p-value of 0.25 as screening criterion to select candidate

variables for the multivariate model. Thus, on the basis of the output set out in Table

1, all of the variables, except for E and SN, appeared to be associated in some way with

the outcome, speech intelligibility.


Table 1: Parameter Estimates, Their Standard Error Estimates, Deviances, L.R.Test

Statistics and Significance Levels for the Fitting Univariate Logistic Regression Models.

Variable β βσ Deviance G p

Constant -0.263 0.048 419.72

A -0.064 0.032 415.77 3.95 0.047

G 0.120 0.098 418.23 1.49 0.221

LI1 -0.363 0.120 392.60 27.12 <0.001

LI2 -0.610 0.119

LO -1.042 0.101 310.49 109.23 <0.001

N1 -0.125 0.126 405.65 14.07 0.003

N2 0.176 0.149

N3 -0.414 0.156

E1 -0.015 0.129 419.71 0.01 0.999

E2 -0.013 0.144

E3 -0.017 0.154

SN1 -0.019 0.147 416.50 3.22 0.359

SN2 -0.200 0.141

SN3 -0.051 0.172

AT1 -0.333 0.104 409.16 10.56 0.005

AT2 -0.056 0.188

BN -0.310 0.034 322.47 97.25 <0.001

Table 2 shows the results of fitting the multivariate logistic model including all the

variables except E and SN.


Table 2: Parameter Estimates, Their Standard Error Estimates, Wald Test Statistics

and Two-tailed Significance Levels for the Multivariate Model.

Variable β β

σ β

βσ p

Constant 8.846 1.920 4.60 <0.001

A -0.033 0.035 -0.94 0.351

G 0.103 0.110 0.94 0.348

LI1 -0.311 0.127 -2.45 0.014

LI2 -0.691 0.127 -5.44 <0.001

LO -0.695 0.133 -5.23 <0.001

N1 -0.046 0.135 -0.34 0.733

N2 0.217 0.161 1.35 0.176

N3 -0.292 0.175 -1.67 0.094

AT1 -0.163 0.113 -1.44 0.150

AT2 -0.017 0.202 -0.08 0.934

BN -0.171 0.044 -3.89 <0.001Deviance : 247.49 (158 df)

The relevance of each variable is mainly verified through an examination of its

Wald statistic. Also, the comparison of the parameter estimate with the corresponding

estimate from the univariate model in Table 1 completes that examination. On the basis

of the results set out in Table 2, variables A, G, and AT should be excluded from the

analysis. Whether to include the variable N in the model was more questionable. Tables

3 and 4 show the results of fitting two new multivariate logistic models containing the

significant variables from the old model, the first including the variable N and the sec-

ond excluding it.


Table 3: Parameter Estimates, Their Standard Error Estimates and Two-tailed Signifi-

cance Levels of Wald Test Statistic for the Multivariate Model Containing Variables

LI,LO,N and BN.

Variable β βσ p

Constant 9.290 1.870 <0.001

LI1 -0.296 0.126 0.019

LI2 -0.675 0.126 <0.001

LO -0.673 0.127 <0.001

N1 -0.044 0.133 0.743

N2 0.256 0.158 0.107

N3 -0.262 0.168 0.119

BN -0.193 0.041 <0.001Deviance : 251.30 (162 df)


cance Levels of Wald Test Statistic for the Multivariate Model Containing Variables LI,

LO and BN.

Variable β β

σ p

Constant 8.860 1.850 <0.001

LI1 -0.287 0.125 0.022

LI2 -0.661 0.124 <0.001

LO -0.720 0.125 <0.001

BN -0.183 0.041 <0.001Deviance : 260.48 (165 df)

The models in Tables 2 and 3 were compared via the likelihood ratio test. The statis-

tic of this test takes the value 3.80, which, compared to a χ2 distribution with 4 degrees

of freedom, yields a p-value of 0.43, showing that the variables A, G, and AT added

little information to the model once the other variables had been included. Also, the

observation of the estimated coefficients for the remaining variables supports that fact

since they were nearly identical in both models.

The likelihood ratio statistic, LRS, for the difference between the models in Tables 3

and 4 (a test for the significance of N) had a value of 9.18, which yielded a p-value of

0.027, thus demonstrating that N contributed significantly to the model. Nevertheless,


the estimated coefficients for the remaining variables did not change appreciably in ei-

ther model. Observation of the estimated coefficients for the variable N in Table 3 sug-

gested that we should consider a new grouping for this variable. A new variable, de-

noted by NE, was chosen to replace N by regrouping two of its categories into one. The

variable NE thus obtained contains three categories, the first, or reference group in-

cludes the first two categories of N, i.e. the low and medium levels of neuroticism, the

rest of the categories being the same for both variables. A univariate analysis of NE

shows that the high level of neuroticism is the most favourable category with an esti-

mated proportion of successes of 0.5, whilst the values for the low-to-medium and the

very high levels are 0.44 and 0.36 respectively.

Table 5 shows the results of fitting a new multivariate logistic model including the

variable NE instead of N.


cance Levels of Wald Test Statistic for the Multivariate Model Containing Variables LI,

LO, NE and BN.

Variable β β

σ p

Constant 9.291 1.870 <0.001

LI1 -0.297 0.126 0.019

LI2 -0.672 0.125 <0.001

LO -0.675 0.127 <0.001

NE1 0.283 0.135 0.036

NE2 -0.234 0.146 0.107

BN -0.193 0.041 <0.001Deviance: 251.40 (163 df)

The models in Tables 3 and 5 are nested and so we could use LRS to compare them.

This statistic took the value 0.1, which, compared to a χ2 distribution with 1 degree of

freedom, yielded a p-value of 0.75. Thus we concluded that the model in Table 5 repre-

sented an improvement over that in Table 3 . Moreover, the LRS for the difference be-

tween the models in Tables 4 and 5 (a test for the significance of NE) had a value of

9.08 with an associated p-value of 0.010, which endorses the contribution of neuroticism

to the model.

We then focused our attention upon the assumption of linearity in the logit for the

variables that were modeled as being continuous. The only variable we needed to check


was background noise (BN). This variable differs between rooms but not within the

same room. Moreover only four different levels of background noise were observed dur-

ing the tests in the six classrooms. These levels were 45 dBA, 45,5 dBA, 47 dBA and 50

dBA. One approach to assessing the scale of the logit was to categorize the variable BN

into groups and so we created two design variables using the 45 and 45.5 values as the

reference group. These design variables, BNG1 and BNG2, were then used in the multi-

variate model instead of BN.

Table 6 shows the results of this fitting with regard to the variables BNG1 and

BNG2.

Table 6: Estimated Coefficients, Estimated Standard Errors and Two-tailed Significance

Levels of Wald Test Statistic of BNG1 and BNG2 from the Multivariate Model Con-

taining LI, LO, NE and BNG.

Variable β β

σ p

BNG1 0.071 0.184 0.699

BNG2 -0.976 0.194 <0.001Deviance : 240.3 (162 df)

The estimated coefficients and p-values in Table 6 suggest a binary model.Thus, we

created a new dichotomous variable, BNC, taking a value of 1 if BN was greater than

47 dBA and 0 otherwise. The results of fitting the multivariate model with the new

variable BNC are given in Table 7.

Table 7: Results of Fitting the Multivariate Model Containing Variables LI, LO, NE

and BNC.

Variable β βσ p

Constant 0.525 0.106 <0.001

LI1 -0.294 0.127 0.020

LI2 -0.664 0.126 <0.001

LO -0.744 0.114 <0.001

NE1 0.268 0.135 0.047

NE2 -0.247 0.146 0.090

BNC -1.003 0.180 <0.001Deviance : 240.45 (163 df)


These results show that students in a noisy classroom (50 dBA) obtained an esti-

mated proportion of successes of 0.38, which is much lower than the score of 0.63 ob-

tained by students in classrooms with low noise levels (≤ 47 dBA). Once again, a nu-

merical comparison of the deviance in Table 7 with that in Table 6, together with the

respective degrees of freedom, indicated an improvement over the last model.

It should be pointed out at this juncture that the variable BN was treated as a

categorical variable because of the particular conditions of this study. Generally speak-

ing, when there are a lot of values for BN the rational thing to do would be to treat it

as a continuous variable in the model.

Once we had ascertained that the continuous variable was in the correct scale we

were able to consider the main-effects model as being complete. We began the multi-

variate model in Table 2 with a deviance of 247.49 and 158 degrees of freedom, and fin-

ished in Table 7 with a deviance of 240.45 and 163 degrees of freedom.

At this stage in the model−building process we felt we should check for interac-

tions. The interaction between the variables LO and BNC made no sense because BNC

= 0 for all the classrooms giving onto the playground. The remaining interactions were

certainly of greater interest. The results of adding each interaction to the main-effects

model are shown in Table 8.

Table 8: Deviances, LRS, Degrees of Freedom and p-Value for Interactions of Interest to

be Added to the Main Effects Only Model.

Interaction Deviance LRS df p-value

Main Effects only2 240.45

LI x LO 236.55 3.90 2 0.142

LI x NE 238.11 2.34 4 0.674

LI x BNC 238.95 1.50 2 0.473

LO x NE 237.31 3.14 2 0.209

NE x BNC 229.08 11.37 2 0.003

It can be seen from the p-values associated with LRS in Table 8 that only the

NE× BNC interaction affords a significant improvement over the main-effects model.

2 Main effects model from table 7.


Consequently, the final fixed-effects model contains the main effects set out in Table 7

plus this latter interaction. The results of fitting this model are giving in Table 9.

Table 9:Results of Fitting the Multivariate Model Containing Variables LI, LO, NE,

BNC and NE× BNC Interaction.

Variable β βσ p

Constant 0.556 0.107 <0.001

LI1 -0.298 0.130 0.022

LI2 -0.632 0.127 <0.001

LO -0.734 0.114 <0.001

NE1 0.111 0.144 0.440

NE2 -0.344 0.157 0.028

BNC -1.589 0.277 <0.001

BNC x NE1 1.256 0.401 0.002

BNC x NE2 0.969 0.449 0.031Deviance : 229.08 (161 df)

4.2 Assessing the Fitting of the Model

Once the model was constructed we needed to assess its overall fitting and suitability.

Some combinations of variable levels were not found in the results and thus only 25

different covariate patterns occur in the final model, as shown in Table 9. In this situa-

tion an appropriate statistic for assessing the fitting is the Hosmer-Lemeshow test

(Hosmer and Lemeshow, 1980; Lemeshow and Hosmer, 1982). The value of this statistic,

computed from the fitted logistic model in Table 9, is χ = 3.51, and the corresponding

p-value computed from the χ2 distribution with 8 degrees of freedom is one of 0.898,

which indicates that the model seems to fit the data quite well. A complete analysis

might include an examination of the individual residuals but this would involve consid-

erable effort and would not answer any outstanding questions.

4.3 Inferences from the Fitted Model

In epidemiological research inferences from a logistic regression model usually begin

with an estimation of the odds ratios for the various risk factors in the model. The main

reason for this is that in many instances the odds ratios approximate the relative risks

and those can easily be obtained from the coefficients of the logistic model. In our con-


text this approximation requires that the expected proportion of failures be small for all

categories of each variable, which is unlikely. Furthermore, we are interested in esti-

mating the proportion of successes for the different levels of the variables and in estab-

lishing an order between these levels on the basis of these proportions. An initial ap-

proach for evaluating the effect that each variable in the model has on the proportion of

successes consists of “adjusting for all other variables”, which involves comparing the

different levels of the variable at certain common values of the remaining variables,

these values being their respective reference values. For each variable we can choose the

most favourable category (the greatest proportion of successes) and the most unfavour-

able category (the smallest proportion of successes). Table 10 shows the favourable and

unfavourable categories for each variable, together with their estimated proportion of

successes (E.P.S.).

Table 10: Estimated Proportion of Successes for the Favorable and Unfavorable Catego-

ries of Each Variable in the Model and Ratio of Proportions.

Variable Favourable E.P.S. Unfavourable E.P.S. Ratio of Proportions

LIFirst Range

0.636

Far Range

0.4811.32

LOYard

0.636

Highway

0.4561.39

N, BN≤47High Level

0.661

Very High Level

0.5531.20

N, BN>47High Level

0.583

Low or Medium Levels

0.2632.22

Since neuroticism and background noise are included in one interaction, both vari-

ables are jointly analysed in Table 10. The ratio of proportions (favourable/ unfavour-

able) is also given in the last column of this table. Note that the proportion of successes

for a student in the front row of the class is 1.32 times that for a student in a back row.

In a noisy classroom (BN>47 dBA) the proportion of successes for a student with a high

level of neuroticism is 2.22 times that of a student with a low-to-medium level.


5. Extension to Random-Effects Model

5.1 Approaches

A natural extension of a fixed-effects model, when there is grouping in the data, is

the random-effects model, which offers an alternative to isolate sources of heterogeneity.

Fixed-effects models explicitly model a location parameter corresponding to the baseline

response in each stratum, whilst random-effects models address this fact by assuming

that each stratum baseline parameter is a realization from a probability distribution

specifiable with a small fixed number of parameters.

In our context it seems reasonable to believe that, because of its internal and external

acoustic conditions, the classroom (C) can play the role of a homogeneity factor.

For categorical data two approaches are available for modeling the so−called ex-

tra−binomial variation. The pioneering work of Crowder (1978) postulated that the suc-

cess probability for the i th stratum derives from a beta distribution. This model is re-

ported as the beta-binomial regression model.

Another approach, with a special intuitive appeal, is to postulate that the said prob-

ability is perturbed on the logit scale. Pierce and Sands (1975) proposed this method in

an unpublished paper, assuming a standard normal distribution for this perturbation.

This model is known as the logistic−normal regression model. Finally the logis-

tic−binomial model (Mauritsen, 1984) generalises the logistic regression model in a man-

ner similar to the logistic−normal regression model. In this case a standardised binomial

distribution is assumed for the random perturbation.

Mauritsen (1984) compares the beta-binomial with the logit−scale prior models by

analyzing such features as goodness of fit, speed of fit and utility, using real data sets

and via computer simulated data. On the basis of this comparison and taking into ac-

count that we are handled distinguishable responses we chose to use the logistic- bino-

mial model for modeling any possible extra−binomial variation.

Let rij denote the number of successes for the j th student in the i th classroom. We

are assuming

rij → Binomial ( 10, π ij ). (3)


The logistic regression model assumes that there are no classroom effects and that π ij

, the success probability for the (i, j) subscripts can be written, in terms of its k-vector

of associated covariates Xij' =( , ,..., )x x xij ij ijk1 2 , as follows:

(4)

which can also be written, in terms of the logit transformation, as

(5)

where g (x ij ) = β β β β0 1 1 2 2+ + + +x x xij ij k ijk... ,

and β’ = ( 0 1, ,..., kβ β β ) is a (k + 1) −vector of parameters3.

The logistic−binomial regression model assumes

(6)

where vi is a realization from a symmetric, standardized binomial distribution, and

σ ≥ 0 is a scale parameter, i.e.

(7)

where wi→ binomial (K,1/2). In particular vi is the same for all students in the i th

classroom.

5.2 Fitting the Logistic−−−−Binomial Regression Model

Table 11 shows the results of extending the model in Table 9 to include the σ pa-

rameter, that is to say, the new model fitted to the data is the logistic−binomial model

formulated in (6). For the fitting, we used a six−point prior distribution, i.e. K=5 in (7).

The choice of this prior distribution is based upon the comparisons made by Mauritsen

(1984).

3Models (1) and (4) are the same. Equation (4) is more explicit than (1) due to notational necessity

( )( )( )

,1

ij

ij

g x

ij g x

e

eπ =

+

( )log it ( )ij ij ig x vπ σ= +

( ) ( )log it ,ij ijg xπ =

( ) 2,

( )i i i

ii

w E w w Kv

KVar w− −

= =


Table 11: Results of Fitting the Logistic-Binomial Regression Model Containing All the

Fixed Effects Terms in Table 9 and Using C as Matching Variable.

Variable β βσ p

Constant 0.538 0.147 <0.001

LI1 -0.306 0.130 0.019

LI2 -0.653 0.128 <0.001

LO -0.691 0.208 <0.001

NE1 0.073 0.145 0.616

NE2 -0.353 0.157 0.025

BNC -1.610 0.367 <0.001

BNC x NE1 1.292 0.402 0.001

BNC x NE2 0.976 0.450 0.030

Excess variation 0.183 (σ) 0.075 0.007Deviance : 223.72 (160 df)

The Wald-test statistic for the excess variation term is a one-tailed test.

To test whether there is any statistically significant excess variation (σ > 0) we have

to compare the model containing no random-effects terms (Table 9) with that contain-

ing an excess-variation term (Table 11) using the likelihood ratio statistic. The square

root of the likelihood ratio statistic is treated as a one−tailed z-statistic since the linear

predictor for the random-effects portion of the regression is restricted to being

non−negative. This LRS results in a value of 5.36, which yields a p-value of 0.010, indi-

cating a significant excess of variation. Note that the standard errors for LO and BNC

in Table 11 have increased in relation to the corresponding ones derived according to

the standard logistic regression model (Table 9). This is the practical effect of the het-

erogeneity, or extra−binomial variation, in the data, since both variables are related to

C, the matching variable.

The variable BNC contributes to the model together with the variable NE, whilst the

variable LO has a separate effect. Thus, it may be that once σ is in the model LO is no

longer significant. This question is analysed in Tables 12 and 13. The intention is to

find the best place for LO in the model.


Table 12: Results of Fitting Six Models to the Data, Using Logistic Regression With

and Without Random Effects.

Fit Fixed Effects Parameters Random Effects Parameters Deviance (df)

A Model I 271.19 (162)

B4 Model I, LO 229.08 (161)

C Model I EV 231.33 (161)

D5 Model I, LO EV 223.72 (160)

E Model I EV, LO 228.16 (160)

F Model I, LO EV, LO 221.86 (159)

Table 13: Analysis of the Fits Reported in Table 12.

Test Explanation Comparison LRS df p

1 Test for LO differences A vs. B 42.11 1 <0.001

2 Test for excess variation given no LO

differences A vs. C 39.86 1 <0.001 6

3 Test for excess variation given LO differ-

ences B vs. D 5.36 1 0.0106

4 Test for LO differences in the presence of

excess variation C vs. D 7.61 1 0.005

5 Test if the two LO groups need to be fit

separately D vs. F 1.86 1 0.163

6 Test for LO differences, while fitting

separate amounts of excess of variation E vs. F 6.3 1 0.012Note. All tests assume the presence of LI, NE and BNC differences.

In Table 12 we set out the results of fitting six different regressions to the data, the

first two using logistic regression and the last four using logistic−binomial regression.

‘Model I’ represents the fixed-effects model containing variables LI, NE, BNC and

NE×BNC interaction. ‘EV’ denotes the term that parametrizes the excess variation.

4 The fit B is the final fixed effects model in table 9.5 The fit D is the random effects model in table 11.6 This test compares the square root of the likelihood ratio statistic against a one tailed normal distribu-

tion.


The first three tests in Table 13 are quite significant, as was to be expected after the

model−building process concluded in Table 11. Test 4 shows that the variable LO must

be present in the fixed-effects portion of the model. The results of the last two tests al-

lowed us to conclude that the best fitting of all is D, i.e. the random effects model in

Table11.

6 Conclusions

We have applied the logistic regression model for estimating the effects of noise, per-

sonality and other factors upon the performance of students in speech intelligibility

tests. The model-building methodology used in the analysis leaves the selection of vari-

ables in the hands of the analyst rather than in the computer's control. The rationale for

this approach is to provide as complete control of confounding as possible within the

given data set. On the basis of this analysis only four factors appear to have a close re-

lationship with performance results. Two of these are related to the acoustic conditions

in the classroom, location (LO) and background noise (BN); the third, student neuroti-

cism (N), is related to the subject's own personality, and the last one is the distance

between the student and the source of the speech signals (LI).

The extension of the model including random effects substantially improves the fit-

ting and provides the best tool for forecasting purposes.

The main behavioral implications of the models may be summarised as follows:

1. - The variables LI and LO represent separate effects on performance. The propor-

tion of successes for a student in the nearest row to the sound source within a classroom

giving onto the playground is estimated to be from 12% to 25% greater than for a stu-

dent in the middle rows, and from 30% to 65% greater than a student farthest away;

the rest of the covariates remain the same. In a classroom giving onto the main road the

corresponding percentages vary from 19% to 29% and from 45% to 75% respectively.

Overall, schoolchildren in classrooms giving onto the playground perform better than

those in classrooms giving onto the road, the proportion of successes for a student in the

former situation being estimated to range from 37% to 91% greater than for a student in

the latter situation, all other factors being equal.

2. − The variables N and BN exert a joint effect on performance. For any combina-

tion (LI, LO) the arrangement of the categories N BN× in order of performance is the

following: high × normal, low-to-medium × normal, high × noisy, very high × normal,


very high × noisy and low-to-medium × noisy; where we use the adjectives ‘normal’

and ‘noisy’ to refer to BN ≤ 47 dBA and BN > 47 dBA respectively. Thus the most

favourable interaction is given for a student with a high level of neuroticism in a normal

classroom. The least favourable interaction, on the other hand, is given for a student

with a low-to-medium level of neuroticism in a noisy classroom. It is also noteworthy

that a student with a high level of neuroticism in a noisy classroom has a better forecast

than a student with a very high level of neuroticism in a normal classroom.

3. − The classroom represents a block factor in the data with a random effect on

performance. This means that two students with the same covariates but belonging to

two different classrooms have estimated proportions of successes which differ by a ran-

dom amount, the approximate distribution of which is N (0,0.18).

4. − Finally, we may conclude that the covariate pattern with the best performance is

given by the following values : front row (LI), playground (LO), high-level neuroticism

(N) and normal noise (BN). For this pattern, the estimated proportion of successes,

without taking into account the random effect, is 0.661. The values far distance from

the speech source, road, low-to-medium level of N and a noisy background constitute

the worst covariate pattern, with an estimated proportion of successes, irrespective of

the random effect, of 0.083.

It is important to note that one of the main results is the significant interaction of

BNC and NE; that is to say, although several variables may exert main effects on the

speech comprehension rate (intelligibility), only an analysis of the interactions of these

variables with the effect of background noise reveal significant information concerning

the influences upon susceptibility.


References

[1] Baer, T., & Moore, B.C.J. (1994). Effects of spectral smearing on the intelligibility

of sentences in the presence of interfering speech. Journal of the Acoustical Society

of America, 95, 2277-2280.

[2] Bendel, R.B.,& Afifi, A.A. (1977). Comparison of stopping rules in forward regres-

sion. Journal of the American Statistical Association, 72, 46-53.

[3] Beranek, L. L. (1947). The design of speech communication systems. Proceedings of

the Institute of Radio Engineers, 35, 880-890.

[4] BMDP (1992). BMDP statistical software manual (Vol.2). Berkeley, CA : University

of California Press.

[5] Broadbent, D.E. (1957). Effects of noise on behaviour. In C.M. Harris (ed), Hand-

book of Noise Control. New York. McGraw-Hill, pp.10-34.

[6] Broadbent, D.E. ( 1972). Individual differences in annoyance by noises. Sound, 6,

56-61.

[7] Broadbent, D.E. (1981). The effects of moderate levels of noise on human perform-

ance. In J.Tobias & E.Schubeert (ed.), Hearing: Research and Theory. New York:

Academic Press

[8] Cohen,S., Evans, G.W., Krantz, D.S & Stokols, D. (1980). Physiological, motiva-

tional, and cognitive effects of aircraft noise on children: Moving from the labora-

tory to the field. American Psychologist, 35, 231-243.

[9] Crowder, M. J. (1978). Beta-binomial Anova for proportions. Applied Statistics, 27

(1), 34- 47.

[10] Delgado, C. (1968). Ruido y palabra : Test de inteligibilidad CIF. Electrónica y

Física Aplicada, XI, 107-112.

[11] EGRET (1991). EGRET statistical software. Statistics and Epidemiology Research

Corporation and Cytel Software Corporation, Seattle, WA.

[12] Evans, G.W. (1990). The nonauditory effects of noise on child development. In

B.Berglund, U.Berlund, J.Karlsson & T.Lindvall (eds), Noise as a Public Health

Problem. Vol 4. 425-453


[13] Eysenck, H.J., & Eysenck, S. B. G. (1987). Eysenck Personality Inventory. Hodder

and Stoughton. Educational London. Revised Spanish Version. TEA. Madrid.

[14] Gulian,E. & Thomas, J.R. (1986). The effects of noise, cognitive set and gender on

mental arithmetic performance. British Journal of Psychology, 77, 503-511.

[15] Hosmer, D. W., & Lemeshow, S .(1980). A goodness-of-fit test for the multiple logis-

tic regression model. Communications in Statistics, A10, 1043-1069.

[16] Hosmer, D. W., & Lemeshow, S.(1989). Applied Logistic Regression. New York:

John Wiley & Sons.

[17] Hygge, S., Bullinger, M. & Evans, G.W. ( 1994). The Munich airport noise study:

Cognitive effects on children from before to after the change over of airports. Ab-

stract from the 23rd International Congress of Applied Psychology, Madrid, Spain.

Report to the Swedish Environmental Protection Agency.

[18] Jones,D.M. & Broadbent, D.E. (1979). Side-effects of interference with speech by

noise. Ergonomics, 22, 1073-1081.

[19] Jones, D.M., Smith, A.P.& Broadbent, D.E. (1979). Effects of moderate intensity

noise on the Bakan vigilance task. Journal of Applied Psychology 64,627-634.

[20] Kryter, K.D. (1970). The effects of noise on man. New York: Academic Press.

[21] Kryter, K. D. (1985). The effects of noise on man. 2nd ed..New York, NY : Academic

Press .

[22] Kryter, K. D. (1994). The Handbook of hearing and the effects of noise. Physiology,

Psychology and Public Health. New York : Academic Press.

[23] Lemeshow, S., & Hosmer, D.W. (1982). The use of goodness-of -fit statistics in the

development of logistic regression models. American Journal of Epidemiology,115,

92-106.

[24] Mauritsen, R. H. (1984). Logistic regression with random effects. Unpublished

Ph.D. Thesis,Department of Biostatistics, University of Washington, Seattle.

[25] Mickey, J., & Greenland, S. (1989). A study of the impact of confounder-selection

criteria on effect estimation. American Journal of Epidemiology,129, 125-137.

[26] Mulaik, S.A., James, L.R., Van Alstine, J., Bennett, N., Lind, S., & Stilwell, C. D.

(1989). Evaluation of goodness-of-fit indices for structural equations models. Psy-

chological Bulletin, 105, 430-445.


[27] Pierce, D.A., & Sands, B.R.(1975). Extra-Bernouilli variation in binary data. Tech-

nical report No.46, Department of Statistics, Oregon State University.

[28] Santalla, Z., Alvarado, J.M. & Santisteban, C. (1999). ¿El ruido afecta a la

focalización de la atención visual? Psicothema, 11, 97-111.

[29] Santisteban, C. & Santalla, Z. (1993). The effects of everyday noise on comprehen-

sion and recall of reading texts. In B.Berglund, U.Berlund, J.Karlsson & T.Lindvall

(eds), Noise as a Public Health Problem, 2 , 553-556.

[30] Santisteban, C & Santalla, Z. (1990a). Efectos del ruido sobre memoria y atención:

Una revisión. Psicothema, 2,49-91.

[31] Santisteban, C., & Santalla, Z.(1990b). SENSIT-NA. Cuestionario de sensibilidad al

ruido para adultos. Norma Ed., S.A. Madrid.

[32] Santisteban, C., Sebastián, E.M. & Santalla, Z (1994). Efectos de ruidos cotidianos

sobre el recuerdo. Psicothema, 6;403-416 .

[33] Smith, A. P.(1988). Individual differences in the combined effects of noise and

nightwork on performance, in Manninen, O. (ed.), Recent Advances in Researches

on the Combined Effect of Environmental Factors ( Finland:Tampere, 365−380 ).

[34] Smith, A. P. (1993). Recent advances in the study of noise and human performance.

Proceedings of the 6th International Congress on Noise as a Public Health Problem,

3, 293−300.

[35] Smith, A.P.& Broadbent, D.E .(1991). Non-auditory effects of noise at work: a re-

view of the literature. Health and Safety Executive Contract Research Report

No.30.

[36] Thurstone, L.L. (1958). Identical Forms. Sciences Research Associates. Chicago.

[37] Webster, J. C. (1969). Effects of noise on speech intelligibility. In : American Spe-

ech and Hearing Association, Noise as a public health hazard. Washington, DC,

ASHA Reports, 4.

[38] Webster, J. C. (1974). The effects of noise on hearing speech. In: US Environmental

Protection Agency (Eds.), Noise as a public health hazard. Washington, DC, US

EPA, 24-43.

joint effect of noise, personality and environmental factors on the intelligibility … · 2007....

Documents