valente et al-2015-international journal of language & communication disorders

17
7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 1/17 INT J LANG COMMUN DISORD,  JANUARY FEBRUARY  2015, VOL. 50,  NO. 1, 14–30 Review Event- and interval-based measurement of stuttering: a review  Ana Rita S. Valente†‡, Luis M. T. Jesus§, Andreia Hall¶ and Margaret Leahy  Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal Department of Education (DE), University of Aveiro, Aveiro, Portugal §School of Health Sciences (ESSUA), University of Aveiro, Aveiro, Portugal ¶Department of Mathematics, University of Aveiro, Aveiro, Portugal Department of Clinical Speech and Language Studies, Trinity College Dublin, Dublin, Ireland (Received October 2013; accepted April 2014 )  Abstract Background: Event- and interval-based measurements are two different ways of computing frequency of stuttering. Interval-based methodology emerged as an alternative measure to overcome problems associated with reproducibil- ity in the event-based methodology. No review has been made to study the effect of methodological factors in interval-based absolute reliability data or to compute the agreement between the two methodologies in terms of inter-judge, intra-judge and accuracy (i.e., correspondence between raters’ scores and an established criterion).  Aims: To provide a reviewrelated toreproducibilityofevent-basedandtime-intervalmeasurement,andtoverifythe effect of methodological factors (training, experience, interval duration, sample presentation order and judgment conditions) on agreement of time-interval measurement; in addition, to determine if it is possible to quantify the agreement between the two methodologies  Methods & Procedures: Thefirst twoauthorssearchedforarticlesonERIC,MEDLINE, PubMed,B-on,CENTRAL and Dissertation Abstracts during January–February 2013 and retrieved 495 articles. Forty-eight articles were selected for review. Content tables were constructed with the main findings.  Main Contribution:  Articles related to event-based measurements revealed values of inter- and intra-judge greater than 0.70 and agreement percentages beyond 80%. The articles related to time-interval measures revealed that, in general, judges with more experience with stuttering presented significantly higher levels of intra- and inter-  judge agreement. Inter- and intra-judge values were beyond the references for high reproducibility values for both methodologies. Accuracy (regarding the closeness of raters’ judgements with an established criterion), intra- and inter-judge agreement were higher for trained groups when compared with non-trained groups. Sample presentation order and audio/video conditions did not result in differences in inter- or intra-judge results. A duration of 5 s for an interval appears to be an acceptable agreement. Explanation for high reproducibility values as well as parameter choice to report those data are discussed. Conclusions & Implications:  Both interval- and event-based methodologies used trained or experienced judges for inter- and intra-judge determination and data were beyond the references for good reproducibility values. Inter- and intra-judge values were reported in different metric scales among event- and interval-based methods studies, making it unfeasible to quantify the agreement between the two methods. Keywords:  stuttering, event-based measurement, interval-based measurement, inter-judge, intra-judge, accuracy.  What this paper adds?  A systematic review has been completed regarding event- and interval-based measurements of frequency of stuttering. Interval-based measurement was developed to overcome reproducibility problems from event-based methodology. This study was developed to verify if it is possible to quantify the agreement between the two methods to conclude  what is the most reproducible or more accurate way to assess stuttering frequency and to analyse the effect of methodological factors on agreement of time-interval measurement. Both methodologies presented agreement values beyond the highest references for all metric scales. It was not possible to perform a comparison study between methods, so it would be useful in future research to quantify the agreement between the two assessment methods.  Address correspondence to: Ana Rita S. Valente, Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, P-3810–193 Aveiro, Portugal; e-mail: [email protected]. International Journal of Language & Communication Disorders ISSN 1368-2822 print/ISSN 1460-6984 online  C  2014 Royal College of Speech and Language Therapists DOI: 10.1111/1460-6984.12113

Upload: marjorie-tovar

Post on 18-Feb-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 1/17

INT J LANG COMMUN DISORD,   JANUARY–FEBRUARY   2015,

VOL. 50,  NO. 1, 14–30

Review 

Event- and interval-based measurement of stuttering: a review 

 Ana Rita S. Valente†‡, Luis M. T. Jesus†§, Andreia Hall†¶ and Margaret Leahy †Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal‡Department of Education (DE), University of Aveiro, Aveiro, Portugal§School of Health Sciences (ESSUA), University of Aveiro, Aveiro, Portugal¶Department of Mathematics, University of Aveiro, Aveiro, PortugalDepartment of Clinical Speech and Language Studies, Trinity College Dublin, Dublin, Ireland

(Received October 2013; accepted April 2014 )

 Abstract 

Background: Event- and interval-based measurements are two different ways of computing frequency of stuttering.Interval-based methodology emerged as an alternative measure to overcome problems associated with reproducibil-

ity in the event-based methodology. No review has been made to study the effect of methodological factors ininterval-based absolute reliability data or to compute the agreement between the two methodologies in terms of inter-judge, intra-judge and accuracy (i.e., correspondence between raters’ scores and an established criterion). Aims: To provide a review related to reproducibility of event-based and time-interval measurement, and to verify theeffect of methodological factors (training, experience, interval duration, sample presentation order and judgmentconditions) on agreement of time-interval measurement; in addition, to determine if it is possible to quantify theagreement between the two methodologies Methods & Procedures: The first twoauthors searched for articles on ERIC, MEDLINE, PubMed, B-on, CENTRALand Dissertation Abstracts during January–February 2013 and retrieved 495 articles. Forty-eight articles wereselected for review. Content tables were constructed with the main findings. Main Contribution:  Articles related to event-based measurements revealed values of inter- and intra-judge greaterthan 0.70 and agreement percentages beyond 80%. The articles related to time-interval measures revealed that,in general, judges with more experience with stuttering presented significantly higher levels of intra- and inter- judge agreement. Inter- and intra-judge values were beyond the references for high reproducibility values for

both methodologies. Accuracy (regarding the closeness of raters’ judgements with an established criterion), intra-and inter-judge agreement were higher for trained groups when compared with non-trained groups. Samplepresentation order and audio/video conditions did not result in differences in inter- or intra-judge results. A duration of 5 s for an interval appears to be an acceptable agreement. Explanation for high reproducibility valuesas well as parameter choice to report those data are discussed.Conclusions & Implications: Both interval- and event-based methodologies used trained or experienced judges forinter- and intra-judge determination and data were beyond the references for good reproducibility values. Inter-and intra-judge values were reported in different metric scales among event- and interval-based methods studies,making it unfeasible to quantify the agreement between the two methods.

Keywords:  stuttering, event-based measurement, interval-based measurement, inter-judge, intra-judge, accuracy.

 What this paper adds?

 A systematic review has been completed regarding event- and interval-based measurements of frequency of stuttering.Interval-based measurement was developed to overcome reproducibility problems from event-based methodology.This study was developed to verify if it is possible to quantify the agreement between the two methods to conclude

 what is the most reproducible or more accurate way to assess stuttering frequency and to analyse the effect of methodological factors on agreement of time-interval measurement. Both methodologies presented agreement valuesbeyond the highest references for all metric scales. It was not possible to perform a comparison study betweenmethods, so it would be useful in future research to quantify the agreement between the two assessment methods.

 Address correspondence to: Ana Rita S. Valente, Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro,P-3810–193 Aveiro, Portugal; e-mail: [email protected].

International Journal of Language & Communication Disorders 

ISSN 1368-2822 print/ISSN 1460-6984 online   C 2014 Royal College of Speech and Language TherapistsDOI: 10.1111/1460-6984.12113

Page 2: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 2/17

Event- and interval-based measurement of stuttering: a review    15

Introduction

Several authors have shown that therapy for adults whostutter (AWS) using behavioural treatment (e.g., fluency shaping and/or stuttering modification approaches) canbe effective, resulting in a significant decrease in stutter-ing frequency (Andrews et al. 1980, Bothe et al. 2006,

Herder   et al.   2006). The frequency or percentage of stuttered syllables (%SS) or words (%SW) has beenused to assess changes in fluency as a result of inter-vention. Stuttering behaviours that include part-wordrepetitions, prolongations and blocks (or other definedstutters) are considered as core behaviours of the disorder(Van Riper 1982) andthese are determined andcounted,and their occurrence reported as a frequency or percent-age (Brundage et al. 2006). As an important part of theassessment researchers and clinicians suggest that repro-ducibility of frequency estimation should be discussed todevelop ‘reliable and valid measurements [ . . . ] for doc-umenting treatment candidacy and outcomes, and forresearch and replications’ (Brundage et al. 2006: 272).

To assess the ‘degree to which repeated measurementin stable study objects, often persons, provide similarresults’ reliability and agreement parameters were esti-mated (de Vet et al. 2006: 1033). The focus of the con-cepts were different: agreement (i.e., absolute reliability)refers to the closeness of repeated measurements (by dif-ferent judges or by the same judge on various situations)concerning measurement error (Cordes 1994, Cordesand Ingham 1994b); reliability (i.e., relative reliability)explains how well subjects can be distinguished regard-less of measurement error, and is related to dependabil-

ity of obtained data or test scores (Cordes 1994). As anumbrella term, the word ‘reproducibility’ has been usedto refer both concepts (de Vet  et al.  2006). Percentageagreement, standard error of measurement (SEM) andlimits of agreement are typical parameters for agreement,and intra-class correlation (ICC) is a common index forreliability reports (de Vet et al. 2006, Gisev  et al. 2013).

Frequency counts in the assessment of stuttering are a popular outcome measure, with a widespread use.However, studies of listener agreement for the occur-rence of stuttering events have reported unit-by-unitinter-judge agreement to be below 60% (Curlee 1981,

Coyle and Mallard 1979, Emerick 1960, MacDonaldand Martin 1973, Young 1975, Martin and Haroldson1981, Martin etal. 1988). There is also evidence of inter-clinic disagreement for elements of frequency counts,including: total count of stuttering events (Kully andBoberg 1988, Ham 1989, Cordes et al.  1992); speech judged as stuttered based on perceptual definitions, i.e.,threshold for stuttering is not fixed for any judge or formany judges; speech judged as stuttered based on differ-ent lists of disfluency types (Cordes 2000, Einarsdottirand Ingham 2005, Young 1975). Such lack of agreement

can lead to different diagnosis or severity descriptionat different clinics, or even to different definitions of stuttering (Brundage et al.  2006). In addition, the rel-evance of estimating severity on the basis of frequency of stuttering events has been criticized by Leith  et al.(1993) because of the lack of valid and reliable severity instruments, and because elements of stuttering (such asblocks accompanied by tension) may indicate the emo-tional involvement of people who stutter (PWS) whenstuttering.

Other studies based on total counts of stuttering (Craig 1990, Gaines   et al.  1991, Guitar   et al.  1992,Lewis 1991, Onslow   et al.  1990, Prins and Hubbard1990, Zebrowski 1991) reported high estimates of inter-and intra-judge agreement (above 90%). Many prob-lems have been demonstrated in frequency counts of the studies mentioned above, e.g., the high agreementfor total counts of stuttering judgments produced highvalues, but many of these agreement estimates neither

presented judgment agreement for total events counts,nor corroborated that the data and experimental analy-sis were unaffected by differences between and among  judges. Such problems lead to the conclusion that dif-ferences between low agreement reported in stuttering measurement research and the high agreement reportedin other stuttering research may be more apparent thanreal (Cordes and Ingham 1994b).

 An alternative measurement system is based onbehaviours occurring within a defined time-interval(Cordes   et al.   1992). This methodological variation(time-interval analysis) was introduced by Schloss   et 

al. (1987) and does not focus on individual stuttering events but on the presence of a stuttering event within a time interval (Alpermann et al. 2012). Alpermann et al.(2012) showed that the ease of judging intervals insteadof individual events could make this type of measure-ment more feasible for clinicians.

In the time-interval methodology, video or audiosamples of speech are divided into short intervals of timeallowing judges to make a binary judgment (Brundageet al.   2006, Cordes   et al.   1992). Intervals that con-tain at least one stuttering event judgment are denomi-nated stuttered intervals and intervals with no stuttering event judgments are labelled as non-stuttered intervals

(Cordes et al.  1992). In this context, stuttering eventsare defined as ‘whatever specified observers point to asstuttering’ (Bloodstein 1990: 392) based on a modifiedversion of the perceptual definition of stuttering (Mar-tin and Haroldson 1981, Bloodstein and Ratner 2008).Two additional supplements are added in order to con-trol judgment idiosyncrasies: judges should only identify stuttered disfluencies; judges should consider that onestuttering event may encompass more than one syllableor more than one word (Cordes et al. 1992).

Page 3: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 3/17

16   Ana Rita S. Valente  et al.

One important implication of time-interval analysisis that training based on this methodology increasesagreement and accuracy (i.e., correspondence judgmentof stuttered or non-stuttered interval with the criterionestablished by a group of authoritative judges (Cordesand Ingham 1999)) of inexperienced judges (Cordes andIngham 1996, 1999, Einarsdottir and Ingham 2005,Ingham et al. 1993b, Yaruss 1997, Cordes and Ingham1994a), making judgments more replicable and precise(Cordes and Ingham 1994a). Ingham et al. (1998b) de-veloped a software and hardware system based on time-interval methodology—Stuttering Measurement Assess-ment and Training (SMAAT)—that include judgmentstimulus from PWS (adults) divided into 5 s intervalsand as judged by 10 highly experienced authorities. Theuse of the SMAAT program improves inter-judge agree-ment, intra-judge agreement and accuracy in untrained judges (Cordes and Ingham 1999). Ingham et al. (1999)developed the Stuttering Measurement System (SMS),

 which is a standardized training programme to makereal-time judgments who stutter in terms of speech rate,stuttering frequency and speech naturalness, in childrenand adults samples previously assessed by experts.The frequency data are computed in an event-basedmethodology, but the SMS program allows the option of calculating stuttering frequency using an interval-basedmethodology (the number of non-stuttered intervalscan also be used to estimate the stutter-free speechrate).

 A modified approach of time-interval analysis withthree categories (fluent intervals, stuttered intervals and

intervals with trained speech pattern), is currently be-ing studied (Alpermann et al. 2010, 2012). The modi-fied time-interval analysis was introduced by Natke and Alpermann (2010) to distinguish the use of trainedspeech patterns (acquired during stuttering therapy)from stuttered and fluent speech by means of time-interval analysis. Alpermann  et al.  (2010, 2012) con-cluded that modified time-interval analysis is a reliableand valid measurement technique.

Time-interval-based measurement is proposed as a focus of attention here, based on concerns regarding theagreement of counting instances of stuttering (Yaruss1997). The comparison between a new measurement

method (i.e., time-interval) and one previouslyproposed(i.e., event-based) is necessary to assess how this new methodology is likely to differ from the established one,by computing their agreement (Bland and Altman 2010,Bartlett and Frost 2008).

This paper presents a review of event-based mea-surement and time-interval analysis in terms of repro-ducibility data, as well as a discussion related to the com-parison between the two methodologies. It also presentsa review regarding the effect of different methodologi-cal factors (i.e., training, experience, interval duration,

sample presentation order and judgment conditions) oninter- and intra-judge agreement related to the time-interval method.

 Aims

This paper aims:

    To review and discuss reproducibility data of event-based analysis and time-interval analysis.

    To discuss the effect of different methodologi-cal factors (training, experience, interval duration,sample presentation order and judgment con-ditions) on inter- and intra-judge agreement of time-interval method.

    To discuss the comparison between the twomethodologies in terms of which one representsthe most reproducible or more accurate way to as-sess stuttering frequency (event-based or interval-based) based on selected studies.

Methods

Review 

The first two authors of this paper conducted inde-pendent literature searches to locate as many relevantstudies as possible. A systematic search of literature was conducted on electronic databases (ERIC, MED-LINE, PubMed, B-on and Cochrane Central registerof Controlled Trials and Dissertation Abstracts) dur-

ing January–February 2013. Reviewers looked for stud-ies related to accuracy and reproducibility of time-interval measurement and with reproducibility data for event-based measurement, from 1992 (year af-ter which the two measurement systems coexisted)to the present year. Keywords included: ‘stuttering’,‘stammering’, ‘stuttering judgment’, ‘time-interval mea-surement’, ‘time-interval analysis’, ‘intra-judge time-interval’, ‘inter-judge time-interval’, ‘inter-judge agree-ment’, ‘inter-judge reliability’, ‘intra-judge agreement’,‘intra-judge reliability’, ‘percentage of syllable stuttered’,and ‘agreement of percentage of syllable stuttered’. Thetwo reviewers also attempted to identify studies pub-

lished in languages other than English.Each citation abstract retrieved was reviewed inde-

pendently by the two reviewers in order to assess theappropriateness of the study for inclusion in the system-atic review. If it was indicated by the abstract that thestudy met the inclusion criteria (or if the abstract wasunclear), the full text of the article was obtained and as-sessed for inclusion in the review, based on the inclusioncriteria presented below.

 When a complete copy of all potential studies wasobtained, an independent evaluation of each study was

Page 4: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 4/17

Event- and interval-based measurement of stuttering: a review    17

conducted by the two reviewers, in order to produce a final inclusion decision. In case of disagreement, review-ers discussed key issues until a consensus decision wasattained.

Inclusion and exclusion criteria 

For a study to be included in this review, the reviewersused the following inclusion and exclusion criteria:

    Participants characteristics: the participants werethose who had been diagnosed as PWS (ado-lescents and adults) without other co-occurring disorders, using labels such as ‘stuttering’ or‘stammering’.

     Judges characteristics: the judges were profession-als with different backgrounds that have diverseexperience levels related to stuttering.

   

Nature of the intervention: studies that re-ported an intervention refer to the use of a training programme using judgment training based on event- and interval-based method-ologies. The reviewers excluded studies basedon other methodologies, such as self-judgment,time-modified intervals and transcription-basedtechniques.

    Outcome variables: the outcomes included wereaccuracy, number of stuttered intervals, %SS,ICC, Pearson product-moment coefficient cor-relation and Cohen’s kappa. When the metricscale was not described, the studies were excluded.The outcome variables related to validity, graph-ical representation and disfluency-types were notincluded.

    Methodological quality: no restrictions were im-posed regarding methodological quality.

Coding 

Papers meeting the inclusion criteria were coded for judgment experience, outcome and methodological fac-tor as well as statistical method used. The coding wasconducted independently by the two reviewers, each

using a complete copy of the retrieved article. Agree-ments between the reviewers occurred for 95% of theoccasions and the disagreements were solved through a discussion.

Biostatistical methods 

Both reviewers individually extracted data related tointer- and intra-judge values as well as accuracy fromselected studies. Data from each study are summarizedin tables A1, B1 and B2 in the appendices.

Results

Overview of the literature search

The reviewers identified a total of 495 articles via initialsearch, based on different word combinations. Thearticles were flagged based on title and abstract andrepeated articles were excluded. Following evaluationof the abstract and methodology, the number of articles was reduced to 70 for further review. The articles wereexcluded due to repetitions observed on electronicdatabases and because of the absence of inter- andintra-judge values. The detailed review of the 70studies based on the exclusion criteria (i.e., articles withreproducibility data related to self-judgement, modifiertime-interval, transcription-based or disfluency types,as well as graphical representation and a logarithmicscale transformation of values were excluded) revealedthat 48 articles presented inter-judge and/or intra-judgevalues that could be used in the review. Data are shown

in tables A1, B1 and B2 in the appendices.

Characteristics of selected studies and results summary 

The majority of studies related to event-based method-ology reported indices of reliability. However, some se-lected studies used the concepts of agreement and re-liability interchangeably, not being specific about thereproducibility concept reported. For that reason, thereviewers decided to use the term ‘reproducibility’ whenreferring to event-based methodology and agreement

 when mentioning time-interval methodology.

Event-based methodology 

 A total of 32 studies related to event-based methodology  were selected (see table A1 in appendix A). The aims andobjectives of the selected studies refer to a broad scope(not specific to the study of the impact of methodolog-ical factors on inter- and intra-judge agreement). How-ever, reproducibility data of %SS was a common out-come measure between the studies. Those studies weremostly related to factors that included: the study of sev-

eral stuttering intervention programmes (e.g., Camper-down, SpeechEasy, Stuttering Modification Programs orcognitive–behavioural therapy); the effect of delayed au-ditory feedback (DAF) and frequency altered feedback (FAF); the effect of personality traits and temperamenton stuttering frequency.

Event-based methodology is used for frequency calculation in assessment instruments, such as theStuttering Severity Instrument—4 (SSI-4) (Riley 2009)or the Test of Childhood Stuttering (TOCS) (Gillamet al.  2009). SSI-4 is a frequently used instrument for

Page 5: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 5/17

18   Ana Rita S. Valente  et al.

determining aspects of stuttering severity in childrenand adults. This instrument uses %SS for frequency computing. TOCS is used to assess speech fluency skills and stuttering-related behaviours in children4–12 years old (Gillam et al. 2009). For the assessmentof stuttering frequency, TOCS uses the percentage of stuttered words. The use of event-based methodology in standardized tests, in research and in clinicaldecision-making process led to the widespread use of this methodology (Yaruss 1997).

Similar to some interval-based methodology studies(Finn 1997, Godinho et al. 2006, Ingham and Cordes1997, Ingham   et al.   1995), the majority of the se-lected event-based studies described judges as experi-enced (researchers or speech and language therapists—SLT), trained or enrolled in a training programme, asexperienced and trained judges were, in general, morereliable than judges with no experience or with no train-ing (Ingham et al.  1993b, Cordes and Ingham 1994a,

1996, 1999, Einarsdottir and Ingham 2008). This wasobserved across the studies, given that the inter- andintra-judge reproducibility exhibit high values for all in-dices (Hinkle et al. 2003, Landis and Koch 1977, Nun-nally and Bernestein 1994, Baer  et al.  1987, McHugh2012, Cordes et al. 1992).

Interval-based methodology 

 A total of 16 studies related to time-interval analy-sis methodology were selected (see table B1 and B2in appendix B). Ten studied the influence of differ-

ent methodological factors (training, experience, inter-val duration, sample presentation order and judgmentconditions) on inter- and intra-judge agreement.

 All but one of the selected studies (Einarsdottir andIngham 2008) was conducted with speech samples of adolescent/adult population.

Five studies conducted training programmes with judges to discriminate between agreed intervals of stut-tered and non-stuttered speech (Ingham  et al.  1993b,Cordes and Ingham 1994a, 1996, 1999, Einarsdottirand Ingham 2008). Two (Einarsdottir and Ingham2008; Ingham  et al.  1998a) of these five studies useda software and hardware system known as Stuttering 

Measurement Assessment and Training (Ingham  et al.1998a).

Four studies conducted an investigation with judges who had different stuttering judgment experiences (In-gham et al.  1993a, Brundage et al.  2006, Cordes et al.1992, Cordes and Ingham 1994c). These studies showedthat, in general, judges with more experience with stut-tering presented significantly higher levels of interval-by-interval intra- and inter-judge agreement than judges with less experience (Cordes et al. 1992).

Two studies investigated the effects of interval dura-tion on observers’ judgments of stuttering (Cordes and

Ingham 1994c, Cordes et al. 1992). There does not ap-pear to be one obvious interval duration that satisfies allthe factors that appear to be relevant in selecting intervalduration. Nevertheless, some findings suggest the use of intervals with a duration between 3 and 5 s (Cordes andIngham 1994c).

Two studies reported inter-clinic comparisons of stuttering judgments. One was related to graduate andundergraduate students (Ingham et al. 1993a); the other was about recognized authorities in stuttering research(Cordes and Ingham 1995). Inter-clinic comparisons re-vealed that judges from the same research centre tendedto do similar stuttered judgments. However, there wasa great amount of disagreement between authorities instuttering research from different research centres andan uniformly high intra-judge agreement among judges(Cordes and Ingham 1995).

One of the selected studies explored differences instuttering judgments related to sample presentation or-

der (Ingham  et al.  1993b) and one other study exam-ined differences in audiovisual and audio-only judgmentconditions (Ingham et al. 1993a). No significant differ-ences were found across sample presentation order or foraudiovisual/audio-only conditions (Ingham etal. 1993a,1993b).

Six studies (Ingham   et al.  1995, 1997, Fox   et al.2000, Finn 1997, Godinho  et al.  2006, Ingham andCordes 1997) had different aims and objectives, but re-ported inter- and intra-judge agreement of stuttering frequency using time-interval analysis technique. Threeof these six studies (Ingham et al. 1995, Finn 1997, In-

gham and Cordes 1997) describe judges as experiencedresearchers or SLT. Accuracy, intra- and inter-judge agreement was

higher for trained groups when compared with non-trained groups (Ingham  et al.  1993b, Cordes and In-gham 1994a, 1996, 1999, Einarsdottir and Ingham2008, Godinho et al. 2006). Comparing stuttered inter-vals judgments and non-stuttered intervals judgmentsmade by students and clinicians with judgments madeby highly experienced judges, the accuracy levels werehigher for non-stuttered intervals than for stuttered in-tervals (Brundage et al. 2006).

Discussion

Inter- and intra-judge reproducibility of event-based methodology 

Reproducibility data related to %SS were reported asPearson product-moment correlation, Cohen’s kappa,ICC and agreement percentage (see table A1 in ap-pendix A). Pearson product-moment correlation val-ues for inter-judge ranged between 0.88 and 0.99 andintra-judge data ranged between 0.86 and 1.0. The

Page 6: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 6/17

Event- and interval-based measurement of stuttering: a review    19

values represent a high correlation (Hinkle   et al.2003).

Cohen’s Kappa inter-judge values ranged from 0.73to 0.98 and intra-judge values varied between 0.76 and0.93, which can be interpreted as a substantial to almostperfect agreement correlation (Landis and Koch 1977).

ICC inter-judge values ranged was between 0.83and 1.00; intra-judge ranged between 0.89 and 0.99.The reported values mean that more than 89% of theobserved score variance is due to true score varianceand that the balance of the variance (1 minus the ICCvalue) is attributable to error (Weir 2005). The reportedoutcomes were greater than 0.70, which is classified asgood (Nunnally and Bernestein 1994).

Studies that report agreement percentage presentinter-judge values that vary between 88% and 98% andintra-judge varying from 95% to 99%. These values were above the minimum acceptable agreement (i.e., >80%) (McHugh 2012, Baer et al. 1987).

The low number of samples/syllables as well as thesmall number of judges used to assess them can helpto explain the high levels of intra- and inter-judge val-ues. Also the use of experienced and trained judges,could also have contributed to the high values reported,as stated in the SSI-4 manual ‘with practice most clini-cians can achieve self-agreement of 85% or higher; [ . . . ]more stringent training procedures should achieve 90%’(Riley 2009: 17). Frequency estimation through event-based methodology presents some other problems re-lated to the reproducibility indices chosen. ICC andPerson product-moment correlation (the more usual in-

dices) are not a report of absolute reliability but relativereliability. Absolute reliability (i.e., agreement) is ‘prefer-able in all situations in which the instrument will be usedfor evaluation purposes’ and is a characteristic of the in-strument instead of a performance with a specific sample(relative reliability) (de Vet et al.  2006: 1038). Relativereliability is ‘highly dependent on the variation in thepopulation sample’ and is ‘only generalisable to sam-ples with a similar variation’ (de Vet et al. 2006: 1038).However, there is no published report of absolute re-liability using more appropriate indices, e.g., limits of agreement, ratio limits of agreement or standard errorof measurement (Gisev  et al. 2013, de Vet et al. 2006).

Inter- and intra-judge agreement of interval-based methodology 

Inter- and intra-judge agreement data related to interval-based technique were reported as agreement percentagebetween or within judges. Agreement will be discussedin terms of the impact of methodological factors.

The studies (Ingham et al. 1993a, Cordes and Ing-ham 1994c, 1995, Brundage  et al.  2006, Cordes et al.

1992) in which training was not applied showed thatintra-judge agreement (reported as agreement percent-age) of time-interval measurement ranged between a mean of 76% and 98% for more experienced judges,and between 60% and 96%, for inexperienced judges.Values of inter-judge agreement (reported as agreementpercentage) for experienced judges ranged between 57%and 87%. Inexperienced inter-judge agreement valuesranged between 74% and 89%. Generally, studies con-cluded that inter- and intra-judge agreement were higherin more experienced judges (Ingham et al. 1993a). Nev-ertheless, different studies presented distinct results con-cerning the experience with stuttering, mainly due toill-defined judges’ experience with stuttering (Inghamet al.   1993a). In this sense, more recent studies thatdo not analyse the influence of methodological factors(Ingham et al.  1995, Finn 1997, Godinho et al.  2006,Ingham and Cordes 1997) used experienced judges foragreement purposes based on assumption that, gener-

ally, more experienced judges reached better agreementvalues.

The studies in which training was applied to in-experienced judges led to a higher increment in accu-racy, inter- and intra-judge agreement in experimentalgroups, when compared with control groups. Inghamet al.’s (1993b) study of inter-judge agreement results were 71% in pre-training and increased to 86% aftertraining. The accuracy decreased from 5.0 errors (i.e.,a non-correspondence between judges’ scores and the judgment defined by a group of authorities in stutter-ing assessment) on pre-training to 1.6 errors on post-

training. Cordes and Ingham’s (1999) study, training lead to a greater accuracy, whether judges needed to learnto match more stuttering eventsor learn not to over iden-tify non-stuttered intervals. Accuracy and inter-judgeagreement also increased for samples who had not beenused for training, which means that training had effectfor familiar (i.e., samples used during training sessions)and unfamiliar samples (i.e., samples not used during training sessions). Related to training with highly agreedintervals (i.e., 100% agreement), it has been shown thatit is more effective for intra-judge agreement (increasefrom 82% in pre-training to 88% in post-training) andinter-judge (increase from 79–80% to 84–88% in post-

training) than training or exposure to poorly agreed(intra-judge increase from 82% in pre-training to 86%in post-training; inter-judge decrease from 81% in pre-training to 78% in post-training) or randomly selectedintervals (Ingham   et al.   1993b, Cordes and Ingham1994a). Software for judges’ training (SMAAT) withsamples from AWS (Ingham et al.  1998a) or preschoolchildren who stutter (Einarsdottir and Ingham 2008)used highly agreed intervals to improve the correspon-dence between judges’ score and the criterion judge-ment established by a group of authoritative judges in

Page 7: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 7/17

20   Ana Rita S. Valente  et al.

stuttering (Brundage et al. 2006). Cordes and Ingham’s(1999) study also showed that the accuracy increased whether it was assessed with respect to event judgmentsor in terms of interval judgment, either in agreed stut-tered and agreed non-stuttered intervals.

Therefore, results of the studies based on interval-recording training (Ingham  et al.  1993b, Cordes andIngham 1994a, 1996, 1999, Einarsdottir and Ingham2008) or interval analyses of event judgments (Cordesand Ingham 1999) do provide ambiguous evidence thatinterval-based procedures will solve the many problemsrelated to stuttering measurement.

Sample presentation order and audiovisual or audio-only conditions did not reveal different results in inter-and intra-judge agreement (Cordes and Ingham 1996,Ingham et al.  1993a). Audio-only experimental condi-tions (Alpermann et al. 2010) have shown similar resultsto audiovisual conditions (Cordes and Ingham 1995).Martin and Haroldson’s (1992) study on speech nat-

uralness and severity of stuttering, revealed that intra- judge agreement and inter-judge agreement were similarin audio-only or audiovisual conditions. However, thespeech naturalness of stutter samples (judge in a 1–9 naturalness scale) was judged systematically as moreunnatural in audiovisual conditions than under audio-only conditions (Martin and Haroldson 1992). Theauthors argued that the difference could be explainedby the rating procedure (more than as a function of particular speech behaviour). Nevertheless, Alpermannet al.   (2010), concluded that the use of audio-only conditions might hinder the judgment of some types

of stuttering (e.g., inaudible blocks) and Martin andHaroldson (1992) suggested that the audiovisual signalappends visible struggle behaviours to the audio-only signal.

 With regard to sample presentation order, thismethodological factor was studied by Ingham   et al.(1993b), but since then, only two of the selected stud-ies used samples without intervening recording inter-vals (Cordes and Ingham 1994a, 1999). Ingham   et al.   (1993b) argued that randomizing the sample andseparating it by a recorded silent interval provided anadditional control of the judgment task, in order toavoid the influence of behavioural cues from preceding 

intervals. Furthermore, Ingham  et al.   (1993b) arguedthat the absence of differences in inter-judge agreementon different sample presentation order may be due tothe experience acquired during the study. Cordes andIngham (1994a, 1999) showed that judges could betrained to achieve better agreement and accuracy val-ues in a continuous-interval format, which means thattime-interval procedures could be used effectively withspontaneous speech samples in clinical settings.

Duration is another methodological factor exploredin two of the selected studies (Cordes and Ingham1994c, Cordes et al. 1992). Cordes et al. (1992) foundthat inter-judge agreement was maximized at greaterthan chance levels for intervals between 3 and 4 s.Two further studies (Ingham   et al.   1993a, 1993b)used 4 s interval-recordings. However, the 4 s inter-val duration was chosen based on interval analysis of event-recorded data, which means that this intervalcould not be the best time interval for interval judg-ments (Cordes and Ingham 1994c). Cordes and Ingham(1994c) made direct comparisons between different in-terval durations, within an interval-recording judgmentsystem and proposed the use of intervals with a du-ration between 3 and 5 s with acceptable agreement. After Cordes and Ingham’s (1994c) study, all the se-lected studies for the systematic review used 5 s intervalduration.

The appropriateness of using interval-based

methodology has been discussed in the literature. It was stated that the subdivision of speech into intervalspresented randomly to the judge affects the assessmentof co-articulatory gestures, insofar as the ‘speech gesturescan be influenced by the preceding and following ges-tures [ . . . ]’ and its duration can be longer than the in-terval duration (Guntupalli et al. 2006, p. 10). The sameauthors also refer to under- or over-count of stuttering events that can occur in time-interval based methodol-ogy. Dueto issues that were discussed by Guntupalli etal.(2006), the increase in agreement produces a decrease invalidity. The duration of time intervals has also been dis-

cussed. Howell et al. (1998) and Howell (2005) suggestthat long intervals lead to a ceiling effect with all intervalsbeing judged as stuttered, which impairs the detection of changes during intervention. Howell (2005) also foundthat the effect of interval duration is dependent on theseverity of stuttering (i.e., the more severe the stuttering,the more likely intervals were to be judged consistentlyasstuttered).

 Yaruss (1997) also discusses some concerns relatedto time-interval methodology. The author argues thatthis methodology cannot be easily transferred to clinicalsettings and nor can it be employed during the clinicaldecision making process with clinical utility (i.e., time-

interval does not present a frequency value with clinicalsignificance).

Despite the disadvantages noted in the literaturerelated to the use of interval based methodology inthe assessment of stuttering, time interval methodshave often been used to estimate behaviours dueto the impracticality of focusing on multiple cate-gories of behaviour/multiple subjects (Meany-Daboulet al. 2007).

Page 8: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 8/17

Event- and interval-based measurement of stuttering: a review    21

Is it possible to quantify which of the twomethodologies represents the most reproducible or more accurate way to assess stuttering frequency based on selected studies? 

To analyse which of the two methodologies presentsbetter inter-judge, intra-judge or accuracy related to as-

sessment of the frequency, a method comparison study  would be necessary in order to inspect the differencesbetween measurements made by the two methods. Thetwo methods should be in the same metric or scale (i.e.,equal agreement index estimated from both methodolo-gies during a pair of measurements from each subject)to quantify their agreement (Bartlett and Frost 2008).

The stuttering frequency of the same sample of sub- jects needs to be measured (and the calculation of inter- judge, intra-judge or accuracy needs to be made) withboth measurement methodologies (Bartlett and Frost2008) to quantity the agreement between methods and

to assess which method would give the least inter-raterand inter-method variation (Bowers  et al.   2009). Re-searchers can visually assess the agreement/disagreementbetween measurement techniques based on graphicaltechniques such as the one proposed by Bland and Altman (1986). Data from the subject’s measurementsmade by the two methods could be plot (i.e., the dif-ference between measurements against the mean) andlimits of agreement could be calculated. Such limits of agreement would represent the range within which it isexpected that most of the differences (95%) lie.

Considering reproducibility data of event-basedmethodology and interval-based methodology it is im-portant to mention the use of the two methodologies inthe same speech samples in two studies, those of Bothe(2008) and Cordes and Ingham (1999). Bothe (2008)assessed the similarities between judgments made in thesame speech sample with interval-based methodology and disfluency types present (a categorical event-basedapproach) in each 5 s interval from 20 children whostutter. The author concluded that the agreement fordisfluency types was very low and that between 54%and 55% of the intervals agreed by the highly experi-enced judges (to be stuttered or non-stuttered) could becategorized similarly on the basis of disfluency types data 

(if a disfluency type was accepted as present based on80% of judgments). In spite of the comparison of binary stuttered/non-stuttered judgment and disfluency types,it is not possible to quantify theagreement/disagreementbecause variables are different and with obviously dis-tinct metrics. The other study (Cordes and Ingham1999) used a previously developed software and hard- ware system for training (SMAAT) to investigate if a training programme based on time-interval methodol-ogy could improve the judgement of stuttering events. After training, inter-judge, intra-judge and accuracy val-

ues increased (agreement was only calculated for time-interval methodology and accuracy for both methodolo-gies). Even though the accuracy data were calculated forboth methods on the same speech sample, quantitativespecific data does not exist for event-based methodology,making it unfeasible to compare the two methods.

Based on the data of selected studies, presented intables A1, B1 and B2 in the appendices, related to inter- judge, intra-judge or accuracy for the two methodolo-gies, it can be concluded that it was not possible toquantify which of the methodologies represent the mostreproducible and more accurate method to assess stut-tering frequency because the majority of event-basedmethodology studies reported relative reliability indicesand interval-based methodology studies reported indicesof absolute reliability (i.e., percentage of agreement);further, in the majority of studies, the two measure-ment methods were not used in the same speech sample(Bartlett and Frost 2008). It was not possible to assess

the consistency of study outcomes due to the lack of data (i.e., confidence intervals reported or the necessary data to compute).

 Although a method comparison study has not beenimplemented, many studies reported that time-interval-based measure may lead to a ‘markedly’ (Ingham andCordes 1997) or ‘demonstrably’ (Brundage et al. 2006)better inter- or intra-judge agreement than the tradi-tional obtained with procedures required to identify events (%SS). Some studies (Curlee 1981, Coyle andMallard 1979, Emerick 1960, MacDonald and Martin1973, Young 1975, Martin and Haroldson 1981, Mar-

tin et al.  1988) about listeners’ agreement of stuttering events (which reported unit-by-unit inter-judge agree-ment below 60%, as reported by Cordes   et al.  1992)used an agreement index developed by Young (1975)based on the number of observers (n), the total numberof words marked as stuttered (T ) and the total numberof different words marked as stuttered (T d):

 Agreement index (Young 1975) =

  1

n − 1

[(

 T 

T d) − 1]

The interval-based methodology estimates the per-centage of judges’ agreement for the occurrence of stut-tering based on the positive agreement index (PA index),

in which the number of positive intervals (i.e., NP, aninterval in which 80% or more of the judges recog-nized at least one stuttering event) and the number of ambiguous intervals (NA) were considered as factors:

PA index =N P /(N P  + N A)

Despite the same type of metric scale (percentage of agreement, which is an index of absolute reliability), thetwo measurement techniques are still not comparabledue to estimation differences inherent in the formulasused for calculation.

Page 9: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 9/17

22   Ana Rita S. Valente  et al.

Study limitations 

The main limitation of the present study was related tothe statistical methods used in the selected studies forthe review. Specifically, the majority of selected studiesthat used event-based methodology present indices of relative reliability (e.g., Pearson product-moment corre-

lation and ICC), which are less stable over different pop-ulation than agreement indices. The number of studies with reproducibility values reported is reduced (n= 48),compared with the number of studies retrieved from thedatabase initial search (n  =  495). The lack of studiesreporting agreement indices leads to uncertainty relatedto the closeness of the scores stemming from the event-based methodology during repeated measurements (deVet et al.  2006). It should also be noted that althoughthe large number of studies (n = 495) identified in ini-tial search, only 48 present sufficient agreement data suitable for review. The lack of agreement data makesit impossible to generalize experimental effects to otherpopulations, as well make conclusions about their con-sistency or stability in replications (Breakwell etal. 2012,de Vet et al. 2006).

Conclusions

‘The frequency of fluency breaks is often one of the mostobvious aspects of the problem and to some degree im-pacts the perception of severity, particularly by the lis-tener’ (Manning 2010: 163). Researchers and cliniciansconcur that objective definitions of stuttering, and reli-

able and valid measurements procedures are both desir-able and a core necessity for treatment outcome researchand replication (Brundage et al. 2006).

Both interval- and event-based methodologies useexperienced and/or trained judges for reproducibility purposes. Trained judges as well as the small number of samples used for reassessment in event-based methodol-ogy studies could contribute to agreement values beyondthe references for good reproducibility values in all in-dices (Hinkle et al. 2003, Landis and Koch 1977, Nun-nally and Bernestein 1994, Baer  et al.  1987, McHugh2012). It can also be concluded that it was impractica-bleto quantify theagreement between inter-judge, intra-

 judge or accuracy measurements made by the two meth-ods in the selected studies (Bartlett and Frost 2008).

Frequency is an aspect that should be relatively easy to calculate, but it is difficult to obtain reliable tabula-tions of event counts (Curlee 1981, Coyle and Mallard1979, Emerick 1960, MacDonald and Martin 1973, Young 1975, Martin and Haroldson 1981, Martinet al. 1988, Ham 1989, Kully and Boberg 1988, Cordeset al.   1992). Researchers attempted to minimize thisproblem by developing better descriptions of stuttering (Packman and Onslow 1998), developing new methods

for the detailed written analysis of stuttering (Yarusset al. 1998), developing training methods (Cordes andIngham 1999, Ingham and Cordes 1997) or using con-sensus judgment procedures (Cordes 2000). Stuttering,as a complex behaviour, can be measured with differentmethods, based on the desired parameters and theaims of the assessment. With the ideas reported in thepresent paper, clinicians/researchers should reflect uponthe influence of methodological factors and choose themost appropriate frequency assessment method.

 A suggestion for future research is to implement a method comparison study between event- and interval-based methodologies, both in the same metric scale of agreement values. The aim of such a (method compar-ison) study would be to verify whether the measure-ments made by the two methods are ‘sufficiently close’or whether the methods can coexist due to good agree-ment in terms of reproducibility values (Bartlett andFrost 2008).

 Acknowledgments

This study was developed as part of the Ph.D. Thesis of thefirst author at the University of Aveiro, Portugal. This work waspartially funded by National Funds through FCT—Foundationfor Science and Technology, in the context of project PEst-OE/EEI/UI0127/2014. This research was also partly supported by a doctoral grant (SFRH/BD/78311/2011) from FCT to Ana Rita S. Valente. Declaration of interest : The authors report no conflictsof interest. The authors alone are responsible for the content and writing of the paper.

References

 A LPERMANN, A., HUBER , W., N ATKE, U. and W ILLMES, K., 2010,Measurement of trained speech patterns in stuttering: inter- judge and intra-judge agreement of experts by means of mod-ified time-interval analysis.  Journal of Fluency Disorders ,  35 ,299–313.

 A LPERMANN, A., HUBER , W., N ATKE, U. and W ILLMES, K., 2012,Construct validity of modified time-interval analysis in mea-suring stuttering and trained speaking patterns.  Journal of  Fluency Disorders , 37, 42–53.

 A NDREWS,G.,GUITAR ,B.andHOWIE, P., 1980, Meta-analysis of theeffect of stuttering treatment.  Journal of Speech and Hearing Research, 45, 287–307.

 A NTIPOVA , E. A., PURDY , S. C., BLAKELEY , M. and W ILLIAMS, S.,

2008, Effects of altered auditory feedback (AAF) on stuttering frequency during monologue speech production.   Journal of   Fluency Disorders , 33, 274–290.

 A RMSON, J . an d K  IEFTE, M., 2008, The effect of SpeechEasy on stut-tering frequency, speech rate, and speech naturalness. Journal of Fluency Disorders , 33, 120–134.

 A RMSON, J., K IEFTE, M., M ASON, J. and DE CROOS, D., 2006, Theeffect of SpeechEasy on stuttering frequency in laboratory conditions. Journal of Fluency Disorders , 31, 137–152.

 A RMSON, J. and STUART, A., 1998, Effect of extended exposureto frequency-altered feedback on stuttering during reading and monologue.   Journal of Speech, Language, and Hearing Research, 41, 479–490.

Page 10: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 10/17

Event- and interval-based measurement of stuttering: a review    23

B AER , D. M., W OLF, M. M. and R ISLEY , T. R., 1987, Some still-current dimensions of applied behavior analysis.   Journal of    Applied Behavior Analysis , 20, 313–327.

B ARTLETT, J. W. and FROST, C., 2008, Reliability, repeatability andreproducibility: analysis of measurement errors in continuousvariables. Ultrasound in Obstetrics and Gynecology ,  31 , 466–475.

B AUERLY , K. R. and DE   NIL, L. F., 2011, Speech sequence skill

learning in adults who stutter.   Journal of Fluency Disorders ,36, 349–360.

BEILBY , J. M., B YRNES, M. L. and Y  ARUSS, J. S., 2012, Acceptanceand commitment therapy for adults who stutter: psychosocialadjustment and speech fluency.  Journal of Fluency Disorders ,37, 289–299.

BLAND, J. M. and A LTMAN, D. G., 1986, Statistical methods forassessing agreement between two methods of clinical mea-surement. Lancet , i, 307–310.

BLAND, J. M. and A LTMAN, D. G., 2010, Statistical methods for as-sessing agreement between two methods of clinical measure-ment. International Journal of Nursing Studies , 47, 931–936.

BLOODSTEIN, O., 1990, On pluttering, skivering, and floggering:a commentary. Journal of Speech and Hearing Disorders ,  55,392–393.

BLOODSTEIN, O., and R  ATNER , N. B., 2008, A Handbook On Stut-tering  (Clifton Park: Delmar).

BOBERG, E. and K ULLY , D., 1994, Long term results of an intensivetreatment program for adults and adolescents who stutter. Journal of Speech and Hearing Research, 37, 1050–1059.

BOTHE, A., 2008, Identification of children’s stuttered and nonstut-tered speech by highly experienced judges: binary judgmentsand comparisons with disfluency-types definitions. Journal of  Speech, Language and Hearing Research, 51, 867–878.

BOTHE, A. K., D AVIDOW , J. H., BRAMLETT, R. E. and INGHAM, R.,2006, Stuttering treatment research 1970–2005: I. Systematicreview incorporating trial quality assessment of behavioral,cognitive, and related approaches. American Journal of Speech– Language Pathology , 15, 321–341.

BOWERS, K. M., BELL, D., CHIODINI, P. L., B ARNWELL, J., INCAR -DONA , S., Y EN, S., LUCHAVEZ, J. and W  ATT, H., 2009, Inter-rater reliability of malaria parasite counts and comparison of methods. Malaria Journal , 8, 267–275.

BREAKWELL,G.,SMITH,J .andW RIGHT, D. (Editors), 2012, Research Methods in Psychology , 4th Edition (London: Sage).

BRUNDAGE, S., BOTHE, A., LENGELING, A. K. and EVANS, J., 2006,Comparing judgments of stuttering made by students, clini-cians, and highly experienced judges. Journal of Fluency Dis-orders , 31, 271–283.

CORDES, A., 1994, The reliability of observational data: I. Theoriesand methods for speech–language pathology. Journal of Speechand Hearing Research, 37, 264–278.

CORDES, A. K., 2000, Individual and consensus judments of dis-fluency types in the speech of persons who stutter. Journal of  Speech, Language, and Hearing Research, 43, 951–964.

CORDES, A. and INGHAM, R., 1994a, Time-interval measurementof stuttering: effects of training with highly agreed or poorly agreed exemplars. Journal of Speech and Hearing Research, 37,1295–1307.

CORDES, A. K. and INGHAM, R., 1994b, The reliability of observa-tional data: II. issues in the identification and measurementof stuttering events. Journal of Speech and Hearing Research,37, 279–294.

CORDES, A. K. and INGHAM, R., 1994c, Time-interval measurementof stuttering: effects of interval duration. Journal of Speech and Hearing Research, 37, 779–788.

CORDES, A. K. and INGHAM, R., 1995, Judgments of stutteredand nonstuttered intervals by recognized authorities in stut-tering research.   Journal of Speech and Hearing Research,  38,33–41.

CORDES, A. K. and INGHAM, R., 1996, Time-interval measurementof stuttering: establishing and modifying judgment accuracy. Journal of Speech and Hearing Research, 39, 298–310.

CORDES, A. K. and INGHAM, R., 1999, Effects of time-interval

 judgment training on real-time measurement of stutter-ing.   Journal of Speech, Language, and Hearing Research,  42,862–879.

CORDES,A.K.,INGHAM,R.,FRANK , P. and INGHAM, J.,1992, Time-interval analysis of inter-judge and intra-judge agreement forstuttering event judgment. Journal of Speech and Hearing Re-search, 35, 483–494.

COYLE, M. M. and M ALLARD, A. R., 1979, Word-by-word analysisof observer agreement utilizing audio and audiovisual tech-niques. Journal of Fluency Disorders , 4, 23–28.

CRAIG, A., 1990, An investigation into the relationship betweenanxietyand stuttering. Journal of Speech and Hearing Disorders ,55, 290–294.

CREAM, A., O’BRIAN, S., JONES, M., BLOCK , S., H ARRISON, E.,LINCOLN, M., HEWAT, S., P ACKMAN, A., MENZIES, R. and

ONSLOW , M., 2010, Randomized controlled trial of videoself-modeling following speech restructuring treatment forstuttering. Journal of Speech, Language, and Hearing Research ,53, 887–897.

CURLEE, R. F., 1981, Observer agreement on disfluency and stutter-ing. Journal of Speech and Hearing Research, 24, 595–600.

DE V ET, H. C. W., TERWEE, C. B., K NOL, D. L. and BOUTER , L.M., 2006, When to use agreement versus reliability measures. Journal of Epidemiology , 59, 1033–1039.

EICHST¨ ADT, A., W  ATT, N. and GIRSON, J., 1998, Evaluation of the efficacy of a stutter modification program with particularreference to two new measures of secondary behaviors andcontrol of stuttering.  Journal of Fluency Disorders ,  23, 231–246.

EINARSDOTTIR , J. and INGHAM, R. J., 2005, Have disfluency-typemeasures contributed to the understanding and treatmentof developmental stuttering?   American Journal of Speech– Language Pathology , 14, 231–243.

EINARSDOTTIR , J. and INGHAM, R., 2008, The effect of stuttering measurement training on judging stuttering occurrence inprechool children who stutter.  Journal of Fluency Disorders ,33, 167–179.

EMERICK , L. L., 1960, Extensional definition and attitude towardstuttering.  Journal of Speech and Hearing Research,  3, 181–186.

FINN, P., 1997, Adults recovered from stuttering without formaltreatment: perceptual assessment of speech normalcy. Journal of Speech, Language, and Hearing Research, 40, 821–831.

FOX , P. T., INGHAM, R., INGHAM, J., Z AMARRIPA , F., X IONG, J.and L ANCASTER , J., 2000, Brain correlates of stuttering andsyllable production: a PET performance-correlation analysis.Brain, 123, 1985–2004.

G AINES, N. D., R UNYAN, C. M. and MEYERS, S. C., 1991, A com-parison of young stutterers’ fluent versus stuttered utteranceson measures of length and complexity.  Journal of Speech and Hearing Research, 34, 37–42.

G ALLOP, R. F. and R UNYANB, C. M., 2012, Long-term effectivenessof the SpeechEasy fluency-enhancement device.   Journal of   Fluency Disorders , 37, 334–343.

GILLAM, R. B., LOGAN, K. J. and PEARSON, N. A., 2009,  Test of   Childhood Stuttering  (Austin, TX: PRO-ED).

Page 11: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 11/17

24   Ana Rita S. Valente  et al.

GISEV , N., BELL, J. S. and CHEN, T. F., 2013, Interrater agree-ment and interrater reliability: key concepts, approaches, andapplications.   Research in Social Administrative Pharmacy ,  9,330–338.

GODINHO, T., INGHAM, R., D AVIDOW , J. and COTTON, J., 2006,The distribution of phonated intervals in the speech of indi-viduals who stutter. Journal of Speech, Language and Hearing Disorders , 49, 161–171.

GUITAR , B., SCHAEFER , H. K., DONAHUE-K ILBURG, G. and BOND,L., 1992, Parent verbal interactions and speech rate: a casestudy in stuttering.   Journal of Speech and Hearing Research,35, 742–754.

GUNTUPALLI, V. K., K  ALINOWSKI, J. and S ALTUKLAROGLU, T., 2006,The need for self-report data in the assessment of stuttering therapy efficacy: repetitions and prolongations of speech. Thestuttering syndrome.  International Journal of Language and Communication Disorders , 41, 1–18.

H AM, R. E., 1989, What are we measuring?  Journal of Fluency Dis-orders , 14, 231–243.

HERDER , C., HOWARD, C., N YE, C. and V  ANRYCKEGHEM, M., 2006,Effectiveness of behavioral stuttering treatment: a systematicreview and meta-analysis. Contemporary Issues in Communi-cation Science and Disorders , 33, 61–73.

HINKLE,D.E.,W IERSMA ,W.andJURS, S.G.,2003, Applied Statistics  for the Behavioral Sciences  (Boston, MA: Houghton Mifflin).

HOWELL, P., 2005, The effect of using time intervals of differentlength on judgements about stuttering. Stammering Research,1, 364–374.

HOWELL, P., STAVELEY , A., S ACKIN, S. and R USTIN, L., 1998, Meth-ods of interval selection, presence of noise and their effects ondetectability of repetitions and prolongations. Journal of the  Acoustical Society of America , 104, 3558–3567.

HUBBARD, C. P., 1998, Stuttering, stressed syllables, andword onsets. Journal of Speech, Language, and Hearing Research,  41, 802–808.

HUINCK , W. and R IETVELD, T., 2007, The validity of a simple out-come measure to assess stuttering therapy. Folia Phoniatrica et Logopaedica , 59, 91–99.

INGHAM, R. J., B AKKER , K., MOGLIA , R. and K ILGO, M., 1999,Stuttering Measurement System (SMS)   (Santa Barbara, CA:University of California).

INGHAM, R. J. and CORDES, A. K., 1997, Identifying the authorita-tive judgments of stuttering: comparisons of self-judgmentsand observer judgments.   Journal of Speech, Language and Hearing Research, 40, 581–594.

INGHAM, R., CORDES, A. K. and FINN, P., 1993a, Time-intervalmeasurement of stuttering: systematic replication of Ingham,Cordes, and Gow (1993).  Journal of Speech and Hearing Re-search, 36, 1168–1176.

INGHAM, R., CORDES, A. K. and GOW , M., 1993b, Time-intervalmeasurement of stuttering: modifying inter-judge agreement. Journal of Speech and Hearing Research, 36, 503–515.

INGHAM, R., CORDES, A. K., INGHAM, J. and GOW , M. L., 1995,Identifying the onset and offset of stuttering events.  Journal of Speech and Hearing Research, 38, 315–326.

INGHAM, R. J., CORDES, A. K., K ILGO, M. and MOGLIA , R., 1998b.Stuttering Measurement Assessment and Training (SMAAT)(Santa Barbara, CA: University of California).

INGHAM, R., CORDES, A. K., K ILGO, M. and MOGLIA , R., 1998a,Stuttering Measurement Assessment and Training (SMAAT)(Santa Barbara, CA: University of California).

INGHAM, R . , MOGLIA , R . , FRANK , P.,INGHAM, J . an d CORDES,A.K.,1997, Experimental investigation of the effects of frequency-altered auditory feedback on the speech of adults who stutter.

 Journal of Speech, Language and Hearing Research,  40 , 361–372.

K  ALINOWSKI, J., A RMSON, J. and STUART, A., 1995, Effect of normaland fast articulatory rates on stuttering frequency. Journal of  Fluency Disorders , 20, 293–302.

K  ALINOWSKI, J., STUART, A., S ARK , S. and A RMSON, J., 1996, Stut-tering amelioration at various auditory feedback delays andspeech rates. European Journal of Disorders of Communication,

31, 259–269.K  ALINOWSKI, J., STUART, A., W  AMSLEY , L. and R  ASTATTER , M. P.,

1999, Effects of monitoring condition and frequency-alteredfeedback on stuttering frequency. Journal of Speech, Language,and Hearing Research, 42, 1347–1354.

K ULLY , D. and BOBERG, E., 1988, An investigation of inter-clinicagreement in the identification of fluent and astuttered sylla-bles. Journal of Fluency Disorders , 13, 309–318.

L ANDIS, J . R . an d K  OCH, G. G., 1977, The measurement of observeragreement for categorical data. Biometrics , 33, 159–174.

L ANGEVIN,M.andBOBERG, E., 1993, Results of an intensive stutter-ing therapy program.  Canadian Journal of Speech–Language Pathology and Audiology , 17, 158–166.

L ANGEVIN, M., HUINCK , W. J., K ULLY , D., PETERS, H. F. M.,LOMHEIM, H. and TELLERS, M., 2006, A cross-cultural, long-

term outcome evaluation of the ISTAR Comprehensive Stut-tering Program acrossDutch and Canadian adults who stutter. Journal of Fluency Disorders , 31, 229–256.

L ANGEVIN,M.,K ULLY ,D.,TESHIMA ,S.,H AGLER , P. and N ARASIMHA ,P. N. G., 2010, Five-year longitudinal treatment outcomes of the ISTAR Comprehensive Stuttering Program.   Journal of   Fluency Disorders , 35, 123–140.

LEITH, W., M AHR , G. C. and MILLER , L. D., 1993, The Assessment of Speech-Related Attitudes and Beliefs of People who Stutter (Rockville, MD: American Speech–Language–Hearing Asso-ciation).

LEWIS, K. E., 1991, The structure of disfluency behaviors in thespeech of adult stutterers.  Journal of Speech and Hearing Re-search, 34, 492–500.

LINCOLN, M., P ACKMAN, A. and ONSLOW , M., 2010, An experi-mental investigation of the effect of altered auditory feedback on the conversational speech of adults who stutter.  Journal of  Speech, Language, and Hearing Research, 53, 1122–1131.

M ACDONALD,J .andM ARTIN, R. R.,1973, Stuttering anddisfluency as two reliable and unambiguous response classes. Journal of  Speech and Hearing Research, 16, 691–699.

M ACLEOD, J., K  ALINOWSKI, J., STUART, A. and A RMSON, J., 1995,Effect of single and combined altered auditory feedback onstuttering frequency at two speech rates.  Journal of Commu-nication Disorders , 28, 217–228.

M ANNING, W. H., 2010, Clinical Decision Making in Fluency Disor-ders  (Clifton Park: Delmar).

M ANNING, W. H. and BECK , J. G., 2013, Personality dysfunction inadults who stutter: another look. Journal of Fluency Disorders ,38, 184–192.

M ARTIN, R. R. and H AROLDSON, S. K., 1981, Stuttering identifica-tion: standard definition and moment of stuttering.  Journal of Speech and Hearing Research, 24, 59–63.

M ARTIN, R. R. and H AROLDSON, S. K., 1992, Stuttering and speechnaturalness: audio and audiovisual judgments.   Journal of   Speech and Hearing Research, 35, 521–528.

M ARTIN, R. R., H AROLDSON, S. K. and W OESSNER , G. L., 1988,Perceptual scaling of stuttering severity.  Journal of Fluency Disorders , 13, 27–47.

M AX , L. and B ALDWIN, C. J., 2010, The role of motor learning instuttering adaptation: repeated versus novel utterances in a 

Page 12: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 12/17

Event- and interval-based measurement of stuttering: a review    25

practice-retention paradigm. Journal of Fluency Disorders , 35,33–43.

MCHUGH, M. L., 2012, Interrater reliability: the kappa statistic.Biochemia Medica , 22, 276–282.

MEANY -D ABOUL, M. G., R OSCOE, E. M., BOURRET, J. C. and A HEARN, W. H., 2007, A comparison of momentary timesampling and partial-interval recording for evaluating func-tional relations. Journal of Applied Behavior Analysis , 40, 501–

514.MENZIES,R.G.,O’BRIAN,S.,ONSLOW ,M.,P ACKMAN,A.,ST CLARE,

T. and BLOCK , S., 2008, An experimental clinical trial of a cognitive–behavior therapy package for chronic stuttering. Journal of Speech, Language, and Hearing Research, 51, 1451–1464.

N ATKE, U. and A LPERMANN, A., 2010, Stottern: Erkenntnisse, Theo-rien, Behandlungsmethoden (Bern: Hans Huber).

NUNNALLY , J. C. and BERNESTEIN, I. H., 1994, Psychometric Theory (New York, NY: McGraw-Hill).

O’BRIAN, S . , ONSLOW , M . , CREAM, A. an d P ACKMAN, A., 2003, TheCamperdown Program: outcomes of a new prolonged-speechtreatment model.  Journal of Speech, Language, and Hearing Research, 46, 933–946.

O’BRIAN, S., P ACKMAN, A. and ONSLOW , M., 2008, Telehealth de-

livery of the Camperdown Program for adults who stutter:a phase I trial.  Journal of Speech, Language, and Hearing Re-search, 51, 184–195.

O’DONNELL, J. J., A RMSON, J. and K IEFTE, M., 2008, The effective-ness of SpeechEasy during situations of daily living. Journal of Fluency Disorders , 33, 99–119.

ONSLOW ,M.,COSTA ,L.andR UE, S.,1990, Directearly intervention with stuttering: some preliminary data. Journal of Speech and Hearing Disorders , 55, 405–416.

P ACKMAN, A. and ONSLOW , M., 1998, The behavioral data languageofstuttering. InA. K.C. R.J. Ingham (ed.), Treatment Efficacy  for Stuttering: A Search for Empirical Bases  (San Diego, CA:Singular), 27–50.

P ACKMAN, A., ONSLOW , M. and   VAN DOOM, J., 1994, Prolongedspeech and modification of stuttering: perceptual, acoustic,and electroglottographic data. Journal of Speech and Hearing Research, 37, 724–737.

PRINS, D. and HUBBARD, C. P., 1990, Acoustical durations of speechsegments during stuttering adaptation. Journal of Speech and Hearing Research, 31, 696–709.

R ILEY , G., 2009,  Stuttering Severity Instrument—Fourth Edition(Austin, TX: PRO-ED).

SCHLOSS, P. L., FREEMAN, C. A. and SMITH, M. A., 1987, Influenceof assertiveness training on the stuttering rates exhibited by three young adults.  Journal of Fluency Disorders ,  12, 333–353.

SPARKS, G., GRANT, D. E., MILLAY , K., W  ALKER -B ATSON, D. andH YNANB, L. S., 2002, The effect of fast speech rate on stut-

tering frequency during delayed auditory feedback. Journal of  Fluency Disorders , 27, 187–201.

STUART, A. , K  ALINOWSKI, J . , A RMSON, J . , STENSTROM, R. and JONES, K., 1996, Fluency effect of frequency alterations of plus/minus onehalf and onequarter octave shifts in auditory feedback of people who stutter. Journal of Speech and Hearing Research, 39, 396–401.

UNGER , J. P., GLUCKB, C. W. and CHOLEWAA , J., 2012, Immedi-ate effects of AAF devices on the characteristics of stutter-ing: a clinical analysis. Journal of Fluency Disorders , 37, 122–134.

V  AN  R IPER , C., 1982, The Nature of Stuttering   (Englewood Cliffs,NJ: Prentice-Hall).

V INCENT, I., GRELA , B. G. and GILBERT, H. R., 2012, Phonologicalpriming in adults who stutter.   Journal of Fluency Disorders ,37, 91–105.

 W EIR , J. P., 2005, Quantifying test–retest reliability using the intra-class correlation coefficient and the SEM. Journal of Strengthand Conditioning Research, 19, 231–240.

 Y  ARUSS, J. S., 1997, Clinical measurement of stuttering behaviors.Contemporary Issues in Communication Science and Disorders ,24, 33–44.

 Y  ARUSS,J .S. ,M AX ,M.,NEWMAN,R.andC AMPBELL, J.,1998, Com-paring real-time and transcript-based techniques for measur-ing stuttering. Journal of Fluency Disorders , 23, 137–151.

 Y OUNG, M. A., 1975, Observer agreement for marking moments of stuttering. Journal of Speech and Hearing Research,  29, 484–489.

ZEBROWSKI, P. M., 1991, Duration of the speech disfluencies of beginning stutterers. Journal of Speech and Hearing Research,34, 483–491.

ZIMMERMAN, S., K  ALINOWSKI, J., STUART, A. and R  ASTATTER , M.,1997, Effect of altered auditory feedback on people who stut-ter during scripted telephone conversations. Journal of Speech,Language, and Hearing Research, 40, 1130–1134.

Page 13: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 13/17

Page 14: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 14/17

Event- and interval-based measurement of stuttering: a review    27

     T   a

     b     l   e     A    1

 .     C

   o   n

    t     i   n   u   e

     d

    S   t  u    d  y

    J  u    d  g  e  s

    N  u   m    b  e   r  o    f  s  a   m   p    l  e  s    /  s  y    l    l  a    b    l  e  s  a  s  s  e  s  s  e    d

    I   n   t  e   r  -    j   u    d  g  e

    I   n   t   r  a  -    j   u    d  g  e

    S   t  a   t    i  s   t    i  c  a    l   m  e   t    h  o    d

    A   r   m  s  o   n  a   n    d    K    i  e    f   t  e    (    2    0    0    8    )

    T  w  o   r  e  s  e  a   r  c    h  a  s  s    i  s   t  a   n   t  s

    1    3    %  o    f   t    h  e   t  o   t  a    l  s  a   m   p    l  e    (    6    0    0  s  y    l    l  a    b    l  e  s    )

    0 .    9    3

    N  o   t  c  a    l  c  u    l  a   t  e    d

    C  o    h  e   n    ’  s    k  a   p   p  a

    M  a  x  a   n    d    B  a    l    d  w    i   n    (    2    0    1    0    )

    T  w  o    j   u    d  g  e  s

    4    0    0    0  w  o   r    d  s    f   r  o   m  a   t  o   t  a    l  o    f    6    0    0    0    0

  w  o   r    d  s

    0 .    8    9

    N  o   t  c  a    l  c  u    l  a   t  e    d

    C  o    h  e   n    ’  s    k  a   p   p  a

    B  a  u  e   r    l  y  a   n    d    D  e    N    i    l    (    2    0    1    1    )

    I   n    f  o   r   m  a   t    i  o   n   n  o   t  a  v  a    i    l  a    b    l  e

    2    5    %  o    f   t  o   t  a    l    d  a   t  a    (   n  o    d  a   t  a   r  e    l  a   t  e    d   t  o   t    h  e

   t  o   t  a    l    i   t  y  o    f  s  y    l    l  a    b    l  e  s    )

    0

 .    9    8    (    f  o   r   r  e  a    d    i   n  g

  s  a   m   p    l  e  s  o    f    P    W    S    )  ;

    0 .    9    6    (    f  o   r  s   p  e  a    k    i   n  g

  s  a   m   p    l  e  s  o    f    P    W    S    )

    N  o   t  c  a    l  c  u    l  a   t  e    d

    C  o    h  e   n    ’  s    k  a   p   p  a

    H  u    i   n  c    k  a   n    d    R    i  e   t  v  e    l    d    (    2    0    0    7    )

    T    h   r  e  e   r  a   t  e   r  s

    1    7    %  o    f   t    h  e   t  o   t  a    l  s  a   m   p    l  e    f  o   r    i   n   t  e   r  -    j   u    d  g  e

   r  e    l    i  a    b    i    l    i   t  y    (    9    5    2  s  a   m   p    l  e  s    )  ;    1    2    %  o    f   t    h  e

   t  o   t  a    l  s  a   m   p    l  e    f  o   r    i   n   t   r  a  -    j   u    d  g  e   r  e    l    i  a    b    i    l    i   t  y

    (    9    5    2  s  a   m   p    l  e  s    )

    0 .    9    5

    0 .    9    5

    I    C    C

    A   n   t    i   p  o  v  a     e      t     a

         l  .    (    2    0    0    8    )

    T  w  o  :   t    h  e    fi   r  s   t  a  u   t    h  o   r  a   n    d  a   n

    i   n    d  e   p  e   n    d  e   n   t    /   r  e  s  e  a   r  c    h  c    l    i   n    i  c    i  a   n

    1    0    %  o    f   t    h  e   t  o   t  a    l  s  a   m   p    l  e    (    1    5    0  s  a   m   p    l  e  s    )

    0 .    9    1

    0 .    9    9

    I    C    C

    L  a   n  g  e  v    i   n     e      t     a

         l  .    (    2    0    1    0    )

    F  o  u   r   t   r  a    i   n  e    d   r  e  s  e  a   r  c    h  a  s  s    i  s   t  a   n   t  s

    1    3  s  a   m   p    l  e  s  c    h  o  s  e   n    f   r  o   m  a   t  o   t  a    l    i   t  y  o    f

    1    2    6  s  a   m   p    l  e  s

    0

 .    9    8    (    C    I   =

    0 .    9    4 ,

    0 .    9    9    )

    N  o   t  c  a    l  c  u    l  a   t  e    d

    I    C    C

    U   n  g  e   r     e      t     a

         l  .    (    2    0    1    2    )

    T   r  a    i   n  e    d   r  e  s  e  a   r  c    h  a  s  s    i  s   t  a   n   t  s    (   n  u   m    b  e   r   n  o   t

   m  e   n   t    i  o   n  e    d    )  a   n    d   t    h  e    fi   r  s   t  a  u   t    h  o   r

    I   n    f  o   r   m  a   t    i  o   n   n  o   t  a  v  a    i    l  a    b    l  e

    1 .    0    0

    N  o   t  c  a    l  c  u    l  a   t  e    d

    I    C    C

    B  e    i    l    b  y     e      t     a

         l  .    (    2    0    1    2    )

    T  w  o    S    L    T

    2    0    0    0  s  y    l    l  a    b    l  e  s    (   t  o   t  a    l    i   t  y  o    f   t    h  e  s  a   m   p    l  e    )

    0 .    9    1    (  o   n  e  -  w  a  y

    i   n    d  e   p  e   n    d  e   n   t  g   r  o  u   p

   r  a   n    d  o   m  e    f    f  e  c   t

   m  o    d  e    l  a   n  a    l  y  s    i  s    )

    0 .    8    9

    I    C    C

    G  a    l    l  o   p  a  a   n    d    R  u   n  y  a   n    b    (    2    0    1    2    )

    T  w  o

    9    0    0  s  y    l    l  a    b    l  e  s    (   t  o   t  a    l    i   t  y  o    f   t    h  e  s  a   m   p    l  e    )

    0 .    8    3

    N  o   t  c  a    l  c  u    l  a   t  e    d

    I    C    C

    M  a  c    l  e  o    d     e      t     a

         l  .    (    1    9    9    5    )

    T  w  o  :  o   n  e   t   r  a    i   n  e    d  s   p  e  e  c    h  –    l  a   n  g  u  a  g  e

   p  a   t    h  o    l  o  g  y  g   r  a    d  u  a   t  e  s   t  u    d  e   n   t  a   n    d  o   n  e

   t   r  a    i   n  e    d   r  e  s  e  a   r  c    h  a  s  s    i  s   t  a   n   t

    3    0    %  o    f   t  o   t  a    l  s  a   m   p    l  e    (    3    0    0    0  s  y    l    l  a    b    l  e  s    )

    8    8    %

    9    5    %

    A  g   r  e  e   m  e   n   t   p  e   r  c  e   n   t  a  g  e

    H  u    b    b  a   r    d    (    1    9    9    8    )

    T  w  o    S    L    T

    1    5    3    8  s  y    l    l  a    b    l  e  s    (   t    h  e   t  o   t  a    l    i   t  y  o    f   t    h  e

  s  a   m   p    l  e    )

    9    8    %    (   p  o    i   n   t  -   t  o  -   p  o    i   n   t

  a  g   r  e  e   m  e   n   t    f  o   r

  s   t  u   t   t  e   r    i   n  g

  o  c  c  u   r   r  e   n  c  e  a   n    d

   n  o   n  -  o  c  c  u   r   r  e   n  c  e    )

    9    8    %    (   p  o    i   n   t  -   t  o  -   p  o    i   n   t

  a  g   r  e  e   m  e   n   t    f  o   r

  s   t  u   t   t  e   r    i   n  g

  o  c  c  u   r   r  e   n  c  e  a   n    d

   n  o   n  -  o  c  c  u   r   r  e   n  c  e    )

    A  g   r  e  e   m  e   n   t   p  e   r  c  e   n   t  a  g  e

    V    i   n  c  e   n   t     e      t     a

         l  .    (    2    0    1    2    )

    T  w  o  :  o   n  e    S    L    T  a   n    d  o   n  e  g   r  a    d  u  a   t  e  s   t  u    d  e   n   t

    i   n  s   p  e  e  c    h  –    l  a   n  g  u  a  g  e   p  a   t    h  o    l  o  g  y ,  w    h  o

    h  a    d    b  e  e   n   t   r  a    i   n  e    d    i   n    i    d  e   n   t    i    f  y    i   n  g

  s   t  u   t   t  e   r    i   n  g   m  o   m  e   n   t  s

    1    2    3    9    0  s  y    l    l  a    b    l  e  s    (   t    h  e   t  o   t  a    l    i   t  y  o    f   t    h  e

  s  a   m   p    l  e    )

    9    5    %

    9    9    %

    A  g   r  e  e   m  e   n   t   p  e   r  c  e   n   t  a  g  e

Page 15: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 15/17

Page 16: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 16/17

Event- and interval-based measurement of stuttering: a review    29

     T   a

     b     l   e     B    1

 .     C

   o   n

    t     i   n   u   e

     d

    S   t  u    d  y  a   n    d    j   u    d  g  e  s

    I   n   t  e   r  -    j   u    d  g  e  a  g   r  e  e   m  e   n   t

    I   n   t   r  a  -    j   u    d  g  e  a  g   r  e  e   m  e   n   t

    F  a  c   t  o   r

    S   t  a   t    i  s   t    i  c  a    l   m  e   t    h  o    d

    C  o   r    d  e  s  a   n    d    I   n  g    h  a   m    (    1    9    9    9    )  :    2    0

  u   n    i  v  e   r  s    i   t  y  s   t  u    d  e   n   t  s  s  e   m    i  -   r  a   n    d  o   m    l  y

  a    l    l  o  c  a   t  e    d   t  o   t  w  o    j   u    d  g  e  g   r  o  u   p  s

    G   r  o  u   p    1    i  :    P   r  e  -   t   r  a    i   n    i   n  g  :   r  a   n  g  e    d    f   r  o   m

    7    1    %  a   n    d    7    6    %    P  o  s   t  -   t   r  a    i   n    i   n  g  :   r  a   n  g  e    d

    f   r  o   m    8    2    %  a   n    d    9    1    %    G   r  o  u   p

    2    j   :    P   r  e  -   t   r  a    i   n    i   n  g  :   r  a   n  g  e    d    f   r  o   m    7    5    %

  a   n    d    8    7    %    P  o  s   t  -   t   r  a    i   n    i   n  g  :   r  a   n  g  e    d    f   r  o   m

    8    3    %  a   n    d    8    8    %

    G   r  o  u   p    1  :    P   r  e  -   t   r  a    i   n    i   n  g  :   r  a   n  g  e    d    f   r  o   m

    8    1    %  a   n    d    8    7    %  ;    P  o  s   t  -   t   r  a    i   n    i   n  g  :   r  a   n  g  e    d

    f   r  o   m    8    9    %  a   n    d    9    2    %  ;    G   r  o  u   p

    2  :    P   r  e  -   t   r  a    i   n    i   n  g  :   r  a   n  g  e    d    f   r  o   m    8    6    %  a   n

    d

    9    2    %  ;    P  o  s   t  -   t   r  a    i   n    i   n  g  :   r  a   n  g  e    d    f   r  o   m

    9    2    %  a   n    d    9    4    %

    N  u   m    b  e   r  o    f  s   t  u   t   t  e   r    i   n  g

  e  v  e   n   t  s  ;    S   t  u   t   t  e   r    i   n  g

    d  u   r  a   t    i  o   n  ;    N  u   m    b  e   r

  o    f    i   n   t  e   r  v  a    l  s    I   n   t  e   r  v  a    l  -

    b  y  -    i   n   t  e   r  v  a    l

    i   n   t  e   r  -    j   u    d  g  e  ;

    I   n   t  e   r  v  a    l  -    b  y  -    i   n   t  e   r  v  a    l

    i   n   t   r  a  -    j   u    d  g  e  ;

    A  c  c  u   r  a  c  y

    A  g   r  e  e   m  e   n   t   p  e   r  c  e   n   t  a  g  e

    E    i   n  a   r  s    d    ´  o   t   t    i   r  a   n    d    I   n  g    h  a   m    (    2    0    0    8    )  :    2    0

    f  e   m  a    l  e   p   r  e  s  c    h  o  o    l   t  e  a  c    h  e   r  s   r  a   n    d  o   m    l  y

  a    l    l  o  c  a   t  e    d   t  o  a   n  e  x   p  e   r    i   m  e   n   t  a    l  g   r  o  u   p

  a   n    d   t  o  a  c  o   n   t   r  o    l  g   r  o  u   p

    E  x   p  e   r    i   m  e   n   t  a    l  g   r  o  u   p  :   n  o   t

  c  a    l  c  u    l  a   t  e    d    C  o   n   t   r  o    l  g   r  o  u   p  :   n  o   t

  c  a    l  c  u    l  a   t  e    d

    E  x   p  e   r    i   m  e   n   t  a    l  g   r  o  u   p  :   n  o   t  c  a    l  c  u    l  a   t  e    d  ;

    C  o   n   t   r  o    l  g   r  o  u   p  :   n  o   t  c  a    l  c  u    l  a   t  e    d

    T   r  a    i   n    i   n  g

    A  g   r  e  e   m  e   n   t   p  e   r  c  e   n   t  a  g  e

Page 17: Valente Et Al-2015-International Journal of Language & Communication Disorders

7/23/2019 Valente Et Al-2015-International Journal of Language & Communication Disorders

http://slidepdf.com/reader/full/valente-et-al-2015-international-journal-of-language-communication-disorders 17/17