the role of symbolic system in relational reasoning
TRANSCRIPT
The Pennsylvania State University
The Graduate School
College of Education
THE ROLE OF SYMBOLIC SYSTEM IN RELATIONAL REASONING
A Thesis in
Educational Psychology
by
Alexa M. Kottmeyer
© 2017 Alexa M. Kottmeyer
Submitted in Partial Fulfillment of the Requirements
for the Degree of
Master of Science
May 2017
ii
The thesis of Alexa M. Kottmeyer was reviewed and approved* by the following: Peggy N. Van Meter Associate Professor of Education Thesis Advisor Jonna M. Kulikowich Professor of Education David L. Lee Acting Head of the Department of Educational Psychology, Counseling, and Special Education *Signatures are on file in the Graduate School.
iii
ABSTRACT
Relational reasoning, the ability to identify relations among apparently unrelated
information, is necessary for cognitive tasks including problem solving and integrating
information in all domains. This study focuses the role of the symbolic system (verbal or non-
verbal) on learners’ relational reasoning abilities, through the verbal Test of Relational
Reasoning (vTORR) and the non-verbal Test of Relational Reasoning (TORR). Confirmatory
factor analysis on each measure supports conceptualization of relational reasoning as comprised
of four distinct forms. However, a combined factor analysis, convergent and discriminant
validity correlations, and a multitrait-multimethod matrix reveal that the symbolic system of the
task impacts relational reasoning abilities. Overall, findings indicate that these two measures of
relational reasoning provide promising avenues for future research, especially for further
exploration of the specific processes underlying verbal and non-verbal relational reasoning.
iv
TABLE OF CONTENTS
List of Tables .................................................................................................................... v List of Figures .................................................................................................................. vi Introduction ………........................................................................................................... 1
Framework of Relational Reasoning .......................................................................... 2 Framework for Symbolic Systems .............................................................................. 4 Measurement of Relational Reasoning ....................................................................... 6 The Current Study ..................................................................................................... 10
Methods ........................................................................................................................... 12 Participants ................................................................................................................ 12 Measures ................................................................................................................... 13 Procedure .................................................................................................................. 15
Results …........................................................................................................................ 16
Performance on the Measures ................................................................................... 16 Internal Stability ........................................................................................................ 17 Factor Structure ......................................................................................................... 19 Convergent and Discriminant Validity ..................................................................... 24
Discussion ....................................................................................................................... 28
Connections ............................................................................................................... 31 Limitations and Future Work .................................................................................... 32
Bibliography ..................................................................................................................... 34 Appendix A: Sample Items from the Test of Relational Reasoning ................................ 39 Appendix B: Sample Items from the verbal Test of Relational Reasoning ..................... 41 Appendix C: CFA Model Results for the TORR ............................................................. 45 Appendix D: CFA Model Results for the vTORR ........................................................... 49 Appendix E: CFA Model Results for the Combined Tests .............................................. 53
v
LIST OF TABLES Table 1: Construct and item descriptions for the four relational reasoning test sections .............. 9 Table 2: Descriptive Statistics .................................................................................................... 17 Table 3: Reliability estimates for subtest scores ......................................................................... 19 Table 4: Model Fit Statistics for vTORR Confirmatory Factor Analyses .................................. 20 Table 5: Model Fit Statistics for TORR Confirmatory Factor Analyses .................................... 21 Table 6: Model Fit Statistics for Combined vTORR/TORR Confirmatory Factor Analyses ..... 23 Table 7: Spearman rho Correlation Coefficients Between vTORR, TORR, Spatial Ability, and Reading Comprehension ...................................................................................................... 25 Table 8: Multitrait-Multimethod Matrix ..................................................................................... 27 Table C1: Factor loadings and latent variable correlation estimates for the TORR Model A ..... 46 Table C2: Factor loadings and latent variable correlation estimates for the TORR Model B ..... 47 Table C3: Factor loadings and latent variable correlation estimates for the TORR Model C ..... 48 Table D1: Factor loadings and latent variable correlation estimates for the vTORR Model A ... 50 Table D2: Factor loadings and latent variable correlation estimates for the vTORR Model B .... 51 Table D3: Factor loadings and latent variable correlation estimates for the vTORR Model C ... 52 Table E1: Factor loadings and latent variable correlation estimates for the combined Model D
................................................................................................................................................ 55 Table E2: Factor loadings and latent variable correlation estimates for the combined Model E
................................................................................................................................................ 56 Table E3: Factor loadings and latent variable correlation estimates for the combined Model F
................................................................................................................................................ 57 Table E4: Factor loadings and latent variable correlation estimates for the combined Model G
................................................................................................................................................ 60 Table E5: Factor loadings and latent variable correlation estimates for the combined Model H
................................................................................................................................................ 62
vi
LIST OF FIGURES Figure A1: Sample vTORR Analogy item .................................................................................. 39 Figure A2: Sample vTORR Anomoly item ................................................................................ 39 Figure A3: Sample vTORR Antinomy item ............................................................................... 40 Figure A4: Sample vTORR Antithesis item ............................................................................... 40 Figure B1: Sample TORR Analogy item .................................................................................... 41 Figure B2: Sample TORR Anomoly item .................................................................................. 42 Figure B3: Sample TORR Antinomy item ................................................................................. 43 Figure B4: Sample TORR Antithesis item ................................................................................. 44 Figure C1: Confirmatory factor analysis models for the TORR ................................................ 45 Figure D1: Confirmatory factor analysis models for the vTORR .............................................. 49 Figure E1: Confirmatory factor analysis models for the combined tests .................................... 53
1
Introduction
A key component for learning and problem solving across any subject or context is the
ability to find and comprehend similarities and differences (Alexander, Singer, Jablansky, &
Hattan, 2016; Gentner & Markman, 1997; Goldstone & Son, 2005; Quine, 1969). Because of this,
different manifestations of the study of reasoning around similarity and difference relations have
occurred for decades (Alexander, Dumas, Grossnickle, List, & Firetto, 2016; Gentner, 1983;
Holyoak & Morrison, 2005; Sternberg, 1977). The capacity to identify these meaningful relations
and patterns within a set of information is known as relational reasoning (Alexander & the
Disciplined Reading and Learning Research Laboratory (DRLRL), 2012).
Relational reasoning plays an important role across many domains and populations: in
verbal tasks such as during medical discourse (Dumas, Alexander, Baker, Jablansky, & Dunbar,
2014), student discussions (Jablansky, Alexander, Dumas, & Compton, 2016), knowledge
revision and conceptual change (Chinn & Malhotra, 2002; Kendeou, Butterfuss, Boekel, &
O’Brien, 2016); in mainly non-verbal tasks like problem solving and learning in mathematics and
engineering (DeWolf, Bassok, & Holyoak, 2015; Dumas & Schmidt, 2015; Richland, Begolli,
Simms, Frausel, & Lyons, 2016); and in tasks involving both verbal and non-verbal stimuli,
including scientific and philosophical writing (Gentner & Jeziorski, 1993; Johnson-Laird, 2005)
and verbal discussions of scientific texts (Murphy, Firetto, & Greene, 2016).
Though we can see that relational reasoning has been studied in a variety of contexts,
there has been little exploration into the role of the verbal or non-verbal symbolic system used in
the relational reasoning task. Previous research has shown differences in knowledge and cognition
due to the symbolic system (Ainsworth, 2006; Krawczyk, 2012; Mayer, 2014; Paivio, 2007), such
as the distinction between the sequential nature of verbal stimuli versus the more simultaneous
2
processing possible with non-verbal objects (Paivio, 2007). These processing differences across
symbolic system may impact how relational reasoning functions in varied contexts.
Relational reasoning plays an important role in thinking and learning in varied domains,
each of which involves tasks that are verbal, non-verbal, and tasks that involve both symbolic
systems. Given these contexts, it is essential to understand how people solve problems involving
relational reasoning and how the symbolic system of the task affects this reasoning. To address
these questions, we explore differences in relational reasoning across symbolic system using two
measures of relational reasoning, a verbal and a non-verbal test.
Framework of Relational Reasoning
Over the years, the conceptualization and measurement of relational reasoning has
changed and grown. At its start, relational reasoning research focused on the study of similarity
and analogical reasoning (Gentner, 1983; Gentner & Markman, 1997; Goldstone & Son, 2005;
Holyoak, 2005; Sternberg, 1977). Recently, however, there has been an increase in evidence for
the importance of other forms of relational reasoning resulting in a current focus on four forms:
analogy, anomaly, antinomy, and antithesis (Alexander & the DRLRL, 2012). Each these four
forms focuses on the discernment of a different type of higher order relations (Dumas et al.,
2014). Analogical reasoning focuses on higher order similarity relations, anomalous reasoning
on higher order discrepancy relations, antinomous reasoning on higher order relations around
incompatibility, and antithetical reasoning on higher order oppositional relations (Dumas,
Alexander, & Grossnickle, 2013).
The first type of relational reasoning, analogical reasoning, is defined as the ability to
identify structural similarity between seemingly unlike concepts or events. Because of its
historical place as the main form of relational reasoning, analogy is the most widely studied of
3
the four forms (Gentner, 1983; Holyoak & Morrison, 2005; Krawczyk, 2012; Sternberg, 1977).
Early studies framed analogical reasoning as a component or indicator of intelligence (Raven,
1941; Sternberg, 1977). Later studies focused on the construct independently from theories of
intelligence, and thus included the development of a foundational theoretical framework for the
concept of analogy called the structure-mapping theory (Gentner, 1983). This theory of analogy
focused on the alignment of relational structure (Gentner & Markman, 1997). Specifically, an
analogous base and target must be structurally consistent, in that matching relations have
matching arguments and have a one-to-one correspondence. The analogy must also have a
relational focus, where there are common relations but not necessarily common objects or
attributes. Finally, these relations must not only match but map through higher order constraining
relations as well (Gentner, 1983; Holyoak, 2005).
Anomalous reasoning is the ability to recognize an abnormality or deviation from an
established pattern. Similar to analogy, this type of relational reasoning can be expressed as
reasoning based around a higher order discrepancy relation (Dumas et al., 2014), though to
identify an anomaly one must first identify the pattern that the anomaly does not fit. In
educational settings, reasoning around anomalies is used to promote conceptual change (Chinn &
Malhotra, 2002).
Antinomous reasoning deals with identifying incompatibilities within and between sets of
information (Dumas et al., 2013). Antinomies are the least well studied previously, but recent
study has identified this form of relational reasoning during medical discourse about patient
diagnoses and student discussion about science text (Dumas et al., 2014, Murphy et al., 2016).
Specifically, Dumas et al. (2014) found that in discourse about medical diagnoses, the more
expert attending physician expressed more antinomous reasoning than the more novice residents.
4
On the other hand, antinomy was the second most frequent type of relational reasoning found in
the high school students’ discussions (Murphy et al., 2016).
Finally, antithetical reasoning involves the identification or use of two directly
oppositional statements or ideas. The role of antithesis in science dates back to Aristotle, who
counted antithesis, along with analogy, as an important structure for both communicating and
reasoning about science (Fahnestock, 1999). More recently, understanding and learning using
oppositional ideas is often studied for its role in argumentation, refutation text, and conceptual
change (Braasch, Goldman, & Wiley, 2013; Dumas et al., 2013; Kendeou et al., 2016).
Though the current four form relational reasoning framework was introduced recently
(Alexander & the DRLRL, 2012), many studies have already collected evidence that these forms
of reasoning are used in learning and problem solving (Dumas et al., 2014; Dumas & Schmidt,
2015; Jablansky et al., 2016; Murphy et al., 2016). These studies have identified each of the four
forms verbally in medical discourse between an attending neurologist and his residents about
patient symptoms and diagnoses during team meetings (Dumas et al., 2014) and in student
conversations about the form and function of familiar and unfamiliar objects in primary and
secondary school (Jablansky et al., 2016), in mainly non-verbal tasks in engineering (Dumas &
Schmidt, 2015), and in mixed system tasks such as verbal student discourse during discussions of
science materials including non-verbal representations (Murphy et al., 2016). While the symbolic
system varies across the contexts, these studies all support the existence of analogy, anomaly,
antinomy, and antithesis during learning and problem solving.
Framework for Symbolic Systems
To understand the possible differences in relational reasoning based on the different
symbolic systems, we first must examine how verbal versus non-verbal information is processed.
5
The question of differences between processing verbal and non-verbal stimuli has been studied
for decades. Larkin & Simon (1987) began by classifying the possible types of external stimuli a
learner may encounter, distinguishing sentential (such as verbal) and non-verbal diagrammatic
representations. They posited that the key difference between these representations is in the types
of component relations the representation explicitly presents to the participant. For example, a
verbal sentence is inherently sequential, and therefore more able to explicitly reveal a temporal
sequence of events; alternatively, because of the multitude of possible connections expressed, a
diagram has the ability to explicitly show physical relations among its components (Larkin &
Simon, 1987; Paivio, 2007). Another distinction between how learners process external verbal
and non-verbal representations is based on the notational rules of the system: verbal information
is governed by syntax, and non-verbal information has certain format rules depending on the
representation (Ainsworth, 2006).
A related classification for these representations draws the line between ‘depictive’ and
‘descriptive’ forms rather than verbal/non-verbal (Schnotz, 2014). A descriptive representation is
symbolic, where the symbols used do not have structural commonality with the idea they are
representing. Spoken and written text are descriptive, as are mathematical equations. On the
other hand, depictive representations are more similar to the idea they intend to represent.
Depictive representations include diagrams and pictures, for example. Like Larkin & Simon
(1987), Schnotz (2014) also discussed the differences between these two forms: descriptive
representations are more able to express abstract knowledge such as general concepts or
connected information while depictive representations, though they are limited to concrete ideas,
inherently display complete information and therefore are more useful for making inferences.
Other researchers have focused on the differences during processing verbal and non-
6
verbal stimuli internally. Dual-code theory (Paivio, 1979; 2007) states that verbal and non-verbal
information are processed internally through two separate but connected channels. Within this
theory, one main distinction between the symbolic systems relates to their use in more abstract
tasks. Due to the more concrete nature of a non-verbal representation, Paivio (1979) argues that
verbal processes are increasing useful as the task becomes more abstract.
The role of symbolic system in processing specifically within relational reasoning has
also been addressed through a review of neurological studies of relational reasoning (Krawczyk,
2012). This review determined that while some portions of the brain were active throughout
relational reasoning, other areas were specific to reasoning during either semantic or visuo-
spatial problems. Though this supports the possibility of differences in relational reasoning
depending on the symbolic system, it is unclear if and how these neurological distinctions would
translate to performance on cognitive tasks. However, these differences in internal processing
and external properties of stimuli expressed using verbal versus non-verbal symbolic systems
raise the question of possible dissimilarities in learners’ relational reasoning when faced with
tasks involving different symbolic systems. This is the question that we begin to tackle in the
current study using two measures of relational reasoning, one verbal and one non-verbal.
Measurement of Relational Reasoning
To fully understand the role of symbolic system in relational reasoning, we must examine
the measurement of relational reasoning through verbal and non-verbal means. Many methods
have been used throughout the years to evaluate relational reasoning (Dumas, 2016).
Historically, the most common measures of relational reasoning included Raven’s Progressive
Matrices (Raven, 1941), a figural measure that requires test-takers to choose the correct figure to
complete a 3x3 matrix, and four-term A:B::C:D analogies using either verbal or non-verbal terms
7
(Dumas et al., 2013). These measures focus exclusively on analogical reasoning, and so the
expansion of the conceptualization of relational reasoning required the development of a new
measure: The Test of Relational Reasoning (TORR; Alexander, Dumas, et al., 2016).
The TORR was developed specifically to measure the four forms of relational reasoning,
and to facilitate the exploration of their relations with learning. To this end, the test is partitioned
into four multiple-choice subtests, with each section corresponding to one of the four types. All
of the items on the TORR are figural, a design chosen to limit “the influence of prior knowledge
and culturally relevant experiences” (Alexander, Dumas, et al., 2016). Each item requires
students to identify patterns within and between sets of geometric figures based on the
instructions given for the section. Sample items from each section of the TORR can be found in
Appendix A.
A measure such as the TORR allows for continued investigation of the role of relational
reasoning in learning and problem solving. In an initial validation study, performance on the
TORR was shown to predict student performance on four-term verbal analogy questions and
math calculation questions from the SAT (Alexander, Dumas, et al., 2016). This study also tested
the factor structure of the TORR using confirmatory factor analysis, and determined that the best
fitting model was the four-factor model matching the conceptualization of relational reasoning as
analogy, anomaly, antinomy, and antithesis (Alexander & the DRLRL, 2012; Alexander, Dumas,
et al., 2016). Scores on the TORR also predicted the change in the novelty of engineering
students’ design solutions following a design instruction called TRIZ; students with higher
relational reasoning ability initially had greater increase in novelty scores for their designs during
the instruction, which indicates that strategies for reasoning relationally may support design in
engineering (Dumas & Schmidt, 2015).
8
However, relational reasoning has also been shown to manifest in verbal tasks such as
learning from text (Alexander & the DRLRL, 2012). To further explore the foundational role of
relational reasoning as a cognitive ability during any type of task, and to fully address the role of
relational reasoning in verbal tasks, a second measure of the four forms was developed using
linguistic items: The Verbal Test of Relational Reasoning (vTORR; Alexander, Singer, et al.,
2016). The overall structure of the vTORR was designed to be parallel to the TORR through four
multiple-choice subtests, one for each type of relational reasoning (Alexander, Singer, et al.,
2016). However, for the vTORR, all items are given in verbal form and respondents must
determine the required relations amongst these sentences or paragraphs. Sample items from each
section of the vTORR can be found in Appendix B. Table 1 presents descriptions of the items in
each section from both the TORR and vTORR as well as the definition of the relational
reasoning form measured by each section. Although the exact instructions for each section vary
somewhat from the TORR to the vTORR due to the constraints of the symbolic system of the
items (verbal or non-verbal), each section’s design ties clearly to the relevant form of relational
reasoning.
9
Table 1
Construct and item descriptions for the four relational reasoning test sections
Definition vTORR TORR
Analogy The ability to identify
structural similarity
between seemingly
unlike concepts or
events.
• 1 – 2 sentences
describing an event
• Choose the most
similar answer option
• 8 figures in a 3 by 3
grid
• Choose the missing
figure
Anomaly The ability to recognize
deviation from an
established pattern.
• 4 sentences
• Choose the option that
does not fit
• 4 figures
• Choose the option that
does not fit
Antinomy The ability to identify
incompatibilities within
and between sets of
information.
• 2 paragraphs
describing the same
event from different
perspectives
• Choose the option that
is consistent with only
one of the paragraphs
• 1 given set of objects
and 4 possible answer
sets
• Choose the option that
could never share an
object with the given
set
Antithesis The ability to identify or
use two directly
oppositional statements
or ideas.
• A sentence with two or
more key features
• Choose the option that
is the opposite
• A process described by
two figures connected
by an arrow
• Choose the option that
is the opposite process
10
An initial validation study for the vTORR supported the expected four factor structure
through confirmatory factor analysis (Alexander, Singer, et al., 2016), the same structure that
was found to fit the TORR (Alexander, Dumas, et al., 2016). The study also found that the two
relational reasoning tests were moderately correlated, which indicated that while student
performance on the two tests was related, the vTORR provided information beyond the TORR
about participants’ relational reasoning abilities (Alexander, Singer, et al., 2016). Considering
the role of relational reasoning in a variety of both verbal and non-verbal tasks, it is important to
examine more in depth the differences in relational reasoning as measured by the TORR and
vTORR.
The Current Study
This study aims to explore the theory and measurement of relational reasoning,
particularly as it relates to the symbolic system of the task. Relational reasoning has been posited
as general ability affecting learning in different contexts (Alexander & the DRLRL, 2012;
Dumas, 2016; Dumas, Alexander, & Grossnickle, 2013; Holyoak & Morrison, 2005). However,
we know that information is processed differently based on the nature of its symbolic system
(Ainsworth, 2006; Krawczyk, 2012; Mayer, 2014; Paivio, 2007). Previously, Alexander, Singer,
et al. (2016) found that students’ performances on the verbal test of relational reasoning explains
some unique information about students’ relational reasoning ability beyond the non-verbal test.
To further explore these differences, in the current study we examine the relationships between
relational reasoning through the verbal vTORR and non-verbal TORR and symbolic system
abilities through reading comprehension and spatial ability. Specifically, we aim to address the
following 6 research hypotheses.
First, though the measures use different symbolic systems, because relational reasoning is
11
a general ability (Alexander & the DRLRL, 2012) we hypothesize that students do not perform
significantly differently on the TORR and vTORR. Second, we expect to replicate previous
results (Alexander, Dumas, et al., 2016; Alexander, Singer, et al., 2016) demonstrating the
reliability of the scores on the TORR and vTORR, as well as on shortened forms of the vTORR
and TORR used in the current study to avoid fatigue effects (see Methods for details). Third, we
will offer continued validity evidence for the 4-form model of relational reasoning through
confirmatory factor analyses of the TORR and the vTORR. As seen previously, we expect the
four factor model to be the best fitting model for both tests (Alexander, Dumas, et al., 2016;
Alexander, Singer, et al., 2016).
The next hypothesis involves the claim that the TORR and vTORR measure the same 4
forms of relational reasoning, which we address through an examination of the factor structure of
the vTORR and TORR scores combined. To support this claim, we would expect the best fitting
model to have a four-factor structure of analogy, anomaly, antinomy, and antithesis at the highest
level. However, because the symbolic system is hypothesized to also play a role in the scores
(Alexander, Singer, et al., 2016), we would expect that a two-level model would be best, with the
four second-level relational reasoning factors each having the two symbolic system factors
nested within.
Fifth, we address the convergent and discriminant validity of the TORR and vTORR by
examining their relation to measures of reading comprehension and spatial ability. Given that the
TORR and vTORR are both measures of relational reasoning, our hypothesis is that these two
tests will be significantly and moderately correlated; we also expect the TORR – vTORR
correlations to be higher than either of the within symbolic system correlations (vTORR –
Reading Comprehension and TORR – Spatial Ability).
12
Finally, we explore the convergent and discriminant validity of the subtests of the TORR
and vTORR using a multi-trait multi-method matrix (Campbell & Fiske, 1959) to address the
role of the symbolic system (verbal or non-verbal) on relational reasoning. Though we predict
that the correlations between verbal subtests and between non-verbal subtests may be significant,
we expect the highest correlations to be those between corresponding sections, such as the
correlation between the verbal analogy section from the vTORR and the non-verbal analogy
section from the TORR.
Methods
Participants
Participants were 768 undergraduate students recruited from 2 introductory level
undergraduate courses, one in the College of Science and the other in the College of Education at
a large University in the eastern United States. All participants consented to participate in the
study and received course extra credit for completion of the study. An alternative activity was
offered for equivalent extra credit for anyone who did not wish to participate in the study. Eight
participants’ data were removed due to missing outcome measures. There were 171 males
(22.3%) and 597 females (77.7%); no participants classified their gender as other. Seven percent
of participants indicated that English was their second language. The sample was composed of
81.9% White/Caucasian, 6.8% Asian, 4.9% African American, 4.3% Hispanic, and 2.2% other.
About half of participants were of freshman semester standing (49.5%), followed by sophomore
standing (32.7%), then junior standing (10.7%) and senior standing or above (7.2%). Majors
from a variety of colleges were represented in the sample: Health and Human Development
(37.7%), Education (24.8%), Engineering (11.6%), Science (11.4%), Liberal Arts (5.4%), and
Nursing (5.0%); the remaining participants were either Undecided (3.1%) or in other colleges
13
within the university (1.0%).
Measures
Demographic survey. The demographic survey included questions regarding intended
major, semester standing, gender, ethnicity, verbal and quantitative SAT score, and age.
Test of Relational Reasoning. As described above, a relational reasoning test was
developed to measure the 4 types of relational reasoning (Alexander, Dumas, et al., 2016) in
college-age individuals. The measure consists of 4 non-verbal tasks, each targeting one of the
types of relational reasoning and each consisting of two sample questions with answers followed
by 8 multiple-choice questions. Each item is scored dichotomously as 0 if incorrect and 1 if
correct. Section scores are determined by the number of correct answers in each section. The
total score is the sum of the correct answers on the entire measure, or equivalently the sum of the
four section scores. Overall reliability for the TORR was calculated in a previous study using
Cronbach’s alpha, 𝛼 = 0.84, and using test-retest reliability, 𝑟 = 0.71 (Alexander, Dumas, et al.,
2016; Dumas & Alexander, 2016). We address the reliability based on the sample in our study in
the Results section.
However, since participants were taking both relational reasoning measures as well as the
spatial ability and reading comprehension tests, a modified version of the TORR was used to
minimize fatigue effects. Based on a similar sample of data collected by colleagues at a Mid-
Atlantic university (Alexander, Singer, et al., 2016), we identified and removed the three poorest
functioning items from each section, leaving 5 items per section. We chose which items to
eliminate based on their reliability, discrimination, and difficulty. First we determined the four
items in each section that had the lowest reliability. Then we considered their discrimination by
identifying the items with lowest discrimination. The three items that had both low reliability and
14
low discrimination were removed from each section. Since all item p-values fell between 0.2 and
0.8, difficulty of the items was only considered for any subtest where less than three items had
both low reliability and low discrimination.
Verbal Test of Relational Reasoning. A verbal test was also developed to measure
relational reasoning in college-age individuals. The measure consists of 4 verbal subtests, each
targeting one of the types of relational reasoning and each consisting of two sample questions
with answers followed by 8 multiple-choice questions. Each item is scored dichotomously as 0 if
incorrect and 1 if correct. Section scores are determined by the number of correct answers in
each section. The total score is the sum of the correct answers on the entire measure, or
equivalently the sum of the four section scores. As for the TORR, each item is scored
dichotomously as 0 if incorrect and 1 if correct. Total score information is determined by the
number of correct answers in each section and the total number of correct answers on the
measure. Overall reliability for the vTORR was calculated in a previous study using test-retest
reliability, 𝑟 = 0.62 (Alexander, Singer, et al., 2016). vTORR reliability calculated using our
sample is discussed in the Results section. Again a modified version was created and used to
minimize fatigue effects. The three poorest functioning items were eliminated from each section,
leaving 5 items per section, following the item selection procedure described for the TORR
above.
Reading comprehension measure. The Davis Reading Test assesses individual reading
comprehension ability in early post-secondary populations (Davis & Davis, 1962). The test
requires examinees to read 6 small passages and then respond to 40 total multiple-choice items.
The original Davis Reading Test consists of two sections of 40 items each, where the first 40
items are scored for comprehension while the second set are scored for speed. Since the score on
15
the first 40 items determines the participant’s reading comprehension ability, only these items
were administered. Reliability estimate (Cronbach alpha) for participants’ scores was 0.77.
Spatial ability measure. The Paper Folding Test assesses spatial reasoning ability
(Ekstrom, French, Harman, & Derman, 1976). This test consists of 2 sets of 10-items that ask the
test taker to envision a piece of paper being folded in different ways with a hole being punched
in that folded paper. Each item stem shows paper folded in a variety of ways and where the holes
are punched. The test taker must identify what the paper would look like completely unfolded
from 5 options to the right of the stem. Each set is timed for 3 minutes, and scores are adjusted
for guessing. Reliability estimates (Cronbach alpha) reported in the literature are around 0.75
(Ekstrom et al., 1976; Miyake, Friedman, Rettinger, Shah, & Hegarty, 2001).
Procedure
Participants received and completed the informed consent and then completed the
demographic survey. The research team then administered the spatial ability test to the entire
group, with each of the two sets being timed for 3 minutes then collected. Following the spatial
ability measure, participants completed reading comprehension test, vTORR, and TORR.
Due to space limitations, a small subsample 𝑛 = 148, 19.3% completed the measures
on paper. The remaining participants 𝑛 = 620, 80.7% completed the vTORR, TORR, and
reading comprehension measures on Qualtrics. Both administrations counterbalanced the order
of the vTORR, TORR, and reading comprehension measures, as well as the vTORR and TORR
subtests, to control for order and fatigue effects. Previous studies found no significant differences
across paper and online administration for the TORR (Alexander, Dumas, et al., 2016), and the
previous study of the vTORR used only online administration (Alexander, Singer, et al., 2016).
However, in our sample there were significant differences across paper and online administration
16
for the vTORR scores for students sampled from the College of Education (PaperM =
14.347, OnlineM = 13.160, t 273.51 = −2.915, 𝑝 = 0.004, 𝑑 = 0.33). No significant
differences were found for the TORR in the same sample (PaperM = 10.898, OnlineM =
10.104, t 289 = −1.839, 𝑝 = 0.067). It is important to note, though, that these differences
may be attributable to the two different College of Education samples rather than to the
administration method for the test, since the sample who completed the measure on paper was
drawn in the Fall semester, while the online sample was recruited in the Spring semester.
Results
Performance on the Measures
To address the study’s first objective, we examined overall student performance on the
measures (Table 2). Subscale scores on the TORR and vTORR ranged from 0-5, with the total
score on each measure therefore ranging from 0-20. The possible scores on the reading
comprehension measure ranged from 0-40, though no participant scored below 2 or above 37
points. Scores on the spatial ability measure ranged from 0-20.
Though the scale is the same for both the vTORR and TORR, participants in this study
received significantly higher scores on the vTORR than they did on the TORR (𝑡 767 =
17.211, 𝑝 < 0.001, 𝑑 = 0.622). They also received higher scores on each section of the vTORR
than on the corresponding section of the TORR; this difference was significant for the analogy
section (𝑡 767 = 10.562, 𝑝 < 0.001, 𝑑 = 0.380), the anomaly section (𝑡 767 = 16.865, 𝑝 <
0.001, 𝑑 = 0.608), and antithesis section (𝑡 767 = 11.830, 𝑝 < 0.001, 𝑑 = 0.427). The
participants significantly different average performance on the verbal and non-verbal relational
reasoning measures is a result that is further explored throughout the next analyses.
17
Table 2
Descriptive Statistics
Verbal Measures Mean (SD) Non-verbal Measures Mean (SD)
vTORR 13.55 (3.578) TORR 10.91 (3.842)
vTORR Analogy 3.39 (1.222) TORR Analogy 2.76 (1.377)
vTORR Anomaly 3.34 (1.270) TORR Anomaly 2.27 (1.436)
vTORR Antinomy 3.10 (1.443) TORR Antinomy 2.97 (1.530)
vTORR Antithesis 3.72 (1.376) TORR Antithesis 2.92 (1.503)
Davis Reading 19.49 (6.266) Spatial Ability 9.81 (4.244)
However, the lack of significant difference between the vTORR and TORR antinomy
sections drew our attention to an interesting pattern that emerges when you look at relative order
of four sections’ mean scores within the vTORR and within the TORR. For both measures,
participants scored on average higher on the antithesis section than on the analogy section, and
higher on analogy than anomaly. Antinomy does not fit this pattern: participants’ mean antinomy
score was the highest mean section score on the TORR but the lowest on the vTORR. This result
indicates that there may be a unique relationship between symbolic system and antinomous
reasoning ability, an idea that is revisited during the examination of the combined factor
structure below.
Internal Stability
Our next research objective was to establish the internal consistency of the two Relational
Reasoning measures, the TORR and the vTORR. Cronbach’s alpha was calculated for both tests.
For the shortened form of the TORR, Cronbach’s alpha was 𝛼 = 0.72; for the shortened form of
18
the vTORR, Cronbach’s alpha was 𝛼 = 0.71. However, since the shortened form of each test
contained 20 items instead of the full 32, the Spearman-Brown prediction formula was used to
predict the reliability of scores from the full measure, 1.6 times as long. The reliability under the
prediction formula for the full TORR scores was calculated as 𝛼 = 0.80 and the reliability for
the full vTORR scores was also calculated as 𝛼 = 0.80.
The prediction formula allows for comparisons between these reliability calculations and
previous reliability calculations for data from these measures; previous studies involving the
TORR found similar values for internal stability using Cronbach’s alpha (Alexander, Dumas, et
al., 2016; Dumas & Alexander, 2016) but the prior study of the vTORR found a reliability value
of 𝑟 = 0.62using a test-retest correlation (Alexander, Singer, et al., 2016). The higher reliability
value in our study may stem from the use of different reliability calculations, but it may also be
due to the elimination of weaker items from the vTORR in the current study.
These previous studies also addressed the reliability of the subtests of the TORR and the
vTORR (Alexander, Dumas, et al., 2016; Alexander, Singer, et al., 2016; Dumas & Alexander,
2016). In the current study, reliability estimates for the subtests’ scores were calculated using
Cronbach’s alpha with the Spearman-Brown prediction formula and are reported in Table 3.
Previous studies of the TORR found subtest reliabilities ranging from 𝛼 = 0.51 to 0.65
(Alexander, Dumas, et al., 2016); our adjusted reliability estimates are higher likely due again to
the elimination of the weakest three items from each section and the use of the prediction
formula. However, the prior study of the vTORR used coefficient H through a Confirmatory
Factor Analysis to determine the reliability of the 4 factors corresponding to the four subtests and
found that while reliability for the Analogy, Antinomy, and Antithesis section scores was
acceptable, the Anomaly section scores had very low reliability (Alexander, Singer, et al., 2016).
19
Our more consistent reliability values across the sections suggest that the low Anomaly score
reliability may have been caused by weaker items that we eliminated in our study.
Table 3
Reliability estimates for subtest scores(Cronbach’s alpha with Spearman-Brown prediction)
Analogy Anomaly Antinomy Antithesis
vTORR .52 .54 .68 .73
TORR .62 .62 .73 .71
Factor Structure
Individual test structure. To address the validity of the claim that relational reasoning
as measured by the TORR and vTORR consists of four types: analogy, anomaly, antinomy, and
antithesis, we ran a Confirmatory Factor Analysis on each of the two tests separately. The CFA
tested three models as depicted in Appendices C and D. Model A depicts a one-factor model
with relational reasoning as the single underlying latent construct, representing the view of
relational reasoning as a unitary construct. Model B is a four-factor model where the latent
constructs of analogy, anomaly, antinomy, and antithesis are separate, but related factors, and
this model supports the hypothesis that relational reasoning has four distinct sub-constructs
(Alexander & the DRLRL, 2012; Alexander, Dumas, et al., 2016; Alexander, Singer, et al.,
2016). The final model, Model C, is a two-factor model where the underlying constructs are
represented by a similarities factor (analogy) and a differences factor (anomaly, antinomy, and
antithesis). This model corresponds to the hypothesis that relational reasoning is two related but
distinct sub-constructs of similarities and differences.
20
The Confirmatory Factor Analyses were run using M-Plus 6 (Muthén & Muthén, 2010).
To accommodate the dichotomous nature of the data and given the sufficient size of the sample,
we chose to use weighted least squares mean- and variance-adjusted estimation (WLSMV;
Hancock & Mueller, 2013). A priori guidelines to decide fit were determined based on the
WLSMV literature (Hancock & Mueller, 2013; Yu, 2002): 𝜒D p-value > 0.05, comparative fit
index (CFI) and Tucker-Lewis index (TLI) ≥ 0.95, and root mean square error of approximation
(RMSEA) ≤ 0.05. Beyond the chi-square, additional measures of global fit such as the goodness-
of-fit index (GFI) and the adjusted goodness-of-fit index (AGFI) were not included because of
the impact of sample size on these statistics as well as because they are based on chi-square, and
so are often considered unnecessary (Sharma, Mukherjee, Kumar, & Dillon, 2005). Fit statistics
for the three models on the vTORR are reported in Table 4, for the TORR in Table 5.
Table 4
Model Fit Statistics for vTORR Confirmatory Factor Analyses
Model 𝜒D df 𝜒D p-value CFI TLI RMSEA
Model A 423.635 170 0.0000 0.857 0.840 0.044
Model B 185.415 164 0.1208 0.988 0.986 0.013
Model C 409.558 169 0.0000 0.865 0.848 0.043
21
Table 5
Model Fit Statistics for TORR Confirmatory Factor Analyses
Model 𝜒D df 𝜒D p-value CFI TLI RMSEA
Model A 794.389 170 0.0000 0.713 0.680 0.069
Model B 234.851 164 0.0002 0.967 0.962 0.024
Model C 740.611 169 0.0000 0.738 0.705 0.066
For both the vTORR and the TORR, the four-factor Model B was the best fitting model
under the above cutoff values. Non-verbal representations of each standardized model with latent
variable correlations and factor loadings (Jackson, Gillaspy, & Purc-Stephenson, 2009) are
included in the appendices: Appendix C for the vTORR and Appendix D for the TORR. Under
the guidelines stated above we can see that the two-factor Model C does not fit the data well for
either test; this indicates that anomaly, antinomy, and antithesis are separate constructs, not a
single differences factor. These results, along with the poor fit of Model A, replicate the prior
findings for the two measures (Alexander, Dumas, et al., 2016; Alexander, Singer, et al., 2016)
and indicate that both tests are measuring four latent constructs of analogy, anomaly, antinomy,
and antithesis as hypothesized in the literature. The matching factor structure of the TORR and
vTORR leads naturally to our next research question.
Combined test structure. Our interest in the TORR and vTORR goes beyond their
individual structure: we seek to understand the impact of symbolic system of the stimuli on
relational reasoning through these two tests. To address this question, we began by running a
Confirmatory Factor Analysis when the two tests are combined rather than examined separately.
We focused on five possible models as seen in Appendix E: The first model, Model D, is a two-
22
factor model with all verbal items are loaded on one factor and all the non-verbal items loaded
on a different but related factor. This model represents the hypothesis that the two tests are
mainly measuring students’ abilities within a symbolic system. Model E is a four-factor model
where items from each section of both the TORR and vTORR are loaded together to form four
distinct but related latent factors and Model F loads vTORR and TORR items separately to form
an eight-factor model, though the eight distinct factors are still related. The hypothesis behind the
four-factor Model E is that relational reasoning consists of the 4 forms regardless of the verbal or
non-verbal nature of the task; Model F’s eight factor structure addresses the possibility that
relational reasoning has four verbal forms and four non-verbal forms that are directly related.
The final two models are higher-order models, where the eight factors from Model F are
only indirectly related through higher order latent factors of either relational reasoning form
(Model G) or symbolic system (Model H). These final models examine whether the scores of
relational reasoning have four higher-order factors with the two symbolic systems nested within
each or are the scores two higher-order factors with nested relational reasoning constructs of
analogy, anomaly, antinomy, and antithesis within each symbolic system.
The Confirmatory Factor Analyses were again run using WLSMV estimation in M-Plus 6
(Hancock & Mueller, 2013; Muthén & Muthén, 2010), with the same guidelines to decide fit
(Hancock & Mueller, 2013; Yu, 2002): 𝜒D p-value > 0.05, CFI and TFI ≥ 0.95, and RMSEA ≤
0.05. Fit statistics for the models on the combined tests are reported in Table 6.
23
Table 6
Model Fit Statistics for Combined vTORR/TORR Confirmatory Factor Analyses
Model 𝜒D df 𝜒D p-value CFI TLI RMSEA
Model D 1405.61 739 0.0000 0.822 0.812 0.034
Model E 1521.13 734 0.0000 0.790 0.777 0.037
Model F 784.559 712 0.0301 0.981 0.979 0.012
Model G 943.644 726 0.0000 0.942 0.938 0.020
Model H 809.365 731 0.0229 0.979 0.978 0.012
Interestingly, neither the four-factor Model E nor the dual level four-factor Model G fit
the data well, indicating that there is some effect being measured differently in the verbal and
non-verbal tests beyond the four forms of relational reasoning seen in the structure of each
individual test. The poor fit of the two-factor Model D, however, supports that the tests are also
measuring something beyond just symbolic system. The eight-factor Model F and the two level
Model H are the best fitting models, meeting the cutoffs for CFI, TLI, and RMSEA and with 𝜒D
values smaller than those of Models D, E, and G. Graphic representations of standardized
Models F and H with latent variable correlations and factor loadings are included in Appendix E
(Jackson et al., 2009). These results indicate that relational reasoning scores measure two latent
factors representing the symbolic systems, with the four forms of relational reasoning nested
within each.
While Models F and H both fit the data well, the two-level Model H may be more helpful
in further understanding the relationship between the symbolic system and the forms of relational
reasoning because it separates the two symbolic system factors from the four relational reasoning
24
factors nested within. Examining the coefficients for the nested variables in this model as
reported in Appendix E, we can see that within both the non-verbal and verbal factors, the
respective analogy variable has the highest coefficient, with anomaly next, and then antithesis
with a smaller coefficient. Antinomy again breaks this pattern, performing differently within
depending on the symbolic system in which it was measured: non-verbal antinomy has the
smallest coefficient within the non-verbal construct while verbal antinomy does not.
Convergent and Discriminant Validity
We continued to explore the role of symbolic system in relational reasoning through
investigating evidence related to convergent and discriminant validity of the TORR and vTORR.
To this end, we first examined the correlations between scores on the total TORR, the total
vTORR, and scores on two individual difference measures: the Davis reading comprehension
test and the Paper Folding spatial ability measure.
Full test correlations. If the TORR and vTORR both measure relational reasoning, we
would expect the correlation between these measures to be positive and strong. We also expect
the correlations between the TORR and the spatial ability measure and between the vTORR and
the reading comprehension measure to each be positive but weak to moderate. Finally, for
relational reasoning as measured by the TORR and vTORR to be considered a system-
independent construct, the correlations between the relational reasoning measures should be
stronger than their correlations with the relevant individual difference measures.
None of those expectations were met by the data. While the correlation between the
vTORR and the TORR was significant, the correlation is moderate, lower than would be
expected for two tests measuring the same construct. Next, the correlation between the two
verbal tests was also significant and moderate, as was the correlation between the two non-verbal
25
tests; both of these correlations were larger than the correlation between the vTORR and the
TORR. Table 7 contains the matrix of all of the correlations between the four measures.
As unexpected as these results are, they are consistent with the results from the combined
factor analysis, specifically that the measurement of relational reasoning is affected by the
symbolic system used in the measure. However, there is an additional piece of data that is
revealed in the correlation matrix: the correlation between the TORR and the measure of
Reading Comprehension is also significant and moderate. This raises an interesting question
about the role of reading comprehension in the non-verbal TORR.
Table 7
Spearman rho Correlation Coefficients Between vTORR, TORR, Spatial Reasoning Ability, and
Reading Comprehension
vTORR TORR Reading Comp Spatial Ability
vTORR .34** .49** .22**
TORR .42** .44**
Reading Comp .23**
Spatial Ability
* 𝑝 < 0.05, ** 𝑝 < 0.001
Multi-Trait Multi-Method Matrix. To further examine the convergent and discriminant
validity of the TORR and vTORR as measures of four sub-constructs of relational reasoning, we
constructed a Multi-Trait Multi-Method matrix of Spearman Rho correlation coefficients
(MTMM; Campbell & Fiske, 1959). In our case, the four sub-constructs of relational reasoning
26
served as the multiple traits and the symbolic system employed in the measure distinguished the
methods.
The first question examined through a MTMM matrix focuses on convergent validity: are
the corresponding TORR and vTORR sections, which theoretically measure the same trait
through different methods, correlated? The MTMM matrix also has the potential to provide
evidence related to discriminant validity through the comparison of the same trait-different
method correlations to other correlation values in the matrix. The same trait-different method
values should be higher than the different trait-different method correlations; they should also be
higher than different trait-same method correlations. Our MTMM matrix is displayed in Table 8.
The data again deviates from expectation. Though the convergent validity correlation
values that address the relationship between each corresponding section across test are
significant, they are low. In addition, these are some of the lowest correlation values in the
matrix. Verbal sections like the vTORR Analogy section are more highly correlated to the other
three verbal sections than they are to the theoretically corresponding section on the TORR.
Similarly, non-verbal sections, such as the TORR Anomaly section, are more highly correlated to
the other non-verbal TORR sections than to the parallel vTORR section. This evidence from the
MTMM matrix further supports that these tests of relational reasoning are impacted strongly by
the symbolic system of the measure.
27
Table 8
Multitrait-Multimethod Matrix
The MTMM matrix also reveals another curious set of data. The convergent validity
correlation values between the corresponding TORR and vTORR sections are also not as strong
as the relationships between non-corresponding TORR and vTORR sections. This result is
vTORR
Analogy
vTORR
Anomaly
vTORR
Antinomy
vTORR
Antithesis
TORR
Analogy
TORR
Anomaly
TORR
Antinomy
TORR
Antithesis
vTORR
Analogy 0.29** 0.29** 0.31** 0.19** 0.21** 0.15** 0.09*
vTORR
Anomaly 0.28** 0.20** 0.23** 0.17** 0.15** 0.10**
vTORR
Antinomy 0.24** 0.20** 0.21** 0.17** 0.13**
vTORR
Antithesis 0.16** 0.15** 0.12** 0.17**
TORR
Analogy 0.38** 0.22** 0.25**
TORR
Anomaly 0.21** 0.28**
TORR
Antinomy 0.14**
TORR
Antithesis
* 𝑝 < 0.05, ** 𝑝 < 0.001
28
challenging to interpret, but it indicates that the four constructs measured by the TORR and
vTORR may not be parallel in the way we theorized, even after considering the symbolic system
used in the measure.
Discussion
Relational reasoning is a key cognitive ability that has been framed as general and
applicable across all domains and contexts (Alexander & the DRLRL, 2012; Alexander, Singer,
et al., 2016). However, research has shown that the symbolic system affects cognition in tasks
that are verbal, non-verbal, and in tasks involving both systems (Ainsworth, 2006; Larkin &
Simon, 1987; Paivio, 2007; Schnotz, 2014). In this study, we explored the role of these symbolic
systems in relational reasoning through two measures, one verbal and one non-verbal.
Overall, the results show that the TORR and vTORR provide promising avenues for
further study of relational reasoning. Our initial findings support those of previous studies of the
TORR and vTORR (Alexander, Dumas, et al., 2016; 2016): both tests are reasonably reliable,
even given their shortened length in this study, and both the TORR and vTORR separately both
support a 4-factor conceptualization of relational reasoning as proposed in the literature
(Alexander & the DRLRL, 2012). However, the remaining evidence indicates that relational
reasoning and its four forms as measured by the TORR and vTORR are impacted by the
symbolic system of the tasks in the tests.
The differences in performance on the TORR and vTORR are evident descriptively.
Unlike we hypothesized, participants performed significantly better on the vTORR than on the
TORR and on three of the four verbal subtests than on their non-verbal counterparts. This
difference in performance on sections designed to be parallel is the first indicator that the
symbolic system plays a role in relational reasoning ability. A previous study found a similar
29
result, where participants scored higher on the verbal than non-verbal measure; this difference
was theorized to be because of the ability of words to convey more subtleties (Alexander, Singer,
et al., 2016). While this may account for the overall performance differences, however, it does
not account for the inconsistent behavior of the antinomy scores as compared to the other
sections on the same test. This implies there may be a unique impact of symbolic system on
antinomous reasoning, possibly that whatever the nature of non-verbal stimuli is that makes the
items more challenging to students does not apply to antinomous reasoning. However,
antinomies could also seem to be differentially affected by the symbolic system because of the
design of the items on the two measures; antinomy items on the vTORR ask participants to
determine the incompatibility of a singular statement with one of two given paragraphs while
antinomy items on the TORR ask participants to determine the incompatibility of entire sets with
a given set
The relationship between symbolic system and relational reasoning in these two measures
continues to come into focus in the Confirmatory Factor Analysis for the scores on the TORR
and vTORR combined. As hypothesized, both relational reasoning and symbolic system played a
role in the best fitting model. Unexpectedly, however, the analogy, anomaly, antinomy, and
antithesis factors nested in the verbal and non-verbal systems rather than the other way around.
This evidence may indicate that the symbolic system of the task plays a more influential role in
relational reasoning than previously considered. We also again see antinomy playing a unique
role, as evidenced by the higher-order factor loadings: antinomy loads more strongly on the
verbal factor than it does on the non-verbal factor. One possible reason for this stems from the
abstract nature of verbal representations versus the more constrained, concrete nature of non-
verbal representations (Larkin & Simon, 1987; Schnotz, 2014). The ability to convey
30
abstractions is important for determining general rules or themes that apply to different sets of
information, and the ability to determine these overarching rules is key for identifying
incompatibilities whether the initial stimuli is verbal or non-verbal.
To further explore the role of symbolic system in relational reasoning, we examined
convergent and discriminant validity for the measures using a reading comprehension test and a
spatial ability test as measures of the participants’ individual differences in ability for the two
symbolic systems. We found that although scores on the two relational reasoning tests are
significantly correlated, these correlations are not stronger than each relational reasoning
measures’ relationship with the relevant individual difference measures. These results indicate
that a participant’s ability in that symbolic system in general is more closely related to their
relational reasoning scores in that system, relationships that may be due to familiarity with the
system (Jablansky et al., 2016) or due to a better understanding of the formatting and syntax
rules within a symbolic system (Ainsworth, 2006).
Finally, the Multitrait-Multimethod matrix (MTMM) showed that correlations between
corresponding sections across the two tests were consistently lower than some or all of the
correlations between the sections measured through the same symbolic system. This evidence
further supports that the method, in our case the symbolic system, plays a strong role in
participants’ relational reasoning abilities.
These results tell a consistent story about the important impact of symbolic system on
relational reasoning ability. Looking back, we can see there are a variety of distinct properties of
the two symbolic systems that could have an effect on the way learners’ reason relationally
within that system. For instance, the sequential nature of verbal representations (Larkin &
Simon, 1987; Paivio, 2007) may influence a reasoner to take a more linear and systematic
31
approach to relationally reasoning in a verbal problem while using a more holistic, spatial
approach to a non-verbal one. Verbal representations also comply to a commonly understood
syntax that could help the participants to identify which components of a sentence or paragraph
are particularly important to connect; non-verbal representations also conform to format rules,
but these rules are often less familiar, especially for the geometric figures used in the TORR
(Ainsworth, 2006; Alexander, Singer, et al., 2016; Jablansky et al., 2016).
These differences across symbolic systems can explain both why students performed
better on average on the verbal relational reasoning questions, but also why the task’s symbolic
system has such a consistent impact on relational reasoning ability. Together, the results from
this study lead us to conclude that relational reasoning can and should be measured verbally and
non-verbally, as the system of the task plays a role in relational reasoning ability.
Connections
The importance of understanding the impact of symbolic system on relational reasoning
ability goes beyond simply further understanding the construct of relational reasoning. On its
own, the differences in relational reasoning across symbolic systems should be considered when
designing instruction using relational reasoning and especially when developing any instruction
specifically intending to target learners’ relational reasoning abilities. If relational reasoning
abilities are impacted by the symbolic system of the task, then any training in relational
reasoning will need to consciously consider the symbolic systems used in the tasks that students
are being trained for.
Recent work on relational reasoning, however, extends the relevance of this study’s
findings through connecting relational reasoning implicitly and explicitly to understanding
information presented through multiple representations, and specifically text-graphic processing
32
(Danielson & Sinatra, 2016; Murphy et al., 2016). Ideally, to create a fully integrated
understanding of a concept described through both verbal and non-verbal representations, a
learner must identify and use the relations between the representations (Danielson & Sinatra,
2016; Mayer, 2014). This form of relational reasoning involves connecting stimuli within each of
the symbolic systems but also drawing connections across the systems – a task that could prove
more challenging if students’ relational reasoning abilities are understood to be affected by the
symbolic system of the stimuli.
Limitations and Future Work
Some data from this study, however, does not immediately seem to fit the consistent
pattern formed above. The first outlying result was the high correlation between the TORR and
reading comprehension when we addressed the convergent and discriminant validity of the
TORR and vTORR. This high value may seem anomalous at first, but an examination of the
TORR reveals a possible cause: instructions for each section of the non-verbal test are given in
words. Because of this, some reading comprehension skill is necessary for participants to
understand the items. Relatedly, the MTMM revealed unexpectedly high correlations between
non-corresponding sections of the TORR and vTORR. These results may indicate that
relationship between symbolic system and the four forms of relational reasoning is more
complex, or that some connections exist between the processes used for solving tasks involving
different forms of relational reasoning across systems. While this study provides no confirmation
of this explanation, this introduces an important avenue for future work: explore in-depth the
processes underlying each form of relational reasoning both verbal and non-verbally, and
determine how the symbolic system impacts its victims.
The main methodological issue remaining in this study has to do with the use of
33
shortened test forms. As discussed, these shortened versions were used to minimize fatigue
effects given the 5 measures participants were asked to complete. To allow for comparison,
specifically comparison of the reliabilities, we used the Spearman-Brown prediction formula to
predict what the reliability of the full test would be based on the small tests. The problem with
this technique, though, is that the predicted reliabilities are likely to be overestimates, due to the
fact that we removed the weakest items. A potential future study could aim to develop a few new
items to replace the particularly poor items that we eliminated; it would also allow for replication
of the results from this study, which would help solidify our understanding of the role of
symbolic system in relational reasoning
Finally, the last issue that could be addressed through further study would be a further
exploration of the application of the differences identified through the current study to tasks
involving a mix of symbolic systems, such as those studied in Danielson & Sinatra (2016) and
Murphy et al. (2016), using the TORR and vTORR as measures of verbal and non-verbal
relational reasoning ability. Ideally, the availability of stable, carefully studied verbal and non-
verbal measures of relational reasoning will allow research on the process and applicability of
relational reasoning across symbolic systems to progress in new and interesting directions.
34
References
Ainsworth, S. (2006). DeFT: A conceptual framework for considering learning with multiple
representations. Learning and Instruction, 16(3), 183-198.
doi:10.1016/j.learninstruc.2006.03.001
Alexander, P. A., & The Disciplined Reading and Learning Research Laboratory. (2012).
Reading into the future: Competence for the 21st century. Educational Psychologist,
47(4), 259–280. doi:10.1080/00461520.2012.722511
Alexander, P. A., Dumas, D., Grossnickle, E. M., List, A., & Firetto, C. M. (2016). Measuring
relational reasoning. The Journal of Experimental Education, 84(1), 119-133.
doi:10.1080/00220973.2014.963216
Alexander, P. A., Singer, L. M., Jablansky, S., & Hattan, C. (2016). Relational reasoning in word
and in figure. Journal of Educational Psychology. Advance online publication.
doi:10.1037/edu0000110
Braasch, J. L. G., Goldman, S. R., & Wiley, J. (2013). The influences of text and reader
characteristics on learning from refutations in science texts. Journal of Educational
Psychology, 105(3), 561-578. doi:10.1037/a0032627
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the
multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105.
doi:10.1037/h0046016
Chinn, C. A., & Malhotra, B. A. (2002). Children's responses to anomalous scientific data: How
is conceptual change impeded? Journal of Educational Psychology, 94(2), 327-343.
doi:10.1037//0022-0663.94.2.327
Danielson, R. W. & Sinatra, G. M. (2016). A relational reasoning approach to text-graphic
35
processing. Educational Psychology Review. Advance online publication.
doi:10.1007/s10648-016-9374-2
Davis, F. B. & Davis, C. C. (1962). Davis Reading Test. New York: The Psychological
Corporation.
DeWolf, M., Bassok, M., & Holyoak, K. J. (2015). Conceptual structure and the procedural
affordances of rational numbers: Relational reasoning with fractions and decimals.
Journal of Experimental Psychology, General, 144(1), 127-150. doi:10.1037/xge0000034
Dumas, D. (2016). Relational reasoning in science, medicine, and engineering. Educational
Psychology Review. Advance online publication. doi:10.1007/s10648-016-9370-6
Dumas, D., & Alexander, P. A. (2016). Calibration of the test of relational reasoning.
Psychological Assessment, 28(10), 1303-1318. doi:10.1037/pas0000267
Dumas, D., Alexander, P. A., Baker, L. M., Jablansky, S., & Dunbar, K. N. (2014). Relational
reasoning in medical education: Patterns in discourse and diagnosis. Journal of
Educational Psychology, 106(4), 1021–1035. doi:10.1037/a0036777
Dumas, D., Alexander, P. A., & Grossnickle, E. M. (2013). Relational reasoning and its
manifestations in the educational context: A systematic review of the literature.
Educational Psychology Review, 25(3), 391–427. doi:10.1007/s10648-013-9224
Dumas, D., & Schmidt, L. (2015). Relational reasoning as predictor for engineering ideation
success using TRIZ. Journal of Engineering Design, 26(1-3), 74.
doi:10.1080/09544828.2015.1020287
Ekstrom, RB, French, JW, Harman, HH, Dermen, D. Kit of Factor-Referenced Cognitive Tests.
Princeton, NJ: Educational Testing Service, 1976.
Fahnestock, J. (1999). Rhetorical figures in science. New York: Oxford University Press.
36
Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science,
7(2), 155–170. doi:10.1016/S0364-0213(83)80009-3
Gentner, D. & Jerziorski, M. (1993). The shift from metaphor to analogy in western science. In
A. Ortony, (Ed.), Metaphor and thought (2nd ed., pp. 447-480). Cambridge, England:
Cambridge University Press.
Gentner, D., & Markman, A. B. (1997). Structure mapping in analogy and similarity. American
Psychologist, 52(1), 45-56. doi:10.1037/0003-066X.52.1.45
Goldstone, R. L. & Son, J. Y. (2005). Similarity. In K. J. Holyoak, & R. G. Morrison, (Eds.), The
Cambridge handbook of thinking and reasoning (pp. 13-36). New York: Cambridge
University Press.
Hancock, G. R., & Mueller, R. O. (2013). Structural equation modeling: A second course (2nd
ed.). Charlotte, NC: Information Age Publishing, Inc.
Hoalyoak, K. J. (2005). Analogy. In K. J. Holyoak, & R. G. Morrison, (Eds.), The cambridge
handbook of thinking and reasoning (pp. 117-142). New York: Cambridge University
Press.
Holyoak, K. J., & Morrison, R. G. (Eds). (2005). The Cambridge handbook of thinking and
reasoning. New York: Cambridge University Press.
Jablansky, S., Alexander, P. A., Dumas, D., & Compton, V. (2016). Developmental differences
in relational reasoning among primary and secondary school students. Journal of
Educational Psychology, 108(4), 592-608. doi:10.1037/edu0000070
Jackson, D. L., Gillaspy, J. A., & Purc-Stephenson, R. (2009). Reporting practices in
confirmatory factor analysis: An overview and some recommendations. Psychological
Methods, 14(1), 6-23. doi:10.1037/a0014694
37
Johnson-Laird, P. N. (2005). Mental models and thought. In K. J. Holyoak, & R. G. Morrison,
(Eds.), The cambridge handbook of thinking and reasoning (pp. 185-208). New York:
Cambridge University Press.
Kendeou, P., Butterfuss, R., Van Boekel, M., & O’Brien, E. J. (2016). Integrating relational
reasoning and knowledge revision during reading. Educational Psychology Review.
Advance online publication. doi:10.1007/s10648-016-9381-3
Krawczyk, D. C. (2012). The cognition and neuroscience of relational reasoning. Brain
Research, 1428, 13-23. doi:10.1016/j.brainres.2010.11.080
Larkin, J. H. & Simon, H. A. (1987). Why a diagram is (sometimes) worth ten thousand words.
Cognitive Science, 11(1), 65-100. doi:10.1016/S0364-0213(87)80026-5
Mayer, R. E. (2014). Cognitive theory of multimedia learning. In R. E. Mayer, (Ed.), The
cambridge handbook of multimedia learning (2nd ed., pp. 43-71). New York: Cambridge
University Press.
Miyake, A., Friedman, N. P., Rettinger, D. A., Shah, P., & Hegarty, M. (2001). How are
visuospatial working memory, executive functioning, and spatial abilities related? A
latent-variable analysis. Journal of Experimental Psychology: General, 130(4), 621-640.
doi:10.1037/0096-3445.130.4.621
Murphy, P. K., Firetto, C. M., & Greene, J. A. (2016). Enriching students’ scientific thinking
through relational reasoning: seeking evidence in texts, tasks, and talk. Educational
Psychology Review. Advance online publication. doi:10.1007/s10648-016-9387-x
Muthén, L. K., & Muthén, B. O. (2010). Mplus user’s guide (6th ed.). Los Angeles, CA:
Authors.
Paivio, A. (1979). Imagery and verbal processes. Hillsdale, N.J: Lawrence Erlbaum Associates.
38
Paivio, A. (2007). Mind and its evolution: A dual coding theoretical approach. Mahwah, N.J:
Lawrence Erlbaum Associates.
Quine, W. V. (1969). Ontological relativity: And other essays. New York: Columbia University
Press.
Raven, J. C. (1941). Standardization of Progressive Matrices, 1938. British Journal of Medical
Psychology, 19, 137–150. doi:10.1111/j.2044-8341.1941.tb00316.x
Richland, L. E., Begolli, K. N., Simms, N., Frausel, R. R., & Lyons, E. A. (2016). Supporting
mathematical discussions: the roles of comparison and cognitive load. Educational
Psychology Review. Advance online publication. doi:10.1007/s10648-016-9382-2
Schnotz, W. (2014). Integrated model of text and picture comprehension. In R. E. Mayer, (Ed.),
The cambridge handbook of multimedia learning (2nd ed., pp. 72-103). New York:
Cambridge University Press.
Sharma, S., Mukherjee, S., Kumar, A., & Dillon, W. R. (2005). A simulation study to investigate
the use of cutoff values for assessing model fit in covariance structure models. Journal of
Business Research, 58(7), 935-943. doi:10.1016/j.jbusres.2003.10.007
Sternberg, R. J. (1977). Intelligence, information processing, and analogical reasoning: the
componential analysis of human abilities. Lawrence Erlbaum Associates.
Yu, C. (2002). Evaluating cutoff criteria of model fit indices for latent variable models with
binary and continuous outcomes. Unpublished doctoral dissertation, University of
California, Los Angeles.
39
Appendix A
Sample Items from the verbal Test of Relational Reasoning
Figure A1. Sample vTORR Analogy item.
Directions: For each problem, the given sentence describes a situation. Select the sentence from the answer choices below that describes the most similar situation. The prison guard walked the inmate back to his cellblock and put him in his jail cell.
A. The grizzly bear climbed into her cave and went to sleep for the winter. B. The teacher led the kindergartners to the parking lot and placed them on the wrong bus. C. The teenager carried a box of cereal into the kitchen and left it on the counter. D. The librarian carried the encyclopedia back to its shelf and slid it into its proper space.
Figure A2. Sample vTORR Anomaly item.
Directions: Three sentences in each question follow a particular pattern or rule. Find this pattern or rule and select the sentence that does not follow the pattern.
A. The pride of lions devoured a wildebeest. B. The school of piranhas feasted on the bird. C. The herd of buffalo munched on the grass. D. The flock of vultures fed on the carrion.
40
Figure A3. Sample vTORR Antinomy item.
Directions: In each problem, read the two paragraphs. Then select the sentence from the answer choices that includes an idea that could be reflected in one paragraph, but not the other. Running is a great way to get in shape and relieve stress. Long runs leave me feeling energized and excited for the day. They also boost my self-confidence and make me feel like I can accomplish anything. Team sports provide opportunities to spend time with friends, with the added benefit of exercise! I enjoy bonding with my friends during a Saturday morning soccer game or Wednesday night kickball tournament.
A. Exercise is valuable for your body. B. Running can be exhausting. C. Exercise should be done socially. D. Kickball isn’t really a sport.
Figure A4. Sample vTORR Antithesis item.
Directions: For each problem, the given sentence describes a situation. Select the sentence from the answer choices below that describes the opposite situation. Lost in the wilderness, the adventurer started a fire to keep warm.
A. While sitting in their home, the family turned up the thermostat to heat the house. B. Running along the marked course, the runner splashed water over his head to stay cool. C. Lost at sea, the sailors huddled in a group to conserve body heat. D. Wandering the tundra, the Eskimo wore a sealskin coat to shield him from the cold.
41
Appendix B
Sample Items from the Test of Relational Reasoning
Directions: Below is a pattern that is not yet complete. Select the figure from those shown below that completes the pattern.
Figure B1. Sample TORR Analogy item.
42
Directions: All these figures but one follow a particular pattern or rule. Find the one figure that does not follow the pattern.
Figure B2. Sample TORR Anomaly item.
43
Directions: • The problems in this section ask you to compare sets of objects that vary in certain
features. • Each set has a specific rule that decides what objects can be included in that set. Some
of the objects included in each set are pictured, enough to allow you to determine its rule for inclusion.
• Every problem asks you to identify which ONE of the four sets that are shown could NEVER have an object in common with the Given set, based on the compatibility of their rules for inclusion.
There will always be EXACTLY 1 set that is incompatible with the Given set.
Figure B3. Sample TORR Antinomy item.
44
Directions: The given figure below depicts a process in which X becomes Y. In the figure, the arrow represents the rule by which the change occurs. Select the answer choice that shows the opposite of the given process.
Figure B4. Sample TORR Antithesis item.
45
Appendix C
CFA Model Results for the TORR
Model A
Model B
Model C
Figure C1. Confirmatory factor analysis models for the TORR.
46
Table C1 Factor loadings and latent variable correlation estimates for the TORR Model A
Factor loading Standard error z-value p-value Relational reasoning by
ANALO1 0.396 0.056 7.035 0.000 ANALO2 0.606 0.053 11.403 0.000 ANALO3 0.414 0.055 7.530 0.000 ANALO4 0.715 0.051 13.944 0.000 ANALO5 0.520 0.054 9.675 0.000 ANOM1 0.595 0.042 14.268 0.000 ANOM2 0.361 0.049 7.388 0.000 ANOM3 0.361 0.048 7.479 0.000 ANOM4 0.562 0.042 13.437 0.000 ANOM5 0.364 0.048 7.547 0.000 ANTIN1 0.491 0.049 9.945 0.000 ANTIN2 0.550 0.042 13.187 0.000 ANTIN3 0.546 0.043 12.727 0.000 ANTIN4 0.383 0.047 8.113 0.000 ANTIN5 0.387 0.048 8.141 0.000 ANTITH1 0.423 0.051 8.341 0.000 ANTITH2 0.405 0.046 8.785 0.000 ANTITH3 0.493 0.045 11.003 0.000 ANTITH4 0.360 0.048 7.565 0.000 ANTITH5 0.532 0.044 12.126 0.000
47
Table C2 Factor loadings and latent variable correlation estimates for the TORR Model B
Factor loading Standard error z-value p-value Analogy by ANALO1 0.399 0.055 7.230 0.000 ANALO2 0.620 0.051 12.077 0.000 ANALO3 0.402 0.054 7.388 0.000 ANALO4 0.711 0.050 14.246 0.000 ANALO5 0.516 0.053 9.743 0.000 Anomaly by ANOM1 0.709 0.044 16.116 0.000 ANOM2 0.421 0.053 7.896 0.000 ANOM3 0.433 0.052 8.262 0.000 ANOM4 0.681 0.044 15.544 0.000 ANOM5 0.417 0.053 7.825 0.000 Antinomy by ANTIN1 0.694 0.047 14.776 0.000 ANTIN2 0.740 0.044 16.688 0.000 ANTIN3 0.739 0.044 16.903 0.000 ANTIN4 0.534 0.049 10.934 0.000 ANTIN5 0.557 0.048 11.573 0.000 Antithesis by ANTITH1 0.539 0.055 9.788 0.000 ANTITH2 0.581 0.049 11.955 0.000 ANTITH3 0.688 0.048 14.480 0.000 ANTITH4 0.497 0.051 9.696 0.000 ANTITH5 0.725 0.046 15.621 0.000
Latent variable
correlations Standard error z-value p-value Anomaly with Analogy 0.745 0.055 13.534 0.000 Antinomy with Analogy 0.382 0.058 6.627 0.000 Anomaly 0.373 0.057 6.600 0.000 Antithesis with Analogy 0.447 0.059 7.531 0.000 Anomaly 0.502 0.056 8.915 0.000 Antinomy 0.252 0.056 4.483 0.000
48
Table C3 Factor loadings and latent variable correlation estimates for the TORR Model C
Factor loading Standard error z-value p-value Similarities (analogy) by
ANALO1 0.396 0.056 7.035 0.000 ANALO2 0.606 0.053 11.403 0.000 ANALO3 0.414 0.055 7.530 0.000 ANALO4 0.715 0.051 13.944 0.000 ANALO5 0.520 0.054 9.675 0.000 Differences by ANOM1 0.595 0.042 14.268 0.000 ANOM2 0.361 0.049 7.388 0.000 ANOM3 0.361 0.048 7.479 0.000 ANOM4 0.562 0.042 13.437 0.000 ANOM5 0.364 0.048 7.547 0.000 ANTIN1 0.491 0.049 9.945 0.000 ANTIN2 0.550 0.042 13.187 0.000 ANTIN3 0.546 0.043 12.727 0.000 ANTIN4 0.383 0.047 8.113 0.000 ANTIN5 0.387 0.048 8.141 0.000 ANTITH1 0.423 0.051 8.341 0.000 ANTITH2 0.405 0.046 8.785 0.000 ANTITH3 0.493 0.045 11.003 0.000 ANTITH4 0.360 0.048 7.565 0.000 ANTITH5 0.532 0.044 12.126 0.000
Latent variable
correlations Standard error z-value p-value Differences with
Similarities 0.680 0.046 14.807 0.000
49
Appendix D
CFA Model Results for the vTORR
Model A
Model B
Model C
Figure D1. Confirmatory factor analysis models for the vTORR.
50
Table D1 Factor loadings and latent variable correlation estimates for the vTORR Model A
Factor loading Standard error z-value p-value Relational reasoning by
ANALO1 0.341 0.049 7.009 0.000 ANALO2 0.390 0.055 7.079 0.000 ANALO3 0.293 0.050 5.864 0.000 ANALO4 0.529 0.045 11.712 0.000 ANALO5 0.503 0.051 9.930 0.000 ANOM1 0.295 0.050 5.912 0.000 ANOM2 0.400 0.048 8.315 0.000 ANOM3 0.464 0.047 9.825 0.000 ANOM4 0.242 0.050 4.797 0.000 ANOM5 0.392 0.056 7.015 0.000 ANTIN1 0.400 0.050 8.023 0.000 ANTIN2 0.379 0.047 8.094 0.000 ANTIN3 0.389 0.048 8.099 0.000 ANTIN4 0.543 0.043 12.654 0.000 ANTIN5 0.620 0.042 14.621 0.000 ANTITH1 0.525 0.044 11.825 0.000 ANTITH2 0.437 0.048 9.076 0.000 ANTITH3 0.617 0.042 14.648 0.000 ANTITH4 0.593 0.043 13.875 0.000 ANTITH5 0.536 0.049 10.851 0.000
51
Table D2 Factor loadings and latent variable correlation estimates for the vTORR Model B
Factor loading Standard error z-value p-value Analogy by ANALO1 0.396 0.055 7.168 0.000 ANALO2 0.458 0.061 7.525 0.000 ANALO3 0.329 0.057 5.785 0.000 ANALO4 0.613 0.052 11.736 0.000 ANALO5 0.583 0.058 10.098 0.000 Anomaly by ANOM1 0.376 0.058 6.496 0.000 ANOM2 0.528 0.056 9.386 0.000 ANOM3 0.614 0.058 10.535 0.000 ANOM4 0.313 0.059 5.296 0.000 ANOM5 0.526 0.066 8.019 0.000 Antinomy by ANTIN1 0.492 0.055 8.890 0.000 ANTIN2 0.473 0.051 9.359 0.000 ANTIN3 0.483 0.053 9.173 0.000 ANTIN4 0.674 0.047 14.269 0.000 ANTIN5 0.773 0.048 16.138 0.000 Antithesis by ANTITH1 0.635 0.047 13.419 0.000 ANTITH2 0.536 0.051 10.486 0.000 ANTITH3 0.755 0.042 17.904 0.000 ANTITH4 0.714 0.044 16.414 0.000 ANTITH5 0.622 0.053 11.629 0.000
Latent variable
correlations Standard error z-value p-value Anomaly with Analogy 0.687 0.078 8.822 0.000 Antinomy with Analogy 0.623 0.064 9.698 0.000 Anomaly 0.598 0.065 9.175 0.000 Antithesis with Analogy 0.646 0.062 10.502 0.000 Anomaly 0.440 0.068 6.482 0.000 Antinomy 0.436 0.056 7.846 0.000
52
Table D3 Factor loadings and latent variable correlation estimates for the vTORR Model C
Factor loading Standard error z-value p-value Similarities (analogy) by
ANALO1 0.392 0.055 7.074 0.000 ANALO2 0.457 0.061 7.481 0.000 ANALO3 0.330 0.057 5.800 0.000 ANALO4 0.617 0.052 11.780 0.000 ANALO5 0.582 0.058 10.076 0.000 Differences by ANOM1 0.298 0.050 5.941 0.000 ANOM2 0.402 0.048 8.302 0.000 ANOM3 0.468 0.048 9.831 0.000 ANOM4 0.243 0.051 4.796 0.000 ANOM5 0.394 0.056 6.992 0.000 ANTIN1 0.405 0.050 8.069 0.000 ANTIN2 0.384 0.047 8.150 0.000 ANTIN3 0.393 0.048 8.125 0.000 ANTIN4 0.549 0.043 12.670 0.000 ANTIN5 0.626 0.043 14.670 0.000 ANTITH1 0.531 0.045 11.919 0.000 ANTITH2 0.442 0.048 9.120 0.000 ANTITH3 0.623 0.042 14.704 0.000 ANTITH4 0.598 0.043 13.901 0.000 ANTITH5 0.541 0.050 10.895 0.000
Latent variable
correlations Standard error z-value p-value Differences with
Similarities 0.795 0.051 15.533 0.000
53
Appendix E
CFA Model Results for the combined tests
Verbal Non-verbal
Model D
Model E
Model F
54
Model H
Verbal Non-verbal
Model G
Figure E1. Confirmatory factor analysis models for the combined tests.
55
Table E1 Factor loadings and latent variable correlation estimates for the combined Model D
Factor loading Standard error z-value p-value Verbal by VANALO1 0.326 0.050 6.530 0.000 VANALO2 0.399 0.056 7.076 0.000 VANALO3 0.308 0.050 6.120 0.000 VANALO4 0.526 0.046 11.452 0.000 VANALO5 0.517 0.052 9.994 0.000 VANOM1 0.303 0.051 5.989 0.000 VANOM2 0.410 0.049 8.420 0.000 VANOM3 0.480 0.048 9.983 0.000 VANOM4 0.251 0.051 4.904 0.000 VANOM5 0.414 0.057 7.266 0.000 VANTIN1 0.403 0.051 7.913 0.000 VANTIN2 0.376 0.048 7.880 0.000 VANTIN3 0.415 0.048 8.581 0.000 VANTIN4 0.530 0.045 11.847 0.000 VANTIN5 0.630 0.042 14.831 0.000 VANTITH1 0.512 0.045 11.302 0.000 VANTITH2 0.404 0.049 8.208 0.000 VANTITH3 0.569 0.045 12.583 0.000 VANTITH4 0.599 0.043 13.874 0.000 VANTITH5 0.540 0.050 10.726 0.000 Non-verbal by GANALO1 0.342 0.049 6.977 0.000 GANALO2 0.513 0.045 11.390 0.000 GANALO3 0.352 0.049 7.168 0.000 GANALO4 0.599 0.044 13.664 0.000 GANALO5 0.437 0.048 9.026 0.000 GANOM1 0.578 0.042 13.707 0.000 GANOM2 0.359 0.049 7.384 0.000 GANOM3 0.375 0.048 7.781 0.000 GANOM4 0.574 0.041 13.919 0.000 GANOM5 0.394 0.047 8.354 0.000 GANTIN1 0.464 0.050 9.209 0.000 GANTIN2 0.506 0.044 11.410 0.000 GANTIN3 0.478 0.045 10.668 0.000 GANTIN4 0.414 0.047 8.767 0.000 GANTIN5 0.386 0.048 8.074 0.000 GANTITH1 0.401 0.052 7.769 0.000 GANTITH2 0.368 0.047 7.866 0.000 GANTITH3 0.446 0.046 9.598 0.000 GANTITH4 0.330 0.049 6.747 0.000 GANTITH5 0.507 0.045 11.193 0.000
56
Latent variable
correlations Standard error z-value p-value Non-verbal with
Verbal 0.485 0.044 10.919 0.000 Table E2 Factor loadings and latent variable correlation estimates for the combined Model F
Factor loading Standard error z-value p-value Analogy by vANALO1 0.293 0.051 5.768 0.000 vANALO2 0.385 0.058 6.608 0.000 vANALO3 0.288 0.051 5.616 0.000 vANALO4 0.488 0.048 10.188 0.000 vANALO5 0.492 0.055 8.920 0.000 gANALO1 0.344 0.050 6.953 0.000 gANALO2 0.525 0.047 11.275 0.000 gANALO3 0.351 0.050 6.963 0.000 gANALO4 0.598 0.044 13.444 0.000 gANALO5 0.441 0.049 8.993 0.000 Anomaly by vANOM1 0.295 0.052 5.707 0.000 vANOM2 0.395 0.051 7.808 0.000 vANOM3 0.459 0.051 9.088 0.000 vANOM4 0.234 0.052 4.461 0.000 vANOM5 0.413 0.059 7.000 0.000 gANOM1 0.567 0.044 12.792 0.000 gANOM2 0.352 0.050 7.025 0.000 gANOM3 0.378 0.049 7.646 0.000 gANOM4 0.580 0.042 13.833 0.000 gANOM5 0.402 0.050 8.106 0.000 Antinomy by vANTIN1 0.406 0.055 7.388 0.000 vANTIN2 0.383 0.051 7.490 0.000 vANTIN3 0.428 0.052 8.297 0.000 vANTIN4 0.537 0.048 11.081 0.000 vANTIN5 0.674 0.046 14.681 0.000 gANTIN1 0.544 0.050 10.820 0.000 gANTIN2 0.559 0.047 11.797 0.000 gANTIN3 0.524 0.046 11.289 0.000 gANTIN4 0.492 0.049 10.048 0.000 gANTIN5 0.450 0.049 9.149 0.000
57
Antithesis by vANTITH1 0.556 0.048 11.587 0.000 vANTITH2 0.420 0.052 8.036 0.000 vANTITH3 0.590 0.048 12.239 0.000 vANTITH4 0.654 0.045 14.690 0.000 vANTITH5 0.575 0.054 10.676 0.000 gANTITH1 0.432 0.055 7.878 0.000 gANTITH2 0.415 0.050 8.371 0.000 gANTITH3 0.485 0.051 9.487 0.000 gANTITH4 0.365 0.053 6.888 0.000 gANTITH5 0.570 0.047 12.137 0.000
Latent variable correlations Standard error z-value p-value
Anomaly with Analogy 0.904 0.045 19.920 0.000 Antinomy with Analogy 0.602 0.046 13.153 0.000 Anomaly 0.586 0.049 12.072 0.000 Antithesis with Analogy 0.582 0.051 11.514 0.000 Anomaly 0.571 0.052 11.012 0.000 Antinomy 0.414 0.049 8.377 0.000
Table E3 Factor loadings and latent variable correlation estimates for the combined Model F
Factor loading Standard error z-value p-value vAnalogy by vANALO1 0.370 0.056 6.606 0.000 vANALO2 0.463 0.062 7.487 0.000 vANALO3 0.342 0.057 5.998 0.000 vANALO4 0.611 0.053 11.633 0.000 vANALO5 0.594 0.058 10.189 0.000 vAnomaly by vANOM1 0.379 0.058 6.494 0.000 vANOM2 0.523 0.058 9.083 0.000 vANOM3 0.605 0.059 10.288 0.000 vANOM4 0.311 0.060 5.174 0.000 vANOM5 0.543 0.066 8.252 0.000
58
vAntinomy by
vANTIN1 0.488 0.058 8.466 0.000
vANTIN2 0.465 0.053 8.830 0.000
vANTIN3 0.507 0.054 9.314 0.000
vANTIN4 0.651 0.049 13.260 0.000
vANTIN5 0.786 0.050 15.799 0.000
vAntithesis by
vANTITH1 0.639 0.048 13.278 0.000
vANTITH2 0.513 0.052 9.890 0.000
vANTITH3 0.714 0.044 16.137 0.000
vANTITH4 0.745 0.044 16.799 0.000
vANTITH5 0.648 0.055 11.746 0.000
gAnalogy by
gANALO1 0.397 0.054 7.294 0.000
gANALO2 0.622 0.050 12.526 0.000
gANALO3 0.411 0.054 7.661 0.000
gANALO4 0.707 0.049 14.354 0.000
gANALO5 0.514 0.052 9.834 0.000
gAnomaly by
gANOM1 0.678 0.045 15.131 0.000
gANOM2 0.411 0.054 7.674 0.000
gANOM3 0.443 0.053 8.369 0.000
gANOM4 0.683 0.044 15.606 0.000
gANOM5 0.452 0.053 8.515 0.000
gAntinomy by
gANTIN1 0.699 0.049 14.361 0.000
gANTIN2 0.728 0.048 15.125 0.000
gANTIN3 0.696 0.046 14.962 0.000
gANTIN4 0.587 0.051 11.440 0.000
gANTIN5 0.572 0.050 11.416 0.000
gAntithesis by
gANTITH1 0.555 0.057 9.741 0.000
gANTITH2 0.569 0.050 11.362 0.000
gANTITH3 0.666 0.051 13.107 0.000
gANTITH4 0.493 0.053 9.216 0.000
gANTITH5 0.746 0.049 15.289 0.000
59
Latent variable
correlations Standard error z-value p-value gAnalogy with vAnalogy 0.429 0.075 5.709 0.000 vAnomaly with vAnalogy 0.685 0.078 8.769 0.000 gAnalogy 0.526 0.073 7.242 0.000 gAnomaly with vAnalogy 0.460 0.076 6.088 0.000 gAnalogy 0.748 0.055 13.517 0.000 vAnomaly 0.335 0.074 4.502 0.000 vAntinomy with vAnalogy 0.622 0.064 9.646 0.000 gAnalogy 0.375 0.063 5.999 0.000 vAnomaly 0.598 0.065 9.163 0.000 gAnomaly 0.363 0.061 5.923 0.000 gAntinomy with vAnalogy 0.275 0.067 4.072 0.000 gAnalogy 0.386 0.058 6.705 0.000 vAnomaly 0.242 0.068 3.550 0.000 gAnomaly 0.373 0.057 6.561 0.000 vAntinomy 0.294 0.059 4.980 0.000 vAntithesis with
vAnalogy 0.649 0.061 10.554 0.000 gAnalogy 0.286 0.064 4.505 0.000 vAnomaly 0.441 0.068 6.490 0.000 gAnomaly 0.309 0.060 5.174 0.000 vAntinomy 0.437 0.056 7.874 0.000 gAntinomy 0.197 0.058 3.389 0.001
gAntithesis vAnalogy 0.166 0.070 2.370 0.018 gAnalogy 0.446 0.059 7.502 0.000 vAnomaly 0.207 0.068 3.024 0.002 gAnomaly 0.505 0.056 8.967 0.000 vAntinomy 0.192 0.060 3.183 0.001 gAntinomy 0.250 0.056 4.444 0.000 vAntithesis 0.290 0.057 5.101 0.000
60
Table E4 Factor loadings and latent variable correlation estimates for the combined Model G
Factor loading Standard error z-value p-value vAnalogy by vANALO1 0.366 0.058 6.288 0.000 vANALO2 0.470 0.065 7.288 0.000 vANALO3 0.342 0.059 5.767 0.000 vANALO4 0.602 0.056 10.794 0.000 vANALO5 0.602 0.061 9.900 0.000 vAnomaly by vANOM1 0.378 0.059 6.357 0.000 vANOM2 0.527 0.059 8.946 0.000 vANOM3 0.603 0.060 9.984 0.000 vANOM4 0.306 0.061 4.981 0.000 vANOM5 0.547 0.067 8.145 0.000 vAntinomy by vANTIN1 0.489 0.059 8.305 0.000 vANTIN2 0.467 0.054 8.642 0.000 vANTIN3 0.518 0.055 9.345 0.000 vANTIN4 0.648 0.051 12.809 0.000 vANTIN5 0.778 0.051 15.129 0.000 vAntithesis by vANTITH1 0.638 0.049 12.975 0.000 vANTITH2 0.510 0.053 9.683 0.000 vANTITH3 0.710 0.045 15.611 0.000 vANTITH4 0.754 0.046 16.530 0.000 vANTITH5 0.644 0.057 11.307 0.000 gAnalogy by gANALO1 0.393 0.055 7.095 0.000 gANALO2 0.622 0.051 12.185 0.000 gANALO3 0.417 0.054 7.680 0.000 gANALO4 0.704 0.051 13.907 0.000 gANALO5 0.515 0.053 9.664 0.000 gAnomaly by gANOM1 0.671 0.047 14.414 0.000 gANOM2 0.408 0.055 7.490 0.000 gANOM3 0.444 0.054 8.204 0.000 gANOM4 0.689 0.045 15.189 0.000 gANOM5 0.456 0.054 8.395 0.000
61
gAntinomy by
gANTIN1 0.702 0.049 14.446 0.000
gANTIN2 0.723 0.049 14.799 0.000
gANTIN3 0.686 0.047 14.669 0.000
gANTIN4 0.596 0.052 11.553 0.000
gANTIN5 0.577 0.050 11.527 0.000
gAntithesis by
gANTITH1 0.555 0.058 9.509 0.000
gANTITH2 0.574 0.051 11.177 0.000
gANTITH3 0.654 0.053 12.304 0.000
gANTITH4 0.498 0.055 9.080 0.000
gANTITH5 0.750 0.051 14.813 0.000
Higher-order
Factor Loadings Standard error z-value p-value Analogy by
vANALOGY 0.667 0.070 9.492 0.000 gANALOGY 0.643 0.060 10.752 0.000
Anomaly by vANOMALY 0.574 0.074 7.796 0.000 gANOMALY 0.584 0.066 8.842 0.000
Antinomy by vANTINOMY 0.654 0.068 9.604 0.000 gANTINOMY 0.449 0.059 7.603 0.000
Antithesis by vANTITHESIS 0.598 0.069 8.707 0.000 gANTITHESIS 0.486 0.058 8.305 0.000
Latent variable correlations Standard error z-value p-value
Anomaly with Analogy 1.625 0.197 8.241 0.000 Antinomy with Analogy 1.143 0.137 8.318 0.000 Anomaly 1.222 0.166 7.342 0.000 Antithesis with Analogy 1.108 0.144 7.713 0.000 Anomaly 1.177 0.175 6.732 0.000 Antinomy 0.912 0.140 6.538 0.000
62
Table E5 Factor loadings and latent variable correlation estimates for the combined Model H
Factor loading Standard error z-value p-value vAnalogy by vANALO1 0.373 0.057 6.591 0.000 vANALO2 0.465 0.063 7.430 0.000 vANALO3 0.342 0.058 5.918 0.000 vANALO4 0.606 0.053 11.338 0.000 vANALO5 0.595 0.059 10.103 0.000 vAnomaly by vANOM1 0.382 0.059 6.502 0.000 vANOM2 0.525 0.058 9.046 0.000 vANOM3 0.610 0.060 10.210 0.000 vANOM4 0.307 0.061 5.077 0.000 vANOM5 0.537 0.066 8.085 0.000 vAntinomy by vANTIN1 0.491 0.058 8.509 0.000 vANTIN2 0.467 0.053 8.885 0.000 vANTIN3 0.511 0.055 9.362 0.000 vANTIN4 0.654 0.049 13.305 0.000 vANTIN5 0.776 0.050 15.593 0.000 vAntithesis by vANTITH1 0.637 0.048 13.126 0.000 vANTITH2 0.515 0.052 9.894 0.000 vANTITH3 0.718 0.044 16.184 0.000 vANTITH4 0.744 0.045 16.644 0.000 vANTITH5 0.645 0.056 11.592 0.000 gAnalogy by gANALO1 0.399 0.055 7.293 0.000 gANALO2 0.617 0.050 12.335 0.000 gANALO3 0.411 0.054 7.628 0.000 gANALO4 0.708 0.050 14.198 0.000 gANALO5 0.516 0.053 9.813 0.000 gAnomaly by gANOM1 0.678 0.045 15.000 0.000 gANOM2 0.413 0.054 7.676 0.000 gANOM3 0.441 0.053 8.276 0.000 gANOM4 0.683 0.044 15.477 0.000 gANOM5 0.453 0.053 8.491 0.000
63
gAntinomy by gANTIN1 0.696 0.049 14.290 0.000 gANTIN2 0.731 0.048 15.320 0.000 gANTIN3 0.701 0.046 15.132 0.000 gANTIN4 0.582 0.051 11.395 0.000 gANTIN5 0.570 0.050 11.407 0.000 gAntithesis by gANTITH1 0.555 0.058 9.602 0.000 gANTITH2 0.572 0.051 11.287 0.000 gANTITH3 0.664 0.052 12.855 0.000 gANTITH4 0.496 0.054 9.195 0.000 gANTITH5 0.744 0.050 14.953 0.000
Higher-order Factor Loadings Standard error z-value p-value
Verbal by
vANALOGY 0.888 0.060 14.764 0.000
vANOMALY 0.776 0.060 12.936 0.000
vANTINOMY 0.724 0.047 15.296 0.000
vANTITHESIS 0.644 0.050 12.815 0.000
Non-verbal by gANALOGY 0.860 0.051 16.883 0.000
gANOMALY 0.848 0.048 17.670 0.000
gANTINOMY 0.482 0.050 9.546 0.000
gANTITHESIS 0.532 0.049 10.759 0.000
Latent variable correlations Standard error z-value p-value
Non-verbal with Verbal 0.600 0.054 11.176 0.000