hr avatar assessment solution technical...
TRANSCRIPT
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 1
HR Avatar Assessment Solution
Technical Manual
Updated: December 2, 2019
www.hravatar.com 41101 Haybine Lane, Aldie VA 20105
703-688-3981
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 2
Contents
Overview ............................................................................................................................... 9
Testing Standards ................................................................................................................. 9
Solution Summary ............................................................................................................... 10
Cognitive Work Simulations ............................................................................................... 11
Previous Research ................................................................................................................................................ 11
Content Development ......................................................................................................................................... 13
Attention to Detail 13
Analytical Thinking 13
Reliability ............................................................................................................................................................... 15
Validity ................................................................................................................................................................... 17
Fairness .................................................................................................................................................................. 17
Attitudes, Interests and Motivations Assessment .............................................................. 29
Previous Research ................................................................................................................................................ 29
Content Development ......................................................................................................................................... 30
Needs Structure 30
Innovative and Creative 31
Enjoys Problem-Solving 31
Competitive 31
Seeks Perfection 31
Develops Relationships 31
Expressive and Outgoing 32
Corporate Citizenship 32
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 3
Exhibits a Positive Work Attitude 32
Adaptable 32
Reliability ............................................................................................................................................................... 33
Validity ................................................................................................................................................................... 34
Fairness .................................................................................................................................................................. 35
Emotional Intelligence Module .......................................................................................... 40
Previous Research ................................................................................................................................................ 40
Content Development ......................................................................................................................................... 40
Self-Control 40
Self-Awareness 41
Empathy 41
Reliability ............................................................................................................................................................... 41
Fairness .................................................................................................................................................................. 41
Workplace Competency Assessment .................................................................................. 43
Previous Research ................................................................................................................................................ 43
Content Development ......................................................................................................................................... 43
Coaching and Developing Others 43
Exercising Political Savvy 44
Guiding, Directing, and Motivating Others 44
Resolving Conflicts and Meeting Customer Needs 44
Team Building 44
Behavioral History Survey ................................................................................................... 46
Previous Research ................................................................................................................................................ 46
Content Development ......................................................................................................................................... 47
Performance 47
Tenure 47
Reliability ............................................................................................................................................................... 48
Fairness .................................................................................................................................................................. 49
Knowledge and Skills Tests ................................................................................................ 50
Development Overview ...................................................................................................................................... 50
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 4
Sales Situation Analysis ....................................................................................................... 51
Test Development ............................................................................................................................................... 51
Reliability ............................................................................................................................................................... 52
Fairness .................................................................................................................................................................. 52
Typing Speed and Accuracy................................................................................................ 53
Test Development ............................................................................................................................................... 53
Reliability ............................................................................................................................................................... 53
Fairness .................................................................................................................................................................. 53
Data Entry ........................................................................................................................... 54
Test Development ............................................................................................................................................... 54
Reliability ............................................................................................................................................................... 55
Fairness .................................................................................................................................................................. 55
Essay Test ............................................................................................................................ 56
Test Development ............................................................................................................................................... 56
Descriptives .......................................................................................................................................................... 57
Fairness .................................................................................................................................................................. 57
Solution Scoring .................................................................................................................. 59
Construct Validity Evidence ............................................................................................................................... 60
Technical Requirements ..................................................................................................... 64
Future Research .................................................................................................................. 64
Reliability ............................................................................................................................................................... 64
Validity ................................................................................................................................................................... 65
Norms .................................................................................................................................................................... 65
References ........................................................................................................................... 66
Appendix A: Summary of the HR Avatar Solutions ........................................................... 72
Appendix B: Historical Validity Evidence for the Cognitive Scales .................................. 92
Appendix C: Historical Validity Evidence for AIMS.......................................................... 93
Appendix D: Scoring Rubric for Essays ............................................................................. 98
Appendix E: Directions for Rating Essay Tests ................................................................. 99
Appendix F: Validity Evidence for HR Avatar Tests ....................................................... 102
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 5
Appendix G: Validity Evidence for HR Avatar Tests ....................................................... 105
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 6
Tables Table 3. Descriptive Statistics and Reliability Evidence for the Cognitive Workplace Simulation Scales ....... 16
Table 4. Evaluation of Cognitive Score Differences by Gender ........................................................................... 19
Table 5. Evaluation of Cognitive Score Differences by Age Group..................................................................... 21
Table 6. Descriptive Statistics and Reliability Evidence for the AIMS Scales ..................................................... 34
Table 7. Evaluation of AIMs Score Differences by Gender .................................................................................. 35
Table 8. Evaluation of AIMs Score Differences by Ethnicity ............................................................................... 36
Table 9. Evaluation of AIMs Score Differences by Age Group ........................................................................... 36
Table 10. Evaluation of AIMs Score Differences by Race Groups: Asian and White ....................................... 37
Table 11. Evaluation of AIMs Score Differences by Race Groups: Black or African American and White . 37
Table 12. Descriptive Statistics and Reliability Evidence for the Emotional Intelligence Module .................. 41
Table 7. Evaluation of AIMs Score Differences by Gender .................................................................................. 42
Table 8. Evaluation of AIMs Score Differences by Ethnicity ............................................................................... 42
Table 9. Evaluation of AIMs Score Differences by Age Group ........................................................................... 42
Table 11. Evaluation of AIMs Score Differences by Race Groups: Black or African American and White . 42
Table 12. Descriptive Statistics and Reliability Evidence for the Professional Behavioral History Survey .... 48
Table 13. Descriptive Statistics and Reliability Evidence for the Entry-Level Behavioral History Survey ..... 48
Table 14. Evaluation of Biographical History Survey Score Differences by Gender ......................................... 49
Table 15. Evaluation of Biographical History Survey Score Differences by Ethnicity ...................................... 49
Table 16. Evaluation of Biographical History Survey Score Differences by Age Group .................................. 50
Table 17. Evaluation of Biographical History Survey Score Differences by Race Groups: Asian and White 50
Table 21. Evaluation of Biographical History Survey Score Differences by Race Groups: Black and White 50
Table 18. Descriptive Statistics and Reliability Evidence for the Sales Situation Analysis Assessment .......... 52
Table 19. Evaluation of Sales Situation Analysis Score Differences ..................................................................... 52
Table 20. Descriptive Statistics and Reliability Estimates for the Typing Test ................................................... 53
Table 21. Evaluation of Business Typing Differences by Subgroup .................................................................... 53
Table 21. Evaluation of Academic Typing Differences by Subgroup .................................................................. 54
Table 20. Descriptive Statistics and Reliability Estimates for the Data Entry Tests .......................................... 55
Table 21. Evaluation of 10-key Score Differences by Subgroup ........................................................................... 55
Table 21. Evaluation of Alphanumeric Score Differences by Subgroup ............................................................. 55
Table 21. Evaluation of Oral Alphanumeric Score Differences by Subgroup .................................................... 55
Table 22. Descriptive Statistics for Essay Scores ..................................................................................................... 57
Table 23. Evaluation of Essay Score Differences by Subgroup – Living in a Big City Prompt ....................... 57
Table 23. Evaluation of Essay Score Differences by Subgroup – Working from Home Prompt ................... 58
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 7
Table 24. Correlations between the AIMs Scales ..................................................................................................... 61
Table 25. Correlations between the Professional Behavioral History Scales ....................................................... 62
Table 26. Correlations between the Entry-Level Behavioral History Scales ....................................................... 62
Table 27. Correlations between HR Avatar AIMs and Behavioral History Scales ......... Error! Bookmark not
defined.
Table 28. Correlations between the Two Cognitive Workplace Simulation Scales: Attention to Detail and
Analytical Thinking ................................................................................................... Error! Bookmark not defined.
Table 29. Correlations between the AIMs and the Cognitive Workplace Simulation Scales Error! Bookmark
not defined.
Table 30. Correlations between the Behavioral History Survey and the Cognitive Workplace Simulation
Scales ........................................................................................................................... Error! Bookmark not defined.
Table 31. Compiled Validity Evidence for the Original Content of the Cognitive Work Simulation Scales . 92
Table 32. Study 1 Results: Concurrent Validation Study for Insurance Consultants N=122-136 ................... 93
Table 33. Study 2 Results: Concurrent Validation Study for Inside Sales N=105 .............................................. 94
Table 34. Study 3 Results: Concurrent Validation Study for an Internet Services Order Processing N=84 .. 94
Table 35. Study 4 Results: Concurrent Validation Study for an Internet and Cable Sales and Service Position
N=72-93 ......................................................................................................................................................................... 94
Table 36. Study 5 Results: Concurrent Validation Study Auto Rental Sales Role N=80-92 ............................. 94
Table 37. Study 6: Concurrent Validation Study for Paramedics N=85 .............................................................. 95
Table 38. Summary of Relationships between Performance Measures and Original AIMS Scales.................. 96
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 8
Figures Figure 1. Summary of HR Avatar Solution ............................................................................................................... 11
Figure 2. Cognitive Work Simulation ........................................................................................................................ 14
Figure 3. Hypothesized Relationships between the AIMs Scales and the Big Five ............................................ 33
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 9
Overview
The HR Avatar Employment Assessment series was designed as a flexible assessment instrument for
helping corporations obtain quality hires by measuring cognitive abilities, biographical data, personality
characteristics and job knowledge related to performance and tenure in the workplace. To cater to differing
individual client needs, HR Avatar offers both individual assessments as well as complete assessment
solutions. By combining assessments and measuring multiple competencies, the assessment solutions allow
for a broader evaluation of overall candidate fit with a particular job.
This technical manual contains a summary of testing standards for the development of psychological
assessments in the workplace and an overall summary of the development of the assessment solution. There
are sections for each assessment within the solution including previous research, content development, and
reliability evidence, validity evidence, and evidence for fairness. The technical manual concludes with a
summary of HR Avatar’s research agenda.
Testing Standards
The Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, National Council on Measurement in Education, 2014), the Principles for the Validation and Use of Personnel Selection Procedures, (Society for Industrial Organizational Psychology, 2003), the Uniform Guidelines on Employee Selection Procedures (Equal Employment Opportunity Commission, C. S. C. U. S. D. L. U., & Equal Employment Opportunity Commission, 1978), and Testing and Assessment: An Employer’s Guide to Good Practices (Saad, Carter, Rothenberg, & Israelson, 2000) are all documents that “govern” the development and use of tests in employment settings. The developers of the HR Avatar Employment Assessment series have made, and continue to make, efforts to adhere to recommendations for test development put forth by these documents. The evidence produced in this technical manual is a summary of all of the theoretical and empirical research done on the tool to date. We anticipate frequent updates as we complete additional research. The following list summarizes the key test development standards from the above listed documents. The documents include additional standards governing test usage and we encourage test users to review and adhere to these standards.
• Define the psychological construct measured by the test
• Document the intended use of the test
• Document how test scores should be interpreted
• Demonstrate validity for the specific use of the assessment and interpretation of scores including:
o Previous research on the construct
o Previous empirical research that demonstrates the extent to which the validity evidence may
be generalizable (e.g. meta-analytic studies)
o Research supporting the link between the content of the assessment and the tasks,
knowledge, skills, and abilities required for the job (i.e., content-related validity evidence)
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 10
o Research supporting a specific response process takes place if it is inferred in the use or
interpretation of the test
o Evidence that the internal structure of the assessment conforms to theoretical expectations
o Relationships between scores and any outcomes the assessment claims to be able to predict,
such as job performance (i.e. criterion-related validity evidence)
o Evidence demonstrating that the test is statistically related to other measures in ways in
which are consistent with theoretical expectations (i.e., convergent and divergent validity
evidence)
• Document sufficient details related to validity research in order to allow the reader an opportunity to
independently evaluate the quality of research
• Document reliability estimates and the Standard Error of Measurement (SEM) for each score, sub-
score, and combination of scores
• Use a type of reliability estimate (e.g., internal consistency, test-retest) that is appropriate for the test
• Document sufficient details related to reliability research in order to allow the reader an opportunity
to independently evaluate the quality of research
• Design test in such a way that it allows all test takers an equal opportunity to demonstrate their
standing on the construct regardless of subgroup membership (e.g., age, disability status, ethnicity,
gender, age, or race)
• Provide empirical evidence demonstrating that the construct is measured the same way across
subgroups
Solution Summary
The HR Avatar Employment Assessment series has been designed for use in identifying or screening out
those job applicants with the lowest potential for success in a given role by assessing job-relevant cognitive
abilities, personality characteristics, behavioral background, and knowledge and skills. There are currently
254 HR Avatar solutions (see Appendix A) in the HR Avatar Employment Assessment series. Each solution
has been designed for specific job roles and contains several components: one of 30 different Cognitive
Work Simulations, one of two different Attitudes, Interests, and Motivations assessment forms (AIMs), an
emotional intelligence module, one of two different Behavioral History Surveys, and job-specific Knowledge
and Skills assessments. The 31 Cognitive Work Simulations represent 31 distinct work contexts and are
discussed in more detail below. The AIMs assessment and the Behavioral History Survey are both available
in two forms. One form is designed for professional positions and the other is designed for entry-level
positions.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 11
Figure 1. Summary of HR Avatar Solution
The assessments are delivered via a computer or other mobile device via the internet. Applicants are sent an
email with a link to take the assessment. Once the applicant clicks on the link in the email, they are brought
to a page in their web browser. The assessment begins with an animated “host”. The host guides the
applicant through the process and provides instructions prior to the start of each assessment. The animated
host is used to provide the applicant with a more engaging and positive assessment experience. Most
solutions take about 40 minutes to complete.
Prior to implementing the solution for selecting employees, HR Avatar recommends completing a job
analysis to ensure that the competencies measured by the HR Avatar assessments are important for success
in your organization. Additionally, HR Avatar recommends that the organization conduct a pilot study to
examine the relationships between the assessments and relevant job performance criteria and to evaluate the
potential for adverse impact. This pilot study should be done prior to using the assessment for screening out
employees.
Cognitive Work Simulations
Previous Research
Those individuals with higher cognitive abilities are more equipped to solve complex problems and learn
new skills. Cognitive ability is necessary for successful performance on a broad range of tasks including
written documentation, oral communication, identifying and managing details, solving problems, reading
and responding to email messages, quickly learning and applying new information, quantitative
computations and analyses, and making well-reasoned decisions
The published research evidence consistently demonstrates a strong, predictive relationship between
cognitive ability, or general mental ability, and job performance and training success. In fact, several meta-
•Measures 1-3 specific cognitive abilities
•1 of 30 job family-specific versionsCognitive Work Simulation
•10 personlity scales
•2 versions (professional and entry-level)Attitudes, Interests, Motivations
• 3 Emotional intelligence scales.Emotional Intelligence
•2 biographical data scales
•2 versions (professional and entry-level)Behavioral History Survey
•1 or more of 55 job-specific knowledge and skills tests
•Sales Situation Analysis, Customer Service, etc.
•Typing Speed/Accuracy, Data Entry, Essay Writing, Etc.Knowledge and Skills tests
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 12
analyses have been completed, summarizing and combing the results of many research studies examining
the validity of cognitive ability (Hunter & Hunter, 1984; Schmidt & Hunter, 1998) and the results generalize
across geographical regions (Bertua, Anderson, & Salgado, 2005; Salgado & Anderson, 2003; Salgado,
Anderson, Moscoso, Bertua, & De Fruyt, 2003a; Salgado et al., 2003b). The operational validity estimates
from these meta-analytic studies tend to be in the .5-.6 range. Though cognitive ability predicts performance
at all job levels, the relationship is moderated by complexity such that the relationship is stronger for more
complex jobs (Bertua et al., 2005; Salgado et al., 2003b).
Currently, the most commonly accepted taxonomy for cognitive abilities is the CHC model (Cattell, 1941;
Horn, 1985; Carroll, 1993). The emergence of this accepted hierarchical taxonomy came from the merging
of two independent lines of research examining the taxonomy and structure of cognitive abilities (McGrew,
2008). The taxonomy posits general cognitive ability at Stratum I, broad abilities at Stratum II, and more
specific abilities at Stratum III. For the prediction of job performance, we are most interested in measuring
abilities at the second stratum as these are deemed to be broad enough to apply to multiple positions, but
specific enough to yield incremental prediction in job performance. Furthermore, recent research has
demonstrated that using specific abilities in selection decisions can reduce adverse impact without
decreasing the validity of the selection process (Kehoe, 2002; Wee, Newman, & Joseph, 2014).
The broad abilities listed in Stratum II consist of: fluid reasoning, comprehension-knowledge, short-term
memory, visual processing, auditory processing, long-term storage and retrieval, cognitive processing speed,
decision and reaction speed, reading and writing, quantitative knowledge, general (domain-specific)
knowledge, tactile abilities, kinesthetic abilities, olfactory abilities, psychomotor abilities, and psychomotor
speed (McGrew, 2008). Based on our understanding of these abilities, we believe that fluid reasoning, short-
term memory, reading and writing, quantitative knowledge, and general (domain-specific) knowledge would
be the most generalizable across jobs. However, the general (domain-specific) knowledge was determined to
be better measured by the knowledge and skills assessments. HR Avatar designed two scales for each job
family: Attention to Detail and Analytical Thinking. The development of these scales is discussed further on
the content development section, but, to summarize, Attention to Detail taps short-term memory, whereas
Analytical Thinking is more closely related to fluid reasoning and quantitative knowledge. Both scales
require that applicants read information, therefore, to some extent, the reading component of reading and
writing is also measured by both scales.
Current meta-analytic research has examined the relationship of these specific abilities with overall job
performance and training performance. In a European sample, Salgado et al. (2003b) found that perceptual
ability and memory (similar to the competency measured by the Attention to Detail scale) had corrected
mean validity coefficients of .52 and .56 with job performance and .25 and .34 with training performance,
respectively. Additioanlly, they found that verbal ability and numerical ability (similar to the competency
measured by the Analytical Thinking scale) had corrected mean validity coefficients of .35 and .52 with job
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 13
performance and .44 and .48 with training perofrmance, respectively. In a UK sample, Bertua et al. (2005)
found that perceptual abilities had a corrected mean validity coefficient of .50 with job performance and .50
with training performance. Verbal ability and numerical ability had corrected mean validity coefficients
of .39 and .42 with job performance and .49 and .54 with training perofrmance, respectively.
Content Development
The Cognitive Work Simulations measure two cognitive competencies: Attention to Detail and Analytical
Thinking. Attention to Detail most represents short-term memory from the CHC model (see above) and
Analytical Thinking is a combination of fluid reasoning and quantitative reasoning.
Attention to Detail
The competency is defined by the ability to process information, recall information accurately, and identify
appropriate resources for locating specific pieces of information. This competency is important for data
entry, identifying errors in data, processing orders, working with tables, understanding policies and
procedures, and working with numerical data such as the type found in financial reports.
Analytical Thinking
The competency is defined by the ability to understand language and the numerical relationships. This
competency is characterized by the ability to process complex information, synthesize data, identify and
solve problems, and make well-reasoned decisions.
The Cognitive Work Simulation requires that candidates complete various job-related tasks in a virtual
environment designed to replicate the workplace. There are 31 versions of the Cognitive Work Simulation
and each is designed to represent scenarios common to a specific job family (listed in Table 3). Candidates
are evaluated on how they respond to avatars representing customers and colleagues and how well they
solve problems in various work-related scenarios. The realistic experience is designed to be more engaging
for applicants and to provide a better measure of ability. During a simulation, the candidate might be
required to read email, listen to voicemail and perform basic keyboard and screen navigation tasks to solve
typical business problems. Note that not all 31 versions were analyzed in this report because adequate
response data was not available for some of the instances.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 14
Figure 2. Cognitive Work Simulation
The items used in the Cognitive Work Simulation were based on an existing library of ability items with a
history of predicting job performance in various workplace settings. Both the original items and the new
content were developed by examining the results of multiple job analyses. There are between 5 and 25 items
in each competency measured by the simulation. Responses to assessment items are scored dichotomously.
In some items, there is more than one correct response. In these situations, an applicant is awarded a point
for each correct response selected. It takes approximately 18-25 minutes to complete a simulation.
All Cognitive Simulations
Basic Entry-Level
Face-to-Face Customer Service
Administration
Business Sales
Business/Finance
Entry-Level Administration
Entry-Level Business & Finance
Entry-Level Office
First-Line Supervisor
General Office Workplace
Information Technology
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 15
Manager
Customer Service (with Email)
Technician
Retail Sales (Hardware Store)
Teller
Teller with Sales
Customer Service (With Email And Calls)
Collections
Driver
Warehouse
Medical Assistant
Construction*
Flight Attendant*
Fast Food Worker*
Restaurant Worker*
Real Estate Agent*
Hospitality Worker*
Retail Sales (Electronics Store)*
Retail Sales (Fashion Store)*
Retail Sales (Sunglasses Store)*
Train Station Ambassador*
Train Station Operations Manager*
Inside Sales (Call Center Sales)*
• Indicates that this module was not analyzed in this assessment due to insufficient data availability.
Reliability
To evaluate the properties of the cognitive assessments as well as other assessments in the solution, HR
Avatar conducted a research study. Participants from the study were live applicants to the job families listed
in Table 3. Data were collected from March 2016 to August 2018 and were collected on the current versions
of each of the assessments. All product testing data, product demonstrations, and retests (first test was kept)
were excluded from analysis. Additionally, only participants with complete data for each competency were
included in the reliability analyses. This analytic strategy ensures what is reported here best matches with the
operational usage and reliability estimates for organizations. Only simulations with sufficient applicant data
to calculate reliability estimates are included in the table below.
Table 1 contains the descriptive statistics for each scale and the alpha estimates of internal consistency.
Please note, based on these findings, further analyses were done at the item and distractor level. These
analyses led the developers to make substantial edits to these items and we anticipate that edits will
substantially improve the reliability of the tools. Further research evaluating the effectiveness of these
changes is forthcoming.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 16
Table 1. Descriptive Statistics and Reliability Evidence for the Cognitive Workplace Simulation Scales
Cognitive Simulation
Analytical Thinking
Attention to Detail
# of Items
N M SD Alpha # of Items
N M SD Alpha
Basic Entry-Level 19 1201 15.90 2.92 0.81 5 1201 3.78 1.40 0.68
Face-to-Face Customer Service
6 251 2.68 2.17 0.84 8 251 5.39 2.01 0.69
Administration 5 1166 2.37 1.22 0.30 11 1163 7.26 2.99 0.75
Business Sales 6 2006 3.01 1.68 0.58 8 2000 3.39 2.04 0.64
Business/Finance 7 3932 4.07 1.59 0.37 6 3918 3.86 1.50 0.50
Entry-Level Administration
10 2204 7.82 3.54 0.78 14 2210 1.79 0.46 0.89
Entry-Level Business & Finance
7 314 3.03 1.85 0.66 5 312 3.23 1.46 0.59
Entry-Level Office 25 4991 17.20 4.72 0.81 13 4991 7.04 2.89 0.75
First-Line Supervisor 9 4921 6.41 1.63 0.49 8 4921 5.21 2.04 0.63
General Office Workplace
7 946 3.17 1.72 0.55 6 975 3.04 1.38 0.39
Information Technology
7 1894 3.94 1.76 0.51 7 1844 4.28 1.77 0.59
Manager 10 2901 5.37 2.19 0.54 5 2901 3.23 1.36 0.49
Remote Customer Service
6 3995 3.07 1.51 0.50 10 3999 5.94 2.68 0.76
Technician 25 263 17.5 4.37 0.79 9 264 4.11 2.98 0.84
Retail Sales 10 85 2.49 2.39 0.76 7 85 3.97 1.86 0.69
Teller 7 611 1.94 0.94 0.58 17 611 11.39 1.72 0.50
Teller with Sales 7 138 2.25 0.99 0.64 17 138 11.27 1.39 0.44
customer Service (With Email And Calls)
6 3551 2.70 1.38 0.39 11 2264 6.41 2.58 0.69
Collections 8 2614 4.32 1.24 0.33 14 2607 8.84 1.85 0.32
Driver 7 145 2.50 1.62 0.51 5 148 2.90 1.28 0.52
Warehouse 8 138 5.21 2.02 0.73 6 138 4.66 1.47 0.68
Medical Assistant 7 275 3.48 1.48 0.39 14 272 10.88 1.97 0.60
Please note that ALL cognitive scales listed above have been supplemented by additional items using one of
several Cognitive Supplement Modules, described below.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 17
Validity
As mentioned above, the simulations were developed based on an existing library of content. The original
developer has provided the results of historical validation studies done with the original content and these
are provided in Appendix B.
Validation evidence is available in two studies that demonstrate the validity of the assessments. The first is a
study with 130 managers and the second is a study with 64 managers. In the first study, the assessment
demonstrated that it predicts job performance (r=.25; p<.01), using a simple “High” or “Marginal”
categorization of job performance, and a second study demonstrated the assessment predicted performance
on a 100-point administrative performance appraisal, using a Spearman correlation (r=.25; p<.05). We
expect validity evidence to be more robust in future studies with better criterion measures. A brief summary
of the results is presented in Appendix G.
Fairness
In order to evaluate how different demographic subgroups perform on the assessments and to assess the
likelihood of adverse impact, analyses were conducted to compare mean scores of the subgroups. The
results are provided in Table 4 and Table 5. Included in the tables are the descriptive statistics for each
subgroup and Cohen’s d. Cohen’s d is an effect size and an indication of how large the mean differences are.
For reference, according to Cohen (1992), effect sizes of .20 are small, .50 are medium, and .80 are large.
Future research will continue to examine subgroup differences as more data become available.
For gender, the weighted-average Cohen’s d values are -0.003 for Analytical Thinking and 0.073 for
Attention to Detail. Negative Cohen’s d values indicate that females scored higher. Overall, these are very
small effect sizes, indicating that there are little-to-no differences between men and women on the cognitive
scales. In the table below, the larger effect sizes tend to be associated with smaller sample sizes, indicating
these results are likely due to sampling error. Based on these results, it is unlikely there will be significant
adverse impact against either men or women by using the cognitive scales for selection decisions.
For age, the weighted-average Cohen’s d values are -0.023 for Analytical Thinking and -.060 for Attention to
Detail. Negative Cohen’s d values indicate that people 40 and Older scored higher. Overall, these are very
small effect sizes, indicating that there are little-to-no differences between ages on the cognitive scales. In
the table below, the larger effect sizes tend to be associated with smaller sample sizes, indicating these
results are likely due to sampling error. Based on these results, it is unlikely there will be significant adverse
impact against age groups by using the cognitive scales for selection decisions.
For ethnicity, the weighted-average Cohen’s d values are -0.031 for Analytical Thinking and .012 for
Attention to Detail. Negative Cohen’s d values indicate that Hispanic or Latino people scored higher.
Overall, these are very small effect sizes, indicating that there are little-to-no differences between ethnicities
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 18
on the cognitive scales. In the table below, the larger effect sizes tend to be associated with smaller sample
sizes, indicating these results are likely due to sampling error. Based on these results, it is unlikely there will
be significant adverse impact against ethnic groups by using the cognitive scales for selection decisions.
For Black-White differences, the weighted-average Cohen’s d values are 0.403 for Analytical Thinking
and .280 for Attention to Detail. Positive Cohen’s d values indicate that people White applicants scored
higher. These are small to moderate effect sizes that can lead to adverse impact against Black applicants;
however, these are smaller than values typically seen in the research literature.
For Asian-White differences, the weighted-average Cohen’s d values are 0.312 for Analytical Thinking
and .574 for Attention to Detail. Positive Cohen’s d values indicate that people White applicants scored
higher. These are moderate effect sizes that can lead to adverse impact against Asian applicants and are
slightly higher than is seen in the research literature. A likely hypothesis about these differences is a language
barrier; future research will examine how to mitigate these differences.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 19
Table 2. Evaluation of Cognitive Score Differences by Gender
Cognitive Simulation Scale Female Male
n M SD n M SD d
Basic Entry-Level Analytical Thinking 718 16.01 2.69 264 15.92 2.63 -0.03
Attention to Detail 718 3.64 1.43 264 3.85 1.36 0.15
Face-to-Face Customer Service
Analytical Thinking 41 2.34 2.21 184 2.76 2.18 0.19
Attention to Detail 41 4.98 2.26 184 5.47 2.00 0.22
Administration Analytical Thinking 184 2.46 1.13 749 2.41 1.24 -0.05
Attention to Detail 184 7.08 2.73 749 7.62 2.98 0.19
Business Sales Analytical Thinking 965 3.05 1.67 746 3.00 1.71 -0.03
Attention to Detail 965 3.31 2.00 746 3.58 2.08 0.13
Business/Finance Analytical Thinking 1333 4.25 1.60 2111 4.00 1.54 -0.16
Attention to Detail 1333 3.92 1.51 2111 3.85 1.46 -0.05
Entry-Level Administration
Analytical Thinking 564 7.26 3.62 1449 8.13 3.48 0.24
Attention to Detail 564 1.74 0.49 1449 1.82 0.44 0.19
Entry-Level Business/Finance
Analytical Thinking 166 3.48 1.76 110 2.54 1.88 -0.51
Attention to Detail 166 3.59 1.35 110 2.88 1.52 -0.48
Entry-Level Office Analytical Thinking 381 18.89 4.78 4246 17.12 4.7 0.37
Attention to Detail 381 7.6 2.83 4246 7.02 2.9 0.2
First-Line Supervisor Analytical Thinking 1907 6.45 1.67 2341 6.41 1.58 -0.02
Attention to Detail 1907 5.25 2.11 2341 5.24 1.99 0.00
General Office Workplace
Analytical Thinking 242 3.56 1.79 556 2.96 1.64 -0.34
Attention to Detail 242 3.29 1.37 556 2.95 1.38 -0.25
Information Technology
Analytical Thinking 1290 4.04 1.76 355 3.55 1.72 -0.28
Attention to Detail 1290 4.36 1.77 355 4.18 1.71 -0.10
Manager Analytical Thinking 1266 5.58 2.29 1114 5.18 2.12 -0.18
Attention to Detail 1266 3.20 1.36 1114 3.44 1.28 0.18
Customer Service (with Email)
Analytical Thinking 889 2.88 1.56 1314 2.98 1.42 0.07
Attention to Detail 889 5.53 2.78 1314 6.22 2.52 0.26
Technician Analytical Thinking 184 16.98 4.55 71 18.69 3.58 0.44
Attention to Detail 184 3.91 2.96 71 4.69 2.98 0.26
Retail Sales Analytical Thinking N/A
Attention to Detail N/A
Teller Analytical Thinking 144 2.03 0.92 423 1.92 0.94 -0.12
Attention to Detail 144 11.63 1.55 423 11.35 1.78 -0.18
Teller with Sales Analytical Thinking N/A
Attention to Detail N/A
Customer Service (With Email And Calls)
Analytical Thinking 845 2.68 1.47 2329 2.72 1.34 0.03
Attention to Detail 845 6.23 2.74 2329 6.53 2.53 0.09
Collections Analytical Thinking 419 4.80 1.17 2022 4.21 1.22 -0.49
Attention to Detail 419 8.97 1.88 2022 8.85 1.83 -0.07
Driver Analytical Thinking N/A
Attention to Detail N/A
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 20
Warehouse Analytical Thinking N/A
Attention to Detail N/A
Medical Assistant Analytical Thinking N/A
Attention to Detail N/A
Note. Positive d-values mean the referent group scored higher.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 21
Table 3. Evaluation of Cognitive Score Differences by Age Group
Cognitive Simulation Scale 40 and Older Under 40
n M SD n M SD d
Basic Entry-Level Analytical Thinking 218 16.16 2.51 675 15.90 2.79 -0.10
Attention to Detail 218 3.86 1.38 675 3.49 1.41 -0.27
Face-to-Face Customer Service
Analytical Thinking 83 2.95 2.18 147 2.22 2.13 -0.34
Attention to Detail 83 5.54 2.07 147 5.12 1.95 -0.21
Administration Analytical Thinking 275 2.38 1.20 667 2.57 1.24 0.15
Attention to Detail 275 7.28 2.95 667 8.19 2.74 0.32
Business Sales Analytical Thinking 283 2.96 1.68 1436 3.32 1.70 0.21
Attention to Detail 283 3.30 2.00 1436 3.93 2.19 0.29
Business/Finance Analytical Thinking 335 4.18 1.55 3132 3.28 1.52 -0.58
Attention to Detail 335 3.93 1.46 3132 3.34 1.57 -0.37
Entry-Level Administration
Analytical Thinking 412 7.83 3.50 1581 8.41 3.50 0.16
Attention to Detail 412 1.77 0.47 1581 1.94 0.33 0.48
Entry-Level Business/Finance
Analytical Thinking 69 3.09 1.95 216 3.20 1.45 0.07
Attention to Detail 69 3.31 1.49 216 3.41 1.25 0.07
Entry-Level Office Analytical Thinking 2194 17.24 4.88 2368 17.28 4.57 0.01
Attention to Detail 2194 7.19 2.95 2368 6.98 2.80 -0.07
First-Line Supervisor Analytical Thinking 1122 6.45 1.60 3206 6.34 1.63 -0.07
Attention to Detail 1122 5.33 2.04 3206 4.94 1.99 -0.19
General Office Workplace
Analytical Thinking 52 3.10 1.69 787 3.21 1.64 0.07
Attention to Detail 52 3.03 1.39 787 2.92 1.40 -0.08
Information Technology
Analytical Thinking 476 3.86 1.78 1160 4.06 1.69 0.12
Attention to Detail 476 4.16 1.82 1160 4.59 1.55 0.26
Manager Analytical Thinking 1046 5.30 2.15 1266 5.56 2.27 0.12
Attention to Detail 1046 3.29 1.32 1266 3.34 1.33 0.04
Customer Service (with Email)
Analytical Thinking 534 3.02 1.47 1408 2.63 1.43 -0.28
Attention to Detail 534 6.39 2.47 1408 4.90 2.78 -0.54
Technician Analytical Thinking 72 17.61 4.33 177 17.44 4.56 -0.04
Attention to Detail 72 4.14 3.02 177 4.31 2.89 0.06
Retail Sales Analytical Thinking N/A
Attention to Detail N/A
Teller Analytical Thinking 73 1.95 0.95 504 2.01 0.88 0.07
Attention to Detail 73 11.56 1.65 504 10.63 1.94 -0.49
Teller with Sales Analytical Thinking 48 2.26 0.98 83 2.21 1.03 -0.04
Attention to Detail 48 11.43 1.24 83 10.96 1.62 -0.32
Customer Service (With Email And Calls)
Analytical Thinking 845 2.60 1.37 2396 2.95 1.38 0.25
Attention to Detail 845 6.44 2.61 2396 6.45 2.51 0.00
Collections Analytical Thinking 740 4.24 1.25 1680 4.55 1.16 0.26
Attention to Detail 740 8.86 1.85 1680 8.85 1.82 0.00
Driver Analytical Thinking N/A
Attention to Detail N/A
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 22
Warehouse Analytical Thinking N/A
Attention to Detail N/A
Medical Assistant Analytical Thinking 53 3.23 1.49 146 3.66 1.45 0.29
Attention to Detail 53 10.91 1.88 146 10.10 2.38 -0.35
Note. Positive d-values mean the referent group scored higher.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 23
Table 4. Evaluation of Cognitive Score Differences by Ethnicity
Cognitive Simulation Scale Hispanic or Latino
Not Hispanic or Latino
n M SD n M SD d
Basic Entry-Level Analytical Thinking 68 16.23 2.50 648 16.03 2.14 -0.09
Attention to Detail 68 3.85 1.33 648 3.37 1.67 -0.29
Face-to-Face Customer Service
Analytical Thinking 89 2.79 2.23 112 2.85 2.14 0.03
Attention to Detail 89 5.20 2.33 112 5.73 1.71 0.27
Administration Analytical Thinking 82 2.47 1.23 735 2.57 1.13 0.09
Attention to Detail 82 7.66 2.94 735 7.56 2.92 -0.03
Business Sales Analytical Thinking 126 3.08 1.69 1320 3.30 1.65 0.13
Attention to Detail 126 3.56 2.10 1320 3.43 1.84 -0.07
Business/Finance Analytical Thinking 214 4.17 1.56 2943 3.94 1.59 -0.14
Attention to Detail 214 3.94 1.44 2943 3.79 1.43 -0.10
Entry-Level Administration
Analytical Thinking 252 8.13 3.49 1421 8.56 3.16 0.13
Attention to Detail 252 1.81 0.44 1421 1.95 0.30 0.42
Entry-Level Business/Finance
Analytical Thinking N/A
Attention to Detail N/A
Entry-Level Office Analytical Thinking 699 17.62 4.75 3033 17.37 4.78 -0.05
Attention to Detail 699 7.28 2.92 3033 6.86 3.04 -0.14
First-Line Supervisor Analytical Thinking 354 6.49 1.60 3373 6.42 1.67 -0.04
Attention to Detail 354 5.33 2.01 3373 5.45 1.95 0.06
General Office Workplace
Analytical Thinking 68 3.21 1.69 620 3.09 1.80 -0.07
Attention to Detail 68 3.05 1.39 620 3.43 1.37 0.27
Information Technology
Analytical Thinking 130 4.01 1.74 1135 3.91 1.89 -0.05
Attention to Detail 130 4.48 1.70 1135 4.62 1.49 0.09
Manager Analytical Thinking 206 5.55 2.25 1615 5.04 2.24 -0.22
Attention to Detail 206 3.44 1.29 1615 3.43 1.31 -0.01
Customer Service (with Email)
Analytical Thinking 377 2.99 1.47 1134 2.98 1.38 0.00
Attention to Detail 377 6.22 2.56 1134 6.47 2.45 0.10
Technician Analytical Thinking 37 17.79 4.32 169 19.03 3.96 0.31
Attention to Detail 37 4.24 3.00 169 4.51 3.12 0.09
Retail Sales Analytical Thinking N/A
Attention to Detail N/A
Teller Analytical Thinking 189 2.03 0.96 328 1.90 0.91 -0.14
Attention to Detail 189 11.35 1.87 328 11.64 1.44 0.18
Teller with Sales Analytical Thinking N/A
Attention to Detail N/A
Customer Service (With Email And Calls)
Analytical Thinking 527 2.76 1.38 2186 2.77 1.35 0.01
Attention to Detail 527 6.53 2.57 2186 6.49 2.59 -0.01
Collections Analytical Thinking 256 4.34 1.23 1788 4.41 1.16 0.06
Attention to Detail 256 8.89 1.83 1788 8.89 1.80 0.00
Driver Analytical Thinking N/A
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 24
Attention to Detail N/A
Warehouse Analytical Thinking N/A
Attention to Detail N/A
Medical Assistant Analytical Thinking 94 3.69 1.40 105 3.02 1.53 -0.45
Attention to Detail 94 10.78 1.94 105 10.79 2.03 0.01
Note. Positive d-values mean the referent group scored higher.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 25
Table 5. Evaluation of White-Black Cognitive Score Differences
Cognitive Simulation Scale Black White
n M SD n M SD d
Basic Entry-Level Analytical Thinking 442 15.45 2.97 319 16.73 2.02 0.49
Attention to Detail 442 3.64 1.43 319 3.86 1.32 0.16
Face-to-Face Customer Service
Analytical Thinking 53 2.19 2.13 78 3.23 2.13 0.49
Attention to Detail 53 5.17 2.00 78 5.90 1.79 0.39
Administration Analytical Thinking 69 2.51 1.08 352 2.79 1.23 0.23
Attention to Detail 69 8.01 2.20 352 9.08 2.23 0.48
Business Sales Analytical Thinking 48 3.06 1.59 237 4.13 1.44 0.72
Attention to Detail 48 4.42 2.03 237 5.24 2.00 0.41
Business/Finance Analytical Thinking 224 3.92 1.53 225 3.76 1.46 -0.10
Attention to Detail 224 3.73 1.62 225 3.93 1.49 0.13
Entry-Level Administration
Analytical Thinking 270 8.74 3.14 469 9.48 2.97 0.24
Attention to Detail 270 2.00 0.21 469 2.01 0.20 0.03
Entry-Level Business/Finance
Analytical Thinking 47 2.91 1.30 160 3.93 1.59 0.66
Attention to Detail 47 3.48 1.22 160 3.81 1.19 0.28
Entry-Level Office Analytical Thinking 2468 16.01 4.49 1815 18.80 4.49 0.62
Attention to Detail 2468 6.59 2.70 1815 7.70 2.97 0.40
First-Line Supervisor Analytical Thinking 41 5.75 1.77 85 6.47 1.56 0.44
Attention to Detail 41 5.58 2.15 85 6.40 2.12 0.38
General Office Workplace
Analytical Thinking N/A
Attention to Detail N/A
Information Technology
Analytical Thinking 308 4.31 1.68 397 4.44 1.78 0.08
Attention to Detail 308 4.56 1.54 397 5.08 1.40 0.35
Manager Analytical Thinking 561 5.65 1.98 623 6.11 2.41 0.21
Attention to Detail 561 3.18 1.27 623 3.91 1.12 0.61
Customer Service (with Email)
Analytical Thinking 69 2.57 1.47 257 3.20 1.33 0.47
Attention to Detail 69 6.61 2.45 257 6.93 2.18 0.14
Technician Analytical Thinking 32 17.69 3.65 70 19.19 3.57 0.42
Attention to Detail 32 4.31 2.47 70 5.33 3.05 0.35
Retail Sales Analytical Thinking N/A
Attention to Detail N/A
Teller Analytical Thinking 186 1.91 0.95 144 1.93 0.94 0.02
Attention to Detail 186 11.56 1.57 144 11.57 1.64 0.01
Teller with Sales Analytical Thinking N/A
Attention to Detail N/A
Customer Service (With Email And Calls)
Analytical Thinking 1114 2.64 1.29 1037 3.15 1.38 0.38
Attention to Detail 1114 6.62 2.55 1037 7.03 2.43 0.16
Collections Analytical Thinking 1460 4.17 1.24 729 4.63 1.20 0.37
Attention to Detail 1460 8.80 1.83 729 9.02 1.81 0.12
Driver Analytical Thinking N/A
Attention to Detail N/A
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 26
Warehouse Analytical Thinking N/A
Attention to Detail N/A
Medical Assistant Analytical Thinking 42 3.52 1.58 108 3.53 1.41 0.00
Attention to Detail 42 10.65 2.11 108 10.89 2.00 0.12
Note. Positive d-values mean the referent group scored higher.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 27
Table 6. Evaluation of White-Asian Cognitive Score Differences
Cognitive Simulation Scale Black White
n M SD n M SD d
Basic Entry-Level Analytical Thinking 76 15.70 2.47 319 16.73 2.02 0.49
Attention to Detail 76 2.80 1.56 319 3.86 1.32 0.77
Face-to-Face Customer Service
Analytical Thinking 32 1.97 2.02 78 3.23 2.13 0.60
Attention to Detail 32 4.00 2.24 78 5.90 1.79 0.98
Administration Analytical Thinking 447 2.15 1.13 352 2.79 1.23 0.54
Attention to Detail 447 6.38 2.87 352 9.08 2.23 1.04
Business Sales Analytical Thinking 1285 2.88 1.65 237 4.13 1.44 0.77
Attention to Detail 1285 3.13 1.87 237 5.24 2.00 1.12
Business/Finance Analytical Thinking 2848 4.18 1.57 225 3.76 1.46 -0.27
Attention to Detail 2848 3.93 1.45 225 3.93 1.49 0.00
Entry-Level Administration
Analytical Thinking 1070 7.07 3.56 469 9.48 2.97 0.71
Attention to Detail 1070 1.67 0.51 469 2.01 0.20 0.76
Entry-Level Business/Finance
Analytical Thinking 68 1.34 1.21 160 3.93 1.59 1.74
Attention to Detail 68 2.04 1.35 160 3.81 1.19 1.42
Entry-Level Office Analytical Thinking 107 16.77 5.28 1815 18.80 4.49 0.45
Attention to Detail 107 6.38 2.96 1815 7.70 2.97 0.44
First-Line Supervisor Analytical Thinking 4032 6.47 1.60 85 6.47 1.56 0.00
Attention to Detail 4032 5.24 2.01 85 6.40 2.12 0.57
General Office Workplace
Analytical Thinking 712 3.15 1.69 31 3.32 1.68 0.10
Attention to Detail 712 3.04 1.36 31 3.77 1.61 0.54
Information Technology
Analytical Thinking 659 3.52 1.65 397 4.44 1.78 0.54
Attention to Detail 659 3.94 1.81 397 5.08 1.40 0.68
Manager Analytical Thinking 864 4.86 2.12 623 6.11 2.41 0.56
Attention to Detail 864 3.19 1.37 623 3.91 1.12 0.56
Customer Service (with Email)
Analytical Thinking 1171 2.87 1.49 257 3.20 1.33 0.23
Attention to Detail 1171 5.82 2.65 257 6.93 2.18 0.43
Technician Analytical Thinking 112 16.39 4.76 70 19.19 3.57 0.64
Attention to Detail 112 3.50 2.89 70 5.33 3.05 0.62
Retail Sales Analytical Thinking N/A
Attention to Detail N/A
Teller Analytical Thinking 135 2.07 0.97 144 1.93 0.94 -0.14
Attention to Detail 135 10.96 2.08 144 11.57 1.64 0.32
Teller with Sales Analytical Thinking N/A
Attention to Detail N/A
Customer Service (With Email And Calls)
Analytical Thinking 627 2.12 1.24 1037 3.15 1.38 0.77
Attention to Detail 627 5.17 2.50 1037 7.03 2.43 0.76
Collections Analytical Thinking 57 4.46 1.18 729 4.63 1.20 0.14
Attention to Detail 57 7.67 2.15 729 9.02 1.81 0.74
Driver Analytical Thinking N/A
Attention to Detail N/A
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 28
Warehouse Analytical Thinking N/A
Attention to Detail N/A
Medical Assistant Analytical Thinking N/A
Attention to Detail N/A
Note. Positive d-values mean the referent group scored higher.
Supplemental Modules
In order to improve the reliability of the cognitive simulations, supplemental modules were introduced to
expand the item pool. These supplemental modules are administered alongside the original simulation for a
seamless candidate experience. But in the supplemental modules, a proportion of items from the entire pool
are randomly drawn to be included in the assessment. The effect is that candidates take a slightly longer
simulation and the scores from the test are more reliable.
The table below contains information on the supplemental modules. For each cognitive scale, the size of the
supplemental item pool, the number of items, and the estimated reliability using the Spearman-Brown
formula are reported. Note that due to low sample sizes, fairness analyses were not able to be computed.
These will be analyzed when sufficient data becomes available.
Table 7. Estimated Reliability Estimates for the Supplemental Cognitive Workplace Simulation Scales
Cognitive Simulation Analytical Thinking Attention to Detail
# of Items in pool
# of Items given
Alpha # of Items
in pool # of Items
given Alpha
Basic Entry-Level 78 26 0.93
Administration 39 14 0.62 87 29 0.84
Business Sales 78 26 0.88 50 17 0.85
General Office Workplace
45 15 0.79 81 27 0.78
Manager 48 16 0.75 66 22 0.84
Customer Service 96 24 0.76 56 14 0.84
All cognitive simulations now have a sequential cognitive supplement module in operation to bolster
reliability.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 29
Attitudes, Interests and Motivations Assessment
Previous Research
The Big Five framework for understanding personality is the most widely accepted taxonomy of personality
traits (Goldberg, 1992; 1993; McCrae & Costa, 1997; McCrae & John, 1992). The five factors, or traits, are
Agreeableness, Conscientiousness, Emotional Stability, Extraversion, and Openness. Agreeableness is
characterized as warmth, ability to get along with others, amd friendliness. Individuals who are
conscientiousness are characterized as being reliable, responsible, dependable, thorough, dutiful,
achievement-oriented and competent. Individuals considered Emotionally Stable are more resilient, stable,
and centered. Extraverted individuals are characterized as being sociable, outgoing, and high in energy.
Openness describes indiciduals who are “creativc, flexible, curious, and unconventional” (Judge & Ilies,
2002)
A substantial amount of research, including meta-analytic research, has demonstrated the relationship
between personality and job performance (Barrick & Mount, 1991; Bartram, 2005; Hogan & Holland, 2003;
Hurtz & Donovan, 2000; Mount & Barrick, 1995; Salgado., 1997; Tett, Jackson, & Rothstein, 1991). In a
meta-analysis of 25 years of personality and performance research drawn from 117 studies of managers,
professionals, sales people, skilled, and semi-skilled workers; researchers concluded that extroversion
predicted success in management and sales; openness to experience predicted training ability; and,
conscientiousness correlated with success in sales, management, professional, skilled and semi-skilled
positions (Barrick & Mount, 1991). The findings were similar in the European community (Salgado, 1997).
Personality traits are stable and slow to change (McCrae, Jang, Livesley, Riemann, & Angleitner, 2001). The
stability of personality traits and their relationship to job performance makes them extremely important to
measure before making a hiring decision.
Theory on personality in the workplace has advanced and researchers have demonstrated that validity
coefficients for personality assessments are stronger for those job performance criteria that can be
theoretically linked to a given personality trait. For example, Borman and Motowidlo (1993, 1997) found
support for their hypotheses that personality would have a stronger relationship with contextual rather than
task performance. The researchers defined contextual performance as those areas of performance that are
important to an organization but not explicity part of the job duties of an individual (e.g., following
organizational rules, giving additional effort when needed, providing support to other organizational
memebers, etc.). Hogan and Holland (2003) found that the Hogan Personality Inventory, a personality
assessment based on the Big Five, was substantially related to multiple measures of job performance across
a wide variety of positions. The relationships were stronger for personality traits that were theoretically
linked to the job performance measure. Bartram (2005) had similar findings, supporting the notion that
theoretically linked personality constructs and job performance domains had stronger relationships than
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 30
personality traits with overall performance. The trait-activation model proposed by Tett and Burnett (2003)
provides a theoretical model allowing for more detailed predictions regarding when a specific personality
trait might be linked to job performance.
Another approach to improving prediction with personality assessment is to measure narrower traits that are
specific facets of the Big Five (Paunonen, Rothstein, & Jackson, 1999; Schneider, Hough, & Dunnette,
1996). Researchers who argue for this approach posit that some facets of a trait may relate to a particular job
performance dimension, but that others may not - or may even be negatively related. Measuring at the broad
level may dilute the relationship between the narrower facet and the job performance dimension. For
example, achievement and reliability are both narrow facets of conscientiousness. One might expect a strong
positive relationship between reliability and retention, but there may not be a relationship between
achievement and retention. Using an assessment that measures the broad trait of conscientiousness may
under-predict retention. Researchers conducted a meta-analysis and demonstrated the superiority of using
narrow facets to predict specific job performance dimensions (Woo, Chernyshenko, Stark, & Conz, 2014).
Furthermore, many personality assessments used for pre-employment selection are actually compound traits
– or constellations of narrow facets (Schneider et. al., 1996). These constellations may come from one or
more of the broader personality traits. In fact, instruments used in both the Hogan and Holland (2003) and
Bartram (2005) meta-analyses contain measures of compound traits.
AIMS v3 Content Development
A review of the literature on personality in the workplace allowed us to identify several competencies, or
compound traits, that are related to successful job performance for many jobs. The ten scales, their
definitions, and example items are listed below. Please note, example items are similar to, but do not reflect,
actual items from the assessment.
Needs Structure
Often, following rules and procedures is critical to successful job performance. When employees do not
follow established organizational policies, work processes may be performed incorrectly, work products may
be flawed, customer service levels may suffer, the organization may become a victim to fraudulent wage and
expense billings, and in some cases the organization may be liable for the actions of the employee. The
Needs Structure scale was designed to evaluate an applicants’ tendency to adhere to organization rules and
procedures and should be important in jobs where rule-abiding behaviors are related to job performance.
Example: I prefer to make extensive plans before I start any project.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 31
Innovative and Creative
For many positions, Innovation and Creativity is critical to successful job performance. Individuals with
high levels of this competency should be more likely to generate new products, develop improved methods
for producing work, and develop solutions to problems.
Example: Some of my suggestions are really eccentric.
Enjoys Problem-Solving
Enjoys Problem-Solving is a competency designed to predict which applicants will be more successful in
roles dealing with data, completing analyses, and conducting research. Individuals who enjoy problem
solving should be more successful in roles that require analytical thinking.
Example: I enjoy learning about how things work.
Competitive
The Competitive competency evaluates the extent to which in individual is likely to do what is necessary to
accomplish their goals – which may be of concern if the individual’s goals are not aligned with the
organization’s goals. Individuals who score highly on this competency are characterized as having concern
for outcomes rather than the feelings of coworkers.
Example: I am not above using people to get my way.
Seeks Perfection
When the quality of work (as opposed to pace) is important for success in a role, the Seeks Perfection
competency should predict performance. Those who score high on this competency are more likely to
double-check their work and have higher rates of accuracy. Additionally, individuals high on this
competency are more likely to be detail-oriented.
Example: My work tends to be faultless.
Develops Relationships
For many roles, working in teams and with others is critical for successful performance. The Develops
Relationships scale was designed to predict which individuals are more likely to develop productive working
relationships and be successful in a team situation.
Example: I have never deliberately said anything that hurt someone's feelings.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 32
Expressive and Outgoing
Individuals who are Expressive and Outgoing are more likely to engage with others. Being Expressive and
Outgoing is also characterized by the ability to influence and persuade others. This competency is important
for success in roles where influence is necessary such as leadership and sales positions.
Example: I tend to take control in most work situations.
Corporate Citizenship
The Corporate Citizenship competency was designed to measure applicants’ tendency to conduct
themselves ethically and responsibly in an organization. Individuals who score highly on this competency are
likely to be honest with managers and colleagues and avoid taking advantage of people and/or situations
just because it suits their self-interest.
Example: I would never use manipulation as a tactic to advance my goals.
Exhibits a Positive Work Attitude
This competency evaluates the tendency for an individual to feel positively toward their job and the
organization they work for. Individuals who score highly on this competency are characterized as feeling
satisfied with their work and are more likely to put in extra effort when needed.
Example: I volunteer for additional work.
Adaptable
The Adaptable competency measures an individual’s tendency to adjust to changes in their work
environment. Individuals who score highly on this competency thrive in fast-paced settings and respond
well to variety.
Example: I think organizational changes are fun and exciting.
The figure below outlines the hypothesized relationships between the AIMs scales and the Big Five traits.
As mentioned above, the AIMs scales are compound traits and measure competencies related to multiple
narrow facets of personality and these facets may be under one or more of the five broader personality
traits. Therefore, an AIMs scale may be expected to relate, albeit moderately, to multiple Big Five traits.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 33
Figure 3. Hypothesized Relationships between the AIMs Scales and the Big Five
The original assessment consisted of 100, Likert-type items. Each item consists of a single statement and
candidates are asked to indicate their level of agreement with the statement. Preliminary data analyses
allowed us to shorten the scales such that the assessment contains a total of 65 items. The AIMs assessment
is available in two forms. The professional form measures all ten competencies. The entry-level form
measures eight competencies as it was determined that Expressive and Outgoing and Innovative and
Creative would be less critical competencies for these roles. The time to take the professional form is
estimated to be between 6 and 7 minutes, and the time to take the entry-level form is estimated to be 5 and
6 minutes. The estimates were calculated using a sample of 402 individuals.
Reliability
The descriptive statistics and reliability estimates of the AIMs scales were estimated using the data collected
from live applicants in the course of applying for jobs. Analyses were restricted to US samples. As can be
seen from the table below, most AIMs scales have acceptable levels of internal consistency. Please note,
Emotional
Stability
Openness
Extraversion
Conscientiousnes
s
Agreeableness
Enjoys Problem-Solving
Innovative and Creative
Competitive
Seeks Perfection
Develops Relationships
Expressive and Outgoing
Corporate Citizenship
Exhibits a Positive Work
Attitude
Adaptable
Needs Structure
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 34
based on these findings, further analyses were done at the item and distractor level. These analyses led the
developers to make substantial edits to these items and we anticipate that edits will substantially improve the
reliability of the tools. Further research evaluating the effectiveness of these changes is forthcoming.
Table 8. Descriptive Statistics and Reliability Evidence for the AIMS v3 Scales
Competency N M SD Alpha
Needs Structure 1999 30.47 3.75 .69
Innovative & Creative 1999 26.94 4.34 .77
Enjoys Problem-Solving 1999 27.94 4.71 .84
Competitive 1999 32.22 6.15 .77
Seeks Perfection 1999 27.95 4.46 .73
Develops Relationships 1999 29.39 3.96 .70
Expressive & Outgoing 1999 19.54 5.52 .69
Corporate Citizenship 1999 47.87 5.68 .81
Exhibits a Positive Work Attitude 1999 30.92 3.85 .74
Adaptable 1999 29.49 3.70 .48
Validity
A Confirmatory Factor Analyses (CFA) was run using R (version 3.1.2) and the “sem” package to evaluate
whether or not the internal structure of the assessment conformed to the hypothesized structure. In other
words, we assessed the extent to which the items loaded onto the particular scale they were assigned to. The
Root Mean Square Error (RMSEA), an indicator of model fit, indicates that the model was satisfactory,
RMSEA=.0675, χ2 (1970, N=567) =7057.80, p<.001. This analysis provides validation support for the
internal structure of the assessment.
As mentioned previously, the content for the AIMs scales were modified and adapted from a longer
assessment that had been used in previous validation research. The developer of the original content
provided the results of several criterion-related validity studies and tables with these results can be found in
Appendix C.
Validation evidence is available in two studies that demonstrate the validity of the assessments. The first is a
study with 130 managers and the second is a study with 64 managers. In the first study, the assessment
demonstrated that it predicts job performance (r=.25; p<.01), using a simple “High” or “Marginal”
categorization of job performance, and a second study demonstrated the assessment predicted performance
on a 100 point administrative performance appraisal, using a Spearman correlation (r=.25; p<.05). We
expect validity evidence to be more robust in future studies with better criterion measures. A brief summary
of the results is presented in Appendix G.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 35
Fairness
In order to evaluate how different demographic subgroups perform on the assessments, mean differences
between protected subgroups were compared (Tables 8 -12). There were too few cases (n<30) to run
analyses for the American Indian or Alaska Native and Native Hawaiian or Other Pacific Islander
subgroups. There were some substantive mean differences on the AIMs scales between males and females;
however, all differences were inconsistent and relatively small, with an average standardized mean difference
of 0.029, very slightly favoring males. There were no substantive mean differences based on ethnicity, with
an average standardized mean difference of 0.033, very slightly favoring Not Hispanic or Latino people.
There were a few substantive mean differences with respect to age group, with a small standardized mean
difference favoring individuals under 40 (0.108). There were a few substantive mean differences between the
Asian and White subgroups, notably on the Corporate Citizenship scale. However, these differences might
be reflections of cultural differences; care should be taken to document the business necessity for use of
scales with large group differences. Overall, there was a small standardized mean difference favoring the
White subgroup (0.113). There were some small differences between the Black or African American
subgroup and the White subgroup, with the average standardized mean difference (-0.230) and most scales
favoring the Black or African American subgroup.
Table 9. Evaluation of AIMs v3 Score Differences by Gender
Scale
Male Female n M SD n M SD d
Competitive 462 33.90 6.37 1383 30.42 6.18 -0.56
Corporate Citizenship 462 48.84 5.27 1383 50.63 4.52 0.38
Develops Relationships 462 29.02 4.04 1383 29.54 3.99 0.13
Enjoys Problem Solving 462 29.11 4.71 1383 28.37 4.71 -0.16
Exhibits a Positive Work Attitude 462 31.02 3.79 1383 31.73 3.70 0.19
Expressive and Outgoing 462 19.71 5.24 1383 17.68 4.96 -0.40
Innovative and Creative 462 27.48 4.50 1383 27.11 4.53 -0.08
Needs Structure 462 30.53 3.79 1383 30.81 3.96 0.07
Seeks Perfection 462 28.55 4.57 1383 28.55 4.48 0.00
Adaptable 462 29.11 3.80 1383 29.62 3.65 0.14
Note. Positive d-values indicate females scored higher
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 36
Table 10. Evaluation of AIMs v3 Score Differences by Ethnicity
Hispanic or
Latino Not Hispanic or
Latino
Scale n M SD n M SD d
Competitive 233 30.48 6.20 1327 31.33 6.44 0.13
Corporate Citizenship 233 51.11 4.12 1327 50.06 4.80 -0.22
Develops Relationships 233 28.74 4.13 1327 29.46 4.01 0.18
Enjoys Problem Solving 233 28.34 4.52 1327 28.59 4.76 0.05
Exhibits a Positive Work Attitude 233 31.81 3.66 1327 31.43 3.83 -0.10
Expressive and Outgoing 233 17.27 4.84 1327 18.38 5.14 0.22
Innovative and Creative 233 26.74 4.31 1327 27.23 4.50 0.11
Needs Structure 233 31.03 4.13 1327 30.60 3.93 -0.11
Seeks Perfection 233 28.28 4.62 1327 28.59 4.44 0.07
Adaptable 233 29.46 3.82 1327 29.46 3.65 0.00
Note. Positive d-values indicate Not Hispanic or Latino scored higher
Table 11. Evaluation of AIMs v3 Score Differences by Age Group
40 and Older Less than 40
Scale n M SD n M SD d
Competitive 623 30.00 6.13 1238 31.95 6.42 0.31
Corporate Citizenship 623 51.45 3.32 1238 49.58 5.17 -0.41
Develops Relationships 623 28.74 3.87 1238 29.78 4.07 0.26
Enjoys Problem Solving 623 27.95 4.57 1238 28.92 4.74 0.21
Exhibits a Positive Work Attitude 623 31.72 3.33 1238 31.53 3.90 -0.05
Expressive and Outgoing 623 17.04 4.36 1238 18.77 5.36 0.34
Innovative and Creative 623 26.29 4.37 1238 27.75 4.50 0.33
Needs Structure 623 30.48 3.99 1238 30.91 3.88 0.11
Seeks Perfection 623 28.32 4.48 1238 28.71 4.54 0.09
Adaptable 623 29.81 3.51 1238 29.39 3.76 -0.11
Note. Positive d-values indicate people under 40 scored higher
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 37
Table 12. Evaluation of AIMs v3 Score Differences by Race Groups: Asian and White
Asian White
Scale n M SD n M SD d
Competitive 601 32.32 6.41 637 30.47 6.29 -0.29
Corporate Citizenship 601 46.99 5.77 637 51.66 2.77 1.04
Develops Relationships 601 29.56 4.04 637 29.08 3.84 -0.12
Enjoys Problem Solving 601 27.82 4.81 637 28.45 4.43 0.14
Exhibits a Positive Work Attitude 601 30.75 4.03 637 31.54 3.40 0.21
Expressive and Outgoing 601 20.10 5.50 637 17.66 4.47 -0.49
Innovative and Creative 601 26.95 4.35 637 26.83 4.27 -0.03
Needs Structure 601 30.28 3.65 637 30.16 4.15 -0.03
Seeks Perfection 601 27.96 4.47 637 28.48 4.42 0.12
Adaptable 601 27.96 3.50 637 29.96 3.41 0.58
Note. Positive d-values indicate White applicants scored higher
Table 13. Evaluation of AIMs v3 Score Differences by Race Groups: Black or African American and White
Black or African
American White
Scale n M SD n M SD d
Competitive 503 31.32 6.25 637 30.47 6.29 -0.13
Corporate Citizenship 503 52.19 2.70 637 51.66 2.77 -0.19
Develops Relationships 503 29.93 4.20 637 29.08 3.84 -0.21
Enjoys Problem Solving 503 29.94 4.68 637 28.45 4.43 -0.33
Exhibits a Positive Work Attitude 503 32.61 3.44 637 31.54 3.40 -0.31
Expressive and Outgoing 503 16.87 4.73 637 17.66 4.47 0.17
Innovative and Creative 503 28.28 4.94 637 26.83 4.27 -0.32
Needs Structure 503 32.12 3.68 637 30.16 4.15 -0.50
Seeks Perfection 503 29.64 4.57 637 28.48 4.42 -0.26
Adaptable 503 30.73 3.59 637 29.96 3.41 -0.22
Note. Positive d-values indicate White applicants scored higher
AIMS v4 Content Development
A common request from employers who use selection assessments is to have a shortened form so that
candidates for employment spend less time being assessed. This is designed to improve the applicant
experience and make for less administrative overhead. Psychometrically, however, this poses a challenge, as
shortening scales can reduce the assessment’s reliability and validity, thereby reducing its usefulness for
employers.
To address these concerns, HR Avatar created a shortened form of the AIMS assessment that focuses on
only a few scales. These scales were carefully chosen to represent the most predictive and broadly applicable
scales that would work for nearly all positions. The advantage of this strategy is that it shortens the
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 38
assessment while at the same time maintaining (or improving) the reliability and validity of the individual
scales. Utilizing a mix of old and new items, the following scales comprised the shortened AIMS assessment:
Integrity
Integrity is one of the most widely predictive personality scales in addition to being an important
psychological construct. Individuals high in integrity do the right thing even when no one is watching and
strive to act with ethical motivations at all times. Individuals low in integrity frequently break the rules This
scale is designed to predict both overall job performance as well as counter-productive behaviors at work
(e.g., absenteeism, theft).
Example: I frequently ignore the rules at work.
Teamwork
With the nature of work increasingly involving collaboration with others, it is important to have a scale that
distinguishes those who would thrive in such an environment from those who prefer individual work.
Individuals high on the Teamwork scale derive energy from working with others and enjoy tasks involving a
lot of teamwork. Individuals low on the Teamwork scale prefer to work solo and may find extensive
teamwork tiring.
Example: I do my best work when working as a member of a team.
Drive
Conscientiousness is arguably the most well-validated personality trait for the prediction of job performance
(Barrick & Mount, 1991). The Drive scale is designed to measure a subset of conscientiousness, focusing on
the most predictive parts, achievement and dependability (Dudley, Orvis, Lebiecki, & Cortina, 2006).
Individuals high on this scale are driven to achieve as much as possible at work while being detailed.
Individuals low on this scale may not achieve as much, nor are the particularly detailed.
Example: I always try to achieve as much as possible at work.
Empathy and Emotional Self-Control
Empathy and Emotional Self-Control is derived from a model of emotional intelligence (see the next
section) and is included in this version of the AIMS assessment. High scores on this scale indicate people
are able to appropriately regulate their own emotions and be sensitive to the needs of others. Low scores on
this scale indicate people may struggle to understand others and be more expressive in terms of their own
emotions.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 39
Example: I can tell when my coworkers are upset.
Reliability and Fairness
The descriptive statistics and reliability estimates of the AIMs v4 scales were estimated using the data
collected from live applicants in the course of applying for jobs. Analyses were restricted to US samples. As
can be seen from the table below, all these scales have acceptable levels of internal consistency.
Unfortunately due to a low volume of testing, fairness analyses were not able to be completed at this time; a
future update will include these analyses once sufficient data are available.
Table 14. Descriptive Statistics and Reliability Evidence for the AIMS v4 Scales
Competency N M SD Alpha
Integrity 271 69 8.9 .78
Teamwork 271 58 5.2 .79
Drive 271 64 5.2 .79
Empathy and Emotional Self-Control
271 61 6.3 .77
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 40
Emotional Intelligence Module
Previous Research
Emotional Intelligence (EI) is a psychological construct that broadly refers to a person’s capability to
understand and reason through emotions and their effects on others. There are two primary models of EI:
ability models view EI as a form of intelligence, while trait models view EI as primarily a set of personality
variables. Though substantial work has been done explaining why different models of EI relate to job
performance (e.g., Joseph, Jin, Newman, & O’Boyle, 2015), its utility as a predictor of job performance is
well-established.
Several meta-analyses have established the validity of the relationship between EI and job performance.
Joseph and Newman (2010) found a validity of .47 for trait-based EI and .23 for ability-based EI. O’Boyle,
Humphrey, Pollack, Hawver, and Story (2011) expanded upon this work and found a validity of .28 for
commercially available EI measures built upon the trait model of EI. Additionally, research has consistently
shown that trait-based EI measures provide incremental validity in predicting job performance over
cognitive ability and personality measures (Joseph & Newman, 2010; O’Boyle et al., 2011; Andrei, Siegling,
Aloe, Baldaro, & Petrides, 2016).
EI has also been linked to several other important criteria, including academic performance, with a validity
of .20 for trait models (Perera, & DiGiacomo, 2013) and job satisfaction, with a validity of .39 for trait
models (Miao, Humphrey, & Qian, 2017). With a measure that predicts a wide variety of important criteria,
Emotional Intelligence is a critical trait to assess.
Content Development
The Emotional Intelligence module consists of three dimensions intending to represent a trait-approach to
the construct: self-control, self-awareness, and empathy. Questions were developed PhD-level
Industrial/Organizational psychologists with expertise in emotional intelligence. Likert-type items were
developed targeting the three scales; initial pilot testing settled on the set of items included in this measure.
The three Emotional Intelligence subscales have six items each and are (note that example items are
representative of, but not included on the final scale):
Self-Control
The Self-Control scale asks questions about a person’s ability to regulate their emotions.
Example: I am able to stay calm in tense situations at work.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 41
Self-Awareness
The Self-Awareness scale asks questions about a person’s ability to understand their emotions and the
effects they have on others.
Example: I can see how my moods affect my colleagues.
Empathy
The Empathy scale asks questions about a person’s ability to understand and respond appropriately to the
emotions and feelings of others.
Example: I can usually tell when my colleagues are upset.
Reliability
The descriptive statistics and reliability estimates of the Emotional Intelligence scales were estimated using
the data collected from live applicants in the course of applying for jobs. Analyses were restricted to US
samples. As can be seen from the table below, the overall Emotional Intelligence has acceptable levels of
reliability, while the individual subscales are provided for developmental purposes only. Please note, based
on these findings, further analyses were done and will inform future scale revisions.
Table 15. Descriptive Statistics and Reliability Evidence for the Emotional Intelligence Module
Competency N M SD ρxx
Emotional Intelligence Overall 413 78.95 9.42 .77
- Self-Control 413 30.31 3.12 .60
- Self-Awareness 413 26.23 4.65 .63
- Empathy 413 25.31 4.68 .69
Fairness
In order to evaluate how different demographic subgroups perform on the assessments, mean differences
between protected subgroups were compared (Tables 14 -17). There were too few cases (n<30) to run
analyses for most racial groups; only Black/White differences are reported here. There were no substantive
mean differences on the Emotional Intelligence scales between males and females. There were some small
to moderate group differences on the Emotional Intelligence scales for Ethnicity, Age, and Race; care
should be taken when using these scores so as not to create unnecessary adverse impact. Further research
will investigate reasons for these differences.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 42
Table 16. Evaluation of EI Score Differences by Gender
Scale
Male Female n M SD n M SD d
Emotional Intelligence Overall 56 80.39 9.57 317 78.87 9.43 -0.16
Self-Control 56 30.48 3.03 317 30.37 3.10 -0.04
Self-Awareness 56 27.00 4.49 317 26.10 4.78 -0.19
Empathy 56 25.45 4.67 317 25.37 4.72 -0.02
Note. Positive d-values indicate females scored higher
Table 17. Evaluation of EI Score Differences by Ethnicity
Hispanic or
Latino Not Hispanic or
Latino
Scale n M SD n M SD d
Emotional Intelligence Overall 84 76.61 9.33 272 79.39 9.32 0.30
Self-Control 84 30.38 2.67 272 30.35 3.12 -0.01
Self-Awareness 84 24.86 4.29 272 26.40 4.67 0.34
Empathy 84 24.12 5.07 272 25.58 4.59 0.31
Note. Positive d-values indicate Not Hispanic or Latino scored higher
Table 18. Evaluation of EI Score Differences by Age Group
40 and Older Less than 40
Scale n M SD n M SD d
Emotional Intelligence Overall 160 76.76 9.19 232 80.63 9.26 0.42
Self-Control 160 29.99 3.12 232 30.70 2.97 0.23
Self-Awareness 160 25.33 4.84 232 26.87 4.45 0.33
Empathy 160 24.36 4.68 232 25.95 4.67 0.34
Note. Positive d-values indicate people under 40 scored higher
Table 19. Evaluation of EI Score Differences by Race Groups: Black or African American and White
Black or African
American White
Scale n M SD n M SD d
Emotional Intelligence Overall 152 81.75 10.11 183 77.56 7.87 -0.47
Self-Control 152 31.27 2.99 183 29.66 3.04 -0.53
Self-Awareness 152 27.61 4.91 183 25.43 3.77 -0.50
Empathy 152 25.89 5.06 183 25.34 4.17 -0.12
Note. Positive d-values indicate White applicants scored higher
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 43
Workplace Competency Assessment
Previous Research
Sometimes called low-fidelity simulations (e.g., Motowidlo, Dunnette, & Carter, 1990), situational judgment
tests (SJTs) have a long history in personnel selection. These assessments present a job-related situation
followed by several possible responses to that situation. The applicant is then asked to either choose a
response or rate the effectiveness of the responses.
The predictive validity of SJTs has long been established. In the first meta-analysis on the issue, McDaniel,
Morgeson, Finnegan, Campion, and Braverman (2001) found that SJTs predicted overall job performance
quite well, with a validity of = .34. This has been confirmed in updated meta-analyses (McDaniel,
Hartman, Whetzel, & Grub, 2007) and extended to note that SJTs are predictive of multiple facets of job
performance (Christian, Edwards, & Bradley, 2010).
Regarding expected adverse impact, there are some small-to-moderate race and ethnic group differences
that could lead to adverse impact (Whetzel, McDaniel, & Nguyen, 2008). For African-American, Asian, and
Hispanic groups, the differences range from .20 to .40, depending on the type of test used. These
differences could lead to adverse impact against these groups. Roth, Bobko, and Buster (2013) extended
these findings to note that there were smaller differences for SJTs focusing on interpersonal skills rather
than cognitively loaded situations. There were no substantial gender differences found (Whetzel, et al, 2008).
Content Development
After a literature review, HR Avatar worked with a team of subject matter experts to define target
competencies predictive of leadership and job performance. From these definitions, a PhD-trained I/O
Psychologist with expertise in SJT test development wrote the items, which were then reviewed with the
team of SMEs. After the items were finalized, the items were pilot tested to rate the effectiveness of each
response option. The final version of the Workplace Competency assessment consists of five competencies
that listed below.
Coaching and Developing Others
Identifies the development needs of others and coaches, mentors, or otherwise helps others to improve
their knowledge or skills. Starts coaching and developing with building a relationship of mutual trust,
working together to decide what to accomplish, set a goal, make a roadmap for reaching the goal, and give
feedback along the way. Provide specific behavioral examples when giving feedback on performance issues,
clarify expectations, and get a commitment from the employee to act.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 44
Exercising Political Savvy
Understands how to position self and communicate objectives in the context of organizational issues and
other personnel, to maximize outcomes both for one’s group and the organization. Gets people to
cooperate with oneself, socializes ideas and builds bridges to meet others halfway.
Guiding, Directing, and Motivating Others
Provides direction and guidance to subordinates, including setting performance standards and monitoring
performance. Coordinates the work and activities of others. Encourages goal accomplishment. Makes
detailed plans that consider what is most important. Communicates priorities to team members. Holds team
accountable for their work. Provides advice that is reasonable and socially aware.
Resolving Conflicts and Meeting Customer Needs
Handles complaints. Looks for ways to solve problems collectively and agree on next steps. Settles disputes
and resolves grievances and conflicts, or otherwise negotiates with others. Works to understand the views of
both sides of a conflict, ensures relevant information is shared and considered, and helps parties in a conflict
to find common objectives.
Team Building
Engages and participates in activities that support improved team social relations, building mutual trust,
respect, communication, understanding and cooperation among team members. Focuses on providing a
team environment that is conducive to collaboration, fostering innovation and creativity, promoting
increased comfort level and celebration among team members.
Reliability
The descriptive statistics and reliability estimates of the Workplace Competency scales were estimated using
the data collected from live applicants in the course of applying for jobs. Applicants provide a rating of
effectiveness on each scenario, and their score is calculated as the absolute value of the z-score difference
from the true effectiveness; as such, the minimum possible score is zero, and lower scores are better. As can
be seen from the table below, the Workplace Competency scales have acceptable levels of reliability. Please
note, based on these findings, further analyses will be done to inform and improve future scale revisions.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 45
Table 20. Descriptive Statistics and Reliability Evidence for the Emotional Intelligence Module
Competency N M SD ρxx
Coaching and Developing Others 996 0.827 0.278 .80
Exercising Political Savvy 996 0.823 0.250 .77
Guiding, Directing, and Motivating Others 996 0.816 0.248 .71
Resolving Conflicts and Meeting Customer Needs 996 0.797 0.254 .69
Team Building 996 0.814 0.270 .76
Note. Lower scores are better.
Fairness
In order to evaluate how different demographic subgroups perform on the assessments, mean differences
between protected subgroups were compared (Tables 19 -21). There were too few cases (n<30) to run
analyses for racial groups. Most of the group differences were quite small, suggesting only minimal risk for
adverse impact against the groups investigated. Further research will monitor these differences.
Table 21. Evaluation of EI Score Differences by Gender
Scale
Male Female n M SD n M SD d
Coaching and Developing Others 306 0.806 0.271 233 0.872 0.288 0.24
Exercising Political Savvy 306 0.809 0.244 233 0.841 0.246 0.13
Guiding, Directing, and Motivating Others
306 0.820 0.256 233 0.843 0.243 0.09
Resolving Conflicts and Meeting Customer Needs
306 0.799 0.259 233 0.801 0.264 0.01
Team Building 306 0.812 0.274 233 0.841 0.266 0.10
Note. Positive d-values indicate females scored higher
Table 22. Evaluation of EI Score Differences by Ethnicity
Hispanic or Latino Not Hispanic or
Latino
Scale n M SD n M SD d
Coaching and Developing Others 63 0.765 0.226 318 0.891 0.298 0.44
Exercising Political Savvy 63 0.790 0.251 318 0.841 0.255 0.20
Guiding, Directing, and Motivating Others
63 0.795 0.247 318 0.850 0.265 0.21
Resolving Conflicts and Meeting Customer Needs
63 0.817 0.245 318 0.825 0.277 0.03
Team Building 63 0.788 0.246 318 0.854 0.278 0.24
Note. Positive d-values indicate Not Hispanic or Latino scored higher
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 46
Table 23. Evaluation of EI Score Differences by Age Group
40 and Older Less than 40
Scale n M SD n M SD d
Coaching and Developing Others 198 0.816 0.268 334 0.865 0.294 0.17
Exercising Political Savvy 198 0.796 0.254 334 0.840 0.259 0.17
Guiding, Directing, and Motivating Others
198 0.783 0.248 334 0.858 0.261 0.29
Resolving Conflicts and Meeting Customer Needs
198 0.784 0.231 334 0.828 0.288 0.16
Team Building 198 0.777 0.248 334 0.861 0.291 0.30
Note. Positive d-values indicate people under 40 scored higher
Behavioral History Survey
Previous Research
Past behavior often predicts future behavior. Biographical data, or bio-data, assessments contain items
developed by identifying patterns associated with high productivity and low turnover. Hunter and Hunter
(1984) report the average validity coefficient for bio-data assessments to be .38. Other researchers have
estimated the validity to be .35 for supervisors (Rothstein, Schmidt, Erwin, Owens, & Sparks, 1990) and .53
for managers (Carlson, Scullen, Schmidt, Rothstein, & Erwin, 1999).
For example, Rothstein, et al., (1990) identified several bio-data factors that could be applied to many jobs.
These included things like having a pervasive feeling of self-worth and confidence; believing that he or she
works better and faster than others in his or her area of specialization; having been recognized for
accomplishments; being outgoing; being a good communicator; taking clear positions; and, feeling healthy
and satisfied with current life situations. Rothstein screened each biographical item for cross-validity then
meta-analyzed 11,000 first-line supervisors from different organizations, age levels, genders, job experience
levels and tenures. He concluded that in all cases, validity estimates for these factors were generalizable,
stable across time, and did not appear to stem from acquired skills, knowledge or abilities.
McDaniel (1989) evaluated biographical questions about school suspensions, drug use, quitting school, prior
employment experience, grades, club memberships, contacts within the legal system, and socioeconomic
status. The results successfully predicted discharge from the military for problems such as alcohol and drug
use, desertion, imprisonment, and “discreditable incidents.”
Oswald, Schmitt, Kim, Ramsay, and Gillespie (2004) reported statistically significant bio-data correlations
with 12 dimensions of college student performance: knowledge, learning, artistic ability, multicultural
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 47
sensitivity, leadership, interpersonal skills, citizenship, health, careers, adaptability, perseverance, and ethics.
Their work showed incremental validity over the traditional use of SAT and ACT with fewer differences
between subgroups than traditional admission measures.
In a similar vein, Kanfer, Crosby, and Brandt (1988) identified correlations between bio-data and tenure; and
a study of 555 real estate agents, Klimoski and Childs (1986) identified five major bio-data factors associated
with job, personal and career success. They included social orientation, economic stability, work ethic
orientation, educational achievement and interpersonal confidence.
Content Development
The Behavioral History Survey consists of three biographical history areas that generalize across most jobs:
tenure, performance, and unproductive behavior. Questions and forms were developed for both
professional and entry-level positions using a panel of experienced managers. Each bio-data item was
reviewed by an expert management panel and scored using a modified Angoff method. There are a few
versions of each form available because some of the items were relevant for some positions but irrelevant
for others. For example, an item asking about experience with an industry would not be appropriate for a
position that spans multiple industries (e.g. Administrative Assistant). Scores for each competency are
calculated by averaging across items within that competency. The use of averaging, rather than summing
across items, allows HR Avatar to compare similar scales across multiple versions of the assessment. The
entry-level form contains 14-15 items and takes approximately 2 minutes to complete. The professional
form contains 18-20 items and takes approximately 6 minutes to complete. The estimates were estimated
with two samples (N= 70 and N=371, respectively).
Performance
The Performance scale asks questions related to past performance on the job and should predict multiple
dimensions of job performance.
Example: How many times have you been promoted at work?
Tenure
The Tenure scale asks questions specifically related to tenure on previous jobs. This scale should predict
retention.
Example: What is your longest tenure with an organization?
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 48
Reliability
The scales in the Behavioral History Survey are relatively short – some with as few as three items.
Additionally, the scales were not expected to be internally consistent as there is no evidence that the
individual item responses should be highly correlated with one another. Therefore, a test-retest reliability
estimate was determined to be more appropriate than one of internal consistency (e.g., Cronbach’s alpha).
Descriptive statistics and reliability estimates were calculated using data collected via MTurk. This data
collection effort is summarized in the Cognitive Work Simulations section. Because participants were
allowed to take multiple versions of the solution, several participants took the Behavioral History Surveys
more than once. Test-retest reliability estimates were calculated by correlating the scale scores between the
first and second administrations of the form.
For the Professional form, the sample was restricted to those individuals who had at least two days between
administrations. The largest number of days between administrations was 25. On average, there were 11
days between administrations (SD=7.49). The descriptive statistics and test-retest reliability estimates are in
Table 12 and all reliability estimates are acceptable.
For the Entry-Level form, the sample was restricted to those individuals who had at least two days, but less
than 30 days between administrations. The largest number of days between administrations was 20. On
average, there were 7 days between administrations (SD=4.76). The descriptive statistics and test-retest
reliability estimates are in Table 13 and all reliability estimates are acceptable.
Please note, there are multiple versions of the Performance scale for both the Professional and the Entry-
Level forms as described above. The descriptive statistics, test-retest reliability estimates and fairness
analyses for this scale are estimated by averaging across various versions of the scales.
Table 24. Descriptive Statistics and Reliability Evidence for the Professional Behavioral History Survey
Competency N M SD ρxx N for
ρxx
Performance 4567 2.45 0.24 .81 65
Tenure 4567 2.42 0.37 .71 65
Table 25. Descriptive Statistics and Reliability Evidence for the Entry-Level Behavioral History Survey
Competency N M SD ρxx N for
ρxx
Performance 1594 2.25 0.47 .74 151
Tenure 1594 2.50 0.33 .71 151
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 49
Fairness
In order to evaluate how different demographic subgroups perform on the assessments, mean comparisons
were computed for each of the subgroups (Tables 14-17). There were too few cases (n<25) to run analyses
for the American Indian or Alaska Native, Black or African American, Native Hawaiian or Other Pacific
Islander, Two or More Races, and Other subgroups. There were significant mean differences on the
Behavioral History scales between males and females. However, all differences were relatively small or they
were in favor of the focal subgroup, females. There were a few significant mean differences based on
ethnicity. Unfortunately, the sample is too small for the Hispanic and Latino subgroup to draw any
conclusions. As data become available, further research will explore these differences at the item and
response level. There were significant mean differences with respect to age group, but all were in favor of
the focal subgroup, 40 and older. There were a few significant mean differences between the Asian and
White subgroups. However, individuals in this sample were from multiple countries and it’s possible that
these differences might be reflections of cultural differences. Additional research is needed to further
evaluate these differences, to calculate country-specific norms, and to evaluate subgroup differences within
country.
Table 26. Evaluation of Biographical History Survey Score Differences by Gender
Male Female
Scale n M SD N M SD d
Professional Performance 1310 2.48 0.23 2945 2.43 0.24 -0.22
Professional Tenure 1310 2.36 0.38 2945 2.45 0.37 0.24
Entry-Level Performance 694 2.23 0.48 742 2.29 0.47 0.14
Entry-Level Tenure 694 2.48 0.34 742 2.54 0.31 0.17
Note. *p<.05
Table 27. Evaluation of Biographical History Survey Score Differences by Ethnicity
Hispanic or Latino
Not Hispanic or
Latino
Scale n M SD n M SD d
Professional Performance 448 2.44 0.24 3155 2.47 0.24 0.11
Professional Tenure 448 2.40 0.37 3155 2.49 0.38 0.23
Entry-Level Performance 139 2.28 0.47 1060 2.40 0.42 0.27
Entry-Level Tenure 139 2.52 0.34 1060 2.56 0.31 0.13
Note. *p<.05
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 50
Table 28. Evaluation of Biographical History Survey Score Differences by Age Group
40 and Older Less than 40
Scale n M SD n M SD d
Professional Performance 982 2.43 0.23 3323 2.49 0.24 0.21
Professional Tenure 982 2.37 0.37 3323 2.59 0.32 0.66
Entry-Level Performance 359 2.21 0.49 1084 2.43 0.36 0.53
Entry-Level Tenure 359 2.46 0.33 1084 2.68 0.26 0.79
Note. *p<.05
Table 29. Evaluation of Biographical History Survey Score Differences by Race Groups: Asian and White
Asian White
Scale n M SD n M SD d
Professional Performance 2626 2.45 0.24 753 2.47 0.24 0.08
Professional Tenure 2626 2.29 0.35 753 2.63 0.28 1.01
Entry-Level Performance 529 2.03 0.45 463 2.41 0.38 0.91
Entry-Level Tenure 529 2.37 0.32 463 2.63 0.29 0.85
Note. *p<.05
Table 30. Evaluation of Biographical History Survey Score Differences by Race Groups: Black and White
Black or African
American White
Scale n M SD n M SD d
Professional Performance 687 2.43 0.23 753 2.47 0.24 0.17
Professional Tenure 687 2.70 0.25 753 2.63 0.28 -0.26
Entry-Level Performance 257 2.52 0.41 463 2.41 0.38 -0.28
Entry-Level Tenure 257 2.64 0.26 463 2.63 0.29 -0.04
Note. *p<.05
Industry-specific Supplements
To improve the reliability and validity of the biodata scales, some industry-specific items were added. These
additional items were only administered to candidates applying for jobs in specific industries. Data are
currently being collected on these items; reliability, validity, and fairness information will be updated once
sufficient data have been collected.
Knowledge and Skills Tests
Development Overview
Candidates need more than just the right combination of abilities, personality characteristics and
background in order to be successful in a job. Most often, specific knowledge and/or specific skills are also
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 51
required. Job knowledge tests measure job-relelevant declaritive knowledge such as technical information,
standards, and best practices as well as knowledge of specific processess and procedures. Job knowledge
tests serve as an indicator of previous job performance and serve as proximal predictors of future job
performance. Skills assessments evaluate a candidate’s ability to perform a specific task. Candidiates who
begin work with a sufficient level of knowledge and/or skill should require less training and beable to
perform better faster. Hunter and Hunter (1984) report the average validity coefficient for job knowledge
tests to be .48. In their meta-analysis, Dye, Reck, and McDaniel (2007) found the average corrected
correlation coefficient for job knowledge tests to be .45 for job performance and .47 for training.
Correlations were even higher for complex jobs and higher to the extent that the job knowledge test was
similar to the job.
HR Avatar has developed dozens of knowledge assessments. See Appendix A for a complete listing. The
knowledge assessments were developed by reviewing multiple resources to identify appropriate items. For
example, The Food Safety Fundamentals assessment was developed by reviewing, among other resources,
the USDA Safe Food Handling Fact Sheets. Each knowledge test contains an item bank and half of the
items are randomly selected to be administered to an applicant. Research on the validity, reliability, and
fairness of these assessments is forthcoming. HR Avatar has three skills assessments: Sales Situation
Analysis, Typing Test, and Essay Test. The development and research on these assessments is described in
more detail below.
Validity
Initial validation evidence is available, in a study with 130 managers, which demonstrates the validity of the
assessments. The assessment demonstrated that it predicts job performance (r=.25; p<.01), using a simple
“High” or “Marginal” categorization of job performance. We expect validity evidence to be more robust in
future studies with better criterion measures. A brief summary of the results is presented in Appendix G.
Sales Situation Analysis
Test Development
The Sales Situation Analysis is a scale that is integrated into the Business Sales Cognitive Work Simulation
(described above). In the simulation, the candidate is asked to assist and respond to customers and
colleagues and solve business-related problems. The Sales Situation Analysis scale specifically evaluates a
candidate’s ability to understand a customer’s needs and identify the most appropriate follow-up actions. In
the simulation, the candidate must read an email communication from a customer and then identify the
customer’s primary concern. Next, the candidate must identify which of several action items are most
appropriate for the specific sales situation.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 52
Reliability
The descriptive statistics and reliability estimate for the Sales Situation Analysis were estimated using live
applicant data for sales positions. Table 18 contains the descriptive statistics and reliability estimate for this
scale. The Sales Situation Analysis scale is not hypothesized to be unidimensional, so the Cronbach’s alpha
estimate below is a significant underestimate of the actual reliability of the scale; unfortunately, test-retest
reliability estimates are unavailable at this time. Further research evaluating the effectiveness of these scales
and their revisions is forthcoming.
Table 31. Descriptive Statistics and Reliability Evidence for the Sales Situation Analysis Assessment
# of
Items N M SD Alpha
Sales Situation Analysis 6 2004 2.92 1.15 .30
Fairness
In order to evaluate how different demographic subgroups perform on the assessments, mean differences
between protected subgroups were compared. The results are provided in Table 19. Small-to-zero
differences were observed between genders and between ethnicities. Applicants 40 and older scored slightly
higher on average. Data were also collected for racial groups; however only Asian and Black or African
American subgroups had sufficient data to report. Some differences between racial groups were observed,
with the White subgroup scoring higher. The Black-White effect size is relatively small; the Asian-White
difference is moderately sized; future research will examine the source of these differences.
Table 32. Evaluation of Sales Situation Analysis Score Differences
n M SD d
Female 746 2.98 1.13
Male 965 2.89 1.18 .08
40 and older 283 3.16 1.21
Less than 40 1436 2.87 1.14 .25
Hispanic or Latino 126 2.93 1.16
Not Hispanic or Latino 1320 2.93 1.14 .00
White 237 3.49 1.10
Asian 1285 2.84 1.12 -.58
Black or African American 48 3.20 1.20 -.26
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 53
Typing Speed and Accuracy
Test Development
The typing test consists of three typing tasks. For each task, applicants are asked to type a short passage.
Each typing task is randomly selected from a group of five passages (15 total passages). For each task, the
words per minute are calculated. This score is modified to an accuracy-adjusted words per minute by
calculating and factoring in the rate of typing errors. The typing scores are calculated by averaging the words
per minute and accuracy-adjusted words per minute across the three tasks. There are two forms of this test
depending on the intended usage: Business and Academic.
Reliability
The typing test was piloted via MTurk (N=155). The majority of the sample was from the US (92.9%). The
pilot assessment consisted of five typing tasks and each typing task was randomly selected from a group of
five passages (25 total passages). Five one-way analyses of variance were completed to evaluate differences
in scores across the passages available for each item. No significant differences were found, F(4, 150)=.56,
p=.70, F(4, 150)=.27, p=.89, F(4, 150)=.79, p=.54, F(2, 121)=1.12, p=.33, F(2, 121)=.22, p=.80.
Descriptive statistics and the alpha estimate of internal consistency were estimated using live applicant data.
The internal consistency of the assessment is well above accepted standards.
Table 33. Descriptive Statistics and Reliability Estimates for the Typing Test
Typing Test N M SD alpha
Business 727 37.08 13.64 .97
Academic 1005 28.57 12.19 .98
Fairness
In order to evaluate how different demographic subgroups perform on the assessment, independent
samples t-tests were performed to compare mean scores of the subgroups (Table 21). Although data were
collected on ethnic group, the sample size for the ‘Hispanic or Latino’ subgroup was too small for analyses
(n=11). There were too few cases (n<10) to run analyses for the African American or Black, American
Indian or Alaska Native, Native Hawaiian or Other Pacific Islander, Two or More Races, and Other
subgroups. Significant mean differences were found for gender and between the Asian and White racial
subgroups. Given that this sample consisted of people piloting the assessment but not necessarily taking the
assessment in order to obtain a job, it is possible that the individuals were not attempting to perform their
best. Further research will evaluate norms using an applicant sample.
Table 34. Evaluation of Business Typing Differences by Subgroup
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 54
n M SD d
Female 452 36.72 13.37 .17
Male 172 39.05 13.27
40 and Older 131 36.36 14.62 .10
Less than 40 500 37.65 12.69
Hispanic or Latino 58 38.90 12.08 -.04
Not Hispanic or Latino 491 38.39 13.44
Asian 77 34.92 12.02 .66
Black 239 33.08 10.53 .86
White 241 43.57 13.51
Note. Negative d-values means the referent group scored lower
Table 35. Evaluation of Academic Typing Differences by Subgroup
n M SD d
Female 587 30.55 11.77 -.40
Male 259 25.82 11.96
40 and Older 196 31.48 12.86 -.29
Less than 40 665 27.97 11.69
Hispanic or Latino 79 30.44 11.80 -.10
Not Hispanic or Latino 677 29.26 12.38
Asian 437 26.00 10.52 .91
Black 101 27.63 10.00 .72
White 230 36.29 12.73
Note. Negative d-values means the referent group scored lower
Data Entry
Test Development
The data entry tests consist of three versions: 10 Key data entry where applicants enter data using a numeric
10-key keyboard, alphanumeric data entry where applicants type what appears on the screen, and oral
alphanumeric data entry where applicants type what they hear. For each test, applicants are asked to type the
data from a prompt, with each task randomly selected from a group of equivalently difficult prompts. For
each task, the keystrokes per hour are calculated. This score is modified to an accuracy-adjusted keystrokes
per hour by calculating and factoring in the rate of data entry errors. The data entry scores are calculated by
averaging the keystrokes per hour and accuracy-adjusted keystrokes per hour the tasks.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 55
Reliability
Descriptive statistics, reported as keystrokes per hour, and the alpha estimate of internal consistency were
estimated using live applicant data. The internal consistency of the assessment is well above accepted
standards.
Table 36. Descriptive Statistics and Reliability Estimates for the Data Entry Tests
Typing Test N M SD # of
tasks alpha
10 Key 112 4597.76 1908.68 3 .83
Alphanumeric 104 4921.03 2726.08 10 .96
Oral Alphanumeric 256 3280.97 1075.49 5 .93
Fairness
In order to evaluate how different demographic subgroups perform on the assessment, mean scores were
compared between subgroups (Tables 35-37). Although data were collected on ethnic group, racial group,
and age, there were not enough data to calculate stable differences (N<30). There were small differences
between genders on each of the three test versions. As such, it is unlikely these tests will give rise to adverse
impact. Further research will investigate differences in other subgroups as more data accumulates.
Table 37. Evaluation of 10-key Score Differences by Subgroup
n M SD d
Female 35 4803.78 2155.19 -.14
Male 67 4521.90 1865.07
Note. Negative d-values means the referent group scored lower
Table 38. Evaluation of Alphanumeric Score Differences by Subgroup
n M SD d
Female 39 4621.73 2369.35 .16
Male 64 5048.81 2915.12
Note. Negative d-values means the referent group scored lower
Table 39. Evaluation of Oral Alphanumeric Score Differences by Subgroup
n M SD d
Female 75 3407.07 1082.38 -.08
Male 45 3380.43 1032.88
Note. *p<.05
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 56
Essay Test
Test Development
Written communication is a key skill in many positions. Communicating via email, writing reports, and
creating presentations all require the ability to communicate effectively. The HR Avatar Essay Test is a fast
method for evaluating an applicant’s written communication skills. The HR Avatar Essay Test consists of
one of two writing prompts. The writing prompts were designed to be general enough to provide an
opportunity for anyone to be able to write a short essay. The writing prompts are included below.
1. Describe the pros and cons of working from home.
2. Describe the pros and cons of living in a big city.
Applicants are asked to write a short essay with a minimum of 100 words and are given an unlimited time to
do so.
The essays are scored using Discern, an open source, machine learning program. Discern was designed by
edX, a nonprofit organization founded by Harvard and the Massachusetts Institute of Technology (MIT)
(edX, 2015; Markoff, 2013). A YouTube video was published and provides some additional information on
how the program works: https://www.youtube.com/watch?v=zFeP678054U (Paruchuri, 2015). The system
produces a score that ranges from 0 to 100. A confidence estimate for the score is also computed. The
confidence estimate can range from 0 to 1. Scores with confidence estimates less than .10 are not considered
valid.
In order to calibrate the program, HR Avatar used MTurk to collect writing samples for the two writing
prompts (N=170 and N=163). The essays were scored on three areas: Grammar, Structure and Content, by
three independent raters. Prior to rating each essay, the raters were provided with scoring rubrics (Appendix
D) and training (Appendix E) for how to score the essays using the rubrics. A total score was calculated for
each essay by aggregating scores on Grammar, Structure, and Content and then averaging across the raters
and linearly transforming the scores to a scale of 0-100. Scores were entered into the program as the
calibration sample for Discern.
In order to evaluate the reliability of the ratings provided by the three raters, intra-class correlation
coefficients were calculated using a two-way, mixed effects model (ICC3) (Shrout & Fleiss, 1979) . The
ICC3 was chosen because it is the reliability of the average rating made by the specific raters in this study
that were of interest, and it is not necessary to generalize the reliability estimate to the population of
potential raters. There were ratings available for all three raters for 317 essays and the reliability of the
average aggregate rating was acceptable (ICC (3,3)=.74). There was also a large relationship between scores
on the two essay prompts, r(152)=.62, p<.01.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 57
In early January 2015 additional improvements were made to the Essay scoring. First, the 196 essays that
were completed since the initial rollout were manually scored by an individual rater and re-entered into the
Discern program to further calibrate the system. Second, HR Avatar added some additional safeguards to
the automated scoring engine. The essays are truncated to 800 words. Essays with fewer than 100 words,
consisting of more than 25% of spelling errors, or more than 25% of grammar errors or style errors are
given a score of 0. An additional program was written to search the HR Avatar database for matching essay
content. The system uses a combination of three methods to determine the similarity between the essay and
existing content in the system as a way of detecting plagiarism: The Levenshtein Distance Strategy (Navarro,
2001), the Jaro distance metric (Jaro, 1989; Jaro, 1995), and the Jaro-Winkler distance metric (Winkler,
1990). If matching content is discovered, it is assumed that the essay is plagiarized and applicants will receive
a score of 0.
Descriptives
HR Avatar has collected essays from applicants applying to numerous jobs. Descriptive statistics for the
machine score and confidence have been provided for the essays in the table below.
Table 40. Descriptive Statistics for Essay Scores
n M SD
Prompt 1 (Working from home)
Score 930 55.24 12.12
Confidence 930 0.76 0.14
Prompt 2 (Living in a big city)
Score 943 49.43 11.03
Confidence 943 0.76 0.14
Fairness
In order to evaluate how different demographic groups perform on the assessment, mean scores were
compared between subgroups on the machine score. As can be seen in the tables below, the differences
between protected groups tends to be fairly small. Only comparisons between racial groups showed some
moderate to large differences; however, with a relatively small sample size for the referent group, these
results should be interpreted with caution. Future research will further investigate these differences.
Table 41. Evaluation of Essay Score Differences by Subgroup – Living in a Big City Prompt
n M SD d
Female 478 50.35 10.76 -.17
Male 370 48.42 11.68
40 and Older 100 52.06 11.66 -.26
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 58
Less than 40 709 49.17 11.12
Hispanic or Latino 63 48.49 11.23 .14
Not Hispanic or Latino 646 50.04 11.05
Asian 686 49.05 10.21 .91
Black or African American 34 53.47 13.75 .39
White 62 58.47 11.97
Table 42. Evaluation of Essay Score Differences by Subgroup – Working from Home Prompt
n M SD d
Female 528 56.40 10.24 -.25
Male 301 53.43 14.42
40 and Older 100 55.98 11.69 -.05
Less than 40 738 55.34 12.02
Hispanic or Latino 51 55.22 15.44 .05
Not Hispanic or Latino 649 55.84 11.73
Asian 658 55.19 12.19 .28
Black or African American 47 55.85 10.57 .24
White 65 58.57 11.79
Validity
Validation evidence is available in two studies that demonstrate the validity of the assessments. The first is a
study with 130 managers and the second is a study with 64 managers. In the first study, the assessment
demonstrated that it predicts job performance (r=.25; p<.01), using a simple “High” or “Marginal”
categorization of job performance, and a second study demonstrated the assessment predicted performance
on a 100 point administrative performance appraisal, using a Spearman correlation (r=.25; p<.05). We
expect validity evidence to be more robust in future studies with better criterion measures. A brief summary
of the results is presented in Appendix G.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 59
Solution Scoring
In addition to providing scores at the scale level, HR Avatar also provides an overall score as an indication
of overall fit between the candidate and a given job. All competencies are grouped within four broad
categories: Cognitive Abilities, Skills and Knowledge, AIMs, and Behavioral History. Each competency, or
scale, is converted to a z-score. For the Cognitive Abilities, Skills and Knowledge, and Behavioral History
categories, a category score is created by averaging z-scores within each category.
For the AIMs and Cognitive Ability categories, O*NET is used to determine which of the competencies are
relevant for a given job and to determine the appropriate weights. O*NET is an online database that
contains specific information about hundreds of occupations (Peterson, et al., 2001). The information
gathered about each job is categorized using strongly supported theoretical models about behavior in the
workplace. Additionally, the process for gathering the information and documenting it is a collaborative
effort. The O*NET Skills and Abilities Importance ratings are the average importance ratings given by at
least eight occupational analysts (Fleisher & Tsacoumis, 2012a, 2012b). The Occupational Analysts all have
two or more years of work experience, two or more years of graduate level education in a program related to
human resources or workplace psychology, and coursework in research methods and job analysis.
Occupational Analysts are provided with extensive training and detailed information about each job prior to
making ratings including job description, knowledge requirements, task descriptions and work context. The
AIMs and Cognitive scales were mapped onto the O*NET Worker Characteristics and Worker
Requirements. Weights are applied to the scales such that each scale receives a weight that is equivalent to
the proportion of its importance rating within O*NET. When a competency is listed more than once, which
is sometimes the case given that several Worker Characteristics or Worker Requirements might be mapped
to a given competency, the weight given to the competency corresponds to the highest importance rating.
For the knowledge tests, the raw scores are the percent correct. As of yet, these assessments do not have
sufficient data to estimate stable normative parameters. Therefore, a mean of .70 and an SD of .25 will be
used to estimate z-scores for applicants.
An overall z-score is calculated by computing a weighted average of the competency categories. The
following weights are assigned to each category:
• Cognitive Ability competencies = 1
• Skills/Knowledge competencies = 0.8
• AIMs competencies = 0.7
• Behavioral History competencies = 0.4
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 60
The overall z-score is transformed to a Normal Curve Equivalent (NCE) score. NCE scores have a mean of
50 and a standard deviation of 21.06 and maintain their equal-interval properties.
Construct Validity Evidence
Tables 41-45 contain the correlations between the scales in the HR Avatar Assessment Solution. Generally
speaking, the relationships conform to what would be expected. For example, within the AIMs assessment,
there is a strong negative relationship between the Competitive scales and the Exhibits a Positive Work
Attitude scale. The Enjoys the Problem Solving and Innovative and Creative scales are highly correlated
which is to be expected as they are both facets of the Openness trait of the Big Five. Correlations between
various assessments can vary depending on the solution and providing all possible combinations is beyond
the scope of this technical report. Specific analyses and tables are available upon request.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 61
Table 43. Correlations between the AIMs Scales
1. EaPWA 2. CC 3. C 4. E&O 5. I&C 6. SP 7. DR 8. EP-S 9. NS 10. A
1. Exhibits a Positive Work Attitude
1
2. Corporate Citizenship
0.559 1
3. Competitive -0.067 -0.195 1
4. Expressive and Outgoing
-0.260 -0.471 0.497 1
5. Innovative and Creative
0.249 0.088 0.407 0.310 1
6. Seeks Perfection 0.224 0.165 0.456 0.161 0.571 1
7. Develops Relationships
0.252 0.113 0.335 0.290 0.646 0.535 1
8. Enjoys Problem-Solving
0.344 0.229 0.351 0.162 0.723 0.561 0.534 1
9. Needs Structure 0.313 0.252 0.238 -0.033 0.469 0.576 0.452 0.520 1
10. Adaptability 0.426 0.401 0.106 -0.062 0.418 0.357 0.400 0.441 0.331 1
Note. N = 1999
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 62
Table 44. Correlations between the Professional Behavioral History Scales
Performance Tenure
Performance 1
Tenure .037 1
Note. N = 2063
Table 45. Correlations between the Entry-Level Behavioral History Scales
Performance Tenure
Performance 1
Tenure .263 1
Note. N = 1035
Table 46. Correlations between the Emotional Intelligence Scales
Self Control
Self Awareness
Empathy
Self Control 1
Self Awareness .179 1
Empathy .081 .686 1
Note. N = 413
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 63
Table 47. Correlations between the Workplace Competency Scales
1. CaDO 2. EPS 3. GDaMO 4. RCaMCN 5. TB
1. Coaching and Developing Others
1
2. Exercising Political Savvy
0.684 1
3. Guiding, Directing, and Motivating Others
0.709 0.679 1
4. Resolving Conflicts and Meeting Customer Needs
0.502 0.613 0.554 1
5. Team Building 0.711 0.640 0.679 0.530 1
Note. N = 996
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 64
Technical Requirements
The HR Avatar solution is designed to be taken on a personal computer or a mobile device
including tablets and mobile phones. A high bandwidth connection is recommended, but not
required. All HR Avatar videos are compressed to less than 500kbps. Lower bitrate versions are
used for mobile devices.
The following web browsers are supported:
• Internet Explorer 6 and above with Flash 9.1.115 or above
• Internet Explorer 9 and above without Flash
• Chrome
• Firefox
• Opera
• Safari
The Flesch-Kincaid Reading Grade Level score is estimated to be 5.8. This indicates that an
applicant must have a reading level similar to that of a 5th or 6th grader in order to comprehend the
text in the assessment.
HR Avatar recommends that the applicant take the assessment in a quiet setting that is free from
distractions. This will allow the applicant the best opportunity for demonstrating their skills and
abilities.
Future Research
HR Avatar is committed to providing employers with high quality and legally defensible assessments
for hiring employees. To that end, HR Avatar plans to continue accumulating reliability, validity, and
fairness evidence to support the use of the solutions. The list below contains several items from our
research agenda. Please contact us, if you have any interest in partnering with HR Avatar on any of
the projects below.
Reliability
• Establish the internal consistency of the knowledge tests.
• Establish the test-retest reliability of the knowledge tests, the Typing Test and the Essay
Test.
• Establish the test-retest reliability of the composite scores for each solution.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 65
Validity
• Conduct studies to establish the content-related validity of the knowledge assessments,
Typing Test and Essay Test.
• Conduct a study to examine the convergent validity evidence for the Cognitive Work
Simulation by comparing scores on the Cognitive Work Simulation and other established
measures of cognitive ability.
• Conduct a study to examine the convergent and divergent validity evidence for the AIMS
assessment by comparing scores on the AIMS assessment with other established measures of
personality – particularly those that measure the Big Five. The hypothesized relationships
can be found in Figure 1.
Accumulate additional criterion-related validity evidence at the individual assessment level and the
composite level by conducting multiple studies on each solution. Using these studies, conduct meta-
analyses to provide evidence for validity generalization. Validation evidence is available in two
studies that demonstrate the validity of the assessments. The first is a study with 130 managers and
the second is a study with 64 managers. In the first study, the assessment demonstrated that it
predicts job performance (r=.25; p<.01), using a simple “High” or “Marginal” categorization of job
performance, and a second study demonstrated the assessment predicted performance on a 100
point administrative performance appraisal, using a Spearman correlation (r=.25; p<.05). We expect
validity evidence to be more robust in future studies with better criterion measures. A brief summary
of the results is presented in Appendix G.
• Fairness
• Conduct a sensitivity review of all assessment content.
• Conduct Differential Item Functioning (DIF) analyses for the items in the assessments to
determine if any of the items behave differently for subgroups as defined by race, ethnicity,
gender, and age group.
• Evaluate mean score differences on each assessment and at the composite level by subgroup.
• Simulate selection ratios for each group at various passing rates to estimate adverse impact
ratios.
Norms
• Update estimates of global norms and estimate norms at the country and/or region level.
• Compare scores across formats (PCs, tablets, mobile phones).
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 66
References
American Educational Research Association, American Psychological Association, National Council
on Measurement in Education. (2014). The Standards for Educational and Psychological Testing.
Washington DC: American Educational Research Association.
Andrei, F., Siegling, A. B., Aloe, A. M., Baldaro, B., & Petrides, K. V. (2016). The incremental
validity of the Trait Emotional Intelligence Questionnaire (TEIQue): A systematic review
and meta-analysis. Journal of Personality Assessment, 98, 261-276.
Barrick, M. R., & Mount, M. K. (1991). The big five personality dimensions and job performance: A
meta-analysis. Personnel Psychology, 44, 1-26.
Bartram, D. (2005). The great eight competencies: A criterion-centric approach to validation. Journal
of Applied Psychology, 90, 1185-1203.
Bertua, C., Anderson, N., & Salgado, J. F. (2005). The predeictive validity of cognitive ability tests: A
UK meta-analysis. Journal of Occupational and Organizational Psychology, 78, 387-409.
Borman, W. C., & Motowidlo, S. J. (1993). Expanding the criterion domain to include elements of
contextual performance. In N. Schmitt, & W. C. Borman, Personnel Selection in Organizations.
San Francisco: Jossey-Bass.
Borman, W. C., & Motowidlo, S. J. (1997). Task performance and contextual performance: The
meaning for personnel selection research. Human Performance, 10, 99-109.
Buhrmester, M., Kwang, T., & Gosling, S. D. (n.d.). Amazon's Mechanical Turk: A new source of
inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6, 3-5.
Carlson, K. D., Scullen, S. E., Schmidt, F. L., Rothstein, H., & Erwin, F. (1999). Generalizable
biographical data validity can be achieved without multi-organziational development and
keying. Personnel Psychology, 52, 731-755.
Carroll, J. (1993). Human Cognitive Abilities: A Survey of Factor Analytic Studies. New York: Cambridge
University Press.
Cattell, R. B. (1941). Some theoretical issues in adult intelligence testing. Psychological Bulletin, 38, 592.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 67
Christian, M. S., Edwards, B. D., & Bradley, J. C. (2010). Situational judgment tests: Constructs
assessed and a meta-analysis of their criterion-related validities. Personnel Psychology, 63, 83-
117.
Cohen, J. (1992). A Power Primer. Psychological Bulletin, 112, 155-159.
Dudley, Nicole M., Orvis, Karin A., Lebiecki, Justin E., & Cortina, Jose M. (2006). A meta-analytic
investigation of conscientiousness in the prediction of job performance: Examining the
intercorrelations and incremental validity of narrow traits. Journal of Applied Psychology, 91, 40-
57.
Dye, D. A., Reck, M., & McDaniel, M. A. (2007). The validity of job measures. International Journal of
Selection and Assessment, 1, 153-157.
edX. (2015, February 25). Retrieved from discern: NY Times
Equal Employment Opportunity Commission, C. S. C. U. S. D. L. U., & Equal Employment
Opportunity Commission. (1978). Unifrom Guidelines on Employee Selection Procedures. Federal
register, 43(166), 38295-38309.
Fleisher, M. S., & Tsacoumis, S. (2012). O*NET analysit occupational skills ratings: Procedures update.
(Tech. Rep. No. FR-11-67) Alexandria, VA.: Human Resources Research Orgnaization
(HumRRO).
Fleisher, M. S., & Tsacoumis, S. (2012). O*NET analyst occupational abilities ratings: procedure update.
(Tech. Rep. No. FR-11-66). Alexandria, VA: Human Resources Research Organization
(humRRO).
Goldberg, L. R. (1992). The development of markers for the Big-Five factor structure. Psychological
Assessment, 4, 26-42.
Goldberg, L. R. (1993). The structure of phenotypic personlity traits. American Psychologist, 48, 26-34.
Hogan, J., & Holland, B. (2003). Using theory to evaluate personality and job-performance relations:
A socioanalytic perspective. Journal of Applied Psychology, 88, 100-112.
Horn, J. (1985). Handbook of Intelligence. New York: Wiley.
Hunter, J. E., & Hunter, R. F. (1984). Validity and utility of alternative predictors of job
performance. Psychological Bulletin, 96, 72-98.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 68
Hurtz, G. M., & Donovan, J. J. (2000). Personality and job performance: The Big Five revisited.
Journal of Applied Psychology, 85, 869-879.
Jaro, M. A. (1989). Advances in record linkage methodology as applied to the 1985 census of Tampa
Florida. Journal of the American Statistical ASsociation, 84, 414-420.
Jaro, M. A. (1995). Probalistic linkage of large public health data file. Statistics in Medicine, 14, 491-498.
Johnson, D. R., & Borden, L. A. (2012). Particiants at your fingertips: Using Amazon's Mechanical
Turk to increase student-faculyy collaborative research. Teaching of Psychology, 39, 245-251.
Joseph, D. L., Jin, J., Newman, D. A., & O'Boyle, E. H. (2015). Why does self-reported emotional
intelligence predict job performance? A meta-analytic investigation of Mixed EI. Journal of
Applied Psychology, 100, 298-342.
Joseph, D. L., & Newman, D. A. (2010). Emotional intelligence: An integrative meta-analysis and
cascading model. Journal of Applied Psychology, 95, 54-78.
Judge, T. A., & Ilies, R. (2002). Relationship of personality to performance motivation: A meta-
analytic review. Journal of Applied Psychology, 87, 797-807.
Kanfer, R., Crosby, J. R., & Brandt, D. M. (1988). Investigating behavioral antecedenty of turnover
at three job tenure levels. Journal of Applied Psychology, 73, 331-335.
Kehoe, J. (2002). General mental ability and selection in private sector organizations: A commentary.
Human Performance, 15, 97-106.
Klimoski, R. J., & Childs, A. (1986). Successfully predicting career success: An application of the
biographical inventory. Journal of Applied Psychology, 71, 3-8.
Markoff, J. (2013, April 4). Essay-Grading Software Offers Professors a Break. NY Times. Retrieved
February 25, 2015, from http://www.nytimes.com/2013/04/05/science/new-test-for-
computers-grading-essays-at-college-level.html?_r=0
McCrae, R. R., & Costa, P. T. (1997). Personality trait structure as a human universal. American
Psychologist, 52, 509-516.
McCrae, R. R., & John, O. P. (1992). An introduction to the five-factor model and its applications.
Journal of Personality, 60, 175-215.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 69
McCrae, R. R., Jang, K. L., Livesley, W. J., Riemann, R., & Angleitner, A. (2001). Sources of
structure: Genetic, environmental, and artifactual influences on the covariation of personality
traits. Journal of Personality, 69, 511-535.
McDaniel, M. A., Hartman, N. S., Whetzel, D. L., & Grub, W. L., III (2007). Situational judgment
tests, response instructions, and validity: A meta-analysis. Personnel Psychology, 60, 63-91.
McDaniel, M. A., Morgeson, F. P., Finnegan, E. B., Campion, M. A., & Braverman, E. P. (2001).
Use of situational judgment tests to predict job performance: A clarification of the literature.
Journal of Applied Psychology, 86, 730-740.
McGrew, K. (2008). CHC theory and the human cognitive abilities project: Standing on the shoulers
of the giants of psychometric intelligence research. Intelligence, 37, 1-10.
Miao, C., Humphrey, R. H., & Qian, S. (2017). A meta-analysis of emotional intelligence and work
attitudes. Journal of Occupational and Organizational Psychology, 90, 177-202.
Miller, J. D., Gentile, B., Wilson, L., & Campbell, W. K. (2013). Grandiose and vulnerable narcissism
and the DSM-5 Pathelogical Personality Trait Model. Journal of Personality Assessment, 95, 284-
290.
Minton, E., Gurel-Atay, E., Kahle, L., & Ring, K. (2013). Comparing data collection alternatives:
Amazon mTurk, college students, and secondary analysis. AMA Winter Educators' Conference
Proceedings, 24, (pp. 36-37).
Motowidlo, S. J., Dunnette, M. D., & Carter, G. W. (1990). An alternative selection procedure: The
low-fidelity simulation. Journal of Applied Psychology, 75, 640-647.
Mount, M. K., & Barrick, M. R. (1995). The Big Five personality dimensions: Implications for
research and practice in human resources management. In K. R. (Eds.), Research in Personnel
and Human Resources Management (Vol. 13) (pp. 153-200). Greenwich, CT: JAI Press.
Navarro, G. (2001). A guided tour to approximate string matching. ACM Computing Surveys, 33, 31-
88.
O'Boyle, E. H., Humphrey, R. H., Pollack, J. M., Hawver, T. H., and Story, P. A. (2011). The
relation between emotional intelligence and job performance: A meta-analysis. Journal of
Organizational Behavior, 32, 788-818.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 70
Oswald, F. L., Schmitt, N., Kim, B. H., Ramsay, L. J., & Gillespie, M. A. (2004). Developing a
biodata measure and situational jusgment inventory as predictors of college student
performance. Journal of Applied Psychology, 89, 187-207.
Paruchuri, V. (2015, February 25). Retrieved from YouTube:
https://www.youtube.com/watch?v=zFeP678054U
Paunonen, S. V., Rothstein, M. G., & Jackson, D. (1999). Narrow reasoning about the use of broad
personaloty measures for personnel selection. Journal of Organizational Behavior, 20, 389-405.
Perera, H. N., & DiGiacomo, M. (2013). The relationship of trait emotional intelligence with
academic performance: A meta-analytic review. Learning and Individual Differences, 23, 20-33.
Peterson, N. G., Mumford, M. D., Borman, W. C., Jeanneret, P. R., Fleishman, E. A., Levin, K.
Y., . . . Dye, D. M. (2001). Understanding work using the occupational information network
(O*NET): Implications for practice and research. Personnel Psychology, 54, 451-492.
Roth, P. L., Bobko, P, & Buster, M. A. (2013). Situational judgment tests: The influence and
importance of applicant status and targeted constructed on estimates of Black-White
subgroup differences. Journal of Occupational and Organizational Psychology, 86, 394-409.
Rothstein, H. R., Schmidt, F., Erwin, F. W., Owens, W. A., & Sparks, C. P. (1990). Biographical data
in employment selection: Can validities be made generalizable? Journal of Applied Psychology,
75, 175-184.
Saad, S., Carter, G. W., Rothenberg, M., & Israelson, E. (2014, March 9). Testing and Assessment: An
Employer's Guide to Good Practices. Retrieved from O*NET:
http://www.onetcenter.org/dl_files/empTestAsse.pdf
Salgado, J. E. (1997). The Five Factor Model of personality and job performance in the European
community. Journal of Applied Psychology, 82, 30-43.
Salgado, J. F., Anderson, N., Moscoso, S., Bertua, C., & De Fruyt, F. (2003). International validity
generalization of GMA and cognitive abilities: A European community meta-analysis.
Personnel Psychology, 56, 573-605.
Salgado, J. F., Anderson, N., Moscoso, S., Bertua, C., de Fruyt, F., & Rolland, J. P. (2003). A Meta-
Analytic Study of General Mental Ability Validity for Different Occupations in the
Eurpopean Community. Journal of Applied Psychology, 88, 1068-1081.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 71
Salgado, J., & Anderson, N. (2003). Validity generalization of GMA tests across countries in the
European community. European Journal of Work & Organizational Psychology, 12, 1-17.
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel
psychology: Practical and theoretical implications of 85 years of research findings.
Psychological Bulletin, 124, 262-274.
Schneider, R. J., Hough, L. M., & Dunnette, M. D. (1996). Broadsided by broad traits: How to sink
science in five dimensions or less. Journal of Organizational Behavior, 17, 639-655.
Shrout, P. E., & Fleiss, J. L. (1979). Intraclass Corrlations: Uses in Assessing Rater Reliability.
Psychological Bulletin, 86, 420-428.
Society for Industrial Organizational Psychology. (2003). Principles for the Validation and Use of Personnel
Selection Procedures (4th ed.). Bowling Green, OH: Author.
Tett, R. P., & Burnett, D. D. (2003). A personality trait-based interactionist model of job
performance. Journal of Applied Psychology, 88, 500-517.
Tett, R. P., Jackson, D. N., & Rothstein, M. (1991). Personality measures as predictors of job
performance. Personnel Psychology, 44, 703-742.
U.S. Census Bureau. (2015, February 17). American Fact Finder. Retrieved from factfinder.census.gov:
factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_12_1YR
_CP05&prodType=table
Wee, S., Newman, D. A., & Joseph, D. L. (2014). More than g: Selection quality and adverse impact
implications of considering second-stratum cognitive abilities. Journal of Applied Psychology, 99,
547-563.
Whetzel, D. L., McDaniel, M. A., & Nguyen, N. T. (2008). Subgroup differences in situational
judgment test performance: A meta-analysis. Human Performance, 21, 291-309.
Winkler, W. E. (1990). String comparator metrics and enhanced decision rules in the Fellegi-Sunter
Model of record linkage. Proceedsings of the Section on Survey Research Methods (pp. 354-359).
American Statistical Association.
Woo, S. E., Chernyshenko, O. S., Stark, S. E., & Conz, G. (2014). Validity of six openness facets in
predicting wok behaviors: A meta-analysis. Journal of Personality Assessment, 96, 76-86.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 72
Appendix A: Summary of the HR Avatar Solutions
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Sales Representative - Wholesale &
Manufacturing
41-4012.00 Business Sales Yes Professional EQ Professional Fundamental Sales Concepts
Customer Service Representative (with
Email)
43-4051.00 Customer Service (with
email)
Yes Professional EQ Professional Core Customer
Service Concepts
Customer Service Representative (with
Email and Calls)
43-4051.00 Customer Service (with email)Calls
Yes Professional EQ Professional Core Customer
Service Concepts
Telemarketer 41-9041.00 Customer Service (with email)Calls
Professional EQ Professional Fundamental Sales Concepts
Computer Applications Software
Developer
15-1132.00 Information Technology
Yes Professional EQ Professional
First-Line Supervisor - Office and
Administrative Support
43-1011.00 First-Line Supervisor
Yes Professional EQ Professional Supervisor Fundamentals
Sales Representative - Technical and
Scientific
41-4011.00 Business Sales Yes Professional EQ Professional Fundamental Sales Concepts
Sales Representative - Services
41-3099.00 Business Sales Yes Professional EQ Professional Fundamental Sales Concepts
Computer Systems Analyst
15-1121.00 Information Technology
Yes Professional EQ Professional
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 73
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Collections Specialist 43-3011.00 Collections Professional EQ Professional
Cashier 41-2011.00 Customer Service (with email)Face
Professional EQ Hourly
Chief Executive 11-1011.00 Manager Yes Professional EQ Professional
Retail Salesperson (Home Goods Store)
41-2031.00 Retail Sales (Hardware
Store)
Professional EQ Hourly Fundamental Sales Concepts
Retail Salesperson (Fashion Store)
41-2031.00 Retail Sales (Fashion)
Professional EQ Hourly Fundamental Sales Concepts
Retail Salesperson (Electronics Store)
41-2031.00 Retail Sales (Electronics)
Professional EQ Hourly Fundamental Sales Concepts
Retail Salesperson (Sunglasses Store)
41-2031.00 Retail Sales (Sunglasses)
Professional EQ Hourly Fundamental Sales Concepts
General Manager 11-1021.00 Manager Yes Professional EQ Professional
Clerk - Bookkeeping, Accounting, and
Auditing
43-3031.00 Admin Assistant
(Entry-Level)
Hourly Hourly Accounting Fundamentals
MS Excel 2019
Simulation
Sales Agent - Insurance
41-3021.00 Business Sales Professional EQ Professional Insurance Fundamentals
Secretary / Administrative
Assistant
43-6014.00 Admin Assistant
Yes Professional EQ Hourly Typing Speed and Accuracy
MS Word 2019
Simulation
Manager - Financial 11-3031.00 Manager Yes Professional EQ Professional Accounting Fundamentals
MS Excel 2019
Simulation
Specialist - Computer User Support
15-1151.00 Information Technology
Professional EQ Professional Core Customer
Service Concepts
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 74
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
First-Line Supervisor - Non-Retail Sales
41-1012.00 First-Line Supervisor
Yes Professional EQ Professional Supervisor Fundamentals
Computer Systems Software Developer
15-1133.00 Information Technology
Yes Professional Professional
Analyst - Information Security
15-1122.00 Information Technology
Yes Professional Professional
Management Analyst 13-1111.00 Business and Finance
Yes Professional EQ Professional Accounting Fundamentals
MS Excel 2019
Simulation
Bank Teller 43-3071.00 Bank Teller Professional EQ Professional Bank Teller Fundamentals
Bank Teller with Sales
43-3071.00 Bank TellerSales
Professional EQ Professional Bank Teller Fundamentals
First-Line Supervisor - Retail Sales
41-1011.00 First-Line Supervisor
Professional EQ Professional Supervisor Fundamentals
Clerk - General Office
43-9061.00 Admin Assistant
(Entry-Level)
Hourly Hourly
Clerk - Billing and Posting
43-3021.00 Admin Assistant
(Entry-Level)
Hourly Hourly
Manager - Computer and Information
Systems
11-3021.00 Manager Yes Professional EQ Professional
Driver - Sales and Delivery
53-3031.00 Driver Professional EQ Professional
Analyst - Market Research
13-1161.00 Business and Finance
Yes Professional EQ Professional Core Marketing Concepts
MS Excel 2019
Simulation
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 75
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Manager - Administrative
Services
11-3011.00 Manager Yes Professional EQ Professional
Computer Programmer
15-1131.00 Information Technology
Yes Professional Professional
Computer Programmer - Java
15-1131.00 Information Technology
Yes Professional Professional Core Java Programming
Computer Programmer - C
15-1131.00 Information Technology
Yes Professional Professional Core C Programming
Computer Programmer - C++
15-1131.00 Information Technology
Yes Professional Professional Core C++ Programming
Computer Programmer - Web
Developer
15-1131.00 Information Technology
Yes Professional Professional Core Web Programming
Computer Programmer - PHP
15-1131.00 Information Technology
Yes Professional Professional Core PHP Programming
Computer Programmer -
Javascript
15-1131.00 Information Technology
Yes Professional Professional Core Javascript
Programming
Computer Programmer - Actionscript
15-1131.00 Information Technology
Yes Professional Professional Core Actionscript
Programming
Computer Programmer - Python
15-1131.00 Information Technology
Yes Professional Professional Core Python Programming
Computer Programmer -
ASP .NET Web Pages
15-1131.00 Information Technology
Yes Professional Professional Core ASP.NET
Programming (Web Pages)
Computer Programmer - Ruby
15-1131.00 Information Technology
Yes Professional Professional Core Ruby
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 76
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Computer Programmer - Java
EE
15-1131.00 Information Technology
Yes Professional Professional Java EE Fundamentals
Manager - Food Service
11-9051.00 Manager Yes Professional EQ Professional Food Safety Fundamentals
Aide - Home Health 31-1011.00 Customer Service (with email)Face
Professional EQ Professional
Network and Computer Systems
Administrator
15-1142.00 Information Technology
Yes Professional EQ Professional Computer Networking Concepts
Claims Adjuster, Examiner,
Investigator
13-1031.00 Business and Finance
(Entry Level)
Yes Professional EQ Professional
Manager - Other 11-9199.00 Manager Yes Professional EQ Professional
Manager - Security 11-9199.07 Manager Yes Professional EQ Professional
Manager - Sales 11-2022.00 Manager Yes Professional EQ Professional Fundamental Sales Concepts
First-Line Supervisor - Production /
Operations
51-1011.00 First-Line Supervisor
Yes Professional EQ Professional Supervisor Fundamentals
Clerk - Insurance Claims / Policy
Processing
43-9041.00 Business and Finance
(Entry Level)
Professional Professional Insurance Fundamentals
Hospitality Industry Customer Service
Worker
43-4081.00 Hospitality Professional EQ Professional Core Hospitality Concepts
Technician - Medical Records and Health
Information
29-2071.00 General Office (Entry-
Level)
Professional Professional Core Health Administratio
n (U.S.)
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 77
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Loan Officer 13-2072.00 Business and Finance
Yes Professional EQ Professional Banking Fundamentals
Executive Secretary / Administrative
Assistant
43-6011.00 Admin Assistant
Yes Professional EQ Professional Typing Speed and Accuracy
Accountant / Auditor 13-2011.00 Business and Finance
Yes Professional EQ Professional Accounting Fundamentals
MS Excel 2019
Simulation
Analyst - Budget 13-2031.00 Business and Finance
Yes Professional EQ Professional
Manager - Real Estate and Community Association
11-9141.00 Manager Yes Professional EQ Professional
Analyst - Financial 13-2051.00 Business and Finance
Yes Professional EQ Professional Accounting Fundamentals
MS Excel 2019
Simulation
Specialist - Computer Network Support
15-1152.00 Information Technology
Professional Professional Computer Networking Concepts
Manager - Architectural and
Engineering
11-9041.00 Manager Yes Professional EQ Professional
Analyst - General 13-1081.02 General Office
Yes Professional EQ Professional MS Excel 2019
Simulation
Meeting, Convention, and Event Planner
13-1121.00 General Office
Yes Professional EQ Professional
Paralegal /Legal Assistant
23-2011.00 General Office
Yes Professional EQ Professional Typing Speed and Accuracy
MS Word 2019
Simulation
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 78
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Legal Secretary 43-6012.00 Admin Assistant
Yes Professional EQ Professional Typing Speed and Accuracy
MS Word 2019
Simulation
Technician - General Maintenance and
Repair
49-9071.00 Technician Professional Professional Mechanical Aptitude
Loan Interviewer 43-4131.00 Business and Finance
Yes Professional EQ Professional Banking Fundamentals
Manager - Marketing 11-2021.00 Manager Yes Professional EQ Professional Core Marketing Concepts
MS Excel 2019
Simulation
Manager - Industrial Production
11-3051.00 Manager Yes Professional EQ Professional
Restaurant Waiter / Waitress
35-3031.00 Restaurant Hourly EQ Hourly
Sales Agent - Real Estate
41-9022.00 Real Estate Worker
Yes Professional EQ Professional Fundamental Sales Concepts
Clerk - Payroll and Timekeeping
43-3051.00 Admin Assistant
(Entry-Level)
Hourly Hourly Typing Speed and Accuracy
Data Entry Keyers 43-9021.00 Admin Assistant
(Entry-Level)
Hourly Hourly Typing Speed and Accuracy
Aide - Personal Care 39-9021.00 Customer Service (with email)Face
Hourly EQ Hourly
Specialist - Human Resources
13-1071.00 Manager Yes Professional EQ Professional Human Resource
Fundamentals
MS Word 2019
Simulation
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 79
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Clerk - Counter / Rental
41-2021.00 Customer Service (with email)Face
Hourly EQ Hourly
Compliance Officer 13-1041.00 General Office
Yes Professional EQ Professional
Specialist - Regulatory Affairs
13-1041.07 General Office
Yes Professional EQ Professional
Technician - Pharmacy
29-2052.00 Technician Professional Professional
Sales Agent - Securities, Financial
Services
41-3031.00 Business Sales Yes Professional EQ Professional Fundamental Sales Concepts
Clerk - Production, Planning, and
Expediting
43-5061.00 Admin Assistant
(Entry-Level)
Hourly Hourly
First-Line Supervisor - Mechanics,
Installers, Repairers
49-1011.00 First-Line Supervisor
Professional EQ Professional Supervisor Fundamentals
Mechanical Aptitude
Customer Service Representative - With
Sales
41-9041.00 Business Sales Professional EQ Professional Fundamental Sales Concepts
Agent - Purchasing 13-1023.00 Business and Finance
Professional EQ Professional
Salesperson - Parts and Accessories
41-2022.00 Retail Sales (Hardware
Store)
Professional EQ Professional Fundamental Sales Concepts
Driver - Heavy and Tractor-Trailer
53-3032.00 Driver Professional Professional
Specialist - Public Relations
27-3031.00 General Office
Yes Professional EQ Professional Core Marketing Concepts
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 80
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Clerk - File 43-4071.00 Admin Assistant
(Entry-Level)
Hourly Hourly Typing Speed and Accuracy
Nursing Assistant 31-1014.00 Medical Assistant
Professional EQ Professional
Specialist - Training and Development
13-1151.00 General Office
Yes Professional EQ Professional
Instructional Designer
25-9031.00 Admin Assistant
Yes Professional EQ Professional Project Management Fundamentals
Graphic Designer 27-1024.00 General Office
Professional Professional
Art Director 27-1011.00 General Office
Professional EQ Professional
Graphic Designer - Web Development
27-1024.00 General Office
Professional Professional HTML4 and CSS 2
Core HTML5
Manager - Medical and Health Services
(General)
11-9111.00 Manager Yes Professional EQ Professional Core Health Administratio
n (U.S.)
Manager - Medical and Health Services
(US)
11-9111.00 Manager Yes Professional EQ Professional Core Health Administratio
n (U.S.)
Receptionist 43-4171.00 Admin Assistant
(Entry-Level)
Hourly EQ Hourly
Secretary - Medical (General)
43-6013.00 Medical Assistant
Professional EQ Professional Typing Speed and Accuracy
MS Word 2019
Simulation
Secretary - Medical (US)
43-6013.00 Medical Assistant
Professional EQ Professional Typing Speed and Accuracy
Core Health Administratio
n (U.S.)
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 81
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Recruiter 43-4111.00 General Office
Yes Professional EQ Professional Core Recruiting Knowledge
Childcare Worker 39-9011.00 General Office (Entry-
Level)
Professional EQ Professional Childcare Fundamentals
First-Line Supervisor - Food Preparation /
Serving
35-1012.00 First-Line Supervisor
Professional EQ Professional Food Safety Fundamentals
Host / Hostess - Restaurant
35-9031.00 Restaurant Hourly EQ Hourly Core Hospitality Concepts
Specialist - Office and Administrative
Support
43-9199.00 Admin Assistant
(Entry-Level)
Hourly Hourly Typing Speed and Accuracy
MS Word 2019
Simulation
Clerk - Order Processing
43-4151.00 Admin Assistant
(Entry-Level)
Hourly Hourly Typing Speed and Accuracy
Driver Transit and Intercity Bus
53-3021.00 Driver Professional EQ Professional
Driver - Light Truck / Delivery
53-3033.00 Driver Professional Professional
Mechanic - Heating, Air Conditioning,
Refrigeration
49-9021.00 Technician Professional Professional HVAC Fundamentals
Mechanical Aptitude
Pharmacist (General) 29-1051.00 General Office
Yes Professional EQ Professional Core Health Administratio
n (U.S.)
Pharmacist (US) 29-1051.00 General Office
Yes Professional EQ Professional Core Health Administratio
n (U.S.)
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 82
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Trainer - Athletic 39-9031.00 General Office (Entry-
Level)
Professional EQ Professional
Attorney 23-1011.00 General Office
Yes Professional EQ Professional
Medical Assistant 31-9092.00 Medical Assistant
Yes Professional EQ Professional
Administrator - Elementary and
Secondary School
11-9032.00 General Office
Yes Professional EQ Professional
Inspector, Tester, Sorter, Sampler,
Weigher
51-9061.00 Technician Professional Professional Mechanical Aptitude
Counselor - Educational,
Guidance, School, Vocational
21-1012.00 General Office
Yes Professional EQ Professional
Personal Financial Advisor
13-2052.00 Business and Finance
Yes Professional EQ Professional Accounting Fundamentals
Hairdresser, Hairstylist,
Cosmetologist
39-5012.00 Basic Entry Level
Hourly EQ Hourly
Social / Human Service Assistant
21-1093.00 General Office (Entry-
Level)
Professional EQ Professional Social Work Fundamentals
Security Guard 33-9032.00 General Office (Entry-
Level)
Professional EQ Professional
Medical / Clinical Laboratory
Technologist
29-2011.00 Technician Professional Professional
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 83
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Fast Food Worker 35-3021.00 CogFastFoody Hourly EQ Hourly
Clerk - Shipping / Receiving
43-5071.00 Admin Assistant
(Entry-Level)
Hourly Hourly
Teacher - Preschool 25-2011.00 Customer Service (with email)Face
Hourly EQ Hourly Education Delivery
Fundamentals
Dental Assistant 31-9091.00 Basic Entry Level
Hourly EQ Hourly
Engineer - Mechanical
17-2141.00 General Office
Yes Professional Professional Mechanical Aptitude
Engineer - Other 17-2199.00 General Office
Yes Professional Professional
Engineer - Industrial 17-2112.00 General Office
Yes Professional Professional
Electrical Engineer 17-2071.00 General Office
Yes Professional Professional
Specialist - Radio Frequency
Identification (RFID)
17-2072.01 Technician Yes Professional EQ Professional
Nurse - Registered 29-1141.00 General Office
Yes Professional EQ Professional
Production Worker 51-9199.00 General Office (Entry-
Level)
Hourly Hourly Mechanical Aptitude
Operator - Power Plant
51-8013.00 Technician Yes Professional EQ Hourly Mechanical Aptitude
Dental Hygienist 29-2021.00 Basic Entry Level
Professional EQ Professional
Machinist 51-4041.00 Basic Entry Level
Hourly Hourly Mechanical Aptitude
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 84
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Radiologic Technologist
29-2034.00 General Office
Professional Professional
Taxi Driver / Chauffeur
53-3041.00 Driver Hourly EQ Hourly
Technician - Emergency Medical
/ Paramedic
29-2041.00 Technician Professional EQ Professional
Operator - Packaging / Filling
Machines
51-9111.00 Technician Hourly Hourly
Mail Carrier 43-5052.00 Customer Service (with email)Face
Hourly EQ Hourly
Social Worker - Child, Family, School
21-1021.00 Customer Service (with email)Face
Hourly EQ Hourly Social Work Fundamentals
Technician - Medical and Clinical Laboratory
29-2012.00 Technician Professional Professional
Clerk - Stockroom 43-5081.00 Admin Assistant
(Entry-Level)
Hourly Hourly
Technician - Automotive Service
49-3023.00 Technician Hourly Hourly Mechanical Aptitude
Operating Engineer 47-2073.00 General Office
Professional Professional
Engineer - Civil 17-2051.00 General Office
Yes Professional Professional Construction Fundamentals
Physical Therapist 29-1123.00 General Office (Entry-
Level)
Professional EQ Professional
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 85
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Correctional Officer 33-3012.00 General Office
Professional EQ Professional
Dispatcher - General 43-5032.00 General Office
Professional EQ Professional
Maid / Housekeeping
Cleaner
37-2012.00 Basic Entry Level
Hourly EQ Hourly
Welder, Cutter, Solderer, Brazer
51-4121.00 Technician Hourly Hourly Mechanical Aptitude
Bartender 35-3011.00 Restaurant Hourly EQ Hourly
Attendant - Food Services
35-3022.00 Restaurant Hourly EQ Hourly
Laborer - Freight and Warehouse
53-7062.00 Warehouse Hourly Hourly
Recreation Worker 39-9032.00 Basic Entry Level
Hourly EQ Hourly
Enforcement Officer 33-3051.00 General Office
Professional EQ Professional
Food Server - Nonrestaurant
35-3041.00 Basic Entry Level
Hourly EQ Hourly
Teacher - Elementary School
25-2021.00 General Office
Yes Professional EQ Professional Education Delivery
Fundamentals
Mechanic - Industrial Machinery
49-9041.00 Technician Hourly Hourly Mechanical Aptitude
Teacher - Other 25-3099.00 General Office
Professional EQ Professional Education Delivery
Fundamentals
Food Preparation Worker
35-2021.00 Basic Entry Level
Hourly EQ Hourly Food Safety Fundamentals
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 86
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Nurse - Licensed Practical / Vocational
29-2061.00 General Office
Professional EQ Professional
Assembler / Fabricator - Other
51-2099.00 Basic Entry Level
Hourly Hourly Mechanical Aptitude
Team Assembler 51-2092.00 Basic Entry Level
Hourly EQ Hourly
Laborer - Packing / Packaging
53-7064.00 Warehouse Hourly Hourly
Carpenter 47-2031.00 Construction Hourly Hourly Carpentry Fundamentals
Janitor 37-2011.00 Basic Entry Level
Hourly Hourly
Plumber, Pipefitter, Steamfitter
47-2152.00 Construction Hourly Hourly Plumbing Fundamentals
Mechanical Aptitude
Cook - Restaurant 35-2014.00 Basic Entry Level
Hourly Hourly Food Safety Fundamentals
Helper - Dining Room and Cafeteria
35-9011.00 Restaurant Hourly EQ Hourly
Teacher - Secondary School
25-2031.00 General Office
Yes Professional EQ Professional Education Delivery
Fundamentals
Laborer - Landscaping and Groundskeeping
37-3011.00 Basic Entry Level
Hourly Hourly
Laborer - Construction
47-2061.00 Construction Hourly Hourly Construction Fundamentals
Teacher - Middle School
25-2022.00 General Office
Yes Professional EQ Professional Education Delivery
Fundamentals
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 87
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Cook - Fast Food 35-2011.00 CogFastFoody Hourly Hourly Food Safety Fundamentals
Dishwasher 35-9021.00 Basic Entry Level
Hourly Hourly
Firefighter 33-2011.00 General Office (Entry-
Level)
Professional EQ Professional
Cleaner - Vehicles and Equipment
53-7061.00 Basic Entry Level
Hourly Hourly
Operator - Industrial Trucks / Tractors
53-7051.00 Driver Hourly Hourly Mechanical Aptitude
First-Line Supervisor - Construction /
Extraction
47-1011.00 First-Line Supervisor
Professional EQ Professional Supervisor Fundamentals
Construction Fundamentals
Helper - Production 51-9198.00 Basic Entry Level
Hourly Hourly Mechanical Aptitude
Attendant - Amusement /
Recreation
39-3091.00 Basic Entry Level
Hourly EQ Hourly
Teacher Assistant 25-9041.00 General Office (Entry-
Level)
Professional EQ Professional Education Delivery
Fundamentals
Cook - Short Order 35-2015.00 Basic Entry Level
Hourly Hourly Food Safety Fundamentals
Electrician 47-2111.00 Technician Hourly Hourly Electrician Fundamentals
Mechanical Aptitude
Teacher - Substitute 25-3098.00 General Office
Yes Professional EQ Professional Education Delivery
Fundamentals
Cook - Institution and Cafeteria
35-2012.00 Basic Entry Level
Hourly Hourly Food Safety Fundamentals
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 88
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Driver - School Bus 53-3022.00 Basic Entry Level
Hourly EQ Hourly
Laborer - Agricultural 45-2092.00 Construction Hourly Hourly
Mechanic - Bus,Truck, Diesel
Engine
49-3031.00 Technician Hourly Hourly Mechanical Aptitude
Installer / Repairer - Telecommunications
Equipment
49-2022.00 Basic Entry Level
Hourly Hourly Mechanical Aptitude
Manager - Construction
11-9021.00 Manager Professional EQ Professional Construction Fundamentals
Teacher - Postsecondary
25-1199.00 General Office
Yes Professional EQ Professional Education Delivery
Fundamentals
Coach / Scout 27-2022.00 General Office
Yes Professional EQ Professional
Laundry and Dry-Cleaning Worker
51-6011.00 Basic Entry Level
Hourly EQ Hourly
Teacher - Special Education
25-2052.00 General Office
Yes Professional EQ Professional Education Delivery
Fundamentals
Assembler - Electrical and
Electronic Equipment
51-2022.00 Basic Entry Level
Hourly Hourly Electrician Fundamentals
First-Line Supervisor - Transportation and
Material-Moving
53-1031.00 First-Line Supervisor
Professional EQ Professional Supervisor Fundamentals
Cost Estimator 13-1051.00 Business and Finance
(Entry Level)
Professional EQ Professional MS Excel 2019
Simulation
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 89
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Painter - Construction and
Maintenance
47-2141.00 Basic Entry Level
Hourly Hourly
Operator - Machine - Metal and Plastic
51-4031.00 Technician Hourly Hourly Mechanical Aptitude
Teacher - Self-Enrichment Education
25-3021.00 General Office
Yes Professional EQ Professional Education Delivery
Fundamentals
Clerk - Information and Record Clerks
43-4199.00 Admin Assistant
(Entry-Level)
Hourly Hourly Typing Speed and Accuracy
Printing Press Operators
51-5112.00 Technician Hourly Hourly
First-Line Supervisor - Housekeeping and
Janitorial
37-1011.00 First-Line Supervisor
Professional EQ Professional
First-Line Supervisor - Helpers, Laborers, and Material Movers
53-1021.00 First-Line Supervisor
Professional EQ Professional
Trimmer - Meat, Poultry, and Fish
51-3022.00 Basic Entry Level
Hourly Hourly Food Safety Fundamentals
Teacher - Kindergarten
25-2012.00 General Office
Yes Professional EQ Professional Education Delivery
Fundamentals
Baker 51-3011.00 Basic Entry Level
Hourly Hourly
Teacher - Health Specialties -
Postsecondary
25-1071.00 General Office
Yes Professional EQ Professional Education Delivery
Fundamentals
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 90
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
General Project Manager
11-1021.00 Manager Yes Professional EQ Professional Project Management Fundamentals
MS Excel 2019
Simulation
Information Technology Project
Manager
15-1199.09 Information Technology
Yes Professional EQ Professional Project Management Fundamentals
MS Excel 2019
Simulation
Information Systems Architect/Engineer
15-1199.02 Information Technology
Yes Professional EQ Professional
Analyst - Business Intelligence
15-1199.08 Business and Finance
Yes Professional EQ Professional MS Excel 2019
Simulation
Database Administrator
15-1141.00 Information Technology
Yes Professional Professional Relational Database Concepts
Computer Programmer - Web
Developer with jQuery
15-1131.00 Information Technology
Yes Professional Professional Core HTML5 JQuery Fundamentals
Manager - Human Resources
11-3121.00 Manager Yes Professional EQ Professional Human Resource
Fundamentals
Account Manager 43-4051.00 General Office
Yes Professional EQ Professional Core Customer
Service Concepts
Computer Programmer - C
Sharp (C#)
15-1131.00 Information Technology
Yes Professional Professional Core C Fundamentals
Bank Teller / Universal Banker
43-3071.00 Bank TellerSales
Yes Professional EQ Professional Universal Banker
Fundamentals
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 91
Job Title O*Net SOC
Simulation Module
(Codgnitive)
Essay AIMS Module
Emotional Intelligence
Biodata Module
Skill Module 1
Skill Module 2
Flight Attendant 53-2031.00 Flight Attendant
Professional EQ Professional
Airline Pilot, Copilot, or Flight Engineer
53-2011.00 General Office
Professional EQ Professional
Technical Writer 27-3042.00 General Office
Yes Professional EQ Professional Mechanical Aptitude
MS Word 2019
Simulation
Manager - Hospitality
11-9081.00 Manager Yes Professional EQ Professional Core Hospitality Concepts
Basic Cognitive & Behavioral
Assessment - Entry Level Version
51-9195.07 Basic Entry Level
Hourly
Fraud Examiner, Investigator, Analyst
13-2099.04 Business and Finance
(Entry Level)
Yes Professional EQ Professional
Specialist - Risk Management
13-2099.02 Business and Finance
Yes Professional EQ Professional MS Excel 2019
Simulation
Actuary 15-2011.00 Business and Finance
Yes Professional EQ Professional
Travel Agent 41-3041.00 General Office
Yes Professional EQ Professional Fundamental Sales Concepts
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 92
Appendix B: Historical Validity Evidence for the
Cognitive Scales
Table 48. Compiled Validity Evidence for the Original Content of the Cognitive Work Simulation Scales
Performance Factor (various organizations)
Attention to Detail
Analytical Thinking
Listening Score (Organization A) .37
Performance Rating .50
Listening Score (Organization B) -.43
Listening Score (Organization C) -.35 -.34
Performance Rating (Organization A) .42
Performance Rating (Organization B) .39
Sales/Hour -.26 -.34
Cross Selling .37 .39
Response Quality .33
Average 2nd Contact .21
Schedule Conformance -.09
Performance Rating (Organization C) .39 .16
*Study results provided by original content developer. Sample sizes are all larger than 200 and p<.05
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 93
Appendix C: Historical Validity Evidence for AIMS
Tables 47-53 contain the results of criterion-related validity studies conducted by the developer of the original AIMS content. Please note that
these results are based on the longer, 100-item version of the assessment. Table 53 summarizes the statistically significant relationships (p<.05)
between the original AIMs scales and various measures of performance. The table includes results for over 5000 applicants and over 12 different
organizations from multiple industries including Financial Services, Insurance, Hospitality, Market Research, and Pharmaceuticals. Note that all
significant correlations are reported and although some of these correlations are negative, we would not expect all of the scales to be positively
related to all job performance dimensions (see above section on previous personality research).
Table 49. Study 1 Results: Concurrent Validation Study for Insurance Consultants N=122-136
Competency Performance
Appraisal Average
Policy Count Average
QRF Level Average
Idle Time Average CPH
Service Average Second
Contact
Needs Structure .18 -.15 .18
Innovative & Creative .16 .19
Enjoys Problem-Solving .24
Seeks Perfection
Exhibits a Positive Work Attitude -.19
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 94
Table 50. Study 2 Results: Concurrent Validation Study for Inside Sales N=105
Competency Perfectionism Quality
Innovative & Creative .26
Enjoys Problem-Solving .26
Competitive
Seeks Perfection .19
Develops Relationships .28 .16
Expressive & Outgoing .27 .18
Corporate Citizenship -.17
Exhibits a Positive Work Attitude
-.18
Adaptable -.17
Table 51. Study 3 Results: Concurrent Validation Study for an Internet Services Order Processing N=84
Competency Quality
Enjoys Problem-Solving .19
Develops Relationships -.27
Competitive -.22
Table 52. Study 4 Results: Concurrent Validation Study for an Internet and Cable Sales and Service Position N=72-93
Competency Performance Quality Attendance
Needs Structure .30 .26
Enjoys Problem-Solving .20
Competitive
Seeks Perfection .21
Develops Relationships .24
Exhibits a Positive Work Attitude .25
Adaptable -.20
Table 53. Study 5 Results: Concurrent Validation Study Auto Rental Sales Role N=80-92
Competency Quality Yield Productivity
Needs Structure .30 .25 .29
Innovative & Creative .32
Seeks Perfection .34 .23
Adaptable .19
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL 2019 95
Table 54. Study 6: Concurrent Validation Study for Paramedics N=85
Competency Reading Learning
Office Procedures
Assessing Following Procedures
Dealing with
Difficulty
Treating Others
with Respect
On time
Overtime Flexibility Sick
Needs Structure
-.21 -.25
Innovative & Creative
-.26 -.15
Enjoys Problem-Solving
-.27
.18 -.26 -.20
Competitive -.22 .20 .24
Seeks Perfection
-.21 .18
Develops Relationships
.19 -.21
Expressive & Outgoing
-.24 -.29
Corporate Citizenship
.24 .23
Exhibits a Positive Work Attitude
-.24 -.22
Adaptable .36 .26
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 96
Table 55. Summary of Relationships between Performance Measures and Original AIMS Scales
Performance Measure
Enjoys Proble
m Solving
Innovative and
Creative
Needs Structur
e
Adaptable
Develops Relationship
s
Expressive and
Outgoing
Competitive
Seeks Perfectio
n
Positive Work Attitud
e
Corporate Citizenshi
p
Total Score -.15 .20
Conversion Rate
-.15 -.16 .21 -.12
Calls Per Hour -.28 .22 -.17 -.17
Unavailability .18 .34 .14
Positive Attitude
Team Attitude -.14 .13 .14
Service Attitude
Caring Attitude .12
Ownership -.12
Performance -.13 .13
Range .13
Ranking -.12 .23 -.17
Sales -.13 -.15 -.10
Talk Time -.20
Account Weight -.14 -.18 -.14 -.14
Total Score -.49 .58
Conversion Rate
.47
Ranking (A) .20 .19
Ranking (B) .20 .28 .21 .21
Call Management
.26 .28 .27 .26 .19 .15 -.13
Average Hold Time
.15
Weighted Rating
.27
Supervisor Rating of Overall Performance
.23 .15
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 97
Performance Measure
Enjoys Proble
m Solving
Innovative and
Creative
Needs Structur
e
Adaptable
Develops Relationship
s
Expressive and
Outgoing
Competitive
Seeks Perfectio
n
Positive Work Attitud
e
Corporate Citizenshi
p
Supervisor Rating of Skill Acquisition
.34 .29 .25
Supervisor Rating of Summary Performance
-.41 .39
Supervisor Rating of Overall Performance
-.36 .28 .30
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 98
Appendix D: Scoring Rubric for Essays
Grammar Content Structure
Syntax, vocabulary, spelling, sentence structure, natural-sounding, command of English
Conciseness, appropriateness, addressed prompt
Logic, flow, organization, format
0 - 3 Latent with mechanical errors. May include errors in spelling, punctuation use, contractions, prepositions, missing words, verb conjugation and tense, etc. Contains a generally poor command of English. Errors are to the point that the writing is barely understandable, if at all.
Does not address more than half of requirements set up by prompt. Response may be incomplete or extremely off-topic. Response may have been written in a completely inappropriate tone for the intended audience. May be extremely unclear.
Ideas are so jumbled by illogical organization that they may be hard or impossible to follow. Format is not apparent or completely inappropriate for intended audience. There may be little to no transition between different ideas.
4 - 6 There are a few errors in grammar, but they do not severely hinder comprehension of the writing. English/wording may sound somewhat unnatural. Main idea of writing is still understood.
Addresses many or all requirements set up by prompt, but they may not have been thoroughly developed, may be missing information, or may be unclear. Response may be slightly off-topic.
Organization of ideas is slightly off-putting and confusing, but reader should be able to follow them and the main idea is still communicated. Ideas may lack transition when needed.
7 - 10 There may be one or two small errors, but they are very minor and do not affect comprehension of the writing. English sounds natural and flows well.
All requirements set up by the prompt are addressed clearly and well developed. There are no confusing spots. Response is not off-topic or missing information.
Organization of ideas is logical and easy to follow. Transitions are usually or always apparent where appropriate.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 99
Appendix E: Directions for Rating Essay Tests
Scoring with a Rubric:
1. Read the rubric.
2. Read the prompt
3. Outline all requirements listed in the prompt
4. Read the response
5. Choose an appropriate score for the writing sample in each of the areas listed on the
rubric. Be sure to rate each area separately, and not to allow a good/bad score in one area to
affect the way you score another area in the same writing sample; a writing sample with
many spelling errors may still reflect all the required content.
Example of a writing sample with a perfect score:
Prompt:
Pretend you are an administrative assistant. Your boss wants to have an offsite team meeting to set
goals for you and the rest of the team next year. Write an email asking team members to attend an
all-day meeting on the first Monday of next month. Tell team members to write down their goals for
the year and come prepared to present them to the group. Additionally, ask if anyone wants to
volunteer to plan a team activity.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 100
Response:
This person scored a 30/30 on their writing sample, on a basis of three factors: grammar, content,
and structure.
There are no errors in grammar or spelling. The English sounds natural and the writing flows well.
The writer scored a 10/10 for grammar.
The writer answered the question thoughtfully and thoroughly. Despite the fact that the author was
not presented with a date, time, or place in the prompt, he/she recognized that, in an actual work
setting, these figures would be required in a successful email, and included all necessary information.
The response addresses all requirements outlined in the prompt and leaves nothing unclear. 10/10
content.
The sample is structured logically so that it flows without abrupt jumps or changes in idea. It is easy
to read and follow. The writer also included a subject line and list of recipients, an appropriate
introduction (Team Members:) and conclusion (Thank you,). The sample received a 10/10 in
structure, earning it a 30/30 overall.
Example of a poor response to the same prompt:
Response:
This person scored a 14/30 on their writing sample based on the same three factors: grammar,
content, and structure.
We are going have a meeting on the first Monday next month. I want everybody to think of
some goals and write them to present to everyone. Does anyone want to plan a team activity?
Team Members:
Arthur is hosting a mandatory team meeting to set goals for next year. The meeting will take
place January 30, 2014 from 1:00 P.M. to 4:00 P.M. at the Ritz Carlton in Vienna, VA in
Conference Room B.
Arthur is expecting each of you to attend. Also, he is expecting you to prepare for the meeting
by documenting your goals and being prepared to present them to the team.
Arthur is looking for a volunteers to lead a team activity during the meeting. If you would like to
volunteer, contact Arthur with your idea by January 15th.
Thank you,
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 101
The grammar is understandable and there are no glaring errors, however, there is a missing word
(We are going have) and the wording is not always clear. The writer scored a 7 for grammar.
Only a few of the requirements set up by the prompt are addressed here, and when they are, the
ideas are hardly developed or elaborated on. Were this a real email, much would be left to confusion.
This person received a 3/10 for content.
There is little transition between ideas or logical flow in this response. Also, this response is not
structured in an email format. The sample received a 4/10 for structure, giving it a 14/30 overall.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 102
Appendix F: Validity Evidence for HR Avatar Tests
The HR Avatar Automated Essay Scoring System
January, 2016
Introduction
Written communication is a key skill in many positions. Communicating via email, writing
reports, and creating presentations all require the ability to communicate effectively.
Traditionally, essays written for assessment purposes are scored using human raters and a pre-
defined scoring rubric. However, the cost of human scorers is relatively high and humans can
become fatigued and erratic in high volume situations.
Luckily, machine learning has advanced to the point where computers can substitute for human
raters reliably. The HR Avatar Essay Test is an implementation of this technique, which results
in lower cost and faster scoring turn-around.
The HR Avatar Essay Test consists of several writing prompts. The writing prompts were
designed to be general enough to provide an opportunity for anyone to be able to write a short
essay. It is easy to add additional prompts for specific situations or for general use.
Applicants are asked to write a short essay with a minimum of 100 words and are given an
unlimited time to do so. The essays are scored using Discern, an open source, machine learning
program. Discern was designed by edX, a nonprofit organization founded by Harvard and the
Massachusetts Institute of Technology (MIT) (edX, 2015; Markoff, 2013). The system produces
a score that ranges from 0 to 100. A confidence estimate for the score is also computed, which
ranges from 0 to 1. Scores with confidence estimates less than .10 are not considered valid.
How it works
HR Avatar uses open source essay scoring software originally published by EDX Corporation, a
spin-off of The Massachusetts Institute of Technology (MIT).
Software addresses and performs regression to produce a score for each submitted essay along a
continuous scale. This is different from classification, in which the software would simply
attempt to categorize each essay into one or more 'groups' or to rank the essays relative to one
another.
Each essay is written according to a predetermined set of instructions typically referred to as the
"Prompt." A typical prompt might be: "In a short essay of 100-400 words, explain whether it's
better to be a planner or to be a dreamer."
All essays are scored by the machine learning algorithm based on a "Training Set" upon which a
regression model has been built. The algorithm essentially analyzes all of the training essays and
produces a best guess at how the human scorers who created the training set would have judged
the new essay.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 103
The application is written in Python and utilizes several open source machine-learning tools and
is centered around a machine-learning library called scikit-learn (http://scikit-learn.org), which in
turn uses a number of other open source mathematical and data manipulation packages
In order perform its task, the application converts each essay into a number of different
"features." Features are measurable aspects of the essay, such as spelling errors per character, or
grammar errors per character. In concept, the essay is reduced to an N-dimensional vector
containing all of the essay's feature scores. However, some features are complex vectors in and
of themselves.
Each feature is measured using specialized software for text analysis. For example, the grammar
errors are determined by looking for good and bad 'ngrams' which are essentially models of
either good or bad grammar.
Another important feature known as a "bag of words" is also generated. The bag-of-words model
is a simplifying representation used in natural language processing and information retrieval
(IR). In this model, a text (such as a sentence or a document) is represented as the bag (multiset)
of its words, disregarding grammar and even word order but keeping multiplicity. This results in
a vector with a length equal to the number of unique words in the largest essay evaluated.
It's helpful to understand the Bag of Words approach in terms of how it's used to filter out junk
email.
In Bayesian spam filtering, used by many spam filters, an e-mail message is modeled as an
unordered collection of words selected from one of two probability distributions: one
representing spam and one representing legitimate e-mail ("ham"). Imagine that there are two
literal bags full of words. One bag is filled with words found in spam messages, and the other
bag is filled with words found in legitimate e-mail. While any given word is likely to be found
somewhere in both bags, the "spam" bag will contain spam-related words such as "stock",
"Viagra", and "buy" much more frequently, while the "ham" bag will contain more words related
to the user's friends or workplace.
To classify an e-mail message, the Bayesian spam filter assumes that the message is a pile of
words that has been poured out randomly from one of the two bags, and uses Bayesian
probability to determine which bag it is more likely to be.
Along with the bag of words feature, another feature is generated that represents how 'topical' the
essay is, by using the bag of words vector that was generated.
Once the features are generated, the application formulates a model, using all training essays,
and their accompanying human-generated scores. The model is essentially a catalog of all feature
measurements for all of the training essays, along with their scores. Once created, this model can
then be used to determine where in the score space a new, unscored essay lies, based on its
feature measurements. In addition to score values, error values, which indicate how consistent
the training essay set was, can be calculated. This can provide a confidence value for the final
score.
The software uses a technique called Gradient Boosting Regression to pinpoint the score within
the model for a given essay by comparing the features for the new essay against the feature sets
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 104
of the pre-scored or 'training' essays. This is a well-established machine learning technique. Data
theory shows that this technique yields excellent results for regression problems like essay
scoring.
How does it perform?
The best indication of how well the machine learning algorithm works is to measure how well it
predicts the score a human rater would have come up with for any given essay. To do this we can
evaluate the machine-human rater reliability.
Reliability is a critical aspect of any assessment. It describes whether the score is consistent, and
puts an upper limit on the validity of the assessment. The data were analyzed to ascertain the
reliability of the machine scores of the essays to represent the scoring of human essay raters.
One thousand, two hundred and fourteen (N=1,214) essays were scored using both human
scoring and machine scoring. The correlation between the ratings was .73, representing an inter-
rater reliability of .73, which indicates that the machine scoring reliably rates the essays similarly
to human raters. In the world of testing, a reliability value of 0.73 is generally considered more
than acceptable.
Therefore, the machine scoring of the HR Avatar essay test was demonstrated to be a reliable
method for scoring essays that is similar to human ratings, but significantly more efficient,
requiring little or no human time or effort to arrive at an assessment of a large number of
applicants’ writing skills.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 105
Appendix G: Validity Evidence for HR Avatar Tests
Results for the HR Avatar High Potential Solution
February 23, 2016
Validation studies have been conducted to confirm that the HR Avatar High Potential Solution
predicts job performance. Two separate studies are reported below. The first compared performance
on the assessment to a dichotomous outcome measure. In the second study, scores on the
assessment were correlated with a 1-100 performance rating.
Study 1
Three companies were included in the first study of the ability of the HR Avatar assessment to
predict manager job performance. The three companies were Johnson & Johnson, Harte Hanks, and
Manulife. They were identified as leaders in the Philippines in terms of using best practices in
personnel management. Individually, their sample sizes were too small to conduct separate
validation studies, but together the number of participants was high enough to enable confidence in
the results.
Results are reported below. The managers received a job performance rating of either “high” or
“marginal.” The sample included 130 managers. It is likely that the performance measure attenuated
the correlation because it is only two levels, which limits the amount of variance, accuracy, and
consistency it can provide. However, it does provide a measure of performance. Based on the
literature review of the components in the HR Avatar Assessment, we expect validity to be very
strong, approximately .35 - .40 uncorrected, and in the .50 - .60 range when corrected for criterion
unreliability.
Table 1 presents the results of the correlation analysis. The overall score predicted job performance
significantly (r=.25; p<.01). Although we are pleased to see a statistically significant correlation that
approaches .30, we believe this is an underestimate of the actual validity, due to the two-level
performance measure. Among the subcomponents, Attention to Detail was particularly robust in
predicting performance (r=.22; p<.05), as was Adaptable (r=.21; p<.05).
The high/marginal performance rating probably has low reliability. Thus, if we use a low reliability
of .40 for the correction for criterion unreliability, the validity increases to .63. We also used a higher
estimate for criterion reliability of .60, which is typically used as an estimate of reliability for well-
developed multi-level performance ratings, as an estimate of the reliability of the high/marginal
performance rating. The result was a more conservative correction for attenuation that yielded a
corrected correlation is .42.
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 106
Further support of the assessment’s prediction of job performance was provided by T-Test analyses,
which indicated that the high-performing group had a significantly higher average overall score
(X=63.16) on the assessment (p<.01) than those who were in the low-to-average job performance
group (X= 54.31).
In addition to predicting test performance, various scores on the assessment were related to
Leadership Engagement and Leadership Aspiration, suggesting that the underlying constructs are
related to important attitudes for leadership potential. For example, Needs Structure, Develops
Relationships, and Corporate Citizenship were related to Leadership Engagement. Further, several
test component scores were related to Leadership Aspiration, including Innovative and Creative,
Enjoys Problem Solving, Develops Relationships, and Adaptable.
Table 1: Correlations between Test Components and Job Performance and Job Attitude
Measures
Test Score Performance (High/
Marginal)
Leadership Engagement
Leadership Aspiration
Overall Score .25** .00 .16
Writing -.06 -.15 .09
Analytical Thinking .17 -.07 -.07
Attention to Detail .22* .01 .09
Adaptable .21* .10 .39**
Develops Relationships .02 .35** .50**
Enjoys Problem Solving .16 .26** .51**
Expressive and Outgoing -.01 -.13 .02
Innovative and Creative .18* .30** .67**
Needs Structure .02 .35** .36**
Seeks Perfection .13 .24** .36**
Frontline Management Fundamentals
.03 .03 .16
Corporate Citizenship .04 .28** .28**
Exhibits A Positive Work Attitude -.04 .11 .06
Competitive -.01 -.07 .00
Notes. *=p<.05; **=p<.01. N=130.
Study 2
The second study was conducted for an organization called UCPB LEAP. The sample was 64
managers who completed the HR Avatar assessment, and their scores were compared to their
previous year’s performance appraisal. The sample was small, which necessitated using a
nonparametric form of correlation called Spearman’s Rho, which indicates the extent to which the
rank order on the test was similar to the rank order on the performance measure. The correlation
HR AVATAR ASSESSMENT SOLUTION TECHNICAL MANUAL - DECEMBER 2, 2019 107
was r=.25, which was significant (p<.05). This supports the assertion that managers who scored
higher on the assessment achieved higher performance ratings.
Conclusion
Based on the two studies presented above, we can say with confidence that managers who score
higher on the assessment perform better on the job. The uncorrected correlation of .25 between
overall score and the high/marginal performance measure is significant. When corrected for
criterion unreliability using a conservative approach, the correlation becomes .42. T-Tests also
support the same conclusion. More research is needed to build on the initial, yet promising results
described above. We expect that the results will demonstrate larger effect sizes and more robust
prediction of job performance when we are able to obtain measures of job performance that have
more variance and subjects are not simply placed into “high” and “marginal” categories. When we
have the time to adjust the scoring and weighting of the overall scores and have better criterion
measures, we believe it will be closer to .35 or more uncorrected, which would then yield a corrected
validity of approximately .60.