Mettl Test For Abstract Reasoning
Mettl Test For Abstract Reasoning
Revised in 2022
Mettl Test for Abstract Reasoning Manual© may not, in whole or in part, be copied, photocopied, reproduced, translated, or converted to any electronic
or machine-readable form without prior written consent of Mercer Mettl.
Page 2
Table of Contents
Executive Summary
The purpose of this technical manual is to describe the process of standardization and validation of Mettl’s
Test for Abstract Reasoning (MTAR). The Mettl Test for Abstract Reasoning (MTAR) is a nonverbal test
designed to measure an individual’s fluid intelligence; their ability to make meaning out of ambiguity and
manage new information and solve novel problems. Organizations across the globe use ability tests as part of
their hiring process. Empirical research has shown that cognitive ability tests are extremely useful in assessing
candidate’s capability to reason, solve problems and take appropriate decisions, all of which entail better work
outcomes. Also, in comparison with other methods of employment testing, especially interviews, which are
prone to subjective bias, cognitive tests are unbiased and objective in nature. In addition, we have seen that in
increasingly global and diverse employment settings, there is a growing need of Non-Verbal Reasoning Tests
which are free from any form of cultural bias. These tests are helpful for candidates from a diverse background
with whom English is not their first language. In our experience abstract reasoning tests are one of the most
used and effective tests in predicting job performance. The previous version of this test was used in hiring and
developmental initiatives in major industries like e-commerce, financial sector, manufacturing, retail, IT &
ITES and the results indicate a positive relationship between the MTAR and competencies such as; ambiguity
tolerance, learning agility and innovation.
Mettl’s Test for Abstract Reasoning is a test of inductive, rather than deductive reasoning. That is, it requires
respondents to look for patterns in information and then generalise those patterns to the next space in a
sequence. It is a non-verbal and abstract measure that uses shapes and patterns to test respondent’s lateral
thinking abilities. It measures the following capabilities of test takers:
• Ability to understand and detect the meaning behind data or given information.
• Ability to identify the relationship between subtle ideas.
• Ability to think abstractly, finding patterns and relationships to solve novel problems.
• Ability to grasp the bigger picture, think clearly and effectively solve complex problems.
• Ability to process and analyse ambiguous information.
• Ability to think creatively and come up with innovative solutions.
Page 5
The test consists of increasingly difficult pattern matching tasks and has little dependency on language
abilities. Each item in the MTAR comprises a pattern of diagrammatic puzzles with one piece missing. The
candidate’s task is to choose the correct missing piece from a series of possible answers. The following goals
guided the development of the MTAR. The test must be:
Relevant: The test is designed to measure an individual’s ability to find patterns in information, solve
problems and deal with abstract situations.
Credible: This manual outlines the statistical evidence of reliability, validity and therefore credibility
of the assessment.
Easy to Use and Interpret: The assessment has been designed to have simple and easy to understand
instructions. The feedback reports are also simple to interpret.
Convenient: The assessment is short and takes no more than 20-30 minutes to complete on average.
The assessment is available online and accessible from anywhere in the world.
Free from cultural biases: The test has undergone statistical analysis to ensure it is free from any bias
or adverse impact.
In line with International Standards of Psychological testing: The MTAR has been developed in line
with the Uniform Guidelines on Employee Selection Procedures (EEOC, 1978), the Society for
Industrial and Organizational Psychology's Principles for the Validation and Use of Personnel Selection
Procedures (SIOP, 2003), EFPA Test Review Model and the Standards for Educational and
Psychological Testing developed jointly by the American Educational Research Association, American
Psychological Association, and National Council on Measurement in Education (1999).
Kanazawa, 2004i; Sternberg, 1997ii and Weinberg, 1989iii separately defined intelligence as:
• The mental abilities that enable one to adapt to, shape, or select one’s environment.
Page 6
The intellectual or cognitive ability of an individual cannot be based on a single function or capacity. Therefore,
psychologists attempted to identify the various components of intelligence. This resulted in theories and
models of intelligence such as Spearman’s two-factor theory, Cattel’s theory of fluid and crystalized
intelligence, Thurstone’s theory of primary mental abilities, Gardner’s theory of multiple intelligence and
Sternberg’s triarchic theory etc. Spearman’s two factor theory proposed that intelligence is a general cognitive
ability which energizes diverse mental faculty and functions. According to Spearman there are two
components of intelligence. General intelligence or ‘G’ which influence the performance on all mental tasks,
while specific intelligence influences abilities on a particular task. On the other hand, Thurstone proposed that
intelligence consists of seven primary abilities, namely: reasoning, verbal comprehension, numerical ability,
word fluency, perceptual speed, spatial visualization and associative memory. Gardner alternatively proposed
eight distinct types of intelligence which included musical, kinaesthetic, spatial and inter as well as intra-
personal ability. Sternberg triarchic theory of intelligence involves three different factors namely; analytical,
creative and practical intelligence. In summary, despite considerable debate on the definition and exact nature
of intelligence, it is still not distinctly conceptualized. However, Spearman’s two factor theory and Horn &
Cattel’s theory of fluid and crystallized intelligence are the two most dominant theories of intelligence and
they are also more psychometrically sound and empirically tested. Therefore, we used these theories in
conceptualizing our cognitive tests especially the Mettl Test for Abstract Reasoning and the Mettl General
Mental Ability Test.
The MTAR is based on Horn & Cattell (1967)iv theory of fluid and crystallized intelligence. According to Cattell
(1987)v intelligence is broadly classified into two distinct factors – fluid and crystallized intelligence. Fluid
intelligence is the ability to reason and use novel information, it includes the ability to distinguish
Page 7
relationships, solve novel or unfamiliar problems, and expand their knowledge base with new information. On
the other hand, crystallized intelligence is the capability to acquire skills and knowledge and apply that
knowledge in specific situations.
Cattel (1987), believed the label ‘fluid intelligence’ reflected the construct’s quality of being applicable to
almost any problem which is why it is assessed with nonverbal or graphical items. The term fluid is intended
to indicate that fluid intelligence is not tied to any specific habits or sensory, motor, or memory area (Cattell,
1987). Fluid intelligence is a basic reasoning ability that can be applied to any problem, including unfamiliar
ones. It is an essential aspect of human cognition because it allows us to adapt to novel and challenging
situations and helps in figuring things out. It also represents the ability to detect meaningful patterns and
relationships.
Literature Review
Intelligence is one of most investigated and significant predictors of real-world outcomes like academic
performance, training performance and on-the-job performance (Kuncel & Hezlett, 2007vi; Salgado, Anderson,
Page 8
Moscoso, Bertua, & de Fruyt, 2003vii; Schmidt & Hunter, 1998viii). As per the findings of meta-analysis
conducted by Postlethwaite (2011)ix fluid intelligence is a significant predictor of performance in high
complexity occupations. Fluid intelligence includes basic cognitive abilities which are essential to assimilate
critical evidence about a problem or decision. To answer [abstract reasoning] questions, a person must
generate hypotheses, test them, and infer rules (Carpenter, Just, & Shell, 1990). Fluid intelligence is also
significantly related to metacognition and high reasoning and problem-solving ability (Cattel, 1971). Duncan,
Burgess, and Emslie (1995)x believed that fluid intelligence relies on prefrontal cortex activation and it may
be the best measure of executive functioning. Zook et al, (2006)xi also reported the significant role of fluid
intelligence in executive functioning which is measured in terms of solving complex and goal-directed
problem-solving tasks successfully.
Kuncel, Hezlett, and Ones (2004)xii believe that both fluid and crystallized intelligence play important roles in
the work setting. Effective job performance depends both on the effective processing of new information as
well as prior learning and experience. For efficient workplace functioning it is important that employees
should possess both technical knowledge and the ability to acquire new knowledge. This will allow them to
efficiently use new information to solve novel problems. In sum, “selecting employees for their ability to solve
problems that don’t exist today…to be able to learn new technologies quickly” is the need of the contemporary
organization (Baker, 1996)xiii. In order to predict job performance accurately we also offer tests to measure
numerical and verbal reasoning which measure crystallized intelligence of the candidate and also a broad
measure of ‘G’ with general mental ability test.
Fluid intelligence is also proven to be a significant predictor of an individual’s ability to multitask (Ben-
Shakhar & Sheffer, 2001xiv; König & Mürling, 2005xv). Individuals who score high on fluid intelligence/abstract
reasoning tests are good at managing large amounts of information and prioritising. A large amount of
research also suggests a strong link between fluid intelligence and working memory (Ackerman, Beier, &
Boyle, 2005xvi; Kane & Engle, 2002xvii). Lastly, fluid intelligence is also proved to be a significant determinant
of learning, specifically in novel conditions (Kvist & Gustafsson, 2008xviii; Watkins, Lei & Canivez, 2007xix). It is
because an individual’s early learning phase is generally disorganized and ambiguous and their ability to
conceptualize and make meaning out of ambiguity is more important at this stage. Therefore, fluid intelligence
Page 9
Item Banking
MTAR is developed using an item banking approach to generate multiple equivalent forms to support item
randomization. The term ‘item bank’ is used to describe a group of items which are organized, classified and
catalogued systematically. According to the research conducted by Nakamura (2000)xxi Item Response Theory
(IRT) facilitates item banking standardization by calibrating and positioning all items in the test bank on the
same latent continuum by means of a common metric. This method can be further use to add additional items
in the test bank to increase the strength of the item bank. IRT also allows construction of equivalent and
Page 10
Our item bank is developed as per the test composition plan which is based on two parameters; representation
of all types of item content and inclusion of easy, medium and difficult items. xxii. In an abstract reasoning test
individual items are designed as per certain rules like shape, size, addition or subtraction of elements,
movement etc. Test composition is defined by a specified number or percentage of items from various content
domains/rules as well as equal numbers of easy, medium and difficult items. It is used to develop a uniform
content outline which is crucial to confirm the construct validity of the test. In an item bank there are more
questions than are needed for each candidate. This enables random generation of items within certain
parameters to ensure each test is no more or less difficult than the last. Although item characteristics can be
estimated with the help of both Classical Test Theory (CTT) and IRT models, the psychometric literature
indicates the IRT method is more suitable for an item banked test (Embretson & Reise, 2013xxiii; Van der
Linden, 2018xxiv). The classical item and test statistics based on the CTT model vary depending on sample
characteristics whereas an IRT model provides ‘sample free’ indices of item and test statistics. Therefore, we
use item response theory to standardize our item banks.
1. Item construction
2. Item review by a panel of experts
3. Pilot testing of items
4. Review of item properties based on pilot data
5. Test administration on representative sample
6. Analysis of item properties and test properties
7. Item finalization and development of item bank
Item Writing
The MTAR consists of matrices with black and white geometrical figures. Candidates are given a three by three
matrix which consists of eight cells containing geometric patterns and one of nine blocks in the matrix is left
blank. Candidates must find the logical rules that govern how the sequence progresses horizontally or
vertically and identify from these the next shape that should fill the blank space. Little or no use of language
or any pre-existing knowledge is required when completing the questions. The development of this test was
done in four broad stages; item creation, multiple rounds of item review, pilot testing and standardization. In
the first stage of item creation a large pool of 170 items were developed by subject matter experts and
psychometricians. Detailed in-depth interviews were conducted to explore with SMEs which item images and
item reasoning should be used.
The following general rules were followed when designing the items:
a. Images/ shapes used should be neutral and not include any culturally specific elements.
b. Images/ shapes should be clear to comprehend, and unambiguous e.g. no blurred lines.
c. There should be a balanced mix of easy, medium and high level of difficulty items.
d. There should be a balanced mix of items with a different number of logical rules included in an item.
Item Review
Item reviews were conducted by our in-house Psychometricians, who have over 10 years of research
experience. Items and answer keys were both reviewed in depth. The difficulty level and item logic of each
item was reviewed thoroughly. The items were also analysed in terms of cultural- neutrality so that no ethnic
or cultural group would be advantages or disadvantaged due to culturally specific images. All items that did
not meet these strict standards were removed. Out of 170 original items a pool of 90 items were finalized for
Page 12
Procedure: In the first stage we conducted a pilot study and individual item parameters were estimated using
a Rasch Model. The objective of conducting the pilot study was to ascertain the basic item properties especially
item difficulty of all 90 items in the first stage. 90 items were divided into three equivalent sets and data was
collected from online administration of all three sets. All the items were mandatory, and participants were not
allowed to skip the item without responding. Only respondents with at least a 90% completion rate were
included in the sample and those with less than 90% completion rate were not included in final data set. This
resulted in 233, 234 and 243 responses in the three sets respectively.
Sample Details: In the first stage data was collected from 710 respondents. 45.5% of respondents from the
total sample were male, 44% of respondents were female, 1.7% of respondents chose ‘other’ as their gender
and 9% of respondents preferred not to disclose. 32% of the respondent’s native language was English and
the mean age of the sample was 31 years. A detailed description of the sample is reported in Appendix 1.
Analysis: A Rasch Model was used to ascertain item properties at stage 1 due to a smaller sample size. This
model provides stable estimates with less than 30 responses per item. A Rasch Model is the one parameter
model of Item Response Theory which estimates the probability of correct responses to a given test item based
on two variables: difficulty of an item and the ability of the candidate. The primary function of this model is to
provide information on item difficulty which helps to organize the test items according to difficulty level,
spread of item difficulty and test length. This helps to ultimately increase the measurement accuracy and test
validity. Based on the findings of the Rasch model, items exhibiting extreme b parameters were rejected at this
stage. Values substantially less than -3 or greater than +3 were regarded as extreme. 21 items from the initial
pool of 90 items got removed at this stage.
Procedure: A total of 69 items survived the pilot study stage. These were arranged in terms of difficulty
parameters and then divided into 3 sets of 23 items each for final stage data collection. The objective of the
second stage of data collection was to standardize the item bank and ascertain the essential psychometric
properties (reliability and validity) of the test. All the items were mandatory at this stage and participants
were not allowed to skip the item without responding. Only respondents with a 90% completion rate were
included in the sample and those with less than 90% completion rate were not included in final data set. This
resulted in 486, 365 and 367 responses in all three sets respectively.
Sample: In the second stage, data was collected from 1218 respondents. 52.6 % of respondents from the total
sample were male, 44.5% of respondents were female, and 2.8 % of respondents identified their gender as
‘other’. 28% of the respondent’s native language was English and the mean age of the sample was 31.9 years.
A detailed description of the sample is reported in Appendix 2.
Analysis: In the second stage of analysis we used a two-parameter model which advocates that the probability
of the correct response is a function of both item difficulty and the respondent’s proficiency. The two
parameter IRT model provides meaningful estimates of item difficulty and item discrimination. For the
finalization of items in the item bank, the following procedure was followed:
Page 14
Items displaying b parameter (item difficulty) larger than -3 or greater than 3 and above were
removed from the data set.
Items displaying a parameter (item discrimination) less than .2 were also removed at this stage.
Two out of 69 items were removed and meaning the final bank consist of 67 items with a balanced spread of
easy, medium and difficult items.
Validity
Validity is the most fundamental property of any psychological test. It involves accumulating relevant scientific
evidence for test score interpretation. The APA Standardsxxv say that there are four major sources of evidence
to consider when measuring the validity of a test; evidence based on test content, evidence based on response
processes, evidence based on internal structure and evidence based on relationship with other variables
especially criterion variables. In order to ascertain the validity of MTAR we collected evidence based on
internal structure (construct validity) and, evidence based on relationship with other variables especially
Page 15
Construct Validity
The purpose of the construct validation is to ascertain whether the test measures the proposed construct or
something else. The most common method of ascertaining the construct validity of an assessment is
exploratory and confirmatory factor analysis. We used the CFA method because our objective is to test a
predefined unidimensional measurement model. One of the most important assumptions of using an IRT
model as a measurement system is that it includes unidimensional items from the item bank. Therefore, in
order to establish construct validity evidence confirmatory factor analyses was used. The CFA results
confirmed the unidimensional factor structure with fit statistics that were satisfactory. As per the CFA model
the fit indices were as per the norms (IFI = .927; RMSEA = .02; CFI = .919 and TLI = .903).
Criterion Validity
Criterion-related validity evidence indicates the extent to which assessment outcomes are predictive of
employee performance in a specified job or role. In order to establish the criterion-related validity, there are
two major methods used:
1. Concurrent Validity: In this method, data on the criterion measures are obtained at the same time
as the psychometric test scores. This indicates the extent to which the psychometric test scores
accurately estimate an individual’s present job performance.
2. Predictive Validity: In this method, data on criterion measures are obtained after the test. This
indicates the extent to which the psychometric test scores accurately predicts a candidate’s future
performance. In this method, tests are administered to candidates when they apply for the job and
their performance is reviewed after six months or a year. Afterwards, their scores on the two
measures are correlated to estimate the criterion validity of the psychometric test.
In order to ascertain MTAR validity, concurrent criterion-related validity evidence was gathered where the
performance data and MTAR score were both collected at the same time. Then the relationship between these
two variables was tested and significant relationships were found. It is important to note here that in criterion
related validity analysis, the precision and relevance of criterion data/employee performance data is
Page 16
extremely vital. Error in measurement of the criterion is a threat to accurate assessment of the test’s validity.
Error in criterion measurement may attenuate the relationship between test score and criterion variables, and
thus lead to an erroneous criterion-related validity estimate. The basic criteria of appropriateness or quality
is as follows. Researchers should
• Have a clear and objective definition and calculation of performance levels.
• Have alignment with key demands of the role.
• Have crucial implications on business outcomes.
• Produce reasonable variance to effectively separate various performance levels.
Study Procedure: In the present study MTAR scores were used as the predictor variable and respondent’s
competency score on the basis of Line-managers ratings were used as the criterion variable. Data was collected
from a multinational company which specializes in HR Consulting. A sample of 150 employees from this
organization were invited to participate in the study and the purpose of conducting the assessments were
explained to them in detail. After collecting responses from the employees on the MTAR a detailed
competency-based performance rating form was completed by their respective line managers. In the
competency-based performance rating form all competencies were defined, and respondents were asked to
rate the competency on a 10-point rating scale (1 =low and 10 = high). Pearson product correlation method
was used to test the relationship between the MTAR score and their competency ratings.
Sample: A Total of 114 employees participated in the study and completed the MTAR. We received managerial
ratings on competencies for only 88 of these respondents. The mean age of the sample was 35 years, 57% of
respondents were male and 43% were female. 73% of the respondents worked as Analysts and Consultants
and the remaining 27% were Leaders and Product owners.
Analysis: Pearson product correlation method was used to test the relationship between the MTAR score and
line manager competency ratings. Results indicate significant positive correlations between the MTAR score
and competency ratings. MTAR score is positively correlated with analytical ability (r = .325, p <.01), Critical
thinking (r = .28, p <.01) Innovation (r = .309, p <.05) and High potential (r = .244, p <.05). These correlation
coefficients are not corrected for attenuation or range restriction. MTAR score is also positively correlated
Page 17
with learning orientation, employability and ability with numbers (refer to Appendix 3, table 1).
Page 18
In the present study group differences based on age, gender, and ethnicity for the MTAR were examined and
reported in table 1- 3 (refer to Appendix 4). Table 1 presents the comparisons of mean group differences
between gender and MTAR score. Results clearly suggest that there is a significant difference in mean score
between male and female respondents. However, based on traditional ranges for interpreting effect sizes
(Cohen’s d; Cohen, 1988), the difference is medium. Table 2 indicates Mean scores for two groups; those 40
years of age and less than 40 years of age. Results indicate these differences are statistically significant,
nonetheless an examination of effect sizes indicates the difference is small. We examined the mean differences
in MTAR scores between two groups– White (reference group) and non-whites (focal group). Results indicate
these differences were not statistically significant.
Additionally, in order to test the impact of English language skills on MTAR scores, we examined the mean
difference in MTAR score between native English speakers and non-English speakers. Results indicate these
Page 19
differences were statistically significant, but the effect size was small. This finding clearly indicates that MTAR
is free from language bias and it’s a global and culture agnostic tool (refer to Appendix 4, table 4).
Page 20
Test Administration
MTAR is an online test administered through an internet-based testing system designed by Mettl for the
administration, scoring, and reporting of occupational tests. Test takers are sent a test link to complete the
test and candidate/test taker data is instantly captured for processing through the online system. Test scores
and interpretive reports are instantly generated. Tests can also be administered remotely but most
importantly all candidates’ data, question banks, reports and benchmarks are stored in a well-encrypted and
highly regarded cloud service. In order to prevent cheating and all forms of malpractices, Mettl’s platform also
offers AI powered anti-cheating solutions that include live monitoring, candidates’ authenticity check, and
secure browsing.
Scoring
Responses to MTAR are scored based on how many correct answers a respondent chooses. Each item consists
of 5 answer options, of which only one is correct. Each item answered correctly is awarded 1 mark and items
answered incorrectly or not attempted are given a 0 (zero) mark. An individual’s overall score is an average
of all items answered correctly. Next, we convert raw scores into sten scores using the formula given below,
which brings these scores into a 10-point scale.
(Z-score * 2) + 5.5
OR
[(X-M)/SD] *2 + 5.5 (same as above)
Test Composition
Each test taker will be asked to complete 23 items in 30 mins. Sample items and a sample report are reported
in Appendix 5.
Interpretation
The MTAR measures the abstract reasoning ability of the test takers working in variety of individual
contributor or managerial roles. This test is suitable to be used in both recruitment and development settings.
Page 21
Abstract reasoning is defined as ability to think laterally, examine problems in unique and unusual ways and
make fresh connections between different concepts. A high score on the MTAR indicates that the test taker
possesses higher ability to solve complex problems by identifying patterns and their underlying rules. A high
score also indicates greater ability to solve problems effectively and perform well in novel situations.
Mettl recommends that the MTAR is used with the following caveats and tips in mind:
Use with other tests: The MTAR, like any other hiring tool, is best used as part of a systematic selection
process, along with other scientifically developed and job-relevant predictors of future success.
Ideally, the MTAR should be administered to job applicants who possess the minimum requirements
for the job. The assessment results can serve as an important part of the hiring decision – but not the
only one.
Aggregate results: The MTAR, when used with large numbers of job applicants, in recommended ways,
will yield a better-quality workforce over time. However, like with any assessment of human abilities,
it is not infallible and should be used in conjunction with other information and followed up with
behavioural tools such as structured interviews and competency assessment.
Simple to complex: If the primary focus is to screen out candidates unlikely to succeed, hiring managers
should focus on eliminating those “not recommended for hire” from the pool first. Those remaining,
should be prioritized as those “recommended for hire” and then consider those “cautiously
recommended for hire”.
Page 22
Appendices
Appendix 1: Demographic details of Pilot study (N = 710)
Table 1: Gender
Gender Frequency Percent
Male 323 45.5
Female 312 43.9
Others 12 1.7
Prefer not to say 63 8.9
Table 2: Age
Age Frequency Percent
20 - 30 years 399 56.2
31-40 years 196 27.6
41-50 years 74 10.4
51-60 years 41 5.8
Level 2: Senior Managers/Directors: Senior Management (Three Levels Below CEO). 73 10.3
Level 3: Managers/Supervisors: Middle management to first-level managers (Five Levels 135 19.0
Below CEO)
Level 4: Entry Level: Non-management/ individual contributor (including entry level) 193 27.2
Table 9: Nationality
Nationality Frequency Percent
Africa 72 10.1
Asia 231 32.5
Australia & NZ 29 4.1
Europe 162 22.8
LATAM 26 3.7
UK 104 14.6
US & Canada 54 7.6
Not disclosed 32 4.5
Table 1: Gender
Gender Frequency Percent
Table 2: Age
Age Frequency Percent
Level 1: Executive Officers: Senior-most Leaders (CEO + One Level Below) 151 12.4
Level 2: Senior Managers/Directors: Senior Management (Three Levels Below CEO). 137 11.2
Level 3: Managers/Supervisors: Middle management to first-level managers (Five Levels 230 18.9
Below CEO).
Level 4: Entry Level: Non-management/ individual contributor (including entry level). 382 31.4
Table 7: Industry
Industry Frequency Percent
Table 9: Nationality
Nationality Frequency Percent
Sample Report
welcome to brighter
Female 18.28
Male 39.24
Europe 2.04
Africa 0.83
UK 0.05
LATAM 0.01
Telecommunications 9.53
Banking 2.30
Automotive 2.16
Industry % of Sample
Insurance 1.10
Others 13.89
Page 37
The new Norms for Abstract Reasoning Standardized English for the Indian region have been developed on 4886
respondents. These norms are based on responses from candidates who have attempted 23 questions (9 easy, 9
medium and 5 difficult). The average time taken for completing this assessment was 23.59 minutes.
Female 28.98
Male 65.82
Automotive 2.74
Insurance 2.25
Others 10.50
welcome to brighter
Above 60 0.13
Automotive 80.28
Farming 4.42
Banking 3.40
Female 31.25
Male 48.59
Below 20 0.11
20-29 5.91
30-39 26.63
40-49 25.11
50-59 7.55
Above 60 1.30
Chemicals 6.87
Farming 6.76
Telecommunications 5.18
Pharmaceuticals 3.43
Others 4.22
Male 45.89
Mexico 21.14
Columbia 7.14
Ecuador 5.65
Peru 4.39
Bolivia 2.49
Argentina 2.09
Chile 1.33
Others 1.85
Machinery 3.10
Telecommunications 2.35
Banking 1.47
Biotechnology 1.41
Others 2.25
20-29 30.93
30-39 27.24
40-49 8.03
50-59 1.45
above 60 0.58
Banking 30.57%
References
vi Kuncel, N. R., & Hezlett, S. A. (2007). Standardized tests predict graduate students‘success. Science,
315, 1080-1081.
vii Salgado, J. F., Anderson, N., Moscoso, S., Bertua, C., & de Fruyt, F. (2003). International validity
domains: a meta-analysis.
x Duncan, J., Burgess, P. W., & Emslie, H. (1995). Fluid intelligence after frontal lobe lesions.
intelligence as predictors of performance on Tower of Hanoi and London tasks. Brain and cognition,
56(3), 286-292.
xii Kuncel, N. R., Hezlett, S. A., & Ones, D. S. (2004). Academic performance, career potential, creativity,
and job performance: Can one construct predict them all?. Journal of personality and social
psychology, 86(1), 148.
xiii Baker, T. G. (1996). Essence of intelligence. Practice Network. Society for Industrial &
predictors of multitasking performance, but polychronicity and extraversion are not. Human
performance, 18(3), 243-266.
xvi Ackerman, P. L., Beier, M. E., & Boyle, M. O. (2005). Working memory and intelligence: The same or
factor as a function of cultural background: A test of Cattell's Investment theory. Intelligence, 36(5),
422-436.
xix Watkins, M. W., Lei, P. W., & Canivez, G. L. (2007). Psychometric intelligence and achievement: A
longitudinal multilevel approach applied to math. Learning and Individual Differences, 20(5), 446-
451.
xxi Nakamura, Y. (2001). Rasch Measurement and Item Banking: Theory and Practice.
xxii Bergstrom, B. A., & Lunz, M. E. (1999). CAT for certification and licensure. Innovations in
xxiv Van der Linden, W. J. (2018). Handbook of item response theory, three volume set. Chapman and
Hall/CRC.
xxv American Educational Research Association, American Psychological Association, Joint Committee
on Standards for Educational, Psychological Testing (US), & National Council on Measurement in
Education. (1985). Standards for educational and psychological testing. American Educational
Research Association.
xxviUniform Guidelines on Employee Selection Procedures (EEOC, 1978), Retrieved September 11,