B4T4 Utility Written Report
B4T4 Utility Written Report
College of Science
DEPARTMENT OF PSYCHOLOGY
Topic:
UTILITY
Submitted to:
Submitted by:
March 2024
UTILITY
CONTENT OUTLINE
What is utility?
● In the context of testing and assessment, utility is referred to as the usefulness or
practical value of employing a test that offers improved efficiency.
2. Cost
● The cost is often referred to as the money and this is the most basic elements in
any utility analysis, specifically the financial cost of the selection of the device
used in the study. The term cost in the context of test utility is more on the side
of its disadvantages, losses or expenses in both economic and non-economic
terms.
● On the other hand, in terms of test utility decisions, cost can be interpreted
traditionally which is in an economic sense where it relates to the expenditure
associated with testing or not.
Funds will be allocated for the test purchase if the test needs to be
conducted including…
1. a particular test;
2. supply of blank test protocols and;
3. computerized test processing, scoring and interpretation service.
○ However, these costs for private clinics are covered by the fees charged to
test-takers while for research organizations, the costs will be paid from the
test user's funds that will then originate from sources of private donations
or government grants.
● In economic costs, an example is the commercial airline facing high fuel costs
who decides to save money in the method of cross-cutting. There will be
potential consequences even though it reduces expenditure or saves money as it
may lead to significant losses in terms of trust from the customer, revenue and
potential safety-related incidents.
● For noneconomic costs, example is the utility of four X-ray pictures as compared
to two X-ray picture to detect fractured ribs of child abuse victims. According to
Hansen (2008) that the four-view X-ray has a better identification of structure
than then two-view X-ray. The researchers recommended adopting the enhanced
protocol despite the additional financial costs to better detect abuse.
3. Benefit
● The benefit is referred to as the profit, gains of advantages that can be viewed in
economic and noneconomic terms that are associated with testing.
● In economic terms, an example is the implementation of a new test to select
employees who are more productive resulting an increase productivity among the
employees that leads to the greater overall company profit
● Some of the noneconomic benefits in an industrial setting are…
○ Increase in quality and of workers' performance
○ Decrease in time needed to train workers
○ Reduction in the number of accidents
○ Reduction in worker turnover
Utility Analysis
Discussant: Bettina Alessandra B. Mallo, Gwynette S. Cobayas, & Ian Karlo B. Nuñez
Some utility analyses are more straightforward and easy to comprehend in terms of
answers to relatively simple questions, while others are quite complex, utilizing complex
mathematical models and intricate weighting schemes for the various variables being
considered. For example, while developing a new diagnostic test, researchers may employ
statistical techniques to evaluate large data sets and simulate the test's sensitivity, specificity,
and predictive values across different patient populations
In simplified terms, utility analysis facilitates the selection of the best assessment tool
among various alternatives by considering both the costs and benefits.
Advantages:
➢ Easy to understand and apply.
➢ Provides a straightforward way to interpret scores and predict
outcomes.
➢ Significantly aid in decision-making processes, particularly
concerning individuals or groups scoring within a specific range on
a predictor.
Limitations:
➢ Oversimplifies the evaluation as it dichotomizes performance into
success and failure.
➢ Focuses on predicting outcomes without considering the cost of
testing.
● Taylor-Russell tables
○ Developed by H.C. Taylor and J.T. Russell in 1939.
○ A method for evaluating the validity of a test in relation to the amount of
information it contributes beyond the base rate. In other words, the tables
provide an estimate of the percentage of employees recruited through the
use of a specific test who will demonstrate success in their respective jobs.
○ To use Taylor-Russell tables, the following information must be present:
■ Definition of success: Success must be clearly defined by
dichotomizing some outcome variable. For example, a general
weighted average of 2.0 or better may be defined as success in
college and those below 2.0 may be defined as failures.
■ Determination of base rate: Base rate is defined as the percentage
of current people hired who are considered successful.
■ Definition of selection ratio: Selection ratio is the percentage
indicating the relationship between the number of individuals
intended for hiring and the pool of available candidates for
employment.
● For example, 50 available position and 500 applicants
○ Selection ratio = Number of available position /
Number of Applicants
= 50 / 500 or 0.1
■ Determination of validity coefficient: Validity coefficient is the
correlation of the test with some criterion. For example, measure of
work quality.
Advantages:
➢ Easy to utilize and provide valuable information into the
relationships between selection ratio, criterion-related validity, and
existing base rate
➢ Facilitates decision-making by quantifying the impact of different
selection procedures on hiring success.
➢ Compares the utility of various tests for the same purpose.
Limitations:
➢ Assumes a linear relationship between predictor scores and
criteria.
➢ Identifying a criterion value to separate successful from
unsuccessful performance can be challenging.
➢ Does not directly indicate the likely average increase in
performance.
➢ Similar to the expectancy table, it dichotomizes performance into
successful and unsuccessful categories.
● Naylor-Shine tables
○ Provides a logical way to assess the contribution of an assessment tool
within the context of established procedure for selection or evaluation.
This involves obtaining the difference between the means of the selected
and unselected groups. By identifying this difference, organizations can
determine additional value provided by the test beyond existing selection
methods.
○ In Naylor-Shine tables, utility is defined in terms of the increase in mean
criterion performance (ex. Final GPA),given the predictive validity of the
selection procedure and the selection ratio.
○ Purposes of the Naylor-Shine model:
■ Estimate the increase in mean criterion performance based on a
certain selection procedure
■ Determine a cutoff value for the predictor to meet a desired level
of mean criterion performance in the selected group.
Advantages:
➢ Provides information needed to use the Brogden-Cronbach-Gleser
utility formula.
➢ Avoids dichotomizing criterion performance and consider the
non-linear relationship between predictors and criteria.
➢ Offer a clear framework for communicating the meaning of test
scores to stakeholders, aiding in decision-making processes.
➢ Can be used for showing average performance gain or determining
the selection ratio needed to achieve a particular performance gain.
Limitations:
➢ Overestimate utility unless top-down selection is employed.
➢ These tables express utility in terms of standardized performance
gains.
➢ Does not address financial aspects such as testing costs.
● In general, the same sorts of approaches to utility analysis are put to work for
positions that vary greatly in terms of complexity.
● The same sorts of data are gathered, the same sorts of analytic methods may be
applied, and the same sorts of utility models may be invoked for corporate
positions ranging from assembly line worker to computer programmer.
● Yet as Hunter et al. (1990) observed, the more complex the job, the more people
differ on how well or poorly they do that job. Whether or not the same utility
models apply to jobs of varied complexity, and whether or not the same utility
analysis methods are equally applicable, remain matters of debate.
In order to determine the cut score for predictors, test developers make use of either
subject-matter experts, or data from a representative sample which is applied in the different
methods that are most efficient to them or the decision-makers.
C. IRT-Based Methods
Based on classical test score theory where testtakers total scores are observed alongside
items that need to be “correct” in order for them to be regarded as those that possess the
trait or attribute. This is further based on the item response theory (IRT) that attempts to
uncover the relationship between unobservable traits, attributes, or abilities and the
response of a subject, however in setting cut stores, experts also associate a certain level
of difficulty per item.
a. Item-mapping method
● Experts need to be trained in estimating the minimal competence required
for the trait or attribute being scored. As they will be arranging items using
a histogram where each column in the histogram will be containing items
that are of equivalent value in accordance to whether future testtakers with
minimal competence will be able to answer that item correctly. In this
regard, the difficulty level shall then be the cut score.
● The process can involve several rounds of judgment and feedback from
the same or different expert ratings until the appropriate difficulty level
has been selected.
b. Bookmark method
● Experts will first be trained in order to be knowledgeable on the minimal
competence required for the knowledge, skill, and/or ability required of
future testtakers. As they will be handed a book that contains one item per
page, arranged in ascending order of difficulty. The experts are tasked to
place a bookmark in between the page that separates future testtakers in
accordance to the minimal competence required.
● Additional rounds of bookmarking and feedback can occur however the
level of difficulty to use as the cut stories relies on the decision of the
test-developers themselves.
Concerns: Training received by experts, floor and ceiling effects, and the
optimal length of the item booklet.
D. Other Methods
a. Decision-theoretic approach (Hambleton & Novick, 1973)
Additional information: A theory commonly used in economics wherein a set of
actions, and a loss function quantifies the value to the decision-maker, (Hirano,
2010). In determining cut scores, the approach is used in miss rates as they are
determined to be a loss of function in order to determine the cut score that either
minimizes miss rates and/or maximizes utility, (de Gruijter & Hambleton, 1984)
b. Method of predictive yield (R. L. Thorndike, 1949)
Made use of a norm-referenced method that considers the following personnel
selection: number of positions available, estimations of the likelihood of offer
acceptance, and the score distribution of applicants.
c. Discriminant analysis or discriminant function analysis
Makes use of different yet related statistical techniques in order to identify the
relationship of scores (in battery tests), and two naturally occurring groups that
contrast one another.
REFERENCES
Cohen, R. J., & Swerdlik, M. E. (2017). Psychological testing and assessment (9th ed.).
McGraw-Hill Education.
Kaplan, R., & Sacuzzo, D. (2018). Psychological Testing (9th ed.). Cengage Learning