Module 1 - PSYCH 3140 Lab

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Central Luzon State University

Science City of Muñoz 3120


Nueva Ecija, Philippines

Instructional Module for the Course


PSYCH 3140
(Psychological Assessment)

Module 1: Basic Concepts in Psychological


Testing

Overview

In Module 1, we will discuss what a psychological test is. After exploring


the history of psychological testing, we will discuss the three defining
characteristics of psychological tests and the assumptions we must make
when using psychological tests. After discussing how tests are classified, we
distinguish four commonly confused concepts: psychological assessment,
psychological tests, psychological measurement, and surveys.

I. Objectives
At the end of this lesson, you should be able to:
1. Know the definition of test and its features.
2. Understand the difference between psychological assessment and psychological
testing.
3. Know the major landmarks in the history of psychological testing.
4. Know the different types of test.
5. Understand basic concepts of standardized and non-standardized testing and
other assessment techniques including norm-referenced and criterion-
referenced assessment, environmental assessment, performance assessment,
individual and group test and inventory methods, psychological testing, and
behavioral observations.
6. Understand the varied purposes of psychological testing in addition to the
various settings in which tests are employed.
7. Apply technical concepts, basic principles and tools of measurement, of
psychological processes.

Topic 1: Definition of a Test

A test is a standardized procedure for sampling behavior and describing it with


categories or scores. For example, all of the following could be tests according to the
definition used in this book: a checklist for rating the social skills of a youth with mental
retardation; a nontimed measure of mastery in adding pairs of three-digit numbers;
microcomputer appraisals of reaction time; and even situational tests such as observing
an individual working on a group task with two “helpers” who are obstructive and
uncooperative.
In sum, tests are enormously varied in their formats and applications. Nonetheless, most
tests possess these defining features:

• Standardized procedure
• Behavior sample
• Scores or categories
• Norms or standards
• Prediction of non-test behavior

Standardized procedure is an essential feature of any psychological test. A test


is considered to be standardized if the procedures for administering it are uniform from
one examiner and setting to another. Standardization, therefore, rests largely on the
directions for administration found in the instructional manual that typically accompanies
a test.
The formulation of directions is an essential step in the standardization of a test. In
order to guarantee uniform administration procedures, the test developer must provide
comparable stimulus materials to all testers, specify with considerable precision the oral
instructions for each item or subtest, and advise the examiner how to handle a wide range
of queries from the examinee.

A psychological test is also a limited sample of behavior. Neither the subject nor
the examiner has sufficient time for truly comprehensive testing, even when the test is
targeted to a well-defined and finite behavior domain. Thus, practical constraints dictate
that a test is only a sample of behavior. Yet, the sample of behavior is of interest only
insofar as it permits the examiner to make inferences about the total domain of relevant
behaviors.

A psychological test must also permit the derivation of scores or categories.


Thorndike (1918) expressed the essential axiom of testing in his famous assertion,
“Whatever exists at all exists in some amount.” Psychological testing sums up performance
in numbers or classifications. The purpose of the testing is to estimate the amount of the
trait or quality possessed by an individual.

The imprecision of testing is simply unavoidable: Tests must rely on an external


sample of behavior to estimate an un- observable and, therefore, inferred characteristic.
Psychometricians often express this fundamental point with an equation:
X=T+e
where X is the observed score, T is the true score, and e is a positive or negative error
component.

A psychological test must also possess norms or standards. An examinee’s test


score is usually interpreted by comparing it with the scores obtained by others on the
same test. For this purpose, test developers typically provide norms—a summary of test
results for a large and representative group of subjects (Petersen, Kolen, & Hoover, 1989).
The norm group is referred to as the standardization sample.
The selection and testing of the standardization sample is crucial to the
usefulness of a test. This group must be representative of the population for whom the
test is intended or else it is not possible to determine an examinee’s relative standing. In
the extreme case when norms are not provided, the examiner can make no use of the test
results at all. An exception to this point occurs in the case of criterion-referenced tests,
discussed later.

In general, the ultimate purpose of a test is to predict additional behaviors, other


than those directly sampled by the test. Thus, the tester may have more interest in the
non-test behaviors predicted by the test than in the test responses per se. The ability
of a test to predict non-test behavior is determined by an extensive body of validational
research, most of which is conducted after the test is released. A test may have a fancy
title, precise instructions, elaborate norms, attractive packaging, and preliminary
findings—but if in the dispassionate study of independent researchers, the test fails to
predict appropriate non-test behaviors, then it is useless.

Further Distinctions in Testing

The chief features of a test previously outlined apply especially to norm-referenced


tests, which constitute the vast majority of tests in use. In a norm-referenced test, the
performance of each examinee is interpreted in reference to a relevant standardization
sample (Petersen, Kolen, & Hoover, 1989). However, these features are less relevant in
the special case of criterion-referenced tests, since these instruments suspend the
need for comparing the individual examinee with a reference group. In a criterion-
referenced test, the objective is to deter- mine where the examinee stands with respect
to very tightly defined educational objectives (Berk, 1984).

Another important distinction is between testing and assessment, which are often
considered equivalent. However, they do not mean exactly the same thing. Assessment
is a more comprehensive term, referring to the entire process of compiling information
about a person and using it to make inferences about characteristics and to predict
behavior. Assessment can be defined as appraising or estimating the magnitude of one or
more attributes in a person. The assessment of human characteristics involves
observations, interviews, checklists, inventories, projectives, and other psychological tests.
In sum, tests represent only one source of information used in the assessment process.
In assessment, the examiner must compare and combine data from different sources. This
is an inherently subjective process that requires the examiner to sort out conflicting in-
formation and make predictions based on a complex gestalt of data.

Basis Assessment Testing

Objective To answer referral question, To obtain some gauge,


solve a problem, or arrive at a usually numerical in nature, with
decision through the use of tools of regard to the ability or attribute.
evaluations.
Process Typically individualized and May be individual or group in
focuses on how the individual nature with little or less regards
processes rather than simply the with the mechanics of contents
results of that processing. and processes.

Role of The assessor is key to the The tester is not key to the
Evaluator process of selecting tests and/or process; practically speaking, one
other tools of evaluation as well as tester may be substituted for
drawing conclusion from the entire another tester without appreciably
evaluation affecting the evaluation
Skill of the Typically requires an Typically requires
Evaluator educated selection of tools of technician-like skills in terms of
evaluation, skill in evaluation, and administering and scoring a test
thoughtful organization and as well as in interpreting a test
integration of data. result.
Outcome Entails a logical problem- Typically yields a test score
solving approach that brings to or series of test scores.
bear many sources of data
designed to shed light on a
referral question
Complexity Simple: Involves one
More complex: Various
uniform procedure, frequently one
procedures and dimensions
dimension.

Duration
Longer Shorter

Sources of
data Several One

Focus
The uniqueness of group, How one person or group
individual or situation compares with others

Qualifications
Knowledge of methods and Knowledge of tests and
field of assessment testing

Procedure
Subjectivity – clinical Objectivity, quantification is
judgement, critical

Cost
High Low
Purpose Arriving at a decision
Obtaining data to make
concerning the referral question or
decisions
problem

Structure
Entails both structured and
Highly structured
unstructured aspects

Evaluations Relatively hard


of results Relatively hard

Topic 2: Major Landmarks in the History of Psychological Testing (Gregory,


2014)

2200 b.c. Chinese begin civil service examinations.


1838 Jean Esquirol distinguishes between mental illness and mental retardation.
1862 Wilhelm Wundt uses a calibrated pendulum to measure the “speed of
thought.”
1866 O. Edouard Seguin writes the first major textbook on the assessment and
treatment of mental retardation.
1869 Wundt founds the first experimental laboratory in psychology in Leipzig,
Germany.
1884 Francis Galton administers the first test battery to thousands of citizens at
the International Health Exhibit.
1890 James McKeen Cattell uses the term mental test in announcing the agenda
for his Galtonian test battery.
1896 Emil Kraepelin provides the first comprehensive classification of mental
disorders.
1901 Clark Wissler discovers that Cattellian “brass instruments” tests have no
correlation with college grades.
1904 Charles Spearman proposes that intelligence consists of a single general
factor g and numerous specific factors s1, s2, s3, and so forth.
1904 Karl Pearson formulates the theory of correlation.
1905 Alfred Binet and Theodore Simon invent the first modern intelligence test.
1908 Henry H. Goddard translates the Binet- Simon scales from French into
English.
1912 Stern introduces the IQ, or intelligence quotient: the mental age divided by
chronological age.
1916 Lewis Terman revises the Binet-Simon scales, publishes the Stanford-Binet;
revisions appear in 1937, 1960, 1986, and 2003.
1917 Robert Yerkes spearheads the development of the Army Alpha and Beta
examinations used for testing WWI recruits.

1917 Robert Woodworth develops the Personal Data Sheet, the first personality
test.
1920 The Rorschach inkblot test is published.
1921 Psychological Corporation—the first major test publisher—is founded by
Cattell, Thorndike, and Woodworth.
1926 Florence Goodenough publishes the Draw-A-Man Test.
1926 The first Scholastic Aptitude Test is published by the College Entrance
Examination Board.
1927 The first edition of the Strong Vocational Interest Blank is published.
1935 The Thematic Apperception Test is re- leased by Morgan and Murray at
Harvard University.
1936 Lindquist and others publish the precursor to the Iowa Tests of Basic Skills.
1936 Edgar Doll publishes the Vineland Social Maturity Scale for assessment of
adaptive behavior in those with mental retardation.
1938 L. L. Thurstone proposes that intelligence consists of about seven group
factors known as primary mental abilities.
1938 Raven publishes the Raven’s Progressive Matrices, a nonverbal test
reasoning in- tended to measure Spearman’s g factor.
1938 Lauretta Bender publishes the Bender Visual Motor Gestalt Test, a design-
copying test of visual-motor integration.
1938 Oscar Buros publishes the first Mental Measurements Yearbook.
1938 Arnold Gesell releases his scale of infant development.
1939 The Wechsler-Bellevue Intelligence Scale is published; revisions are
published in 1955 (WAIS), 1981 (WAIS-R), 1997 (WAIS-III), and 2008
(WAIS-IV).
1939 Taylor–Russell tables published for deter- mining the expected proportion
of successful applicants with a test.
1939 The Kuder Preference Record, a forced- choice interest inventory, is
published.
1942 The Minnesota Multiphasic Personality Inventory (MMPI) is published.
1948 Office of Strategic Services (OSS) uses situational techniques for selection
of officers.
1949 The Wechsler Intelligence Scale for Children is published; revisions are
published in 1974 (WISC-R), 1991 (WISC-III), and 2003 (WISC-IV).
1950 The Rotter Incomplete Sentences Blank is published.
1951 Lee Cronbach introduces coefficient alpha as an index of reliability (internal
consistency) for tests and scales.
1952 American Psychiatric Association publishes the Diagnostic and Statistical
Manual (DSM-I).
1953 Stephenson develops the Q-technique for studying the self-concept and
other variables.
1954 Paul Meehl publishes Clinical vs. Statistical Prediction.
1956 The Halstead-Reitan Test Battery begins to emerge as the premiere test
battery in neuropsychology.
1957 C. E. Osgood describes the semantic differential.
1958 Lawrence Kohlberg publishes the first version of his Moral Judgment Scale;
research with it expands until the mid-1980s.
1959 Campbell and Fiske publish a test validation approach known as the
Multitrait-multimethod matrix.
1963 Raymond Cattell proposes the theory of fluid and crystallized intelligences.
1967 In Hobson v. Hansen the court rules against the use of group ability tests to
“track” students on the grounds that such tests dis- criminate against
minority children.
1968 American Psychiatric Association publishes DSM-II.
1969 Nancy Bayley publishes the Bayley Scales of Infant Development (BSID).
The revised version (BSID-2) is published in 1993.
1969 Arthur Jensen proposes the genetic hypothesis of African American versus
white IQ differences in the Harvard Educational Review.
1971 In Griggs v. Duke Power the Supreme Court rules that employment test
results must have a demonstrable link to job performance.
1971 George Vaillant popularizes a hierarchy of 18 ego adaptive mechanisms and
describes a methodology for their assessment.
1971 Court decision requires that tests used for personnel selection must be job
relevant (Griggs v. Duke Power).
1972 The Model Penal Code rule for legal in- sanity is published and widely
adopted in the United States.
1974 Rudolf Moos begins publication of the Social Climate Scales to assess
different environments.
1974 Friedman and Rosenman popularize the Type A coronary-prone behavior
pattern; their assessment is interview-based.
1975 The U.S. Congress passes Public Law 94142, the Education for All
Handicapped Children Act.
1978 Jane Mercer publishes SOMPA (System of Multicultural Pluralistic
Assessment), a test battery designed to reduce cultural discrimination.
1978 In the Uniform Guidelines on Employee Selection adverse impact is defined
by the four-fifths rule; also guidelines for employee selection studies are
published.
1979 In Larry P. v. Riles the court rules that standardized IQ tests are culturally
biased against low-functioning black children.
1980 In Parents in Action on Special Education v. Hannon the court rules that
standardized IQ tests are not racially or culturally biased.
1985 The American Psychological Association and other groups jointly publish the
influential Standards for Educational and Psychological Testing.
1985 Sparrow and others publish the Vineland Adaptive Behavior Scales, a
revision of the pathbreaking 1936 Vineland Social Maturity Scale.
1987 American Psychiatric Association publishes DSM-III-R.
1989 The Lake Wobegon Effect is noted: Virtually all states of the union claim
that their achievement levels are above average.
1989 The Minnesota Multiphasic Personality Inventory-2 is published.
1992 American Psychological Association publishes a revised Ethical Principles of
Psychologists and Code of Conduct (American Psychologist, December
1992)
1994 American Psychiatric Association publishes DSM-IV.
1994 Herrnstein and Murray revive the race and IQ heritability debate in The Bell
Curve.
1999 APA and other groups publish revised Standards for Educational and
Psychological Testing.
2003 New revision of APA Ethical Principles of Psychologists and Code of Conduct
goes into effect.

Topic 3: Types of Tests

Tests can be broadly grouped into two camps: group tests versus individual
tests. Group tests are largely pencil-and-paper measures suitable to the testing of large
groups of persons at the same time. Individual tests are instruments that by their design
and purpose must be administered one on one. An important advantage of individual tests
is that the examiner can gauge the level of motivation of the subject and assess the
relevance of other factors (e.g., impulsiveness or anxiety) on the test results.

The Main Types of Psychological Tests

Intelligence Tests: Measure an individual's ability in relatively global areas such as


verbal comprehension, perceptual organization, or reasoning
and thereby help determine potential for scholastic work or
certain occupations.
Aptitude Tests: Measure the capability for a relatively specific task or type of
skill; aptitude tests are, in effect, a narrow form of ability
testing.
Achievement Tests: Measure a person’s degree of learning, success, or
accomplishment in a subject or task. creativity Tests: Assess
novel, original thinking and the capacity to find unusual or
unexpected solutions, especially for vaguely defined problems.
Personality Tests: Measure the traits, qualities, or behaviors that determine a
person’s individuality; such tests include checklists, inventories,
and projective techniques.
Creativity Tests: Assess a subject’s ability to produce new ideas, insights, or
artistic creations that are accepted as being of social, aesthetic,
or scientific value.
Interest inventories: Measure an individual's preference for certain activities or topics
and thereby help determine occupational choice.
Behavioral procedures: Objectively describe and count the frequency of a behavior,
identifying the antecedents and consequences of the behavior.
Neuropsychological Tests: Measure cognitive, sensory, perceptual, and motor performance
to determine the extent, locus, and behavioral consequences of
brain damage.

Other Test Classifications:

Classifications according to:

A. Form
Paper and Pencil test
Performance test

B. Time Element
Speed tests
Power tests
Tests without time limits

C. Responses
Verbal responses
Non-verbal responses

D. Scoring Procedure
Objectively – scored test
Subjectively – scored test

E. Standardization
Standardized test
Non-standardized test

F. Levels
Level A
Level B
Level C

Topic 4: Uses of Testing

By far the most common use of psychological tests is to make decisions about
persons. For example, educational institutions frequently use tests to determine placement
levels for students, and universities ascertain who should be admitted, in part, on the basis
of test scores. State, federal, and local civil service systems also rely heavily on tests for
purposes of personnel selection.

Even the individual practitioner exploits tests, in the main, for decision making.
Examples include the consulting psychologist who uses a personality test to determine that
a police department hire one candidate and not another, and the neuropsychologist who
employs tests to conclude that a client has suffered brain damage.

But simple decision making is not the only function of psychological testing. It is
convenient to distinguish five uses of tests:

• Classification
• Diagnosis and treatment planning
• Self-knowledge
• Program evaluation
• Research

Rating: To rate people when test data help determine


where they fall relative to either (1) their peers
or (2) some standard of performance.
Placement: It involves the evaluation of people so that they
can be matched with the appropriate services or
environments.

Selection: Tests are frequently used for selection of a group


of people from a larger pool of applicants or
candidates

Competency and Proficiency: Tests can be used to indicate whether or not an


examinee’s performance meets a preselected
criterion.

Diagnosis: In diagnosis, tests are used to determine the


nature and typicality of an individual’s underlying
characteristics. Schools use tests to identify
potential learning problems in children and
suggest areas of strength useful in planning
remediation. Clinicians use tests to identify areas
of pathology or adjustment problems and to plan
treatment approaches.

Outcome evaluation: Tests also can be used to make decision by


evaluating an outcome, such as the value of a
program, a product or a course of action.

Uses of Psychological Tests in Various Setting


(Murphy & Davidshofer, 1988; as cited in Suba, 2013)

Education Setting
• School readiness and school admission
• Classroom selection or classification of students with reference to their ability to
profit from different types of school instruction.
• Identification of exceptionality
• Diagnosis of academic failures and learning disabilities
• Educational planning and career counseling
• Evaluation of student competencies
• Evaluation of teacher competencies
• Evaluation of instructional programs

Business/Industrial Setting
Psychological tests are used in conjunction with other methods of obtaining information
about individuals i.e., biological data, application forms, interviews, work samples and
employment records.
• Selection of new employees: Hiring, Classification, and Job Assignment
• Evaluation of current employees: Job Transfer, Training, Promotion, Termination
• Evaluation of programs and/or products
• Assessment of consumer behavior
Counseling or Clinical Setting
The use of tests in counseling has broadened from educational/vocational planning to
involvement in all aspects of the person’s life. Tests are used to enhance self-
understanding and personal development.
• Identification of intellectual deficiencies
• Psycho diagnosis/ differential diagnosis of psychopathology
• Clinical assessment of emotional/behavioral disorders
• Marital and family assessment
• Assessment in Health and Legal Context

Research and Others


• Data gathering/theory verification
• Environmental assessment

Limitations of Psychological Tests


• Scores can’t reveal how or why the individual obtained a certain score
• May seem to give favorable responses
• Doesn’t measure the ability or potential to apply and appreciate information gained
• Results can’t make decisions for the examinee.
• Chance error on individual interpretation of scores
o SEM = reasonable limits of scores and yet maintain its reliability
o SEdiff = Difference between two scores for test of significance
o SEest = Margin of error expected in individual’s predicted criterion

Why is Psychological Assessment less Precise?


• Psychological tests measure only a sample of the behavior or property under study
• Psychological tests use a more limited scale.
• Psychological measurement is more easily affected by extraneous variables.

Why is Psychological Assessment less Direct?


• Many psychological tests are designed to draw inferences about underlying
attributes or characteristics.
• Many psychological tests are designed to measure constructs, hypothetical
dimensions on which individuals differ. Because psychological constructs are really
theoretical abstraction, they cannot be measured directly. Instead, we must infer
their presence for measurements of specific behaviors.

Topic 5: Characteristics of a Good Test

A good test is designed carefully and evaluated empirically to ensure that it


generates accurate, useful information. There are four important characteristics of a good
test. The four characteristics are: 1. Reliability 2. Validity 3. Objectivity 4. Usability.

Characteristic 1. Reliability:
The dictionary meaning of reliability is consistency, depend-ence or trust. So in
measurement reliability is the consistency with which a test yields the same result in
measuring whatever it does measure. A test score is called reliable when we have reason
for believing the score to be stable and trust-worthy. Stability and trust-worthiness depend
upon the degree to which the score is an index of time-reliability’ is free from chance error.
Therefore reliability can be defined as the degree of consistency between two
measurements of the same thing. For example we administered an achievement test on
Group-A and found a mean score of 55. Again after 3 days we administered the same test
on Group-A and found a mean score of 55. It indicates that the measuring instrument
(Achievement test) is providing a stable or dependable result. On the other hand if in the
second measurement the test provides a mean score around 77 then we can say that the
test scores are not consistent.

In the words of Gronlund and Linn (1995) “reliability refers to the consistency of
measurement—that is, how consistent test scores or other evaluation results are from one
measurement to other.” C.V. Good (1973) has defined reliability as the “worthiness with
which a measuring device measures something; the degree to which a test or other
instrument of evaluation measures consistently whatever it does in fact measure.”

According to Ebel and Frisbie (1991) “the term reliability means the consistency with which
a set of test scores measure whatever they do measure.” Theoretically, reliability is defined
as the ratio of the true score and observed score variance. According to Davis (1980) “the
degree of relative precisions of measurement of a set of test score is defined as reliability.”

Thus reliability answers to the following questions (Gronlund and Linn, 1995):

• How similar the test scores are if the lost is administered twice?
• How similar the test scores are if two equivalent forms of tests are administered?
• To what extent the scores of any essay test. Differ when it is scored by different
teachers?

It is not always possible to obtain perfectly consistent results. Because there are several
factors like physical health, memory, guessing, fatigue, forgetting etc. which may affect the
results from one measurement to other. These extraneous variables may introduce some
error to our test scores. This error is called as measurement errors. So while determining
reliability of a test we must take into consideration the amount of error present in
measurement.

Characteristic 2. Validity:

“In selecting or constructing an evaluation instrument, the most important question


is; To what extent will the results serve the particular uses for which they are intended?
This is the essence of validity.” —GRONLUND

Validity is the most important characteristic of an evaluation program, for unless a


test is valid it serves no useful function. Psychologists, educators, guidance counselors use
test results for a variety of purposes. Obviously, no purpose can be fulfilled, even partially,
if the tests do not have a sufficiently high degree of validity. Validity means truth-fullness
of a test. It means to what extent the test measures that, what the test maker intends to
measure.

What is measured and how consistently it is measured. It is not a test characteristic,


but it refers to the meaning of the test scores and the ways we use the scores to make
decisions. Following definitions given by experts will give a clear picture of validity.
According to Gronlund and Linn (1995) validity refers to the appropriateness of the
interpretation made from test scores and other evaluation results with regard to a particular
use.

From the above definitions it is clear that validity of an evaluation device is the
degree to which it measures what it is intended to measure. Validity is always concerned
with the specific use of the results and the soundness of our proposed interpretation.

It is not also necessary that a test which is reliable may also be valid. For example,
suppose a clock is set forward ten minutes. If the clock is a good time piece, the time it
tells us will be reliable. Because it gives a constant result. But it will not be valid as judged
by ‘Standard time’. This indicates “the concept that reliability is a necessary but not a
sufficient condition for validity.”

Characteristic 3. Objectivity:

Objectivity is an important characteristic of a good test. It affects both validity and


reliability of test scores. Objectivity of a measuring instrument moans the degree to which
different per-sons scoring the answer receipt arrives of at the same result. C.V. Good (1973)
defines objectivity in testing is “the extent to which the instrument is free from personal
error (personal bias), that is subjectivity on the part of the scorer”.

Gronlund and Linn (1995) states “Objectivity of a test refers to the degree to which
equally competent scores obtain the same results. So, a test is considered objective when
it makes for the elimination of the scorer’s personal opinion and bias judgement. In this
con-text there are two aspects of objectivity which should be kept in mind while
constructing a test.”

• Objectivity in scoring.
• Objectivity in interpretation of test items by the test taker.

Objectivity of Scoring:

Objectivity of scoring means same person or different persons scoring the test at
any time arrives at the same result without may chance error. A test to be objective must
necessarily so worded that only correct answer can be given to it. In other words, the
personal judgement of the individual who score the answer script should not be a factor
affecting the test scores. So that the result of a test can be obtained in a simple and precise
manner if the scoring procedure is objective. The scoring procedure should be such that
there should be no doubt as to whether an item is right or wrong or partly right or partly
wrong.
Objectivity of Test Items:

By item objectivity we mean that the item must call for a definite single answer.
Well-con-structed test items should lead themselves to one and only one interpretation by
students who know the material involved. It means the test items should be free from
ambiguity. A given test item should mean the same thing to all the students that the test
maker intends to ask. Dual meaning sentences, items having more than one correct answer
should not be included in the test as it makes the test subjective.

Characteristic 4. Usability:

Usability is another important characteristic of measuring instrument. Because


practical considerations of the evaluation instruments cannot be neglected. The test must
have practical value from time, economy, and administration point of view. This may be
termed as usability.

So, while constructing or selecting a test the following practical aspects must be taken into
account:

1. Ease of Administration:

It means the test should be easy to administer so that the general class-room teachers can
use it. Therefore, simple and clear directions should be given. The test should posses very
few subtests. The timing of the test should not be too difficult.

2. Time required for administration:

Appropriate time limit to take the test should be provided. If in order to provide ample time
to take the test we shall make the test shorter than the reliability of the test will be reduced.
Gronlund and Linn (1995) are of the opinion that “Somewhere between 20 and 60 minutes
of testing time for each individual score yielded by a published test is probably a fairly good
guide”.

3. Ease of Interpretation and Application:

Another im-portant aspect of test scores are interpretation of test scores and application of
test results. If the results are misinterpreted, it is harmful on the other hand if it is not
applied, then it is useless.

4. Availability of Equivalent Forms:

Equivalent forms tests helps to verify the questionable test scores. It also helps to eliminate
the factor of memory while retesting pupils on same domain of learning. Therefore
equivalent forms of the same test in terms of content, level of difficulty and other
characteristics should be available.

5. Cost of Testing:
A test should be economical from preparation, administration and scoring point of view.

References:

Berk, R. A. (Ed.). (1984). A guide to criterion-referenced test construction.


Baltimore: Johns Hopkins University Press.

Davis, C. (1980). Perkins-Binet Tests of Intelligence for the blind. Watertown, MA:
Perkins School for the Blind.

Ebel, R.L. and Friesbie, D.A. (1991) Essential of Educational Measurement.


Englewood Cliffs, New Jersey: Prentice-Hall Inc.

Gregory, Robert J. (2014). entitled Psychological Testing: History, Principles, and


Applications, 7th Edition. Pearson Education. ISBN 978-0-205-95925-9

Gronlund, N.E. and Linn, R.L. (1995) Measurement and Assessment in Teachng. New
York: Macmillan.

Murphy, K. R., & Davidshofer, C. O. (1988). Psychological testing. Englewood Cliffs,


NJ: Prentice Hall.

Petersen, N. S., Kolen, M. J., & Hoover, H. D. (1989). Scal- ing, norming, and
equating. In R. L. Linn (Ed.), Edu- cational measurement (3rd ed.). New York:
American Council on Education/Macmillan.

Suba, E.S. (2013). Teaching in guide in psychological assessment. Unpublished


Manual: Central Luzon State University, Science City of Muñoz, Nueva Ecija

You might also like