8602
8602
Test administration
Test administration procedures are developed for an exam program in order to
help reduce measurement error and to increase the likelihood of fair, valid, and
reliable assessment. Specifically, appropriate standardized procedures improve
measurement by increasing consistency and test security. Consistent,
standardized administration of the exam allows you to make direct comparisons
between examinees' scores, despite the fact that the examinees may have taken
their tests on different dates, at different sites, and with different proctors.
Furthermore, administration procedures that protect the security of the test help
to maintain the meaning and integrity of the score scale for all examinees.
Importance of Test Administration
Consistency
Standardized tests are designed to be administered under consistent procedures
so that the test-taking experience is as similar as possible across examinees. This
similar experience increases the fairness of the test as well as making examinees'
scores more directly comparable. Typical guidelines related to the test
administration locations state that all the sites should be comfortable, and should
have good lighting, ventilation, and handicap accessibility. Interruptions and
distractions, such as excessive noise, should be prevented. The time limits that
have been established should be adhered to for all test administrations. The test
should be administered by trained proctors who maintain a positive atmosphere
and who carefully follow the administration procedures that have been developed.
Test Security
Test security consists of methods designed to prevent cheating, as well as to
protect the test items and content from being exposed to future test-takers. Test
administration procedures related to test security may begin as early as the
registration procedures. Many exam programs restrict examinees from registering
for a test unless they meet certain eligibility criteria. When examinees arrive at
the test site, additional provisions for test security include verifying each
examinee's identification and restricting materials (such as photographic or
communication devices) that an examinee is allowed to bring into the test
administration. If the exam program uses multiple, parallel test forms, these may
be distributed in a spiraled fashion, in order to prevent one examinee from being
able to copy from another. (Form A is distributed to the first examinee, Form B
to the second examinee, Form A to the third examinee, etc.) The test proctors
should also remain attentive throughout the test administration to prevent
cheating and other security breaches. When testing is complete, all test related
materials should be carefully collected from the examinees before they depart.
Reliability of Test
Reliability refers to how dependably or consistently a test measures a
characteristic. If a person takes the test again, will he or she get a similar test
score, or a much different score? A test that yields similar scores for a person who
repeats the test is said to measure a characteristic reliably.
How do we account for an individual who does not get exactly the same test score
every time he or she takes the test? Some possible reasons are the following:
Test taker's temporary psychological or physical state. Test
performance can be influenced by a person's psychological or physical
state at the time of testing. For example, differing levels of anxiety, fatigue,
or motivation may affect the applicant's test results.
Environmental factors. Differences in the testing environment, such as
room temperature, lighting, noise, or even the test administrator, can
influence an individual's test performance.
Test form. Many tests have more than one version or form. Items differ on
each form, but each form is supposed to measure the same thing. Different
forms of a test are known as parallel forms or alternate forms. These
forms are designed to have similar measurement characteristics, but they
contain different items. Because the forms are not exactly the same, a test
taker might do better on one form than on another.
Multiple raters. In certain tests, scoring is determined by a rater's
judgments of the test taker's performance or responses. Differences in
training, experience, and frame of reference among raters can produce
different test scores for the test taker.
These factors are sources of chance or random measurement error in the
assessment process. If there were no random errors of measurement, the
individual would get the same test score, the individual's "true" score, each time.
The degree to which test scores are unaffected by measurement errors is an
indication of the reliability of the test.
Reliable assessment tools produce dependable, repeatable, and consistent
information about people. In order to meaningfully interpret test scores and make
useful employment or career-related decisions, you need reliable tools. This
brings us to the next principle of assessment.