Assessment of Learning PPT 201012014906

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 54

ASSESSMENT OF

LEARNING
ASSESSMENT OF
LEARNING
It focuses on on the development and utilization of
assessment tools to improve the teaching learning process.
MEASUREMENT: refers to the quantitative aspect of evaluation.
EVALUATION: the qualitative aspect of evaluation.
TEST: consist of questions or exercise or
other devises for measuring the outcome of
learning.
CLASSIFICATION OF
TEST
3, According to nature of answer
1, According to manner and
a. personality test
response
b. intelligence test
a. oral
c. aptitude test
b. written
d. achievement or summative test
2, According to method of e. sociometric test
preparation
f. diagnostic or formative test
c. subjective g. trade or vocational test
d. objective
• Objective test: test which have definite answers
and therefore are not subject for personal bias.
• Teacher made test: constructed by the teacher based on the
content of different subject taught
• Diagnostic test: to measure a student strength
and weaknesses.
• Formative test: done to monitor students attainment of the
instructional objectives.
• Summative test: done at the conclusion of instruction
• Standardize test: are test for which contents have been selected
and for which norms and standards have been established.
• Criterion referenced measure: a measuring device
with a pre determined level of success or
standard on the part of the test takers.
• Norm referenced measure: is a test the is scored on the
basis of the norm or standard level of accomplishment
by the whole group taking the test.
CRITERIA OF A GOOD
EXAMINATION
Validity:
Refers to a degree to which the test measures what it is intended to
measure. It is the usefulness of the test for a given measure.
RELIABILITY:
Pertains to the degree to which a test measures what is suppose to measure.
The test of reliability is the consistency of the result when it is administered to
different group of individuals with similar characteristics in different places at
different times.
OBJECTIVITY:
is the degree to which personal bias is eliminated in the
scoring of the answers. When we refer to the quality of
measurement, essentialy we mean to have the amount of
information contained in a score generated by the
measurement.
Measurement may differ in the amount on information the
members contain. These are:
Nominal Measurement
Are the least sophisticated, they merely classify object or
by assigning numbers to them. These numbers are arbitrary and
imply no qualification, but the categories must be exclusive and
exhaustive.
Ordinal Measurement
Ordinal scales classify, but they also assign rank order. Example, ranking
individuals in a class according to their test scores. Their scores are ordered
from highest to lowest.
Interval Measurement:
in order to be able to add and subtract scores, we use interval scales.
This measurement contains the nominal and ordinal properties and is
also characterized by equal units between scire points.
Ratio measurement:
the ost sophisticated type of measurement includes all the
proceeding properties, but in a ratio scale, the zone point is not
arbitrary; a score of zero includes the absence of what is being
measured.
Norm-referenced and criterion-referenced Measurement
norm-referenced has been used in education; norm-
referenced test continue to comprise a substantial portion of the
measurement in today’s schools. Criterion-referenced measurement
imphasize that the type of measurement or testing depends on how
the scores are interpreted . Both type can be be use effectively by
the teacher.

Norn- referenced interpretation


stems from the desire to differentiate among individuals or
to
discriminate among the individuals of some defined groups
on
Criterion-referenced interpretation
It means referencing an individual’s performance to some criterion that is a
defined performance level. A second meaning fo this term involves the idea of
a defined behavioral domain that is, a defined body of learner’s behavior.

Distinction

Norm-reference test are usually more general and comprehensire and


cover a large domain of content and learning tasks.

Criterion=referenced test focus on a specific group of learner behavior.


STAGES IN TEST
CONSTRUCTION
i. Planning the test
III.Establishing test validity
A. Determining the objectives IV.Establishing the test reliability
B. Preparing the table of specification. V. Interpreting the test scores
C. Selecting the appropriate item format
D. writing the test items
E. editing the test items
II. Trying out the test
A. administering the first try out – then item analysis
B. administering the second try out – then item analysis
C. preparing the finaal form of test
MAJOR CONSIDERATON IN
TEST CONSTRUCTION

Type of test
It is a take home test rather that an in class test, how do you make sure
that students work independently, have equal access to sources and
resources, or spent a sufficient but not enourmous amount of time on the task?
The test plan must include a wide array of issues. Anticipating this potential
problem allows the test constructor to develop positions or policies that are
consistent with his/her testing philosophy.
MAJOR CONSIDERATON IN
TEST CONSTRUCTION

TEST LENGTH
a majot decision in the test planning is how many items should be
included on the test. There should be enough to cover the content
adequately, but the length of the class period or the attention span or
fatigue limits of the students usually restrict the test length.
MAJOR CONSIDERATON IN
TEST CONSTRUCTION

ITEM FORMATS
determining what kind of items to include on the test is a
major decision once the planning decisions are made, the item
writing begins. This task is often the most feared by the
beginning test constructors. However, the proceedures are
more common sense than formal rules.a
POINTS TO BE CONSIDERED IN
PREPARING ATEST

1. Are the instructional objectives clearly defined?


2. What knowledge, skills and attitudes do you want measure?
3. Did you prepare table of specifications?
4. Did you formulate well defined and clear test items?
5. Did you employ correct English in writing items?
6. Did you avoid giving to the correct answer?
7. Did you test the important ideas rather than the trivial?
8. Did you adapt the test's difficulty to your student's ability?
POINTS TO BE CONSIDERED IN
PREPARING ATEST

9. Did you avoid using textbook jargons?


10. Did you cast the items in positive form?
11.Did you prepare a scoring key?
12. Does each item have single correct answer?
13. Did you review your items?
DIFFERENT TYPES OF
TEST
1, The test items should be selected very carefully. Only important facts should be
included.
2. The test should have extensive sampling of items.
3.The test items should be carefully expressed in simple, clear, definite, and
meaningful sentences
4 There should be only one possible correct response for each test item
5. Each item should be independent. Leading clues to other items should be
avoided.
6 Lifting sentences from books should not be done to encourage thinking and
understanding
DIFFERENT TYPES OF
TEST
7. The first person personal pronouns / and we should not be used.
8. Various types of test items should be made to avoid monotony .
9.Majority of test items should be of moderate difficulty, Few difficult and few casy items should be
included.
10.The test items should be arranged in ascending order of difficulty, Easy items should the
beginning encourage the examinee to pursue the the most difficult items the end.
11.Clear, concise, and complete directions should precede Sample test items be provided for expected
responses.
12.Items which can be answered previous experience alone without knowledge ofthe 12. subject matter
should not be included
13. Catchy should not be used in the test iterns.
DIFFERENT TYPES OF
TEST
14. Test items must be based upon the objectives of the course and upon the course content.
15. The test should measure degree of achievement or determine difficulties the learners 16. The
test should emphasize ability to apply and use facts as well as knowledge of facts.
17.The test should be of such length that it can be completed within the allotted by all or nearly all of the
pupils, The teacher should perform the test herself to determine its approximate time allotment.
18.Rules governing good language expression, grammar, spelling, punctuation, and
capitalization should be observed items.
19. Information on how scoring be done should be provided.
20. Scoring Keys in correcting and scoring tests provided.
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST
a. Recall types
I. Simple recall type
a. This type consists of questions calling for a single word or expression as an answer
b. Items usually begin with who, where, when, and what
c. Score is the number of correct answers,
2. Completion type
a. Only important words or phrases should omitted avoid confusion.
b. Blanks should equal lengths.
c. The blank, as much possible, is placed near end of the sentence.
D. Articles a, an, and the be provided before omitted word phrase to clues for
answers.
e. number of correct answers
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST

3. Ennumeration Type.
a, the exact number of expected answers should be stated. b
blanks should be on equal length.
c score is the number of correct answer.
4, identification Type
a. The items should make an examinee think of a word, number or group of
words that would complete the statement or answer the
problem.
b. Score is the number of correct answer.
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST

B. RECOGNITION TYPES
1,True-false or alternate-response type
a. Declarative sentences should be used.
b. The number of "true" and "false" items should be more or less equal
c. The truth or falsity of the sentence should not be too evident.
d. Negative statements should be avoided.
e. The "modified true-false" is more preferable than the "plain true-false"
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST
f.In arranging the items, avoid the regular recurrence of "true" and "false"
statements.
g.Avoid using specific determiners like: all, always, never, none,
nothing, most, oflen, some, etc, avoid weak statements as may,
sometimes, as rule, in general etc
h. Minimize the use qualitative terms like few, great, many, more, etc
i.
Avoid leading clues answers in all items.
J. Score is the number of correct answers in "modified true-false and
right answers minus wrong answers "plain true false"
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST

2, Yes-No type
a. The items should be in interrogative sentence.
b. The same rules as in “true or false” are applied.
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST
3. Multiple=response type
a. There should be three to five choices. The number of coices
choose in the item.
b. The choices should be numbered or lettered so that only the number
or letter can be written on the space provided,
c. If the choices are figures, thy should be arranged in ascending order.\
d. Avoid the use of “a” or “an” as the last word prior to the listing of the
responses.
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST

e. Random occurrence of responses should be employed.


f.The choices, as much as possible should be at the end of the
statement.
g.The choices should be related in some way or should belong to the
same class.
h. Avoid the use of “none of these” as one of the choices.
i. Score is the number of correct answers.
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST

4. Best Answer Type


A.There should be three in five choices all of which are right
but vary in their degree of merit, importance or desirability.
B.The other rules for multiple response items are applied here.
C. Score is the number of correct answer.
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST
5. Matching Type
A.There should be two columns under “A” are the stimuli which should be
longer and more descriptive than the responses under column “B”. The
response may be a word, a phrase, a number or a formula.
B.The stimuli under column “A” should be numbered and the responses
under column ”B” should be lettered. Answers will be endicated by
letters only on lines provided in column “A”.
C. The number of pairs usually should not exceed twenty items
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST
D.The number of responses in column “B” should be two or more
than the number of items in column “A” to avoid guessing.
E.Only one correct answer for each item should be
possible.
F. Matching sets should neither be too long or too short.
G. All items should be in the same page.
H. Score is the number of the correct answer.
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST

C. Essay type examination.


1. Comparison of two things.
2.Explanation of the use of meaning of a statement or
passage.
3. analysis
4. Decision for or against
5. discussion
POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT KINDS OF TEST

How to construct essay examinations.


1, Determine the objectives or essentials for each question to be evaluated.
2. Phrase questions in simple, clear and concise language.
3.Suit the length of the questions to the time available for answering the essay
examination. The teacher should try to answer the test herself,
4. Scoring:
a. Have a model answer in advance.
b. Indicate the number of points for each question.
c. Score a point for each essential.
ADVANTAGES AND DISADVANTAGES OF
THE OBJECTIVE TYPE OF TEST

Advantages
a. The objective test is free from personal bias in scoring
b. It is easy to score. With a scoring key, the test can be corrected by different
individuals without affecting the accuracy of the grades given
c.It has high validity because it is comprehensive with wide sampling of
essentials.
d. It is less time-consuming since many items can be answered in a given time.
e.It is fair to students since the slow writers can accomplish the test us fast as the fast
writers.
ADVANTAGES AND DISADVANTAGES OF
THE OBJECTIVE TYPE OF TEST

Disadvantages
a. It is difficult to construct and requires more time to prepare.
b.It does not afford the students the opportunity in training for self- and
thought organization.
c.organization. It cannot be used to test ability in theme writing or
journalistic writing.
ADVANTAGES AND DISADVANTAGES OF
THE ESSAY TYPE OF TEST

Advantages
a. The essay examination can be used in practically all subjects of the school
curriculum.
b. It trains students for thought organization and self expression
c. It affords students opportunities to express their originality and independence
of thinking .
d.Only the essay test can be used in some subjects like composition writing
journalistic writing which cannot be tested the objective type test
e.Essay examination measures higher mental abilities comparison,
interpretation, defense of opinion and decision
f. The essay test is easily prepared.
e. It is inexpensive.
ADVANTAGES AND DISADVANTAGES OF
THE ESSAY TYPE OF TEST

Disadvantages
a. The limited sampling of items makes the test unreliable measure of
achievements or abilities.
b. Questions usually are not well prepared.
c.Scoring is highly subjective due to the influence of the corrector's
personal judgment
d.Grading of the essay test is inaccurate measure pupils'
achievements due to subjectivity of scoring
STATISTICAL MEASURES OR TOOLS USED
IN INTERPRETING NUMERICAL DATA

FREQUENCY
DISTRIBUTION
a simple, common sense technique for describing a
set of test scores is through the use of frequency
distribution. It is merely listing of the possible score
values and the number of persons who ahieve each
scores.
StTEPS THAT ARE INVLOVED IN CREATING
THE FREQUENCY DISTRIBUTION

FIRST, List the possible score values in rank


order, from highest to lowest then a second column
indicate the frequency or number of person who
receive each scores.
When there is a wide range of scores in a frequency distribution, the distribution can be quite
long , with a lot of zeros in the column of frequencies. Such a frequency distribution can make
interpretation difficult and confusing. a grouped frequency distribution would be more
appropriate in this kind of situation. groups of score values are listed rather than each separate
possible score value.
If we were to change the frequency distribution and Table 2 into a grouped frequency
distribution, you might choose intervals such as 48 -50 , 45 -47, and so forth. The frequency
corresponding to intervals 48 -50 would be 9 (1 + 3 + 5). The choice of width of the interval is
arbitrary cramp , but it must be the same as all intervals. In addition, it is a good idea to have
an odd numbered interval width so that the midpoint of the interval is a whole number. This
strategy will simplify subsequent grass at 10 descriptions of data. The group frequency
distribution is presented in Table 3.
Measures of Central Tendency

Frequency distributions are helpful for indicating the shape of to describe


a distributions of scores , but we need more information than shape to describe
the distribution adequately . We need to know we're the scale of measurement a
distribution is located and how the scores dispersed in the description . For the
former, we compute measures of central tendency, and for the latter we compute
measures of dispersion. Measures of central tendency are points of the scale of
measurement, they are representative of how the scores then to others . There are
commonly used measures of central tendency, the mean, the median, the mode,
but the mean is by far the most widely used .
The Mean
The mean of a set of scores is the arithmetic It is
found by summing the scores and dividing the sum by the
number of scores . The mean is the most commonly used
measure of central tendency because is easily understood and
is based on all the scores in the set; hence , it summarizes a
lot of information. formula of the Mean is as follows.
The Median
Another measure of central tendency is the median which is the point that distribution in
half; that is, half of the scores fall above the median and half of the scores fall below the
median.
When they are only a few scores, the median can often be found by inspection . if there is
an odd number of scores , the middle score is the median . When there is even a number of
scores, the median is halfway between the two middle scores . However, when they are tied
scores in the middle of the distribution, or when the scores are in a frequency distribution, the
median may not be so obvious.
Consider again the frequency distribution in table 2. there were 25 scores and distribution,
so the middle score should be the median. A straightforward way to this median is to
augment the frequency distribution with a column of cumulative frequency. Cumulative
frequencies indicate the number of scores at or below each score. Table 4 indicates the
cumulative frequencies for the data in Table 2
For example, 7 persons scored at or below a score of 40 , and 21
persons scored at or below a score of 48.
To find the median , we need to locate the middle score in
the
cumulative frequency column, because this score is the
median. Since there are 25 scores in the distribution ,
the middle one isthe 13th, a score of 46. Thus, 46 is a median of
this distribution;half of the people scored above 46 and
half
scored.
When there are times in the middle of the distribution,
th
ere may be a need to interpolate between scores to get the exact
The Mode
The measure of central tendency that is the easiest to find is
the mode . the mode is the most frequently occurring score in the
distribution . The mode of the scores in table 1
48. 5 persons had two scores of 48 and no other score
occurred as often.
Each of the three measures of central tendency - the mean , median , and
mode
the means a legitimate definition of “average” performance on this test.
However ,
each
people scored at or below 46 and more people received 48 than any other score.
When a distribution has a small number of very extreme scores , though , the
median may be a better definition of central tendency . the mode provides the least
information and is used infrequently as “average”. The mode can be used with
nominal scale data , just as an indicator of the most frequently appearing category . The
mean, the median, and the mode all describes central tendency:
The mean is the arithmetic average
The median divides the distribution in
half The mode is the most frequent score
Measures of Dispersion
Measures of central tendency are useful for summarizing average
performances, but they tell us nothing about how the scores are distributed or
spread out around the averages . Two sets of test scores may have equal measures
of central tendency , but they may differ and other ways. One of the distributions
may have the scores tightly clustered around the average , and the other
distribution may have scores that are widely separated. As you may have
anticipated, there descriptive statistics that measure dispersion, which are also
called measures of variability. These measures indicate how spread out the
scores tend to be.
The Range
The range Indicates the difference between the highest and scores in a
distribution . It is simple to calculate, but it provides limited information. We
subtract the lowest from the highest score and add 1 so that we include both
scores in the spread between them. For the scores of Table 2 the range is 50 -
34 + 1 = 17.
A problem with using the range is that only the two most scores are
used in this computation. There's no indication of the of scores between
highest and lowest. Measures of dispersion that into consideration every score
in the distribution are the variance and standard deviation. The standard
deviation is used a great deal in Interpreting scores from standardized tests.
The Variance
🞅 The variance measures how widely the scores in
distribution are spread about the mean . In other
words , the variance is the average squared
difference between the scores and the mean.
🞅 The computation of the variance for the scores of table
1 is illustrated in table 5 period the data for the
students K through V are omitted to save
but these values are included in the column totals
and in the computation.
The Standard Deviation

The standard deviation also indicates how spread out the scores are, but is expressed in
the same units as original scores. The standard deviation is computed finding the square root
of the variance.
S = S2
For the data in table 1 , the variance is 22.8. The standard deviation is 22.8, or
4.77. The scores of most norm groups have the shape of a normal distribution- a
symmetrical bell-shaped distribution with which most people are familiar. With the normal
distribution, about 95% of the scores are within 2 standard deviation of the mean,
Even the scores are not normally distributed, most of the scores will be within standard
deviations of the mean. In the example, the mean minus 2 standard deviation is 34.46, and
the mean plus two standard deviation is 53.54. Therefore one score is outside of this interval;
the lowest score, 34, is slightly more than two standard deviations from the mean.
Graphing Distributions
A graph of distribution of tes course is often better understood than is the frequency
distribution or amir table of numbers . The general pattern of scores , as well as and unique
characteristics of the distribution , can be seen easily in simple graphs. There are several
kinds of graphs that can be used, but a simple bar or a histogram, is as useful as any.
The general shape of the distribution is clear from the graph. most of the scores in a
distribution are high, at the upper end of the graph. Such a shape is quite common for the
scores of classroom tests. That is, test scores will be toward the right and of the
measurement scale.
A normal distribution has most of the test scores in the middle of the distribution and
progressively Fewer Scores toward extremes. the scores of norm groups are seldom graphed
but they could be if we were concerned about seeing specific shape of the distribution of
scores. Usually, we know or assume that the scores are normally distributed.

You might also like