Assessment 1 and 2 Review Material
Assessment 1 and 2 Review Material
TEACHER EDUCATION
UNIVERSITY
Owned and operated by the City Government of Urdaneta
Assessment refers to the process of gathering, describing or quantifying information about the student
performance. It includes paper and pencil test, extended responses and performance assessment are usually
referred to as “authentic assessment” tasks.
Measurement is a process of obtaining a numerical description of the degree to which an individual possesses
a particular characteristic. Measurement answers the questions “How much?”
Evaluation refers to the process of examining the performance of student. It also determines whether or not
the student has met the lessons’ instructional objectives.
Test is an instrument or systematic procedure designed to measure the quality, ability, skill or knowledge of
students by giving a set of question in a uniform manner. Since test is a form of assessment, tests also answer
the question “How does individual student perform?”
Testing is a method used to measure the level of achievement or performance of the learners. It also refers to
the administration, scoring and interpretation of an instrument designed to elicit information about
performance in a sample of a particular area of behavior.
TYPES OF MEASUREMENT
There are two ways of interpreting the student performance in relation to classroom instruction. These are the
Norm-referenced tests and Criterion-referenced tests.
Norm-referenced Test is a test designed to measure the performance of a student compared with other
students. Each individual is compared with other examinees and assigned a score usually expressed as a
percentile, a grade equivalent score or a stanine. The achievement of student is reported for broad skill areas,
although some norm-referenced tests do report student achievement for individual.
The purpose is to rank student with respect to the achievement of the others in broad areas of knowledge and
to discriminate high and low achievers.
Criterion-referenced Test is a test designed to measure the performance of the students with respect to some
particular criterion or standard. Each individual is compared with a pre-determined set of standard for
acceptable achievement. The performance of the other examinees is relevant. A student’s score is usually
expressed a percentage and student achievement is reported for individual skills.
The purpose is to determine whether each student has achieved specific skills or concepts. And to find out how
much students know before instruction begins and after it has finished.
Common characteristics of Norm-referenced Tests and Criterion-referenced Tests (Linn et. al., 1995)
1. Both require a specification of the achievement domain to be measured.
2. Both require a relevant and representative sample of test items.
3. Both use the same types of test items.
4. Both used the same rules for item writing (except for item difficulty)
5. Both are judge with the same qualities of goodness (validity and reliability)
6. Both are useful in educational assessment.
Differences between Norm-referenced Tests and Criterion-referenced Tests:
Norm-Referenced Tests Criterion-Referenced Tests
1. Typically covers a large domain of learning tasks, 1. Typically focuses on a delimited domain of
with just a few items measuring each specific learning tasks, with a relative large number of
task. items measuring each specific task.
2. Emphasizes discrimination among individuals in 2. Emphasizes among individuals can and
terms of relative level of learning. cannot perform.
3. Favors items of large difficulty and typically omits 3. Matches item difficulty to learning tasks,
very easy and very hard items. without altering item difficulty or omitting
easy and hard items.
4. Interpretation requires a clearly defined group. 4. Interpretation requires a clearly defined
and delimited achievement domain.
TYPES OF ASSESSMENT
There are four types of assessment in terms of their functional role in relation to classroom instruction. These
are the placement assessment, diagnostic assessment, formative assessment and summative assessment.
A. Placement Assessment is concerned with the entry performance of student. The purpose of placement
evaluation is to determine the prerequisite skills, degree of mastery of the course objectives and the
best mode of learning.
B. Diagnostic Assessment is a type of assessment given before the instruction. It aims to identify the
strengths and weaknesses of the students regarding the topics to be discussed. The purpose of
diagnostic assessment is:
1. to determine the level of competence of the students;
2. to identify the students who have already knowledge about the lessons; and
3. to determine the causes of learning problems and formulate a plan for remedial action.
C. Formative Assessment is a type of assessment used to monitor the learning progress of the students
during and after instruction. Purposes of formative assessment:
1. to provide feedback immediately to both student and teacher regarding the success and failures
of learning;
2. to identify the learning errors that is in need of correction; and
3. to provide information to the teacher for modifying instruction and used for improving learning
and instruction.
D. Summative Assessment is a type of assessment usually given at the end of a course or unit. Purposes of
summative assessment:
1. to determine the extent to which the instructional objectives have been met;
2. to certify student mastery of the intended outcome and used for assigning grades;
3. to provide information for judging appropriateness of the instructional objectives; and
4. to determine the effectiveness of instruction.
MODES OF ASSESSMENT
A. Traditional Assessment
1. Assessment in which students typically select an answer or recall information to complete the
assessment. Test may be standardized or teacher made test. These tests may be multiple-choice,
fill-in-the-blanks, true-false, or matching type.
2. Indirect measures of assessment since the test items are designed to represent competence by
extracting knowledge and skills from their real life context.
3. Items on standardized instruments tend to test only the domain of knowledge and skill to avoid
ambiguity to the test takers.
4. One-time measures rely on a single correct answer to each item. There is a limited potential for
traditional test to measure higher order thinking skills.
B. Performance Assessment
1. Assessment in which students are asked to perform real-world tasks that demonstrate meaningful
application of essential knowledge and skills.
2. Direct measures of student performance because tasks are designed to incorporate contexts,
problems, and solution strategies that students would use in real life.
3. Designed ill-structured challenges since the goal is to help students prepare for the complex
ambiguities in life.
4. Focus on processes and rationales. There is no single correct answer; instead students are led to
craft polished, thorough and justifiable responses, performances and products.
5. Involve long-range projects, exhibits, and performances are linked to the curriculum.
6. Teacher is an important collaborator in creating tasks, as well as in developing guidelines for
scoring and interpretation.
C. Portfolio Assessment
1. Portfolio is a collection of student’s work specifically selected to tell a particular story about the
student.
2. A portfolio is not a pile of student work that accumulates over a semester or a year.
3. A portfolio contains a purposefully selected subset of student work.
4. It measures growth and development of students.
Factors to Consider when Constructing Good Test Items
A. Validity is the degree to which the test measures what is intended to measure. It is the usefulness of
the test for a given purpose. A valid test is always reliable.
B. Reliability refers to the consistency of score obtained by the same person when retested using the
same instrument or one that is parallel to it.
C. Administrability refers to the uniform administration of test to all students so that the scores obtained
will not vary due to factors other than differences of the students’ knowledge and skills. There should
be a clear provision for instruction for the students, proctors and even who will check the test or the
scorer.
D. Scorability. The test should be easy to score, directions for scoring is clear, provide the answer sheet
and the answer key.
E. Appropriateness. The test item that the teacher constructs must assess the exact performances called
for in the learning objectives. The test item should require the same performance of the student as
specified in learning objectives.
F. Adequacy. The test should contain a wide sampling of items to determine the educational outcomes or
abilities so that the resulting scores are representatives of the total performance in the areas
measured.
G. Fairness. The test should not be biased to the examinees. It should not be offensive to any examinee
subgroups. A test can only be good if it is also fair to all test takers.
H. Objectivity represents the agreement of two or more raters or test administrators concerning the
score of a student. If the two raters who assess the same student on the same test cannot agree on
score, the test lacks objectivity and the score of neither judge is valid, thus, lack of objectivity reduces
test validity in the same way that lack reliability influence validity.
TABLE OF SPECIFICATION
Table of Specification is a device for describing test items in terms of the content and the process
dimensions. That is, what a student is expected to know and what he or she is expected to do with that
knowledge. It is described by combination of content and process in the table of specification.
Sample of One way table of specification in Linear Function
Content Number of Class Number of Items Test Item
Sessions Distribution
1. Definition of Linear Function 2 4 1-4
2. Slope of a Line 2 4 5-8
3. Graph of Linear Function 2 4 9-12
4. Equation of Linear Function 2 4 13-16
5. Standard Forms of a Line 3 6 17-22
6. Parallel and Perpendicular Lines 4 8 23-30
7. Applications of Linear Functions 5 10 31-40
TOTAL 20 40 40
ITEM ANALYSIS
Item analysis refers to the process of examining the student’s response to each item in the test. There are two
characteristics of an item: desirable and undesirable characteristics. An item that has desirable characteristics
can be retained for subsequent use and that with undesirable characteristics is either be revised or rejected.
Three criteria in determining the desirability and undesirability of an item:
a. difficulty of an item
b. discriminating power of an item
c. measures of attractiveness
Difficulty index (DF) refers to the proportion of the number of students in the upper and lower groups who
answered an item correctly. In a classroom achievement test, the desired indices of difficulty not lower than
0.20 nor higher than 0.80. The average index of difficulty from 0.30 or 0.40 to a maximum of 0.60.
PUG PLG
DF PUG= proportion of the upper group who got an item right
2
PLG= proportion of the lower group who got an item right
Level of Difficulty of an Item
Index Range Difficulty Level
0.00 – 0.20 Very Difficult
0.21-0.40 Difficult
0.41-0.60 Moderately Difficult
0.61-0.80 Easy
0.81-1.00 Very Easy
Index of Discrimination
Discrimination index is the difference between the proportion of high performing students who got the item
right and the proportion of low performing students who got an item right. The high and low performing
students usually defined as the upper 27% of the students based in the total examination score and the lower
27% of the students based on the total examination score. Discrimination index is the degree to which the item
discriminates between high performing group and low performing group in relation of scores on the total test.
Index of discrimination are classified into positive discrimination, negative discrimination and zero
discrimination.
Positive Discrimination – if the proportion of the students who got an item right in the upper performing group
is greater than the proportion of the low performing group.
Negative Discrimination – if the proportion of the students who got an item right in the low performing group is
greater than the students in the upper performing group.
Zero Discrimination – if the proportion of the students who got an item right in the upper performing group
and low performing group are equal.
Discrimination Index Item Evaluation
0.40 and up Very Good Item
0.30 – 0.38 Reasonably good item but possibly subject to improvement
0.20 – 0.29 Marginal item, usually needing and being subject to improvement
Below 0.19 Poor item, to be rejected or improved by revision
Maximum Discrimination is the sum of the proportion of the upper and lower groups who answered the item
correctly. Possible maximum discrimination will occur if the half or less of the sum of the upper and lower
groups answered an item correctly.
Discriminating Efficiency is the index of discrimination divided by the maximum discrimination.
Notations: PUG= proportion of the upper group who got an item right
PLG= proportion of the lower group who got an item right
Di = discrimination index
DM = maximum discrimination
DE = discrimination efficiency
Formula: Di PUG PLG
Di
DE
DM
DM PUG PLG
Example: Eighty students took an examination in Algebra, 6 students in the upper group got the correct
answer and students in the lower group got the correct answer from item number 6. Find the Discriminating
Efficiency.
Given: Number of students took the exam = 80
27% of 80 = 21.6 or 22, which means that there are 22 students in the upper performing group and
22 students in the lower performing group.
6
PUG 27%
22
4
PLG 18%
22
Di PUG PLG
27% 18%
9%
D
DE i
DM
.09
.45
0.20or 20%
DM PUG PLG
27% 18%
45%
VALIDITY OF A TEST
Validity refers to the appropriateness of score-based inferences; or decisions made based on the students’ test
results. The extent to which a test measures what it’s supposed to measure.
Important Things to Remember About Validity
1. Validity refers to the decisions we make, and not to the test itself or to the measurement.
2. Like reliability, validity is not an all or nothing concept; it is never totally absent or absolutely perfect.
3. A validity estimate, called a validity coefficient, refers to specific type of validity. It ranges between 0 to 1.
4. Validity can never be finally determined; it is specific to each administration of the test.
TYPES OF VALIDITY
1. Content Validity – a type of validation that refers to the relationship between a test and the instructional
objectives, establishes content so that the test measures what it is supposed to measure. Things to remember
about validity:
a. The evidence of the content validity of your test is found in the Table of Specification
b. This is the most important type of validity to you, as a classroom teacher.
c. There is no coefficient for content validity. It is determined judgmentally, not empirically.
2. Criterion-related Validity – a type of validation that refers to the extent to which scores from a test relate to
theoretically similar measures. It is a measure of how accurately a student’s current test score can be used to
estimate a score on a criterion measure, like performance in courses, classes or another measurement
instrument.
a. Construct Validity – a type of validation that refers to a measure of the extent to which a test measures a
hypothetical and unobservable variable or quality such as intelligence, math achievement, performance
anxiety, etc. it established through intensive study of the test or measurement instrument.
b. Predictive Validity – a type of validation that refers to a measure of the extent to which a person’s
current test results can be used to estimate accurately what that person’s performance or other criterion,
such as test scores will be at the later time.
3. Concurrent Validity – a type of validation that requires the correlation of the predictor or concurrent
measure with the criterion measure. Using this, we can determine whether a test is useful to us as predictor or
a substitute (concurrent) measure. The higher the validity coefficient, the better the validity evidence of the
test. In establishing the concurrent validity evidence, no time interval is involved between the administration of
the new test and the criterion or established test.
Example:
3 – Excellent Researcher
includes 10-12 sources
no apparent historical inaccuracies
can easily tell where the sources of information was drawn from
all relevant information is included
2 – Good Researcher
includes 5-9 sources
few historical inaccuracies
can tell with difficulty where information came from
bibliography contains most relevant information
1 – Poor Researcher
includes 1-4 sources
lots of historical inaccuracies
cannot tell from which source of information came
bibliography contains very little information
2. Analytic Rubrics – the teacher or the rater identify and assess components of a finished product. Breaks
down the final product into component parts and each part are scored independently. The total score is
the sum of all the rating for all the parts that are to be assessed or evaluated. In analytic scoring, it is very
important for the rater to treat each part as separate to avoid bias toward the whole product.
Advantages: more detailed feedback, scoring more consistent across students and graders
Disadvantage: time consuming to score
Example:
Criteria Limited Acceptable Proficient
1 2 3
Made good observations observations are most observations all observations are
absent of vague are clear and clear and detailed
detailed
Made good predictions predictions are most predictions are all predictions are
absent or irrelevant reasonable reasonable
Appropriate conclusion conclusion is absent conclusion is conclusion is
or inconsistent with consistent with consistent with
observations most observations observations
2. Showcase Portfolio
It is also known as best works portfolio or display portfolio. In this king of portfolio, it focuses on the
student’s best and most representative work, it exhibit the best performance of the student. Best works
portfolio may document student efforts with respect to curriculum objectives; it may also include evidence
of student activities beyond school.
It is just like an artist’s portfolio where a variety of work is selected to reflect breadth of talent. Hence, in
this portfolio, the student selects what he or she thinks is representative work.
The most rewarding use of student portfolios is the display of the students’ best work, the work that
makes them proud. In this case, it encourages self-assessment and builds self-esteem to students. The
pride and sense of accomplishment that students feel make the effort well worthwhile and contribute to a
culture for learning in the classroom.
3. Progress Portfolio
It is also known as Teacher Alternative Assessment Portfolio. It contains examples of students’ work with
the same types done over a period of time and they are utilized to assess their progress.
Uses of Portfolios
1. It can provide both formative and summative opportunities for monitoring progress toward reaching
identified outcomes.
2. Portfolios allow students to document aspects of their learning that do not show up well in traditional
assessments.
3. Portfolios are useful to showcase periodic or end of the year accomplishments of students such as in
poetry, reflections on growth, samples of best works, etc.
4. Portfolios may also be used to facilitate communication between teachers and parents regarding their
child’s achievement and progress in a certain period of time.
5. The administrators may use portfolios for national competency testing to grant high school credit, to
evaluate educational programs.
6. Portfolios may be assembled for combination of purposes such as instructional enhancement and progress
documentation. A teacher reviews students’ portfolios periodically and make notes for revising instruction
for next year use.
According to Mueller (2010), there are seven steps in developing portfolios of students. Below are the
discussions of each step.
1. Purpose: What is the purpose(s) of the portfolio?
2. Audience: For what audience(s) will the portfolio be created?
3. Content: What samples of student work will be included?
4. Process: What processes (e.g., selection of work to be included, reflection on work, conferencing) will be
engaged in during the development of the portfolio?
5. Management: How will time and materials be managed in the development of the portfolio?
6. Communication: How and when will the portfolio be shared with pertinent audiences?
7. Evaluation: if the portfolio is to be used for evaluation, when and how should it be evaluated?
Guidelines for Assessing Portfolios
1. Include enough documents (items) on which to base judgment.
2. Structure the contents to provide scorable information.
3. Develop judging criteria and a scoring scheme for raters to use in assessing the portfolios.
4. Use observation instruments such as checklists and rating scales when possible to facilitate scoring.
5. Use trained evaluators or assessors.
Traditional Assessment – it refers to the use of pen-and-paper objective test.
Alternative Assessment – it refers to the use of methods other than pen-and-paper objective test which
includes performance test, projects, portfolios, journals, and the likes.
Authentic Assessment – it refers to the use of assessment methods that simulate true-to-life situations. This
could be objective test that reflect real-life situations or alternative methods that are parallel to what we
experience in real life.
Performance-based Assessment
Performance-based assessment is a process of gathering information about student’s learning
through actual demonstration of essential and observable skills and creation of products that are grounded in
real world contexts and constraints. It is an assessment that is open to many possible answers and judged
using multiple criteria or standards of excellence that are pre-specified and public.
3. How many grams (g) are there Get a table balance with sets of
in 1 kilogram (kg)? weights. Place 1kg of mangoes on
a. 1, 000 g the table balance and 10 sets of
b. 1, 050 g weights of 100g each. You count the
c. 1, 100 g set of weights you put on the table
balance and multiply (100 x 10).
Ask: How many grams are there 1
kilogram?
4. How many cups are there in 4. Get and empty 1gallon ice cream
1 gallon? Container and a measuring cup. Let
a. 14 cups the student fill the cup with water
b. 15 cups and pour it to the empty container
c. 16 cups until it is filled up.
Ask: How many cups of water did you
pour into a gallon container of ice cream?
How many cups are there in 1gallon?
PERFORMANCE-BASED ASSESSMENT
- a form of testing that requires students to perform a task rather than an answer from a ready-made
list.
Scoring Rubrics – scoring scale used to assess student performance along a task-specific set of criteria. It
contains the essential criteria for the task and appropriate levels of performance for each criterion.
Descriptors – tells students precisely what performance looks like at each level and how their work may be
distinguished from the work of others for each criterion.
Example:
Criteria 1 2 3
Number of
Appropriate Hand X1 1-4 5-9 10-12
Gestures
Appropriate Facial Lots of inappropriate Few inappropriate No apparent
Expression X1 facial expression expression inappropriate facial
expression
Voice inflection X2 Monotone voice used Can vary voice Can easily vary voice
inflection with inflection
difficulty
Incorporate proper
ambiance through Recitation contains Recitation has some
feelings in the voice X3 very little feelings feelings
Task Designing – the design of the task depends on what the teacher to observe as outputs of the
students.
Scoring Rubrics – descriptive scoring schemes that are developed by teachers to guide the analysis of the
products or processes of students’ efforts.
Criteria Setting – statements which identify “what really counts” in the final output. Some of the most
often used major criteria for product assessment are Quality, Creativity, Comprehensiveness, Accuracy, and
Aesthetics.
From the major criteria, the next task is to identify sub-statements that would make the major criteria
more focused and objective. It will be noted that each score category describes the characteristics of a
response that would receive the respective score. Describing the characteristics of responses within each score
category increases the likelihood that two independent evaluators would assign the same score to a given
response. In effect, this increases the objectivity of the assessment procedure using rubrics. In the language of
test and measurement, we are actually increasing the “inter-rater reliability”.
Attitudes influence the war person acts and think in a social community we belong, they can function as
frameworks and references for forming conclusions and interpreting or acting for or against an individual,
concept or idea. It influences behavior.
Motivation – reason or set of reasons for engaging in a particular behavior. The reasons include basic
needs, object, goal, state of being, and ideas that is desirable. It also refers to initiation, direction, intensity
and persistence of human behavior.
There are two kinds of motivation: intrinsic motivation and extrinsic motivation.
Intrinsic Motivation – brings pleasure or make people feel what they are learning is morally significant.
Extrinsic Motivation – comes when a student compelled to do something because of factors external
to him.
Motivation in education can have several effects on how students learn and behave towards subject
matter. It can direct behavior toward particular goals -- lead to increase effort and energy; increase
initiation of, and persistence in activities; enhance cognitive processing and determine what consequences
are reinforcing that leads to improve performance.
Self-efficacy – impression that one is capable of performing in a certain manner or attaining certain goals.
It is a belief that one has the capabilities to execute the courses of actions required to manage prospective
situations. It relates to person’s perception of their ability to reach a goal.
Assessment tools in the affective domain are those which are used to assess attitudes, interest, motivations
and self efficacy. These include:
1. Self-Report - most common measurement tool in the affective domain that essentially requires an individual
to provide an account of his attitude or feelings toward a concept or idea or people. It is also called “written
reflections”
2. Rating Scales - refers to a set of categories designed to elicit information about a quantitative attribute in
social science. Common examples are the Likert scale and 1-10 rating scales for which a person selects the
number which is considered to reflect the perceive quality of a product. The basic feature of any rating scale is
that it consists of a number of categories. These are usually assigned integers.
3. Semantic Differential (SD) Scales - tries to assess an individual’s reaction to specific words, ideas or concepts
in terms of ratings on bipolar scales defined with contrasting adjectives at each end.
Good ___ ___ ___ ___ ___ ___ ___ Bad
3210123
( 3 – extreme; 2 – quite; 0 - neutral)
Thurstone Scale
Thurstone is considered the father of attitude measurement and addressed the issue of how favorable
an individual is with regard to a given issue. He developed an attitude continuum to determine the
position of favorability on the issue.
Directions: Put a check mark in the blank if you agree with the item:
____ 1. Blacks should be considered the lowest class in human beings. (scale value = 0.9)
____ 2. blacks and whites must be kept apart in all social affairs where they might be taken as equals (
scale value = 3.2)
_____3. I am not interested in how blacks rate socially. (scale value = 5.4)
Likert Scales
Likert developed the method of summated ratings (or Likert scale), which is widely used. This requires
an individual to tick on a box to report whether they “strongly agree” “agree” “undecided”,
“disagree” or “strongly disagree” in response to a large number of items concerning attitude object
or stimulus.
a. pick individual items to include. Choose individual items that you know correlate highly with the
total score across items
b. choose how to scale each item, or construct labels for each scale value to represent
interpretation to be assigned to the number
c. ask your target audience to mark each item
d. Derive a target’s score by adding the values that target identifies on each item.
Checklists
- most common and perhaps the easiest instrument in the affective domain. It consist of simple items that the
student or teacher marks as “absent” or “present” Here are the steps in the construction of a checklist:
a. enumerate all the attributes and characteristics you wish to observe
b. arrange this attributes as a “shopping list” of characteristics
c. ask students to mark those attributes which are present and to leave blank those which are not
Types of Rubrics
Type Description Advantages Disadvantage
PORTFOLIO ASSESSMENT
Portfolio assessment is the systematic, longitudinal collection of student work created in response to specific,
known instructional objectives and evaluated in relation to the same criteria. Student portfolio is a purposeful
collection of student work that exhibits the student’s efforts, progress and achievements in one or more areas.
The collection must include student participation in selecting contents, the criteria for selection, the criteria for
judging merit and evidence of student self-reflection.
Comparison of Portfolio and Traditional Forms of Assessment
Traditional Assessment Portfolio Assessment
Measures student’s ability at one time Measures student’s ability over time
Done by the teacher alone, students are not aware Done by the teacher and the students; the students
of the criteria are aware of the criteria
Conducted outside instruction Embedded in instruction
Assigns student a grade Involves student in own assessment
Does not capture the students’ language ability Capture many facets of language learning
performance
Does not include the teacher’s knowledge of Allows for expression of teacher’s knowledge of
student as a learner student as a learner
Does not give student responsibility Students learn how to take responsibility
5. Showcase Portfolio
It is also known as best works portfolio or display portfolio. In this king of portfolio, it focuses on the
student’s best and most representative work, it exhibit the best performance of the student. Best works
portfolio may document student efforts with respect to curriculum objectives; it may also include evidence
of student activities beyond school.
It is just like an artist’s portfolio where a variety of work is selected to reflect breadth of talent. Hence, in
this portfolio, the student selects what he or she thinks is representative work.
The most rewarding use of student portfolios is the display of the students’ best work, the work that
makes them proud. In this case, it encourages self-assessment and builds self-esteem to students. The
pride and sense of accomplishment that students feel make the effort well worthwhile and contribute to a
culture for learning in the classroom.
6. Progress Portfolio
It is also known as Teacher Alternative Assessment Portfolio. It contains examples of students’ work with
the same types done over a period of time and they are utilized to assess their progress.
Uses of Portfolios
7. It can provide both formative and summative opportunities for monitoring progress toward reaching
identified outcomes.
8. Portfolios allow students to document aspects of their learning that do not show up well in traditional
assessments.
9. Portfolios are useful to showcase periodic or end of the year accomplishments of students such as in
poetry, reflections on growth, samples of best works, etc.
10. Portfolios may also be used to facilitate communication between teachers and parents regarding their
child’s achievement and progress in a certain period of time.
11. The administrators may use portfolios for national competency testing to grant high school credit, to
evaluate educational programs.
12. Portfolios may be assembled for combination of purposes such as instructional enhancement and progress
documentation. A teacher reviews students’ portfolios periodically and make notes for revising instruction
for next year use.
According to Mueller (2010), there are seven steps in developing portfolios of students. Below are the
discussions of each step.
8. Purpose: What is the purpose(s) of the portfolio?
9. Audience: For what audience(s) will the portfolio be created?
10. Content: What samples of student work will be included?
11. Process: What processes (e.g., selection of work to be included, reflection on work, conferencing) will be
engaged in during the development of the portfolio?
12. Management: How will time and materials be managed in the development of the portfolio?
13. Communication: How and when will the portfolio be shared with pertinent audiences?
14. Evaluation: if the portfolio is to be used for evaluation, when and how should it be evaluated?
Guidelines for Assessing Portfolios
6. Include enough documents (items) on which to base judgment.
7. Structure the contents to provide scorable information.
8. Develop judging criteria and a scoring scheme for raters to use in assessing the portfolios.
9. Use observation instruments such as checklists and rating scales when possible to facilitate scoring.
10. Use trained evaluators or assessors.