0% found this document useful (0 votes)
23 views4 pages

Psychometric Evaluation of A Knowledge Based Examination Using Rasch Analysis

1. The document describes using Rasch analysis to evaluate a knowledge-based examination taken by 355 medical students. 2. Rasch analysis provides a deeper analysis than classical test theory by considering the interaction between student ability and item difficulty. 3. The analysis identifies test items that do not fit the Rasch model well or are dependent on other items, allowing test developers to iteratively improve the test so item difficulty matches student ability levels.

Uploaded by

Mustanser Farooq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views4 pages

Psychometric Evaluation of A Knowledge Based Examination Using Rasch Analysis

1. The document describes using Rasch analysis to evaluate a knowledge-based examination taken by 355 medical students. 2. Rasch analysis provides a deeper analysis than classical test theory by considering the interaction between student ability and item difficulty. 3. The analysis identifies test items that do not fit the Rasch model well or are dependent on other items, allowing test developers to iteratively improve the test so item difficulty matches student ability levels.

Uploaded by

Mustanser Farooq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Psychometric evaluation of a knowledge

based examination using Rasch analysis


Abstract Classical Test Theory has traditionally been used to carry out post-examination analysis of
objective test data. It uses descriptive methods and aggregated data to help identify sources of
measurement error and unreliability in a test, in order to minimize them. Item Response Theory (IRT),
and in particular Rasch analysis, uses more complex methods to produce outputs that not only identify
sources of measurement error and unreliability, but also identify the way item difficulty interacts with
student ability. In this Guide, a knowledge-based test is analyzed by the Rasch method to demonstrate the
variety of useful outputs that can be provided. IRT provides a much deeper analysis giving a range of
information on the behavior of individual test items and individual students as well as the underlying
constructs being examined. Graphical displays can be used to evaluate the ease or difficulty of items
across the student ability range as well as providing a visual method for judging how well the difficulty of
items on a test match student ability. By displaying data in this way, problem test items are more easily
identified and modified allowing medical educators to iteratively move towards the ‘perfect’ test in which
the distribution of item difficulty is mirrored by the distribution of student ability

Introduction
The quality of assessment methods and processes is as important as the quality of the teaching and
learning process in any form of educational activity.

Practice points
1. Rasch analysis is a particular method used in IRT. .

2. IRT supersedes CTT, in that it takes into consideration the interaction between student ability and item
difficulty. .

3. The characteristics of a test that fits the Rasch model can be identified, so that test developers can
iteratively move towards the ‘perfect’ test. .

4.The ‘perfect’ test is one on which the distribution of student ability is perfectly mirrored by the
distribution of item difficulty.

Comparing Classical Test Theory with Item


Response Theory.
This section of the Guide describes and compares the concepts and methods that underpin Classical Test
Theory (CTT), which is the more traditional approach to psychometric analysis, and Item Response
Theory (IRT), which is a more developed and contemporary approach..

Methods.
The Rasch model Despite the complexity of the statistical and measurement methods, used by the Rasch
model, the results can answer some simple questions given below.

1.How well does a student answer a question if we know the student’s ability and the item’s difficulty? .

2.What is the probability of a student answering an item correctly given a measure of item difficulty? .

3. If student ability equals item difficulty, what is the probability of answering the item correctly? .

4.What is the probability of a less or more able student answering an easy or difficult item?

Unidimensionality
One of the assumptions of Rasch modeling is that a test optimally measures a single underlying
construct; this is termed unidimensionality. For example this underlying single construct can be identified
with cognitive ability in a knowledge-based test or practical performance in an OSCE. Unidimensionalty
implies that all items in a test or all OSCE stations assess a single construct or dimension .

Response dependency
Another assumption of Rasch analysis is local independency of items. This means that the probability of
answering one item correctly should be independent of the answer to other items. When the value of an
item is predicted by the value of another item, the assumption of independency is violated. In the context
of the Rasch model, items with a high positive correlation indicate that one of the two questions is
redundant for the test. Correlations greater than 0.50 between items are considered an indication of
response dependency and items should be investigated. For example if item 1 has a correlation coefficient
of 70% with item 2 this indicates a local item dependency between item 1 and item 2, suggesting both
item 1 and item 2 are required for the test..

Item difficulty invariance


Another feature is ‘item difficulty invariance’ which provides valuable information about the invariance
or stability properties of item values within a test. Invariance in this context means that the properties of
an item are not influenced by the ability of the students answering the item. A scatter plot of item
difficulty values from high- and low-ability students can display a correlation that reveals the extent to
which item difficulties vary between the two groups. By inserting 95% confidence interval control limits
onto such plots items that are not invariant or unstable with respect to ability can be easily identified. Item
difficulty invariance also allows us to identify items that are useful across the ability range in order to
calibrate questions for item banks. This means that assessors will have convenient access to a large
number of tested questions which are classified according to student ability and item difficulty. Such
questions can also be used for computer adaptive testing (CAT) where the questions administrated to
students can be modified according to their performance on the previous questions.

Response dependency
The local independence assumption is not violated if the order of the questions in an examination does not
affect their difficulty. Test Response dependency was assessed for the complete test and for each case.

Reliability and separation estimates


The PSR for the whole test was 0.65 with a PSI of 1.37. A PSI value less than 2 indicates that the spread
or separation of students on the construct being measured was not satisfactory, suggesting that the
questions had low discrimination.

Rasch item fit


Table 3 shows item difficulty, standard error and item fit in each case. The outfit statistics show that Q11
and 16 are not within the acceptable range (both for MNSQ and ZSTD) implying they needed to be
investigated as they did not contribute towards the underlying test construct.

Participants
The examination data used in this Guide was processed from results obtained from 355 medical students
in their final clinical knowledge-based exam. We used Winsteps* software (Linacre 2011), to produce
simulated modifications of the data to create examples for the purposes of this Guide. We did not require
approval from our research ethics committee as this study was carried out using data acquired from
normal exams within the curriculum with the goal of monitoring the quality of individual questions in
order to improve student assessment.

Data collection
Knowledge-based test The simulated knowledge-based questions were used to assess cognitive
performance of students in this study. The test consisted of 43 questions to assess two clinical cases. Case
1 consisted of 24 questions on Clinical Laboratory Sciences and Case 2 consisted of 19 questions on
chronic illness in General Practice. Each question was marked dichotomously, i.e. students received 1
mark if they answered the question correctly and 0 if they answered incorrectly. The potential score for
Case 1 and Case 2 was 24 and 19, respectively. There was no negative marking for incorrect answers.
Students responded to the questions through an online assessment system (Rogo¯, University of
Nottingham) during a normal summative examination.

Psychometric software
The Rasch measurement model (Rasch 1980) was used to analyses the different response patterns
obtained using Winsteps* software

Results
In this section, we will demonstrate the results of the Rasch analysis of our simulated exam data under the
headings previously discussed. For each section, we will discuss the following.

You might also like