100% found this document useful (2 votes)
934 views7 pages

NCE Assessment and Testing PDF

The document discusses key concepts in psychological assessment including purposes of assessment, evaluating validity and reliability, types of intelligence and personality tests, and statistical concepts like mean, median, mode, and standard deviation. It provides information on evaluating the validity of tests, including content, criterion, and construct validity. It also defines key research concepts like independent and dependent variables, internal and external validity, and statistical significance.

Uploaded by

Amy D.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
934 views7 pages

NCE Assessment and Testing PDF

The document discusses key concepts in psychological assessment including purposes of assessment, evaluating validity and reliability, types of intelligence and personality tests, and statistical concepts like mean, median, mode, and standard deviation. It provides information on evaluating the validity of tests, including content, criterion, and construct validity. It also defines key research concepts like independent and dependent variables, internal and external validity, and statistical significance.

Uploaded by

Amy D.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Measurement: The process of defining and Ethics of Assessment

estimating the magnitude of human attributes


and behavioral expressions Competence to use and interpret instruments
Informed consent

Purposes of assessment in counseling: Release of results to qualified professionals


Instrument selection
Dx and treatment planning
Conditions of administering
Placement
Monitoring client progress Scoring and Interpretation
Evaluate counseling outcome Obsolete assessments and outdated results
Assessment construction
Outcome Research:

>The degree to which clients change


>The factors that play a role in client change

Mental Measures Yearbook Validity


Most comprehensive Content – content is appropriate to test's
>All profile info purpose
>Validity and reliability
Criterion – effectiveness in predicting
>Norming data
performance on a specific criterion/ how
>Reviews by experts
accurately a test measures the outcome it
was designed to measure
Tests in Print
>Concurrent
Companion to MMY, without psychometric
>Predictive
info or reviews
Construct – Ability to measure the construct
Tests >Determined by experimental designs, factor
> Concise descriptions analysis, convergences, or discriminant
> No data on norms, validity, or reliability Validity coefficient – correlation between a
> No reviews test score and the criterion measure
Test Critiques Standard error of estimate: The expected
> Everything, plus comprehensive reviews margin of error in a predicted criterion score
> User-friendly, non-jargony due to the less than 100% validity of any test
Reliability: Consistency of scores attained by the same Intelligence
person on different administrations of the same test
A person's observed score (X) is true score (T) plus
First test Binet & Simon
the amount of error present during administration IQ = MA/CA x 100
Reliability coefficient – 1.00 max, .80 - .95 usually IQ now replaced w. Standard Age Score
acceptable range (SAS)
Reliability can be approximated through:
Galton – IQ is a single or unitary factor,
Test-retest reliability – Stability. Same test given to normal distribution curve, and genetic
same person or group twice with time between
Alternative form/parallel or equivalent – One group Guilford – 120 factors, individual differences
takes two different tests that are same content and convergent & divergent thinking
difficulty

Internal consistency – Consistency of responses among Weshcler mean of 100, SD of 15


test items in a single administration. Estimated by:
Adult WAIS-IV
Split-half (Hard and requires Spearman-Bowman formula) Kids 6-16 WISC-V
Inter-item - Kuder-Richardson 20 or Cronbach
Kids 2-7 WPPSI-IV
Inter-rater reliability – consistency of ratings by two or
more scorers

Objective Personality Tests Projective Personality Tests


MMPI-2 – Adult psychopathy Rorschach Inkblots
Millon Multiaxial – DSM personality disorders Thematic Apperception Test (TAT)
Myers-Briggs Type Indicator– Ages 14+
Picture cards, ages 4+, Murray
Extraversion v introversion – where your
energy is directed House-Tree-Person
Sensing v intuition – perceive world around you Client draws
Thinking v feeling – make decisions
Judging v perceiving - deal w/external world Sentence Completion Tests

Sixteen Personality Factors (16PF) – Ages 16+


Cattell, sixteen traits of normal people

NEO Personality Inventory (NEOP3) – Teens up


Normal personality vis Big Five (Openness,
Conscientiousness, Extraversion, Agreeableness,
Neuroticism)
Null hypothesis – There is no r/s between the Internal Validity of a study – Changes
IV and DV in the DV are in fact due to the IV

Significance level: Threshold for rejecting the Threats to Internal validity:


NH. P value of less than .05 = significance/
History
95% sure you're right
Selection
Type l error – Reject NH when actually NH is Statistical Regression
actually true
Testing
Type ll error – Retain NH when actually NH is
Instrumentation
false
Attrition
Power – The likelihood of detecting a
Maturation
significant relationship between variables
when one is there. Power can be increased Diffusion of treatment
by: Increasing sample size; increasing effect Experimenter effects (Halo & Hawthorne)
size; minimizing error; using a one-tailed test;
Subject effects (role influence or demand
or using a parametric statistic
characteristics)

Variance: A statistical average of the amount


External Validity - Results can be generalized
of dispersion around the mean in a
to a larger group distribution of scores. It is the Standard
Deviation squared.
Threats to external validity:
Standard Deviation: The amount of dispersion
Novelty effect in a set of scores. The square root of the
Experimenter effect average squared deviations from the mean of
History by treatment effect a set of scores. It is simply the square root of
Measurement of the dependent variable the variance.

Time of measurement by treatment effect *Of the three measures, the STANDARD
DEVIATION is most affected by outliers.
Independent variable – The thing being Mean - Average score from a group of tests
controlled or manipulated (cause; the
treatment or intervention) Median - Score that’s exactly in the middle
when ordered from low to high
Dependent variable – The outcome that is
influenced by the IV (effect) Mode - The score that occurs most frequently
in a set of scores
Control group – Doesn’t get the IV
> In a normal distribution, Mean, Median,
Experimental or Treatment group – Gets and Mode are the same
the IV
> Of the three, the MEAN is most affected
Internal validity – Changes in the DV are by outliers or extreme scores.
actually due to the IV

External validity – Can the results be Range - Difference between the largest and
generalized to a wider group smallest scores

Validity Standard error of measurement

Whether a test actually measures How scores from repeated administrations


what it purports to measure of the same test to the same individual are
distributed around the true score
Content validity – Test items reflect all
major content areas in that domain > Often reported as a confidence interval
Criterion validity – Test can predict a Standard error of measurement (SEM) - a
person's performance on a specific criterion
statistical range that will include a test
taker's score – calculated by the
multiplication of the test's standard
deviation by the square root of it, then
the subtraction of the reliability
coefficient.
Positively Skewed: Most scores clustered at Statistical Significance: Probability that the
the lower end of the curve, with a few very results obtained were due to chance
high scores creating a long tail to the right. (represented by the value of 'p'). A p-value
The mean is greater than the median, of .05 or less means results were statistically
median is greater than the mode. significant (i.e. not due to chance).
Negatively Skewed: Most scores clustered t-test: A statistical procedure designed to
at the upper end of the curve, with a few test the difference between the means of
very low scores creating a long tail to the left. two groups
The mean is less than the median, and the
median is less than the mode. Sensitivity – Test's ability to identify true
positives; the presence of a thing

Specificity - Test's ability to identify true


negatives; the absence of a thing

Stanine or Standard Nine - a way of scaling


test scores, nine divisions, five in the middle
w/an SD of 2

Standardized scores Factors that influence Reliability:

Z-score – How many standard deviation Test length – Longer are more reliable
units above or below the mean a raw
score is. Raw score minus the sample Homogeneity of items – High variation in
mean divided by the sample's standard content leads to lower reliability
deviation
Heterogeneity of test group – Test-takers
who vary on the characteristic being
z = X – M / SD measured leads to higher reliability

Speed test – False high reliability because


most score high (because easy questions)
Scales Scatterplot
Nominal – naming data
A graph of plotted dots that show the
Ordinal – Classifies & assigns rank-order relationship between two sets of data/two
(Likert) variables

Interval – includes ordinal and has If from lower left to upper right = a
equidistant points positive correlation between the variables
Ratio – Most advanced, preserves nom, ord,
and inter (Height) If from upper left to lower right = a
negative correlation.
Likert – Strongly agree to strongly disagree
Semantic differential/Self-anchored – span >When left had the upper hand, sinister
between two dichotomous adjectives (Good Bad)

Thurstone – Agree/Disgree
Guttman – Are you willing to permit ___?

Probability Sampling: Nonprobability Sampling:

Simple random Convenience - Easily accessible

Systematic - every nth Purposeful - Seek out participants who


have the needed characteristics (students
Stratified random - divided into subgroups who have been exposed to DV)
by a characteristic (age, gender)
Quota - Similar to cluster & stratified, but
Cluster - Existing subgroups (pick 20 no randomization
random schools) -> Can be multi-stage
(randomly pick 20 schools, then randomly
pick 10 classes from each)
Causal-comparative/ex post facto - attempt to determine the
cause or consequences of differences that
already exist between groups -> The independent variable
occurred BEFORE the study.

Histogram - Graph w/ rectangles to show the frequency of


data items in successive numerical intervals of equal size

You might also like