Topic Introduction
- measure complex constructs / concepts
- assumption of additivity
o respondents asked to answer several items that constitute scale à all answers added up to obtain overall scale
- reasons for measuring concept by using multiple indicators instead of one
o reflects complexity of subject
o develop more valid measures à avoid misclassification that can occur from using one item measures
o increase reliability
§ if one question poorly worded / not understood , other questions will offset
o greater precison
§ example
· using suburb of residence as measure of person's social status
· mix of indicators : education, occupation, income
o simplified data analysis
§ instead of analyzing each question separately, analyse one variable
Steps in Scale Development Process
- definition of construct
- generation of item pool
- choice of response format
- review of items / pilot testing of scale / administration of scale to development sample
- evaluation, including
o reliability
o validity
Step One – Definition of Construct
- sound understanding of literature
- conceptual model of basic construct needs to be developed.
- range of definitions
- identify common elements
- develop definition based on these
- delineate dimensions of construct à decide to measure all dimensions / focus
Generation of Item Pool
- develop set of questions which seem to measure that concept
- use informants from target group
Review of Items / Pilot Testing / Developmental Testing of Scale
- expert reviewers / fresh set of eyes
- test sample should be representative of population for which scale was developed.
- test for reliability / validity
Evaluation of Scale
- scales supposed to be unidimensional
- only if dimensions are strongly correlated does it make sense to construct single scale
Testing Reliability of Scale
- internal consistency
- items are highly correlated
- ideally, scale items
o show relatively high variance
§ low variance does not discriminate among individuals
o mean scores falling close to centre of range
- reliability
o intercorrelation among each of its items (item – item correlation)
o items with low / negative correlation should be deleted.
- Cronbach's alpha
o most commonly used method of testing reliability of scalr

N = number of items
c-bar = average inter-item covariance among items
v-bar = average variance
ranges from 0 to 1 ; higher values indicate higher levels of internal consistency
how similar items on scale are
minimum level of 0.7 recommended
if alpha above 0.9, some items on scale very similar, and can be discarded.
size of alpha affected by reliability of individual items
increase alpha – delete unreliable items.
to id unreliable items
examine various statistical properties of items
item total correlations
alpha if item deleted.
- item – total correlation
o correlation between item and total score of scale (without item being investigated)
o good item-total correlations should be at least 0.3
- alpha if item deleted
o calculate what alpha would be if item was dropped
Reliability Based on Correlation Between Scale Scores
- reliability computation
o same set of people complete two separate versions of scale
§ alternate forms reliability
o same version on multiple occasions
§ test-retest reliability
- test – retest reliability / temporal stability
o how constant scores remain from one occasion to another
o correlation between scores
o if correlation high (0.8 or higher) à scale considered to be reliable
o problems
§ often difficult to give test to same people twice
§ memory
§ attitudes may change
Testing Validity of Scale
- how well a scale measures what it intends to measure
- validity of scale à how we have defined concept it is designed to measure
- content validity
o adequacy with which a measure / scale has sampled from intended universe of content.
o behaviour domain à are all major aspects covered / in right proportions
o qualitative measure
o expert judges
- face validity
o often confused with content validity
o how respondents perceive appropriateness of test
o content – items are about what you are measuring
o face – items appear to be about
o whilst not validity in technical sense, is a desirable feature
§ taken seriously by respondents
- criterion validity
o relationship that exists between scale scores and some specified measurable criterion
§ concurrent validity
· new measure relates concurrently to some other measure of same concept.
· correlation coefficient
· highly correlated à assume new measure is valid
· issue
o choice of appropriate criterion
o justify why new scale required
o may be no other measure available
§ criterion groups
· test of political conservatism
o give test to radical & conservative politicians
- Predictive validity
o usefulness in predicting future events etc
o correlation between initial test and secondary outcome.
- Construct validity
o test scale performance in terms of theoretically derived hypothesis concerning nature of underlying variable or construct.
§ convergent validity – relationship with related construct
§ discriminant validity – unrelated construct
o compare scale scores for groups of people who are known to differ in terms of trait / characteristic under investigation
- Incremental validity
o ability of scale to contribute something over and above that offered by existing scales
o compare predictive power of new scale to established scale
Summary
- points to be covered when presenting newly developed scale
o statement what scale measures
o justification for scale
§ uses
§ advantages over existing scale
§ how pool of items was drawn up
· sources
· special steps re content / face validity
· description of sample used for testing
· descriptive statistics
· reliability statistics
· validity statistics
· scale / instructions / questions
No comments:
Post a Comment