HMS777: Scale Development Seminar

Scale Development Seminar – Lucy Busija

- Why measurement is important

o Examples

§ Exams

§ Consumer research

§ Performance appraisal

§ Myer Briggs score

§ [GW – get Anna to do thi]

- Reflection of underlying condition

o Eg, tone of voice as one indicator of suicidal tendency

- Questions can bias response

o Qualifiers à can "force" an answer

o Double barrelled

o Interpretation

o Graduated scale – scale needs sufficient granularity to capture detail of respondent's position

o Reactivity à eg questions on meaning of life

§ Can prompt reflection and therefore change perception

- Holmes – Rahne social readjustment scale

o Does giving scale of impact influence response

o Good vs bad stress à lumps together

- Reversing direction of questions

o Traditional wisdom

§ Make people stop and think

o Now

§ People do genuinely differ in how they respond to positively worded questions vs negatively worded questions.

- Issue of choosing sides vs fence sitting

- Theories of measurement

o Classical

o Generalisability

o Item Response

- Classical

o All error is random

o Cronberg's Alpha à does CA relate to Generalisability Theory

o Systematic error à criticism of Classical Theory

o Systematic error doesn't cancel out

Random error

TRUE SCORE observed score

Systematic error

Response to shortcomings of Classical Theory is Generalisability Theory

[GW : look at missing data classification classification]

- Intervention

o Placebo effect

o Reactivity

ANOVA model à sources of variation

IRT

[GW: need to get some understanding on this]

- Range of difficulty

o From easy

o To hard

Cross cultural differences

- What makes a good scale

- Setting / context / use

- Purpose / context

- Population

- No absolute measure of reliability / validity

can only say a scale is good / effective in context of purpose of scale

ruler analogy

o measure

o weigh

- Sensitivity / responsiveness à can a test distinguish between 2 levels of a characteristics

o Ie high vs low self esteem

- Effect size of intervention

- Reliability

- Consistency / robust

- Amount of error in test score

- Variability

- Types

o Test / retest

o Equivalent forms

o Split half

o Inter rater agreement

o Internal consistency à Cronberg's alpha – what is the math

- Volatile characteristics à mood / anxiety

- Don't want real change to occur between test / retest

- Memory

- Difficulty

- Intangible measure

- Pearsons

- what happens à all score 10% higher in retest?

- systematic change à Pearsons would not have detected this à what is the test that would have detected this change?

- why does this impact reliability

- would not normally administer test twice

- like treatment + placebo affect vs just placebo

- ICC

- agreement coefficient

- split half reliability

- Spearman Brown correlation

- why are long tests more reliable

- [GW à why not do multiple splits with resampling]

- Internal Consistency

- Cronbergs alpha

- how does this differ from previous

- longer tests à duplicate or redundant questions à artificially increase CA

- complexity of concept

o fatigue – not very complex

o depression – very complex

- not suitable for questionnaires with internal order like

o from easy to hard questions

- CA à percentage of true score out of total score

- FDA guidelines

- nature of concept à diffuse / intangible

- reliability à absence of variance

- item – total correlation

o partial auto correlation

o item – remainder correlation

- squared multiple correlation

- validity

- conceptual properties of scale

- valid scale must be reliable

- does a scale always face validity

- content validity

o eg à depression

o scale should cover all aspects of depression

§ loss of sleep

§ loss of apetite

§ etc

- example à walking up stairs à look at this example again

o how often

o comfort / discomfort

- inkblot test à how do you test that for reliability and validity

- constructive / discriminant comparisons

- where is the boundary of a measure.

- Area under ROC à data prediction

- prepare responses , then write question

- scale development

- de Villis & Nunnally JC 1994 Psychometry

- writing skill / editing

- IRT

- some items measure low ability / some items measure high ability

o time mgt example à they all measure average time mgt ability

o if all scores have mean around 5, best able to score middle of the range time mgt skills

- a clinical measure that targets high levels of depression – may not work well with low level depression.

- how does variance relate to min / max / mean à investigate

o same units ?

- what is the problem if scores bunch up at one end of scale à investigate

- what type of scale works with C alpha

o 10 item scale

o 3 item scale

o 2 item scale

- does amalgamating various skewed distributions result in a different distribution

- what is distribution of combination à investigate

- ability

- perception of ability

- perception can change without ability changing

- reactivity effect

- need to redo interclass correlation

- why ANOVA

- different from correlation

- when to use

- no – group validity hypothesis

HMS777

Monday, April 11, 2011

Scale Development Seminar – Lucy Busija

No comments:

About Me

Search This Blog

Followers

Blog Archive