Monday, April 11, 2011

Scale Development Seminar – Lucy Busija

Scale Development Seminar – Lucy Busija

 

-          Why measurement is important

o   Examples

§  Exams

§  Consumer research

§  Performance appraisal

§  Myer Briggs score 

§  [GW – get Anna to do thi]

-          Reflection of underlying condition

o   Eg, tone of voice as one indicator of suicidal tendency

-          Questions can bias response

o   Qualifiers à can "force" an answer

o   Double barrelled

o   Interpretation

o   Graduated scale – scale needs sufficient granularity to capture detail of respondent's position

o   Reactivity à eg questions on meaning of life

§  Can prompt reflection and therefore change perception

-          Holmes – Rahne social readjustment scale

o   Does giving scale of impact influence response

o   Good vs bad stress  à lumps together

-          Reversing direction of questions

o   Traditional wisdom

§  Make people stop and think

o   Now

§  People do genuinely differ in how they respond to positively worded questions vs negatively worded questions.

-          Issue of choosing sides vs fence sitting

-          Theories of measurement

o   Classical

o   Generalisability

o   Item Response

-          Classical

o   All error is random

o   Cronberg's Alpha  à does CA relate to Generalisability Theory

o   Systematic error à criticism of Classical Theory

o   Systematic error doesn't cancel out

 

 

 

 

 

 

 

 

 

Random error

 

 

TRUE SCORE                                                                                       observed score



 

 


 Systematic error

 

 

Response to shortcomings of Classical Theory is Generalisability Theory

 

[GW : look at missing data classification classification]

 

-          Intervention

o   Placebo effect

o   Reactivity

 

 

ANOVA model à sources of variation

 

 

IRT

[GW: need to get some understanding on this]

 

-          Range of difficulty

o   From easy

o   To hard

 

Cross cultural differences

 

 

-          What makes a good scale

-          Setting / context / use

-          Purpose / context

-          Population

-          No absolute measure of reliability / validity

 

can only say a scale is good / effective in context of purpose of scale

ruler analogy

o   measure

o   weigh

 

-          Sensitivity / responsiveness à can a test distinguish between 2 levels of a characteristics

o   Ie high vs low self esteem

 

 

-          Effect size of intervention

 

 

-          Reliability

-          Consistency / robust

-          Amount of error in test score

-          Variability

 

 

-          Types

o   Test / retest

o   Equivalent forms

o   Split half

o   Inter rater agreement

o   Internal consistency  à Cronberg's alpha – what is the math

 

-          Volatile characteristics  à mood / anxiety

 

-          Don't want real change to occur between test / retest

-          Memory

-          Difficulty

 

-          Intangible measure

 

 

-          Pearsons

 

-          what happens à all score 10% higher in retest?

-          systematic change à Pearsons would not have detected this à what is the test that would have detected this change?

-           

-          why does this impact reliability

-          would not normally administer test twice

-          like treatment + placebo affect vs just placebo

 

 

-          ICC

-          agreement coefficient

 

 

-          split half reliability

-          Spearman Brown correlation

-          why are long tests more reliable

-          [GW à why not do multiple splits with resampling]

 

 

 

-          Internal Consistency

-          Cronbergs alpha

-          how does this differ from previous

-          longer tests à duplicate or redundant questions  à artificially increase CA

-          complexity of concept

o   fatigue – not very complex

o   depression – very complex

-          not suitable for questionnaires with internal order like

o   from easy to hard questions

-          CA à percentage of true score out of total score

-          FDA guidelines

-          nature of concept à diffuse / intangible

 

 

-          reliability à absence of variance

 

-          item – total correlation

o   partial auto correlation

o   item – remainder correlation

-          squared multiple correlation

 

 

 

-          validity

-          conceptual properties of scale

-          valid scale must be reliable

 

 

-          does a scale always face validity

 

 

-          content validity

o   eg à depression

o   scale should cover all aspects of depression

§  loss of sleep

§  loss of apetite

§  etc

-          example à walking up stairs  à look at this example again

o   how often

o   comfort / discomfort

 

 

-          inkblot test à how do you test that for reliability and validity

 

 

-          constructive / discriminant comparisons

-          where is the boundary of a measure.

 

-          Area under ROC à data prediction

 

 

 

 

-          prepare responses , then write question

 

 

-          scale development

-          de Villis & Nunnally JC  1994  Psychometry

 

 

-          writing skill / editing

-          IRT

-          some items measure low ability / some items measure high ability

o   time mgt example à they all measure average time mgt ability

o   if all scores have mean around 5, best able to score middle of the range time mgt skills

-          a clinical measure that targets high levels of depression – may not work well with low level depression.

 

 

-          how does variance relate to min / max / mean   à investigate

o   same units ?

 

 

-          what is the problem if scores bunch up at one end of scale  à investigate

 

 

-          what type of scale works with C alpha

o   10 item scale

o   3 item scale

o   2 item scale

 

 

-          does amalgamating various skewed distributions result in a different distribution

-          what is distribution of combination  à investigate

 

 

-          ability

-          perception of ability

-          perception can change without ability changing

-           

 

 

               

 

 

-          reactivity effect

 

 

-          need to redo interclass correlation

 

 

-          why ANOVA

-          different from correlation

-          when to use

 

 

-          no – group validity hypothesis

No comments: