Thursday, March 31, 2011

de Vaus – Chapter 4 – Developing Indicators for Concepts


-          must translate concepts into form in which they are measurable

 

Clarifying the Concepts

 

-          concept – abstract summary of whole set of behaviours, attitudes and characteristics à with something in common

-          define concept, then develop indicators for concept as it has been defined.

-          how to clarify concepts

o   obtain a range of definitions of the concept.

o   identify common elements

o   develop definition based on these

o   or

o   distinguish between different ways concept has been used

-          decide on a definition

o   justify decision

o   to assign definition à give it nominal definition / working definition

-          delineate the dimensions of the concept

o   eg, social , political, psychological, etc

o   concept mapping

 

 

Developing Indicators

 

-          descending the ladder of abstraction

-          sub dimensions

-          to develop indicators

o   how many indicators to use

o   how to develop the indicators

o   how to form items into a questionnaire

-          how many indicators to use

o   if no agreed way of measuring a concept, develop indicators for range of definitions

o   if concept is multidimensional, are you really interested in all dimensions

o   ensure key concepts are thoroughly measured

o   number of questions to capture scope of concept

o   pilot test to eliminate unnecessary questions

o   practical considerations – eg, overall length of questionnaire

-          how to develop indicators

o   where possible, use well established indicators

o   some groups – use a less structured approach to data collection

o   use informants

 

Evaluating Indicators

 

-          reliability

o   obtain same result on repeated occasions

o   sources of unreliability

§  bad wording of question

§  interviewer effect

§  coding error

§  questions on which respondent has no opinion

o   testing reliability

§  best methods only apply to measuring reliability of scales where we have a set of questions to measure one concept – rather than single item indicators

§  single question to measure concept

·         test – retest method

o   increasing reliability

§  use multiple item indicators

§  careful question wording

§  interviewer training

§  working out coding methods

-          validity

o   measures what it is intended to measure

o   not the measure à use to which measure is put

o   criterion validity

§  compare how people answered our new measure of concept, with existing well accepted measures of concept.

·         assume existing measure is valid?

§  criterion groups

·         eg, new measure of political conservatism given to members of both conservative and radical groups

o   content validity

§  eg, test of arithmetic ability

o   construct validity

§  how well measure conforms with theoretical expectations

-          problem of meaning

 

LO – Topic 5 – Coding and Cleaning Data


-          Understand purpose of coding

-          Be familiar with standard code frames, such as those developed by ABS

-          Be able to develop code frames for open response questions

-          Be able to prepare variables for analysis

-          Know how to change. collapse and reorder categories of variables

-          Know how to create new variables from existing ones

-          know how to deal with missing data

 


-          Reasons why open ended questions used: range of possible response large / response options unknown / general feelings / reasons for

-          Codes – pre existing versus developed from responses

-          Coding missing data / reasons for missing data

-          Sources of coding error

-          checking for error – valid range / filter checks / logical checks

-          changing categories

o   reduce number of categories

§  substantive – categories have something in common (eg industry based categories)

§  distributional

o   rearrange à more logical order – rearrange industries by level of unionization

o   reverse coding

-          Create new variables

o   develop scales

o   conditional transformations

o   arithmetic transformations

-          standardizing variables

o   relativity

o   units of measure not comparable

o   different distributions

-          Missing data

o   check for bias – split group in two

o   dealing with

§  delete

·         listwise

·         casewise

·         variable

§  statistical imputation

·         sample mean

·         group mean

·         random assignment

·         regression analysis

 

-           

 

 

Monday, March 28, 2011

SG: Topic 5 : Coding and Cleaning Survey Data


 

Coding Open Ended Questions

-          classifying answers / converting to numbers

-          open ended

o   attribute information – range of answers is too large

o   attitudinal – response options are unknown / feedback is required.

o   general feelings

o   reasons for opinions

-          other category

-          tradeoff : more codes more detail / less codes easier the analysis

-          codes

o   pre-existing

§  systematic / developed by experts

§  publically available / coding transparent

§  same coding for repeated surveys

§  facilitate comparisons

o   developed based on responses

§  read selection of responses

§  summarise responses into themes

§  if required – group themes into broad topics

§  generate frequency distribution for each theme

 

Thematic Coding

 

 

 

Coding Missing Data

 

-          different from valid code

-          reasons

o   not required to answer

o   not ascertained

o   refused to answer

o   did not know answer / no opinion

 

Checking for Coding Error

 

-          sources of error

o   data entered in wrong column

o   miscoding

§  data collection

§  manual coding

§  data entry

-          methods for checking coding errors

o   valid range checks

o   filter checks

o   logical checks

 

 

Preparing variables For Analysis

 

-          Changing categories

o   initial coding results in more categories than we require

§  recode occupational categories into white / blue collar

o   too few subjects in some categories

o   collapsing categories can highlight patterns in data (but can also mask a relationship)

o   approaches

§  substantive

·         combining categories that have something in common

o   industry based categories

o   amount of training

·         divide categories of variables into equal lots  [gw ?]

§  distributional

·         restricted to ordinal and interval variables

·         divide sample into roughly equal sized groups of cases

-          rearranging categories

o   arrange categories in more logical order

§  more appropriate to focus of analysis

§  tables easier to read

§  changing level of measurement of variable and thus affecting the methods of analysis that can be applied to variable

o   example

§  organize industry categories according to level of unionization

-          reverse coding

o   when constructing scales

o   change direct of scale to be consistent

 

 

Creating New Variables

 

-          create new variables

o   developing scales

o   conditional transformations

§  eg, marital history of both husband and wife

o   arithmetic transformations

§  age difference between husband and wide

 

Standardising Variables

 

-          interested in scores relative to other people in sample

-          comparable studies where units of measure are not comparable (eg, income)  ??

-          remove inflation

-          interval level  à z scores

-          ordinal level à percentiles

 

 

Dealing with missing data

 

-          checking for missing data bias

o   divide sample into 2 groups based on whether particular variable is missing data or not

o   cross tab

-          methods for dealing with missing data

o   deleting either cases or variables

§  list wise deletion

·         any case with missing data deleted

·         issues

o   loss of data / reduction in sample size

§  pair wise deletion

·         use only cases with complete data for each calculation

§  deletion of variable

o   statistical imputation

§  sample means

·         value of mean of that variable

§  group means

·         divide sample into groups on background variable

·         issue

o   exaggerates extent to which people in a group are similar

o   inflates correlation between variables

§  random assignment within groups

·         divide sample into groups on background variable

·         replace missing value with value of same variable of nearest preceeding case

·         maintains variability

§  regression analysis

 

 

 

 

 

 

 

Gw comment

 

-          is a relationship being masked as a result of collapsing same / similar to Simpsons law

 

Sunday, March 27, 2011

The Effect of the Question on Survey Responses – A Review Graham Kalton / Howard Schuman 1982

 

Introduction

-          survey responses – sensitive to exact wording, format and placement of questions asked.

-          factual / non factual

o   factual component overlaid with evaluation

§  how is your health now

§  low level of correlation between perception and fact.

-          validity studies

o   possible with factual only

-          methods effect

o   artifact of the set of measuring instruments employed

-          consistency vs validity

o   split ballot experiments

-          possible influences on responses

o   factual

§  definition

§  comprehension

§  memory

§  social desirability

o   non factual

§  issues of balance

§  offer of middle alternatives

§  order of presentation of alternatives

 

 

Question Effects with Factual Questions

 

 

-          precise definition of fact to be collected à apparently marginal changes in def can have profound effects

o   unemployment / labor force

o   definition of a room

-          respondents need to comprehend question

o   meaning of week day

o   meaning of proportion

-          recall

o   length of recall period

o   salience

o   choice of appropriate [time] reference period

o   bounded recall

o   minimise memory errors

§  use of records

§  aided recall techniques

§  diaries

-          social desirability bias

o   randomized response techniques

§  any gain in bias reduction has to be offset with sizable increase in sampling error

§  hampers analysis of relationship between threatening question and other variables

-          long question vs short question

o   keep it short vs keep it simple

o   difficulties from long questions probably derive from their complexity rather than their length per se

-          instructions

-          use of feedback

-          securing respondent feedback

 

 

Question Effects with Non factual Questions

 

-          conceptualization of construct to be measured

-          don't worry about marginal distribution of answers

-          look for some form of correlational analysis

 

 

-          treatment of don't knows

o   explicit don't know category

o   Specifically ask – do you have an opinion on ……

-          open or closed questions

o   open

§  responses nominal in nature and sizable in number

-          use of balance

-          acquiescence

-          middle alternatives

-          order of alternatives

 

 

General Question Effects

 

-          effects noted for one type of question can apply to other type – ie sensitivity

-          question re income

-          presence of other questions in questionnaire / relative position

o   question order / context

-          example found of order effect

o   general question on issue with specific question on same issue

o   position of response on long lists

o   crime survey – inclusion of attitude questions – increased reported victimization rates

§  attitude question may have stimulated respondent memory / awareness

 

 

Concluding Remarks

 

-          survey questioning not a precision tool

-          need to understand nature of context effects

-          strong defence against artifacts described in this paper – use multiple questions / contexts and modes of research

-          by tieing an important concept to at least a few items that differ among themselves in form, wording, and context, investigator is unlikely to be trapped into mistaking a response artifact for a substantive finding.

 

 

Discussion

 

-          tacit support for view that since marginal results are difficult to measure, the concentration on comparisons is acceptable

-          surveys need to derive valid methods to measure proportion supporting particular issue

 

è comment by Hedges

o   much could be learnt from more open discussion and criticism of questions, even in the absence of specific experiments

o   implicit assumption in literature that in absence of hard evidence there is little point in speculating or reasoning about question effects because any opinion is likely to be as good as any other

è [GW comment : why non prob surveys can provide valuable info]

 

 

 

-          this raises the question of whether we model bias in the mean bought about by inaccurate measurement and / or bias in measures of relationships brought about by correlated errors, which may be a more appropriate distinction than the std one proposed in the paper between "factual" and "opinion" questions

 

-          a lot of survey research is concerned with breaking down concepts into their component parts and asking questions on each one of them

 

o   income

o   takes 20 or 30 questions to actually disentangle it

 

 

-          example of chronic sickness and long term incapacity

 

 

 

-          questioning problems

o   imprecision

§  will lead individual respondents applying their own interpretation

o   method effects

o   respondent task problems

§  eg, memory / recall

o   asking different questions

 

 

 

-          donkey vote

-          using electoral research à position on ballet paper

 

 

 

 

-          questions and responses

-          should be : stimuli and responses

 

 

-          long questions à give people a chance to think and you get different replies

 

 

 

Topic 5 : de Vaus : Preparing Variables for analysis - Chapter 10


 

Changing Categories

-          collapsing categories

o   recode occupational into white collar / blue collar

o   categories with few subjects

o   for cross tabs etc, too many categories is cumbersome

o   eg, change direction & strength to strength only

-          approaches to collapsing categories

o   substantive approach

§  contiguous in meaning categories

o   distributional approach

§  recoding variables where categories / values have natural order

·         coding incomes into high / medium / low

·         divide distribution into equal sized groups

o   rearranging categories

§  new logic

§  eg, order industry in order of decreasing unionism

§  moving say no qualification

o   reverse coding

 

 

Creating New Variables

 

-          developing scales

-          conditional transformations

-          creating new variables with arithmetic transformations

o   age difference (between spouses)

 

Standardizing Variables

 

-          standardizing using z scores

-          standardizing for different distributions

o   same units, different distributions

o   eg, VCE scores

-          adjustments with ordinal level variables

o   ranking

 

Problem of Missing Data

 

-          checking for missing data bias

o   group sample into 2 groups : missing / valid  à how did they answer other questions

-          minimizing effect of missing data

o   delete cases – list wise

o   delete variables

o   pairwise deletion

o   sample mean approach

o   group means approach

o   random assignment within groups

§  locate case with missing data à look at value of same variable nearest preceding case with valid code

§  introduces randomness

§  does not affect strength of correlations

o   regression analysis

 

Topic 5 : de Vaus : Chapter 9 : Coding


 

-          classifying / coding

 

Classifying Responses

-          creation of classification system à Imposes order on data

-          political / cultural – eg apartheid system – classifying by race

-          human constructions

-          precoding à forced response

-          post coding à open ended

o   systematic pre existing

o   based on responses

-          importance of coding for tracking changes over time (consistency

-          standardized coding

 

Allocating Codes to each variable

 

-          multilevel classification schemes

-          what level of detail is required

-          developing set of codes from responses

-          multiple answers

-          coding multiple responses to closed questions

o   Example – 12 qualities of children

§  multiple dichotomy

·         12 variables

o   respond yes or no to each quality

o   requires 12 variables

§  multiple response

·         respondents pick say, two most important qualities

o   variables

§  choice1

§  choice2

-          multiple responses to open questions

o   using multiple response approach can lead to some difficulties with analysis

§  eg, how many people selected one particular quality

§  would have to look at information in both variables

-          coding numerical data

-          coding missing data

o   not required

o   not ascertained

o   refused to answer

o   did not know answer / no opinion

 

Allocating Column Numbers to each Variable

 

 

Producing a Codebook

 

-          exact wording of question

-          name of variables

-          data type

-          first / last column number

-          valid codes for question

-          missing data codes

-          special coding instructions

 

Checking for Coding Errors

 

-          valid range checks

-          filter checks – ie don't ask an unemployed person about job satisfaction

-          logical checks – age of child given age of parent

 

 

Entering Data

 

 

 

Issues That Complicate Coding

 

-          multivariable questions

-          several questions being used to code one variable

o   scaling

 

 

 

 

 

 

GW Comments

 

-          human constructions à bra example